Design - Best Practices

Adrianna · Sep 23, 2002

I'm getting ready to head a pretty large database consolidation effort. I don't know much about data warehousing but I'm about to have to learn.

Anyway, I have many many spreadsheets and databases that need to be consolidated to eliminate user redundancy. Many of the tables and spreadsheets share a hand full of fields, but I don't want to consolidate everything into one huge table because I'm worried about speed, security, and other issues. The main table already contains 109 fields, so I was going to create 1 to one tables linking the common data (which I use as primary keys). A few of the tables also have one to many tables attached to them.

So, I'm wondering how one would go about approaching such a huge project. I've ready a few posting that express the negative side of one-to-one relationships, and I'm wondering if there is another way to handle the data, if someone has a good example, or if someone has tips that they might want to share!

Thanks!

RichMorrison · Sep 23, 2002

You wrote
<<
The main table already contains 109 fields, so I was going to create 1 to one tables linking the common data (which I use as primary keys).
>>

This statement fills me with awe and horror.

If a table represents an entity, then what type of entity has 100+ attributes that are of business interest ?

That's a rhetorical questions. The answer is "ain't none"

IMO, you have to start from ground zero with a business oriented data model. The table and relationships come later.

RichM

The_Doc_Man · Sep 23, 2002

First, this kind of design is important to get as close to right as possible - but in Access, you can make minor mistakes and still recover. So don't obsess too much over the problem.

Second, the idea of having several one-to-one tables makes sense in certain ways, but you require specific conditions for it to make sense.

- The purists would tell you not to do this at all because, if all 109 fields are truly related to the prime key according to normalization rules, then they should all be kept together. However, I tend to be somewhat of a pragmatist.

- You would split a table if some fields in the big (109 field) table are almost NEVER used. OR if some group of fields is frequently blank. (You might also hear this data layout called "sparse.") If the ratio of populated fields to total records is less than about 50% as a first approximation, it often pays to accumulate the sparse fields in a separate one-to-one table. Particularly if you rarely reference the sparse fields for typical reports.

- Splitting one table into two or more tables that maintain a one-to-one relationship adds a little extra storage because the prime keys have to be replicated. BUT - if the split doesn't produce too many very short tables, the cost of the replication is not that much. And it can be offset by the speed improvements brought about when processing sub-set tables. Plus space improvements. The extra primary key space can be offset by the space savings in the number of records that are mostly either empty or hold pointers to null strings. This is because even a null string takes up space in a record. Namely, the space of the string descriptors that reveal the string to be null. So if you save space on empty records, you are saving real space.

- If you choose to do this, make the "sparse" table as though it were the "many" side of the relationship. Select a JOIN type where you take ALL the records of the main table and only the records of the secondary table if they match. In other words, treat the sparse side of the relationship as a potential "many" even if it is never more than one-to-one.

- Speed improvements come about because if your main table has a lot fewer than 109 fields, more of it will fit inside a 2Kbyte buffer at once.. And that 2Kb is the apparent buffer size used by Access when scanning tables.

- You would NEVER split out fields that are referenced by more than 50% of the queries, forms, and reports you have designed. The more you use them, the less likely you are to move them aside to a secondary table.

- Since a one-to-one relationship provides a pretty much unequivocal joining, queries based on such situations won't take that much more time to be generated.

Third, one of these one-to-one tables must always be the base of joins to your real "many to one" situations. ONLY one table should be the basis for all other joins. The tables that are sparse should NEVER be the basis for a join if the main table makes more sense as that base.

These represent some thoughts on the subject that you suggested might apply to you. They are opinions to be taken as such, and more than slightly involved with a person philosophy that sometimes avoids doing the "scrupulously right" thing when a more practical but technically questionable thing is available.

Be aware that many viewpoints exist on this subject and that not all folks are equally receptive to my ideas on when you can relax the "stringently correct" rules to get things done.

Adrianna · Sep 23, 2002

The_Doc_Man

Wow, that was great. I'd actually considered a lot of your point before, but I wasn't sure if they were considered poor practice, of if they were okay as long as they met by certain standards. You've helped to clear up many of my questions. I do have a primary table that contains my "core data", a "sparse" one-to-one relationship for data that is only run on certian [systems], and a many-to-one table gathering data about the [locations] of the [systems]. I was having difficulty getting the one-to-one table to function correctly without having to manually enter the new [locations] into the table. That's when I was tasked with relating even more data into this database.

The idea is that the [systems] link everything, then accreditation is run (many fields to track), then architecture is tracked, the location information (one-to-many), federal regulation compliance, and an budgeting. So, I'm trying to figure out how to interconnect this information (avoiding one HUGE table) inorder to maintain required security, speed, and relationships.

So, althought some of the data is sparse at the moment, it will eventually be converted into SQL 2000, provided a cold fussion or APS.NET front, and hopefully data will be provided for all aspects of the systems

So....looks like I have a huge task before me.

I really appreciate your advice!

The_Doc_Man · Sep 24, 2002

In reading my prior post, I see I explained something a bit weakly.

You would use a linkage that looks like a "one-to-many' rather than a "one-to-one" because "one-to-many" is technically incorrect as a description, even though it is an industry-standard term. It is really "one-to-{something-other-than-one}", which includes "one-to-nothing" - the "one-to-one" match would exclude records from the primary if the sparse table had no data, but that isn't what you always want. You can use "isnull" or the Nz() function to handle fields from a non-existing element of the sparse portion of your data.

Design - Best Practices

Adrianna

Registered User.

RichMorrison

Registered User.

The_Doc_Man

Immoderate Moderator

Adrianna

Registered User.

The_Doc_Man

Immoderate Moderator

Similar threads

Users who are viewing this thread