When to Split into Multiple Tables?

eyal8r

Registered User.
Local time
Yesterday, 19:14
Joined
Aug 30, 2007
Messages
76
How do I know when I should split a db into multiple tables? Right now- everything is in one table. Is there a rule of thumb or method to determine when I should split it up?

For example- MY DB is running queries on properties for sale. Would I split all the Active, Sold, Expired, Off Market listings into separate tables, since I will be frequently queriing based off of the status of the property? ie- I'll be running avg prices of homes that are in each of these status. If that's the case, how do I know what relationship to give it/how to set it up (use the MLS # as the primary key in each, or do I need a foreign key or something else?)

Is there a general rule of thumb, or guide to use when designing a db?

THANKS!
 
Do a search on the subject of normalizing tables. This should help you understand when a table needs to be split. There are several 'degrees' of normalization. You should probably be aiming for what's called '3rd normal form' for use with access.

As a starting primer, there's a link to a pdf tutorial on the topic on the thread http://www.access-programmers.co.uk...00211&highlight=tutorial+normalization&page=2 by Kraj.
 
When you are finished reading up on normalization, consider queries as your friend.

Would I split all the Active, Sold, Expired, Off Market listings into separate tables,

Depends. If those are mutually exclusive conditions of a single property, then NO do not split. Instead have a status field that uses A, S, E, or O (for example) to indicate the status of the property. Then you can use queries to select only those properties that meet your requirements. Queries save you LOTS of time and trouble. Splitting a table when you really needed to is bad enough. Doing it when you didn't need to is even worse because of the data integrity issues implied by such actions.
 
ok- read that, and have a better feel for it now. So- seems like to me-

Status (Active, Sold, Etc) does NOT need it's own table- I'll just include that in nearly every query I run. However- Realtor's info (name, code, phone #, etc) DOES need it's own table- especially since a realtor can represent multiple properties, in multiple status at any given time. And since MLS# is the primary key (never a repeat- is unique to each property, no matter what the status or price, etc), property address and attributes (stories, size, bed/bath, pool, fireplace, etc) can all stay in one table. Right? So, I'll always use MLS# as a primary key for all the tables.

Additionally- Subdivision may need it's own table, because it can fall within multiple grids/areas at the same time- especially when the user enters in the wrong info ;)

Am I on the right track here?
 
So, I'll always use MLS# as a primary key for all the tables

I'm not sure what 'MLS' represents in the real world so it's hard for me to comment. I would doubt that it would be the primary key for all the tables. It may be the key field that links many of the tables but each table should have their own primary key.

For example, the primary key of your Realtor table might be RealtorID (autonumber). The primary key of your PropertyForSale Table might be PropertyForSaleID (autonumber). You might also need a junction table if more than one realtor is involved in a transaction for a property. This PropertyTransaction table would have its own primary key (PropertyTransactionID (autonumber) and would have a fields to contains the foreign keys from the Realtor and PropertyForSale tables.

Similarly, you might also need a junction table to show which Realtor(s) is/are representing certain property(ies).

It all depends on exactly what you're wanting to track and how you plan to use the data. My rule of thumb is to try to identify all the real world 'entities' and give them a table of their own. In your case entities might include 'Realtors', 'Properties', 'Customers'. Then you figure out how those tables relate to each other and use foreign keys to link them ( or Junction tables if many to many relationships are involved). Some of the junction tables might represent the things you are wanting to track (e.g., transactions, showings, etc).
 
I wouldn't use the MLS number as the PK. I would use an autonumber and keep the MLS number as a data field. That gives you the flexibility of including properties that do not have MLS numbers.
 

Users who are viewing this thread

Back
Top Bottom