Is a new table needed? Advice needed

sueviolet · Sep 9, 2002

I have a table that I'm not really happy with. At the moment it contains fields both of totals and averages

Example: % fish and average fish

Do you think a should just create 2 tables, one for totals and one for averages? Or is quite fine the way it is?

Just trying to think of any confusion or problems that might arise in the future.

Thanks

Rich · Sep 10, 2002

If these are calculated values then they shouldn't be stored at all

sueviolet · Sep 10, 2002

Thanks Rich

Can you please tell me why?

(I didn't design this database - always hesitent to change someone elses work)

The_Doc_Man · Sep 11, 2002

In general, if you can recompute some number only from data found in a given record (non-totals query) or a given aggregate (totals query), you do not store the computed value. This is because you save LOTS of space as the tables get larger.

For every computation that is in the query, you can dump one "computed" field in the underlying table. If the table has a lot of records, that adds up fast.

In general, it is a bad idea to do ANY kind of stored aggregate value because of the ability to go back and edit the items in the record. What would you do if someone came back to that record and "cooked the books" so to speak? In other works, they gave you a count of 100 fish and 51 of them were Yellowfin Tuna and then they EDIT a percentage of 40% for the tuna catch. You would never know something was wrong unless you had some other routine to do sanity checks behind the scenes. If you never store the computed ratio, you can't have it manually edited out from under you.

Now, the other side of this coin is a matter of practicality. Modern computers are so fast that they can outrun the fastest disks we have. So in the time between each disk seek, they might as well take some time to recompute totals, percentages, and the like.

Actually, there is a normalization rule that applies to this case as well. Suppose the table is a record of fish caught in a fleet of fishing boats.

FishCatch
-- date (PK part 1)
-- boatID (PK part 2)
-- tuna
-- carp
-- trout
-- mahimahi
-- grouper
-- eel
-- coelocanth
-- totalfish (= computed sum of all named categories)
-- percenttuna ( = computed ratio of tuna count / totalfish )

(This isn't a good structure for MANY reasons, but it will do for the point I wish to make...)

In this table, the field 'percenttuna' is an entry that does not entirely depend on the primary key. Ditto 'totalfish'. These two fields depend on the date, the boat, AND the values in the other fields. The normalization rule says that anything not dependent on the primary key doesn't belong in that table. It belongs elsewhere. In this case, you could do it in a query that did the computation for you. Which means that, if the query isn't active, the data in question is NOWHERE.

Now, the CORRECT way to do the above is

Table FishCatch
-- BoatID (PK part 1)
-- Date (PK part 2)
-- FishID (PK part 3)
-- FishCount

Then do a sum grouping by boatID and date across all fish types to get fish totals. Compute your percentages based on the count for a particular type and the total you computed from the other query.

The idea of using a query is also good because both tables and SELECT queries (with or without computations) simply are ways to provide record streams. They are almost identically equal - except that queries can do things that tables really can't.

sueviolet · Sep 11, 2002

Thanks for the examples and taking the time to answer - much appreciated

Is a new table needed? Advice needed

sueviolet

Registered User.

Rich

Guest

sueviolet

Registered User.

The_Doc_Man

Immoderate Moderator

sueviolet

Registered User.

Similar threads

Users who are viewing this thread