View Full Version : table normalization
geokes 06-24-2003, 04:54 PM My web designer and I (both amateur database architects:)) got into a big fight about the following problem.
We need to store information about computer parts. Each category of parts (Processor, memory, hard disk etc.) has a different set of parameters:
PROCESSOR: core speed, bus speed, cache, packaging
MEMORY: capacity, bus speed, packaging
HARD DISK: capacity, interface type, rotation speed, access time
....
We can't just make one big table with all the parameter fields, because most of them will be empty for each record.
What is the best way to normalize this table?
(We are looking for independent opinions so I'm not posting our solutions yet.)
raskew 06-24-2003, 05:44 PM Just a rough sketch:
tblComputers:
-ComputerID - (PK, Autonumber)
-ComputerName - Text
-ProcessorID – (FK, Long)
-MemoryID – (FK, Long)
-HardDriveID – (FK, Long)
tblProcessors
ProcessorID – (PK, Autonumber)
-CoreSpeed – (Long)
-BusSpeed – (Long)
-Cache – (Long)
etc..
tblMemory
MemoryID – (PK, Autonumber)
-Capacity – (long)
-BusSpeed – (long)
etc..
tblHardDrive
HardDriveID – (PK, Autonumber)
-Capacity – (long)
-InterfaceType – (text)
-Rotation speed – (long)
-AccessTime – (long)
etc..
Now starting relating those, e.g.:
[tblComputers].[ProcessorID] ->[tblProcessors].[ProcessorID]
[tblComputers].[MemoryID] -> [tblMemory].[MemoryID]
[tblComputers].[HardDrive] -> [HardDrive].[HardDriveID]
...and you’ll be cooking!
Pat Hartman 06-25-2003, 12:50 PM Manufacturing systems and order entry systems do not keep separate tables for each end item. So, I don't think that you should either. You can use several generalized text fields that hold similar pieces of data for each type of component. I would use two tables:
tblComputer:
ComputerID (autonumber)
MfgID (foreign key)
MfgModelNum
MfgSerialNum
ShortTextDesc
AssignedToID (foreign key)
PurchaseDate
AssignedDate
tblComponents:
ComponentID (autonumber)
ComputerID (foreign key)
ComponentTypeID (foreign key)
MfgID
MfgModelNum
MfgSerialNum
ShortTextDesc
PurchaseDate
AssignedDate
Speed
Capacity
InerfaceTypeID (foreign key)
etc.
ComponentTypes would be - internal hard drive, laser printer, modem, external hard drive, etc.
InterfaceTypes would be - USB, Serial, Parallel, etc.
raskew 06-25-2003, 04:44 PM Pat-
I'll bow to your expertise with manufacturing systems, but think you may be suggesting a spreadsheet. For example, you list Speed as a field, but looking at the poster's example and my example, under tblProcessors you have two versions of Speed (CoreSpeed & BusSpeed) and under tblHardDrive, RotationSpeed.
These are all apples and oranges. In order for a Speed field to make sense, it's going to need an identifier which, in your example, equates to additional columns.
What do you think?
Bob
Pat Hartman 06-26-2003, 02:00 PM I think the question should be, "what is the user going to do with the data?". If it is simply for documentation, generic fields will work fine. Obviously there is no comparison of Bus speed to rotation speed. The idea is to examine the data that needs to be stored and determine how many columns need to be used and to give them meaningful generic names. Another alternative, one that is closer to the normalized ideal, is to treat all the attributes of a component as the many side of a one-to-many relationship. In that case you would have an attributes table that would contain only four fields -
AttributeID (autonumber primary key)
ComponentID (foreign key)
AttributeTypeID (foreign key)
AttributeValue (text description of the attribute)
The fully normalized structure allows you to define new AttributeTypeID's for each new component. There is certainly justification for this type of structure. New devices are constantly being released and who knows what attributes they will have. This structure will allow them to be defined without any programming changes.
geokes 06-27-2003, 04:51 AM Thanks for all the ideas. The data will be used on a website to sell computer parts as well as whole systems. A computer system as a whole is a composit part, incorporating other parts.
Parameters are neccessary to display as info on the details page and to check compatibility (BusSpeed of CPU should match Bus Speed of Motherboard and RAM).
What do you think about the table/subtable solution:
tblParts:
-PartID - (PK, Autonumber)
-PartName - Text
-PartDescription – Text
-PartPrice – Text
tblProcessors - (PK, LookUp from tblParts)
CoreSpeed - Number
BusSpeed - Number
Cache - Number
etc.
tblHardDrive
tblProcessors - (PK, LookUp from tblParts)
Capacity - Number
InterfaceType - Number
AccessTime - Number
etc.
All the subtables have one-to-one relationship to tblParts.
geokes 06-27-2003, 05:04 AM An additional consideration. How to store the part list for a computer. My idea was to create a many-to-many relationship of tblParts to itself through a linking table:
tblComputers:
PartID
ComponentID
Example:
tblParts:
PartID PartName …
1 P4 2.4
2 P4 3.0
3 256MB
4 512MB
5 20GB
6 40GB
7 Workstation
8 Server
tblComputers
PartID ComponentID
7 1
7 3
7 6
...
8 2
8 4
8 6
...
Pat Hartman 06-27-2003, 12:55 PM You are making a great deal of work for yourself plus causing serious on-going maintenance requirements. As soon as a new type of component is developed you (or your successor) will have a choice, corrupt one of the existing tables so you can add the new component or add a new table plus the cooresponding forms/webpages/queries, etc. What do you think is going to happen when you are told that the item needs to be added to your web page immediately?
|
|