Dear Doc Man
While I applaud anyone's idea to help others understand how to go about designs, there are only three or four REAL rules regarding standards.
1. Determine if your company has any standards.
2. If so, follow them.
3. When in doubt, see rule 2.
4. When no rules exist, THEN you get to make some choices.
I think that I'm addressing case 4 as my students need rules to give them a quality standard that will get them started. If you do read my rules, however, you'll see that the introduction broadly agrees with the above.
I would add, however, that just as all useful applications always need to be improved this is also true for most rules.
> I have not tried to download your rules because, to be blunt, anyone's rules are wrong sometimes. I work with the government a lot and I can tell you right now, even their best data naming standards are trash most of the time. They make it impossible to program without the ability to touch-type.
If you had read my guidelines though you'd know that they are not just naming standards (though suggested naming standards are included).
> I'll also state categorically that primary keys SHOULD NOT be autonumber if there is ever any chance that they will take on meaning outside of the guarantee of uniqueness.
Thank you for this comment. This is exactly the type of feedback I need.
> Because, you see, autonumbered keys take up space with no relevant meaning. If another key adds meaning, I'll take the meaningful key every time.
This is the advantage referred to in :-
2/ The table will be slightly more compact because it will avoid the need for a meaningless primary key field.
> Let's take a parts inventory in which we have an absolute guarantee that the SKU number is unique. Autonumber? I don't need no steenkin' autonumber. (With apologies to lovers of old Humphrey Bogart movies.). If two different parts come in with the same SKU, send back the parts for corrections to their labels. In this case, the uniqueness of the SKU lets you catch someone else's error.
Yes. The primary key is doubling as a unique constraint in this case. The advantages of using a unique constraint are detailed earlier in this thread as Advantage 2 (which I'll re-state at the end of this message). But I think you've identified a new argument for meaningful primary keys.
"If a unique constraint is required the overhead of an extra index is avoided as the primary key unique index doubles as a unique constraint for the meaningful key". I've added a new advantage for meaningful keys as "3" below.
> Let's take a compound-key situation in which the prime key is a date and, say, a store number. (OK, it's contrived.) If I make that table with an autonumber, I still need indexes on store and date. PLUS the implied index for the autonumber. Waste of space.
Another point that I see falling out of this is that since you cannot create a composite alternate key (only a composite primary key) the task of maintaining a unique constraint composed of two attributes becomes more difficult.
I'll include this one in advantage "3" at the end of this message.
> You use keys when it makes sense to do so. You CHOOSE keys when they support your functions. If the above date/store table is something that is referenced a lot via foreign keys, then OK an autonumber might be useful. But if this table is making the references itself i.e. IT contains a lot of foreign keys and no tables reference it via foreign key, why waste the space? Or the time?
I think you are making three points here.
* Advantage 2 of meaningful keys once more (compactness).
* A new advantage for meaningless primary keys as they will potentially execute lookups faster (in your example).
But I'm not sure that there is a time advantage, would appreciate if you can elaborate upon that.
> Don't get me wrong. Standards exist precisely because without them, folks tend to go off on tangents. OK, worthy cause here. Keeping everyone on the same page. Good thing to do for large teams of programmers.
I completely agree with you here. I'd even say that standards are essential when programming in a team environment. But once again, this isn't just a naming convention but a complete set of quality design rules and this type of standard has another use - the education of people learning to develop applications so that they start their career with a set of "good habits".
> BUT, standards also tend to stifle creativity when they go too far. So my advice to you is, put standards in their proper place. They are great for beginners, great for companies that have lots of staff turnover, lousy for individuals, and often cumbersome when they are extensive enough. Don't forget that when you have externally imposed standards, for a large enough shop you just created a new overhead job called the "standards compliance" officer.
I hate to be regulated in anything but I do realise that in society we'd have anarchy without laws. And we all hate the policeman (until he saves us from something terrible). I've seen lots of terrible databases!
> OK, if you check my reputation here, you'll find that I'm a pragmatist. I've been in the programming business 35+ years. Standards come and go. I can't get worked up about them.
I greatly appreciate you sharing your experience. I find it hard to get worked up about anything these days (except the congestion charge in London)!
So here's the advantages/disadvantages of meaningful/meaningless primary keys. If you (or anybody else) can add to the list I would be most grateful. This may not be a "right and wrong" issue but with the advantages and disadvantages clearly stated it is possible for each to choose their path. This issue seems to be clouded with faith issues though it is great to get some logical mileage out of this discussion.
Here we go then with the advantages/disadvantages so far:-
Arguments for Meaningless keys
========================
1/ A primary key should never change and a key that contains meaningful data will always be subject to potential change. While most RDBMS support cascading updates a major change in the primary key can result in the need for extensive system re-design.
2/ A primary key may never be Null so it will be impossible to add a record if the field chosen for the primary key is not known at the time of data entry. In this respect it is less flexible than an alternate key (uniquely constrained index).
3/ Because the primary key is meaningless it is possible to choose a data type for it that will optimise indexed lookups (eg the Autonumber data type in a Jet database).
Arguments for Meaningful keys
=======================
1/ Reporting is easier as a join is avoided when the only information needed from another table is the value of the foreign key.
2/ The table will be more compact because it will avoid the need for a meaningless primary key field and its index.
3/ It isn't possible to create a composite alternate key (only a composite primary key). For this reason the task of maintaining a unique constraint composed of two attributes becomes more difficult and involves the maintanance of three indexes rather than one.
Any more suggestions for this list would be gratefully appreciated. I intend to publish these advantages/disadvantages in the next version of the standard.
** I'm moving the discussion of meaningless vs meaningful primary keys to: "Theory and practice of Database Design" as I think that's a more appropriate forum for this particular issue. VBA coding related issues can stay here **
Mike Smart