Meaningless vs Meaningful Primary Keys

Mike Smart · Feb 22, 2007

Greetings

We've been discussing this subject in the VBA coding forum and I've moved it here as it seems more appropriate. It was sparked by the publication of a set of design standards for Access applications on my web site www.learnaccessvba.com in which I advocate meaningless primary keys.

This subject has been discussed in the past but generally from a computer-science perspective rather than a real-world business needs perspective. There's no simple Right/Wrong answer to it but a set of advantages and disadvantages to each approach. The discussions so far have yielded the following specific advantages and disadvantages and the purpose of this thread is to discover more. Here's the presently identified pros and cons of each approach:

Arguments for Meaningless keys
========================
1/ A primary key should never change and a key that contains meaningful data will always be subject to potential change. While most RDBMS support cascading updates a major change in the primary key can result in the need for extensive system re-design.
2/ A primary key may never be Null so it will be impossible to add a record if the field chosen for the primary key is not known at the time of data entry. In this respect it is less flexible than an alternate key (uniquely constrained index).
3/ Because the primary key is meaningless it is possible to choose a data type for it that will optimise indexed lookups (eg the Autonumber data type in a Jet database).
4/ Business needs may dictate that meaninful data that was unique when the system was designed must change its status to non-unique at a later date. This can result in the need for extensive system re-design if meaningful keys have been implemented.

Arguments for Meaningful keys
=======================
1/ Reporting is easier as a join is avoided when the only information needed from another table is the value of the foreign key.
2/ The table will be more compact because it will avoid the need for a meaningless primary key field and its index.
3/ It isn't possible to create a composite alternate key (only a composite primary key). For this reason the task of maintaining a unique constraint composed of two attributes becomes more difficult and involves the maintanance of three indexes rather than one.

I would like carefully considered and logical argument for any new advantages/disadvantages. The advantages/disadvantages should be within the scope of real-world business applications.

I really appreciate any feedback provided and the findings will be written into the next edition of the quality standard which will be published on my web site.

Thanks in advance!

KKilfoil · Feb 26, 2007

Proposed argument FOR meaningless key:

4) Sometimes, it is impossible/difficult to find a meaningful key that is GUARANTEED to be unique. I have encountered several cases where an assumed-to-be-unique key turned out out to be non-unique when more data was merged/linked in from other sources, completely messing up an otherwise good design. I suspect that this will be quite common in dbs that need to track the history of stuff.

5) Meaningless keys remove the temptation to use the PK field to contain multiple non-atomic fragments: "Since we have to have it, lets combine this concept and that concept and perhaps the first three characters of a third concept to 'guarantee' (see above) uniqueness." Violates normalization rules and/or duplication of data, but I have seen it done repeatedly.

Mike Smart · Feb 26, 2007

KKilfoil said:
Proposed argument FOR meaningless key:

> 4) Sometimes, it is impossible/difficult to find a meaningful key that is GUARANTEED to be unique. I have encountered several cases where an assumed-to-be-unique key turned out out to be non-unique when more data was merged/linked in from other sources, completely messing up an otherwise good design. I suspect that this will be quite common in dbs that need to track the history of stuff.

You are absolutely correct. Today's "unique" key is tomorrow's problem. I worked on a £40 Million + project once and when I was asked to comment on the database found that the primary key of the customer table was FirstName + LastName + DateOfBirth... really, this is not a joke! When I challenged the developer he said "well, maybe once in a million times it could be a problem" ... indeed! Great point.

> 5) Meaningless keys remove the temptation to use the PK field to contain multiple non-atomic fragments: "Since we have to have it, lets combine this concept and that concept and perhaps the first three characters of a third concept to 'guarantee' (see above) uniqueness." Violates normalization rules and/or duplication of data, but I have seen it done repeatedly.

Well, really that is a normalisation error but a very common one. In my classes I often tell my students (who usually already have a sick database) that a common problem is the "meaningful part number" where different bits of the number mean different things and this breaks atomicity and always eventually causes problems. You can see the light bulbs going on in their heads. This is one of the most common database design errors. I'm not sure it properly counts as an argument for meaningless keys though as it is only a "temptation" so I won't add it to the list but still an excellent point.

Thanks for your feedback.

MarkK · Feb 27, 2007

Mike Smart said:
I would like carefully considered and logical argument for any new advantages/disadvantages. The advantages/disadvantages should be within the scope of real-world business applications.

I like this topic to the extent that it illustrates important principles, but it also leaves me feeling like you're selling something. So we discuss on this web site so you can refine and publish on your web site?
Are you bringing something here, or taking something away?
Mark

boblarson · Feb 27, 2007

Also, as mentioned in this post by Jon, the site admin, your link in your signature is actually not allowed by this forum. It would be appreciated if you would follow the rules and remove it from your signature.

http://www.access-programmers.co.uk/forums/showthread.php?t=68868

Jon said:
No links in signature tags

--------------------------------------------------------------------------------

Please note that we do not allow links in signature tags advertising your own site. We respectly request you to follow this rule.

__________________
Regards,

Jon

The_Doc_Man · Mar 1, 2007

Argument FOR meaningful keys

Where appropriate to the nature of the table, there is a more intuitive "look and feel," when the key is capable of having meaning, to use a meaningful key.

For instance, part order numbers in a complex inventory. (Not your suppliers' numbers - YOUR numbers as would be printed in your catalog.)

I happen to have a minor quibble with a comment above, and it is only minor...

Meaningless keys remove the temptation to use the PK field to contain multiple non-atomic fragments

Well, really that is a normalisation error

My quibble is this: Compounded prime keys that are segmented can still be valid candidate keys even when non-atomic. It is a normalization error if and only if the compound prime key should not have been considered as a candidate key in the first place. In the example from KKilFoil, the theme was that by randomly selecting extra fields, the result becomes unique (usually by accident.)

I agree that when such a thing happens, it is always bad (because the inclusion of something else "just to make it unique" violates the rule about the rest of the record depending on the entire prime key, whatever it is. If the elements of the prime key include something that should have been treated as a dependent element, it should not have been a candidate key or part of a candidate key.

The distinction is that the error is the composition of the key, not in the fact of its being non-atomic.

Case in point, and it makes sense... my alma mater course catalogs have class names such as CHEM 101, PHYS 230, MATH 1, ENGL 1A, etc etc

There is at least an argument (admittedly not perfect) that having a compound key consisting of department code and course designator would make a good key since - as pointed out in other discussion - you don't need to look up anything when the key is also meaningful. So, for example, when printing out a student's course load, you just have a list of courses by department and number. No lookups.

Obviously, past a certain point you beat the dead horse, but I contend that making a database too abstract makes it too hard to maintain. My reason for this is that databases, like programs, have to be maintained by people. People want to be able to remember what something means. This is why you have mnemonic variable names in programs. A similar concept applies to the database field names and keys. Divorce yourself too far from the meaning of your data and you have not done ANYONE a favor.

Mike Smart · Mar 2, 2007

Dear Doc Man

Thank you for your interesting and well presented post.

The_Doc_Man said:
Argument FOR meaningful keys
Where appropriate to the nature of the table, there is a more intuitive "look and feel," when the key is capable of having meaning, to use a meaningful key. For instance, part order numbers in a complex inventory. (Not your suppliers' numbers - YOUR numbers as would be printed in your catalog.)

If primary keys are meaningless they are always invisible to the user of the system. In this context there is no look or feel. I understand that your comment relates to the presentation to the user of the data contained within a (meaningful) primary key. The meaningless argument maintains that the key is merely an internal system mechanism within a database with the sole purpose of enforcing referential integrity. In the meaningless paradigm the actual data will be presented (and look and feel) to the user in a manner that is identical to the meaningful paradigm. The only difference is that it is no fulfilling the dual-purpose of data store and primary key.

If I have misunderstood your point I'd be grateful if you would expand upon this.

Case in point, and it makes sense... my alma mater course catalogs have class names such as CHEM 101, PHYS 230, MATH 1, ENGL 1A, etc etc

There is at least an argument (admittedly not perfect) that having a compound key consisting of department code and course designator would make a good key since - as pointed out in other discussion - you don't need to look up anything when the key is also meaningful. So, for example, when printing out a student's course load, you just have a list of courses by department and number. No lookups.

I'm assuming that this is a re-statement of argument (1) for meaningful keys stated in the first post. If you feel that the point is not adequately summarised in argument(1) please let me know.

I think that we are in agreement about the importance of atomicity of data. In your example above you would advocate splitting the two data elements DepartmentCode and CourseDesignator, rather than break normal form, and I'd completely agree with you.

Obviously, past a certain point you beat the dead horse, but I contend that making a database too abstract makes it too hard to maintain. My reason for this is that databases, like programs, have to be maintained by people. People want to be able to remember what something means. This is why you have mnemonic variable names in programs. A similar concept applies to the database field names and keys. Divorce yourself too far from the meaning of your data and you have not done ANYONE a favor.

The Douglas Bader quotation at the beginning of "The Rules" document is that "rules are for the guidance of wise men and the obedience of fools". Sometimes there is a case for de-normalisation and one should never be afraid to break the rules (provided that the need to break the rules can be logically defended). I'm in total agreement with you on this.

The argument for meaningless keys, however, is that they do not form a part of the meaningful data contained within the table, they exist merely as a system mechanism to enforce referential integrigy. In the meaningless paradigm the information system maintains exactly the same degree of readability with meaningless keys as it does with meaningful ones, the only difference is that the meaningful data does not fulfil the dual role of primary key and information store.

I can't see a new argument in what you have described for meaningful keys that is not covered by the three already defined but if you disagree I'd appreciate your wording of the new advantage so that I can more easily understand it.

Best Regards

The_Doc_Man · Mar 2, 2007

My biggest objection to meaningless keys is that when a meaningful candidate key exists, you are reinventing wheels. When no valid candidate key exists, a meaningless but unique key is indeed a good solution.

If a meaningless key exists and always remains hidden, you can "fake" a look and feel based on an apparent (but not actual) key field. But since I am first and foremost a pragmatist, I will ALWAYS look for a meaningful candidate key before assuming that I should start from a meaningless key. Part of that comes from the concept that any significant database is a model of your business actions. If you track something in your business on paper, you have identified a candidate key for your database. (Your model HAD to start somewhere, after all...)

Let's consider, for example, a simple supplier table. Here, in MOST cases, you will probably have proper grounds for a meaningless key since suppliers rarely have inherently unique identifying numbers or codes. A number assigned chronologically works as well as any other.

BUT suppliers have catalogs with unique product numbers - so a compound key consisting of the meaningless supplier number plus the meaningful product number could be used for a combined "available products" table. I don't think we disagree conceptually, but I see you as hard-pedaling the meaningless key concept when it is merely one more tool in the arsenal.

One problem I have with meaningless keys is visual validation. If the key has no meaning, I can't sight-validate it. Where the key has meaning, I can look at the key plus other fields for the record and detect inconsistencies. Again, I guess this is a pragmatist approach, but validation is impossible when dealing with meaningless keys - because they have no inherent meaning against which to be validated. And to me, THAT is a very important tool in the arsenal.

The argument for meaningless keys, however, is that they do not form a part of the meaningful data contained within the table, they exist merely as a system mechanism to enforce referential integrigy.

I don't care that the meaningless key isn't part of the meaningful data in the table. I care that it is part of the table. A future maintainer will hopefully know that it is meaningless. But when I look at a lot of the kids coming out of schools these days, I'm not making any bets. (Sorry if that sounds a bit harsh to anyone, but I must be true to myself first, and it is an honest opinion.) Again, that is part of my argument about meaning, and I'm pedaling that one as hard as you are pedaling the one about meaningless keys.

I have taught database courses in the past (none recently) and I have made it a point to suggest meaningful keys first but UNIQUE keys always. So I don't think we are that far apart, we just emphasize differently. vive le' difference!

KKilfoil · Mar 2, 2007

In my opinion, a truely unique meaningful (MF) key might be better than a meaningless (ML) key, unless performance requirements dictate limiting the PK to an autonumber-sized entity.

However, finding a truely-unique, under all possible foreseeable circumstances, MF can be difficult, and inexperienced developers are prone to misidentify a MF's uniqueness, leading to horrible problems later.

The 'safe' choice is the ML, such as the autonumber, since it is maintained as unique by JET's rules.

So, for me, examples of MF that I would be willing to use are Social Insurance Numbers (I'm Canadian), OTHER databases' tables' ML numbers (for distributed databases, if you want to call that meaningful), and internal 'catalog numbers' that are 100% guaranteed to remain unique under all forseeable circumstances.

So, in nearly every instance, I tend towards meaningless numbers.

Case in point, and it makes sense... my alma mater course catalogs have class names such as CHEM 101, PHYS 230, MATH 1, ENGL 1A, etc etc

There is at least an argument (admittedly not perfect) that having a compound key consisting of department code and course designator would make a good key since - as pointed out in other discussion - you don't need to look up anything when the key is also meaningful. So, for example, when printing out a student's course load, you just have a list of courses by department and number. No lookups.

In the above example, I may have asked myself : Is there ANY chance that the college (uni?) might re-use an old course number? Is there any chance that this database might survive a merger with another college, or at least its' data? Do I feel like managing a multi-field key (as I presume Doc_Man was suggesting [Department] and [CourseNum] as two separate fields)?

I suspect that I would have still applied an autonumber PK to this table. Of course, your milage may vary.

The_Doc_Man · Mar 3, 2007

Do I feel like managing a multi-field key

{best Clint Eastwood/Dirty Harry impersonation} Well, do ya, punk? {/impersonation}

Seriously, we are missing the point. When there is a natural key, it almost manages itself.

Also, the college in question had a rule that once a course was named, it could not be changed in description because of historical accreditation and credit transferability. Be that as it may, the point was that the combination of department and course number was a "natural" element of the business model and presented itself as a candidate key.

Also remember, I ALWAYS come at this (well, almost always) from a pragmatic point of view, not the purist point of view. I will denormalize a table at the drop of a hat if it means that my performance gets better by a lot.

I guess to me there is a bright line in the sand: the existence of a natural, valid candidate key with meaning. No natural key? Autonumber in a heartbeat. Natural key? Use it unless it is too long to be efficient OR if it is so non-uniform in nature that, though unique, it is unmanageable.

Mike Smart · Mar 3, 2007

So, for me, examples of MF that I would be willing to use are Social Insurance Numbers (I'm Canadian), OTHER databases' tables' ML numbers (for distributed databases, if you want to call that meaningful), and internal 'catalog numbers' that are 100% guaranteed to remain unique under all forseeable circumstances.

So, in nearly every instance, I tend towards meaningless numbers.

Hi Kilfoil

I completely agree with you and the points you make. The key words in your statement above are "forseeable circumstances".

I thought you'd enjoy the following anecdote showing how some very clever people (who had even been to university)! managed to get things very, very wrong over and over again:

When I was about five years old we got our first telephone in a place called Huyton that had a population of around 50,000. The telephone number was Huyton 2835. The phone people knew that it was inconceivable that more than ten thousand people in our town would ever own a telephone - that would be more than one telephone for every two households... clearly ridiculous. The number was thus 100% guaranteed to never change.

A few years later they had to change the whole numbering system. The impossible had happened and almost everybody wanted a telephone. Huyton became area 489 so the number changed to 489-2835. Now there were up to ten thousand areas in the UK - each having ten thousand possible numbers. That's ten million combinations - surely future proof. At last we had a number that was 100% guaranteed to never change.

A few years later it became clear that even that wasn't going to work for much longer. The unthinkable had happened again and the country was running out of numbers. So they created a new prefix called City code. The Liverpool area now became code 051 so our number changed to 051-489-2835. The clever telephone men had finally got it right!

But the clever men were wrong again! Somebody invented the mobile telephone and even ten digits wasn't enough anymore. They added a digit to the city prefix and the number became 0151-489-2835. At last we had a system that would last for ever... a system that was 100% guaranteed to never change.

We left the area about this time and were surprised to find that after a few years the telephone number was re-cycled by the telephone company and now it points to somebody elses phone. This became such a problem for people that they invented the "geographically independed phone number" and now I have a number that will stay with me "for life" AND is 100% guaranteed to never change.

I wonder if there will be a sequel to this (true) anecdote or will there be yet another change in the future? One thing is for sure, if I'd designed a database all of those years ago to track telephone numbers we'd still have the same primary key and it would still identify us and all of the phone numbers we'd had during the interim.

... If only all things in life were so reliable!

Mike Smart · Mar 3, 2007

The_Doc_Man said:
{best Clint Eastwood/Dirty Harry impersonation} Well, do ya, punk? {/impersonation}

Seriously, we are missing the point. When there is a natural key, it almost manages itself.

Also, the college in question had a rule that once a course was named, it could not be changed in description because of historical accreditation and credit transferability. Be that as it may, the point was that the combination of department and course number was a "natural" element of the business model and presented itself as a candidate key.

Also remember, I ALWAYS come at this (well, almost always) from a pragmatic point of view, not the purist point of view. I will denormalize a table at the drop of a hat if it means that my performance gets better by a lot.

I guess to me there is a bright line in the sand: the existence of a natural, valid candidate key with meaning. No natural key? Autonumber in a heartbeat. Natural key? Use it unless it is too long to be efficient OR if it is so non-uniform in nature that, though unique, it is unmanageable.

Hi again Doc Man

What I'm trying to establish in this thread are demonstrable advantages to the use of the meaninful/meaningless paradigm. So far we've identified all of the advantages and disadvantages mentioned by many people (in the summary at the beginning of this thread). Once there are no new ideas I intend to put my own weightings upon the advangtages/disadvantages of each approach and logically argue the choice of each weighting. Others can do the same and decide upon their own weightings. There may, at that point, be a compelling case for either approach - I still have an open mind.

The object of the exercise is to identify the best approach from a pragmatic, business-orientated perspective and positively not to consider computer-science type arguments. You can review the advantages/disadvantages at the beginning of this thread (I've editied it to add one new one identified within the thread too) and take a view upon which approach currently seems to provide the most business benefits (the most pragmatic approach).

I'd still like very much to hear anything new to add more benefits to the list either for or against each approach if tangible business benefits can be demonstrated that have not yet been identified.

Thanks for the great input.

Mike Smart · Mar 3, 2007

Hi Doc Man

I thought you'd appreciate a point-by-point examination of the issues you raise:

The_Doc_Man said:
My biggest objection to meaningless keys is that when a meaningful candidate key exists, you are reinventing wheels.

I think this is a re-statement of advantage 2 for meaningful keys?

Part of that comes from the concept that any significant database is a model of your business actions. If you track something in your business on paper, you have identified a candidate key for your database. (Your model HAD to start somewhere, after all...)

A business consists of entities and the relationships between them along with attributes of the entities, of course. But the keys are a component of the relationships and not the data itself. I am having difficulty taking the above statement and mapping it to a direct business benefit in using meaningful keys.

Let's consider, for example, a simple supplier table. Here, in MOST cases, you will probably have proper grounds for a meaningless key since suppliers rarely have inherently unique identifying numbers or codes. A number assigned chronologically works as well as any other. BUT suppliers have catalogs with unique product numbers - so a compound key consisting of the meaningless supplier number plus the meaningful product number could be used for a combined "available products" table.

I think this is a re-statement of advantage 2 and 3 for meaningful keys?

I don't think we disagree conceptually, but I see you as hard-pedaling the meaningless key concept when it is merely one more tool in the arsenal.

I'm still maintaining an open mind upon the best approach until all of the advantages/disadvantages of each approach are clearly identified.

One problem I have with meaningless keys is visual validation. If the key has no meaning, I can't sight-validate it. Where the key has meaning, I can look at the key plus other fields for the record and detect inconsistencies. Again, I guess this is a pragmatist approach, but validation is impossible when dealing with meaningless keys - because they have no inherent meaning against which to be validated. And to me, THAT is a very important tool in the arsenal.

The Primary/Foreign key pairs allow sight-validation (from the schema) of the relationships in the system... but nothing else... in the meaningless paradigm. Depending upon the ER modelling tool in use it may not be apparent which fields are alternate keys (fields having a unique constraint). I use ER-Win and such fields are clearly marked in the schema but the simpler tools in Access doesn't provide this information. You have made a good point (that hadn't ocurred to me) but this advantage simply works-around a limitation of the Access schema tools so it could more be an argument for using professional schema design tools - in other words it is Access toolset specific rather than database design specific. If there was a requirement to use the Access schema design tools AND have visual identification of alternate keys this could be simply provided by an appropriate naming convention (though this isn't something I advocate myself).

I don't care that the meaningless key isn't part of the meaningful data in the table. I care that it is part of the table. A future maintainer will hopefully know that it is meaningless. But when I look at a lot of the kids coming out of schools these days, I'm not making any bets. (Sorry if that sounds a bit harsh to anyone, but I must be true to myself first, and it is an honest opinion.)

I think that this is more an argument for training rather than design.

Again, that is part of my argument about meaning, and I'm pedaling that one as hard as you are pedaling the one about meaningless keys.

Once again, I have an open mind at the moment and am only trying to identify the advantages and disadvantages of each approach before reaching a decision.

I have taught database courses in the past (none recently) and I have made it a point to suggest meaningful keys first but UNIQUE keys always. So I don't think we are that far apart, we just emphasize differently. vive le' difference!

I don't think we disagree at all. Advantage 4 for meaningless keys addresses the possiblity that meaningful data may change its status from unique to non-unique at a future date. If you agree with this it follows that to ensure unique keys meaningful data would always have to be combined with a meaningless key. It is difficult for me to see the business advantage of doing this but I would value any expansion upon these comments that may identify a new benefit of this approach.

The_Doc_Man · Mar 3, 2007

I think I have identified part of my objection here.

I'm still maintaining an open mind upon the best approach

The above statement presumes there is a single best approach. My experience has told me there is not. Every problem is the same. Every problem is different. When you say things like "best" approach you step into a realm that has NEVER worked - the one-size-fits-all theory of programming. And when you do, you become evolutionary deadwood.

I think that this is more an argument for training rather than design.

NO amount of training will prepare a kid for having to wade elbows deep into bad code, bad design, bad thought processes. RELEVANCE will assist the junior programmer given "a big opportunity to shine." When the field names, form names, table keys, etc. have no RELEVANCE to the problem at hand, that kid is going to find that his shine is badly tarnished. And it will take a huge pot-load of work to overcome someone else's non-work. (Sort of a "conservation of information" analog to conservation of energy.)

Thus speaks the voice of personal experience from many years ago when I was that kid given the "chance to shine." Ten days of no sleep and too much caffeine later, I had the problem solved. And resolved to NEVER AGAIN allow that kind of crap to happen to anyone who followed me.

You speak of the issue of unique keys changing format or relevance. Well, at that point you revisit the problem. Because as any good programmer knows, a change in the problem usually requires a change to its solution. I have no doubt that when a key loses relevance or uniqueness, it is time for a change. But at least in my world, you don't always have time to concentrate on a design that will last forever. You get fired or reassigned or at least formally reprimanded for wasting time on something that should be simpler than you made it. And if you have a pointy-haired boss, s/he will NEVER see a meaningless key as valid. This is a cold, hard reality that I have encountered too often personally for it to be just bad luck. When I have gone to conventions, my colleagues report the same phenomena. I just think that you are stepping into an ivory tower. Don't take that as dismissal. Just add it as a sanity-check.

Perhaps we are having that worst of all "religious" discussions - purist vs. pragmatist. I've already made my position known in THAT arena. If so, then we will have to agree to disagree on the relative strengths of the various influences that lead one to choose MF/ML keys.

Mike Smart · Mar 3, 2007

The_Doc_Man said:
The above statement presumes there is a single best approach. My experience has told me there is not. Every problem is the same. Every problem is different. When you say things like "best" approach you step into a realm that has NEVER worked - the one-size-fits-all theory of programming. And when you do, you become evolutionary deadwood.

I think you are agreeing that for every case there is a best approach. Quantifying the pros and cons of each approach will help others to quickly identify the best approach for a given set of circumstances. This is what I hope to achieve in this thread.

NO amount of training will prepare a kid for having to wade elbows deep into bad code, bad design, bad thought processes. RELEVANCE will assist the junior programmer given "a big opportunity to shine." When the field names, form names, table keys, etc. have no RELEVANCE to the problem at hand, that kid is going to find that his shine is badly tarnished. And it will take a huge pot-load of work to overcome someone else's non-work. (Sort of a "conservation of information" analog to conservation of energy.)

This issue we're discussing here is but two lines from the eleven page design standard and naming of all of the elements you mention is completely covered. A database created in line with the recommendations is a delight to work with and very self-documenting. In this thread I'm focusing upon simply the issue of choosing the most efficient primary key.

You speak of the issue of unique keys changing format or relevance. Well, at that point you revisit the problem. Because as any good programmer knows, a change in the problem usually requires a change to its solution. I have no doubt that when a key loses relevance or uniqueness, it is time for a change.

I'm sure you agree that it is good practice to design a system that, as far as possible, will be easily maintainable as unknown future needs must be supported. This is argument 1 and 4 in favour of meaningless keys - avoidance of major system re-design as needs change.

But at least in my world, you don't always have time to concentrate on a design that will last forever. You get fired or reassigned or at least formally reprimanded for wasting time on something that should be simpler than you made it.

In my experience a system underpinned by a properly designed database will be finished faster than one underpinned by a poor database. I often see poorly implemented systems that were originally justified by a "quick and dirty" approach but then wasted months of effort stepping around the design deficiencies.

And if you have a pointy-haired boss, s/he will NEVER see a meaningless key as valid. This is a cold, hard reality that I have encountered too often personally for it to be just bad luck. When I have gone to conventions, my colleagues report the same phenomena.

.

In the design standard I've writen "A very important responsibility of your job as a development professional is to educate and inform (managers) so that this (poor design justified by quick and dirty approach) doesn't happen". I strongly believe this and have managed to successfully educate and inform many managers over the years who have then bought into a quality approach.

Perhaps we are having that worst of all "religious" discussions - purist vs. pragmatist. I've already made my position known in THAT arena. If so, then we will have to agree to disagree on the relative strengths of the various influences that lead one to choose MF/ML keys.

I do hope that you do not think I am a purist! As stated many times this thread focuses upon the business benefits of each approach and not any dogma or computer-science arguments.

Thanks again for your feedback, it is much appreciated.

The_Doc_Man · Mar 5, 2007

Well, as long as we have thoroughly aired out the laundry here so that everyone knows some of the arguments, they can decide for themselves. I'll leave it at that and avoid the temptation to get in a final shot.

KKilfoil · Mar 5, 2007

Advantage FOR meaningful (MF) keys:
(Assuming all of the previously-stated strenuous criteria for selecting a MF key are met)

The resulting db structure should be easier to understand, particularly for people other than the original developer(s). This can be really handy if the db survives for a number of years and then requires maintenance. MF keys are somewhat self-documenting.

This can be especially useful in a novice/learning environment, when you have people learning about normalisation for the first time, as the whole 'PK/FK' concept is somewhat easier to visualize and understand when a MF key is employed.

Mike Smart · Mar 5, 2007

KKilfoil said:
Advantage FOR meaningful (MF) keys:
(Assuming all of the previously-stated strenuous criteria for selecting a MF key are met)

The resulting db structure should be easier to understand, particularly for people other than the original developer(s). This can be really handy if the db survives for a number of years and then requires maintenance. MF keys are somewhat self-documenting.

This can be especially useful in a novice/learning environment, when you have people learning about normalisation for the first time, as the whole 'PK/FK' concept is somewhat easier to visualize and understand when a MF key is employed.

Hi Kilfoil

Whether a schema is easier to understand is subjective. My opinion (as a professional educator) is that it is more difficult to understand a schema with meaningful primary keys ... but as that is subjective too it is arguable. I'll make my argument by considering the following schema (that meets my published design standards, a schema that didn't could well be more difficult to read) :-

Here are the reasons why I feel that the meaningless paradigm schema is more readable (and describable to my students):

1/ It is immediately apparent which are the primary keys. The key is always the field bearing the name of the table + ID.
2/ If is immediately apparent which are the foreign keys. These are the only fields that are not prefixed by the table name, and that also end in "ID".
3/ There is no confusion about where the data is actually stored. In the MF tables a novice may think that EmployeePayrollNumber and CustomerCompanyVATNumber were stored in the InvoiceMF table. Being clear upon where the data actually is makes third normal form principles easier to explain (and understand).

I would find it far harder to teach my classes from the second schema than the first and actually feel that the ML schema is more natural and readable (and far less confusing) than the second.

But this is subjective and I am not offended at all if you disagree with me.

The_Doc_Man · Mar 6, 2007

1/ It is immediately apparent which are the primary keys. The key is always the field bearing the name of the table + ID.

No. YOUR NAMING convention seemingly makes it apparent, but the content of the field that is the actual key could be MF or ML (your nomenclature).

2/ If is immediately apparent which are the foreign keys. These are the only fields that are not prefixed by the table name, and that also end in "ID"

No. Again, your NAMING convention perhaps has this effect. The content of the key has nothing to do with your naming style.

There is no confusion about where the data is actually stored.

Again, no. This is still an issue with NAMING conventions, not contents. All I need to do is note with a prefix that a particular use of a field is prime key or foreign key as pkUserCode or fkUCode or whatever. And by the way, the data are stored where you find them.

There is not a reason in the world why you couldn't use other conventions. But you are confusing the issue between key name and key meaning. And if that was not your intent, Mike, I must tell you that you fell down on this job. Your argument in your response to KKilFoil VERY DEFINITELY confuses the line between key name and key meaning.

I'm not out to get you but if you are going to try to teach this to kids, then I have to toss in my two cents' worth. I'm eventually going to have one of those kids as my apprentice. I don't want to have to fire a kid because you taught him/her based on confusion of name and meaning.

There are valid reasons to choose naming conventions including my own favorite, relevance to the topic. BUT the name of a field is NOT the whole story. Your last post at least SEEMS to indicate that it is so. And there is where I have SERIOUS problems. Just 'tain't that way, Mike. And my 35+ years of experience and authorship of petroleum-related programs used on three continents and 60 sites world-wide are all I have to back me on this.

Don't misunderstand. I respect the teaching profession and applaud you for going that way. Heck, my stepdaughter is an Elementary teacher, grades 4-5 (USA). I wish you good luck in what you do. But if this is an example of what you are doing, you will need more than just luck. You'll need a good, critical proof-reader for your lesson plans.

Mike Smart · Mar 6, 2007

The_Doc_Man said:
No. YOUR NAMING convention seemingly makes it apparent, but the content of the field that is the actual key could be MF or ML (your nomenclature).

No. Again, your NAMING convention perhaps has this effect. The content of the key has nothing to do with your naming style.

Again, no. This is still an issue with NAMING conventions, not contents. All I need to do is note with a prefix that a particular use of a field is prime key or foreign key as pkUserCode or fkUCode or whatever. And by the way, the data are stored where you find them.

You need to read my post again! Especially the line:

"I'll make my argument by considering the following schema (that meets my published design standards, a schema that didn't could well be more difficult to read)"

My design standard is holistic and the naming convention is, indeed, a major contributor to the readability of the sample schemas. The use of meaninful keys would undermine the naming conventions (and has done in the meaningful schema that does not, and cannot meet my design standard) and I did qualify these points appropriately. But this is not the point being discussed - we're focusing merely upon the advantages and disadvantages of meaningless/meaningful keys. The example simply illustrates that wheter either approach is more readable is in the eye of the beholder (iow it is subjective).

I'm not out to get you but if you are going to try to teach this to kids Heck, my stepdaughter is an Elementary teacher, grades 4-5 (USA)... I wish you good luck in what you do.

Do they really teach Access to Elementary children! My courses are all done for adults from large companies. Companies that book my courses are almost always looking for the solutions to problems that are, in the main, caused by universally poor database design.

But we digress... The only important thing to come from the previous post is that whether either database is more readable than the other is subjective, in other words, on this single point at least, everybody is right!

And I don't think you are out to get me! I've greatly enjoyed reading your posts.

Meaningless vs Meaningful Primary Keys

Registered User.

Registered User.

Registered User.

bit cruncher

Smeghead

Immoderate Moderator

Registered User.

Immoderate Moderator

Registered User.

Immoderate Moderator

Registered User.

Registered User.

Registered User.

Immoderate Moderator

Registered User.

Immoderate Moderator

Registered User.

Registered User.

Immoderate Moderator

Registered User.

Similar threads

Users who are viewing this thread