Text as Primary Key vs Auto Number

Access Autonumber in Increment mode is notorious for reseeding when a whole record is inserted. This can be worked around but it does expose a weakness in Autonumber. We do not have absolute control over the mechanisms controlling it.

Galaxiom

Are you suggesting that the insertion of a new record or should I say records can cause the AutoNumber to go out of control when this is done pragmatically.

Can you clarify this just a little.
 
Button

I thought the topic was Text V Auto Number as a Primary Key. I always use AutoNumber and that is my choice. The reason I do so is based upon my experiences with other Access Databases where Natural Keys were the preference. In many cases a second and third field was added and then as a last resort an AutoNumber was added to form a Composite Key in order to gain some form of uniqueness. Can you imagine using 4 or more fields in order to create a unique reference. I have seen it, worked with it, and hate it to bits. If that is what you propose then that is up to you.

So how do you expect your end users to identify the information in such a table? Your auto-number key won't help them uniquely identify anything because you've already told us you intend to hide it from them. Yet you don't appear to be suggesting any alternative. That's why I'm saying your point is irrelevant to the discussion. By introducing an arbitrary and therefore irrelevant surrogate key you aren't addressing the critically important issue of unique identification.

If a table lacks any natural key at all then users are left in the abysmal situation of working with duplicate data. Many of us are familiar with the messes created by database systems that don't support data integrity. Even when you use a surrogate key a natural key is still just as important, so introducing a surrogate doesn't in any way relieve the database designer from the responsibility of identifying and implementing the natural key for that table - i.e. you would still need to create key constraints and indexes for the natural key as well as the surrogate.
 
Well, I basically agree with Rainlover.

If there is a need for some other candidate data within that table to be functionally unique, and in my experience it tends to be a compound series of columns, then I constrain it with a unique index. I have no problem with doing this nor see any real downside to it. It's not unknown for these indexes to require tweaking from time to time as assumptions or the underlying data change and in some cases I have multiple unique indexes on a table, they're rare, but they do sometimes occur.


Users don't have to worry about duplicate data because the application handles it for them and where duplicates should not be able to occur the relevant indexes or constraints are applied to the table and ensure that it isn't an issue.

Ultimately I have no real issue with natural Primary keys, but I do think that surrogate values should always be used as foreign keys. It simplifies the design as they'll never be modified and it makes the foreign key explicit.

All my tables use a tablenameId column [as primary keys] and all foreign Key columns are named for the primary key to which they refer.

e.g. An Account table has the following columns:
AccountId
CustomerId
Date
SortCode
AccountNumber
Balance
CompanyId

Knowing nothing else about the database I'd suggest that you can infer a lot about the immediate structure of that database based on those columns and that one sentence description.

If you think that you shouldn't be able to add the same date, sortcode and Accountnumber combination to the table, try it and see what happens.

The surrogate keys from a user point of view are meaningless, they exist purely to simplify maintenance of the database and improve performance. I don't care if they're sequential, have gaps etc and, like Rainlover, I rarely expose my users to these columns, certainly not in the application layer so that there is no chance that someone will try to assign meaning to them.
 
tehNellie,

Well said. I agree with what you have said. For the reasons you have given I agree that the choice of "primary" key is essentially irrelevant and uninteresting. What really matter are the natural keys and how they are used. I.e. how is information in the database uniquely identifiable by users of that database? Are the natural keys an accurate representation of the reality the database is supposed to model?

As you point out, the choice of foreign keys is important too. Foreign keys don't have to reference the primary key of a table, nor do they necessarily have to reference a natural key. It is not implicit that using a surrogate key always improves performance however. The overuse of surrogates can have a disastrous effect on performance. Surrogates don't necessarily simplify maintenance either because they increase the number of constraints and indexes that have to be managed (because as you rightly say, the natural key usually still has to be indexed as well). Surrogates also introduce certain unique problems and complexities: e.g. the extra logic and processor cycles necessary to lookup the natural key from the surrogate or vice versa.

I'm not opposed to the use of surrogates. I just think they should be used prudently and selectively. I see no value at all in being dogmatic about "always" using a surrogate. Do what is right for each particular situation.
 
The only thing I would add (regarding tehNellie's comments) is that if the PK is going to be used elsewhere as an FK, then DEFINITELY I would make the PK a surrogate in order to simplify the implementation of Access requirements for parent/child table relationships. I.e. you must have a PK on a parent in order to link a child to its parent.

I still have no objections to using natural keys when available and appropriate i.e. stating the obvious, if it ain't a candidate key based on the rules for selecting candidate keys, don't try to make it one. I have to bang on stuff to make it fit all too often anyway. I don't need to compound my problems by mucking about with funky keys.

As far as letting users see the PK even when it is a surrogate: The question is still "So what could they do with this PK if they knew it?" If the answer is "nothing" then I don't care and will often put the PK in a write-locked text box just so they could report it to me if there was a problem.
 
Are you suggesting that the insertion of a new record or should I say records can cause the AutoNumber to go out of control when this is done pragmatically.

I have found that inserting a complete record including the autonumber field (eg restoring a deleted record from a backup) is a reliable way to cause the autonumber to reseed at the inserted record.
 
The only thing I would add (regarding tehNellie's comments) is that if the PK is going to be used elsewhere as an FK, then DEFINITELY I would make the PK a surrogate in order to simplify the implementation of Access requirements for parent/child table relationships. I.e. you must have a PK on a parent in order to link a child to its parent.

I still have no objections to using natural keys when available and appropriate i.e. stating the obvious, if it ain't a candidate key based on the rules for selecting candidate keys, don't try to make it one. I have to bang on stuff to make it fit all too often anyway. I don't need to compound my problems by mucking about with funky keys.

As far as letting users see the PK even when it is a surrogate: The question is still "So what could they do with this PK if they knew it?" If the answer is "nothing" then I don't care and will often put the PK in a write-locked text box just so they could report it to me if there was a problem.

it's not experienced developers that have the problem, is it, really.

It's inexperienced developers, and the clients themselves, who expect to see meaningful data in the tables. Probably compounded by MS who give us lookup fields to facilitate such bad practice.


bit in bold. Yes, I often find it useful to know what the "ID" value actually is. Sometimes I hide it, but sometimes, I just grey it out. Depends on the app, really
 
The only thing I would add (regarding tehNellie's comments) is that if the PK is going to be used elsewhere as an FK, then DEFINITELY I would make the PK a surrogate in order to simplify the implementation of Access requirements for parent/child table relationships. I.e. you must have a PK on a parent in order to link a child to its parent.

I don't see why this condition should make the slightest difference whether a natural of surrogate key is used. If a natuaral key is available then the process of linking is exactly the same.

Indeed it is also quite simple to link tables on a compund key and there can be reasons to use them too.

Moreover I have already resoundingly debunked the claims for adding superfluous surrogate keys in post #46. My points stand unopposed because none of those promoting such use has addressed any of them.
 
I don't see why this condition should make the slightest difference whether a natural of surrogate key is used. If a natuaral key is available then the process of linking is exactly the same.

Indeed it is also quite simple to link tables on a compund key and there can be reasons to use them too.

Moreover I have already resoundingly debunked the claims for adding superfluous surrogate keys in post #46. My points stand unopposed because none of those promoting such use has addressed any of them.

not debunk, just preference. I like the idea of a surrogate key for managing relationships, as it never needs changing, thereby avoiding any possibility of cascading updates. I like the efficiency of single field numeric keys. But the nature of the keys does not add anything to the integrity of the dbs structure at all.
 
From tehNellie's post:

If there is a need for some other candidate data within that table to be functionally unique, and in my experience it tends to be a compound series of columns, then I constrain it with a unique index.

From Galaxiom's comments on my comments:

I don't see why this condition should make the slightest difference whether a natural of surrogate key is used. If a natuaral key is available then the process of linking is exactly the same.

tehNellie's post implied compound natural keys. I use a surrogate SINGLE key and just constrain the compound keys for disallowing duplicates as needed. Perhaps I should have emphasized that point. That is why my method is simpler for linking. Otherwise, if the parent table has a 3-field compound key, who wants to duplicate that key through every table? It sort of defeats the purpose of normalization.
 
I use a surrogate SINGLE key and just constrain the compound keys for disallowing duplicates as needed. Perhaps I should have emphasized that point. That is why my method is simpler for linking. Otherwise, if the parent table has a 3-field compound key, who wants to duplicate that key through every table? It sort of defeats the purpose of normalization.

I do not disagree with surrogate keys where the alternative is a wide compound key. I notice you jumped to three fields in your comment which virtually shows you acknowledge that a two field compound key is justifiable.

That of course can be true as I showed in my earlier post.

What I object to are claims that every table should be keyed with a surrogate PK. I simply isn't true.

Most design decisions have both costs and benefits and adding unnecessary keys costs processor time and storage space. That can be a reason to avoid adding a surrogate key that serve no important purpose.
 
I do not disagree with surrogate keys where the alternative is a wide compound key. I notice you jumped to three fields in your comment which virtually shows you acknowledge that a two field compound key is justifiable.

That of course can be true as I showed in my earlier post.

What I object to are claims that every table should be keyed with a surrogate PK. I simply isn't true.

Most design decisions have both costs and benefits and adding unnecessary keys costs processor time and storage space. That can be a reason to avoid adding a surrogate key that serve no important purpose.

Probably a good example would be a us state table. Do you add a numeric code to each stare, or do you stick with their standard 2 letter codes.

Daft as it seems, I am so used to using numerics, i would probably add an autonumber!
 
Probably a good example would be a us state table. Do you add a numeric code to each stare, or do you stick with their standard 2 letter codes.

Daft as it seems, I am so used to using numerics, i would probably add an autonumber!

What do you gain by doing that?
 
Simply repeating your unsubstantiated assertion in no way lends the slightest support to that assertion. Indeed it further calls into question its validity since its claimant, when pressed, has been unable to offer a single advancement on what is essentially, "I said so".
A waste of space so far.


The PK, by definition, is indexed. I also indexed fields that need to be searched. There are often several.
Same rule applies for either method.


In the case of my work environment (a small financial institution), the Member field (Long) is pervasive in almost every transaction. It is an excellent natural key that uniquely identifies the client.
That is nice to hear, but adds nothing to the discussion.


If I included a surrogate key as the PK of the Member table the Member field in the Members table would still have to be indexed to provide rapid search facilities by Member. Moreover I would have to translate the the Member field from the text reports that are parsed to obtain the original data and convert them to the utterly pointless surrogate key.
I am not sure what you are saying here but it sort of sounds like you are doing all this work instead of the computer which these days is more than capable of doing all this in a fraction. If you could explain exactly what you are talking about and that you are willing to work with me I would put this on a test bench that would record the times over a long period of time. Say overnight.

All queries by Member would have to include a join to the Members table.
I don’t see a problem with this.

There is no advantage whatsoever joining to the member table by an artificial key. Hence a redundant field and its index is being maintained for no purpose. I believe your case is shot to pieces.

I still stand by using AutoNumber as a preferred Primary Key. I have never seen this fault. I have seen on many occasions difficulties with a Natural Key.

I will point out for others. Not you Galaxiom as I am sure you do understand this. A KEY is used to uniquely identify a record. If Table A had a record using an Autonumber and that number was 12345654321 then that record would be related to other table where the Foreign Key is also 12345654321. This could happen with a variety of Tables.

I do not expect the AutoNumber to be sequential. I do not see a need for the AutoNumber to be seen and therefore knowingly used by the user.

I can and do use the Number behind the scenes.
 
So how do you expect your end users to identify the information in such a table? Your auto-number key won't help them uniquely identify anything because you've already told us you intend to hide it from them. Yet you don't appear to be suggesting any alternative. That's why I'm saying your point is irrelevant to the discussion. By introducing an arbitrary and therefore irrelevant surrogate key you aren't addressing the critically important issue of unique identification.

If a table lacks any natural key at all then users are left in the abysmal situation of working with duplicate data. Many of us are familiar with the messes created by database systems that don't support data integrity. Even when you use a surrogate key a natural key is still just as important, so introducing a surrogate doesn't in any way relieve the database designer from the responsibility of identifying and implementing the natural key for that table - i.e. you would still need to create key constraints and indexes for the natural key as well as the surrogate.

Button

You ask a good question.

"So how do you expect your end users to identify the information in such a table?"

Can you explain for the exact same situation how do you expect your end users to identify the information in such a table.
 
Button

You ask a good question.

"So how do you expect your end users to identify the information in such a table?"

Can you explain for the exact same situation how do you expect your end users to identify the information in such a table.

With a Natural Key of course. I asked the question only because you seemed to be implying that there was no need to bother with natural keys because one could always use a surrogate instead.

Natural keys are essential in good database design. They are no less essential just because you use surrogate keys. So it's wrong to imply (if that was what you meant) that using a surrogate somehow relieves the database designer from the work of selecting or designing or managing the use of natural keys.
 
With a Natural Key of course. I asked the question only because you seemed to be implying that there was no need to bother with natural keys because one could always use a surrogate instead.

Natural keys are essential in good database design. They are no less essential just because you use surrogate keys. So it's wrong to imply (if that was what you meant) that using a surrogate somehow relieves the database designer from the work of selecting or designing or managing the use of natural keys.

You appear to be suggesting that I am proposing something that I am not and then you refer to your own Database in an incorrect manner.

Just because you somehow use a field in your searching you give it the status of Primary. I am sure if you were asked to describe what a Primary key is then your answer would have something more to do with the defining of each Record within the table.

A primary Key is a unique identifier.

If you were to use a person's name then this would not be good. John Smith. You could not use either one on its' own and even when you combine them they are still not unique. So maybe we could add the DOB, their Mother's maiden name. So if this was a simple table you would have the following Fields. FirstName, LastName, DOB, AddressLOne, AddressLTwo, City, Spouse and PostalCode.

A difficult table to deal with. Best solution is to add an Autonumber. Without the Autonumber you most likely would need the entire field that form the primary. This could be four fields. Now this is messy. Setting up the Indexes could bring the computer to a halt, according to some.
 
What I would like to know is in a situation where multiple Fields are required to locate/create something unique about a record do you actually use this in your searching. That is, do you use a combination of say 4 fields or whatever the count has to be to make a search.

And how would you attach a sub form. Again would you use the multiple fields to find a parent child situation.

It has been so long since I have done this that I honestly have forgotten how to.
 
What I would like to know is in a situation where multiple Fields are required to locate/create something unique about a record do you actually use this in your searching. That is, do you use a combination of say 4 fields or whatever the count has to be to make a search.

Yes of course. Users initiate searches and they will presumably want to search on a variety of things and not just the key attributes. Users will not generally need to search for things based on a surrogate key, so if the surrogate key is used to support users' queries at all then it requires an extra lookup to be performed based on other attributes (in technical terms it is sometimes called a "bookmark lookup"). I expect you know this but I'm not sure where your question is leading.

In response to your previous post, it's correct that of course any candidate key (not just the "primary" key) is a unique identifier for data in a table. Users usually need to be able to identify the information in a table and use unique identifying attributes to relate that information in the table to real things that are of interest to them in the world outside the database. To do that they need unique identifiers that work in reality - i.e. natural key attributes that are recognisable and useable by human beings and are guaranteed by key constraints within the database. Normally the natural key will be indexed because it is typically used in searches and because unique indexing is usually the best (or only) way to guarantee that uniqueness is enforced.

Surrogate keys are another thing altogether. They don't replace natural keys. They simply solve certain technical issues for database developers while being of no special relevance to users of the database.
 
Yes of course. Users initiate searches and they will presumably want to search on a variety of things and not just the key attributes. Users will not generally need to search for things based on a surrogate key, so if the surrogate key is used to support users' queries at all then it requires an extra lookup to be performed based on other attributes (in technical terms it is sometimes called a "bookmark lookup"). I expect you know this but I'm not sure where your question is leading. In response to your previous post, it's correct that of course any candidate key (not just the "primary" key) is a unique identifier for data in a table. Users usually need to be able to identify the information in a table and use unique identifying attributes to relate that information in the table to real things that are of interest to them in the world outside the database. To do that they need unique identifiers that work in reality - i.e. natural key attributes that are recognisable and useable by human beings and are guaranteed by key constraints within the database. Normally the natural key will be indexed because it is typically used in searches and because unique indexing is usually the best (or only) way to guarantee that uniqueness is enforced. Surrogate keys are another thing altogether. They don't replace natural keys. They simply solve certain technical issues for database developers while being of no special relevance to users of the database.
Do you think that there is anything stopping me with my AutoNumber from doing the exact same search as you can. Of course there is nothing whatsoever. So why do I create something different with the AutoNumber. Take all those Fields that you use to create something unique. We could have any number of these for each Table. In fact Table_B may have four fields to create a reference to Table_A. Then if we look at the other side there may be Five fields to create a unique reference to Table_C. Instead of using all of these for our join we use the AutoNumber that has been assigned to that record.

You may be thinking that I am being ridiculous with my joins and I would have to agree. I would go further to say that it would be unlikely that you would create such a messy design. But I have seen this in many situations. Apply this design to 20 tables. View this in the Relationships window. You have a mess where I have a tidy setup one only join between two tables.

Teaching someone how to use AutoNumber is relatively easy. Creating a Unique key with a combination of fields is a little difficult.

The follow up from here is to explain how you would go about creating / selecting a Primary Key or as in many cases designing one by joining fields together.
 

Users who are viewing this thread

Back
Top Bottom