Index on Autonumber key column

syswizard · May 31, 2013

Can anyone explain why one would want a unique index on an ID key ?
It can't be effectively used in joins and is never used in a where clause.
So why have one ?

rzw0wr · May 31, 2013

None of that is true.
I don't know where you get this from.
I use them every time I make a query and in about every where clause.
I would think you need to read up on table normalization a little more.

Dale

syswizard · May 31, 2013

I'm talking about a SURROGATE key (ID), not a known Primary Key like CustomerID which would be indexed.
I guess my real question is: can an Autonumber column ever have duplicates ?

Galaxiom · Jun 1, 2013

An autonumber field cannot have duplicates. It is automatically indexed as part of ensuring it does't have duplicates.

If you have CustomerID as the Primary Key then there is no point even having the autonumber field at all.

Autonumbers are typically used as the PK so that the value in any other field such as CustomerID can be easily changed.

ButtonMoon · Jun 1, 2013

syswizard said:
I'm talking about a SURROGATE key (ID), not a known Primary Key like CustomerID which would be indexed.
I guess my real question is: can an Autonumber column ever have duplicates ?

Yes an auto-numbered column can have duplicates unless you have a uniqueness constraint to prevent that. Unfortunately the Access UI doesn't deal very well with key constraints other than the primary key. You can define them using SQL's UNIQUE keyword but the only indication that they are unique in the UI is via an index with the property of "no duplicates". Go figure!

An auto-numbered column is typically used for a surrogate key. That usually does means it's a good idea to index it because it will often (though not necessarily always) be referenced by foreign keys and in joins.

jdraw · Jun 1, 2013

In Access terms an autonumber "data type" is supposedly guaranteed to be unique. It is not guaranteed to have sequential, nor positive values -- only uniqueness. And you can only have 1 autonumber field per table.

I said "supposedly" above, because if an autonumber repeats (attempts to create a duplicate),
see http://allenbrowne.com/ser-40.html

Your question of surrogate key (ID) not a PK is interesting. I'm trying to think of the case where you would use a surrogate autonumber in a unique index and not have it as a PK. Perhaps you have a sample.

I can see the Surrogate autonumber as PK, and having other fields (multiple) forming a unique compound index to prevent duplicates of the multifield value. I do this quite often when resolving many to many relationships.

I agree with ButtonMoon that the auto-numbered surrogate column is typically used as the primary key.

ButtonMoon · Jun 2, 2013

Hi Pat,

It seems like you are both muddling terminology and making certain assumptions about what is happening in the OP's database. A primary key is a candidate key. By convention when a table has more than one such key then one of them is designated as primary and any others are called secondary keys or alternate keys. That's essentially just a matter of convention and convenience though. The primary key is supposed to designate one key as being a "preferred" identifier or having some special significance known to the designer or user. The choice of primary key is therefore of no theoretical or practical consequence because a primary key is not different to any other key unless you choose to make it so.

Very often the primary key is the one referenced by foreign keys in other tables but that isn't invariably the case. Some database designers adopt the convention of designating a natural key as "primary", regardless of whether a foreign key will actually reference it or not and even if such a table also has a surrogate. There's nothing fundamentally wrong with doing that. As with other design conventions, what matters is that the community using the convention understands it and can apply it correctly.

Regrettably some DBMSs use keys designated in the system as "primary" to influence or even control certain features such as storage or index creation. This is extremely unfortunate because it both violates the principle of physical database independence and actually goes against the original point of a primary key being the preferred identifier of the user: if the user is directed to choose a key based on what the software will do internally with that choice then maybe it won't be his/her preferred key at all.

In the present discussion I don't think there's enough information about the physical implementation to make a final judgement on syswizard's choice of indexes. Syswizard isn't necessarily wrong when he says that the surrogate isn't used in joins and searches because it is unfortunately a little too common to see surrogates pointlessly added in cases where they will probably never be used (at least judging by a few posts in this and similar fora). He definitely isn't wrong to suspect that autonumbered columns could contain duplicates if they don't have constraints or indexes that disallow that.

Just my 0.02. HTH

ButtonMoon · Jun 3, 2013

Pat Hartman said:
Surrogate keys are not generally referred to as Candidate keys since they do not exist in the underlying data. They are artificially created because no suitable Candidate key exists or because the database designer prefers to use artifical keys rather than natural keys.

A surrogate key is a minimal superkey and therefore by definition it is a candidate key. I've not before heard anyone say that a surrogate isn't a candidate key, which is a strange misuse of terminology. In terms of the relational model it is certainly incorrect to imagine that primary keys could be something other than candidates. Candidate keys are a fundamental feature of the RM and if a table lacks any candidate keys then it is not a relation.

You seem to be under the impression that you can't enforce RI with a non-primary candidate key. I don't know why you think that. I know you can enforce RI against non-primary keys in Jet and you can do the same with any DBMS that supports Standard SQL, including Oracle, SQL Server and MySQL. A foreign key constraint can reference any key* and the logic and behaviour is exactly the same whether that key happens to be "primary" or not. I suggest you try it out for yourself.

(*Technically, SQL-style "foreign key" constraints can sometimes reference sets of columns that aren't keys at all - but that's off-topic for the moment.)

Pat Hartman said:
This is an Access forum so my remarks are biased toward how Jet/ACE work and in the absense of a statement to the contrary all posts should be considered to be asking about Access/Jet/ACE.

I assumed the OP was asking about Access but not necessarily about Jet/ACE. In any case, all my remarks apply equally to Jet/ACE as far as I'm aware.

gemma-the-husky · Jun 3, 2013

syswizard

consider this somewhat differently. the indexed autonumber is ideal for managing relationships between multiple tables.

let's say you have a customers table, with an autonumber key. a sales order table, with the customerid as its foreign key, and its own autonumber key. and a sales order item table with the salesorder id as ITS foreign key.

ie.

customer 1 ... n sales orders (joined on autonumber customerID key)
sales orders 1.... n sales order lines (joined on autonumber salesorderID key)

now this does mean that in order to establish the customer name for an order line you have to reference up through the parent sales order to the parent customer - but this is just a natural join.

you could easily find, say, all order lines for a given customer, for a given range of order dates. you just need to include tables up to the customers table in order to extract the data.

the use of single field integer (ie autonumber) keys is highly efficient, compared with multiple-field keys, or text kets, and also divorces the content of the join - meaning you no longer need cascading updates when the (say) customer name changes.

it's a good trade-off.

gemma-the-husky · Jun 3, 2013

syswizard said:
I'm talking about a SURROGATE key (ID), not a known Primary Key like CustomerID which would be indexed.
I guess my real question is: can an Autonumber column ever have duplicates ?

if the foregoing hasn't clarified this, then the answer to this is BY DEFINITION - NO - an autonumber is defined NOT to have duplicates.

A certain bug in access occasionally produces a situation where the next number seed fails., and duplciates could arise, although general a unque constraint prevents the addition of the duplicate. but this perverse, and is not expected by any means.

to take your examnple though, what do you mean by "customerid" - is this "The account number" or something else "real"? if so what happens when you ever need to modify the value for an existing customer?

if it is just a sequential number, then it is tantamount to an "autonumber"

ButtonMoon · Jun 3, 2013

gemma-the-husky said:
if the foregoing hasn't clarified this, then the answer to this is BY DEFINITION - NO - an autonumber is defined NOT to have duplicates.

To settle this just try a very simple test. In Access 2010:

Code:

CREATE TABLE tbl1 (x AUTOINCREMENT NOT NULL);
INSERT INTO tbl1 (x) VALUES (1);
INSERT INTO tbl1 (x) VALUES (1);

Result: two rows with the same value in the autoincrement column.

Now drop the table and recreate it with a key constraint:

Code:

CREATE TABLE tbl1 (x AUTOINCREMENT NOT NULL UNIQUE);
INSERT INTO tbl1 (x) VALUES (1);
INSERT INTO tbl1 (x) VALUES (1);

Result: second insert fails.

Autoincrement columns are not implicitly unique. Constraints are required to guarantee uniqueness.

There is also another related problem. Inserting a value of N to an autoincrement column resets the seed value so that the next generated value will be N+1 - even if that value already exists.

gemma-the-husky · Jun 4, 2013

partial quote from buttonmoon

There is also another related problem. Inserting a value of N to an autoincrement column resets the seed value so that the next generated value will be N+1 - even if that value already exists.

no - this is just not correct. The seed definitely does not reset in this case. try it and see.

I stand by what I said. In normal use an autonumber field will increment without issues. often access will try to set the autonumber as the PK, in which case it becomes unique. the behaviour that causes the seed to "lose" its correct value, is a bug - definitely problematical, but not intended"

yes, it is possible to "poke" an autonumber into a table, (but only with an append query) and this could produce duplicates in an unindexed field, but it won't happen by accident.

Galaxiom · Jun 4, 2013

gemma-the-husky said:
no - this is just not correct. The seed definitely does not reset in this case. try it and see.

I have personally seen this happen on two occasions. It doesn't always happen but it certainly can.

It is a major issue to be aware of when inserting records recovered from backups. Suddenly the autonumber starts trying to create duplicated records.

It is one of the reason I prefer to use other incremental numbering techniques.

ButtonMoon · Jun 4, 2013

gemma-the-husky said:
no - this is just not correct. The seed definitely does not reset in this case. try it and see.[/QUOTE]

I tried it and got the result I described (in Access 2007). Example code follows. After running this I opened the table and added a new row in the grid in the usual way. The autogenerated value was 2 and this caused an error "The changes you requested to the table were not successful because they would create duplicate values in the index...". The same error occurred when I tried to insert again because the next value generated was 3.

Dave, what are you doing differently? Maybe it was fixed in some versions? Maybe you could post some code to reproduce what works for you.

Code:

CREATE TABLE tbl1 (x AUTOINCREMENT NOT NULL PRIMARY KEY, z INT NOT NULL); INSERT INTO tbl1 (x,z) VALUES (3,0); INSERT INTO tbl1 (x,z) VALUES (2,0); INSERT INTO tbl1 (x,z) VALUES (1,0);

syswizard · Jun 4, 2013

gemma-the-husky said:
if the foregoing hasn't clarified this, then the answer to this is BY DEFINITION - NO - an autonumber is defined NOT to have duplicates.

A certain bug in access occasionally produces a situation where the next number seed fails., and duplciates could arise, although general a unque constraint prevents the addition of the duplicate. but this perverse, and is not expected by any means.

Thanks for that Gemma. I have been amazed from this thread on all of the misconceptions of table design....zowie, no wonder there have been so many software disasters recently that were publicized.
An Autonumber column as surrogate key SHOULD never have duplicates as it should always be increasing in value. And I contend that an index on such column is totally worthless and wasteful. It consumes space and slows-down inserts.

jdraw · Jun 4, 2013

As others have said, a autonumber is something specific in Access. An autonumber can be used as a Surrogate, but a Surrogate does not have to be an Autonumber. As Galaxiom in Post #16 said he prefers to use a different incremental numbering scheme.

An Autonumber column as surrogate key SHOULD never have duplicates as it should always be increasing in value.

Why must it always be increasing? If you are using the Surrogate as PK, it is unique
(definition of pk here http://databases.about.com/cs/administration/g/primarykey.htm )
Here's a link that shows some additional thinking about Access' autonumbers.

http://www.utteraccess.com/wiki/index.php/Autonumbers

syswizard · Jun 4, 2013

It doesn't have to be increasing...it's just that a simple implementation of autonumber would always remember the most recent highest value. A more sophisticated one would search for gaps, etc.

Galaxiom · Jun 4, 2013

syswizard said:
An Autonumber column as surrogate key SHOULD never have duplicates as it should always be increasing in value. And I contend that an index on such column is totally worthless and wasteful. It consumes space and slows-down inserts.

That suggestion is totally wrong. Any field, autonumber or otherwise, that requires value be unique should have a unique index. Likewise if a combination of field values needs to be unique then it should have a composite unique index enforced at the table level.

Best practice does not leave data integrity of the tables to chance.

ButtonMoon · Jun 4, 2013

Pat Hartman said:
A foreign key constraint can reference any key* and the logic and behaviour is exactly the same whether that key happens to be "primary" or not. I suggest you try it out for yourself.

Click to expand...

I have. Since you seem to be able to, please post a working example so we can all learn something.

Pat, the error message you posted speaks for itself: "No unique index found for the referenced field". If the referenced column(s) have a unique index then you can enforce referential integrity using those columns. It doesn't have to be the primary key though. Any key will do.

ButtonMoon · Jun 6, 2013

Pat Hartman said:
You are the one who said it was wasteful to have an index on the surrogate key.

I think you are addressing syswizard who started the thread. I didn't say an index was wasteful. I said it was usually a good idea to index a surrogate key and also that an index or constraint is required to guarantee uniqueness of a key based on a key-generator like auto-numbering. I'll let syswizard respond to the points you made because generally I agree with what you've said, except in these aspects:

Of course it can't be used effectively to relate tables if you don't give it a unique index and once you go that route, why not make it the PK? After all, its reason for existence was to provide uniqueness because there wasn't a viable candidate key!

Once you add an autonumber to a table, it doesn't just become A candidate key, it becomes THE candidate key or there is no point to adding it at all.

Those arguments don't really follow. One of the very common reasons for using a surrogate key is where stability is required for very practical reasons but the values of other keys in the table are expected to be somewhat changeable. The surrogate doesn't remove the need for the other keys however. E.F.Codd (who introduced the concept of surrogates to the relational model) says:

"Database users may cause the system to generate or delete a surrogate, but they have no control over its value, nor is its value ever displayed to them."

It is usually assumed that business users do not see surrogate keys (their separation from the business domain is virtually the definition of a surrogate key) and therefore those users usually will need some other key with which to identify the information that's of interest to them.

A very simple example is a table of user names. A user may have a login name which is required to be unique but may change if the user requests it. An internally generated surrogate user identifier is therefore added to mitigate some of the issues that may arise from changing a key. The surrogate is very probably the key referenced by foreign keys in other tables but both keys (login name and surrogate) are still equally important from a data integrity point of view and would be enforced by unique key constraints in the database. The choice of "primary" key in a table with more than one key is in principle of very little importance. What matters is how they are used and that they are properly implemented to ensure uniqueness is enforced.

Index on Autonumber key column

Registered User.

I will always be a newbie

Registered User.

Super Moderator

Registered User.

Super Moderator

Registered User.

Registered User.

Super Moderator

Super Moderator

Registered User.

Super Moderator

Super Moderator

Registered User.

Registered User.

Super Moderator

Registered User.

Super Moderator

Registered User.

Registered User.

Similar threads

Users who are viewing this thread