Solved Should (some) non-keys in tables be set to "indexed?"

HalloweenWeed · Aug 16, 2023

Recently I have heard from a forum member that one should not mark a table field as "indexed" before adding a query that selects records based on said FK field, claiming that Access will (sometimes?) create a duplicate index (if you create the relationship before marking the FK as indexed or adding a query to select records based on it), and was told that Access will automatically index said field. This question is somewhat related, as I am rebuilding my tables from scratch in a new immature database. I have a field that I have marked as "required" and do NOT "allow zero length," it may become hundreds of records in length, and this field will be used by the interface to select the record, but it is not the FK (nor the PK). Should I mark it "indexed: Yes (no duplicates)?" The field is type "short text." I am thinking it may enhance performance of the database when the user selects the record, when the table becomes many records in length. The user will probably be using a combobox to select the record, and it will be sorted by this field. Thank you for any input regarding this matter.

ebs17 · Aug 16, 2023

An index is a database object built into the definition of a table. Both are objects of the database in which the real table is actually located.
An index cannot be created by a "normal" query (Data Manipulation Language as a subset of SQL), but it can be created with DDL (Data Definition Language). Anyone who uses something like this knows what they are dealing with.

An index is comparable to the index in a book. It supports reading processes and thereby accelerates them. An index slows down writing processes because the index tree has to be maintained in addition to the data. In a typical database, however, read operations predominate, so that an index used is mostly useful.

An index is useful for fields or combinations of fields that are used for comparison, that is, where operations such as linking, filtering, sorting, grouping, and partial aggregation are applied.
An index is particularly important when linking tables via relationships, so when creating a relationship with referential integrity set, Access automatically indexes the foreign key field. However, this index is not visible in a superficial view (index window). Indexing the foreign key field separately would therefore be an unnecessary duplication.

In addition, there is a setting in the options in Access, where fields with specific designations (ID, key) are automatically indexed to support beginners and those who do not know. It is better to switch off this setting immediately in order to be able to implement your own index planning without disruptions.

To reiterate, select queries don't create indexes, but they can use them if they exist.

CJ_London · Aug 16, 2023

Sounds like the member is misinformed or you have misunderstood

access will automatically create indexes where they are specified- see File>Options>object designers

As a default they usually specify 'ID' and some others (can't remember what). So any field created with a name ending in ID (or whatever is in the list) will have an index automatically created

They will also automatically create an index for a field specified as the primary key - and this is where duplicates can occur. Access will of course create an index for any field ending in ID, including those that are designated a foreign key - but that is at the time of creation and you are unlikely to go into the index form and add another one. Perhaps a good reason to name your foreign keys with a different suffix.

You will see I have effectively turned autoindex off by leaving it blank but to illustrate how it works, I added ID then I created a table with fields called SID, PID and AIDT.

You can see from the index view (indexes button on the ribbon), there are two indexes automatically created. Not for AIDT becuase it doesn't end in ID

I then changed the PID field to primary key and a primary key index is created - note also the change in the allow duplicates and required properties

So now we have a duplicate index and you need to removes the ones you don't want.

So it is the primary key, not foreign key that potentially gets duplicated - because they have different properties.

My advice is to disable autoindex and this won't happen.

General rule: for indexing

use numbers, not text for primary and foreign keys (better performance) - but see below - a few hundred records such as US state abbreviations would be OK as text
Index all fields that are regularly joined on or used as criteria or sorted - with the exception of..
fields with a limited range of values and roughly evenly split between them (no real performance gain, but performance overhead in maintaining them

other considerations
Where fields are frequently null, set the ignore nulls value to yes
Tables with only a few hundred records probably won't benefit from indexing

CJ_London · Aug 16, 2023

ebs17 said:
However, this index is not visible in a superficial view (index window).

So where is it visible?

HalloweenWeed · Aug 16, 2023

CJ_London said:
...

You can see from the index view (indexes button on the ribbon), there are two indexes automatically created. Not for AIDT becuase it doesn't end in ID
View attachment 109462

Thank you @CJ_London . But I did not have such "indexes" button. So I looked in the ribbon commands, and added it under db "tools" tab, "custom" group. But still, I selected a table in a previous database that I know has multiple indexes and the button remains greyed out and non-selectable.

EDIT: I tried turning on show hidden & system objects in Objects ribbon, no joy.

HalloweenWeed · Aug 16, 2023

Otherwise, I believe I got my answer: there is no significant advantage in marking it indexed. TY all.

ebs17 · Aug 16, 2023

Show indexes via code.

Code:

Sub ShowIndexes(TableName As String)
    Dim db As DAO.Database
    Dim idx As DAO.Index
   
    Set db = CurrentDb
    With db.TableDefs(TableName)
       For Each idx In .Indexes
           Debug.Print idx.Name, idx.Fields
       Next idx
    End With
End Sub

ebs17 · Aug 16, 2023

HalloweenWeed said:
this field will be used by the interface to select the record ... The user will probably be using a combobox to select the record, and it will be sorted by this field.

Filtering and sorting are typical operations where an index will result in a performance improvement.

CJ_London · Aug 16, 2023

HalloweenWeed said:
But still, I selected a table in a previous database that I know has multiple indexes and the button remains greyed out and non-selectable.

It’s in the table design ribbon - and won’t work on linked tables

HalloweenWeed · Aug 16, 2023

CJ_London said:
It’s in the table design ribbon - and won’t work on linked tables

Ah, I see now. TY.

The_Doc_Man · Aug 16, 2023

The original question includes whether certain indexes are of any value. Many reasons exist for adding an index. One way or another, all of those ways come back to looking for something - i.e. searching the table. Whether searching via a combo box, or filtering something in a report or form, or having a formal relationship back to some field to support a JOIN clause, you are searching for some specific value. Therefore, though many ways exist that use indexes, they are always present to support a search operation.

The answer to the index's usefulness can be estimated by computing the cardinality of the index. The word cardinality in database terms roughly means "how many records do I expect to be returned when searching for a given value of this indexed field?"

Cardinality can be ROUGHLY computed by counting the number of unique values in the field in question and divide that into the number of records in the table. (This is a ROUGH calculation... there are other ways, particularly if you have a predictably biased distribution of values.) The bigger this quotient, the less value there is in the index for most purposes. If you get back 50% of the table, the index would not do much good in searching. If your search returns 1 record, then your index was the PK and is the ideal search field. All indexes will have cardinality between a unique return and 50% return (unless you place an index on a field that is essentially constant.)

Why is a 50% cardinality not so good? Because if you are using the index, you have to go to the index, then find the next matching index entry, then follow the pointer to the record in question. But if you didn't use the index (instead doing what is called a "relation scan"), you just find the next record and if it matches, use it - but if not, find the next record in turn until you get a match. Flipping attention between the index and the table isn't awfully expensive but when dealing with a 50% return on a search, the two methods are not terribly far apart.

If you are dealing with composite indexes, the cardinality of any ONE of the indexes might be a moderate-to-high number. However, if the combination of indexes in this composite key has small cardinality, it might still be useful. Expressed as percentages, cardinalities multiply. So if index A has about 10% cardinality and index B also has 10% cardinality, as long as A and B are independent of each other, you might see a 1% net cardinality, which is a significant reduction of record returns.

The remaining question on utility is "how often will you actually use it for searching by any of the normal means of searching?" If the answer is "many times per session" then you can still get some advantage out of a high-percentage return. If the answer is "maybe once per week, if that much" then perhaps you don't care even about a moderate cardinality.

But, you say... Why do I care? Because if you have lots of INSERT INTO or UPDATE or DELETE operations in that table, EVERY CHANGE requires EVERY INDEX to be rewritten. So... indexes cost you in table maintenance/overhead. Lightly used or rarely used indexes don't need to be there. But heavily used indexes, even with poor cardinality, might be worth the effort of keeping them around. It's all relative to usage, so the DB's designer has to decide on utility.

Galaxiom · Aug 17, 2023

The_Doc_Man said:
If you get back 50% of the table, the index would not do much good in searching. I

This is why RDMS like SQL Server support Filtered Indexes. They are configured by the designer to exclude values that cause high cardinality, providing benefit where the index would be an advantage, without the overheads of maintaining an index on common values.

Cardinality is also important factor for the query optimiser. As a table grows the frequency of particular values can change altering the 'Statistics" of the table. The statistics are used by the query optimiser to calculate the plan. When they get stale they need to be updated and the query plan rebuilt. This is done with specific commands in a DBMS.

In Access, editing the query, probably compacting and definitely decompiling the database or copying the objects to a new database will force the query plans to be rebuilt. I don't know what Access (actually the ACE engine) does for statistics. I would assume they do exist in ACE and would be updated in a Compact and Repair of the back end.

The_Doc_Man · Aug 17, 2023

G., I've never seen an article on any statistics kept by the Jet or Ace engines. Not saying there aren't any, but I've never seen them in any non-trivial discussion. I would suspect that they have to be kept within the individual tables in some hidden sub-structure.

ebs17 · Aug 17, 2023

Attached is a script by Michael Kaplan and Julianne Lee for the last point.

MarkK · Aug 17, 2023

Pat Hartman said:
Please do not summarize my posts when you do not understand them.

It is difficult for me to see my own ignorances. Please be patient and kind with me while I learn from you.

The_Doc_Man · Aug 18, 2023

Pat Hartman said:
@The_Doc_Man I have an old book for Jet 3.5. It has a wealth of information. Sadly there isn't an ACE version. "Jet Database Engine Programmer's Guide, second edition" It talks about statistics and that they are used by the Query Optimizer. They also mentioned that statistics are also requested via ODBC when working with linked tables which I thought was interesting.

If you read the history, Ace is merely a new version (not a total rewrite) of Jet. So at least some of it might be valid.

isladogs · Aug 18, 2023

ebs17 said:
Attached is a script by Michael Kaplan and Julianne Lee for the last point.

That's a great article as you would expect whenever Michael Kaplan was involved.
I'm only surprised that there was no mention of viewing the query execution plan using the undocumented JET ShowPlanfeature.

Show Plan - Go Faster

Use Jet ShowPlan to view the execution plan of Access queries. This can help optimise query execution

isladogs.co.uk

Perhaps that feature wasn't available at the time that article was written?

isladogs · Aug 18, 2023

Pat Hartman said:
@isladogs Looks like the link to Susan's article is now pointing to some AI stuff.

Thanks for letting me know

Hmm. That's a pity if the old article has been lost.

Anyway, easily solved.
I have just updated the link to an archived copy stored on the Wayback Machine website
I also copied the page myself a few years ago just in case it disappeared. So I have also added a PDF of my copy

isladogs · Aug 18, 2023

Refresh the webpage using Ctrl+F5 to clear the cache

Solved Should (some) non-keys in tables be set to "indexed?"

Member

Well-known member

Super Moderator

Super Moderator

Member

Member

Well-known member

Well-known member

Super Moderator

Member

Immoderate Moderator

Super Moderator

Immoderate Moderator

Well-known member

Attachments

bit cruncher

Immoderate Moderator

Access MVP / VIP

Access MVP / VIP

Access MVP / VIP

Similar threads

Users who are viewing this thread