Indexing. When does it happen. (1 Viewer)

RainLover

VIP From a land downunder
Local time
Tomorrow, 08:34
Joined
Jan 5, 2009
Messages
5,041
What is indexing?
Indexing is a way of sorting a number of records on multiple fields. Creating an index on a field in a table creates another data structure which holds the field value, and pointer to the record it relates to. This index structure is then sorted, allowing Binary Searches to be performed on it.
The downside to indexing is that these indexes require additional space on the disk, since the indexes are stored together in a table using the MyISAM engine, this file can quickly reach the size limits of the underlying file system if many fields within the same table are indexed.

http://stackoverflow.com/questions/1108/how-does-database-indexing-work

That was just an introduction and a link to the source.
:) :) :) :)

What I would like to know is when is the Index applied. A Field needs to be reindexed whenever a record is either created or edited.

Is this index applied on the saving of a record or perhaps when required. Namely when you do a search.

I am trying to understand what the affect is of reindexing. If it is on save then there should be no change in the overall speed of performing searches etc,

However if it is done when the index is required for searching then it could make a considerable difference to the speed. If several fields are indexed it would do the exact opposite to what we wanted.

If you know the correct answer as to when the indexing is applied please let me know. Then if you know of a link then that would be better still.
 

RainLover

VIP From a land downunder
Local time
Tomorrow, 08:34
Joined
Jan 5, 2009
Messages
5,041
CJ

Just to clarify my thinking.

When you add a new record there is a slight delay while Access updates the indexing for that field.

When you do a search or even simply open the records it becomes faster than a similar situation where there is no indexing.

When we add a new record the delay would be so small that it would not be noticed. Bearing in mind that we are not typing at high speed.

Therefore I would be safe in saying that indexing does not slow the database down unless there are too many fields that are indexed. But that is another discussion.
 

spikepl

Eledittingent Beliped
Local time
Tomorrow, 00:34
Joined
Nov 3, 2010
Messages
6,142
Therefore I would be safe in saying that indexing does not slow the database down unless there are too many fields that are indexed.
Not quite. Indexing is an overhead imposed on INSERT and UPDATE, and a benefit for SELECT. For most DB's SELECT is the operation performed the most, so there an index is beneficial. There arew instances where it is best not to have indexes at first, but to do whatever needs to be done and then index the final table.

Note also that Access automatically creates indexes for fields beginning or ending with ID, CODE, KEY or NUM. On top of that, it also creates indexes for primary keys. So if you have a field called MyID as PK, then the silly thing creates 2 indexes!
 

CJ_London

Super Moderator
Staff member
Local time
Today, 23:34
Joined
Feb 19, 2013
Messages
16,642
When you add a new record there is a slight delay while Access updates the indexing for that field.
correct, the more indexes, the bigger the dataset, the longer it takes - but still pretty quick. If you are inserting or updating many records the update of indexes seems to be faster pro rata

When you do a search or even simply open the records it becomes faster than a similar situation where there is no indexing.
Correct: a good example of this is relatively easing demonstrated when searching a string field. If you search Like '....*' then the index is used if it exists, if you search Like '*...*' or Like '*...' then the index is not used. Best practice when building a search form is to not include the first * in the code, but train the user to use it when required. - And you can't search ADODB recordsets with an initial * anyway.

Access automatically creates indexes for fields beginning or ending with ID, CODE, KEY or NUM
Not quite - these are set in options>object designers>table design view. So you can add/delete the options you require.

So if you have a field called MyID as PK, then the silly thing creates 2 indexes!
I'd forgotten that! But then I consistently use PK/FK anyway. So followup questions - in the context of a PK is there a benefit of having two indexes, or better to remove the non PK index (which presumable reduces bloat)? and if there are two indexes does Access optimise their use?
 

Cronk

Registered User.
Local time
Tomorrow, 08:34
Joined
Jul 4, 2013
Messages
2,774
So followup questions - in the context of a PK is there a benefit of having two indexes, or better to remove the non PK index (which presumable reduces bloat)? and if there are two indexes does Access optimise their use?
The PK is obviously unique. The second is not. So I consider it useless and delete it when I remember.
 

Galaxiom

Super Moderator
Staff member
Local time
Tomorrow, 08:34
Joined
Jan 20, 2009
Messages
12,854
Note also that Access automatically creates indexes for fields beginning or ending with ID, CODE, KEY or NUM. On top of that, it also creates indexes for primary keys. So if you have a field called MyID as PK, then the silly thing creates 2 indexes!


You can turn off this feature. Clear out the box under:

Options > Object Designers > AutoIndex on Import/Create.
 

Galaxiom

Super Moderator
Staff member
Local time
Tomorrow, 08:34
Joined
Jan 20, 2009
Messages
12,854
When we add a new record the delay would be so small that it would not be noticed. Bearing in mind that we are not typing at high speed.

Therefore I would be safe in saying that indexing does not slow the database down unless there are too many fields that are indexed.

The indexing task, even for one index, can be a very considerable proportion of the load when inserting or updating a record. In some cases the indexing can be the larger part of the task.

Unlike in the table itself where the new record is simply tacked on to a page wherever it suits the engine, the index records are in a specific order and records that come after it need to be moved so it can be inserted in the right place.

Updating the value in an indexed field also requires the same process.

While this is obviously not going to be noticed where there are a small number of users compared to the available resources, it becomes critically important on a large system with many users and records.

Where a large number of records are to be inserted as a batch, the overall performance can sometimes be improved by dropping the indexes before beginning the insert and completely rebuilding them afterwards, especially if the new records are not going to be at the end of the index.
 

gemma-the-husky

Super Moderator
Staff member
Local time
Today, 23:34
Joined
Sep 12, 2006
Messages
15,677
I haven't read every post in this.

It's worth investigating b-trees.

If you understand binary trees, then a b-tree is somewhat similar in concept. A binary tree has two descendants at each node. If records are added in a non-random manner, or from a skewed root value, then the tree will become very tall. A balanced binary tree addresses this by maintaining the height at an optimum level, by re-centring the root node when necessary.

A b-tree is similar but each node consists of perhaps 100 items.

So the root can have 100 children, each with 100 items. After 3 levels of tree, we already can store 1m records in a tree of height 2. In fact the nodes fill to less than full capacity I think (so at least 50, but less than 100), because of the tree balancing algorithm.

The same principles about tree rebalancing are used to maintain the b-tree at an optimum height.

Because of this structure records can be pinpointed from their index value with a very smaller number of reads, which makes them very efficient.

So what happens is that when you update a record to a table, all the indexes need to be rebalanced. I doubt if this happens at "insert" stage, more likely at "update" stage.

http://en.wikipedia.org/wiki/B-tree

Re-indexing should never be necessary, because the b-tree will always be optimally balanced.

The problem will come with deleted records, which is why we need C&R. Deleted records will only be logically deleted (or at least the index entry will not be removed initially). I am pretty sure that compact and repair will physically delete the deleted records, and then reconstructs the indexes for the retained items.

[on reading the wiki article, I think my explanation is actually incorrect in a few places, but hopefully it's not too bad!]
 
Last edited:

Users who are viewing this thread

Top Bottom