Database Bloat (1 Viewer)

Foster2019

New member
Local time
Today, 10:19
Joined
Sep 30, 2019
Messages
7
Hi people,

So I read an article posted by theDBguy on 11-03-2018 about database bloat, and I think the conclusion was that append queries and make table queries both cause similar size bloat, and that make table queries where faster.

My questions are,

If I have a table with several columns that would only ever have 20 rows that would keep the same primary key. But all the data in the table except for the primary key changed every week (some cells with empty values) what would cause less bloat.

A) Delete the whole table each time and create a new one using a template and the copy object method, keeping the primary key and having empty values in all other cells. Before using an update query to update the records

B) Delete the whole table each time and create a new one using a template and the copy object method, with no rows or data and using the open recordset add new method

C) Using an update query to null all the values except for the primary key before inserting the new data via the update method keeping the primary key

D) Using a delete query to delete all the rows but not the table then using the open recordset add new method creating a new primary key.

I’ve read that deleting tables every time could cause other problems like instability as mentioned in the article I was reading, but I don’t think I fully understand what that means?

Is deleting rows of data the same as deleting all the values and then updating the empty values when it comes to database bloat? And is deleting a primary key going to cause more bloat than any other column?
The table hasn’t got any relationships.

I hope the above makes sense :/
It’s the end of a long day

Thanks people:)
 

theDBguy

I’m here to help
Staff member
Local time
Today, 03:19
Joined
Oct 29, 2018
Messages
21,358
Hi. If I understood your question correctly, then I would say using an UPDATE query would probably cause less bloat, if any at all.
 

June7

AWF VIP
Local time
Today, 02:19
Joined
Mar 9, 2014
Messages
5,425
What is purpose of a table where every value would change except primary key?

I use 'temp' tables - tables are permanent but records are temporary - and have not experienced bloat or instability issues.

If db is split, these 'temp' tables must be located in frontend. If it's not split then would certainly be unstable for multiple users.

Deleting and creating objects is a design change. If db is deployed as an executable, that approach might not be possible - I've never had to work with accde file.
 
Last edited:

isladogs

MVP / VIP
Local time
Today, 10:19
Joined
Jan 14, 2017
Messages
18,186
The thread being referenced is probably https://www.access-programmers.co.uk/forums/showthread.php?t=302259

I agree you should use an update query for this. Just one ….no need to null the existing values first...just replace using the update. There is absolutely no benefit to using any other method in your case

In that thread, I was the one who mentioned instability through repeated use of make table queries. I've experienced 'unrecognised database format' errors in several cases which, for me, were always fixed by decompiling. You might not be so lucky.

@June
You can create, edit and delete tables, queries and macros in ACCDE files (unless fully locked down).
You may be thinking of changes to forms, reports and code, all of which are disabled
 
Last edited:

The_Doc_Man

Immoderate Moderator
Staff member
Local time
Today, 05:19
Joined
Feb 28, 2001
Messages
27,001
Actually, the mechanism is less important than the final effect. Because of sharing issues, Access never actually re-uses deleted space until you Compact & Repair the DB. Instead, it marks the record as "deleted" (displayed as #deleted#) so that the user who comes behind you no longer tries to touch that record. Remember that user A can delete (technically, "mark for delete") a record that was opened at the time by user B. If the locking is set up correctly, this will happen almost transparently.

The issue of doing an UPDATE is that the physical record doesn't get updated. The physical record gets REPLACED by its updated equivalent. If you did a DELETE first and followed it with an INSERT INTO, Access still doesn't garbage collect because of sharing issues.

Stated another way, if you do the UPDATE, it isn't a true "update in place" because of the potential that the currently targeted record might be open in someone else's open report, so the update marks the old record and writes a new record corresponding to the one just updated.

Net result in ANY order that you do it? The old record is marked awaiting the next C&R and the new record is threaded into the table's list of known records. This activity has two effects: (1) Bloat occurs due to the growth of records marked for deletion and (2) tables opened directly or through a query lacking an ORDER BY or GROUP BY clause will appear to be written in an order other than sorted by prime key.

NOTA BENE: Opening a table in datasheet view on your screen actually opens an implied "generic" query that would provide records in PK order. When I say "open a table" I mean through any method involving a recordset, express or implied, as long as the opening method doesn't impose order on its target.
 

Pat Hartman

Super Moderator
Staff member
Local time
Today, 06:19
Joined
Feb 19, 2002
Messages
42,981
Doc,
I think that Access updates "in place" as long as the changed record is less than or equal to the original version in length.
Access will NOT update locked records, i.e. records that are dirtied by a form or locked by a bulk update query. I think you get a message and the query doesn't run if the record is enqueued by a different action query. But if the record is dirtied by a form, you probably will get to update it and the form will get the conflict error when the user attempts to save. The whole thing has to do with how Jet/ACE identify deadlocks and handle them. I know much more about how mainframe databases do this than how Jet/ACE does it.

The reason I think that Access updates in place is because if it didn't, records would appear to be out of order all the time when you open a table directly since that process returns a recordset in physical order UNLESS you sort it. And since most people think that tables are sorted, there can't be a lot of record movement going on. I've posted instructions several times to prove this to people. You need a row count and physical record size long enough to fill a sector to see any impact. A "sector" or whatever a physical record is in the mind of Jet/ACE, is the amount of data handled by a single I/O operation. Then you have to update an "early" record and make it longer than it was originally. Access cannot put it back where it was so it moves it to the end of the file. Other RDBMS' use more sophisticated methods involving embedded free space. Once you force Access to move a record, it will appear to be missing when you open the table directly but you will find it if you scroll to the end or apply a sort.

I agree that this is a strange situation where 20 primary keys remain but are associated with different data. But, given that description, I would just update the table.

@June,
You are not running into bloat because either:
1. you have compact on close set. This is dangerous and not advised.
2. you compact frequently enough that the small amount of data you are deleting/adding isn't causing much of a problem.
 

Foster2019

New member
Local time
Today, 10:19
Joined
Sep 30, 2019
Messages
7
Thanks people,

Yes Isladogs, that’s the article I mentioned, I should have posted a link. It had been a long day and I wasn’t really thinking. :(

I agree it’s a weird setup,

The primary key represents people and the columns are time slots, the code checks the column to see if the person is available before assigning them a task. Other code picks up later to put into a printable excel rota.

Thanks for the information about how access handles data when it is deleted; I feel that I have a better understanding now. I think I’ll stick to using an update query. :)

I’ll make sure I keep the temp tables in the front end, knowing that is really useful.
 

Pat Hartman

Super Moderator
Staff member
Local time
Today, 06:19
Joined
Feb 19, 2002
Messages
42,981
You can't get rid of temp tables entirely so when you need them, segregate them in a "template" database. The "template" contains empty versions of the tables. Your app deletes the existing version of the database and copies the template to the old name. That leaves all the links and queries intact.
 

The_Doc_Man

Immoderate Moderator
Staff member
Local time
Today, 05:19
Joined
Feb 28, 2001
Messages
27,001
Pat, I respectfully disagree. Not trying to step on your toes here, but I have looked into this in the past because I had to have answers for some government wonks who knew what they wanted but didn't know doodlum squat about databases.

Here is the short answer: Access CAN'T do an "update in place" in a shared environment because users might not have set dbSeeChanges in their recordset options. If I recall correctly, that isn't the default. Access has to retain the values at the very least for the currently open record. The simplest way out is to NEVER assume that the user has selected the dbSeeChanges option, so Access will never do UPDATE-in-place.

In a split shared DB, your FE makes a temp copy of what it is going to look at, but if you open your recordset and then someone else behind you opens an overlapping set, neither of you will see changes immediately. The delay is based on the FE's auto-refresh interval, one of the options. Until that auto-refresh interval trips, you will be able to see things in your set that aren't there any more in your co-worker's set. Both sets of records are valid, each for their time of snapshot. Access won't know or indicate that a change has occurred until it goes back to do the update or until that auto-refresh detects a change. And an update in place would make that nearly impossible to recognize.

Following is a more in-depth discussion of why this is so. Folks not interested in this level of nuts and bolts might not care to read further.

Consider the order of operations of SQL: https://www.periscopedata.com/blog/sql-query-order-of-operations

1. Get source tables & queries identified (the FROM clauses)
2. Identify the records to be used (the WHERE clauses)
3. Form aggregates (the GROUP BY clauses)
4. Identify aggregates to be used (the HAVING clauses)
5. Identify final data (the SELECT clause for SELECTs; the field lists for UPDATE, INSERT queries)
6. Sort returned records (the ORDER BY clause)
7. Apply limits (such as TOP N clauses)

If you look at this, once SQL has formed the list of working records, what happens after that moment is on a parallel course. The records selected by the FROM/WHERE combination are now on their own, at least briefly, when it comes to grouping, forming aggregates, filtering aggregates, etc.

As to whether records are changed in order of appearance frequently? I have observed this. Remember, opening a table in Datasheet View implies a hidden query that happens to apply the PK in an ORDER BY statement. You have to check for record scrambling in something that doesn't come through that implied ORDER BY. Further, if the table is large enough and updates are sparse enough, you won't see many changes anyway.

FWIW, I had to use some single-table queries for my biggest project, and diddling around, I took out the ORDER BY clause from that query. Since that wasn't the table's "implied query" datasheet view, ordering wasn't automatic and I could see the real order. The tables get pretty jumbled pretty quickly. In fact, it seemed based on a couple of experiments I ran that the table slowly gravitated to chronological order of update, most recent changes last. A C&R didn't scramble them back into PK order, either.

OK, it's not an earth-shaking thing, and if your queries have an ORDER BY clause you never see it anyway.

One last reason why Access cannot do UPDATE-in-place: It would among other things violate the ANSI standard (1992) for cursors for "INSENSITIVE" cursors (i.e. you didn't specify dbSeeChanges).

This quote comes from (Second Informal Review Draft) ISO/IEC 9075:1992, Database Language SQL- July 30, 1992 and I have underscored one key phrase.

If a cursor is open, and the current SQL-transaction makes a change
to SQL-data other than through that cursor, and the <declare cur-
sor> for that cursor specified INSENSITIVE, then the effect of
that change will not be visible through that cursor before it is
closed
. Otherwise, whether the effect of such a change will be
visible through that cursor before it is closed is implementation-
dependent.

In essence, that can't happen if you do UPDATE-in-place. "Cursor" in this context of course is the same as a Recordset object.

More that you wanted to know? You WERE warned earlier.
 

Pat Hartman

Super Moderator
Staff member
Local time
Today, 06:19
Joined
Feb 19, 2002
Messages
42,981
opening a table in Datasheet View implies a hidden query that happens to apply the PK in an ORDER BY statement.
it doesn't actually do that. But people get fooled because C&R rewrites each table in PK sequence so when you open a table, the query that runs, which has no order by initially (unless you've saved one) returns the rows in physical order which happens to be by PK UNTIL a record gets moved because it was updated and couldn't be put back in place.

@Doc, I am happy to be corrected if I am wrong and I can't really argue the rest of the points with you. I can only offer empirical examples that what I described is true. My testing was not extensive and multiple people were not updating the same sets of records at the same time. So, Access may not update records in situ in all cases even though that is what I saw it do in my testing.

Are you talking about Jet/ACE or SQL Server?
 

The_Doc_Man

Immoderate Moderator
Staff member
Local time
Today, 05:19
Joined
Feb 28, 2001
Messages
27,001
Both ACE and SQL claim to be more or less compliant with ANSI 92 SQL standards. My comments were about any SQL that claims such compliance.

As to whether C&R re-orders the tables to PK order, I have to admit I did not think it did that, though that is a minor issue to be honest. And as you point out, the longer it has been since you did a C&R, the more likely it is that record order has been scrambled by updating, thus rendering the question moot.
 

Pat Hartman

Super Moderator
Staff member
Local time
Today, 06:19
Joined
Feb 19, 2002
Messages
42,981
Good, I hoped you were talking about SQL Server and I was talking about Jet/ACE.

In order to validate what I said about the sorting by PK (although this might actually be documented in the Jet Programmer's Guide), you have to update a couple of records in a BIG file and make them move from the first 20 rows to the end of the file. You will only see this if you open the table without sorting it - ever. Access saves sorts so if you ever sorted the table, it will still apply.
 

jcoogan

New member
Local time
Today, 06:19
Joined
Mar 9, 2023
Messages
1
Here's another possibility that took me forever to discover. I am using Microsoft 365 subscription, ver 2302. Check menu: "File, Options, Current Database". Look for "Track name Autocorrect info" and uncheck it.

I >think< the idea behind this option is that Access will auto correct other instances of objects behind the scenes when you change an object name. Example: You have two queries that are linked and you change the name of one, then Access will automagically update the name change in the separate join query, preventing it from breaking. Pretty cool, but also possibly pretty expensive in terms of database bloat because Access has to write all these object references somewhere!

In my case I have a database with code only, data tables are in a separate linked database. The code database was bloating from 3000k to the 2gig limit in 20 min! My code database is doing a lot of database object and recordset object creation and deletion. When I unchecked the option named "Track name Autocorrect info" it resolved the issue and now the code database only grows by about 500k.
 

theDBguy

I’m here to help
Staff member
Local time
Today, 03:19
Joined
Oct 29, 2018
Messages
21,358
Here's another possibility that took me forever to discover. I am using Microsoft 365 subscription, ver 2302. Check menu: "File, Options, Current Database". Look for "Track name Autocorrect info" and uncheck it.

I >think< the idea behind this option is that Access will auto correct other instances of objects behind the scenes when you change an object name. Example: You have two queries that are linked and you change the name of one, then Access will automagically update the name change in the separate join query, preventing it from breaking. Pretty cool, but also possibly pretty expensive in terms of database bloat because Access has to write all these object references somewhere!

In my case I have a database with code only, data tables are in a separate linked database. The code database was bloating from 3000k to the 2gig limit in 20 min! My code database is doing a lot of database object and recordset object creation and deletion. When I unchecked the option named "Track name Autocorrect info" it resolved the issue and now the code database only grows by about 500k.
Hi. Welcome to AWF!

Thanks for sharing. Yours might be a special case.
 

isladogs

MVP / VIP
Local time
Today, 10:19
Joined
Jan 14, 2017
Messages
18,186
@jcoogan
I see you've also posted exactly the same observations in threads at several forums
I also think there may be something unusual about your setup.
I have used the Name AutoCorrect feature for years and never had the bloating issue you describe.
 

ebs17

Well-known member
Local time
Today, 11:19
Joined
Feb 7, 2020
Messages
1,883
The Name AutoCorrect feature is a performance brake and should be switched off.

Who needs something like that when you have a plan for your objects and their names and won't change them afterwards?
 

isladogs

MVP / VIP
Local time
Today, 10:19
Joined
Jan 14, 2017
Messages
18,186
Even with the best planning, there are situations when renaming objects is appropriate.
There are many occasions when Name Autocorrect is very useful.
For example when taking over a badly designed database to improve its design.

As long as the feature is used with care, it works well.
When it was first introduced around 2000, the feature was problematic.
Although that was fixed in the next couple of version, many people turn it off when not actually currently needed.

As for a performance brake, that should rarely, if ever, be an issue as it won't be used when other processes such as complex queries are being run
 

ebs17

Well-known member
Local time
Today, 11:19
Joined
Feb 7, 2020
Messages
1,883
Even with the best planning, there are situations when renaming objects is appropriate.
I would distinguish between being in the process of developing and maintaining an application, and being in the process of doing real work.

If I'm constantly renaming while working and have to implement that via the application, I'd better rethink my confused way of working.
 

isladogs

MVP / VIP
Local time
Today, 10:19
Joined
Jan 14, 2017
Messages
18,186
So would I if that was what I did. Who, apart from yourself, said anything about constantly renaming?

@jcoogan did refer to "a lot of database object and recordset object creation and deletion." which is in itself a bad idea and will definitely cause significant bloating
 

Users who are viewing this thread

Top Bottom