Question about best practice on overwriting records vs deleting records (1 Viewer)

Benginner2212 · Nov 29, 2023

I have a database tracking various audio, video and network connections. Each cable is identified with an alphanumeric code that consists of a 3 characters and a four digit number, the source and destination of the cable and the equipment that is connected by the cable.

Currently, our practice is that we delete the record for a given cable number from the database. Which so far hasn't been a problem, but I recently finished a project where my cable numbers were in the 9000 range. Due to limitations caused by other ways that we document how equipment is wired connected, the numeric portion of the cable number can't exceed 9999. To avoid going having cable numbers that exceed 9999, I was thinking that it would be best to overwrite records with new data instead of completely deleting the old records from the database. Then given the user an option to view a report the lists all of the decommissioned cable numbers that are present in the database so that they could effetely re-use a cable number.

I know it's a basic question but I am just curious if overwriting records could eventually cause the record or database to become corrupt?

June7 · Nov 29, 2023

If you don't care about history, then see no reason not to reuse. Instead of deleting, Null fields except the identifier. Next time user needs to enter a new record, present 'blank' record for completion if one exists. I have done this.

Benginner2212 · Nov 29, 2023

I have error checking on the all of the fields of the record to try an ensure that good data has been entered by the user before a record is saved. I am concerned that if I were to set the other fields to Null that I would be getting Null errors when I tried to pull up the record for a decommissioned cable number. Could I just overwrite the fields with other data like "N/C" or "DECOM"?

June7 · Nov 29, 2023

Sure. So then pull up a record that has "N/C" in a particular field, if any. If none exist, present user with a completely new record row.

The_Doc_Man · Nov 29, 2023

Two points to consider. First, your question about database problems if you "reuse" a slot? It depends, but usually not an issue UNLESS you somehow were using that letter/digit combination as a single-field primary key. (Prime keys should not be modified, as a theoretical thing.)

Second, from a mechanical standpoint, there is no practical difference between doing an UPDATE vs. doing a DELETE/APPEND combination. Both will have the same effect internally to the database.

Pat Hartman · Nov 29, 2023

Overwriting old data is poor practice. If you don't need the old data, delete it or archive it. Then create a procedure that finds the gaps and adds new values in the gaps where old records used to exist. In either case, you have no history so you need to make sure that won't be a problem.

We have no idea what the database is all about. You have limited one of the values to four digits and are not willing or able to increase the number of digits. How would you determine that some data is old and can be deleted? How many active connections would you have at one time? What happens when you remove the connection? does the equipment also get removed?

Pat Hartman · Nov 29, 2023

The_Doc_Man said:
Both will have the same effect internally to the database.

And will cause bloat so you have to remember to schedule periodic C&R events.

The_Doc_Man · Nov 30, 2023

True enough, Pat. Like I said, an UPDATE cycle or a DELETE/APPEND cycle have about the same effect. However, the OP's database doesn't seem very large and the "churn" rate doesn't sound that high, so this might not be AS bad as some we've seen.

The more critical question is, regarding the old data, do they NEED to remember what WAS, or do they only care about what IS? From the early posts, it doesn't seem so.

@Benginner2212 - the question is, will you need to know anything about prior uses of your special ID codes? Is history of any value in what you do. Because given the question you asked, re-using a slot has one negative effect of importance... it complicates your ability to look at history. But this is not a mechanical issue - it is a theoretical/practical issue.

If you have a LOT of updating then you will now and then run into something called "bloat" - and it can be a problem if you let it run away with your DB. When you update records (because you are reclaiming them), Access effectively does the equivalent of a simultaneous DELETE (of the old record) and an INSERT (of the new record.) It has to do that in that specific way because of something called a "rollback" - a way to undo a failed transaction. So what happens is that the space occupied by the deleted record becomes temporarily lost until you do a Compact & Repair (C&R) action. For small records, this bloat takes a long time to build up to an important amount. However, the faster you update and the more you update, the sooner you will find that your DB has doubled in size and really needs the C&R.

Benginner2212 · Nov 30, 2023

@Pat Hartman
@ The_Doc_Man

I'm a broadcast engineer for my state's local PBS station and the purpose of of the database is what cables are connection various pieces of equipment together and the function of each cable that is connected to a piece of equipment.

All of the cables are given a 7 digit alphanumerical code that represents a category that is assigned to the cable based on the function of the cable and 4 digit number to give the cable a unique identifier in that category. So if a cable is going to be used for sending HD video from point A to point B that cable falls into our category of HD video cables and gets the 3 character prefix of HDV. Typically, when we run a new cable we just add 1 to the four digit number of the last cable that was ran in a given category. So because the last HD cable that was ran has the cable number of HDV2204 the new cable that is ran will be entered into the database as HDV2205.

The record in the database will indicate that HDV2205 connects the video output of Satellite Receiver A that is in Rack 24 is connected to Port 221 of our video router.

The reason that we can't exceed a four digit number in any given cable category is due to the fact that also use Auto Cad to create drawings of our various systems. So we have a drawing that shows all the equipment that is connected to the inputs of our video router. My boss is very particular about how those auto cad drawings are done and because all of our current drawings use the 7 digit alphanumeric code he doesn't want to get away from the 7 digit alphanumeric code so that the drawings remain uniform. I know that is a silly reason to be unwilling to change but it is a hill that my boss will die on so I've given up on trying to get him to change.

We consider any cable that is still in the racks as a active cable, so if we take out a piece of equipment but leave the cable in the rack we'll note in the database that one end of a cable is not connected. And then we consider any cable that is completely removed from the racks as a decommissioned cable and delete the cable from the database.

Going back to my example of the Satellite Receiver. PBS decides that is going to stop sending us content via Satellite receiver A. Since, we no longer need Satellite Receiver A, we'll remove the the Satellite Receiver from rack 24 and remove the cable from both the video output of the Satellite Receiver and port 221 of the video router. Then we'll pull the cable HDV2205 out of the racks completely, coil up the cable, cut off the labels, throw it in out bin of cables to be used at a later date and delete HDV2205 from the database.

We do keep a history in a roundabout way. When the main navigation page of the database is closed. a function is called that creates a backup of the database for that day and saves the backup file to a shared network folder. My IT person has that shared network folder setup to automatically delete any file that is older than six months. So we effectively have a six month history of changes to the database. Which is more than plenty for us because we aren't constantly swapping out equipment and adding or removing cables. But beyond that, I don't know if keeping a history would benefit us since we removed cables when we remove equipment. If we swap out equipment or move equipment, we just update the current record in the database with the changes.

Pat, your suggestion that I either delete or archive old data and then create a procedure to find gaps in the cable numbers is ironic because I was doing something similar before and was cautioned against that by an other access user because if something didn't get deleted properly I could potentially create a duplicate record and corrupts the database.. But I am thinking I might not have been doing it in the way your suggesting. I am not using the cable numbers as a primary key, but rather letting Access assign the primary key to the records. So the way that I was doing that was first I'd delete the old data from the database, a query would run that identified gaps in the cable numbers and then a new record would be created when a cable number was reused.

To me, it sounds like you are suggesting that I write the new data to the old record, which I would think wouldn't be possible if the record is deleted.

I am confused about the_doc_man's comments about bloat and rollback. Does that simultaneous delete and insert only happen when I change information on a record in the database or does it also happen when I delete one record and then create a new record?

And if I set up something to run a compact and restore of the database, would I run that every time the database is closed or would it be a periodic thing?

Pat Hartman · Nov 30, 2023

No, I would delete the old record. the numeric suffix of 2205 is added at the time you put the pieces together and end up with HDV2205. There should be a unique index on the two fields "HDV" and "2205". To do this, you need to use the indexes dialog. You cannot create a two column index using the Table design view. In the indexes dialog, find the first totally blank row and add a name for the index. add the first column and mark the index as unique. then move to the next row. leave the index name blank and add the second column. You can create indexes of up to 10 fields using this method.

So you just increment the numeric sequence number as normal. Then check to see if it is > 9999. If it is, then you have to run your gap query to find the first gap. Use that number instead of the generated number as the new sequence.

Normally databases grow gradually. If you are frequently deleting records or updating them and potentially making them longer, then you run into the bloat issue because Access does not reuse the space for the deleted record or for the original smaller record. That means that instead of C&R monthly, you might need to do it weekly. Just watch how your db grows to see what the frequency you need is.

What I don't like about the reuse is that if you keep printed records of ANYTHING, they become suspect because you might have reused the number. I don't like deleting history so I have another thought. Keep the concatenated sequence number in a separate field with a non-unique. Then when you archive the record rather than delete it, remove the contents of the two working fields. You should add an ArchivedDT field so that the missing values make sense. This lets you keep all the old data and it doesn't interfere with the new data.

Using the second method, if you look for HDV2205 using the concatenated field, you might get two records whenever the field had been reused. So, you need to carefully consider which situations call for using the two working fields rather than the concatenated field for searching.

ebs17 · Nov 30, 2023

"HDV" and "9999"
Can't you open a new number range with a new alphanumeric code ("H2V"?)?

Benginner2212 · Dec 5, 2023

ebs17 said:
"HDV" and "9999"
Can't you open a new number range with a new alphanumeric code ("H2V"?)?

While that would be the easiest solution, my boss is against anything would affect the uniformity of the drawings and layout of the physical layout of the printed copy of the database and would veto the change. I am lucky that I was able to convince him that switching to an outlook database was a good idea.

Benginner2212 · Dec 5, 2023

Pat Hartman said:
No, I would delete the old record. the numeric suffix of 2205 is added at the time you put the pieces together and end up with HDV2205. There should be a unique index on the two fields "HDV" and "2205". To do this, you need to use the indexes dialog. You cannot create a two column index using the Table design view. In the indexes dialog, find the first totally blank row and add a name for the index. add the first column and mark the index as unique. then move to the next row. leave the index name blank and add the second column. You can create indexes of up to 10 fields using this method.

So you just increment the numeric sequence number as normal. Then check to see if it is > 9999. If it is, then you have to run your gap query to find the first gap. Use that number instead of the generated number as the new sequence.

Normally databases grow gradually. If you are frequently deleting records or updating them and potentially making them longer, then you run into the bloat issue because Access does not reuse the space for the deleted record or for the original smaller record. That means that instead of C&R monthly, you might need to do it weekly. Just watch how your db grows to see what the frequency you need is.

What I don't like about the reuse is that if you keep printed records of ANYTHING, they become suspect because you might have reused the number. I don't like deleting history so I have another thought. Keep the concatenated sequence number in a separate field with a non-unique. Then when you archive the record rather than delete it, remove the contents of the two working fields. You should add an ArchivedDT field so that the missing values make sense. This lets you keep all the old data and it doesn't interfere with the new data.

Using the second method, if you look for HDV2205 using the concatenated field, you might get two records whenever the field had been reused. So, you need to carefully consider which situations call for using the two working fields rather than the concatenated field for searching.

I think that I will have to go with the second solution that you've suggested. Right now the field for the "HDV" is a combo box from a table that contains all of the categories of cables that is separate from the table that contains the field for "2205" I don't think that I could create a unique index for the field that contains "2205" since it would contain duplicate entries.

I think I could add a field that contained the concatenated HDV2205 from the two working fields and either archive the records or search/print the records based of the concatenated field.

What happens when I archive a record? Does all the data for a record get archived or just what I tell access to archive? For example if the record for HDV2205 has a primary key of 321, and all of the other fields are populated with the source rack, destination rack and descriptions get archived or would it just be "HDV" and "2205" that get archived? If I reuse the number HDV2205 would the primary key for that record be 321 or would it have a new primary key?

June7 · Dec 5, 2023

Depends what you mean by "archive". I think in this case suggestion is to just set the value of a field ArchiveDT with date record is "archived". Then the record can be excluded from queries that filter out the "archived" records. This record is not deleted and PK 321 is still in table. If you reuse HDV2205, this would be creating a new record with its own new PK. This retains history, which you did not seem to care about.

An alternative was described in post 2 which everyone else here seems to reject.

Pat Hartman · Dec 5, 2023

Benginner2212 said:
I don't think that I could create a unique index for the field that contains "2205" since it would contain duplicate entries.

Read my suggestion again. That is NOT what I told you to do. I told you to create a multi-field unique index and I even gave you directions on how to do it since it is not obvious from the interface.

Benginner2212 said:
What happens when I archive a record?

Depends on what you mean by archive. YOU are in control over what archive means. To most people, it means deleting the record AND related data and appending them to archive tables in a separate database. To others, it may mean leaving the record in place but adding an ArchiveDT field to hold the date that the record was made obsolete.

@June7 brought up post #2 again. The problem with deleting and/or overwriting is that anything that has been printed becomes suspect. If you NEVER print anything, you don't have a problem. If you do print reports or invoices or ANYTHING that includes the generated code, I would NEVER delete the record and writing over the old record is the same concept in this case as deleting because the record in the table bears no relationship to anything that might have been printed in the past.

That is why I gave you a second option which allows you to not only keep history (i.e. archive in place) but to reuse the user generated codes in a new record.

June7 · Dec 5, 2023

Reusing a record worked for my db. If user aborted data entry (rare event) after record identifier was created (not autonumber PK), I saved that record for the next data entry, usually the same day so chronology remained valid even when there were multiple users. But chronology was not critical. Also, I allowed for VOIDING records and not reusing the identifier in cases of lost document or request was later withdrawn. I could not have gaps in this identifier assignment.

Another situation in same db was very old project numbers were getting reassigned (controlled by another office) but they would be a completely different project name. So with the combination of project number and name, reports could be distinguished.

Benginner2212 · Dec 5, 2023

Pat Hartman said:
Read my suggestion again. That is NOT what I told you to do. I told you to create a multi-field unique index and I even gave you directions on how to do it since it is not obvious from the interface.

I am sorry that I misunderstood your suggestion. I tried to create a unique index based on the two working fields that I am using for the cable category and the cable number but I get an error message that states I can't do it because it would create duplicate records. However I was able to create an non-unique index with the two working fields. I am going to try and follow the second suggestion that you made and create a field that concatenates the two working fields and give that field a non unique index.

The_Doc_Man · Dec 5, 2023

Benginner2212 said:
I am confused about the_doc_man's comments about bloat and rollback. Does that simultaneous delete and insert only happen when I change information on a record in the database or does it also happen when I delete one record and then create a new record?

Both an UPDATE and a combo of DELETE/INSERT have essentially the same effect in terms of bloat. DELETE/INSERT is just slower.

When you do an UPDATE, there is a rule relating to what is called a "rollback" (database slang of "undo"). For a brief moment, both the original record and the new record co-exist in different places in the physical layout of the file on the disk. If the updater task decides to abort the update action for some reason, the "old" record has to return to the table unchanged and the "new" record with all changes is discarded.

We have reason to believe that tables are actually a linked list of records, not an end-to-end array of records. So when the UPDATE actually executes for a given record, the old record is unlinked from the table's list and the new record is added to the table's list. But in a shared table, it is possible that someone else is currently running a query having that old record in place. So the old record can't be removed (yet). When that hypothetical SELECT query is done, the unlink can take effect. But because of that delay, the new (updated) record cannot take its place so it gets tacked onto the end of the table's linked list.

If you do this as an UPDATE, that unlink/link pair of operations is, from the program viewpoint, simultaneous - even though we know the computer MUST do things one instruction at a time. When you do the DELETE/INSERT operation as two discrete queries, the same steps are performed, but the insertion (2nd step) doesn't have to happen right away. However, the rule about not unthreading a record when someone else is running a query against that recordset still applies. So again, the INSERT tacks the new record at the end of the table's list.

Since we don't actually have Open Source code visibility for Access, I have to qualify the above as "one man's educated guess" as to what really happens behind the scenes. But over the years we have seen tidbits here and there to suggest that the above isn't terribly far from reality.

Pat Hartman · Dec 5, 2023

Benginner2212 said:
I am sorry that I misunderstood your suggestion. I tried to create a unique index based on the two working fields that I am using for the cable category and the cable number but I get an error message that states I can't do it because it would create duplicate records.

Try reading the directions one more time. Let me repeat it for you.

Pat Hartman said:
To do this, you need to use the indexes dialog. You cannot create a two column index using the Table design view. In the indexes dialog, find the first totally blank row and add a name for the index. add the first column and mark the index as unique. then move to the next row. leave the index name blank and add the second column. You can create indexes of up to 10 fields using this method.

June likes the reuse idea. I disagree. I think it is worse than deleting, but that is just an opinion, especially since I believe that both are wrong. The real issue is do you have paperwork that references the generated ids? You have not actually addressed this question.

Pat Hartman · Dec 5, 2023

The_Doc_Man said:
Both an UPDATE and a combo of DELETE/INSERT have essentially the same effect in terms of bloat. DELETE/INSERT is just slower.

You are assuming the delete/insert are adjacent. In fact, they are very unlikely to be. One record is deleted/cleared when the cable is taken out of service. Some other cable lis added, having NOTHING to do with the one delete/disabled, and which almost certainly has a different three letter prefix so the new record would be extremely unlikely to replace the one that was just taken out of service.

Question about best practice on overwriting records vs deleting records (1 Viewer)

Member

AWF VIP

Member

AWF VIP

Immoderate Moderator

Super Moderator

Super Moderator

Immoderate Moderator

Member

Super Moderator

Well-known member

Member

Member

AWF VIP

Super Moderator

AWF VIP

Member

Immoderate Moderator

Super Moderator

Super Moderator

Similar threads

Users who are viewing this thread