How do we historicize some rows in data tables? (2 Viewers)

amorosik

Active member
Local time
Today, 16:54
Joined
Apr 18, 2020
Messages
778
Suppose we have our beautiful program that operates on three tables: Documents, Customers, and Phone Calls
After several years, the Documents table will contain many records that will likely no longer be useful and will unnecessarily burden the archive
How do we historicize all documents dated before 2025 in the DOCUMENTS table and then physically move them to another database, while still being able to retrieve the historicized information?
 
An append query, followed by a delete query, both with appropriate parameters, would do the job. You can retain access to the archived data by linking it to the production database.

I am not sure, though, what you mean by "historicize" them. How is that different from moving them to the archive database?
 
Why do you need to move them to another database? Data doesn't take too much space on disk. You can set a default value for date in your search form, so when users search for something, it shows the data from 2025 and later as default. One bonus is that if for some rare reasons they need to refer to old data, they simply can change the default value and search.
 
My solution has been to save the historic data in the tables to XML, which is easily retrievable, when required.
 
As KitaYama has pointed out, there is no need to move the rows to another table. The value of an Historicized column of Boolean (Yes/No) data type could be set to True. In its normal day to day operations the database would then use those rows WHERE NOT Historicized.

The attached little demo file illustrates a similar approach to mark records as deleted. One or more records marked as deleted can be restored at any time should this be necessary. You'll see that the demo includes an option to delete multiple rows where the value in a column of date/time data type is prior to a date entered into an unbound text box in a dialogue form. This is analogous to marking rows as historicized in your case.

The demo does include an option to move a single deleted row to a backup table, as you are proposing. You could easily amend this to move multiple rows. One or more rows can be restored from the backup table at any time if necessary. Note that to avoid primary key conflicts if the database is compacted and repaired, and a backed up record is then restored, the ContactID primary key of the Contacts table in the demo is not an autonumber, but is computed when a row is inserted.
 

Attachments

My colleagues have offered a few thoughts on the subject. I'll take a different direction just to cover some bases. Let's assume for the sake of discussion that you really DO need to remove - but somehow preserve - old data.

The problem, of course, is the intertwining relationships that can exist sometimes between two or more tables. You therefore have to create a "deconvolution" plan that starts by identifying tables that depend on other tables (and why).

Simple case: An invoice table with a child table of invoice line-item details, one/many parent/child relationship, with relational integrity enabled. You can't archive the invoices first, because removing the invoice entries would violate relational integrity for the line-item table. So you would think that have to archive the CHILD table (invoice details) first. EXCEPT that you then need to consider what format you are using to store the details. Your plan needs to take the master DB and the archive destination into account.

If you are planning to keep a separate Access BE file as the particular archiving destination AND you wanted to maintain relational integrity in that file, too, then you would have to (a) COPY the invoice table to the archive, (b) copy the inventory details to the archive, (c) delete the archive details from the live DB, (d) delete the invoice (parent) records from the live DB - all the while using a restrictive WHERE clause that would only copy or delete records eligible to be archived... which means that the invoice table (the original parent) has to have a field to drive the archive step in the first place. If there is a date field and you are archiving by record's age, no biggie. Now, if there are multiple inter-table relationships that confuse this issue, you have to do the same type of analysis but with extra tables to complicate your plans. The general rule is "archive the top of the hierarchy first, remove the top of the hierarch last."

But there can be complications. In my big security-tracker DB, a complication was that each "required action" entry applied to every server on site, over 1500 total. So I had to know the state of each server for each required action. There was more to it than that, but this part illustrates an "oddball" case. I was allowed to archive "required action" entries after 6 months - but only six months after ALL of the servers had been appropriately updated. If even ONE server was still pending for action, the whole action entry was OPEN. Which meant that the date of issuance of the action notice wasn't important - it was the date of closure in the junction table that tracked individual compliance. The tables ALL had a Y/N field for "MaintSelect" (selected for maintenance) so that I could run a couple of queries to select (or de-select) eligible records, then run the appropriate export and removal queries. I.e. I built indicator fields into the tables so that when it was time to do maintenance, I could mark the records. Once the marking was done, the copying and removal were almost trivial.

IF you are actually NOT planning to use a BE file, you could do what I did for a situation where the action history would become irrelevant AND there was no need to record every single detail, I built a query to combine data and exported the query records to a spreadsheet. The removal step didn't change, but the archival step did, because I could export everything at once to the spreadsheet.

The archival step will always depend on what you choose as the storage mechanism. The removal step hardly ever seems to change, in principle. The only question in removal is whether you can use an already existing field or you need a "selected for action" field as the mechanical method of record selection.
 

Users who are viewing this thread

  • plog
Back
Top Bottom