Checking the integrity of db after processing

adrian.stock22

Registered User.
Local time
Today, 11:32
Joined
Mar 21, 2004
Messages
57
Hi, All,

Background (questions: see below)

---------------

I have a db whose one and only table has recently been split up into three tables as follows:

1 Main table (containing contact column; name, tel, fax, email, etc)
2 Memo table (containing long fields of type Memo)
3 Non-memo table (containing short fields of non-Memo type)

We now have to submit Table 1 to a data cleansing company, which inserts columns about telephone and fax suppression (people who we are not allowed to phone or fax), validity of address, etc.

When we first did this (when the three tables where still only one, which should have made processing errors less likely), one of the tables came back in mangled form. Data were all right for, say, ten or 20 records, then came records where "DNA splicing" had taken place, i.e. where parts of one record had been spliced into another record. However, it had been done in such a way that it was not easy to detect by visual inspection and comparison with the original table, especially since many records looked all right.

Typical splicing error: the memo field of one record had replaced the memo field of another record, but NOT that a text field of one record had replaced the memo field of another record or another text field of another record.

----------------------------

The data cleansing company (DCC) told us that it was not their fault but that we had made mistakes in the way in which we submitted our db for 'automatic' on-line cleansing. This may or may not be the case, but certainly the initial instructions from the DCC were misleading or unclear. Therefore

(1) we are now taking measures to prevent this from happening again and

(2) we have to check on tables which come back from the DCC to make sure they have not been corrupted again (just in case it IS the DCC's fault), since this would be absolutely disastrous for us.

Step 1: we divided our one and single table in to three tables, linked by ID as Primary Key, as above. So now only Table 1 has to be sent for cleansing, but we are, of course, concerned, just in case when returned, the records in table 2 and 3 are no longer linked to the correct records in Table 1.

Step 2: we altered Table 1 so that its structure corresponds as closely as possible to what is taken as normal by DCC.

Step 3: We removed all commas from our Table 1 before converting it into a csv file.
Note: We submit our table as a comma delimited file (csv file), using double quotes as text qualifiers. We had been told by DCC (perhaps quite mistakenly) that the occurrence of commas in our memo fields had screwed up their program.

----------------------------------------

We are now experimenting with a variety of formats for submitting our data and receiving them back.

We need a convenient way of checking whether the csv file which we receive back and which we convert into an Access table and substitute for our original Table 1 has not been corrupted.

*** We want to be sure that none of the records in Table 1 on its own have been spliced up and wrongly spliced together as described above.

*** Secondly we want to be sure that the fields in Table 2 (memo) and Table 3 are still related to the correct fields in Table 1.

We know that this SHOULD be the case if we and if DCC have done their jobs correctly, but then things have gone wrong before and therefore we want to check and confirm this positvely, rather than lose our invaluable database.

Question 1
---------------

Are there any routine procedures, perhaps incorporated into Access, to compare the original related Tables with the Tables which we received back.

Is there perhaps some ready-made utility to do this job?


Question 2
---------------

If there is nothing ready-made, we are thinking of doing the following:

Create a query which shows 26 hand-picked records (i.e. one screenful) of the 4000 records in the database, and shows of each record a few memorable fields taken from each of the three tables (as much as can be seen on one screen from left to right without scrolling).

Apply this query to the original db and to the cleaned db, make screen-prints of the results, print them and compare them manually.

Is that a good and workable idea, or is there a better approach to the problem?

Thanks for your help.

Adrian
 

Users who are viewing this thread

Back
Top Bottom