Merging 3 databases shortcut?

Chrisopia

Registered User.
Local time
Today, 04:27
Joined
Jul 18, 2008
Messages
279
Alas, we finally have a way to share a central database, but I'm now left with a laborious task of merging the currently 3 separate databases.

Combining them isn't a problem, but what I need is a short cut to match the ID's and merge the records into 1.

For example, my table for address has a customer ID for Database A, B and C.

Now what that means is I have 3 records for the same address. I've already ran a duplicate wizard, and it shows me the 3 entries for each. Now what I would do is manually type the customer ID's into 1 record and delete the other 2.
This is obviously quite time consuming and prone to typing errors.

There must be a way to merge them, with an update query or something??
 
In general this is what I would do. Let's say you have 3 tables (called A, B & C) and the way to match them all is on one field (f). I would copy and paste table A as the new main table (lets call this D).

Next I would create an APPEND query that puts all data unique to B into D. The query would be a LEFT JOIN from B into D on field f. I would bring down all fields from B and make sure they are set to append to the correct fields in D. Next I would bring down field f from D make sure its not getting appended anywhere and then place this criteria uner it "Is Null".

Running that query will place all the records not in D but in B into D. I would then edit that query replacing B with C. That will get all the unique records into D from all 3 tables.
 
Your first issue is coming up with a consolidated customer table. Start by creating a new table with the identical schema of the existing tables. Then add a new column (long integer) to hold the "old" primary key and a second new column (long integer) to hold a reference number that you will use when consolidating. Run append queries to append all the data from each customer table to the new one. Except - the existing autonumber PK will be appended to the "old" PK column and you will allow Jet to generate a new autonumber PK for the consolidated table.

Once all the data is in one table, you have to start weeding out dups. You can't delete them at this point but what you need to do is to choose the one that will become the official customer record and then place that records new PK into the reference field.

Once the customer table is cleaned up, you need to deal with the transactions. To validate your conversion, you will need to know where each transaction record came from so if your schema doesn't already include a column for this, you will need to add one. Now the append queries for the transactions will have to populate the new "source" column. They will join to the new customer column on the old PK/FK values but append the new PK value as the FK. This FK will be the actual PK of the customer or the value from the reference column if it is populated.

Once the tables are converted, you need to create queries that pull out the data from each "source" and match that set of data to a query that pulls the same set of data from the original source database. If the data doesn't match exactly, you need to sort out the conversion issue.

After the conversion is validated, you can delete the extra columns from the customer table that were just needed for conversion.
 
Thanks Pat, it's stage 2 I'm stuck with in your method. But it mostly makes sense.

And Plog, that is a good idea, but the customer ID is used in another table. I can only convert and update the Customer ID to a single one when I've done, what Pat describes in step 2... hmmm
 
Is there a way to do, for example;

IF [Company Name] = Duplicate and Duplicate-Count = 3, then if ID-A Is Null then (take ID where ID-A Is Not Null), If ID-B is Null then (take ID where ID-B is not null) , IF ID-C is Null then (take ID where ID-C is Not null)

Then Delete 2 of the duplicates...

I wish Access was more English, but I guess my shortened lamen equation is more complicated than it looks...
 
Personally I would create a database and import the tables into this database and then start manipulating the data. I do not know how many record but I would create a boolean Delete Flag in each table and flag the records to be deleted and test initially for all duplicates and then for duplicates in the non-deleted records.

Simon
 
Is there a way to do, for example;

IF [Company Name] = Duplicate and Duplicate-Count = 3, then if ID-A Is Null then (take ID where ID-A Is Not Null), If ID-B is Null then (take ID where ID-B is not null) , IF ID-C is Null then (take ID where ID-C is Not null)

Then Delete 2 of the duplicates...

I wish Access was more English, but I guess my shortened lamen equation is more complicated than it looks...
You can do anything you want if you want to write code. The outline I gave you requires no code but it does require a manual review to identify duplicates. Identifying duplicates is a whole process in and of itself. For smallish sets of data, it is more cost effective to scan manually since many dups are not identical. As a human, you would identify Pat Hartman as a dup of Patricia Hartman but the algorithm to do the same thing would be quite complex.
 
Thanks pat, but the data would be exactly identical as the data has existed on 3 databases.
My usual method would be to type the ID's of the other 2 records into the blank columns, then delete the other 2 leaving one behind.

I should have mentioned I have around 14,000 address to compress (around 4,500 unique address entries)... this would take a long time to do hence I would need VBA to do it and avoid errors.

isn't there a simple method of checking the next 2 records and merging the blank columns?

Heres an example:
Address, ID-A, ID-B, ID-C
23 Castle Street, 1203, (blank), (blank)
23 Castle Street, (blank), 799, (blank)
23 Castle Street, (blank), (blank), 8213

Should become:
23 Castle Street, 1203, 799, 8213
 
I've done it!

It may be the long way around, and I'm sure there may be an SQL that can do it in one... but heres what I did:

1) Make a query to find the duplicates, set by [Address 1], I called it tblAddDup,
- apparently Access works faster by first making a count query and using that to list the other duplicates, rather than the duplicate function?

2) With that list, make 3 separate queries, 1 for each of the separate IDs... Using the "Is Not Null" criteria under each ID heading.
-This effectively breaks the list apart again, only showing 1 entry for each Address 1 duplicate. I called these qryID-A, qryID-B, qryID-C...

3) Make an Append query, using the query to find duplicates (tblAddDup). Using the 3 queries in step 3, each linked via [Address 1] with option 3 selected for each link.
- This shows all from the duplicate list and only those from the ID query.

4) The append is to a new table, with the same structure. Match everything to the relevant headings, [Address 1] to Address 1, [Postcode] to Postcode etc...
But only use the ID from each of the three queries:
[ID-A] from qryID-A, [ID-B] from qryID-B, and [ID-C] from qryID-C.

5) The final destination table (which I called tblAddress Final) has "Address 1" as its primary key. So when I run an append query to take all from the append in step 4 and place it into tblAddress Final, it will pop up a message that X records could not be sent due to key violations - therefore only importing one of each record.

6) run a similar append query, using the opposite of step 2 (where instead of count >1, I use count <2). This finds any remaining records that only had 1 entry or weren't duplicates.

And done! If anyone knows how to condense this, feel free to add!
 

Users who are viewing this thread

Back
Top Bottom