Procedures for importing and duplicate checking?

AndrewS

Registered User.
Local time
Today, 17:32
Joined
Feb 21, 2017
Messages
30
A request to see if I don't need to reinvent the wheel ...

We've been using Eventbrite for managing bookings on workshops etc and I've been asked to look at importing that data into our Access contacts database. I can download the booking data from the Eventbrite website in Excel format, and it's no problem just uploading it. But I'd like to minimise manual process and avoid duplicates.

So what I'm looking for is if anyone has any procedures (I couldn't see anything in the Code Repository forum here) that they're willing to share, and that would cover uploading from Excel, checking for duplicates and (this is my feeling as to the way to go), presenting the user with lists of current records and possible duplicates in the new data, so they can accept or reject possible matches.

I appreciate this is a bit cheeky, but I'm sure I'm not the first person to need to do something like this, and even something vaguely similar would be useful to get my thinking on the right lines.

Cheers.
 
I don't think you'll find something pre-written, as most excel imports processes are bespoke to the data layout coming in and the layout required within your system.

The easiest way is normally to have some form of staging table that you upload the new data into. You then query that against your existing data.

You can present this query result as a list box perhaps, depending on the number of records. You can then select the list box items you want to import, and do an append query, for either just the selected items, or the selected items and any that aren't matched (new data?)
 
To add to Minty's comments, and to give an example, I had a case of taking Excel data of military awards to individuals and trying to sort out duplicate personnel. While military service number was one identifying field, some had enlisted multiple times (with different service numbers) and at one stage when different military services had in the past issued the same set of service numbers to respective recruits.

Then there is the possibility of different forms of the same name eg William/Bill. On top of that, there will be typos.
 
Further to the comments by Minty and Cronk which I agree with, many people have spent careers cleansing data. A staging type table is often part of the solution, and you can probably make use of unique composite index.
An issue you will face is to clearly identify what makes a duplicate, or said differently - is person A the same person as B or C?

I found this re temp/staging table in a response to a similar question:
"Most likely you use a temp table when you are importing records from elsewhere (like, say, Excel, or another db, perhaps maintained by someone else). Then a form is used to manipulate the record data (manually, with queries or with VBA procedures) - adding, deleting, modifying records/data until the records are in the condition you want them to add to a "permanent" table."

Good luck.
 
Last edited:
..One particular bank, the comma separated values do not format correctly in to MS Access, and I don't know why.
...
Do the commas been placed correct in the file, when you look at it in program like Notesblok?
If the values are easy to mask, then you could post a part of the original file, where you know there is a problem.
You have not written what exactly goes wrong, but what I suspect is that there may be some hidden characters in the file, like Tab, Return etc..
 

Users who are viewing this thread

Back
Top Bottom