Deleting Repeted Records in a table (in mass)

sibemol · May 14, 2010

I have a database with thousands of records for each of the following fields: Medical Record Numbers (MRN), Names, DOB, Gender, Ethnic Group.

In ALL patients each row is repeated 9-10 times.

I would like to delete ALL repeated records based on MRN
in a way that it would allow me to have name, DOB, gender, ethnic group only once per MRN.

Could someone advice how to most quickly do this? Thanks.

PNGBill · May 14, 2010

A delete query is the way.

First, I suggest you make a copy of the table or database, which ever is convenient.

Second, make a select query that shows all the records to delete including the one to leave.

Then edit the query and ensure the result no longer includes the records you want to delete. The number should have reduced by the number of records you want to retain.

Change this query to a delete query, run same and check if all is correct. If so, delete the delete query and when you are really confident, delete the copy table.
A delete query should only delete the records that it would have returned had it been a select query.

Only tricky part may be where you reduce the records returned in your query.

Do you need help with this part, or any other?

sibemol · May 14, 2010

"Second, make a select query that shows all the records to delete including the one to leave."

How do you do that without having to read 10,000 rows? I do not know how to command acces to delete all repeated records without me having to choose each one individually.

Do you think a SELECT DISTINCT command could help here? Thanks.

PNGBill · May 14, 2010

This process needs to be done step by step or you could end up with a "dogs breakfast".

If you first do a select query that returns all records then you know how many records you have.

Then when you change this query so you return only the ones to delete then the diff in the count should be the qty of records you want to retain.

May be step one is to do some queries to analyse your data so you have proper information as to how many records you have and should have.

Supply some table names and field names and I will help you through some of these simple queries.

Remember, when you do a select query (common query) it will show you a number on the bottom of the screen. This is the number of records the query has selected. Nothing hard with this part.

sibemol · May 14, 2010

I have a table (imported from an Excel sheet) with 9,500 Rows. I have the following Fields: MRN, Person Name, Birth Date, Gender, Ethnic Group.

As I reviewed this table I realized each row is being repeated 8-10 times.

e.g.:

11111, Juan Moreno, 1/1/2010, Male, Hispanic
11111, Juan Moreno, 1/1/2010, Male, Hispanic
11111, Juan Moreno, 1/1/2010, Male, Hispanic
11111, Juan Moreno, 1/1/2010, Male, Hispanic
11111, Juan Moreno, 1/1/2010, Male, Hispanic
11111, Juan Moreno, 1/1/2010, Male, Hispanic
11111, Juan Moreno, 1/1/2010, Male, Hispanic
11111, Juan Moreno, 1/1/2010, Male, Hispanic

What I whish to do is to delete all the repeated rows, so that I end up with just one row for each patient.

11111, Juan Moreno, 1/1/2010, Male, Hispanic

This will allow me to then make the MRN the Primary Key. I have made copies of the table and the entire database, I am safe, in case anything goes spook!

PNGBill · May 14, 2010

Sounds good. and just one table.

I think you should add your primary Key now.

Insert a row at the top of your table in design view and call it what you want - say RecordID - and make it an auto number.

When you save the table you should then have distinct records all with an id from 1 to 9500.

You can delete this field later if you want another primary key.

Then make a query like this one:

SELECT TableMedical.RecordID, TableMedical.TelphoneNumber, TableMedical.Name, TableMedical.BirthDate, TableMedical.Sex
FROM TableMedical;

and the result should give you 9,500 records.

PNGBill · May 14, 2010

I have found this on Access help:

Automatically delete duplicate records from a table@import url(/Office.css);Automatically delete duplicate records from a table

In this procedure, you create a copy of the structure of the table that contains duplicates, make primary keys of all the fields that contain duplicates, and then run an append query from the original table to the new table. Because fields that are primary keys can't contain duplicate records, this procedure produces a table without duplicate records.
To create a new table

In the Database window, click Tables under Objects.
Click the name of the table you want to delete duplicate records from.
Click Copy on the toolbar.
Click Paste on the toolbar.
In the Paste Table As dialog box, type a name for the copied table, click Structure Only, and then click OK.
Open the new table in Design view, and select the field(s) that contained duplicates in the table you copied.
Click Primary Key on the toolbar to create a primary key based on the selected fields.
Save and close the table.

To append only unique records to the new table

Create a new query based on the original table containing duplicates.
In query Design view, click the Query Type on the toolbar, and then click Append Query.
In the Append dialog box, click the name of the new table from the Table Name list, and then click OK.
Include all the fields from the original table by dragging the asterisk (*) to the query design grid.
Click Run on the toolbar.
Click Yes when you receive the message that you're about to append rows.
Click Yes when you receive the message that Microsoft Access can't append all the records in the append query. This transfers only unique records to your new table and discards the duplicates.
To see the results, open the table by clicking Tables under Objects in the Database window and clicking Open on the Database window toolbar.
When you're sure the new table has the correct unique records, you can delete the original table, and then rename the new table using the name of the original

Study this and I will get back to you soon.

PNGBill · May 14, 2010

This query will return just one record for each of the duplicate records in your table and append the data to your new table:

INSERT INTO TableMedicalBackup ( TelphoneNumber, Name, BirthDate, Sex )
SELECT TableMedical.TelphoneNumber, TableMedical.Name, TableMedical.BirthDate, TableMedical.Sex
FROM TableMedical
GROUP BY TableMedical.TelphoneNumber, TableMedical.Name, TableMedical.BirthDate, TableMedical.Sex;

The diff between this and the access help is:

Make a new table with the field names etc and include the primary key as you would want.

Use this query to append the unique records from the old table to the new table and each record will automatically get it's new primary key.

do not use the old tables primary key in your append query.

sibemol · May 14, 2010

It worked!!!!!
It took me a while to figure it out, however, by following each of the steps, it worked!!! It deleted 5343 rows with repetitious MRNs.
Thanks a lot!

Deleting Repeted Records in a table (in mass)

sibemol

Registered User.

PNGBill

Win10 Office Pro 2016

sibemol

Registered User.

PNGBill

Win10 Office Pro 2016

sibemol

Registered User.

PNGBill

Win10 Office Pro 2016

PNGBill

Win10 Office Pro 2016

PNGBill

Win10 Office Pro 2016

sibemol

Registered User.

Similar threads

Users who are viewing this thread