eliminating redundant records

xxyy · May 21, 2007

Dear All,

I have a table with several fields.

Concentrating only on few fields I'd like to sort out redundant information (unnecessary records).

(It is similar to using Ecxel: Data>Filter>Advanced filter>Unique records only)

So, I'd like to ask for your help how can I reduce my table to unique records.

thx

Csaba

Dennisk · May 21, 2007

use a query and set the output rows to unique either from the the query property sheet or by using the Distinct keyword.

Although I would recommend you create another table with the unique data and if it on the one side of a one-to-many relationship and the one table is using an autonumber as a PK then you will have to update the FKs on the many side.

xxyy · May 21, 2007

Hi Dennisk,

Sorry, I'm just beginner.

On the property sheet I found the following entries:

description
format
input mask
caption
smart tags
+
display control

Maybe I didn't use the property sheet you mentioned. Could you help please how/where to set the 'unique' property for output rows?

Thanks.

xxyy

Dennisk · May 21, 2007

thats the property for the column not the query. Click in the table window of the query, or select View/SQL and add the DISTINCT keyword after the SELECT keyword

jal · May 22, 2007

If you need to actually DELETE the redundant records, you're best off using the following GROUP BY method (I got it from Molinaro's book) instead of DISTINCT. First, add an autonumber column if you don't have one, by pasting the following into the sql editor and running it.
ALTER TABLE Customers ADD COLUMN ID1 COUNTER
Now assuming you want to remove all rows where first-last name repeats do this:
DELETE FROM Customers
WHERE ID1 NOT IN
SELECT MIN(ID1) FROM Cusomters GROUP BY LastName, FirstName
Or to get a preview of which rows will be deleted, first do this:
SELECT * FROM Customers
WHERE ID1 NOT IN
SELECT MIN(ID1) FROM Cusomters GROUP BY LastName, FirstName
EXPLANATION: The GROUP BY clause is used for working with rows that have values in common (for example GROUP BY State would be used in a sales report to total sales for all rows for items sold in the California stores. Thus the GROUP BY clause detects where a value is repeated/duplicated in a column (in this case detects all rows that have "California" in the State column). If we have
ProductName QtySold Day
widgets 5 Monday
widgets 10 Monday
tape 1 Tuesday
tape 2 Wednesday
And we write
Select ProductName, sum(qtySol)
Group By ProductName
We'll get
widgets 15
tape 3
But what constitutes a duplicate entry if the Group By is multiple? For example if we have
Select ProductName, Day, sum(qtySold)
Group By ProductName, Day
In this case a duplciate entry is anytime we have a matching pair for ProductName and Day (such as Widgets-Monday repeated). Thus the first two rows constitute a dup entry
widgets 5 Monday
widgets 10 Monday
With the result of the Select being this:
widgets 15 Monday (aggregated dups)
tape 1 Tuesday (no dups to aggregate because no name-day dup)
tape 2 Wednesday (no dups to aggregate because no name-day dup)
Now I can use this principle to eliminate duplicates from any table, for example these dups:
Acct Cpt
23123 99285
23123 99285
23123 99285
34444 12001
34444 12001
34444 12001

Access always remembers which rows are in each dup set and thus knows the line ID# (called ID1) for each row in the current dup set. To remove all dups without removing the original entry, keep any ONE of the dup entries (we chose to keep the one with the lowest line-ID#, given by Min(ID1), but we could have chosen Max(ID1) or Median(ID1). This explains the original example which is similar to the following example.
DELETE FROM AllAccts
WHERE ID1 NOT IN
SELECT MIN(ID1) FROM AllAccts GROUP BY Acct, CPT

xxyy · May 22, 2007

Dear Denisk and Jal, thank for your help!!!

eliminating redundant records

xxyy

Registered User.

Dennisk

AWF VIP

xxyy

Registered User.

Dennisk

AWF VIP

jal

Registered User.

xxyy

Registered User.

Similar threads

Users who are viewing this thread