Hide duplicate data across entire report (1 Viewer)

dbasch · Feb 10, 2003

Hello,
How do I remove duplicate records from a report accross the entire report? According to the Access help I can only remove duplicate records from each page using 'Hide Duplicates'. Any suggestions?
Thanks,
Derek Basch

FoFa · Feb 11, 2003

Base the report on a query and remove the duplicates in your query.

dbasch · Feb 11, 2003

Ok, this brings me back to my original problem. I was unable to utilize the DISTINCT function to remove duplicates because:

If the SELECT clause contains more than one field, the combination of values from all fields must be unique for a given record to be included in the results.

So, the DISTINCT function is worthless if you are returning more that one field. My solutions is to assign an Auto Number primary key to the source table. Then split the souce table into two tables, one with JUST the DISTINCT field (A) and the other with all the rest of the fields (B). Then, take the two tables and RIGHT JOIN them on the distinct field 'A RIGHT JOIN B'.

It seems like alot of ungraceful hacking that im having to do to make up for the shortcomings of 'DISTINCT' and 'Hide Duplicates'.

So, here is my question/problem. The source table mentioned above is created using a Make Table query based upon the product of a UNION query. In order to implemen my above strategies I need to insert an 'Auto Number' primary key into either the UNION or the Make Table queries.

Is it possible to add an 'AutoNumber' field to UNION or MakeTable queries?

I found this:

http://www.fabalou.com/Access/Queries/querycreator.htm

which seems to indicate that it can be done.

Thanks everyone,
Derek Basch

FoFa · Feb 11, 2003

Use Group By and any other agregate functions you need. Distinct is a limit use verb.

If the SELECT clause contains more than one field, the combination of values from all fields must be unique for a given record to be included in the results.

I hate to point out the obvious, but if the given value is NOT unique, than it is a duplicate. Your original question did ask how to remove duplicates. (I am just teasing you don't get upset). Disinct works on all fields in a query. Order By along with it's other agregate functions like MAX, MIN, SUM, COUNT, etc. allow you to change how that functions. Of course one of the effects of UNION is to remove duplicate entries also, if you want duplicate entries, you must use UNION ALL. UNION works like Distinct does. You can use the Group By on the product of the UNION to combine rows you want combined. Why not post an example of what you are trying to do?

dbasch · Feb 11, 2003

AHAH! GROUP BY! That is what I needed. Thanks FoFa.

GROUP BY Clause
Combines records with identical values in the specified field list into a single record. A summary value is created for each record if you include an SQL aggregate function, such as Sum or Count, in the SELECT statement.

dbasch · Feb 11, 2003

Nevermind, scratch the below!

Upon further investigation I discovered these gems:

All fields in the SELECT field list must either be included in the GROUP BY clause or be included as arguments to an SQL aggregate function.

groupfieldlist - The names of up to 10 fields used to group records. The order of the field names in groupfieldlist determines the grouping levels from the highest to the lowest level of grouping.

So, I still cant group by just one field! Well, here is an example of what im working with. Maybe you will see a solution that I dont:

[/code]
SELECT [ACT].[Create Date], [ACT].[Company], [ACT].[Record Creator], Qualified() AS Qualified, [ACT].[Lead Source]
FROM ACT LEFT JOIN [Web And Referral Leads] ON [ACT].[Company]=[Web And Referral Leads].[Company]
WHERE ((([ACT].[Create Date])>#12/31/2001# And ([ACT].[Create Date])<#12/31/2002#) And [ACT].[Record Creator]<>"Barry Bowman" And [ACT].[Record Creator]<>"Jorge Yant" And ([ACT].[Company]) Is Not Null);

I would like to remove ONLY duplicate [ACT].[Company] instances.

Thanks again everyone,
Derek Basch

Pat Hartman · Feb 12, 2003

You have a concept problem with what constitutes a duplicate. When you join the two tables involved in a 1-to-many relationship, the 1-side data is presented with each many-side record that it relates to. That doesn't mean that it has been duplicated.

In looking at your query, I see that you have [Web And Referral Leads] included but you are not selecting any columns from it nor are you specifically using it for selection criteria. If you really are seeing duplicates (all columns in a row), then that is what is causing them.

Finally, when working with 1-to-many relationships, it is usually better to make a main report to show the 1-side data and a sub-report to show the many-side data. That will eliminate other situations that look like you have duplicate data.

dbasch · Feb 12, 2003

Pat,
Thanks for the reply.

You have a concept problem with what constitutes a duplicate. When you join the two tables involved in a 1-to-many relationship, the 1-side data is presented with each many-side record that it relates to. That doesn't mean that it has been duplicated.

That makes sense. So, I still can't use the GROUP BY statement to group the 1 side data that is represented multiple times.

This is because the GROUP BY statement requires that I include every field declared in the SELECT statement. I have multiple fields so GROUP BY groups only records with all similar fields. It is impossible to GROUP BY just the [ACT].[Company] field.

In looking at your query, I see that you have [Web And Referral Leads] included but you are not selecting any columns from it nor are you specifically using it for selection criteria. If you really are seeing duplicates (all columns in a row), then that is what is causing them.

You are correct again. I am not seeing duplicate records I am seeing duplicate field data.

Finally, when working with 1-to-many relationships, it is usually better to make a main report to show the 1-side data and a sub-report to show the many-side data. That will eliminate other situations that look like you have duplicate data.

This also makes sense. Now I just have to figure out how to do it! I am actually creating my report from the UNION of two LEFT JOINS. The Qualified query below and an Unqualified query:

Code:

SELECT [Web And Referral Leads].[Company], [Web And Referral Leads].[Date], Unqualified() AS Qualified, [Web And Referral Leads].[Type of Lead] AS [Lead Source]
FROM [Web And Referral Leads] LEFT JOIN ACT ON [Web And Referral Leads].[Company]=[ACT].[Company]
WHERE ((([Web And Referral Leads].[Company]) Is Not Null) And (([Web And Referral Leads].[Date])>#12/31/2001# And ([Web And Referral Leads].[Date])<#12/31/2002#) And (([ACT].[Company]) Is Null));

It seems to me that this throws a wrench in using a subreport because the 1:M relationships are lost in the UNION. I can't see a way to do this without removing duplicate field data prior to the UNION? Sorry for all the huge messages everyone and thank you so much for your continued patience

Pat Hartman · Feb 12, 2003

1. Why are you referencing [Web And Referral Leads] in the query when you are not using it for anything? If a company occurs more than once in [Web And Referral Leads] you will be duplicating data.

2. What makes you think that the 1:m relationship is impacted by the union?

3. What do you expect to be eliminated by a Group By? The Group By would eliminate rows, NOT blank out columns. If you want to blank out columns because it looks neater, you MUST use a report.

dbasch · Feb 12, 2003

Pat,

1. Why are you referencing [Web And Referral Leads] in the query when you are not using it for anything? If a company occurs more than once in [Web And Referral Leads] you will be duplicating data.

I am using it.

A company is considered qualified if:

1) It is in the ACT table
2) It is in the ACT table and in the [Web And Referral Leads] table

FROM ACT LEFT JOIN [Web And Referral Leads] ON [ACT].[Company]=[Web And Referral Leads].[Company]

The above LEFT JOIN from the 'Qualified' query returns the desired result set. If a company name is in the ACT table more than once it returns all the records of that company from the ACT table. 1:M, as you pointed out. When their are multiple company records in the ACT table I would like to group them.

A company is considered unqualified if:

1) It is in the [Web And Referral Leads] table but not in the ACT table.

FROM [Web And Referral Leads] LEFT JOIN ACT ON [Web And Referral Leads].[Company]=[ACT].[Company]
.... AND [ACT].[Company] Is Null;

The above LEFT JOIN and AND from the 'Unqualified' query returns the desired result set. If a company name is in the [Web And Referral Leads] table more than once it returns all the records of that company from the [Web And Referral Leads] table. When their are multiple company records in the [Web And Referral Leads] table I would like to group them.

Expressions in the SELECT statements of both queries insert a true or false value into the 'Qualified' column. True for Qualified and False for Unqualified.

I then perform a UNION of both qeuries:

Code:

SELECT [Qualified Leads].Company, [Qualified Leads].[Create Date] AS [Date], [Qualified Leads].Qualified, [Qualified Leads].[Lead Source]
FROM [Qualified Leads]

UNION SELECT [Unqualified Leads].Company,  [Unqualified Leads].[Date] AS [Date], [Unqualified Leads].Qualified,  [Unqualified Leads].[Lead Source]
FROM [Unqualified Leads];

And there you have it! The (almost) desired result set. Everything would work perfect if I could group [COLOR=red]ONLY[/COLOR] the company names in the Qualified and Unqualified queries.

[QUOTE]2. What makes you think that the 1:m relationship is impacted by the union? [/QUOTE]Inexperience and a touch of ineptitude. :p

Thanks a million Pat and I hope this doesn't make your head hurt like mine does after writing this.

Regards,
Derek Basch

dbasch · Feb 13, 2003

No Go Joe

Looks like everyone got sick of dealing with this one. Bummer

Moderator:

could you please rename this thread to 'Grouping duplicate field data' and move it to the 'queries' section?

Derek

Pat Hartman · Feb 17, 2003

I've been away playing bridge. You keep saying:

And there you have it! The (almost) desired result set. Everything would work perfect if I could group ONLY the company names in the Qualified and Unqualified queries.

What do you expect to happen when you group by ONLY company? How would Jet know which rows to eliminate when they contain different values? Why not show us what the current recordset looks like and and then show us what you would like to actually see.

dbasch · Feb 17, 2003

What do you expect to happen when you group by ONLY company? How would Jet know which rows to eliminate when they contain different values?

I see what you mean. How can Jet know which company records field values to include with a grouped company record. It cant. I guess this would explain why Jet only allows for grouping by all fields included in the SELECT statement.

Why not show us what the current recordset looks like and and then show us what you would like to actually see

Here you go. Here is an example original recordset:

Code:

Company      Date
THX          12/12/2002
BZF          1/17/2001
FRT          3/24/2002
FRT          7/4/2002

And what I would like to see:

Code:

Company      Date
THX          12/12/2002
BZF          1/17/2001
FRT          3/24/2002

As discussed above, it is subjective what date should be included with the grouped FRT company record because such a thing isn't really possible.

I found an example query in the Access help file that kinda illustrates my issue:

The following SQL statement shows how you can use DISTINCTROW to produce a list of companies that have at least one order but without any details about those orders:

SELECT DISTINCTROW CompanyName
FROM Customers INNER JOIN Orders
ON Customers.CustomerID = Orders.CustomerID
ORDER BY CompanyName;

What I need is:
a list of companies that have at least one order with details about those orders

Thanks again,
Derek Basch

directormac · Feb 17, 2003

Some clariciation is perhaps needed...

I'm just getting back to the list and back to MS Access at all after a LONG hiatus, so forgive me if I'm posting the obvious, please.

HOWEVER, I'm wondering about a couple things you've said along the way...

A company is considered qualified if:

1) It is in the ACT table
2) It is in the ACT table and in the [Web And Referral Leads] table

<SNIP>

A company is considered unqualified if:

1) It is in the [Web And Referral Leads] table but not in the ACT table.

It seems to me that the only part needed of the above is "if it is in the ACT table." Simpliying your statements a bit, where X = "Is Qualified":

Code:

1:  X if (Y)
2:  X if (Y &&  Z)
3: !X if (Z && !Y)

So, all possible positive results for #2 #3 are already solved for by #1, are they not? A lead is qualified if they are on the ACT table, unqualified if they are NOT in the ACT Table. Maybe I'm just missing something...

Here's my next question. You started this thread by asking about a REPORT, and you keep saying things like:

And what I would like to see

So do you really need to eliminate the "duplicate" company names from the table/query, or is just that you don't want to see the same names on the report over and over? If THAT's the case, the grouping functions within the design of the report can make it pretty without any further SQL wrangling on your part.

Apologies to all if this is totally unhelpful. Still, glad to be back on the forums...

--Returned Mac

dbasch · Feb 18, 2003

So, all possible positive results for #2 #3 are already solved for by #1, are they not?

Almost. You are correct that rule #2 is redundent.

unqualified if they are NOT in the ACT Table

Rule #3 however, is still needed because ( && !Y) must also be satisfied. So revised it would be:

Code:

1:  X if (Y)
2: !X if (Z && !Y)

A company is considered qualified if:

1) It is in the ACT table 

A company is considered unqualified if: 

1) It is in the [Web And Referral Leads] table [COLOR=red]and[/COLOR] not in the ACT table.

So do you really need to eliminate the "duplicate" company names from the table/query, or is just that you don't want to see the same names on the report over and over?

The prior case. I need to remove the "duplicate" company fields from the original query so the the report will sum/total correctly.

Apologies to all if this is totally unhelpful. Still, glad to be back on the forums...

Actually it was quite helpfull. It made me realize that one of the logic rules was redundent.

.

The root of my problem still lies in how to group records with duplicate company fields while "bringing along" one of the grouped records non-company fields. Make sense? No one seems to get it

Every grouping or distinct function works on either a singular field or the combination of a set of fields. I need to apply these functions (I think) to a singular field in a set of fields.
Thanks for all the help everyone,
Derek Basch

Pat Hartman · Feb 18, 2003

If you don't care which of the non-company field values are returned (somewhat curious to my mind, but OK), then you can group by the company field and use the First() function for the others.

SELECT First([ACT].[Create Date]), [ACT].[Company], First([ACT].[Record Creator]), Qualified() AS Qualified, First([ACT].[Lead Source])
FROM ACT LEFT JOIN [Web And Referral Leads] ON [ACT].[Company]=[Web And Referral Leads].[Company]
WHERE ((([ACT].[Create Date])>#12/31/2001# And ([ACT].[Create Date])<#12/31/2002#) And [ACT].[Record Creator]<>"Barry Bowman" And [ACT].[Record Creator]<>"Jorge Yant" And ([ACT].[Company]) Is Not Null)
Group By [ACT].[Company], Qualified();

First() simply returns the value from the first company record in the recordset. The problem is somewhat more complicated if you want the earliest date for example. Post back if that is the situation since Min() will NOT give the same results as First().

directormac · Feb 18, 2003

Hmmm...

Ok, in re: your selection logic, I'm still not convinced test #3 is required to find out if the company is qualified or not . Note that this is different from those tests performed to see if the particular record should be in the recordset or not. It seems that if we have already run X if Y on all records within the already selected recordset then any record which has not been positively tagged for X can be assumed as !X. Instead of asking "Are they qualified or not qualified?" you are simply asking "are they qualified?"

Did that make any sense at all?

I think part of the problem in getting the best possible help here may still be in communicating the problem fully. I for one suspect that you need to break the Company info out into a separate table to get this thing working. For clarification, though, break it down for us further. You start with something like:

TABLE: ACT
FIELD: (PK) LeadID [autonum]
FIELD: Create Date [date/time]
FIELD: Company [text]
FIELD: (FK) Record Creator [long int]
FIELD: (FK) Lead Source [long int]

TABLE: Web Leads
FIELD: (PK) WebLeadID [autonum]
FIELD: Create Date [date/time]
FIELD: Company [text]
FIELD: (FK) Type of Lead [long int]

Yes? Correct that where I'm wrong. Then give us a quick sample recordset for EACH of those tables, and then the recordset you want to achieve (as before).

Whoops, I see while I've been typing this that Pat has probably answered the question (as usual). Ignore me if she has. (Hey Pat, long time no see!)

--Slow Mac

dbasch · Feb 18, 2003

Eureka!

Eureka! That is perfect Pat! Finally it works! Thank you so much!

Directormac:

Ok, in re: your selection logic, I'm still not convinced test #3 is required to find out if the company is qualified or not . Note that this is different from those tests performed to see if the particular record should be in the recordset or not.

Yes, technically test #3 would not be required if the records were only in one table. However they are in two, so I must check that a record from the [Web And Referral Leads] table does not exist in the ACT table, which would make the record qualified. I would have to take the UNION of the qualified and unqualified record sets to be able to simplify the logic to:

Code:

if !x then x

Make sense?

Thank you everyone and I hope I can return the favor to the forum soon.

Derek Basch

Hide duplicate data across entire report (1 Viewer)

dbasch

Registered User.

FoFa

Registered User.

dbasch

Registered User.

FoFa

Registered User.

dbasch

Registered User.

dbasch

Registered User.

Pat Hartman

Super Moderator

dbasch

Registered User.

Pat Hartman

Super Moderator

dbasch

Registered User.

dbasch

Registered User.

Pat Hartman

Super Moderator

dbasch

Registered User.

directormac

Occasional Presence

dbasch

Registered User.

Pat Hartman

Super Moderator

directormac

Occasional Presence

dbasch

Registered User.

Similar threads

Users who are viewing this thread