records appearing in previous queries

Garren

New member
Local time
Today, 04:23
Joined
Jun 29, 2007
Messages
6
Hello everyone.

I use Duplicates Wizard to run about 14 queries which will show all possible combinations of duplicated records in an access table. Can anyone tell me how to add an extra column in reports 2 onwards displaying either a "yes" or a blank cell, being the answer to the question "does this record appear in any previous reports?"

Hope you can help.

Thanks.
 
Maybe we should be looking at how you're getting the duplicates to begin with. It worries me that there are so many ways there can be a duplicate.
Sounds like your table needs to be normalised. Can you give a quick explanation of where the data is coming from and what kind of dups you're getting? A cut down sample of the db would be most helpful. I'm using 2000 so if you're using something newer than that please convert the posted db.
 
Thanks for your reply. The data comes from our clients' accounts payable systems and we're specifically looking for invoices that have been entered into the system twice and therefore paid twice. Report 1 will do a duplicate test on supplier number, invoice number, invoice date and amount; Report 2 will do it again but this time leave out invoice number, Report 3 will leave out say supplier number etc etc so that over 14 different reports you have every possible way that an invoice may have been wrongly duplicated in a system. With say 100,000 lines of invoice data there are duplicated invoices in there and it's our task to locate them.

The problem I have is that in say report 2 which has the invoice number left out as a dupes test (thereby showing me dupes on the other 3 criteria) is that records which appeared in report 1 where all 4 criteria were duplicated will also appear in report 2. Since I've already seen the record in Report 1 I'll ignore it in Report 2 but in order to ignore it I need to know that it appeared in the previous report. Having a column asking the question "does this record appear in previous reports?" will save us a lot of time in scanning down the lists of duplicated invoices in reports 2 to 14 since these lists can be several thousand lines. I hope this answers your question.
 
I see where you are going. Actually, there are other ways to do this. In essence, you are looking for votes on whether something is or is not duplicate. My question is whether you are explaining the problem correctly. (Yeah, I know, that's an awfully pretentious question, but then again, I'm know to be an awfully pretentious guy.)

Seems to me that if the invoice has been paid twice, you can't leave out anything as a consideration for identifying what was paid. If you did, how do you know it was paid for this invoice and not some other one? In other words, if you are going to manage payments, you need uniqueness of transaction. If you are not enforcing uniqueness of transaction, you are playing loose with fiduciary responsibility (to your clients). And if in accepting a payment you don't maintain adequate records in the first place to verify the status of the account and to track its payments, you are inviting problems.

MY problem is that from here, I wonder whether it is your explanation that is the problem or your business rules that are lax.

OK, if that seems even the least bit harsh, please note that from the outside it simply appears to be playing loosely with something that shouldn't be treated loosely in my book. It's your business and your business model, so if your explanation was right, then OK, that's the hand you've been dealt. But I just have conceptualization issues, I guess.

As to the issue of whether you have duplicate payments, I question the logic of how you determine duplication, since it seems as though in your "incomplete" match cases, you still don't know whether the payment was duplicated or applies to a different billing cycle or what it references. And finding an overlap in query A that is missing one factor will still not tell you whether an overlap in query B represents a duplicate payment.

You can treat this as "invoice defines the payment" or "month + account number defines the payment" or several other things that define a payment. But what you store must always remove any ambiguity regarding what really got paid. Otherwise, you will NEVER know who owes you what amount and when it is overdue.
 
A couple more questions.
1. The table of invoices you're checking, is it static? I mean do you get a
new table of data each week, month, whatever, to run your reports against?
Or is this table continually getting new records added to it.
2. Do the records have a unique key field?
3. You say you have 14 qrys. Are they in sets? For example, you say you find dups for supplier, inv. nbr, inv. date, and amount. Then the second qry is a subset of those fields. So do you have like 5 or 6 different sets or ?
 
Thanks chaps for replying.

I'm sorry if I've caused any confusion but let me make a few things clear: my company does not make or receive any of the duplicated payments. Our clients are large multinational companies who process and pay several hundred thousand invoices per annum. When they are processing so many invoices it is inevitable that a certain percentage of those invoices are (due to human error) input into their system twice. For example, invoice number 1234 from Bob Smith Limited dated 5/6/06 for £687.70 might be mistakenly input again by the clerk some weeks later with the same details or one of the other details might be different. It might go in again as invoice number 01234 or as B.Smith Ltd or the amount might be slightly different or the date different etc etc.

These companies give us a download from their accounts payable software with the specific instruction to identify and investigate invoices that have been keyed into their system twice in whatever combination. Please take it as read that I know the duplicates are there: our task is to find them.

With larger clients we use an external consultant which has powerful software to look for duplicates and the tables they give can do what I'm asking. However, this costs us money and for smaller clients I've discovered that Access has a find duplicates wizard. Once we have received the data from our clients it is static and will not change. There are also other invoice details in the download per invoice such as posting date, currency and a unique reference number but the core details to identify duplicated entries are supplier number, invoice number, amount and invoice date. The reason why there are so many reports is because in no particular order this allows for all the possible combinations of the four key details. Report 1 shows me all records in the data where all four details are identical, which is straitghtforward. Because Report 2 is tested on invoice number, invoice date and amount but leaves out supplier number, this will show invoices in the data where these 3 are identical but the supplier number is not. We will then investigate whether the client has paid the same invoice to Bob Smith Limited and again to B.Smith Ltd, who are of course the same company.

This is repeated along the remaining reports but each time changing the combination of criteria for duplicates testing. However, the problem I wish to solve is that Report 2 will also contain records from Report 1, Report 3 will contain records from Reports 1 and 2 and so on and so forth all the way up to Report 14. With all due respect everyone, there's no need to ask why there are duplicated invoices in the data: I know there are and I know why they've happened and I know how to find them. All I need to know is how to write a macro which will link all the queries together and will display a column in Reports 2 to 14 which will display either a "Y" or blank cell to the question "is this record displayed in any previous reports?".

The reason why I need this is becuase we have to page down each Report deciding whether or not a pair of records represents the same invoice duplicated in the system. Some of these Reports contain tens of thousands of lines and it would be much easier if we could skip by the records we know to be in previous reports and have therefore already been looked at.

Sorry to be so long winded!
 
Thanks chaps for replying.

I'm sorry if I've caused any confusion but let me make a few things clear: my company does not make or receive any of the duplicated payments. Our clients are large multinational companies who process and pay several hundred thousand invoices per annum. When they are processing so many invoices it is inevitable that a certain percentage of those invoices are (due to human error) input into their system twice. For example, invoice number 1234 from Bob Smith Limited dated 5/6/06 for £687.70 might be mistakenly input again by the clerk some weeks later with the same details or one of the other details might be different. It might go in again as invoice number 01234 or as B.Smith Ltd or the amount might be slightly different or the date different etc etc.

These companies give us a download from their accounts payable software with the specific instruction to identify and investigate invoices that have been keyed into their system twice in whatever combination. Please take it as read that I know the duplicates are there: our task is to find them.

With larger clients we use an external consultant which has powerful software to look for duplicates and the tables they give can do what I'm asking. However, this costs us money and for smaller clients I've discovered that Access has a find duplicates wizard. Once we have received the data from our clients it is static and will not change. There are also other invoice details in the download per invoice such as posting date, currency and a unique reference number but the core details to identify duplicated entries are supplier number, invoice number, amount and invoice date. The reason why there are so many reports is because in no particular order this allows for all the possible combinations of the four key details. Report 1 shows me all records in the data where all four details are identical, which is straitghtforward. Because Report 2 is tested on invoice number, invoice date and amount but leaves out supplier number, this will show invoices in the data where these 3 are identical but the supplier number is not. We will then investigate whether the client has paid the same invoice to Bob Smith Limited and again to B.Smith Ltd, who are of course the same company.

This is repeated along the remaining reports but each time changing the combination of criteria for duplicates testing. However, the problem I wish to solve is that Report 2 will also contain records from Report 1, Report 3 will contain records from Reports 1 and 2 and so on and so forth all the way up to Report 14. With all due respect everyone, there's no need to ask why there are duplicated invoices in the data: I know there are and I know why they've happened and I know how to find them. All I need to know is how to write a macro which will link all the queries together and will display a column in Reports 2 to 14 which will display either a "Y" or blank cell to the question "is this record displayed in any previous reports?".

The reason why I need this is becuase we have to page down each Report deciding whether or not a pair of records represents the same invoice duplicated in the system. Some of these Reports contain tens of thousands of lines and it would be much easier if we could skip by the records we know to be in previous reports and have therefore already been looked at.

Sorry to be so long winded!
 
Thanks chaps for replying.

I'm sorry if I've caused any confusion but let me make a few things clear: my company does not make or receive any of the duplicated payments. Our clients are large multinational companies who process and pay several hundred thousand invoices per annum. When they are processing so many invoices it is inevitable that a certain percentage of those invoices are (due to human error) input into their system twice. For example, invoice number 1234 from Bob Smith Limited dated 5/6/06 for £687.70 might be mistakenly input again by the clerk some weeks later with the same details or one of the other details might be different. It might go in again as invoice number 01234 or as B.Smith Ltd or the amount might be slightly different or the date different etc etc.

These companies give us a download from their accounts payable software with the specific instruction to identify and investigate invoices that have been keyed into their system twice in whatever combination. Please take it as read that I know the duplicates are there: our task is to find them.

With larger clients we use an external consultant which has powerful software to look for duplicates and the tables they give can do what I'm asking. However, this costs us money and for smaller clients I've discovered that Access has a find duplicates wizard. Once we have received the data from our clients it is static and will not change. There are also other invoice details in the download per invoice such as posting date, currency and a unique reference number but the core details to identify duplicated entries are supplier number, invoice number, amount and invoice date. The reason why there are so many reports is because in no particular order this allows for all the possible combinations of the four key details. Report 1 shows me all records in the data where all four details are identical, which is straitghtforward. Because Report 2 is tested on invoice number, invoice date and amount but leaves out supplier number, this will show invoices in the data where these 3 are identical but the supplier number is not. We will then investigate whether the client has paid the same invoice to Bob Smith Limited and again to B.Smith Ltd, who are of course the same company.

This is repeated along the remaining reports but each time changing the combination of criteria for duplicates testing. However, the problem I wish to solve is that Report 2 will also contain records from Report 1, Report 3 will contain records from Reports 1 and 2 and so on and so forth all the way up to Report 14. With all due respect everyone, there's no need to ask why there are duplicated invoices in the data: I know there are and I know why they've happened and I know how to find them. All I need to know is how to write a macro which will link all the queries together and will display a column in Reports 2 to 14 which will display either a "Y" or blank cell to the question "is this record displayed in any previous reports?".

The reason why I need this is becuase we have to page down each Report deciding whether or not a pair of records represents the same invoice duplicated in the system. Some of these Reports contain tens of thousands of lines and it would be much easier if we could skip by the records we know to be in previous reports and have therefore already been looked at.

Sorry to be so long winded!
 
I don't see any way to do what you're asking using the queries you have.
Unfortunately, it doesn't always work out like that.
I put together a different approach in the attached db. It will double the
nbr of queries, but in the end you will only have a single copy of each potential duplicate. In the example, it checks all 4 fields and then just the 3 fields. If you look at tblInvoices and then tblDups you'll see that you only get the items you need to check. To run it yourself, clear tblDups and then just run the queries ending in "B". The "A" queries are Select queries finding the dups based on the criteria, and the "B" queries only pull the dups from the "A" queries where the records don't already exist in tblDups. It's not the most elegant solution, something written in VB would work better and I'll look at that if you want. It would probably be about what the outside consultant is doing for you.
The id field in my table would be equal to the unique reference nbr you said existed. I added the Sort Group field to give us a way to tell which query produced the records. Take a look and we'll go from here.
 

Attachments

Thanks for your help.

It might make a difference if you know that when we receive the CD from the external consultant the Reports are tables rather than queries. Perhaps they do this becuase they don't want us copying how to run the dupes reports!

If possible, I would like to be able to keep the Reports to the current number and if that means writing a VB I would appreciate your help greatly. To give you a visual idea of what I need I've copied a few lines into Excel from Report 3 provided by our consultant. You'll see at the end of the record there are two columns; one saying whether the record appears in previous reports and the other naming the report. This Report was 13,000 lines long and is testing duplicates based on invoice number, invoice date and gross amount but leaving out vendor number. When we flick down this Report we will ignore the "Y"'s but will look at the blanks because as you can see the other records have already appeared in Report 2. If you look at the first set of blanks you will see that the vendor number for each invoice is different and we will therefore wish to find those documents to see if they are indeed the same invoice. Without the "Earlier Reports" column we would have to look at each set of duplicates even though we've probably already looked at it in Reports 1 and 2.

This Excel spreadsheet is exactly what I'm looking to achieve but obviously in an Access query or table if need be.

Thanks.
 

Attachments

Users who are viewing this thread

Back
Top Bottom