Query is duplicating data

TiffBrassy

Registered User.
Local time
Today, 15:28
Joined
Jan 22, 2014
Messages
10
UGH!!!

Ok, I created a query using 2 tables, [tbl_transaction_details_b] and [Siemens_Contact_Info1]. Currently it is pulling 38 records (which is right and the same number of records in tbl_transaction_details_b). But when I add Line Number to the query (which is in [Siemens_Contact_Info1] I go from 38 records to 1019. Here is the SQL view before adding line number:

SELECT Siemens_Contact_Info1.[Mail Code], tbl_transaction_details_b.INVNUMBER, Format([INVDATE],"mm/dd/yyyy") AS InvoiceDate, "USD" AS InvoiceCurrency, Null AS [Canadian Tax Registration ID], tbl_transaction_details_b.PONUM, Null AS [Account No], Null AS [Payment Terms], tbl_transaction_details_b.SHIPTOCUSTOMERNAME, tbl_transaction_details_b.SHIPTOADDR1, tbl_transaction_details_b.SHIPTOCITY, tbl_transaction_details_b.SHIPTOSTATE, tbl_transaction_details_b.SHIPTOPOSTALCD, "USA" AS [Ship To Country], Null AS [Charge1 Name], Null AS [Charge1 Value], Null AS [Charge2 Name], Null AS [Charge2 Value], Null AS [Charge3 Name], Null AS [Charge3 Value], Null AS [Charge4 Name], Null AS [Charge4 Value], Null AS [Charge5 Name], Null AS [Charge5 Value], NZ([BASECHARGES])+NZ([OVERAGECHARGES])+NZ([EXTENDED_AMOUNT]) AS [Gross Amount], tbl_transaction_details_b.INVOICE_LINE_TAX_AMOUNT, "0" AS [Total Charges], NZ([BASECHARGES])+NZ([OVERAGECHARGES])+NZ([EXTENDED_AMOUNT])+NZ([INVOICE_LINE_TAX_AMOUNT]) AS [Invoice Total], Null AS Line_Number, tbl_transaction_details_b.MAKE_MODEL AS [Product/ Service No], "METER" AS Description, tbl_transaction_details_b.SERIALNUMBER AS [Siemens Part Number], tbl_transaction_details_b.QUANTITY, "EA" AS UNIT, tbl_transaction_details_b.Rate, tbl_transaction_details_b.INVOICE_LINE_TAX_AMOUNT AS [Sales and Use Tax], Null AS GST, Null AS PST, Null AS HST, Null AS QST, NZ([BASECHARGES])+NZ([OVERAGECHARGES])+NZ([EXTENDED_AMOUNT])+NZ([INVOICE_LINE_TAX_AMOUNT]) AS [Line Total]
FROM tbl_transaction_details_b LEFT JOIN Siemens_Contact_Info1 ON tbl_transaction_details_b.PONUM=Siemens_Contact_Info1.PO_NUM
GROUP BY Siemens_Contact_Info1.[Mail Code], tbl_transaction_details_b.INVNUMBER, Format([INVDATE],"mm/dd/yyyy"), "USD", Null, tbl_transaction_details_b.PONUM, Null, Null, tbl_transaction_details_b.SHIPTOCUSTOMERNAME, tbl_transaction_details_b.SHIPTOADDR1, tbl_transaction_details_b.SHIPTOCITY, tbl_transaction_details_b.SHIPTOSTATE, tbl_transaction_details_b.SHIPTOPOSTALCD, "USA", Null, Null, Null, Null, Null, Null, Null, Null, Null, Null, NZ([BASECHARGES])+NZ([OVERAGECHARGES])+NZ([EXTENDED_AMOUNT]), tbl_transaction_details_b.INVOICE_LINE_TAX_AMOUNT, "0", NZ([BASECHARGES])+NZ([OVERAGECHARGES])+NZ([EXTENDED_AMOUNT])+NZ([INVOICE_LINE_TAX_AMOUNT]), Null, tbl_transaction_details_b.MAKE_MODEL, "METER", tbl_transaction_details_b.SERIALNUMBER, tbl_transaction_details_b.QUANTITY, "EA", tbl_transaction_details_b.Rate, tbl_transaction_details_b.INVOICE_LINE_TAX_AMOUNT, Null, Null, Null, Null, NZ([BASECHARGES])+NZ([OVERAGECHARGES])+NZ([EXTENDED_AMOUNT])+NZ([INVOICE_LINE_TAX_AMOUNT]), tbl_transaction_details_b.TRANSACTION_TYPE
HAVING (((tbl_transaction_details_b.TRANSACTION_TYPE) Not Like "INV-INTRL LEASE RENT"));

What do I need to do in order to pull the line number for each record? Oh the line number is determined by the EQUIP_SEG which is also on [Siemens_Contact_Info1]

Thanks in advance.
 
I'd suggest you throw out all fields not pertinent to your problem and show that query.
 
...Currently it is pulling 38 records

Not exactly. Its returning 38 rows of data, but its finding 1019 records (or more). The discrepancy is because of your GROUP BY clause.

When you use a GROUP BY clause you (for lack of a better term) roll duplicates into one record. It does this by defining duplicate records as any records that have the same data that you have listed in your SELECT and GROUP BY clauses. When you added [Line Number] to those clauses all those records now have to match on an additional field, which they don't, so they all show.

Like spike, I didn't look at your query because its too complex and unwieldly. I did however see these 2 field names: [Charge3 Value] and [Charge5 Value]. Numerated field names are a sign of a poor table structure. You shouldn't number fields, instead you should break them off into another table and have each be its own record.

If you want to post your table structure we can help, but I think the issue you are experiencing is the tip of the poorly structured iceberg.
 
I agree with spikepl and plog too much fluff -- What exactly is the problem? What have you done to solve it? Why all the Null fields--are you trying to create an input for something else?
 
Thank you spikepl, plog, and jdraw for responding. My very first post I was advised to show the code for my query and then ask the question. It was a simple query and question. So this time I showed the code and it is too complex and unwieldly. I think I am finding a happy median now. I truly appreciate the constructive criticism.

@jdraw, I am trying to create an input for something else, hence the null fields and @plog, [Charge3 Value] is a required field name for the output. @spikepl, I should not have included all of that in my original question. Plog, I do think you are on to something and you understand my question. However, when I remove the GROUP BY clause it returns 1000+ records again. Do I need to create another table to piece this together?

Thanks in advance.
 
Thank you spikepl, plog, and jdraw for responding. My very first post I was advised to show the code for my query and then ask the question.

It isnt all about the query, it is about the format too.... Peopel trying to help you generaly dont want to be :banghead:

Therefor when you post code or sql you make sure it is READABLE, as you should already have done before you even posted because READABLE code is maintainable code and will often times point you into the direction of your problem(s).

This is your query, without the group by but in a more readable format and wrapped in code wraps ([cade] at the start and [/cade] at the end replace a by o obviously :) - or click the # button on the post menu)
Code:
SELECT Siemens_Contact_Info1.[Mail Code]
     , tbl_transaction_details_b.INVNUMBER
     , Format([INVDATE],"mm/dd/yyyy") AS InvoiceDate
     , "USD" AS InvoiceCurrency
     , Null AS [Canadian Tax Registration ID]
     , tbl_transaction_details_b.PONUM
     , Null AS [Account No]
     , Null AS [Payment Terms]
     , tbl_transaction_details_b.SHIPTOCUSTOMERNAME
     , tbl_transaction_details_b.SHIPTOADDR1
     , tbl_transaction_details_b.SHIPTOCITY
     , tbl_transaction_details_b.SHIPTOSTATE
     , tbl_transaction_details_b.SHIPTOPOSTALCD
     , "USA" AS [Ship To Country]
     , Null AS [Charge1 Name]
     , Null AS [Charge1 Value]
     , Null AS [Charge2 Name]
     , Null AS [Charge2 Value]
     , Null AS [Charge3 Name]
     , Null AS [Charge3 Value]
     , Null AS [Charge4 Name]
     , Null AS [Charge4 Value]
     , Null AS [Charge5 Name]
     , Null AS [Charge5 Value]
     , NZ([BASECHARGES])+NZ([OVERAGECHARGES])+NZ([EXTENDED_AMOUNT]) AS [Gross Amount]
     , tbl_transaction_details_b.INVOICE_LINE_TAX_AMOUNT
     , "0" AS [Total Charges]
     , NZ([BASECHARGES])+NZ([OVERAGECHARGES])+NZ([EXTENDED_AMOUNT])+NZ([INVOICE_LINE_TAX_AMOUNT]) AS [Invoice Total]
     , Null AS Line_Number
     , tbl_transaction_details_b.MAKE_MODEL AS [Product/ Service No]
     , "METER" AS Description
     , tbl_transaction_details_b.SERIALNUMBER AS [Siemens Part Number]
     , tbl_transaction_details_b.QUANTITY
     , "EA" AS UNIT
     , tbl_transaction_details_b.Rate
     , tbl_transaction_details_b.INVOICE_LINE_TAX_AMOUNT AS [Sales and Use Tax]
     , Null AS GST
     , Null AS PST
     , Null AS HST
     , Null AS QST
     , NZ([BASECHARGES])+NZ([OVERAGECHARGES])+NZ([EXTENDED_AMOUNT])+NZ([INVOICE_LINE_TAX_AMOUNT]) AS [Line Total]
FROM      tbl_transaction_details_b 
LEFT JOIN Siemens_Contact_Info1 ON tbl_transaction_details_b.PONUM=Siemens_Contact_In fo1.PO_NUM
Where     tbl_transaction_details_b.TRANSACTION_TYPE Not Like "INV-INTRL LEASE RENT";


This query now clearly shows you
1) have 2 tables
2) Pull most your fields from the transaction table and only 1 from Contactinfo

As Plog already stated the Line Number is making unique lines, which is probably a result of a dodgy design or a wrong query.
 
I see that you are building data for input to some other program, so you can be forgiven for having to conform to the requirements of said program. Thus the comments about structuring issues are clearly not your problem. They are the problem of the poor folks who have to work with that other program. But whoever designed that stuff had a "spreadsheet mindset." They should try to at least make it into the 20th century if they can't make it into the 21st. That is a ledger-book (paper) format last used by Bob Cratchett, I believe, working for Marley and Scrooge. But enough hum-bug.

When you tell SQL to return records, it returns every possible record it can find, but if there are any one-many combinations in the actual or implied JOIN, you will get one record for every COMBINATION of fields in the set of records you are retrieving. When you define record components in a way that cannot be combined without glossing over the distinct records, Access will balk - hence the 1000+ records you mentioned.

I'm going to give you advice on how to do this in a manageable way using the "divide and conquer" method of query building.

See every one of those "NULL" cases? Get rid of them (for a moment). They are clutter for us AND FOR YOU. You'll bring them back later when you get the rest of the query built.

Build a query or sequence of queries that recover only the non-null data. Use a layered query approach to retrieve the unique records and let a different layer do the "GROUP" by portion. Do not include the things in that GROUP BY query that cannot be grouped. I.e. if you group by a field that is the prime key of the table, you get one record for each key value regardless of the GROUP BY clause. It is only when you group by things that are NOT unique that you can effectively combine multiple records with the SQL Aggregate functions like COUNT, SUM, MAX, MIN, etc.

When you are done, you can then build a single overarching query that recovers your data and as a side-effect, returns an empty string, zero number, or whatever you need for the things where you were returning nulls. As to the idea of actually returning a null, I'm not sure that I ever would do so. You only ask for Access and SQL to balk because most SQLs I've ever seen really don't like nulls. But if you need that, you need it. Do it last because the nulls don't have to be injected into the output until the final query is run.
 
Thanks The Doc Man!!!! You are THE MAN!!! Turns out I was on the same track as you as far as deleting those fields that were not immediately neccessary. I had the thought of incorporating the MAX function but was stuck on when and where to use it. Then I read your post!!! Ding Ding Ding Ding! So I removed the GROUPED BY on those fields that could not be grouped and changed it to MAX OF and I finally was able to return 38 records with the corresponding Line Number!!!

Your reference to Bob Cratchett was my comic relief in the office today. We needed the laugh.

Thanks a bunch!
 
Last edited:

Users who are viewing this thread

Back
Top Bottom