How to avoid Cartesian Joins

Herie

New member
Local time
Today, 23:24
Joined
Oct 3, 2012
Messages
2
I receive various data from an outside source to create reporting. Unfortunately non of this excel table has a primary key.
I have to connect the data via 'booking ID' field which has in all 3 tables muliple entries.

I entered SELECT DISTINCT in the SQL script. When I add the required fields they still deliver more rows than anticipated.

The SQL for 2 table currently is:
SELECT DISTINCT QUBE.booking_id, QUBE.cost_centre_category_name, QUBE.start_date, QUBE.amount, QUBE.costing_regular_local, QUBE.costing_offered_local, QUBE.saved, QUBE.lost, QUBE.no_of_seg_per_costing, QUBE.company_code, QUBE.company_name, QUBE.account_code, QUBE.account_name, QUBE.cost_centre_category, QUBE.cost_centre_group_code, QUBE.cost_centre_group_name, QUBE.cost_centre_code, QUBE.cost_centre_name, QUBE.number_of_pax, QUBE.booking_name, QUBE.[passenger(s)_name], QUBE.profile_unique, QUBE.costing_unique, QUBE.item_unique, QUBE.trn_number, QUBE.trn_date, QUBE.trn_type, QUBE.booking_int_flag, QUBE.booking_travel, QUBE.first_date, QUBE.last_date, QUBE.ordered_date, QUBE.segment_travel_type, QUBE.seg_provider_code, QUBE.provider_name, QUBE.preferred_provider_flag, QUBE.chain_code, QUBE.chain_name, QUBE.supplier_code, QUBE.supplier_name, QUBE.details, QUBE.segment_class, QUBE.segment_line, QUBE.segment_status, QUBE.service_code, QUBE.service_name, QUBE.first_city, QUBE.final_city, QUBE.destination, QUBE.destination_nation, QUBE.airline_designator, QUBE.ticket_number, QUBE.costing_deal_code, QUBE.no_rooms_cars, QUBE.reference, QUBE.costing_reason_code, QUBE.costing_reason_name, QUBE.miles, QUBE.ordered_by, QUBE.ordered_method, QUBE.trn_count, QUBE.costing_travel_type, QUBE.city_pair, QUBE.employee_type, QUBE.ticket_cc_fee, QUBE.best_fare_selected, QUBE.costing_exchange_ticket, QUBE.seg_first_nation_code, QUBE.seg_final_nation_code, QUBE.CLASS_GROUPING_NAME, QUBE.segment_flag, [CAR TAR].[Car type], [CAR TAR].Incidentals, [CAR TAR].[Days away]
FROM QUBE LEFT JOIN [CAR TAR] ON QUBE.booking_id = [CAR TAR].[Booking ID]
GROUP BY QUBE.booking_id, QUBE.cost_centre_category_name, QUBE.start_date, QUBE.amount, QUBE.costing_regular_local, QUBE.costing_offered_local, QUBE.saved, QUBE.lost, QUBE.no_of_seg_per_costing, QUBE.company_code, QUBE.company_name, QUBE.account_code, QUBE.account_name, QUBE.cost_centre_category, QUBE.cost_centre_group_code, QUBE.cost_centre_group_name, QUBE.cost_centre_code, QUBE.cost_centre_name, QUBE.number_of_pax, QUBE.booking_name, QUBE.[passenger(s)_name], QUBE.profile_unique, QUBE.costing_unique, QUBE.item_unique, QUBE.trn_number, QUBE.trn_date, QUBE.trn_type, QUBE.booking_int_flag, QUBE.booking_travel, QUBE.first_date, QUBE.last_date, QUBE.ordered_date, QUBE.segment_travel_type, QUBE.seg_provider_code, QUBE.provider_name, QUBE.preferred_provider_flag, QUBE.chain_code, QUBE.chain_name, QUBE.supplier_code, QUBE.supplier_name, QUBE.details, QUBE.segment_class, QUBE.segment_line, QUBE.segment_status, QUBE.service_code, QUBE.service_name, QUBE.first_city, QUBE.final_city, QUBE.destination, QUBE.destination_nation, QUBE.airline_designator, QUBE.ticket_number, QUBE.costing_deal_code, QUBE.no_rooms_cars, QUBE.reference, QUBE.costing_reason_code, QUBE.costing_reason_name, QUBE.miles, QUBE.ordered_by, QUBE.ordered_method, QUBE.trn_count, QUBE.costing_travel_type, QUBE.city_pair, QUBE.employee_type, QUBE.ticket_cc_fee, QUBE.best_fare_selected, QUBE.costing_exchange_ticket, QUBE.seg_first_nation_code, QUBE.seg_final_nation_code, QUBE.CLASS_GROUPING_NAME, QUBE.segment_flag, [CAR TAR].[Car type], [CAR TAR].Incidentals, [CAR TAR].[Days away]
HAVING (((QUBE.cost_centre_category_name) Like "*SBU*"));

Who has any constructive suggestions, how to get the correct amount of entries in my queries?
 
Till someone comes along, a bit of loud thinking :
1) You mentioned 3 tables, there appears to be only 2 tables in the SQL query posted, QUBE and CAR TAR.

2) When you say, there are no primary columns in the tables, assume, bookingid is not unique, in either of the tables.

3) Is it possible to have a seperate booking table with bookingid as primary key ( perhaps a look in to normalization )?

4) If 3) is not possible,
a) Which of the tables QUBE and CAR TAR is likely to have all the bookingid's & the other table likely to have bookingid's as a subset of this ? If it is for eg. table QUBE, then perhaps, get the distinct bookingid's from this table via a query & then use it to join with the other two tables QUBE and CAR TAR and vice-a-versa, in case it is table CAR TAR .

b) If 4a) is not the case, then, perhaps, use a union query on both the tables, to get distinct bookingid's & then use it in a query with the two tables.

Edit :
5) A look at the data & the logic of the query.

Thanks
 
Last edited:
Thanks recyan.

to 1) correct, currently only 2 tables are connected because of the issue of multiplying the entries.

to 2) I requested to have a field in each table included which would have a primery key from a 4th table (travellers unique profile id) and I am hoping this will solve the problem

to 3) the bookingId is not unique as I have the same booking ID for flight, hotel, car and TM fees.

to 4) the QUBE data is the main data and has included flight, hotel, car and other charges/fees. In the CAR TAR and HOTEL TAR the incidentals (chargebacks) are listed separately which I require for the reporting.

I hope this explains the data and the reason a bit more.

Thanks for any help.
 
1) In the QUBE table, perhaps, the bookingid along with some other field/s must be unique.
2) If 1) is the case, then, do your other tables also have similar fields ?
4) think, if you post your tables with some dummy data & the resultant data that you want your query to give, perhaps someone should be able to help you.

Thanks
 
From your description in #3, it sounds like you are trying to join tables from separate 1-many relationships simply because they have the same ID and that is why you are getting a Cartesian Product.

Rather than creating a monolithic query, use subreports for each data set to show cars, hotels, and incidentals independently.
 

Users who are viewing this thread

Back
Top Bottom