Top/Max/Subqueries ... losing my mind

Cait

Registered User.
Local time
Today, 22:25
Joined
Feb 25, 2008
Messages
15
I've looked at so many options for writing this query that I can't see the forest for the trees and hoping you kind folks will point me in the right direction.

I have a table called services that includes:

service_id
prop_id
service_year (ie: 2007, 2008)
service_date (ie: 3/3/08)

service_id is a unique ID for each service which relates to data I will need to pull from another table.

prop_id relates to a property ID in another table, not unique as multiple services against one property.

The problem that I have is that each prop_id can have more than one service_date for the same service_year. I need to be able to find the latest service_date on a prop_id and its associated service_year and service_id.

Any pointers would be greatly appreciated. BTW, using this in Access 2003 & 2007
 
query1 GroupBy on prop_id and service_year,and Max on Service_date
query2 join query1 on all fields back to the table and pull in the service_id

brian
 
This is close, but not quite. Each financial year, I generate a service_id and service_year for that financial year on each prop_id. In some cases, the service_date might actually be null because we haven't done at least one service in that financial year and that's something I need to know. In other cases, we may do multiple services in one financial year.

Query1 pulls data as expected including null service_dates

But Query2 when linking prop_id, service_year and service_date back to Query1 will not pull null service_dates. I not only need to know that max service_date and its associated data, but if we've done a service in that financial year at all (and its associated data). :confused:
 
Last edited:
I have been working this problem, hoping to find the answer without someone doing the work for me.

Here's what I've found bearing in mind that each service year will automatically have 3027 records generated against it (which = the total number of properties), plus any duplicate services done in that service year ...

Using Brian's method:

SELECT Query1.prop_id, Query1.service_year, Query1.MaxServiceDate, tblServices.service_id FROM Query1 INNER JOIN tblServices ON (Query1.MaxServiceDate = tblServices.service_date) AND (Query1.service_year = tblServices.service_year) AND (Query1.prop_id = tblServices.prop_id) WHERE (((Query1.service_year)=2007))
ORDER BY Query1.prop_id;

This produces only 2893 records where I would expect 3027. I get the service_id on all records but no records with a null service_date. RIGHT JOIN produces same results.

LEFT JOIN produces the 3027 records I expect (the largest service_date for each property, failing that a null service_date), but fails to pull the service_id on null service_dates and I need that service_id even on null dates to link to another table.

-------------------

I did a bit fiddling last night and have a query that pulls the largest service_date, failing that the null service_date with its associated service_id with on all 3027 property_id. This is exactly what I expect, but ... it takes 1:11 to run. On a table of 8,000+ records, that's unworkable. This is what I wrote:

SELECT tblServices.prop_id, tblServices.service_year, tblServices.service_date, tblServices.service_id
FROM tblServices
WHERE (((tblServices.service_id) In (SELECT TOP 1 service_id FROM tblServices as tblServices1 WHERE tblServices1.prop_id = tblServices.prop_id AND tblServices1.service_year = 2007 ORDER BY tblServices1.service_year DESC, tblServices1.service_date DESC)))
ORDER BY tblServices.prop_id;

If I remove the tblServices1.service_year = 2007 from the subquery, the query runs instantaneously; however, I may need to examine other services years other than current (ie: greatest) service_year.

Any help would be greatly appreciated. If I can crack this, then it's my last step to finally being able to normalise a horrendous table I've inherited.
 
Last edited:
You never mentioned Nulls.

Do an other query selecting all fields and criteria Is Null on Service_Date
then Union this with the 2nd query.
Union queries are covered in Help but probably is
Select* from Query2
Union
Select * from query3;

You may be able to directly Union queries 2 and 3 together in 1 query if that makes sense ie just a 3rd query not 3 and 4

Brian
 
Hmm, got what I expected: all 3027 properties ids with a max service_date or null, as appropriate, and its service_id. Was also able to UNION in the 3rd query rather than having 4th. Now just have to do some serious testing with union query and all the table joins I'll have to do to pull the rest of the information.

So thank you, Brian, for your advice! I would have never gone that route without a push in the right direction.

Previously, the app was running on a local machine with one user. Now it's going to need to be split into a FE and BE to be run across a network. Am I likely to take a performance hit on UNION query which will likely serve as a base in many other queries designed to link associated tables?
 
I don't know the answer to that. What I would advise is to start a thread in the General forum labelled e.g. "performance concerns" and give details of the size of the tables and any other information you can come up with say number of records expected in the queries. Hopefully one of the gurus on the site will respond, if they need specific info they will ask.

Brian
 
Hi Cait
I realised late last night that there is a circumstance where the UNION query approach does not work. I presume that you could have
service_id prop_id Year Date
1..............1..........2007..Null
2..............1..........2007..03/03/2007

both woould be returned using the Union approach so I tried a few things and came up with a change to query2 as below, it works on my small test data but you should test it.

Code:
SELECT Table1.service_id, Query1.prop_id, Query1.Service_year, Query1.MaxOfService_date
FROM Table1 INNER JOIN Query1 ON (Table1.prop_id = Query1.prop_id) AND (Table1.Service_year = Query1.Service_year)
WHERE (((Query1.MaxOfService_date)=[table1].[service_date])) OR (((Query1.MaxOfService_date) Is Null));

Sorry about that hope you haven't wasted too much time, I think this situation arose because the Is NNull situation arose after the first thoughts and I rushed the response. That's my excuse and I'm sticking to it :)

Brian
 
Yes, you are right, Brian, that situation above could indeed happen. Not sure when I can do more work on this database, but I will keep you updated as I progress. Thank you again for all your help!
 
Cait, why do you have a service date and a service year. Having a service_year is redundant data when the year can be calculated from the Service_Date.

Dallr
 
Two reasons:

1) I inherited it that way

2) We work off financial years rather than calendar, which makes selection a tad easier.

This table contains a lot of information that can be normalised further, but I'm still thinking and testing my way through scenarios.

Once I have a bit more time to dedicate to this, I will probably drop the service years altogether.
 
I assumed that the service _year meant the year the service was due and that it might actually be in a different calendar year else why not Year_Serviced.

Brian
 
I assumed that the service _year meant the year the service was due and that it might actually be in a different calendar year else why not Year_Serviced.

Spot on, Brian. It's the year the service is due and it should be in the same financial year as service_date, although that's not always the case and that's where the null service_dates come from.

By law, we have to service each year (within 365 days) and I am assume it was easier for the person who designed this to search on null service_dates in a service year to find properties that hadn't been serviced within requirements.

Since I've taken it over, been using datediff date() - max(service_date) and finding overdue services that way which rather does make service years redundant.
 
Last edited:
"else why not Year_Serviced"
It all depends on the Business rules that need to be applied. However, from the brieft description noted by Cait, I still say that the service_year is irrelevant.

If you want to find Property that are overdue for a service then Use the following SQL.

Code:
SELECT YT.Prop_ID
     , DateDiff("m",Max(YT.Service_Date),Date()) AS MonthsOverdue
FROM YourTable AS YT
GROUP BY YT.Prop_ID
HAVING (((DateDiff("m",Max(YT.Service_Date),Date()))>=12));

If you want to find if there is a missed year in any one of your property then use the following SQL. The SQL is based on the one table you mentioned and used only two fields: Prop_ID and Service_Date everything else is derived.

You can replace the table "Yourtable" in the SQL with the name of your actual table in of your database and it should run one time.

Code:
SELECT YourTable1.Prop_ID 
     ,YourTable1.YearServiced
     , Yourtable2.YearServiced
     , ((Yourtable2.YearServiced - YourTable1.YearServiced) -1) As YearsMissed
FROM (SELECT YT.Prop_ID
           , Year([Service_Date]) AS YearServiced
           , (SELECT Count(Prop_ID) FROM YourTable YTS WHERE YTS.Prop_ID = YT.Prop_ID AND YTS.Service_Date <= YT.Service_Date) AS DerivedAutoNumber
FROM YourTable AS YT) AS YourTable2 RIGHT JOIN (SELECT YT.Prop_ID
           , Year([Service_Date]) AS YearServiced
           , (SELECT Count(Prop_ID) FROM YourTable YTS WHERE YTS.Prop_ID = YT.Prop_ID AND YTS.Service_Date <= YT.Service_Date) AS DerivedAutoNumber
FROM YourTable AS YT) AS YourTable1 ON (YourTable2.DerivedAutoNumber -1 = YourTable1.DerivedAutoNumber) AND (YourTable2.Prop_ID = YourTable1.Prop_ID)
WHERE ((Yourtable2.YearServiced - YourTable1.YearServiced) -1) >= 1
ORDER BY Yourtable1.Prop_ID , Yourtable1.YearServiced

I have also uploaded a sample database for your review. I hope it is attached properly.

dallr
 

Attachments

"else why not Year_Serviced"
It all depends on the Business rules that need to be applied. However, from the brieft description noted by Cait, I still say that the service_year is irrelevant.

If you want to find Property that are overdue for a service then Use the following SQL.

Code:
SELECT YT.Prop_ID
     , DateDiff("m",Max(YT.Service_Date),Date()) AS MonthsOverdue
FROM YourTable AS YT
GROUP BY YT.Prop_ID
HAVING (((DateDiff("m",Max(YT.Service_Date),Date()))>=12));

I do agree. You needn't have a service_year field to find overdue services. Been using a similar query to the above, only calculating off days rather than months OR looking for null service_dates which are classed as overdue too.

Code:
SELECT YourTable1.Prop_ID 
     ,YourTable1.YearServiced
     , Yourtable2.YearServiced
     , ((Yourtable2.YearServiced - YourTable1.YearServiced) -1) As YearsMissed
FROM (SELECT YT.Prop_ID
           , Year([Service_Date]) AS YearServiced
           , (SELECT Count(Prop_ID) FROM YourTable YTS WHERE YTS.Prop_ID = YT.Prop_ID AND YTS.Service_Date <= YT.Service_Date) AS DerivedAutoNumber
FROM YourTable AS YT) AS YourTable2 RIGHT JOIN (SELECT YT.Prop_ID
           , Year([Service_Date]) AS YearServiced
           , (SELECT Count(Prop_ID) FROM YourTable YTS WHERE YTS.Prop_ID = YT.Prop_ID AND YTS.Service_Date <= YT.Service_Date) AS DerivedAutoNumber
FROM YourTable AS YT) AS YourTable1 ON (YourTable2.DerivedAutoNumber -1 = YourTable1.DerivedAutoNumber) AND (YourTable2.Prop_ID = YourTable1.Prop_ID)
WHERE ((Yourtable2.YearServiced - YourTable1.YearServiced) -1) >= 1
ORDER BY Yourtable1.Prop_ID , Yourtable1.YearServiced

Now this is an interesting bit of kit and warrants some serious study on my part. Thank you!

As I mentioned above, each year a service_id and service_year are auto-generated for each property. We don't really need this auto generation of service_id and service_year because we can find overdues using datediff.

What you haven't seen in this service table is the rest of the rubbish which currently relates (though should be normalised out) to a service_id and service_year.

When a property hasn't been serviced it is because the tenant has not allowed us access to carry out the work. By refusing to do so, they are breaking the law and it requires us to record a specific series of date-based actions in order to comply with legislation. Those actions include: 1st class letter, recorded letter, home visit, solicitor's letter, etc.

At present, all of those date-based actions are in the services table and should be taken out and put in a separate table called NoAccess. I'm having trouble how deciding how to best relate NoAccess data. If I relate NoAccess to Services, then I must create a service_id each year for each property, otherwise you can't record NoAccess dates (kinda like what came first: chicken or the egg). If I relate NoAccess to Property, then I have to have a way of closing down the NoAccess record once a tenant has allowed a service. At the moment because service and no access are in one table, no access is closed when a service_date is entered.
 
I'd be a little bit cautious about abandoning Service_Year without talking to the business people.
It could be that in Dec 2008 work is slack and the big man says lets start some of 2009's early. Service_year seems to be only loose;y not directly related to Service_date.

All things are possible in business.

Brian
 
It could be that in Dec 2008 work is slack and the big man says lets start some of 2009's early. Service_year seems to be only loose;y not directly related to Service_date.”

Even if the aforementioned can happen, the first set of SQL I posted in my last response will cover this without needing to store the year separately. Because, when you do a “datediff()” on the last service date “max(Serviced_date)” The property would not show up on your listing as needing a service. Eg.
Prop_DI……….Service_Date
2……………………….01/01/2008
2……………………….12/29/2008

So even if by chance someone wants to service the same property twice in one year It would not be a problem.

That being said, I agree with Brianwarnock when he stated that we must know the entire business rules in order to have a better knowledge regarding if to keep service_year in or out.


Cait
It might be best to start a new thread specific to your normalization issues so that this thread would not be cluttered with two distinct issues. I am new to this forum so I am not sure what might be best. Maybe Brainwarnok can advice on the course of action to take since he has been here longer than me and would know it’s workings/Operations/ guidelines better.

Make sure when you are stating your business case it is very detailed and with possibly a DB attachment .

Regards
Dallr
 
Dallr is correct that if Cait wants to change the direction of the trhread to discuss jis normalistion/business issues it is better to start a new thread with that as the title as it will attract people with specific experience. I don't think we are there yet, and I dont think there is likely to be more to be said on this thread, tho' it is not for me to close it.

Brian
 
I don't think we are there yet
Hum.....

Don't mean to get touchy here but, I never stated that I was not up to the task. Maybe I being judged on my post count in this forum..... :(

I am above average at normalization to say the least. One can say "self praise is NO praise". :) Although that might can be the case, I am confident in my own ability.

All i was alluding to is that it might be best to use a separate post to distinguish the issues of the tread. Most forums encourage this distinction on issues.

Regards
Dallr
 
I think that you misunderstood something.
I did not say that we were not upto the task of discussing normalisation or anything else, but that the thread was not at the stage to break it away into a new thread, the purpose of a more speciffically targeted thread is that very experienced people, of which you may be one, may get involved.

Brian
 

Users who are viewing this thread

Back
Top Bottom