Average of 150 values query

donnie_darko

New member
Local time
Today, 11:43
Joined
May 12, 2011
Messages
7
Hi...first time poster so be gentle :cool:

I have a table in the following format:

Code1 Code2 Reserve
1 5 678
1 9 23
......

Code1 takes values from 1 to 945 and Code2 takes values from 1 to 3000 (the table has 945*3,000=2,835,000 lines)

I need to summarise by Code1 for the average of the highest 150 Reserve values for each Code1 i.e. a table like:

Code1 Reserve
1 587
2 456
...
945 673

(e.g. 587 is the average of the highest 150 Reserve values for Code1)

Anyone have any ideas how to do this in a query? I tried messing about with SELECT TOP but couldn't get the right syntax.

Thanks in advance.
 
This requires multiple queries firstly you need to rank the data, call this queryrank, replace tablename with your table

Code:
SELECT x.code1, x.reserve, (select count(*) + 1
           from [I]tablename[/I]
          where code1 = x.code1
            and reserve > x.reserve ) AS rank
FROM [I]tablename[/I] AS x
ORDER BY x.code1, x.reserve DESC;
Then select as required

Code:
SELECT qryrank.code1, Avg(qryrank.reserve) AS AvgOfreserve
FROM qryrank
WHERE (((qryrank.rank)<151))
GROUP BY qryrank.code1;


BTW you only run the 2nd query


Brian
 
Thanks for the reply Brian - I'll give your code a try in work tomorrow.
 
I tried this code and it's not giving the correct answers. I think this is due to the fact that reserve, for the most part, has a value of zero so the RANK function won't be able to deal with this. In an example file with 3000 lines Reserve was non-zero for only 62 of these lines. Anyone have any idea how to get round this?
 
In what way is it not giving the correct answer.
If the top reserve was say 300 and then you had 149 at 0 then the average would be 2 the problem i realise is that if say there were 299 at 0 the average would be 1, ie if the records at the 150 count are of equal value and therefore rank they will be selected and thus averaged, it doesn't have to be 0 it handles that just like any other number.

hmm not sure how to handle this.

We need to create a running number within group for the criteria.

Thinks!!

Brian
 
I think that you have to decide how you want to handle equal values at the 150 mark. If there are only a couple and there is not a big change in value then the difference to the average probably wont matter, but that is your decision.

If there are a lot and/or if we now drop to zero then that is different. see simple example based on top5.

You could decide to ignore 0 by changing the code to this

Code:
SELECT x.code1, x.reserve, (select count(*) + 1
           from tablename
          where code1 = x.code1
            and reserve > x.reserve ) AS rank
FROM tablename AS x
where x.reserve <> 0
ORDER BY x.code1, x.reserve DESC;

If none of the above suits then I think you must do it all in code and I am sorry I cannot help.
Perhaps a repost in the VBA forum , mentioning this thread, would help.

Brian

Brian
 

Attachments

Thanks Brian. I need to find the top 5% values so have to take account of all 150 values when calculating the average. The only way i can think of is to sort and then add an index from 1 to 3000 for each code1 - can that only be done by using vba code?
 
I have to ask...
Does your table not have a single primary key field?
You mention Code1, Code2, Reserve.
Is the primary key a combination of Code1 and Code2? Or do you have another, for example autonumber, column?
(It so often makes such solutions simpler.)
 
For example, assuming a single identifying column, PKID, then something like just:

Code:
SELECT T.Code1, Avg(T.Reserve) AS AvgRes
FROM tblTableName AS T
WHERE T.PKID In
    (
     SELECT TOP 150 T2.PKID FROM tblTableName T2  WHERE T2.Code1 = T.Code1 ORDER BY Code2 DESC
    )
GROUP BY T.Code1

On such a large table, these queries are always going to tax the engine a bit.
Make sure you've indexed the vital fields (Code1 for sure).
 
Code:
SELECT T.Code1, Avg(T.Reserve) AS AvgRes
FROM tblTableName AS T
WHERE T.PKID In
    (
     SELECT TOP 150 T2.PKID FROM tblTableName T2  WHERE T2.Code1 = T.Code1 ORDER BY Code2 DESC
    )
GROUP BY T.Code1

Joining Code1 & Code2 together (e.g. Code1_Code2) would be a unique identifier. Should I create a field with this identifier and call it PKID and use the code above?
 
Yes or just insert an Autonumber field into the table as the PKID if the pkid is not needed for any other reason.

Brian

PS shouldn't the order by be code1 not code2 I'm sure its a typo.
 
Last edited:
Yes or just insert an Autonumber field into the table as the PKID if the pkid is not needed for any other reason.

Brian

PS shouldn't the order by be code1 not code2 I'm sure its a typo.

I'm rubbish at Access so how would I create an Autonumber field for each code1 i.e. For Code1=1, autonumber from 1 to 3000, Code1=2, autonumber from 1 to 3000 etc?
 
>> Joining Code1 & Code2 together (e.g. Code1_Code2) would be a unique identifier.

It might well be - but if you created a new field holding that, them your data is not atomic. You're repeating the same information in fact.

Adding a primary key index over those two fields wouldn't help you with the query.
Any attempt to concatenate them in the query to achieve the same - although entirely possible - would be a nightmare performance-wise.

An autonumber field is as simple to add as could be.
Just open the table in design view add a new column (that PKID) and select its type as autonumber. Select it as the primary key field.
It will be automatically populated. The values in it are unrealted to the other columns (as it should be). The combination of Code1 & Code2 remains as a candidate key, but for performance reasons you're creating another unique identifier.

Cheers.
 
I was interrupted this morning by a phone call from my wife whose in hospital, having reread Leigh's code I think I am coeect in saying that Top also pulls in all values that Tie at the point of selection ie 150, so the code sufferes from the same problem as mine, even if it is neater.

I would love to be wrong.

Brian
 
Hope your wife is doing OK!

Yes, that had crossed my mind.
Although Jet defaults to what would be "WITH TIES" in another SQL dialect - and it intrinsically locked to that - it's not difficult to push further.
When there is a tie, the decision inevitably rests with the developer (informed by the client/user) to decide upon what criteria (i.e. other field) would break that tie.
For example if there was more than one Code2 values the same at the 150th position (I don't see how the business model we're decribing in this thread even allows it, given that Code1 & Code2 are a candidate key) then we could fall back to the larger Reserve value (or smaller if preferred) or just make sure we grab only one (even if it's random / abitrary) using the new PKID by including those in the ORDER BY clause of the subquery.
e.g.
Code:
SELECT T.Code1, Avg(T.Reserve) AS AvgRes
FROM tblTableName AS T
WHERE T.PKID In
    (
     SELECT TOP 150 T2.PKID FROM tblTableName T2  WHERE T2.Code1 = T.Code1 ORDER BY T2.Code2 DESC[COLOR=red], T2.Reserve, T2.PKID[/COLOR]
    )
GROUP BY T.Code1

Cheers.
 
The order by needs to be

ORDER BY T2.Code1 DESC, T2.Reserve DESC, T2.PKID DESC

I suspect Leigh is creating "air code" I never challenge MVPs without first testing.

Brian
 
Well, I wasn't assuming anything about the desired order particularly. Just that they be made unique. Yes if the higher values were wanted then they absolutely need to be DESCending.
 
I'm rubbish at Access so how would I create an Autonumber field for each code1 i.e. For Code1=1, autonumber from 1 to 3000, Code1=2, autonumber from 1 to 3000 etc?

Thanks for the responses so far. Can anyone answer the question above?
 
As mentioned above - an autonumber is completely independent of all other columns in the table. The code suggested doesn't require that it's related or varies depending on the other columns. You want it to be independent. You're keeping your Code1 and Code 2 columns. It's simply a way of uniquely identifying a row.

Add the autonumber column as I described above.

Cheers.
 

Users who are viewing this thread

Back
Top Bottom