Ranking Problem

kriemer

Registered User.
Local time
Yesterday, 22:14
Joined
Aug 28, 2005
Messages
38
I know this has been asked and answered, but I can't figure out the answers to meet my requirements so hopefully someone can help me.

I have a database which consists of several stock tickers sorted on "Date" and and calculated value ("ABS"). What I require is to be able to generate a "Ranking Value" from the ABS value, where the highest ABS = 1, next highest =2, etc. for any particular date group, then starts again. I currently create the ranking in Excel and re-import the results into Access. I have reached the 65.5k record Excel limit and now need to do all the work in Access

The code I use to generate my un-ranked table is as follows:
Code:
INSERT INTO [Ranking ABS Access] ( Ticker, [Date/Time], Equationtype, Calcgain, ABS )
SELECT [Main program walkforward explore table].Ticker, [Main program walkforward explore table].[Date/Time], [Main program walkforward explore table].Equationtype, [Main program walkforward explore table].Calcgain, Abs([PositionScore]) AS ABS
FROM [Main program walkforward explore table]
ORDER BY [Main program walkforward explore table].[Date/Time], Abs([PositionScore]) DESC;

I have found the following code which by description does what I need (though it will take someone much smarter than I am to translate it to my needs) albeit for a differently structured database:
Code:
select a.Dept, a.Subdept, a.Amount, count(*) as rank
from
(
select Dept, Subdept, Amount
from joe
group by Dept, Subdept, Amount
) as a
inner join
(
select Dept, Subdept, Amount
from joe
group by Dept, Subdept, Amount
) as b on a.Dept = b.Dept and a.Amount <= b.Amount
group by a.Dept, a.Subdept, a.Amount
order by a.Dept, a.Subdept, count(*)

Any and all help/directions/suggestions/and even questions will be greatly appreciated.
 
Well, assuming you want to rank abs across dates, not within date, one way to approach it is to have a query that generates the ranks for each abs score.

Something like
MyRankQuery

SELECT MyTable.ABS, DCount("*","MyTable","[ABS]<=" & [ABS]) AS Rank
FROM MyTable
GROUP BY MyTable.ABS, DCount("*","MyTable","[ABS]<=" & [ABS]);


Then make another query that joins the table to the first query.
FinalQuery

SELECT MyTable.MyDate, MyTable.ABS, MyRankQuery.Rank
FROM MyTable INNER JOIN MyRankQuery ON MyTable.ABS = MyRankQuery.ABS;

Note: that if there are two or more identical ABS scores then they will both be ranked the same (so tied for 3rd means the rank given to both is 4th)
 
CraigDolphin

You were right the 2nd time, I'm looking to rank ABS within a date group.

What to do?

Thanks for getting back to me.

k
 
Last edited:
Try:

SELECT MyTable.MyDate, MyTable.ABS, DCount("*","MyTable","[ABS]<=" & [ABS] & " AND [MyDate]=#" & [MyDate] & "#") AS Rank
FROM MyTable
GROUP BY MyTable.MyDate, MyTable.ABS, DCount("*","MyTable","[ABS]<=" & [ABS] & " AND [MyDate]=#" & [MyDate] & "#")
ORDER BY MyTable.MyDate, MyTable.ABS;
 
I'm Flumoxed

Craig;

I modified your code to suit the specifics of my database (table name, etc.) as follows:
Code:
SELECT [1 ABS scores to excel].Date/Time, [1 ABS scores to excel].ABS, DCount("*","1 ABS scores to excel","[ABS]<=" & [ABS] & " AND [Date/Time]=#" & [Date/Time] & "#") AS Rank
FROM [1 ABS scores to excel]
GROUP BY [1 ABS scores to excel].Date/Time, [1 ABS scores to excel].ABS, DCount("*","1 ABS scores to excel","[ABS]<=" & [ABS] & " AND [Date/Time]=#" & [Date/Time] & "#")
ORDER BY [1 ABS scores to excel].Date/Time, [1 ABS scores to excel].ABS;

Nothing happens when I execute the query. It just runs on and and on..

Can you see what I've done wrong?

Thanks in advance.

k
 
Last edited:
Well, for starters, having a field named Date/Time is an extremely bad idea. The slash mark is a special character and can cause odd things to happen.

Try changing your table name and field names to something that does not use spaces, and does not use special characters.

It may take a while for the query to execute if you have a lot of records in the table. I did test the sql on a sample db so it should work. See attached
 

Attachments

Craig,

The database is comprised of >65k records, and growing, and so the problem is indeed that the ranking routine takes a long, long, long time to complete. When I filter to reduce the number of records the ranking completes (so I know your code does work).

Are there any other ranking techniques that would run faster given the size of my database?

I could send you a copy of the database if you like to see how it is structured.

Many thanks again;

k
 
Not that I am aware of....about the only thing I can recommend doing is to create an index on the date and ABS fields. This may improve performance.

Also, if the data is going across a network, this might be slowing things down too.

VBA is almost always slower than SQL. You may have to live with the fact that you're asking the computer to do a lot of thinking/work to a lot of data. It's going to take some time.
 
Craig,

How would an index on Date and ABS fields help improve calculation performance?

Sorry for being obtuse.


Thanks

k
 
Index's help by creating a very fast way for the db engine to find records.

Think about scanning through a table, line by line to find a value. That's your db's situation without an index. Each read of each line is an operation that has to happen.

Now, for comparison, consider if someone had gone through and pre-sorted your table and gave you two cards to look at: one with values starting at 0 and ending halfway through the dataset, the other starting at half-way, and ending at the highest number.

You know the value you're looking for is one the first card because it is less than the half value. So, in one step, we've eliminated half the records.

But it gets better, there's another tier of card to consider. Let's say there's three second level cards that represent the range of values on the first card of the first tier.

Each card represents one third of the range of values. So you now know that your value must be in the last card of this tier.

Now, we're down to having to sort only 1/6 of the possible records after two operations.

You guessed it, we're not done yet. There's another tier of several cards that each represent a portion of the range of numbers on the third card of the second tier. Let's say there's another 3 cards at that tier.

As before, the data is ordered logically, and you can skip straight to the correct card because you know the value you're looking for.

Bingo, three steps has reduced the number of records to search to 1/18th of the number you started with. Now all you have to do is read through the records on that third tier card, and you're home free.

Generally, indexes have as many tiers as needed to ensure that the bottom-tier card have only a relatively small amount of data on them. From memory it's only about 8kb of data. And the usual terminology to describe cards in indices is 'pages'.

Hopefully you can see that indexes are much like card files. Like any method of organizing information, it can vastly speed up read-times, but there is a cost. Generally, whenever you have an index, it slows down write speeds to the table because each time you edit, delete, or add a new record to the table, the index must also be updated to reflect the new data.

Hope it helps
 
Indexing sounds like a valuable "trick".

2 remaining questions:

1] Is it possible to index a query result (as would be the case with ABS values)?

2] Your code ranks lowest to highest. How do I rank the highest ABS value as 1, next highest as 2, etc.

Many thanks. I hope this has been of help to others as well.

k
 
To reverse the order, you'd change the <= to >=, and probably change the Order by for ABS to DESC.

Code:
SELECT MyTable.MyDate, MyTable.ABS, DCount("*","MyTable","[ABS]>=" & [ABS] & " AND [MyDate]=#" & [MyDate] & "#") AS Rank
FROM MyTable
GROUP BY MyTable.MyDate, MyTable.ABS, DCount("*","MyTable","[ABS]>=" & [ABS] & " AND [MyDate]=#" & [MyDate] & "#")
ORDER BY MyTable.MyDate, MyTable.ABS DESC;

As for indexing a query, no. Remember, a query is simply a view of the data in the tables.

If you put the appropriate indices on your table, the db will automatically know to use them when figuring out how best to go about retrieving the data for your query.
 
The indexing reduced the run time from forever to about 15 minutes.

Wonderful^3; works a treat!

Kevin
 
Well, that's good to hear :) 15 minutes is still a long time to sit and twiddle your thumbs though!
 
I agree, normally 15 minutes is a computer eternity; but this is just one of many scheduled events that take place in the wee hours of the am. With Access providing the ranking I am able to remove an Excel item from the scheduler. I believe that the additional time in Access is more than justified by a simplification of the scheduler.

k
 

Users who are viewing this thread

Back
Top Bottom