Query extracting observations between start and end date

fboehlandt

Registered User.
Local time
Today, 02:18
Joined
Sep 5, 2008
Messages
90
Hello everyone,
I have the following normalized table containing the return observations for different dates and various investment funds. The table looks as follows:

[Fund1] [Code1] [Date1] [Return1]

[Fund1] [Code1] [Date2] [Return2]

[Fund1] [Code1] [Date3] [Return3]

...

[Fund1] [Code1] [DateN1] [ReturnN1]

[Fund2] [Code2] [Date1] [Return1]

[Fund2] [Code2] [Date2] [Return2]

[Fund2] [Code2] [Date3] [Return3]

...

[Fund2] [Code2] [DateN2] [ReturnN2]

...

[FundM] [CodeM] [Date1] [Return1]

[FundM] [CodeM] [Date2] [Return2]

[FundM] [CodeM] [Date3] [Return3]

...

[FundM] [CodeM] [DateNM] [ReturnNM]



where M is the total number of funds and Nk denotes the number of observations per fund, [Fund] is the fund's name and
Code:
 some arbitrary serial number unique to each fund. Note that not all funds have the same number of observations (i.e. different reporting start and end dates).
I would like to run a query so as to extract funds with a continuous performance track record between a specified start and end date. I came up with the following query that works. However, it is very slow and not very elegant. In addition, it assumes that if a fund has a record for the start date and the end date, then the record must have a continuous track record between those dates (which, luckily, is true):

[CODE]SELECT Performance.*
FROM Performance
WHERE Performance.Code IN
(SELECT Performance.Code
FROM Performance
WHERE Performance.Code IN
(SELECT Performance.Code
FROM Performance
WHERE Performance.MM_DD_YYYY=#startdate#) 
AND Performance.MM_DD_YYYY=#enddate#)
AND Performance.MM_DD_YYYY BETWEEN #startdate# AND #enddate#;

I hope anyone's got a more efficient query to do this. Any help is greatly appreciated...
 
Just an observation (I think you had another thread where I stated this, but not sure it was yours) -

Your table structure is NOT NORMALIZED. Having more than one table with the same fields means it is not. It should have one table with an identifier which makes the distinction between records as to whatever makes the difference.
Never mind - I think I read it wrong. You were showing ROWS of data as an example and not from different tables, correct? If so, sorry about the above in gray.
 
Last edited:
And then I just went back and found out that it IS you who has a non-normalized structure with PerformanceA and PerformanceB as two tables with the exact same fields. So, I stand by my assertion in the gray above, that your table structure is NOT NORMALIZED.
Normalize.png
 
As per my other post:
Originally, there were to access databases. Each database contains the same tables with identical fields but different (unique) records. I created a new database containing all tables from the two previous databases. Lastly, I used a simple Union query to combine the two 'Performance' tables (which are normalized). Is there a better way?
The structure of the two tables are as outlined above. I seek to combine the two tables (identical design) 'PerfromanceA' and 'PerformanceB' into 'Performance' (i.e. combining the two databases). I thought to use a Union query but may be that is not appropriate?

http://www.access-programmers.co.uk/forums/editpost.php?do=editpost&p=1027038
 
Please tell us about the tables.
What fields in the table?
What is the Primary Key?
As Bob said----Normalize your tables. Life with database becomes a lot less hectic.

Merry Christmas
 
Merry christmas,
here the table 'Information':

[Fundname] [Fundcode] [Description]

Here, Fundcode is the primary key. Like autonumber, fundcode is a random unique code assigned to each investment fund.

The table 'Performance' looks as follows:

[Fundname] [Fundcode] [Date] [Return]

Here, Fundcode would be the foreign key to link the two tables (1:n relationship). So far, so good.

Now, I have two distinct databases. Their structure is idenitcal, i.e. they both contain tables 'Information' and 'Performance'. The differences are the distinct records in each database. For ease of data sourcing I would like to combine the two databases. The resulting tables would then be table 'InformationNew':

[Fundname] [Fundcode] [Description] [Database]

and the 'PerformanceNew' table retains the fields of the original tables:

[Fundname] [Fundcode] [Date] [Return]

and contains the return observations for funds from both databases. Assume that the Fundcode is unique across the two databases. Let#s take it from here: What are your thoughts on this?
 
You don't want FundName in both Tables if FundCode is the PK.
You can get FundName via a query.

What's with the 2 databases, I thought we were talking about 2 tables????

You'll need a composite key on Performance to make records uniquely identifiable.
 
Again:
Originally, there were two databases. Now, it is one. I have imported all tables from two databases into one. Then, I ran a UNION query to combine the tables. Don't worry about the primary keys now. This is not relevant. I don't seem to get my point across so I will see if I cannot narrow down the problem before posting it again. Thanks for your input
 
I don't understand your query, but then I am a simple guy, I thought that all you would need would be

Code:
SELECT Performance.*
FROM Performance
WHERE  Performance.MM_DD_YYYY BETWEEN #startdate# AND #enddate#;

Brian
 
Not quite,
that would yield all investment funds and their respective performance between start and end date. Not all fund will have a continuous track record between start and end date (i.e. some observations will be missing). I need all funds with continuous track record only. Thus, I run one query to check for all funds that have a return observations for start date, another query to check for all funds that have a return observation for end date, and lastly a query to extract the return obervations for funds that belong to both previous queries. Instead of running the queries one after the other, I use
Code:
IN (SELECT...IN (SELECT...)...)
Hope that clarifies?
 
:o
Sure It made me go back and look at your query, I totally missed the ) after both the start and end dates in the Where clause, guess the old eyes are getting worse.

Brian
 
You have presumably chosen subqueries over cascading queries as they appear more elegant, however i remember reading a discussion on the merits of both and one of the MVPs stated that subqueries can constrain the optimisation by the compiler.
In your case all 3 queries are ging to read all of the file testing the date field(s), with cascading queries you would Join the 2nd and 3rd on the Fund, joins are known to be efficient.
How about 2 queries which I think also remove the luckily element you mention in Post 1.

SELECT fund
FROM Table
WHERE (((fundate) Between [Startdate] And [Enddate]))
GROUP BY fund
HAVING (((Count(fund))=DateDiff("d",[Startdate],[Enddate])+1));

This selects all the funds that have an appearance count between those dates equal to the number of days, inclusive.
Then join this back to the original table to extract any other info needed.

I'm not a performace wizard but think this may be faster.

Brian
 
Another approach if you are happy that a fund that is there on the start and end dates meets the criteria is to cobine those two queries along the lines of

Select Fund
From performance
where funddate=[startdate] or funddate = [enddate]
group by Fund
Having Count(Fund)=2

Brian
 

Users who are viewing this thread

Back
Top Bottom