Using Inconsistent Date Time as Join

mbreu996

Registered User.
Local time
Today, 15:02
Joined
Feb 19, 2010
Messages
49
I wish to use a date time stamp as the join between 2 tables in order to perform various queries. My issue is this:

Table1 - this is a time series table, has date time stamps that may be in any interval (every 15 minutes, every 30 minutes, or no consistent interval at all). This table has many more records than Table2, as it is data from stream height recorders that is collected on a much higher frequency than Table2, which only has date times from site visits (this may occur anywhere from once or twice a day to only once a month).

Table2 - this table has date time stamps that could be any date time, whenever a person visits a site, they record the current date time

I have been able to join two tables with a date time stamp before, but the way I did it was to have Table1 have 15 minute intervals rounded to the 5th decimal place, and the same for Table2. This is not flexible enough, I want to be able to join Table1 and Table2 based on "the nearest date time match" when both tables have inconsistent times that don't match exactly.

My best stab at how to tackle this would be to use an update query that for each value in Table2, the nearest date time match in Table1 would be written to another field (field1 lets call it) in the same record in Table2. Then I use field1 of Table2 to join to Table1 rather than the original date time on Table2 which has a nearly 0% to have an exact match on Table1. The problem with my idea is I do not know how to take a date time from one table and look for the nearest match on another, and write that date time back to the table.

Perhaps this is better accomplished using VBA? Maybe there is a better solution than I can visualize? Maybe I am just lacking the simple skill to finding the nearest times?

I sincerely appreciate your assistance
 
Try this to get the values to match table1 and table2

query1
SELECT Table2.dtefldb, Table1.dteflda, (Abs([dtefldB]-[dtefldA])) AS Expr1, Table2.txtsite
FROM Table1, Table2;

query2
SELECT Query1.dtefldb, Min(Query1.Expr1) AS MinOfExpr1, Query1.txtsite
FROM Query1
GROUP BY Query1.dtefldb, Query1.txtsite;

query3
SELECT Query2.txtsite, Query2.dtefldb, Query1.dteflda
FROM Query2 INNER JOIN Query1 ON (Query2.dtefldb = Query1.dtefldb) AND (Query2.MinOfExpr1 = Query1.Expr1) AND (Query2.txtsite = Query1.txtsite);

Forgot to say that you cannot use this to do an Update you would need to create a table but you can just join the tables and query3

SELECT Query3.txtsite, Table2.dtefldb, Table1.dteflda, Table1.txtflda
FROM (Query3 INNER JOIN Table2 ON (Query3.dtefldb = Table2.dtefldb) AND (Query3.txtsite = Table2.txtsite)) INNER JOIN Table1 ON Query3.dteflda = Table1.dteflda;


Brian
 
Last edited:
Awesome! Thanks for such a quick reply!!!
 
Excellent. Thanks a million. It will be interesting to see how long this takes once Table1 has years and years of data - millions or even tens of millions of records. Any tips for optimizing this once it gets huge?
 
No I haven't, I would add that storing the data as you had planned would not comply with the rules of normalisation, and might lead to inaccuracies if data is added late, that is why I resisted the idea of creating a table.

Will you be able to remove data that is not required?

Hopefully as you gain experience you may come up with an operational solution.

Brian
 
I think my structure is correct. Table1 contains a time series of stream heights collected by stage recorders at various stream sites. I use a combo ID of the SiteID and DateTime to uniquely identify each record - there can never be 2 simultaneous stages at the same site. Table2 is the header information from a site visit - what site was visited and what time.

The resulting queries serve the purpose of linking the field observations with what the sensors record.

Could you elaborate where I have gone astray with my structure?
 
I forgot to mention that I need to preserve all the data in Table1 because it is used for multiple things - not just the scenario I mention.
 
You do bring up an excellent point though now that I think about it more. Perhaps I can implement an operational solution such as deleting records from Table1 once all necessary operations have been performed on it, everything has been QC'd, and the data is more than 2 years old or something like that. If I don't do this 10 years from now that query may take an incredibly long time to complete.

I really appreciate your insight on this issue it has been very valuable.

Best Wishes,
-Mark
 
Hi
I wasn't suggesting that your current table structure was not correct, only that storing a calculated field can lead to problems of data integrity, sorry that was not clear.

More importantly I overlooked an important point with my air code, namely that you have the site in both tables, therefore the first query should join on this.

I created a rough example see attached.

Brian
 

Attachments

Brian,

I really appreciate you posting that example, thank you for your assistance. Thank you for pointing out the issue of normalisation. I'm not sure what I can do remedy this given the nature of the data, hopefully it doesn't come back to haunt me.

Cheers,
-Mark
 

Users who are viewing this thread

Back
Top Bottom