Identifying timeseries data gaps (1 Viewer)

Berlinomatt

New member
Local time
Today, 11:41
Joined
Aug 10, 2020
Messages
14
Hi once again,

I had posted this in another thread but coupled it with another question but I think its better to separate them out as both seem quite complex (at least for me).

I have a very large dataset in the form of timeseries data for different weather station locations. Unfortnately there are often gaps due to technical issues with the station. The weather stations record every 5 minutes and what I'm hoping to do is identify the start and stop point of every gap that exceeds a 30 minutes time period. At the moment every field where there was any gap is populated with -9999. If the gap is less than 30 mins we will populate the field with zeros instead of -9999. The main priority however is to identify the start and stop point of these gaps as they need to be excluded from the analysis. Here an example of the data:
Trigger.JPG


Many thanks once again!!!
 

neuroman9999

Member
Local time
Today, 03:41
Joined
Aug 17, 2020
Messages
827
I'm more of a code writer than a query writer, however I'm wondering if you can do this the same way one would do like a record numbering query. in that process, the query statement, when transacted for every record, looks at the previous record to see what the record number is. It makes me wonder if you can do that here, in terms of looking at the stop/start point that is relevant in order to get what you need....
 

CJ_London

Super Moderator
Staff member
Local time
Today, 08:41
Joined
Feb 19, 2013
Messages
16,521
this can be done in a query as demonstrated in your previous post on this subject and in the crosspost.

You need to be clearer about what this means

is to identify the start and stop point of these gaps as they need to be excluded from the analysis.
do you want a query to identify these? - since solutions have already been provided in your previous post/crosspost it would appear not. If not, in what form do you require the identification?
 

arnelgp

..forever waiting... waiting for jellybean!
Local time
Today, 16:41
Joined
May 7, 2009
Messages
19,094
basicall what you need is a Cartesian query.
there are 5 tables on the sample.
the fictitious data is Table1 where we want to know which time is missing.
we can only Query for 1 day because it takes much time (as if access is not responding, when we take
more than a day).

qry_days are cartesian of tables h, d, m, y (hour table, day table, month table and year table).
qryTable1 is just an ordinary select query from Table1.

we use Left join qry_days against qryTable1.
see them all in design view.

the final query is qryFinal where you can identify which time does not exists from table1 (on date Oct 7, 2018, note i used "text").
 

Attachments

  • timeGap.zip
    45.6 KB · Views: 422

Berlinomatt

New member
Local time
Today, 11:41
Joined
Aug 10, 2020
Messages
14
Thank you very much for your detailed answers, especially @arnelgp

I think maybe I did not explain myself clearly enough so apologies for that.

When there is a gap in the recording of data the gap does not appear in the timeseries but only as a '-9999' in the parameter column. Initially the gap appeared in the timeseries but for ease of analysis these were populated with accurate timestamps but then '-9999' as a parameter value. Therefore what I need the query to be able to do is identfy a succession of '-9999' entries in consective 5 minute timestamps that continues for a period longer than 30 minutes in a given column (here pcp_mm).

Apologies for not making that clearer earlier

1599739855225.png
 

MajP

You've got your good things, and you've got mine.
Local time
Today, 04:41
Joined
May 21, 2018
Messages
8,439
can you post a table? It can be in excel or a database? At this point I think it may be easier to simply read the data and then write to a table the IDs of the records in the time gap. Then you can query based on that. If you can add a field "Gap" and make it boolean then you could write to the table fields in the the gap. This would be a little faster and the query faster.
 

arnelgp

..forever waiting... waiting for jellybean!
Local time
Today, 16:41
Joined
May 7, 2009
Messages
19,094
see Query1 esp the column, Expr1.
it is counting how many gap there is.
therefore you only filter it (> 6, for 30 or more gaps).
 

Attachments

  • timeGap2.zip
    32.2 KB · Views: 416

Berlinomatt

New member
Local time
Today, 11:41
Joined
Aug 10, 2020
Messages
14
If I initially run this query I get an error stating 'Duplicate output alias Expr1'. I if modify the code so that the second Expr1 is Expr7 I get errors in the database
Code:
SELECT min_weather.[station name] AS Expr1, min_weather.date AS Expr2, min_weather.year AS Expr3, min_weather.month AS Expr4, min_weather.day AS Expr5, min_weather.pcp_mm AS Expr6, Switch([pcp_mm]<>-9999,0,True,DCount("1","min_weather","[station name]='" & [station name] & "' and [year]=" & [year] & " and [month]=" & [month] & " and [day]=" & [day] & " and [pcp_mm]=-9999")) AS Expr1
FROM min_weather;
 

Berlinomatt

New member
Local time
Today, 11:41
Joined
Aug 10, 2020
Messages
14
Here a small subset...
 

Attachments

  • Database7.accdb
    576 KB · Views: 429

arnelgp

..forever waiting... waiting for jellybean!
Local time
Today, 16:41
Joined
May 7, 2009
Messages
19,094
Query1 will count the Gaps.
on Query2 it will filter Query1 where the Gap Count > 6 (more than 30 min).
 

Attachments

  • Database7.zip
    53.3 KB · Views: 438

Users who are viewing this thread

Top Bottom