External Data Parsing

Beetle · Jun 15, 2011

The one adjustment I'd probably want to make to your code is to store the racedata.txt file location in a table field, viewed and accessed from a Form Field call '[filelocation]'. This is just in case the file location changes. Are you able to advise the coding change to achieve this?

Simple. In the code, just change this line;

strFileName = "C:\YourFolder\racedata.txt"

to;

strFileName = Me![FileLocation]

assuming that [FileLocation] is a field in the record source of the form that the code is attached to. You may have to add quote delimiters if the path in the table field does not include the quotes.

Glad it worked for you.

PaulO · Jun 15, 2011

OK thanks again Sean!

By the way, I've often seen the Me! expression used ... does this mean the current form?

Beetle · Jun 15, 2011

By the way, I've often seen the Me! expression used ... does this mean the current form?

Yes. It's a keyword that can be used if you're writing code in a form module. So instead of;

Forms!YourFormName!YourControl

you can simply write;

Me!YourControl

PaulO · Jun 15, 2011

ok Sean ... that works great too ... !

One (last) question ... I need the query to delete any existing contents of tblTEMP (or perhaps even delete the entire table?) else it seems if the command button I've created is clicked twice or more the existing records get edited/prefixed with additional code somehow.

Beetle · Jun 15, 2011

it seems if the command button I've created is clicked twice or more the existing records get edited/prefixed with additional code somehow.

Very true. If you loop through the same set of records again the results are going to be garbage because the records have already been modified once. The general idea for the process would be;

1) Import the text file into tblTemp.
2) Loop through the records and add the prefix to each row.
3) Run your other queries or code to parse the data out to other tables.
4) Delete all the records from tbltemp
5) Import a new text file and start the process over

For step 4 you can simply use code like the following at the appropriate step within the process;

Code:

Dim strSQL As String
 
strSQL = Delete * From tblTemp;
 
CurrentDb.Execute strSQL, dbFailOnError

Beetle · Jun 15, 2011

On the other hand, it just occurred to me that you might want to import multiple text files into the table before running the loop. In that case, divide the code into two different procedures (run by , for example, two different command buttons). One procedure would be for the TransferText method, which you could run as many times as you want with different text files and it would just add more records to the table each time. The other procedure would run the loop after you've transferred all the necessary text files. You would still want to delete all the records from tblTemp after the data is parsed out to other tables.

Galaxiom · Jun 15, 2011

Opening a recorset against a temporary table is not a reliable solution.

It is imperative to realise that the concept of "Next Record" has no useful meaning in a table because records have no intrinsic order. Only by ordering in a query according to some value in a field do the records have a defined sequence.

While records *may* appear in the recordset in the order they were added to the temporary table this cannot be relied upon and to operate in this way is outside of the scope of the nature and definition of a table.

I would definitely avoid the import and read the text directly line by line parsing the data and writing into tables.

gemma-the-husky · Jun 15, 2011

G is right

if there is nothing in the file by which you can uniquely identify a row, and thereby reconstruct the initial order, the only solution is to read and process the file a line at a time

------
[added later]

furthermore, ome suggestion above was to change the data. Instead, add another column to the table, and populate that with whatever you were going to prefix the data with. Then your process becomes "safe" and re-runnable.

Beetle · Jun 16, 2011

Code:

Private Sub Oops(Cancel As Integer)

Dim strSQL As String
strSQL = "Delete Egg From Face"
CurrentDb.Execute strSQL

End Sub

Galaxiom and Dave make a good point that I had not properly considered when I posted that method. They are right in that it would not be fully reliable. It would probably be a good idea to look further at their suggestions

Galaxiom · Jun 16, 2011

For some reason I missed noticing that this was the same thread I posted on this morning until after I posted. Oh well. Now that I have written it i might as well stand. It cannot be said too many times.

(I recall a thread recently where we told the same person over and over again and they still wondered why the table came out in a random order.)

There is one huge vulnerability in the code supplied by Beetle.
It imports the text to a temporary table and then creates a recordset on that table.

Be aware that tables are unordered lists of data. They are not spreadsheets. Consequently records in a table have no intrinsic sequence so you should not rely on those records appearing in the recordset in the same order as they appeared in the original text file.

Although the records often do import in the original order, this behaviour is not within the specification of a table and as such it is only luck that maintains the original sequence.

It is quite likely that on subsequent imports the records will become disordered because Access often fills the deleted records of an existing table somewhat haphazardly.

You might then be tempted to create a fresh table each time but it is important to realise that you would still be working outside the specification of a table.

If the order of records matters it is imperitive that they carry some field that can be used to recreate the order. This must be added before or during the import which is why reading a textfile line by line is often used to import data rather than a simple TransferText.

When opening a recordset, if the order of the records matters it is imperative that the recordset be built on an ordered query rather than directly on a table.

PaulO · Jun 16, 2011

Beetle said:
On the other hand, it just occurred to me that you might want to import multiple text files into the table before running the loop. In that case, divide the code into two different procedures (run by , for example, two different command buttons). One procedure would be for the TransferText method, which you could run as many times as you want with different text files and it would just add more records to the table each time. The other procedure would run the loop after you've transferred all the necessary text files. You would still want to delete all the records from tblTemp after the data is parsed out to other tables.

Multiple files not envisaged ...

PaulO · Jun 16, 2011

Galaxiom said:
For some reason I missed noticing that this was the same thread I posted on this morning until after I posted. Oh well. Now that I have written it i might as well stand. It cannot be said too many times.

(I recall a thread recently where we told the same person over and over again and they still wondered why the table came out in a random order.)

There is one huge vulnerability in the code supplied by Beetle.
It imports the text to a temporary table and then creates a recordset on that table.

Be aware that tables are unordered lists of data. They are not spreadsheets. Consequently records in a table have no intrinsic sequence so you should not rely on those records appearing in the recordset in the same order as they appeared in the original text file.

Although the records often do import in the original order, this behaviour is not within the specification of a table and as such it is only luck that maintains the original sequence.

It is quite likely that on subsequent imports the records will become disordered because Access often fills the deleted records of an existing table somewhat haphazardly.

You might then be tempted to create a fresh table each time but it is important to realise that you would still be working outside the specification of a table.

If the order of records matters it is imperitive that they carry some field that can be used to recreate the order. This must be added before or during the import which is why reading a textfile line by line is often used to import data rather than a simple TransferText.

When opening a recordset, if the order of the records matters it is imperative that the recordset be built on an ordered query rather than directly on a table.

Food for thought ...

Just to clarify from my perspective that the ORDER that the records appear in the freshly-populated temp table is NOT in any way important ... what's important is that the correct association/prefix is made between each header record and all associated sub-records. The reason the order isn't important is because I have a bank of queries that subsequently analyse and parse the sub-records and two of theses fields are DATE of arrival and TIME of arrival, so I can then correctly sort the sub-records using these fields.

Also, as far as am aware the racedata.txt file will always produce its output in 'groups' i.e. Header followed by all associated sub-records, then any new Header and its associated sub-records, etc. So I think Sean's code is safe on this count, perhaps you'd all be kind enough to confirm.

I think it was Dave who mentioned that putting the 'prefix' into a separate table field might be a good idea, and I kinda like that idea if the code can be modified in such a fashion?

gemma-the-husky · Jun 16, 2011

paul

the order IS important. The bird-detail lines belong to the preceding race-header line.

The question is, if the order of these rows was randomised, would you be able to then pick any bird-detail row and know just from the data in that row, which race it belonged to.

I think not, and that is the reason why you need to import the data a row at a time, to maintain that order.

What we are saying is that if you could not reconstruct the orginal correspondence of headers-details directly from the data, without making any assumptions about the order of the data, then it is potentially unreliable. Therefore in order to that, the bird-detail rows MUST include details about the race-header row.

If you read in the data a row at a time, you can add this linking information at that time. That is the point.

There is another reason you cannot use an import spec - and that is because the header and detail rows have different formats. If the data content is not the same, there is no way you can AUTOMATICALLY "slice" up each row, using the same template. Ideally every row should have exactly the same layout.

One other solution that might help a little is to have only race per download.

But I think you should be looking at reading the data in a row at a time.

PaulO · Jun 16, 2011

Unfortunately I cannot control the number of races written into the racedata.txt file ... this is the hardware manufacturer's choice/control.

It's also frustrating that the racedata.txt file has no common/linking text in both the header and sub-records ... again this is the hardware manufacturer's choice/control.

I pretty much get what your saying insofar as it's safer to read the data record by record and appending each record into a table thereby making sure that 1) the original ordering of the data is preserved and 2) enduring that sub-records are correctly matched to headers.

What would the coding adjustment look like?

gemma-the-husky · Jun 16, 2011

I don't know what code you have at present - the logic of the suggested coding would be as follows.

You need routines to slice a race-header and bird-detail line into the different fields, and use sql insert statmeents to add these to your data tables

Code:

open file
do 
   read a line

   determine what sort of row it is

   if race-header then
        save header details
        put race details in race-table
  end if

  if bird-details then  
       using last header details put bird details into bird-race-table
  end if

loop until end of file
close file

PaulO · Jun 16, 2011

Post #15 shows Sean's code that I'm using at present, and which seems to work ... I tried to copy it into this posting without success.

PaulO · Jun 16, 2011

Sean

Just now analysing the output in tblTEMP ... I actually need to preserve any and all leading spaces

e.g.

T)GB10 N099519HC927209400J13103612050000000000000000000000000000000000000000

... needs to include the leading spaces (two in the particular instance)
T) GB10 N099519HC927209400J13103612050000000000000000000000000000000000000000

There's a very good reason for this ... how should your code be adapted to keep the full text strings with any and all leading and intermediate spaces?

added) actually I think the leading spaces are lost when tblTEMP is first created? I can work around this ...

Beetle · Jun 16, 2011

OK, I've taken another stab at this. This method reads through the text file line by line and adds a new record to tblTemp each time, including a LineNumber field and a separate field for the prefix. I'm not sure if you will need the LineNumber field or not but I included it anyway as it could be used for sorting. So in my test, tblTemp now includes the following fields;

LineNumber (Long Integer Number field)
LinePrefix (Text field
LineText (Memo field)

Even though I am using a Temp table in this example, you may be able to expand upon the code to read each line of the text file and insert the appropriate data from each line directly into your other tables if you want.

Also, to get it to work right I ended up opening the file just to determine the value for the Header from the first line, then close the file, then re-open it and loop through. This could probably be improved upon as well, but I haven't used this method in the past so I'm winging it a little bit. If there are any fatal flaws in this method hopefully G or Dave will point them out.

The resulting data set looks like the attached image. It also preserves the leading spaces.

Anyway, here is the code;

Code:

Dim intPos As Integer
Dim lngCount As Long
Dim rs As DAO.Recordset
Dim strLineData As String
Dim strFileName As String
Dim strHeader As String
Dim strPrefix As String
Dim strDelete As String
 
strFileName = "C:\YourFolder\racedata.txt"
 
'Delete any existing records from the Temp table
strDelete = "Delete * From tblTemp;"
CurrentDb.Execute strDelete, dbFailOnError
 
'Set the record counter to 0
lngCount = 0
 
'Open the recordset
Set rs = CurrentDb.OpenRecordset("tblTemp")
 
'Read the text file to determine the value for the Header
'from the first line then close the file
Open strFileName For Input As #1
Line Input #1, strLineData
strHeader = Left(strLineData, 19)
Close #1
 
'Re-open the file and read each line, inserting new records in the table as we go
Open strFileName For Input As #1
Do While Not EOF(1)
Line Input #1, strLineData
lngCount = lngCount + 1
If Left(strLineData, 19) = strHeader Then
    If Mid(strLineData, 56, 1) <> "0" Then
        intPos = 56
    Else
        intPos = 57
    End If
    strPrefix = Mid(strLineData, intPos, 1) & ")"
End If
With rs
    .AddNew
    !LineNumber = lngCount
    !LinePrefix = strPrefix
    !LineText = strLineData
    .Update
End With
Loop
 
'Close the file and the recordset
Close #1
rs.Close

Maybe this method will be more reliable.

PaulO · Jun 16, 2011

Thanks Sean ... I'll give this a roadtest as soon as possible and give feedback!

PaulO · Jun 16, 2011

Sean

Sean ... That too works well!

However I have consulted the Manufacturer's data protocol document once more and my definition of what constitutes a Header record needs tweeking slightly, as follows:-

If character 56 = "T" this denotes a Training flight header record. Where this arises all subsequent characters will be "*" (including character 58). We can use character 56 to define the prefix for all associated sub-records.
If character 58 = " " (i.e. blank) this denotes a Race flight header record and we can use characters 56 & 57 to define the prefix for all associated sub-records
- Previous rule was to look for a numeric value in character 57 but I've now discovered that the sub-records can populate a "1" into this character (my example just had "0") so this now effectively invalidates our current rule.
On next blank in character 58 this indicates an additional Race flight header record and we can again use characters 56 & 57 to define the prefix for all associated sub-records
Treat sub-records for each header set in just the same way as before
Now that we are putting the prefix into a separate field I don't think we now need the ")" which was only there as a data separator.

If we can tweek as per above I really think the job's a goodun!

External Data Parsing

Duly Registered Boozer

Registered User.

Duly Registered Boozer

Registered User.

Duly Registered Boozer

Duly Registered Boozer

Super Moderator

Super Moderator

Duly Registered Boozer

Super Moderator

Registered User.

Registered User.

Super Moderator

Registered User.

Super Moderator

Registered User.

Registered User.

Duly Registered Boozer

Attachments

Registered User.

Registered User.

Similar threads

Users who are viewing this thread