Validate headers of source data

  • Thread starter Thread starter Deleted member 125916
  • Start date Start date
D

Deleted member 125916

Guest
Hi all,

I'm working with several different source datasets. Mostly this are TXT files.

When I import the data in a new table with a specification, I want to validate if the dataset is the correct dataset be validating the headers in the source data.

I have created a table with the headers, that consists of 92 columns. Below 3 example columns of the header data:

tbl_Tickets_Headers
Field1 Field2 Field3
StagnationId TicketTypeId SequenceNr

Now I want to check during the import of the imported data has a line in it that is equal to the data in the table headers. If not, the import stops and informs the user that the data set is not correct. This to prevent that incorrect data will be inserted into the databse.

Anyone has a tip of how this can be done by VBA or maybe just with queries?


Some extra information:
  • The data is import into the database via VBA DoCmd.TransferText into a new created table, using an inport specification.
  • If the data is okay, it will be appeneded to an existing table that is being used to process the data furter.
  • After that the import table is being deleted.
Thanks in advance!
 
Last edited by a moderator:
I would "simply" open the file using the OPEN command or a File system object and (double) check the first line for having the proper column headers.
 
I have created a table with the headers, that consists of 92 columns. Below 3 example columns of the header data:

I don't know if I completely follow what your table holds. You said 92 columns but then only showed 1 column of data and said those were the columns. So the sample data you listed are column names in this table? Or are they values in your table? Is each file header the name of a field in this table or a value in this table?

Answer this: how many different file headers are valid? Post an example of a valid file header.
 
Allright, let me be clear on that ;)

The TXT file is imported in a new table. The headers of the source data are within the data, so the headers of the new table are like Field1, Field2, .... , Field92

I have created a seperate table with all valid headers as data. The headers of this table are also like Field1, Field2, .... , Field92. This table is in the attached database.
 

Attachments

Whenever I've done this I've always done as namliam suggests

I would "simply" open the file using the OPEN command or a File system object and (double) check the first line for having the proper column headers.

Once you've done that you can link it to your table.

edit:
Given that you also want to do this:
Now I want to check during the import of the imported data has a line in it that is equal to the data in the table headers. If not, the import stops and informs the user that the data set is not correct. This to prevent that incorrect data will be inserted into the databse.

In that case I've written VBA to read in the whole file and import it into the table rather than just create a linked table so I can sanity check each parameter as it comes in
 
I'm not familiar with this Open command (yet). I am mostly have been using simple tables/query databases. But working my way now in smarter handling of data and error handling. This data validation is the 1st step in that direction.

So a simple example would be much apreciated. I will google it also in the meantime.

Thanks in advance.
 
I would "simply" open the file using the OPEN command or a File system object and (double) check the first line for having the proper column headers.

have you tried looking into this?
 
Just to help:

dim IFN as integer
iFN = FreeFile
Open "C:\myfile.txt" For Input As #iFN

will open myfile

Then

dim msLine as string
Line Input #iFN, msLine

will read a whole line into msLine

Then finally you can use something like this to read each value out at a time where the delimiter is ';' and mlPos and msLine are module level variables

Code:
Function GetValue() As Variant
Dim str As String
Dim lPos2 As Long
  If mlPos <= Len(msLine) Then
    lPos2 = InStr(mlPos, msLine, ";")
    If lPos2 > 0 Then
      str = Mid$(msLine, mlPos, lPos2 - mlPos)
      mlPos = lPos2 + 1
    Else 'Read to the end of the line
      str = Right$(msLine, Len(msLine) - mlPos + 1)
      mlPos = Len(msLine) + 1
    End If
  Else
    str = ""
  End If
  If str = "" Then
    GetValue = Null
  Else
    GetValue = str
  End If
End Function
 
Thanks a lot RichP & namliam

This will certainly help me. As soon if I fully understand it and have it working I will post the way I have done it and mark this thread as solved.

Have a great weekend!
 
Hi all,

I been busy with it. But encountered the following problem with checking the file before importing:
Sometimes the headers in the source data are not on the first line. This is due the system (SAP) of our customer we work with. Several blank records before the actual hearder line and data is starting.
Before checking if the headers are okay, I import the data into a table that is created at the import step. The import is done with the following VBA command:

1. Import data into table TESTFILE
Code:
DoCmd.TransferText acImportDelim, "SPEC_0001", "TESTFILE", "c:\path\testfile.txt", False, ""

Next step is to delete the empty records, this is done with the following command:

2. Delete empty records from table TESTFILE
Code:
DoCmd.RunSQL "DELETE TESTFILE.*, TESTFILE.field1 FROM TESTFILE WHERE ((TESTFILE.field1) is Null)"

After this step the data is ready to be inserted in the table that can be used for furter data processing, this is done with the following command:

3. Append data from table TESTFILE to table TESTFILE_RAW
Code:
DoCmd.RunSQL "INSERT INTO TESTFILE_RAW SELECT TESTFILE.* FROM TESTFILE"

After this step the table TESTFILE that is used for import is deleted:

4. Delete table TESTFILE
Code:
DoCmd.DeleteObject acTable, "TESTFILE"

So my goal is to validate the data before step 3, when the data is actually going to be moved to the table with correct data.
As the headernames are in the data of table TESTFILE, I just need a check if a certain record is in the dataset. This record is the one with the headernames of the source file. If this record exists, it will be removed from the source data and step 3 can be executed. If this record is not found, the macro stops.

Thanks in advance!
 
Problem with your approach is that column headers are always text values, thus you need to import all the columns as text inorder to be able to validate them all.

Like I said the best way to do this is to check the file even before you even touch it, to prevent any risk of data-corruption.
Something along the lines of:
Code:
Dim I as integer, myString as string
Open "Yourfile.txt" for input as #1
Line input #1, myString
I = 1
Do while not eof(1) and I < 10 ' or however many lines you want to check
    if mystring = "headerline" then i = 999
    i = i + 1
    Line input #1, myString
Loop
if I = 999 then 
    'do import here
else 
    ' error no/wrong header found
endif
** Disclaimer ** above is aircode, untested and typed straight onto the forum it may need some tweaking
 
Hi namilam,

Thanks a lot for your patience with me.

I see that that the code you made is going into the right direction. That saves me indeed some programming after the import.

Now the next challange:
The part "headerline" of my biggest and daily datadump consist of 92 colums. The text of the header lines consists of 1327 characters. That is way more then the type string can hold. Any suggestion on that?
 
cant you identify the header with only the first 100 or something characters?
 
That could be indeed a solution, but then again, if someone removes a column or change the order of the export from the client system it will not see that. But for now the adjust to string size is okay. Thanks!
 
Hi namliam,

The check only 100 chars is not an option, as the script takes the whole first line from the the source data. And the type string can hold more then 255 chars in vba, upto 2 billion chars or so. Not sure. It is just how you define it in VBA.

To define the string that needs to be checked can be done as follows:

Code:
myHeader = "Somestring" _
& "Someotherstring" _
& "Anotherstring"

This can be done quite some times, but also has it's limitations.

Thanks a lot for all help! I consider this thread as solved!
 

Users who are viewing this thread

Back
Top Bottom