Getting close, but missing a step or two. Have a series of text files that need to be cleaned up and reformatted. The issue is two fold. Need to remove hard return wrapping so that each line is complete and then format to a table structure. I built code that imports fine if no wrapping. I tried this link and seemed to work, except in my implementation it is putting in delimiters, so that if a row of data has a comma, it treats that as a separate field rather than a single field. I thought that fixed width would mean that it ignores potential delimiter characters.
http://www.access-programmers.co.uk/forums/showthread.php?t=176573
The other problem is with the logic so that sometimes in is concatenating a new line to the old line. There may be exceptions due to typos and older files that used a tab rather than space, but seems in general like the indication of a wrap termination is that a new record will begin with either a number period and space or a letter (a, b, c, d only) a period and a space. The next part of building the table is tricky because the question needs to be repeated for each response as shown below in phase 2.
Example from plain text file (xxx.txt):
1. This is line one sentence
a. response a
b. response b
c. response c
d. response d
2. This is line two sentence
and it wraps
a. response a
b. response b
c. response c is
also wrapping
d. response d
----------------------------------
Phase 1 result should be:
-----------------------------------
1. This is line one sentence
a. response a
b. response b
c. response c
d. response d
2. This is line two sentence and it wraps
a. response a
b. response b
c. response c is also wrapping
d. response d
----------------------------------
Phase 2 result should be an access table with several cols:
-----------------------------------
Here is the code that I'm trying to use for unwrapping.
For reference, here is code that works fine for files that are properly formatted to begin with (1 row per question/response and no wrapping), in case I need to have a two step process, but perhaps the above code would be able to do the formatting this does as well.
http://www.access-programmers.co.uk/forums/showthread.php?t=176573
The other problem is with the logic so that sometimes in is concatenating a new line to the old line. There may be exceptions due to typos and older files that used a tab rather than space, but seems in general like the indication of a wrap termination is that a new record will begin with either a number period and space or a letter (a, b, c, d only) a period and a space. The next part of building the table is tricky because the question needs to be repeated for each response as shown below in phase 2.
Example from plain text file (xxx.txt):
1. This is line one sentence
a. response a
b. response b
c. response c
d. response d
2. This is line two sentence
and it wraps
a. response a
b. response b
c. response c is
also wrapping
d. response d
----------------------------------
Phase 1 result should be:
-----------------------------------
1. This is line one sentence
a. response a
b. response b
c. response c
d. response d
2. This is line two sentence and it wraps
a. response a
b. response b
c. response c is also wrapping
d. response d
----------------------------------
Phase 2 result should be an access table with several cols:
-----------------------------------
Code:
QNO Question Response Answer
1 This is line one sentence a response a
1 This is line one sentence b response b
1 This is line one sentence c response c
1 This is line one sentence d response d
2 This is line two sentence and it wraps a response a
2 This is line two sentence and it wraps b response b
2 This is line two sentence and it wraps c response c is also wrapping
2 This is line two sentence and it wraps d response d
Code:
Private Sub btnUnwrapText_Click()
'Create a new text file that has removed the wrapped text
'assumes that wrapping only occurs on second line. If
'wrapping exceeds two lines, code won't work. Haven't had
'a chance to see how to dynamically resolve this issue.
'20150520
'http://www.access-programmers.co.uk/forums/showthread.php?t=217125
Dim db As DAO.Database
Dim RS As DAO.Recordset
Dim stSQL As String
Dim stLine As String
Dim stUnWrapped As String
Dim fso As New FileSystemObject
Dim ts As TextStream
Dim stImport As String
Dim stTableName As String
stImport = selectFile("Text")
stTableName = Mid(stImport, InStrRev(stImport, "\") + 1)
stTableName = Mid(stTableName, 1, InStr(stTableName, ".") - 1)
stTableName = Replace(stTableName, "-", "_")
stSQL = "Select * FROM " & FileNameWithExt(stImport) & " "
Set db = OpenDatabase(FilePath(stImport), False, False, "Text; HDR=No")
Set RS = db.OpenRecordset(stSQL)
Set ts = fso.OpenTextFile(Replace(stImport, ".", "_Rev."), ForWriting, True)
Do Until RS.EOF
stLine = RS.Fields(0)
RS.MoveNext
If Not RS.EOF Then
stUnWrapped = RS.Fields(0)
End If
If Mid(stUnWrapped, 2, 1) <> "." Then
If Left(stUnWrapped, 1) <> Chr(42) Then
stLine = stLine & " " & stUnWrapped
Else
RS.MovePrevious
End If
Else
RS.MovePrevious
End If
ts.WriteLine stLine
RS.MoveNext
Loop
RS.Close
Set RS = Nothing
Set db = Nothing
End Sub
Code:
Sub btnImportExamText_Click()
'Read the text file containing the exam questions
'parse out the line and reformat to enable data to
'flow into the XMLExam table. Once in the table,
'can view the questions as well as export to XML format
'20150518
'sources:
'http://stackoverflow.com/questions/938796/read-lines-from-a-text-file-but-skip-the-first-two-lines
'http://www.wiseowl.co.uk/blog/s211/readline.htm
'again, we need this strange thing to exist so that ...
Dim RS As DAO.Recordset
Dim fso As New FileSystemObject
'the file we're going to read from
Dim ts As TextStream
'... we can open a text file with reference to it
Dim ThisLine As String
Dim i As Integer
Dim stImport As String
Dim stTableName As String
Dim stQNo As String
Dim intQNo As Integer
Dim intDelim As Integer
Dim stText1 As String
Dim stType As String
Dim stText2 As String
Dim stCorrect As Boolean
Dim dupCheck As Integer
Dim stSQL As String
Set RS = CurrentDb.OpenRecordset("XMLExam")
stImport = selectFile("Text")
stTableName = Mid(stImport, InStrRev(stImport, "\") + 1)
stTableName = Mid(stTableName, 1, InStr(stTableName, ".") - 1)
stTableName = Replace(stTableName, "-", "_")
'Set ts = fso.OpenTextFile("Z:\EXAM\COURSE\COURSE5D15.txt", ForReading)
Set ts = fso.OpenTextFile(stImport, ForReading)
'keep reading in lines till no more
i = 0
intQNo = 1
Do Until ts.AtEndOfStream
ThisLine = ts.ReadLine
i = i + 1
' Debug.Print "Line " & i, ThisLine
intDelim = InStr(ThisLine, ".")
stQNo = Left([ThisLine], intDelim)
If Left(stQNo, 1) >= 0 And Left(stQNo, 1) < 10 Then
stQuestionno = Left(stQNo, intDelim - 1)
stText1 = Mid(ThisLine, intDelim + 1)
'Trial and error determined that Readline goes to next line. Need to
'move to a next line if the current line begins with a number, because
'it is a question and the response answer (A,B,C,D) begins on the
'following line. Originally tried .skipline to skip a line, but that
'ended up going over the next line meaning would miss a row altogether.
ThisLine = ts.ReadLine
intDelim = InStr(ThisLine, ".")
stText2 = Mid(ThisLine, intDelim + 1)
Else
stQuestionno = stQuestionno
stText1 = stText1
End If
stType = Left([ThisLine], intDelim)
Select Case Left(stType, 1)
Case "A" To "Z"
stType = Left(stType, InStr(stType, ".") - 1)
stCorrect = False
Case Chr(42)
stType = Mid(stType, 2, 1)
stCorrect = True
Case Else
stType = ""
stCorrect = False
End Select
If Me.optAutoCalc = True Then
intQNo = intQNo
Else
intQNo = stQuestionno
End If
'avoid duplicates
dupCheck = DCount("ProdCode & Text1 & Type & SourceFormat", "XMLExam", "ProdCode = '" & stTableName & "' AND Text1 = '" & Trim(stText1) & "' AND [Type] = '" & stType & "' AND SourceFormat = 'Text'")
If dupCheck = 0 Then
RS.AddNew
RS!ProductCode = stTableName
RS!Category = "EXAM"
RS!displayQuestions = Me.txtDisplayQuestions
RS!passMinimum = Me.txtpassMinimum
RS!questionNumber = intQNo
RS!text1 = Trim(stText1)
RS!type = UCase(stType)
RS!Text2 = Trim(Mid(ThisLine, 4))
RS!correct = stCorrect
RS.Update
End If
'Increment Auto Question Number assuming D is last response of current question
If Me.optAutoCalc = True And stType = "D" Then
intQNo = intQNo + 1
End If
Loop
'close down the file
ts.Close
RS.Close
If DCount("ProdCode", "XMLExam", "ProdCode = '" & stTableName & "' AND SourceFormat Is Null") > 0 Then
If Me.optAutoCalc Then
stSQL = "UPDATE XMLExam SET XMLExam.displayQuestions = " & intQNo - 1 & " " & _
"WHERE XMLExam.ProdCode = '" & stTableName & "'" & " " & _
"AND XMLExam.SourceFormat Is Null"
CurrentDb.Execute stSQL
End If
If Me.optLastNumberInFile Then
stSQL = "UPDATE XMLExam SET XMLExam.displayQuestions = " & intQNo & " " & _
"WHERE XMLExam.ProdCode = '" & stTableName & "'" & " " & _
"AND XMLExam.SourceFormat Is Null"
CurrentDb.Execute stSQL
End If
stSQL = "UPDATE XMLExam SET XMLExam.SourceFormat = 'Text' " & _
"WHERE XMLExam.ProdCode = '" & stTableName & "'" & " " & _
"AND XMLExam.SourceFormat Is Null"
CurrentDb.Execute stSQL
End If
Me.Requery
End Sub