Solved How to transform this data? (1 Viewer)

Mike Krailo

Well-known member
Local time
Today, 00:47
Joined
Mar 28, 2020
Messages
1,331
I have a text file full of data that appears like the following example data. I'm looking for a way to transform that data into proper tabular data in Excel so that I can apply the resulting data as the source for mail merge labels. The end goal is to create proper hard drive labels.

Code:
Title = ms1 /c0/e38/s0
SN = WBM3D8MB0000C1163DKB
Manufacturer = SEAGATE
ModelNumber = ST2400MM0149   
RawSize = 2.182 TB

Title = ms1 /c0/e38/s1
SN = WBM3J6HK0000C1163DHA
Manufacturer = SEAGATE
ModelNumber = ST2400MM0149   
RawSize = 2.182 TB

Title = ms1 /c0/e38/s2
SN = WBM3BYZC0000C113A8DU
Manufacturer = SEAGATE
ModelNumber = ST2400MM0149   
RawSize = 2.182 TB

Title = ms1 /c0/e38/s3
SN = WBM3J6Z50000C116K5UL
Manufacturer = SEAGATE
ModelNumber = ST2400MM0149   
RawSize = 2.182 TB
 
Maybe RegEx could work here?
 
Paste the existing single column data into the spreadsheet.
Put the whole lot into an array, and process the single source array into another destination multidimensional array.

Dump the destination array into your spreadsheet final destination worksheet.

It will work very quickly and with such fixed layouts should be a doddle, using basic string manipulation vba functions.
 
OK, let me see if I can do that Minty. Going to have to learn how to do the array thing.
 
I would try and just read the file line by line.
Split and test the word before the = and populate respective column.
 
I figured out how to do the dynamic array in excel. That is an awesome function. I used the following simple function to get the actual file to convert to the required format. The rest was as Minty said, was easy to do.

Code:
=WRAPROWS(A1:A179,6)
 
Here is a working solution. It read the text file and produced this

MikeK.png
 

Attachments

For those interested
Code:
Private RS As DAO.Recordset
Public Function GetFile() As String
   ' Set up the File Dialog.
   Dim fdialog As FileDialog
   Set fdialog = Application.FileDialog(msoFileDialogFilePicker)
   With fdialog
      ' Allow user to make multiple selections in dialog box
      .AllowMultiSelect = False
      ' Set the title of the dialog box.
      .Title = "Please select a file"
      ' Clear out the current filters, and add your own.
      .Filters.Clear
      .Filters.Add "Text File", "*.txt"
      '.Filters.Add "All Files", "*.*"
      ' Show the dialog box. If the .Show method returns True, the
      ' user picked at least one file. If the .Show method returns
      ' False, the user clicked Cancel.
      If .Show = True Then
         'Loop through each file selected and add it to our list box.
         If fdialog.SelectedItems(1) <> vbNullString Then
           GetFile = fdialog.SelectedItems(1)
         End If
      Else
         MsgBox "You clicked Cancel in the file dialog box."
      End If
   End With
End Function


Public Sub ReadLineByLine(strFile As String)
' ***************************************************
' * Open a Text File And Loop Through It            *
' ***************************************************
  Dim IntFile As Integer
  Dim StrIn As String
 
  Dim TheTitle As String
  Dim TheSN As String
  Dim TheManufacturer As String
  Dim TheModelNumber As String
  Dim TheRawSize As String
  Dim TempArray() As String
 
  IntFile = FreeFile()
  Set RS = CurrentDb.OpenRecordset("tblEquip")
  Open strFile For Input As #IntFile
 
  Do While Not EOF(IntFile)
     'Move to title
     Do While Not EOF(IntFile)
       If InStr(StrIn, "Title = ") Then Exit Do
       Line Input #IntFile, StrIn
     Loop
    TheTitle = FindAfter(StrIn, "Title = ")
    Line Input #IntFile, StrIn
      
       TheSN = FindAfter(StrIn, "SN = ")
    Line Input #IntFile, StrIn
      TheManufacturer = FindAfter(StrIn, "Manufacturer = ")
    Line Input #IntFile, StrIn
      TheModelNumber = FindAfter(StrIn, "ModelNumber = ")
    Line Input #IntFile, StrIn
      TheRawSize = FindAfter(StrIn, "Rawsize = ")
    InsertEquipment TheTitle, TheSN, TheManufacturer, TheModelNumber, TheRawSize
  Loop
  Close #IntFile
  RS.Close
End Sub
Public Function FindAfter(ByVal SearchIn, ByVal SearchAfter) As String
  'This will find the text after a given search for text. The text is considered as a single item if no more than one space in the text
  'Do not know if this is any faster, probably not. Makes the code a little more reuseable
  'SEGMENT : CLASS7                                            Print Date 04-07-2018 12:40:37     Page        3 of 2000
  'If SearchAfter = "SEGMENT :" then it returns CLASS7, if "Print Date" it returns "04-07-2018 12:40:37", and "Page" returns "3 or 2000"
  SearchIn = CleanAndRemoveSpaces(SearchIn)
  FindAfter = Trim(Split(SearchIn, SearchAfter)(1))
  FindAfter = Trim(Split(FindAfter, "  ")(0))
End Function
Public Function FindBefore(ByVal SearchIn, SearchBefore) As String
  'This will find the text before. The text is considered as a single item if no more than one space in the text
  Dim aBefore() As String
  SearchIn = CleanAndRemoveSpaces(SearchIn)
  FindBefore = Trim(Split(SearchIn, SearchBefore)(0))
  aBefore = Split(FindBefore, "  ")
  'find last element to left
  FindBefore = Trim(aBefore(UBound(aBefore)))
End Function
Public Sub InsertEquipment(TheTitle As String, TheSN As String, TheManufacturer As String, TheModelNumber As String, TheRawSize As String)
  RS.AddNew
    RS!EquipTitle = TheTitle
    RS!EquipSn = TheSN
    RS!equipManufacturer = TheManufacturer
    RS!equipModelNumber = TheModelNumber
    RS!equipSize = TheRawSize
  RS.Update
  Exit Sub
errlbl:
  Debug.Print Err.Number & " " & Err.Description
  Resume Next
End Sub
 
I was on the right lines then?
However @MajP how do you avoid the blank line between the data?
I cannot see how that is done? :(

Edit: Never mind, just spotted it. :)
 
MajP, that was similar to what I had in mind if I was doing it in access, I just needed it for Excel though. But that is a nice Access way to do it. Very much appreciate posting that.
 
Why not doing it the Excel way?
Paste all in Excel worksheet. It should be in one column, let's say "A".
In column "B", to the right of the first occurrence of "title", use the Mid function to extract from the text to the left from the 9th character, use the length of the text for the third parameter by simply using the length function, the first formula should look like:
=MID(A1,9,LEN(A1))
In column "C", to the right of the formula you just added, enter something similar:
=MID(A2,6,LEN(A1))
I hope you see the pattern.

Once the 5 columns have formula (from B to F), select that portion and a little below, like:
1702577600715.png

Then grab the handle in the lower right corner and fill down. It'll look like this:
1702577734910.png


The rest should be just filtering removing empty cells.
1702577820247.png


Ok, then just paste into another sheet. To be sure, select your range and use the GoTo Special
1702578042523.png


Then, choose Formulas and paste in another sheet. With enough practice, this is done super fast. For example, in the english version of Excel, to go to the GoTo Special, I'd use Alt + hfdsf + enter. In the spanish version, I'd use Alt + ohuru + enter
 
Last edited:
MajP, that was similar to what I had in mind if I was doing it in access, I just needed it for Excel though. But that is a nice Access way to do it. Very much appreciate posting that
Almost none of that code is Access specific. You could run in Excel, and simply insert into a worksheet. Or run in access and export to Excel if that is an option.
 
Last edited:
The Excel method is pretty simple but it requires a strongly formatted source file. Access and VBA can overcome some of the lax formatting problems.
 
Same code run in Excel only real change is the insert to the sheet and removal of the recordset code.
Code:
Private Sub InsertEquipment(TheTitle As String, TheSN As String, TheManufacturer As String, TheModelNumber As String, TheRawSize As String)
  Dim startRow As Long
  Dim rng As Range
  startRow = Cells(Rows.Count, 1).End(xlUp).Row
  If Not IsEmpty(Range("A1")) Then startRow = startRow + 1
  Range(Cells(startRow, 1), Cells(startRow, 1)).Value = TheTitle
  Range(Cells(startRow, 2), Cells(startRow, 2)).Value = TheSN
  Range(Cells(startRow, 3), Cells(startRow, 3)).Value = TheManufacturer
  Range(Cells(startRow, 4), Cells(startRow, 4)).Value = TheModelNumber
  Range(Cells(startRow, 5), Cells(startRow, 5)).Value = TheRawSize
  Exit Sub
errlbl:
  Debug.Print Err.Number & " " & Err.Description
  Resume Next
End Sub
 

Attachments

The Excel method is pretty simple but it requires a strongly formatted source file. Access and VBA can overcome some of the lax formatting problems.
I've learned the hard way that spending 5~10 minutes formatting my data in Excel for further operation will not bring the joy and satisfaction of a VBA flexing, but it gives me more time to spend with friends and family (if one-off).
 
I would be willing to bet if the file is large the array method will be significantly quicker than directly reading/writing the cells in Excel.
I have done quite a lot of text data manoeuvring in Excel and the performance difference is at least 10 fold.
 
I've learned the hard way that spending 5~10 minutes formatting my data in Excel for further operation will not bring the joy and satisfaction of a VBA flexing, but it gives me more time to spend with friends and family (if one-off).
The problem is that it is users who actually import the data as part of the daily operation of an application and they typically have no control over the format of the files received so you expect them to format the new file every day?
 
Here is a far better version of the original code to help some in what @Pat Hartman stated
Code:
Public Sub ReadLineByLine(strFile As String)
' ***************************************************
' * Open a Text File And Loop Through It            *
' ***************************************************
  Dim IntFile As Integer
  Dim StrIn As String

  Dim TheTitle As String
  Dim TheSN As String
  Dim TheManufacturer As String
  Dim TheModelNumber As String
  Dim TheRawSize As String
  Dim searchFor As String
  Dim LastElementFound As Boolean

  IntFile = FreeFile()
  Set RS = CurrentDb.OpenRecordset("tblEquip")
  Open strFile For Input As #IntFile

  Do While Not EOF(IntFile)
   
    LastElementFound = False
    searchFor = "Title = "
    If IsFound(StrIn, searchFor) Then
      TheTitle = FindAfter(StrIn, searchFor)
      If Not EOF(IntFile) Then Line Input #IntFile, StrIn
     End If
    searchFor = "SN = "
    If IsFound(StrIn, searchFor) Then
      TheSN = FindAfter(StrIn, searchFor)
      If Not EOF(IntFile) Then Line Input #IntFile, StrIn
    End If
    searchFor = "Manufacturer = "
    If IsFound(StrIn, searchFor) Then
      TheManufacturer = FindAfter(StrIn, searchFor)
      If Not EOF(IntFile) Then Line Input #IntFile, StrIn
    End If
    searchFor = "ModelNumber = "
    If IsFound(StrIn, searchFor) Then
      TheModelNumber = FindAfter(StrIn, searchFor)
      If Not EOF(IntFile) Then Line Input #IntFile, StrIn
    End If
    searchFor = "RawSize = "
    If IsFound(StrIn, searchFor) Then
      TheRawSize = FindAfter(StrIn, searchFor)
      If Not EOF(IntFile) Then Line Input #IntFile, StrIn
      LastElementFound = True
    End If
    If LastElementFound Then InsertEquipment TheTitle, TheSN, TheManufacturer, TheModelNumber, TheRawSize
    If Not LastElementFound And Not EOF(IntFile) Then Line Input #IntFile, StrIn
  Loop
  Close #IntFile
  RS.Close
End Sub
Public Function IsFound(ByVal SearchIn, ByVal searchFor) As Boolean
  IsFound = InStr(SearchIn, searchFor)
End Function

It is not bullet proof but here is the pretty dirty file and it provided the correct results. The only requirement is that the leading identifiers are correct and all elements exist in some order.

Code:
1234 Title = ms1 /c0/e38/s0
Junk In between
Junk in between
abcd SN = WBM3D8MB0000C1163DKB
Junk In between
Junk in between
JUNK JUNK Manufacturer = SEAGATE  AND JUNK
Junk In between
Junk in between
12345 ModelNumber = ST2400MM0149  
Junk In between
Junk in between
vbnm RawSize = 2.182 TB

Junk In between
Junk in between
!@#$%
Title = ms1 /c0/e38/s1
SN = WBM3J6HK0000C1163DHA
Junk In between
Junk in between
Manufacturer = SEAGATE
ModelNumber = ST2400MM0149  
RawSize = 2.182 TB

Junk In Between
Junk In Between

Title = ms1 /c0/e38/s2
SN = WBM3BYZC0000C113A8DU
Manufacturer = SEAGATE
ModelNumber = ST2400MM0149  
Junk In between
Junk in between
RawSize = 2.182 TB

Title = ms1 /c0/e38/s3
SN = WBM3J6Z50000C116K5UL
Manufacturer = SEAGATE
ModelNumber = ST2400MM0149  
RawSize = 2.182 TB
 

Users who are viewing this thread

Back
Top Bottom