Parsing To A Character Type (1 Viewer)

mstorer

Registered User.
Local time
Today, 01:38
Joined
Nov 30, 2001
Messages
95
Hi Everyone -

I have a string field from which I need to extract a number. The field is not split into fixed length units though. For Example:

00150
002250ABC
0032500XYZ

I am only interested in the numeric portion in bold.

I can omit the first three numbers easily enough, but since the value I want varies in length and there is no specific delimiter, I don't know to to get the instr() function to go to the first non-numberic value. Perhaps I'm barking up the wrong tree with instr()?

Thanks for any assistance!

- Matt
 

boblarson

Smeghead
Local time
Yesterday, 17:38
Joined
Jan 12, 2001
Messages
32,059
How about using

Val(Mid([YourFieldNamehereInBrackets],4))
 

mstorer

Registered User.
Local time
Today, 01:38
Joined
Nov 30, 2001
Messages
95
Thank you Bob. A long day and not enough coffee can sometimes make a mountain out of a mole hill. For whatever reason, I was fearing I'd have to write a function that looped to the first character.... very inefficient with the size of the database. And I use the Val()function all the time!!!

Thanks again for the carity!

- Matt
 

deafmetal1

Senior Chief
Local time
Today, 09:38
Joined
May 2, 2008
Messages
30
Great thread, however I have a similar issue as follows:

I am parsing as well using INSTR.

The data string is formatted as follows:

P/-/9 or P/1/9

I want it to grab the string irregardless of the middle character in between the delimiters (slashes) and yet take into account there must be a number 0-9 after the second slash, so that I'd parse any of the below examples:

P/-/6
P/1/6
P/8/7

etc...

I can write a thousand lines of code to grab P/-/0 then look for P/1/0, then P/2/0, etc... but that is WAY inefficient, and like I said, don't care what the middle character is as long as it's looking for the number after the 2nd slash. Reason is there are other lines in the message that may have a P/ in it.

I tried Like and nested IF statements to no avail. Was thinking defining a string with the criteria, then using INSTR(strSTRING, strCRITERIA). That on a path to success? Thank you in advance!!

Rob
 

deafmetal1

Senior Chief
Local time
Today, 09:38
Joined
May 2, 2008
Messages
30
Bob,

Sorry to confuse you, I confuse myself some days!!

Let me try this; In the emails/messages I parse, I look for a line that looks like this:

P/-/####

Sometimes it will have a number in between, as such: P/#/####

There is more data in the lines, and they end with a //.

I have it figured out to find these lines based simply on starting with P/ and ending in //. But if there are any other accidental P/ in a different line that ends with a // it grabs it mistakenly.

Here's the existing code:
Code:
If InStr(strFileToParse, "P/") <> 0 Then
          strFileToParse = Trim(Right(strFileToParse, Len(strFileToParse) - InStr(strFileToParse, "P/") + 1))
            If InStr(strFileToParse, "//") <> 0 Then
            strLineToParse = CleanString(Left(strFileToParse, InStr(strFileToParse, "//") + 1))
            strP = strLineToParse
End If
Else
End If
It was grabbing extraneous stuff with just the P/, so I wanted to extend to the right a bit to ensure I get the right line.

That's where I get the P/*/# (asterich represents any singular char, will either be a dash or a number, the # is a number 0 thru 9)

That help clarify any further? Wish I could just use wildcards for characters and numbers in the InStr statement above.

R,
Rob
 

raskew

AWF VIP
Local time
Yesterday, 19:38
Joined
Jun 2, 2001
Messages
2,734
Doesn't sound as though you tried the function provided.

As written, it returns the value of the character(s) following the 2nd "/", without consideration to what might be between the slashes. Is that not what you are after?

Bob
 

gemma-the-husky

Super Moderator
Staff member
Local time
Today, 01:38
Joined
Sep 12, 2006
Messages
15,738
you could use the split function, then test each item in the resulting array with val until you get a value <> 0

this would mean that A\12\123XY would return 12 as the first value.
but A\AB\123XY would return 123
 

deafmetal1

Senior Chief
Local time
Today, 09:38
Joined
May 2, 2008
Messages
30
That would mean I'd have to understand it, and it confuses me, but am trying it now. Th eonly reason I didn't was because I didn't understand the syntax: ? fFindNthOccur(x, " ", 3)

Will try and report back findings, wasn't trying to be an ignorant requester of help, my skill level of understanding sometimes hampers my ability.

Sorry to matt for hijacking his original thread!!!
 

raskew

AWF VIP
Local time
Yesterday, 19:38
Joined
Jun 2, 2001
Messages
2,734
Hi -

The ...
x = "P/-/6"
? val(mid(x, fFindNthOccur(x, "/", 2)+1))
6

...is the method used to test the function from the immediate (debug window) (Ctrl + G)

The first line (x = "P/-/6") followed by a carriage return simplifies the "P/-/6" so you can just refer to 'x', rather than repeating the whole thing.

The next line tests the function. '?' represents print. The call, as shown above, returns the numeric value of the character(s) immediately following the second slash in x -- in this case it's 6. Were x = "P/8/62A5", the same call would return 62. The fFindNthOccur() function is actually returning the location of the second slash in the string, which in this case is 4. The Mid() function is then used to return a value starting with 4 + 1. Were you using just the Mid() call, it'd read: ? val(mid(x, 5))

Using the example you showed (? fFindNthOccur(x, " ", 3)), the function returns the location of the 3rd space in the string, e.g. if
x = "The quick brown fox", the above call would return 16

Hope that clarifies it somewhat.

Best Wishes - Bob
 
Last edited:

deafmetal1

Senior Chief
Local time
Today, 09:38
Joined
May 2, 2008
Messages
30
Bob,

That's an EXCELLENT explanation, and I truly appreciate you taking the time to do so!

I have fiddled with your code, it works great, I get the value after the 2nd Slash every time, but at the point that I'm using this code, I'm trying to get the parser to find the P/#/### line, not so much process the data after the 2nd slash yet. So although using the code against a line that starts with P/-/6, I get a resulting 6, I am having it grab the entire line based on the starting data (P/-/6), then later on I grab the internals of that line and store the "6" in a table.

Make sense?
 

raskew

AWF VIP
Local time
Yesterday, 19:38
Joined
Jun 2, 2001
Messages
2,734
Hi Chief -

Make sense?

Not entirely, I'm a tad slow.

Please provide a list of examples, showing in bold that numbers you're attempting to isolate. If some of the examples wouldn't be valid (should be overlooked), please include them anyway.

Are these strings coming from a field or what? Please clarify.

Best wishes - Bob (SGM, USA, Ret)
 

deafmetal1

Senior Chief
Local time
Today, 09:38
Joined
May 2, 2008
Messages
30
Bob,

Great to see another Senior Enlisted!

Maybe if I start from ground zero, it will help. I'm parsing messages out of Outlook. The parser stores the file to parse as an object, then I have a Public Sub that looks within each object for specific lines. Once it finds that specific line to parse, which ends in a "//", it then passes the line to parse to a line specific function which strips the specific data fields and appends them to applicable tables within the db.

The whole thing works fine if the line has a very unique set name, but some of the messages have lines that simply start with one letter. In the previous examples, a P. Below is an example of one unique line:

MSGID/MSG TYPE/SENDER/SERIAL #//

So in this case, I have the parser use InStr(strFileToParse, "MSGID/") and a delimiter function to find the "//" to grab the specific line. Then I pass that line to a function that will pull out the specific items after each individual "/". In this case I send the MSG TYPE, the SENDER and the SERIAL # to a table. After some reverse engineering, this is easy.

The hard part is single letter lines, such as S/, P/, E/ etc... In the above example line, the second field ends in E/. So if I was parsing with my InStr method above simply using "E/", it would grab the E/ and all after that to the "//", although from the wrong set, because it exists previous to the real E/ line I want later in the message.

So I started trying to find ways to grab more from the line I want to parse. So if the E/ lines always looks like this: E/BROWN/FOX/, I could simply adjust my InStr statement to include all of that. I don't have that option with my P line.

It is always like this: P/(either a "-" or a single number)/#####(4 - 5 digit number)/

For example: P/1/5555/-/-/-/-/-/-// (the other dashes here represent other fields of data that may be populated in this line, but are unimportant to discuss at the moment but get stripped from the line after parsing.

Either way I just want it to look for that routine, hopefully with some kind of wildcard routine, and capture that line so I can then strip the specific sets. Using P/ isn't working, so I have to move to the right a little bit to ensure I get the right line for post parsing data stripping.

That help Bob?

Thanks Again Shipmate!!
 

raskew

AWF VIP
Local time
Yesterday, 19:38
Joined
Jun 2, 2001
Messages
2,734
Hi -

Did I mention slow?

Would you provide a series of examples, with the appropriate items to extract shown in Bold.

What I'm trying to see is the logic of if you detect MSG ID, you do this, but if you detect P/ you do that.

Appreciate your explanations but a bunch of examples (or a zip file, if that's feasible), in the format that you'd process them, would be more self-explanatory.

Thanks, Bob
 

deafmetal1

Senior Chief
Local time
Today, 09:38
Joined
May 2, 2008
Messages
30
I have an example base DB attached. Included in the zipfile is also a pic of the VBA references, and 4 test emails. Reccomend creating an Outlook folder in your Inbox called MSG_PARSER or whatever. Not sure if you can just drag and drop the emails into it, though you should be able to. You'll be able to set the folder location in the parser itself.

The parser has 4 entries already in it's tables that you can empty and play with. They are from the 4 emails provided. You'll notice the P/ issue if you open up the MSG_Data table and look in the PNUM column. That illustrates the issue if I just use "P/" in my InStr method. Hence why I was trying to somehow incorporate more of the P line structure into my parsing of the line.

That any better?

Thanks a TON!! This is a 6+ month project by a VBA illiterate [me]. (unfortunately with this stripped example, you don't get to see all the stuff I've learned here from the TREMENDOUS help I've received. But you'll see some, like the Outlook folder option.)
 

Attachments

  • MSG PARSER.zip
    90 KB · Views: 114

raskew

AWF VIP
Local time
Yesterday, 19:38
Joined
Jun 2, 2001
Messages
2,734
Hi -

Outlook not being my strong-suit, I was unable to drag & drop the messages to folder MSG_Data, apparently since the messages are Read Only.

Having said that, I still don't see the problem. Given the examples in table Msg_Data, the function I provided will, in every case take the string in field pLine and correctly return the number shown in field pNum.

Code:
x = "P/-/9999/Test/Test/Test//"
? format(val(mid(x, fFindNthOccur(x, "/", 2)+1)), "0000")
9999

x = "P/-/1234/Test/Test/Test//"
? format(val(mid(x, fFindNthOccur(x, "/", 2)+1)), "0000")
1234

x = "P/4/0052/Test/Test/Test//"
? format(val(mid(x, fFindNthOccur(x, "/", 2)+1)), "0000")
0052

Since your 'numbers' are stored as text and may contain preceding zeros, I added the format() wrapper.

What am I missing? Please post an example string which may occur, where the above wouldn't work.

Bob
 
Last edited:

deafmetal1

Senior Chief
Local time
Today, 09:38
Joined
May 2, 2008
Messages
30
If that's the case, then how do I have the Public Sub solve for x? In your intermediate code, you're providing the PLine data. Whereas my problem is finding and grabbing the PLine first, I'll worry about delimiting and getting specific fields from the line later. I need to get the PLine every time first. In the 4th example email (Problem Child), that's the one that shows where an errant P/ ruins my method.

That or I'm missing how best to implement your code. There are more than that numerical field that I need to grab from the PLine, I'm just simplifying for the sake of trying to reliably grab that line within an object every time.
 

raskew

AWF VIP
Local time
Yesterday, 19:38
Joined
Jun 2, 2001
Messages
2,734
Take a look at this, copied from the 4 test messages:

Code:
Test 1
x = "MSGID/TEST-MESSAGE/ME-MYSELF-AND-I/001//P/-/9999/Test/Test/Test//"
? mid(x, instr(x, "//")+2)
P/-/9999/Test/Test/Test//

Test 2
x = "MSGID/TEST-MESSAGE/ME-MYSELF-AND-I/002//P/-/1234/Test/Test/Test//"
? mid(x, instr(x, "//")+2)
P/-/9999/Test/Test/Test//

Test 3
x = "MSGID/TEST-MESSAGE/ME-MYSELF-AND-I/003//P/4/0052/Test/Test/Test//"
? mid(x, instr(x, "//")+2)
P/4/0052/Test/Test/Test//

Test 4
x = "MSGID/TEST-MESSAGE/FROM-A-PIMP/023//P/-/1234/Test/Test/Test//"
? mid(x, instr(x, "//")+2)
P/-/1234/Test/Test/Test//

I don't see the "errant P" in Test 4.

Bob
 

deafmetal1

Senior Chief
Local time
Today, 09:38
Joined
May 2, 2008
Messages
30
Bob,

Open up the MSG_Data table, look under the PNUM column for Test #4 data. You'll find the Value of ERROR_FORMAT (because intNthDelimitorNumber:=3 doesn't exist due to grabbing the wrong line of data). Further, if you look at the actual PLINE it parsed (P/023//), it grabbed the wrong line due to the errant P/ in the originator field of test #4 (FROM-A-PIMP/). [Lame example, I know.]

The importance is all in my tables, I then export the data to Excel for Pivot Tabling. I understand the text format you mentioned above (appreciate the wrapper, but unnecessary), I just whipped together a sample db. In my live one, the import table is text, and the final data table is set to Single Number with 4 decimal places. There are some numbers I snag that I don't want to lose the zeros.

So if I read your code above correctly, you're counting the second instance ending in // to get the PLine? This may be doable, but for messages with more than one particular Line that may get repeated without a predictable sequence, that may not work so well, unless I come up with a flexible loop. Interesting.
 
Last edited:

raskew

AWF VIP
Local time
Yesterday, 19:38
Joined
Jun 2, 2001
Messages
2,734
How does (a verbatim copy of the Test 4 message):

MSGID/TEST-MESSAGE/FROM-A-PIMP/023//
P/-/1234/Test/Test/Test//

...turn into what is shown in the table? The message format is identical (with the exception that there is a '-', rather than an integer between the first two slashes) to Test 1 - 3 messages.

Is this your code doing this?

Bob
 

Users who are viewing this thread

Top Bottom