Odd-numbered appearances of text within a cell

TomH

Registered User.
Local time
Today, 17:12
Joined
Nov 3, 2008
Messages
111
HI all.

A friend of mine asked for some help on this and I haven't been able to figure it out. He extracted some text from a PDF file, pasting it into Excel, and needs to find certain pieces of information. Problem is, the text came out as a long string instead of being broken into cells.

So, what we need to do is find, within the text, the 1st, 3rd, 5th, 7th, 9th, 11th and 13th times the word "Principal" appears, and then report back the name that follows. Within the text, it would read something like

(1234 Principal John Doe)

The number in front of it will change or be in a different format, there are other parentheses in the text (varying number of times within the text), and the name changes, of course. We want to extract the name following the word "Principal"... but only the odd-numbered times they appear. Some cells will have as few as six "Principal" entries, others as many as fourteen.

I hope this is enough information. Any help is greatly appreciated.
 
Just use a nested find ? i.e.
=FIND("Principal", V11,FIND("Principal",V11)+1)

Should find the start of the second one...

Could also do something simular in VBA if you pref.
 
Thanks for the response.

That gets me to the first and second iterations of the word "Principal"... so would I have to have 14 nested FINDs? Can I even do that? Also, what I want reported back is the name that follows, and those names will vary in length.

Isn't there a way of asking for the position of the, for example, 5th time a text string appears in a cell without the nesting?
 
No there isnt, indeed the hard(er) part is going to be extracting the name.

Is the 7th principal always the last?

Like I said, if you prefer you can also do something with VBA.... along the lines of:
Code:
Sub Get7thPrincipalName()
    Dim i As Integer, y As Double
    Dim tst As String
    tst = "(1234 Principal John1 Doe)(1234 Principal John2 Doe)(1234 Principal John3 Doe)(1234 Principal4 John Doe)(1234 Principal John5 Doe)(1234 Principal John6 Doe)(1234 Principal John7 Doe)(1234 Principal John8 Doe)(1234 Principal John9 Doe)"
    y = 1
    For i = 1 To 7
        y = InStr(y, tst, "Principal") + 9
    Next i
    Debug.Print Mid(tst, y)
End Sub
 
Thinking about it for 2 more seconds, perhaps the split function is something of use here.

Code:
Sub testSplit()
    Dim i As Integer, y As Double
    Dim arr() As String
    Dim tst As String
    tst = "(1234 Principal John1 Doe)(1234 Principal John2 Doe)(1234 Principal John3 Doe)(1234 Principal4 John Doe)(1234 Principal John5 Doe)(1234 Principal John6 Doe)(1234 Principal John7 Doe)(1234 Principal John8 Doe)(1234 Principal John9 Doe)"
    y = 1
    arr() = Split(tst, "Principal")
    For i = LBound(arr) To UBound(arr)

        Debug.Print i, arr(i)
    Next i
    Debug.Print "Number 7 only" & arr(7)
End Sub
 
Thanks guys... with your help I have figured it out.

To find the starting points for the subsequent "Principal" entries, I use the example above for the second one, and laid them out in a single row. To find the ending point, I used the FIND function to find the close parentheses that follows the name, using the appropriate related starting point as in the first FIND function, so that I find the right one in the text. Then, to get the name that follows, I used the MID function with the starting point being the starting point of "Principal" plus 10 characters, and the length being the position of the close parentheses minus the "Principal" starting point +10. Worked great. The key was the FIND function.

Some of my entries only report back six, others have as many as fourteen... but I can eliminate the error messages and then sort out the names I need from there.

THANKS for your help.
 

Users who are viewing this thread

Back
Top Bottom