Separating Hyperlink in a Long String

RFabert

Registered User.
Local time
Today, 11:23
Joined
Jun 27, 2012
Messages
31
Hi,

I have another string processing question:

Here are sample rows of the data I am working with:

http://www.thedailyworld.com/ View The Daily world website
http://chroniclingamerica.loc.gov/lccn/sn87093039/issues/ <b>View digital version 1904-1922 online.</b>;"http://www.leavenworthecho.com/ Leavenworth Echo website"




http://www.sos.wa.gov/library/docs/iii/charts/kingcounty.htm <b>Newspaper publishing history chart</b> <img align=absmiddle src= http://www.sos.wa.gov/library/docs/iii/olympiansm.jpg > NotScoped
http://www.sos.wa.gov/library/docs/iii/charts/kingcounty.htm <b>Newspaper publishing history chart</b> <img align=absmiddle src= http://www.sos.wa.gov/library/docs/iii/olympiansm.jpg > NotScoped
http://www.sos.wa.gov/library/docs/iii/charts/kingcounty.htm <b>Newspaper publishing history chart</b> <img align=absmiddle src= http://www.sos.wa.gov/library/docs/iii/olympiansm.jpg > NotScoped
http://www.sos.wa.gov/library/docs/iii/charts/kingcounty.htm <b>Newspaper publishing history chart</b> <img align=absmiddle src= http://www.sos.wa.gov/library/docs/iii/olympiansm.jpg > NotScoped
http://www.sos.wa.gov/library/docs/iii/charts/kingcounty.htm <b>Newspaper publishing history chart</b> <img align=absmiddle src= http://www.sos.wa.gov/library/docs/iii/olympiansm.jpg > NotScoped
http://www.sos.wa.gov/library/docs/iii/charts/kingcounty.htm <b>Newspaper publishing history chart</b> <img align=absmiddle src= http://www.sos.wa.gov/library/docs/iii/olympiansm.jpg > NotScoped
http://seattletimes.nwsource.com/html/home/ View Seattle Times website;"http://www.sos.wa.gov/library/docs/iii/charts/seattletimesmap.htm <b>Newspaper publishing history chart</b> <img align=absmiddle src= http://www.sos.wa.gov/library/docs/iii/olympiansm.jpg > NotScoped"
http://www.sos.wa.gov/library/docs/iii/charts/bothellreporter.htm <b>Newspaper publishing history chart</b> <img align=absmiddle src= http://www.sos.wa.gov/library/docs/iii/olympiansm.jpg > NotScoped;"http://www.pnwlocalnews.com/north_king/bkn/ View website"
http://www.sos.wa.gov/library/docs/iii/charts/bothellreporter.htm <b>Newspaper publishing history chart</b> <img align=absmiddle src= http://www.sos.wa.gov/library/docs/iii/olympiansm.jpg > NotScoped;"http://www.pnwlocalnews.com/north_king/bkn/ View website"
http://www.sos.wa.gov/library/docs/iii/charts/bothellreporter.htm <b>Newspaper publishing history chart</b> <img align=absmiddle src= http://www.sos.wa.gov/library/docs/iii/olympiansm.jpg > NotScoped;"http://www.pnwlocalnews.com/north_king/bkn/ View website"
http://www.sos.wa.gov/library/docs/iii/charts/bothellreporter.htm <b>Newspaper publishing history chart</b> <img align=absmiddle src= http://www.sos.wa.gov/library/docs/iii/olympiansm.jpg > NotScoped;"http://www.pnwlocalnews.com/north_king/bkn/ View website"


I am trying to separate the sos chart hyperlink : http://www.sos.wa.gov/library/docs/iii/charts/seattletimesmap.htm

Some have no chart and some hyperlinks are in the middle of the string, rather than in the beginning of the string.


I would like to use Excel Functions on the worksheet rather than VBA.

Thank you!
 
Not sure I completely understand.

Are the first two lines going to have the hyperlink extracted? They don't seem to have the .sos part?

Also, rows have 2 hyperlinks? Which one to extract?

What pattern can we look for?
 
Hi NBVC,

only the hyperlinks with the sos chart path are needed. Other hyperlinks need to be removed from the string. The rest of the rows will have "No Chart Listed" placed into the field. HEre's what I've been working with...the problem is that I have to repeat it and I'd like to be able to do it all in one or two steps:

=IF(ISERROR(FIND("sos",E2))=TRUE, "No Chart Listed ", LEFT(E2,FIND("<b>",E2)-1))
=IF(ISERROR(FIND(";",S2))=TRUE,S2,RIGHT(S2,LEN(S2)-(SEARCH(";",S2)+1)))
 
Not sure what you mean by "I have to repeat it"

Those formulas look pretty good to me, you just need to copy them down...

You wanted both formulas combined into one?

something like:

Code:
=IF(ISERROR(FIND("sos",A1)),"No Chart Listed ",IF(ISERROR(FIND(";",LEFT(A1,FIND("<b>",A1)-1))),LEFT(A1,FIND("<b>",A1)-1),RIGHT(LEFT(A1,FIND("<b>",A1)-1),LEN(LEFT(A1,FIND("<b>",A1)-1))-(SEARCH(";",LEFT(A1,FIND("<b>",A1)-1))+1))))

or a shorter version:

Code:
=IF(ISERROR(FIND("sos",A1)),"No Chart Listed ",TRIM(MID(A1,FIND("http://www.sos.",A1),FIND("<b>",A1)-FIND("http://www.sos.",A1))))
 
Last edited:
I have to repeat the second one because there is often a lengthy list of hyperlinks in one line. It has to be repeated for each new ;
 
NBVC,

The first solution doesn't work and the second one does, other than not including all the types of paths.

Thank you for your efforts!
 

Users who are viewing this thread

Back
Top Bottom