Web Scraping (1 Viewer)

YNWA

Registered User.
Local time
Today, 05:57
Joined
Jun 2, 2009
Messages
905
Hi,

Would someone be willing to create an excel file that will scrape all the course and details from a Universitys website?

Trying to use YouTube but just do not get it at all.

Thanks
 

plog

Banishment Pending
Local time
Yesterday, 23:57
Joined
May 11, 2011
Messages
11,675
Would someone be willing to create an excel file that will scrape all the course and details from a Universitys website?

That I got to see. I've worked with some Excel guru's (that's not a compliment) who could do some truly stupid-magnificant things with Excel that it wasn't intended for (building web pages, a time clock for employees, a touch screen login ap for patients to a doctor's office). If such a site existed for Excel abominations I could spend days on it, kind of like thedailywtf.com or clientsfromhell.com

With that being said. If you can provide the url of the site, I can look at it and see if I can scrape it with a more apt programming language which would put the data in an Excel file, rather than using Excel to do it.
 

YNWA

Registered User.
Local time
Today, 05:57
Joined
Jun 2, 2009
Messages
905
That I got to see. I've worked with some Excel guru's (that's not a compliment) who could do some truly stupid-magnificant things with Excel that it wasn't intended for (building web pages, a time clock for employees, a touch screen login ap for patients to a doctor's office). If such a site existed for Excel abominations I could spend days on it, kind of like thedailywtf.com or clientsfromhell.com

With that being said. If you can provide the url of the site, I can look at it and see if I can scrape it with a more apt programming language which would put the data in an Excel file, rather than using Excel to do it.

Hi, its http://www.westminster.ac.uk/courses/undergraduate and I need to go into each Course (eg. Accounting, then go to each sub course eg Business Economics BSc Honours and then scrap the data from there.

So essentially this page of data http://www.westminster.ac.uk/course...-time/u09fubec-bsc-honours-business-economics.

It can be done using Marcos see here: http://www.youtube.com/watch?v=qbOdUaf4yfI but its condused me.

Thanks
Will
 

plog

Banishment Pending
Local time
Yesterday, 23:57
Joined
May 11, 2011
Messages
11,675
It can be done using Marcos

Like I said, I'm sure there are people who could create an Excel file to operate a pacemaker, doesn't mean its the right tool for the job.

Since this involves scraping multiple pages, its probably going to take a few hours to code this (at least for me anyway), so I'd recommend going to a freelance site and hiring someone to do it.
 

YNWA

Registered User.
Local time
Today, 05:57
Joined
Jun 2, 2009
Messages
905
Now that's a great line...

The last link the OP provided is very comprehensive, the author appears to pull off the operation. So I am confused why the OP can't keep repeating the procedure until it works.

I dont get how it can be done. From his example all the States seem nicely listed in a number sequence and all in one place.

But my example I have to click Accounting, then click a course, then scrape the course page, then go back and click another course and so on.

The web pages are not numbered nicely, so each page has a different text name, not a different number.

Do you think the only way to scrap the page is to scrape the course page itself and do that for every course?

If so, its probably easier and quicker for me to just copy and paste what I need from each course page.
 

YNWA

Registered User.
Local time
Today, 05:57
Joined
Jun 2, 2009
Messages
905
If I was faced with this type of challenge I would practice the author's technique until I could replicate his results, then tweake to suite my needs.

I can replicate his results for 1 page, but I dont see how its any use when I can copy and paste the individual course pages quicker.

I would need a tool that scrapes the /courses/ directory but using his method he +1 increments the url nicely so its easy to do.

If one course is called Finance Management and the other is Music Production, how do I say go to music after finance?

Especially if Finance Management is in the /courses/accounting/ folder and music is in the /courses/music/ folder?
 

Users who are viewing this thread

Top Bottom