Grabbing Web Information Into Database

DataCraft

New member
Local time
Tomorrow, 05:49
Joined
May 18, 2015
Messages
1
Hi All, I am rather new to Access and am building a personal book cataloger. I've so far created the main table and form, but...

I have two third-party library programs that allow you to type in an ISBN and the software searches, say Amazon, and grabs particular missing data: author, cover jpg and summary etc, and imports the info to the program. I'd like to replicate this process with Access. If I can type in an author, title or ISBN to Access, I'd like it to search Amazon to gather and input the missing data to my assigned fields. Is this possible? Or am I stuck to typing in the information separately. Thanks.
 
..Is this possible? Or am I stuck to typing in the information separately. Thanks.
It is possible, but only if your programming skill is on a certain level, else it is a big hurdle.
In your program, you open IE as silent, navigate and send a search string to Amazon, wait until it replay and search in the returned string after the information you need.
And then you put the information into your database.
Everything is done by code you have to write or find on the Internet.
The bold words are keyword you can use for searching the Internet after information/ sample code.
Below is a link to how to use the keywords.
https://www.google.dk/search?q=Crea...Qbnw4PQAQ#q=open+internet+explorer+silent+vba
 
Hi DataCraft!

Many years ago, I created an Access 2007 app to keep stats on the NHL hockey games. It was connecting to NHL site and READING the page reporting scores, etc.
M. JHB is very polite in saying 'it is a big hurdle'; I am a very experienced programmer and it took me something like 100 hrs. to get it work. I will try to find something about it.

Good luck, JLCantara.
 
100 hrs. isn't that much if you start from scratch even if you are a experienced programmer.
Getting ideas; searching, finding and sorting information, code writings, testing and error finding, testing again - time flies very fast! :D
Afterwards when everything runs and the code is line up, you can't nearly understand why it took so long, but !!! :D :D
 
Hello Spike!!!

Since my NHL app was lost, your ref. will be useful if I need to read web pages again.

Good day, JLCantara.
 
Hi guys, whilst scraping the HTML of a book supplier is a good answer, I think for books (especially) scraping HTML is like building a steam powered car.

I would suggest you instead look at API which supply you information in XML format. For example this one http://isbndb.com/account/logincreate sorry I am assuming it is for free but do not have an account.

Also Amazon Web Services (free to join and minimal costs based on usage) http://docs.aws.amazon.com/AWSECommerceService/latest/DG/EX_LookupbyISBN.html
 
Darbid

When I propose a solution, I know what I am talking about! I suggest this but I didn't try. Amazon - minimal cost: how much? dunnow.
And how do you read the XML file? dunnow.

So...
 
Lets make my answer a little more generic then.

HTML Scraping is not a good idea, for many reasons including that the owner of the site might not like it, may try to code to prevent it and every change that is made to the scraped part of the website will have to be reflected by you with changes in your code.

Using web services (an API) in XML or JSON format is by far more popular and could be said to be a developer's generally accepted way of solving this problem.

Specifically I have never called a book metadata API so I cannot comment on any specific API.

I am a user of Amazon Web Services (AWS) and I am paying nothing for the membership for the first year. AWS charges based on usage and has many products which it offers such as databases, non-sql storage and many many more products which all have different pricing based on usage.

I hope that clears up a few things.
 
Hi Darbid!

Reading your #9, I noticed that I was picky. I am presently in a bad mood because I am having all kinds of problem since I switched to Access 2013 (Office 365)!

If a site owner use HML format for his info display, well too bad. Here, in Québec, many company display info that cannot be 'read': they use 'pictures'!!!!

Since web pages are generated by an app, their format is stable: so no problem with that.

Anyway, what ever is the format of what you read (HTML, HHHTML, XXXML, etc.) you need to know the tag's name AND STRUCTURE. So the work ends up almost the same...

Good day, JLC.
 
Darbid has a point. Screen scraping is a measure of last resort. It is not stable because changes can be made to a site without any warning, also, some sites' TOS directly forbids scraping. Webservices serving XML is a feature provided by many sites and data tends not to be buried in prettyfying elements; so surely it is preferable to scraping.

Besides, OP has never returned, so all this is pretty academic...
 

Users who are viewing this thread

Back
Top Bottom