Web Wiz - Green Windows Web Hosting

  New Posts New Posts RSS Feed - how to extract data from websites
  FAQ FAQ  Forum Search   Events   Register Register  Login Login

how to extract data from websites

 Post Reply Post Reply
Author
liza View Drop Down
Newbie
Newbie


Joined: 24 January 2006
Status: Offline
Points: 2
Post Options Post Options   Thanks (0) Thanks(0)   Quote liza Quote  Post ReplyReply Direct Link To This Post Topic: how to extract data from websites
    Posted: 24 January 2006 at 3:57am

Hi,

I would like to know how data can be automatically extracted from websites, such as from asp pages.
 
Is it possible to automatically select from the drop-down list, type in the text boxes, and automatically click on the submit button.  And then do this over and over again for all the items in the drop-down list?
 
It seems like it is possible, right?  Otherwise, how do those shopping comparison websites do it?
 
Thanks in advance.
 
Liza
Back to Top
dpyers View Drop Down
Senior Member
Senior Member


Joined: 12 May 2003
Status: Offline
Points: 3937
Post Options Post Options   Thanks (0) Thanks(0)   Quote dpyers Quote  Post ReplyReply Direct Link To This Post Posted: 24 January 2006 at 9:51pm
In asp, you can use serverXMLHTTP to pull back a screen in a string that you can then search/screenscrape. You can also use it to post form data although some sites will check http_referrer to ensure that form data is only posted from their domain.

Many of the comparison sites use an api to pick up a feed from the sites they compare .

Lead me not into temptation... I know the short cut, follow me.
Back to Top
liza View Drop Down
Newbie
Newbie


Joined: 24 January 2006
Status: Offline
Points: 2
Post Options Post Options   Thanks (0) Thanks(0)   Quote liza Quote  Post ReplyReply Direct Link To This Post Posted: 27 January 2006 at 7:40am

Thanks for your response.

Could you also point me in the direction of learning about those things that you mentioned?  websites, books, etc.
 
Thanks!
Back to Top
dpyers View Drop Down
Senior Member
Senior Member


Joined: 12 May 2003
Status: Offline
Points: 3937
Post Options Post Options   Thanks (0) Thanks(0)   Quote dpyers Quote  Post ReplyReply Direct Link To This Post Posted: 28 January 2006 at 12:20am
There's an example of serverxmlhttp here http://new2asp.com/Section_Files/PullStrings/PullStrings.asp

You can use either get or put. a put will ususally require adding a query string to the url.

You can probably contact the sites you want to get the info from to find out what api's they use. There's industry specifix xml feeds and often site specific ones.

Always get permission from the sites to either screen scrape or use their api. You'll be exposed to legal liability if you don't. They can also complain to your webhost and your isp about theft of content, badwidth theft, etc. You could wake up one morning and find yourself on a badgut list with no website and no internet connection.


Edited by dpyers - 28 January 2006 at 12:21am

Lead me not into temptation... I know the short cut, follow me.
Back to Top
 Post Reply Post Reply

Forum Jump Forum Permissions View Drop Down

Forum Software by Web Wiz Forums® version 12.08
Copyright ©2001-2026 Web Wiz Ltd.


Become a Fan on Facebook Follow us on X Connect with us on LinkedIn Web Wiz Blogs
About Web Wiz | Contact Web Wiz | Terms & Conditions | Cookies | Privacy Notice

Web Wiz is the trading name of Web Wiz Ltd. Company registration No. 05977755. Registered in England and Wales.
Registered office: Web Wiz Ltd, Unit 18, The Glenmore Centre, Fancy Road, Poole, Dorset, BH12 4FB, UK.

Prices exclude VAT at 20% unless otherwise stated. VAT No. GB988999105 - $, € prices shown as a guideline only.

Copyright ©2001-2026 Web Wiz Ltd. All rights reserved.