Problem in 'onchange="new Ajax.Request('...')' calls.
Hi Everybody,
I have a site to scrape, which has three drop down boxes for country, province and city respectively. For country and province dropdown box, I have "onchange="new Ajax.Request('...')" calls. When i select a Country it automatically populates the option values of province, and when i select a province it populates the city drop down values without any page refresh.
Here the values which are populated automatically can be seen in the browser, but when i view source the html code, it is not changed and i can't see the new populated values there. I also take a look at Last Response and i found "\u003Cp id=\"city_select\"\u003E\n\
n\t\u003Cselect id=\"city\" name=\"city\..." codes like this. I feel this cause the problem in viewing HTML source code.
I have a requirement where i can loop through all the countries, provinces and their cities, and make a URL for scraping useful data.
But in screen scraper my Extractor Pattern fails to identify the html code.
Can anyone please help me to find a solution.
Thanks in advance.
Vivek.
Problem in 'onchange="new Ajax.Request('...')' calls.
Thanks a lot for your reply. your suggestion works.
Problem in 'onchange="new Ajax.Request('...')' calls.
vivek,
You're going to have to proxy the site using screen-scraper's [url=http://www.screen-scraper.com/support/docs/proxy_server_overview.php]proxy[/url]. While you have your browser connected through the proxy, click on a few of the options and you'll notice that new requests and responses will appear in the proxy transactions list.
You'll want to make scrapeables file out of a few of the transactions but I would recommend learning what's being requested and reuse them as you can.
Also, remember Ajax code won't necessarily come down with any mark-up and can look somewhat cryptic. A deft hand at regular expressions may be necessary.
-Scott