Scraping ASP.NET sites and Going to Next Page
Hi,
I have finished completing all the tutorials and was very keen to run my first scraping project. However I have hit a brick wall as the site (http://www.totaljobs.co.uk ) I am trying to scrape is an ASP.NET site and next button for next page uses aspx postback / javascript
Can someone please explain the steps I need to perform to go to the next page. I have read http://blog.screen-scraper.com/2008/06/04/scraping-aspnet-sites/ and other topics in the forum but I am still struggling.
The result page work fine when I pass ~#INDUSTRY#~ but the next page errors with Warning! Received a status code of: 500. Please help.
Here are the parameters on the result page
Keywords -
Industry- ~#INDUSTRY#~
Sort- 2
From- /JobSearch/AdvancedJobSearch.aspx
Here are the parameters on the next page
Keywords -
Industry- ~#INDUSTRY#~
Sort- 2
From- /JobSearch/AdvancedJobSearch.aspx
__EVENTTARGET - srpPager
__EVENTARGUMENT - 4 (this is the page number)
hdnSearchResults - BJF,A,C8y0E,G/J,ZAo,db2,gC7,hg2,Jvo-,GPO-,DVu-,B8F-,d,B etc--
__VIEWSTATE - /wEPDwUKMTk3MDU1OTM0Mg9kFgwCAg8WAh4JaW5uZXJodG1sZWQC etc---
advancedRefineSearch$txtKeywords
advancedRefineSearch$txtLocation
advancedRefineSearch$ddlRadius :5
advancedRefineSearch$ddlSalary: 1|0
advancedRefineSearch$ddlContractType :0
advancedRefineSearch$ddlIndustrySector :18
advancedRefineSearch$rblCompanyType : 0
ddlSort : 2
Job Sites
Hi
Im currently working on a project for a client, that is looking at scraping the site you mentioned in your response.
Have you been able to get this to work.
spwizard, Have a look at this
spwizard,
Have a look at this blog entry for more ideas on scraping .Net sites.
-Scott
The trick is to get your HTTP
The trick is to get your HTTP request to look exactly like the one your proxy captured. In .NET, the VIEWSTATE can be dynamic too, so you may need to scrape it and set with a variable too.