Any way to clear cookies without the enterprise edition?
Hi!
I am trying to scrape information from http://opportunities.osteopathic.org/search/search.cfm?searchType=1
The search session establishes a random CFID, TOKEN, and JSESSIONID.
I am scraping the cookies and reusing them in my scrapes but it, unfortunately, does not work. I get the following:
Starting scraper.
Running scraping session: AOA IM Residency
Processing scripts before scraping session begins.
Processing script: "Initializing AOA Session"
Scraping file: "Search Results"
Search Results: Resolved URL: http://opportunities.osteopathic.org/search/search_results.cfm?CFID=44828&CFTOKEN=c06d0e5ad8245290-5EEEF88D-CA60-8432-1A2837A4EEF25725&jsessionid=f030fc2272bb71cb8e83463c6ce7a272a168
Search Results: Sending request.
Search Results: Warning! Received a status code of: 500.
Search Results: Processing scripts before all pattern applications.
Search Results: Applying extractor pattern: PROGRAM ID HOSPITAl ID
Search Results: Extracting data for pattern "PROGRAM ID HOSPITAl ID"
Search Results: The pattern did not find any matches.
Search Results: PROGRAM ID HOSPITAl ID: Processing scripts once if no matches.
Search Results: PROGRAM ID HOSPITAl ID: Processing scripts after all pattern applications.
Search Results: Warning! No matches were made by any of the extractor patterns associated with this scrapeable file.
Processing scripts after scraping session has ended.
Scraping session "AOA IM Residency" finished.
I assume that I need to clear the cookies prior to the scrape but, without the enterprise edition, I am unable to use session.clearCookies();
Anyway around this? Am I headed in the right direction?
Thanks!
Steven
Thanks! That seemed to fix the issue!
Screen-scraper is one step ahead of me :)
Steven, Sometimes it's best
Steven,
Sometimes it's best not to short cut things. screen-scraper will handle the passing of cookies provided it has a chance to receive them from the server.
You should probably have at least three scrapeable files.
Search by (with POST parameters)
http://opportunities.osteopathic.org/search/search.cfm?searchType=1
States (with POST parameters)
http://opportunities.osteopathic.org/search/search.cfm
Search results (redirects to...)
http://opportunities.osteopathic.org/search/search_process.cfm
(...)
http://opportunities.osteopathic.org/search/search_results.cfm?CFID=65977&CFTOKEN=9f4f299627ac6717-707B1BCF-C756-42D3-68DB31E5C2A2FC9E&jsessionid=f0302a178783b5206f364347c465b5e29226
So, notice how if you allow screen-scraper to process the first three pages it will automatically redirect to the fourth page with the CFID, CFTOKEN, & jsessionid in tact.
-Scott