Screen-Scraper Freezes On Last Response Tab
I'm trying to scrape the page at https://locator.chase.com. Every time I attempt to view the last response tab for the scrapeable file, screen-scraper freezes and my CPU cycles max out to 100%.
I've taken the following actions to try to fix it:
- turned off tidy
- cleared and rejected cookies
- increased memory allocation for screen-scraper
- cleaned the response before applying the extractor pattern
- limited the response displayed in the last response tab by applying a few methods to the response using "scrapeableFile.getContentAsString();", but changing the length of the response displayed in the last response tab doesn't fix the problem.
Is there some way I can troubleshoot more thoroughly to glean more information from screen-scraper about how and why this is happening? I've looked through the session level API and haven't seen anything particularly useful. Secondarily, is there any way I can apply my own "TidyHTML" functionality to a scrapeableFile, or make changes to the methods in which screen-scraper parses the response from the server before the scrapeableFile is called? I know I can change the response before the extractor pattern is applied, but doing that did not fix this problem. (This would be particularly useful when parsing JSON responses).
Screen-Scraper Version 6.0.67a - Enterprise Edition
Windows 7 64 bit version
In version 6.0.61a or newer
In version 6.0.61a or newer you can add a line to screen-scraper.properties:
The problem you describe is the last response is too big, so you need to set it down to a point it will work. It will just be trial and error. Once done, to see the full response, you may need to view the response in a browser, then view the source of that page.
Still freezing - but problem is fixed
Hi Jason. Thank you for the response.
I added
MaximumDisplayedLastResponseLength=100
to the screen-scraper properties file and restarted screen-scraper. Although it did make the last response much smaller, I still managed to crash screen-scraper somewhere between viewing the last response, changing the HTML Tidy settings, and attempting to click on the scrape name. However, with another restart of the program, everything is working fine now. Thanks.