Concurrent Scrape: Out of memory

I'm having memory problems when running concurrent scrapes.

Here is my set up.
win 7
4 gig ram
quad 2.6ghz
1 gig max memory allocation in screen scraper

I'm invoking screen scraper via java.

Java side:
Read input from file and save it to list. Close stream.

//loops for around 50 keywords
//this spawns 50 scrapes
for (String keyword : keyList)
{
try
{
RemoteScrapingSession remoteScrapingSession = new RemoteScrapingSession("TEST");
remoteScrapingSession.setVariable("KEYWORD", keyword );

remoteScrapingSession.setDoLazyScrape( true );
remoteScrapingSession.scrape();

}
finally
{
// Be sure to disconnect from the server.
remoteScrapingSession.disconnect();
}
}

On the screen scraper side:
Before scraping session begins:
Initialize csv stream and save it go a session variable
Initialize a few session variables
Read input from a file and save contents to list. Close stream.

First Scrapeable Page:
for loop to invoke product page

Product scrapeable page:
for loop to invoke details page

Details scrapeable page:
scrape contents, write them out to csv

Always at the end:
Script to close writer stream and clear all session variables.

When I run the java file through eclipse and monitor the memory usage on the web interface, it goes to 100% very fast. At 100% it is able to run for a while, then I get a out of heap memory error. It is not the 50 concurrent scrape, as I've tried with just a couple and it does the same thing. There is a memory leak somewhere. I'm not sure if I'm using the remotescraping correctly.

I'm exiting eclipse after I run the java program. Does this matter?

Any help is appreciated.

Test-Scraper, How much memory

Test-Scraper,

How much memory are you allocating to screen-scraper? It sounds like you're doing well to avoid the recursion memory issue by how you're using for loops in your scraping sessions. I wouldn't imagine the data from the text files that you're saving to memory is very high.

Take a look at this blog entry and see if you can't employ some of the techniques it talks about.

Try tinkering with turning lazy scrape on and off.

If you'd like you can send me your scraping session and I can take a look at it.

-Scott

Gets even weirder

Ok, I tried everything, increasing the amount of memory of screen scraper to 1.5g and then to 2g. Still 100% memory usage. Next I'm just running 1 scrape, memory still jumps to over 40%. Next I took out scripts one by one to see which one was the culprit. Still no go. So finally I just have one scrapeable file that goes to the home page and nothing else. Before I ran the scrape memory is 4%. When I run the scrape with one scrapeablefile pointing to the home page, the memory jumps to 19%. I then downloaded the sample shopping site scrape from the tutorial and ran it fine, 4% memory to 5%.

The next thing I did was create a new scraping session with one scrapeable file pointing to the same website again. Memory jumps to 19% again.
I then tried creating a new scraping session with just 1 scrapeable file pointing to the same website on a different install of screen scraper and it works fine.

screen scraper 5.5.35a x64 causes memory to jump to 19%,
screen scraper 5.5.34a x86 causes memory to jump to 4%, Since this version was not up to date, I updated to eliminate it being to version.
screen scraper 5.5.35a x86 casues memory to jump to 4%.

So after all this it seems my computer is having problem with the 64 bit version of screen scraper.
I will use the 32bit for now until can find a fix.

1 gig for screen scraper

I have set 1 gig for screen scraper. I've been reading the forums and it suggested that anything above 1 gig is not really worth it.

Turning lazy off is not an option for me, because I need it to run concurrently.

I will try to increase screen scraper's memory to 1.5g and 2g respectively and see how it goes.
If I still have problems I'll send you my scrapes.

Test-Scraper, Thank you for

Test-Scraper,

Thank you for taking the time to troubleshoot this issue as you have. We would be very appreciative if you wouldn't mind emailing me (scottw [@] screen-scraper) the URL to the site where you are getting the 19% memory jump in the 64 bit version of screen-scraper.

Thanks,
Scott