screen-scraper public support

Questions and answers regarding the use of screen-scraper. Anyone can post. Monitored occasionally by screen-scraper staff.

Nested DataSet

The page I'm trying to scrape is here

As you can see there are two major categories, Repayment Mortgages and Interest Only Mortgages. I'm only interested in scraping repayment mortgages.

Within the repayment mortgages section there are groups of mortgage categories. Each category can contain 1-2 individual mortgages.

OutOfMemoryError

I get an OutOfMemoryError when using the method, scrapeableFile.saveFileOnRequest( filePath ).

My max memory allocation in settings is set at 1024mb.

My session runs fine w/o the saveFileOnRequest method, but if I try to save the file via the saveFileOnRequest method,(which are .jpg) the memory jumps to 100% within a couple of sec.

Here is the sequence
Category Page (after each pattern match) -> Product Page

Product Page (after each pattern match) -> Details Page
Product Page (once if pattern match) -> Next Page

Next Page (after each pattern match) -> Details Page

beginner question : limit the number of requests

Hi,
is there a way to limit the number of requests screen-scraper does at once?
For example: request 5 pages, stop 10 seconds, request another 5 pages. Or better make 5 and 10 random numbers.
You guessed it: this is in order to mimic normal browsing of a site so you don't get banned and the sites server doesn't get overloaded.

Thanks in advance,
Titzu

Tutorial 2 + iteration

I'm sort of a newbie when it comes to programming but I'm trying to apply the second tutorial (webshop) to an online car market. The material is big so I think I have to iterate to get done. I tried to modify the setup in the tutorial with this script: http://community.screen-scraper.com/Next_Page_Memory_Conscious, but I have run in to some problems.

I'm extracting the number of the next page as HAS_NEXT_PAGE and run the script after each pattern match, but the loop does not seem to get running. This is the output I'm getting:

Search results: Applying extractor pattern: HAS_NEXT_PAGE

dropdown date

Hey,

i´m new in screen scraping. i like to know how you scrape a dropdown date form.
for example:
i have a dowpdown day - month - year form
i like to scrape the day´s and then a name and price on that specific day - month - year.
i know how to scrape one day but not the whole month or year for a specific detail.
can somebody help me?

best regards

Oliver

Write to csv

Hi,
Is there a way to set the formatting of the csv column before writing?

An example would be if I have a header called "TEST" and I write out 6-8614 it automatically changes the format of the "TEST" column to date and in the csv the value is 06/01/14.

write DAY_OF_YEAR

Hi,
How can I get and write out the current day of the year (0-365). The same as in php

$dayofyear = date("z");

Thanks

edit:
Nevermind, I found it.

http://community.screen-scraper.com/node/1108

Issue : Download file generated at run time

Screen scraper application is unable to download file generated at run-time. I tried to download file using downloadFile method but it is downloading file which contains HTML text instead of expecting data.

Steps to download file
----------------------
1. Login to web site.
2. Go to file download page
3. Click on file download link (Hyperlink re-directing to someURL/index.php?G1=value&G2=value&G=value) - This PHP script process the requested file and then provides option to download file)

For this one application in particular:

an IF problem

I would like to avoid to write in a csv file all the data scraped IF "date" variable is zero.
the script is
out.write( session.getVariable( "CODE" )+ "," );
out.write( session.getVariable( "WIND" )+ "," );
out.write( session.getVariable( "TMAX" )+ "," );
out.write( session.getVariable( "TMIN" )+ "," );
out.write( session.getVariable( "WIND") + "," );
out.write( session.getVariable( "DATE" )+ "," );

i tried with:
if (dataRecord.get("DATE") != null)
{
out.write( session.getVariable( "CODE" )+ "," );
out.write( session.getVariable( "WIND" )+ "," );

UTF-8 in Linux

Moved my machine over to using Linux and found that with UTF-8 I'm now getting a squiggly A character infront of a £. Using the same settings as I was in Windows and never had this issue. In the debug log the characters keep appearing. Are there any settings that need to be changed specifically for Linux?