screen-scraper support for licensed users

Questions and answers regarding the use of screen-scraper. Only licensed Professional and Enterprise Edition users can post; anyone can read. Licensed users please contact support with your registered email address for access. This forum is monitored closely by screen-scraper staff. Posts are generally responded to in one business day.

You do not have adobe flash player installed.

Hi Guys,

I'm trying to get website scraped but what im getting is message displayed: "You do not have adobe flash player installed."

Any idea how i can solve this problem?

Cheers,

Radek

Strip HTML missing when java script involved...

I have a piece that is scraped to be :

<font color="#000000"><SCRIPT language="JavaScript"> var scolor='#fc050b'; var shimmercount=shimmercount+3; eval('var shimmercolor' +shimmercount+ '="' +scolor+ '"');  document.write("<span id='" + shimmercount + "animate'><b>"); </SCRIPT>Brian2112</b></span></font>

I am expecting to see only :

Brian2112

but I end up with :

var scolor='#fc050b'; var shimmercount=shimmercount+3; eval('var shimmercolor' +shimmercount+ '="' +scolor+ '"');  document.write(""); Brian2112

Double Spaces in path to PDF changed to single space in scrape

The screen I'm scraping has about 30 pdf's. I'm able to scrape them all except 4.
The 4 I can't scrape have a double space in the path.
If I view source in IE I get this:
onclick="javascipt: viewFile('8363_501a Agency Jumbo  04-18-11.pdf')">501a Agency Jumbo  04-18-11.pdf
There is a double space after the word Jumbo.

In Screen Scraper under Last Response, I get this:
onclick="javascipt: viewFile('8363_501a Agency Jumbo 04-18-11.pdf')">501a Agency Jumbo 04-18-11.pdf

There is only a single space after Jumbo.

URL with colon in it.

For some reason, the following URL won't return the same HTML as a browser will:

http://www.dlcmgmt.com/property/output/center/detail/id:1439

I tried checking if it was cookies or anything, but you can just paste that URL into any browser and it works, but if you try to run it in a scrape, it doesn't. My only thought is that it's because there's an extra colon in there? I tried using %3A instead, but to no avail. Thanks in advance.

HelloRequest followed by an unexpected handshake message.

Running the latest alpha (15a) and I'm seeing this message on one server, but not my local machine. The failing machine is inside a firewall, I am not.

This site has an underscore in the domain name, but we were using an IP address as well and it had the same error. Any idea what this means?

(2) Login Page: An input/output error occurred while connecting to 'https://pa_allegheny.uslandrecords.com/palr/'. The message was HelloRequest followed by an unexpected handshake message.

Convert date from mmm dd, yyyy (May 12, 2011) to yyyy-mm-dd - Help Appreciated

I have developed the following script, but I can not get the if statements to work in the month and the date section. Any suggestions would be helpful. Thanks.

//reformat date from mmm dd, yyyy to yyyy-mm-dd

try{
    // Local reference to variables
    String rvwDate = dataRecord.get("Date");

    if(rvwDate != null){
        // Split the Original Date on the " " character into three parts
        String[]  dateParts = rvwDate.split(" ");
                session.log( "Month is : " + dateParts[0] );
        String mmValue;
                                //Translate the month into MM format

Re-using session variable as a patern parameter

I'm trying to basically search for the session variable in a load of text that i'm trying to extract. Would somethnig like this work:

name="ProductID" value="~@SKU@~" />

Basically I have a load of size variables that all look the same with the same markups. I need to search for the size that I have got saved as a session variable and then search for the SKU using the known size var form the variable.

So in pratice:

session.varibale = size 12

Problem with loop

Hi,

I am trying to loop through a scrape to stop the problems with recursive with the stack. But for some reason the loop starts and then a new loop starts and a new loop starts but none go through the loop numbers until I cancel the scrape and then all the loops go through but do not so what they are mean't to. I have shorted it. Any help would be appreciated

for(int i = 1; i < 9; i++)
{
litreIteration = i;

if ( litreIteration == 1) { litre="00"; session.log("in the litre 00"); }
else if ( litreIteration == 2) { litre="10"; }
else if ( litreIteration == 3) { litre="11"; }

Maximum heap size too large

Hi,

I am having a big problem. I am having problems with the number of scripts on the stack so I increased the maximum memory and I set the maximumScriptsOnStack to 3000. Now screenscraper will not start saying that the JVM could not be started as the maximum heap size may be too large. Is there a way to get around this as in reset it. If I can't open the program I cannot reset it.

I reinstalled screenscraper.

Regards,
Seamus

Way to exit scrape from script

Hi,

I was looking at the documentation and I was wondering if there is command to exit a scrape from a script. I want to be able to exit the script if an extractor pattern finds something on a page as it means the information does not meet the criteria for a quote. Is there a command to stop the scrape from a script?

Apologies, I found the answer when I was looking up nullOrEmptyString. The session.stopScraping() function.

Regards,
Seamus