screen-scraper public support

Questions and answers regarding the use of screen-scraper. Anyone can post. Monitored occasionally by screen-scraper staff.

Writing scraped data to a MySQL database?

Has anyone written scraped data to a MySQL database?

Do you have a script that you could share with us?

(I have not been able to find documentation that walks a person through this.)

Appreciatively,
Peter

Invoking screen-scraper from Internet Explorer?

Is it possible to invoke a screen-scraper scraping session from Internet Explorer?

For example, could you use a Links entry (Internet Explorer) or Toolbar button (Internet Explorer) - e.g., coded with a javascript - to send to current web URL to screen-scraper and execute a scraping session using a specific scrapeable file? If yes, do you have any coding examples?

A little context may help. Bookstore websites present a unique web page for each book, but they present the information in a standard format (which makes it scrapable).

Best regards,
Peter

Ignoring HTML codes

I'm using an evaluation copy of screen-scraper professional (Version 2.7.2).

I would like to setup an extraction pattern that removes the URL from the sample below (without the extra HTML code). I reviewed other entries in this forum and saw a reference to a "Strip HTML" checkbox under the Advanced tab for a given extraction pattern. I do not see that checkbox listed.

Is there also a way to do this with a regular expression? Please describe.

Appreciatively,
Peter

_____________

HTTP Client - Security Problem

Hello Screen Scrapers,

I have created my scraping session to connect to a secure site that if I visit using IE that checks to see I have my Trusted Connect Smart Card in,
prompts me to choose a key (digital certificate being the smart card) from a list and enter my pin.

If I create my Scraping session and configure it to use Internet Explorer as the HTTP Client and run my scrape from the GUI it prompts me for the key and pin and happily scrapes away.

Image Verification

Hi guys, just trying screen-scraper out and am impressed so far.

One of the page I want to scrap is behind a login with image verification (i.e. you need to enter some text generated in an image to log in). Is there a way to work around this? Maybe something like SS load the image, display/save it to a location, waits for my input after viewing the image, then moves on? Or are there other ways to handle this? Help much appreciated, many thanks.

Scraping through Advanced Search and Java

I am looking at a gold mine of Canadian Business information and I would love to build a scrape that allows me to submit a set of variables to invoke a Detailed Search on the following page:

http://strategis.ic.gc.ca/app/ccc/search/cccSearch.do?language=eng&porta...

The result of the search is a URL, example:

Java Nul error despite scripts working and data extracted

I have a simple scrape session pulling two peices of datae from 1 table row. The extractor patterns are working correctly (when I apply pattern to last scraped data it displays correctly for both extractor patterns). I have also edited each token multiple times to verify "save in session variable' is checked for both.

I am running my Write_Data_To_File script after each pattern is applied.

No matter what I check the log always shows:

java.lang.NullPointerException BSF info:null at line: 0 column: columnNo

Please suggest what may be causing this?

The log file

I was wondering if there is a way to turn off the log text under the scraping sessions tab. I am hitting about 50,000 records and that text will take up a lot of computer resource if it tries to render it to the screen.

Multiple Server Instances

It appears possible to have multiple instances of the server on one machine -- but can they each have their own database?

We have one machine which is a staging AND a production box with a different web server instance/environment for each. We'd like a database of scrapes for staging and then when they're QA'd, we'd like to move those scrapes up to production.

Right now it looks like all instances of SS on one box point to the same database.

Are there any official recommendations for something like this?

NullPointerException Error

Running this script:

// Output a message to the log so we know that we'll be writing the text out to a file.
session.log( "Writing data to a file." );

// Create a FileWriter object that we'll use to write out the text.
out = new FileWriter( "hockey_stats.txt" );

// Write out the text.
out.write( session.getVariable( "modano_name" ) );
//out.write( session.getVariable( "modano_goal" ) );
//out.write( session.getVariable( "modano_assist" ) );

// Close the file.
out.close();

with this pattern text: