screen-scraper support for licensed users

Questions and answers regarding the use of screen-scraper. Only licensed Professional and Enterprise Edition users can post; anyone can read. Licensed users please contact support with your registered email address for access. This forum is monitored closely by screen-scraper staff. Posts are generally responded to in one business day.

How many threads can a .NET external program run with the screen-scraper Windows service?

I created a VB.NET program that issues calls to run scraping sessions. Now I want to create about 3 threads in the program. The threads run concurrently and each thread will run a list of scraping sessions one after the other. The reason I want to do this is because a few of the scrapes take several minutes to complete and I want to run other scrapes at the same time, so that the long-running scrapes don't hold them up. Can the screen-scraper Windows service handle more than one request at a time? What is the maximum number of threads it can handle?

Gary Frank on 06/22/2010 at 9:33 am

screen-scraper support for licensed users

Ajax Results page

Hi guys-

I have started working on another website for a client and having a little difficulty. I am able to walk through all the steps via the proxy, but can't figure out what page is posting the results table "Melhores ofertas por pessoa" (best offers). The results table completely loads after all the airline rates load. Any ideas on where to look?

The website address is

"http://passagens-aereas.viajanet.com.br/trechos-nacionais/passagem-aerea_rio-de-janeiro_sao-paulo"

jsncochran on 06/21/2010 at 11:31 pm

screen-scraper support for licensed users

New line character should be replaced by single space character

While scraping a site I noticed the problem that a "new line" character is being replaced by an empty character.

Example:
The last response:

We love the chill attitude these jeans have. Roll 'em up or roll 'em<br />down for the ultimate laissez-faire look.

The extracted data:
We love the chill attitude these jeans have. Roll 'em up or roll 'emdown for the ultimate laissez-faire look.

edgar on 06/16/2010 at 3:49 pm

screen-scraper support for licensed users

Can't find webrefid key value

Hi Support,

I can't figure out how the "webrefid" key value is generated. I had a similar problem in the past with another website but I was able to figure it out by looking at the log in the proxy session for a redirect but in this one I can't find any clue in the log. The screen scraping is configured in the following steps:

Step 1

http://www.smartlaw.org/index.cfm

as you suggested in another post I set the parameters as follows:

Adrianjay on 06/12/2010 at 6:08 pm

screen-scraper support for licensed users

Referrer

Just curious-- Is there any subtle "referrer" difference between letting scrapableFiles run in sequence versus chaining them together with scripts? Moreover, would a single script calling several scrapableFiles share the same referrer for each scrapableFile's request (as if each scrapeableFile gets pushed onto the stack and then popped before the next one runs)?

It *looks* like none of those things matter, that the referrer is always exactly what you'd expect it to be, given the scraped order of files...

timv on 06/08/2010 at 9:18 pm

screen-scraper support for licensed users

500 null / JRun Servlet Error

Hi Support:

I'm trying to scrape the following site:

http://www.smartlaw.org/

but can't seem to get past the first page.

On page:

http://www.smartlaw.org/

I input the following:

Step 1: Residential Lease-Tenant

Step 2: -No Preference

Step 3: English/ Click Find a Lawyer

but receive the following error message on the screen:

500 null

and

JRun Servlet Error

on the tab of the screen.

Adrianjay on 06/04/2010 at 7:19 pm

screen-scraper support for licensed users

Bookstore scrape

Hi Guys-

I recently started working on a new website that needs scraping. They are wanting to scrape each book based on the term, department, course, and section. I tried walking through the steps via the proxy, but can't figure out how the dropdowns are being populated. I figured I would scrape all that information them loop it back through to get the books for each department, course, and section. Any ideas on how to proceed?

Here is the website:

jsncochran on 06/04/2010 at 12:59 am

screen-scraper support for licensed users

Memory Leak

There seems to be a pretty significant memory leak in the newest alpha version of the screen-scraper workbench (4.5.64a). I don't have to do much before the memory usage hits 100% and stays there. I do have quite a few scrapes in the workbench, and I have been working with some large ones, but I haven't had this kind of trouble before. Let me know if you need more details.

chrishathaway on 06/03/2010 at 2:30 pm

screen-scraper support for licensed users

6 comments

issues doing scraping

Hi,
i'm trying to scrape a site by passing a list of the securities to it from .csv file. The security name is at the top of the page and the owners below. i want to pass securities to each and want it to return the security as well as the list of owners. I made progress with getting the security returned as well as its owners by creating two extractor patterns. The problem I'm having is that if i send a list of securities, i returns result for the last one only, where as i want it to return each security along with its owners.

dapor on 05/31/2010 at 5:01 am

screen-scraper support for licensed users

DB corruption... :(

Hey guys. I know this used to be a problem with alpha updates back in the 20a-30a range, but have there been any recent issues with SS dropping the scraping sessions from the database? Two times in a row this evening, I opened my workbench and my scrapes were gone, leaving only the scripts.

To be fair, I had Actions enabled at the time, and SS had frozen/died on the first DB drop I experienced.

Any insider info on possible things to avoid?

timv on 05/24/2010 at 10:30 pm

screen-scraper support for licensed users

8 comments

Search

Community

screen-scraper

User login