screen-scraper public support

Questions and answers regarding the use of screen-scraper. Anyone can post. Monitored occasionally by screen-scraper staff.

Problem with SSL and SS Pro

I am trying to scrap a site that uses SSL that I have a username and Password for... I setup the Proxy to use port 8777 and then setup IE to use a proxy server with port 8777 for http and https requests.

After doing this, I get multiple problems.. It tells me the certificate does not match..(it pulls the cert from screen-scraper.com) so i accept it anyway... Then it either give me an application error or it resolves back to the login page...

Any help you can provide would be great..

BTW, i have tried firefox with this as well...

Can SS extract this 'BrowsePDFServlet' Object?

Hello - thx in advance for your help -

I'm trying to scrape specific files from the US Patent site. They are presented openly one by one, but of course I would like to search through them programmatically.

At this page:
http://portal.uspto.gov/external/portal/!ut/p/_s.7_0_A/7_0_CH/.cmd/ad/.ar/sa.getBib/.c/6_0_69/.ce/7_0_1ET/.p/5_0_18L/.d/1?selectedTab=ifwtab&isSubmitted=isSubmitted&dosnum=09893809

You can find this code:

Another Beginner question (sub extractions)

Hi,

First off i would like to thank you for looking at this post. I have alittle problem and wondering if anyone can lend a helping hand.

the problem is best described in an example so heres go.

I look at a site that has a dvd on it. It has the name of the dvd and the price (or says it is out of stock). now i want to create a session that will rip the name of the dvd off and will rip the price OR the comment "out of stock" whatever is their.

Broken pipe

Hi,

Screen-scraper just stops scraping without any clue. We call screen-scraper using a php script. Execution time is set to a high value and memory is also set to a high value in php.ini. I see a lot of broken pipe errors in screen-scraper's error.log. Screen-scraper is run thourgh tor + privoxy. Any ideas ?

Error in method invocation-Script doesn't write to data file

I'm pulling the data I want, but I can't write it to a data file.

Here is my Script (I'm not a programmer, but I can normally work my way around with examples):

/ Output a message to the log so we know that we'll be writing the text out to a file.
session.log( "Writing data to a file." );

// Create a FileWriter object that we'll use to write out the text.
out = new FileWriter( "estellamntrnchre.csv" );

// Write out the text.
out.write(

optimising

Ok im a bit pissed cus I cant optimise my client side code enough, i've broken it down

I get the ids then the dataobjects separetly
Why oh why am i Still getting
"Exception in thread main java.lang.OutofMemoryError"

Three variables read from three text files to generate URLs?

I'm trying to read three sets of variables from three .txt files, in order to generate the URLs for scraping.

So, for instance, I have three text files:
1) a list of domains
2) a list of dates (yyyy-mm-dd format)
3) a list of search types (a and n are the available types that I currently use)

I want to be able to create a URL that will incorporate each domain for each of the dates and search types.

Here's a sample URL:
http://sitetoscrape.com/detail/?ns={DOMAIN}&date={DATE}&net=9&changes=15&act={SEARCH TYPE}

Simple VBScript date formatting question

It is my understanding that I have to pass a date to a POST parameter as a string, not a date.

If today's date is 6/18, I can get a string "06/17/2006" using:

Call RunnableScrapingSession.SetVariable( "YESTR_DATE", CStr(DateAdd( "d", -1, Date) ) )

but I am stuck there.

1. How can I get yesterday's date into a MMDDYY string? ("061706")

2. Is there an entry-level VBScript site or book that can be recommended? What I seem to find online are just re-hashes of the Microsoft site.

Thanks,

Jim

Larger scrape patterns

I had a problem with SS last night that forced me to scrap my whole project and start again : (

I had an extractor pattern that basically looked like this:

A

B

...

Each block (A, B...) was basically copied and pasted from the scraped page itself, and then modified to contain a single token. So in this example, I'm getting two token out of the HTML.

Easy question from a new user

Hey folks,

First of all, thanks for taking a look at my query. Secondly, just as background information, I don't know any java or VBScript (I remember some C/C++, but not too much).

Q: Is there any way to use a wildcard in the Extractor Patterns?

DK