screen-scraper public support

Questions and answers regarding the use of screen-scraper. Anyone can post. Monitored occasionally by screen-scraper staff.

Logging in to Google Analytics & Youtube

Hello

I am demoing the pro version of screen-scraper for a project that requires logging into my google analytics and youtube accounts in order to scrape a bunch of data. However, after working through tutorial #2 and reviewing the forums and other documentation, I am still unable to figure out how to get screen-scraper to login to these sites. Any help you can provide is greatly appreciated.

Google Analytics Login Page:

While Loop

Wondering if anyone has an example of doing a "while" or "until" type of loop.

I have a table that I scrape from multiple pages. On each page it has a different number of rows. I'd like to extract each row.

I see that I can't use a sub-extraction pattern, but need to run a manual instead.

I'm just not experienced enough with this to follow the example of manual extraction.

Writing to a CSV

When I scrap a list of websites for data it writes the data to a csv just fine (actually when the number is more than 3 digits like 4000 it puts the "4" in column A and "000" in column B but I dont mind that) but it doesnt skip a line when there is no data to be found. So if the second website doesnt have data in it but every other site does then all my information is offset by one. I would rather a 0 when there is no data found - like a place holder - so my websites and data match up.

I have attached my code.

Thanks!
Dan

FileWriter out = null;

try
{

Any way to clear cookies without the enterprise edition?

Hi!

I am trying to scrape information from http://opportunities.osteopathic.org/search/search.cfm?searchType=1
The search session establishes a random CFID, TOKEN, and JSESSIONID.
I am scraping the cookies and reusing them in my scrapes but it, unfortunately, does not work. I get the following:

Starting scraper.
Running scraping session: AOA IM Residency
Processing scripts before scraping session begins.
Processing script: "Initializing AOA Session"
Scraping file: "Search Results"

Input from CSV scrape results has next page how to iterate?

I'm scraping using a file with zipcodes to provide the URL and the scrape works when the scrape results page has one page. In instances where I have a next page, I can't figure out how to get the script to allow all of the next pages to be scraped BEFORE inputting the next zipcode in the CSV value. So the results I get are 1 page scraped from each zipcode even if a zipcode has many pages.

Here is the code that Im working with so far. If anyone can help please jump in!

// If using an offset, this number should be the first search results page's offset, be it 0 or 1.

maximum number of scripts allowed on the stack was reached

Hello all,

I'm trying to scrape a site that has around 600 pages of search results with 10 rows of data per page. For each row of data I'm following the URL and grabbing the results.

The problem is that I'm encountering this error:

Scraping Session: ERROR--halting the scraping session because the maximum number of scripts allowed on the stack was reached. Current number on the stack is: 50.

To workaround this, I have created the following script based on the recommendations in the blog post (http://blog.screen-scraper.com/2008/07/07/large-data/) but I'm still getting the errors.

Adding Sub-Extractor Patterns isn't returning results.

I am attempting to scrape a web page for information regarding residency programs. Currently, I have a Extractor pattern (DATARECORD) and the following Sub-Extractors:
SPECIALTY
PROGRAMNAME
ADDRESS1
ADDRESS2
PHONE
APPROVED
OGM1
OGM2
OGM3
SETTING

Any patterns after this do not record information when the scrape session is run.
However, initially, when I set up the new Sub-Extractors and run a test pattern, results come up. When I run the test pattern after running the scrape session, there are no results.

?

Thanks!

Steven

Redirection

Im currently trying to get a number that is written in the title of a certain website (compete.com) and write it out into a csv. When I test the extractor (peppercom.com ~@UVs@~ UVs for September 2011 | Compete) it works. I see 35 where it says ~@UVs@~ - sequence 0. But when I start the scraping session (to write the file out) instead of looking in the site (http://siteanalytics.compete.com/peppercom.com/) it automatically redirects me to http://www.compete.com/ie6/. I found that the redirected site is written below the one Im interested in.

Download CSV - Getting [Binary Data]

I am trying to pull down a CSV file from Google Adsense. The problem is that the data doesn't show but instead get [Binary Data]. If I view the data returned in Fiddler it looks good. I upgraded to the newest version 5.5.22a and sitll et a problem. I have a feeling it is because of the content-type in the headers being set to unknown. Below is the full response. Any ideas?

Thanks, Jason

HTTP/1.1 200 OK
Set-Cookie: AdSenseLocaleSession=en_US; Path=/adsense/v3/; Secure; HttpOnly
Content-Type: unknown; charset=UTF-16LE
Content-Disposition: attachment; filename=report.csv

Can I use the variables declared in a script in another script?

I tried by creating a public class and inside declaring the variables as public static, but I'm getting this error when the script is called:

An error occurred while processing the script: 2 Once if pattern matches
The error message was: class bsh.ParseException (line 1): public-- Encountered "public" at line 1, column 1.

Thanks in advance