screen-scraper public support

Logging in to Google Analytics & Youtube

Hello

I am demoing the pro version of screen-scraper for a project that requires logging into my google analytics and youtube accounts in order to scrape a bunch of data. However, after working through tutorial #2 and reviewing the forums and other documentation, I am still unable to figure out how to get screen-scraper to login to these sites. Any help you can provide is greatly appreciated.

Google Analytics Login Page:

dmalicke on 01/09/2012 at 10:47 am

screen-scraper public support

While Loop

Wondering if anyone has an example of doing a "while" or "until" type of loop.

I have a table that I scrape from multiple pages. On each page it has a different number of rows. I'd like to extract each row.

I see that I can't use a sub-extraction pattern, but need to run a manual instead.

I'm just not experienced enough with this to follow the example of manual extraction.

Boobounder on 12/21/2011 at 1:35 am

screen-scraper public support

1 comment

Writing to a CSV

When I scrap a list of websites for data it writes the data to a csv just fine (actually when the number is more than 3 digits like 4000 it puts the "4" in column A and "000" in column B but I dont mind that) but it doesnt skip a line when there is no data to be found. So if the second website doesnt have data in it but every other site does then all my information is offset by one. I would rather a 0 when there is no data found - like a place holder - so my websites and data match up.

I have attached my code.

Thanks!
Dan

FileWriter out = null;

try
{

dsalazar on 11/28/2011 at 9:05 am

screen-scraper public support

Any way to clear cookies without the enterprise edition?

Hi!

I am trying to scrape information from http://opportunities.osteopathic.org/search/search.cfm?searchType=1
The search session establishes a random CFID, TOKEN, and JSESSIONID.
I am scraping the cookies and reusing them in my scrapes but it, unfortunately, does not work. I get the following:

Starting scraper.
Running scraping session: AOA IM Residency
Processing scripts before scraping session begins.
Processing script: "Initializing AOA Session"
Scraping file: "Search Results"

szunino on 11/04/2011 at 10:40 am

screen-scraper public support

Input from CSV scrape results has next page how to iterate?

I'm scraping using a file with zipcodes to provide the URL and the scrape works when the scrape results page has one page. In instances where I have a next page, I can't figure out how to get the script to allow all of the next pages to be scraped BEFORE inputting the next zipcode in the CSV value. So the results I get are 1 page scraped from each zipcode even if a zipcode has many pages.

Here is the code that Im working with so far. If anyone can help please jump in!

// If using an offset, this number should be the first search results page's offset, be it 0 or 1.

bcb on 11/04/2011 at 4:14 am

screen-scraper public support

maximum number of scripts allowed on the stack was reached

Hello all,

I'm trying to scrape a site that has around 600 pages of search results with 10 rows of data per page. For each row of data I'm following the URL and grabbing the results.

The problem is that I'm encountering this error:

Scraping Session: ERROR--halting the scraping session because the maximum number of scripts allowed on the stack was reached. Current number on the stack is: 50.

To workaround this, I have created the following script based on the recommendations in the blog post (http://blog.screen-scraper.com/2008/07/07/large-data/) but I'm still getting the errors.

bytesize99 on 11/03/2011 at 11:19 am

screen-scraper public support

Adding Sub-Extractor Patterns isn't returning results.

I am attempting to scrape a web page for information regarding residency programs. Currently, I have a Extractor pattern (DATARECORD) and the following Sub-Extractors:
SPECIALTY
PROGRAMNAME
ADDRESS1
ADDRESS2
PHONE
APPROVED
OGM1
OGM2
OGM3
SETTING

Any patterns after this do not record information when the scrape session is run.
However, initially, when I set up the new Sub-Extractors and run a test pattern, results come up. When I run the test pattern after running the scrape session, there are no results.

?

Thanks!

Steven

szunino on 11/02/2011 at 2:16 pm

screen-scraper public support

1 comment

Redirection

Im currently trying to get a number that is written in the title of a certain website (compete.com) and write it out into a csv. When I test the extractor (peppercom.com ~@UVs@~ UVs for September 2011 | Compete) it works. I see 35 where it says ~@UVs@~ - sequence 0. But when I start the scraping session (to write the file out) instead of looking in the site (http://siteanalytics.compete.com/peppercom.com/) it automatically redirects me to http://www.compete.com/ie6/. I found that the redirected site is written below the one Im interested in.

dsalazar on 10/17/2011 at 9:58 am

screen-scraper public support

Download CSV - Getting [Binary Data]

I am trying to pull down a CSV file from Google Adsense. The problem is that the data doesn't show but instead get [Binary Data]. If I view the data returned in Fiddler it looks good. I upgraded to the newest version 5.5.22a and sitll et a problem. I have a feeling it is because of the content-type in the headers being set to unknown. Below is the full response. Any ideas?

Thanks, Jason

HTTP/1.1 200 OK
Set-Cookie: AdSenseLocaleSession=en_US; Path=/adsense/v3/; Secure; HttpOnly
Content-Type: unknown; charset=UTF-16LE
Content-Disposition: attachment; filename=report.csv

realmuser on 10/11/2011 at 11:57 am

screen-scraper public support

Can I use the variables declared in a script in another script?

I tried by creating a public class and inside declaring the variables as public static, but I'm getting this error when the script is called:

An error occurred while processing the script: 2 Once if pattern matches
The error message was: class bsh.ParseException (line 1): public-- Encountered "public" at line 1, column 1.

Thanks in advance

404 on 10/09/2011 at 12:25 pm

screen-scraper public support

1 comment

Search

Community

screen-scraper

User login