screen-scraper support for licensed users

Questions and answers regarding the use of screen-scraper. Only licensed Professional and Enterprise Edition users can post; anyone can read. Licensed users please contact support with your registered email address for access. This forum is monitored closely by screen-scraper staff. Posts are generally responded to in one business day.

Alpha update progress slow

I'm updating to an alpha version (because of http://community.screen-scraper.com/node/2298) and the progress is extremely slow. I ran the update on my laptop and it took a minute or two. Following the same steps on a server that has a much faster and reliable connection the progress bar only reached 10% after 2 hours. I tried the update on another server and it was slightly faster, but nothing like it should be. I'm not sure how to trouble shoot an issue like this. Is there a method to log this and get more details to you?

exdap on 10/28/2014 at 3:25 pm

screen-scraper support for licensed users

Reading data from MySQL table

Hi, I'm adding data to a MySQL database. There are 2 tables I want to add data to: one that creates an auto-incrementing id, and another that has a foreign key on the auto-incrementing id of the first table.

I can create records and add them to the first table. I'm using the SQLDataManager to do this:

for (i=0; i < dataset_name.getNumDataRecords(); i++){
data_record = dataset_name.getDataRecord(i);
dm.addData( "db_table_name", data_record);
dm.commit("db_table_name");
dm.flush();
}

oldgarvice on 10/22/2014 at 5:14 pm

screen-scraper support for licensed users

SQL SSH Connection

I am trying to SSH into mySQL.

So far I have:

1) Before scraping session begins, I call the script SQL_SSH_CONNECTOR. It looks like this:

import com.screenscraper.datamanager.sql.*;

// SshDataSource
ds = new SshDataSource( "[email protected]", "SSH_pass_here" );
ds.setDriverClassName( "com.mysql.jdbc.Driver" );
ds.setUsername( "mysql_user_here" );
ds.setPassword( "mysql_user_pass_here" );

ds.setUrl( SshDataSource.MYSQL, 3306, "database_name_here" );

// Create Data Manager
dm = new SqlDataManager( ds, session );

// Build Schemas For all Tables

oldgarvice on 10/20/2014 at 12:45 pm

screen-scraper support for licensed users

Double spaces in csv file changed to a single

In a scrape I use (with html tidy off) there are URLs that point to images. Some of these have single and some double spaces in the paths. When I copy and paste out of the 'test pattern' button, the data has (correctly) got 2 spaces in. When the data is shown on the scrape log is is shown with just one?

from datarecord in scrape log:
/components/com_vehiclemanager/photos/4739FD7F-98F8-FA8B-9E4C-BE46C0A9D71D_2008 08 MAN 26 440 6X4 Chassis cab (2)_450_600.JPG"

(note only one space after the word cab)

Pasted from the test pattern for that token:

jas777 on 10/20/2014 at 10:21 am

screen-scraper support for licensed users

scraping data for different keywords searched in one click

I am working on a project and I am facing a problem in deciding how to proceed.

I want to extract data for 160 companies from uspto.gov. Now I don't want to give a search keyword and extract data for one company at a time. Instead of this, I want to extract data on just one button click. It should find all the companies name automatically and for each company, I want to extract the total number of patents in each year. For e.g
For Google: 1500 patents in 2006, 1100 in 2007, 100 in 2008 etc.

Is it possible to automate the system on one button click anyhow?

patela7014 on 10/17/2014 at 12:12 am

screen-scraper support for licensed users

3 comments

Proxy Server Having Problems With SSL

I've been having a lot of problems trying to scrape SSL webpages when using the proxy server. Sometimes I can circumvent the SSL by removing the "s" from "https" in the URL. Other times, I can find a way around it by using Google Chrome and its Inspect Element feature, which allows me to see the transactions and then I can hand build the GET request in screen scraper using the "addHTTPHeader" method to and have it run before the file is scraped.

GenericPause on 10/16/2014 at 1:46 pm

screen-scraper support for licensed users

Ignoring html tags like <b></b> while scraping

Hello Everyone,

I am trying to scrape the data from the following line:

Inventors:
Murphy; Bruce W. (Pyrmont, AU), Iversen; Grim H. (Trondheim, NO)

Now as a normal scraping, I will do as follows:

Inventors:
~@INVENTORS@~

but it includes HTML tags too in results which I want to ignore.

patela7014 on 10/08/2014 at 3:27 pm

screen-scraper support for licensed users

scraping next link pages (navigation problem)

I am working on a project to scrape the data automatically.

I am scraping a website called "www.uspto.gov".

These are the two links:

http://patft.uspto.gov/netacgi/nph-Parser?Sect1=PTO2&Sect2=HITOFF&u=%2Fnetahtml%2FPTO%2Fsearch-adv.htm&r=0&f=S&l=50&d=PTXT&OS=%22social+networking%22&RS=%22social+networking%22&Query=%22social+networking%22&TD=6908&Srch1=%22social+networking%22&NextList2=Next+50+Hits

patela7014 on 09/16/2014 at 1:13 pm

screen-scraper support for licensed users

The message was java.lang.RuntimeException: Could not generate DH keypair.

When I go to this page (https://www.greenmountainenergy.com/for-home/products/comed/) I receive this error: An input/output error occurred while connecting to 'http://www.greenmountainenergy.com/for-home/products/comed/'. The message was java.lang.RuntimeException: Could not generate DH keypair.

I found this topic that might explain the underlying java issue: https://community.qualys.com/thread/1407

Thanks, Jeremy

exdap on 06/23/2014 at 8:12 am

screen-scraper support for licensed users

3 comments

Scaping confusing Ajax? site

Hi
I am attempting to scrape this site: http://davidevansagricultural.co.uk.temp.realssl.com/Used/UsedTractors/tabid/67/Default.aspx

I can see the html for the results page, (above) and I can see the data I want to take out of each of the relevant product detail pages here for example:
http://davidevansagricultural.co.uk.temp.realssl.com/Used/UsedTractors/tabid/67/ctl/Detail/mid/469/xmid/870/xmfid/7/Default.aspx

But I cannot find how to get the post data to create the link from the search results page to the product detail page.

jas777 on 06/20/2014 at 8:08 am

screen-scraper support for licensed users

Search

Community

screen-scraper

User login