screen-scraper support for licensed users
Alpha update progress slow
I'm updating to an alpha version (because of http://community.screen-scraper.com/node/2298) and the progress is extremely slow. I ran the update on my laptop and it took a minute or two. Following the same steps on a server that has a much faster and reliable connection the progress bar only reached 10% after 2 hours. I tried the update on another server and it was slightly faster, but nothing like it should be. I'm not sure how to trouble shoot an issue like this. Is there a method to log this and get more details to you?
Reading data from MySQL table
Hi, I'm adding data to a MySQL database. There are 2 tables I want to add data to: one that creates an auto-incrementing id, and another that has a foreign key on the auto-incrementing id of the first table.
I can create records and add them to the first table. I'm using the SQLDataManager to do this:
data_record = dataset_name.getDataRecord(i);
dm.addData( "db_table_name", data_record);
dm.commit("db_table_name");
dm.flush();
}
SQL SSH Connection
I am trying to SSH into mySQL.
So far I have:
1) Before scraping session begins, I call the script SQL_SSH_CONNECTOR. It looks like this:
// SshDataSource
ds = new SshDataSource( "[email protected]", "SSH_pass_here" );
ds.setDriverClassName( "com.mysql.jdbc.Driver" );
ds.setUsername( "mysql_user_here" );
ds.setPassword( "mysql_user_pass_here" );
ds.setUrl( SshDataSource.MYSQL, 3306, "database_name_here" );
// Create Data Manager
dm = new SqlDataManager( ds, session );
// Build Schemas For all Tables
Double spaces in csv file changed to a single
In a scrape I use (with html tidy off) there are URLs that point to images. Some of these have single and some double spaces in the paths. When I copy and paste out of the 'test pattern' button, the data has (correctly) got 2 spaces in. When the data is shown on the scrape log is is shown with just one?
from datarecord in scrape log:
/components/com_vehiclemanager/photos/4739FD7F-98F8-FA8B-9E4C-BE46C0A9D71D_2008 08 MAN 26 440 6X4 Chassis cab (2)_450_600.JPG"
(note only one space after the word cab)
Pasted from the test pattern for that token:
scraping data for different keywords searched in one click
I am working on a project and I am facing a problem in deciding how to proceed.
I want to extract data for 160 companies from uspto.gov. Now I don't want to give a search keyword and extract data for one company at a time. Instead of this, I want to extract data on just one button click. It should find all the companies name automatically and for each company, I want to extract the total number of patents in each year. For e.g
For Google: 1500 patents in 2006, 1100 in 2007, 100 in 2008 etc.
Is it possible to automate the system on one button click anyhow?
Proxy Server Having Problems With SSL
I've been having a lot of problems trying to scrape SSL webpages when using the proxy server. Sometimes I can circumvent the SSL by removing the "s" from "https" in the URL. Other times, I can find a way around it by using Google Chrome and its Inspect Element feature, which allows me to see the transactions and then I can hand build the GET request in screen scraper using the "addHTTPHeader" method to and have it run before the file is scraped.
Ignoring html tags like <b></b> while scraping
Hello Everyone,
I am trying to scrape the data from the following line:
Murphy; Bruce W. (Pyrmont, AU), Iversen; Grim H. (Trondheim, NO)
Now as a normal scraping, I will do as follows:
~@INVENTORS@~
but it includes HTML tags too in results which I want to ignore.
scraping next link pages (navigation problem)
I am working on a project to scrape the data automatically.
I am scraping a website called "www.uspto.gov".
These are the two links:
http://patft.uspto.gov/netacgi/nph-Parser?Sect1=PTO2&Sect2=HITOFF&u=%2Fnetahtml%2FPTO%2Fsearch-adv.htm&r=0&f=S&l=50&d=PTXT&OS=%22social+networking%22&RS=%22social+networking%22&Query=%22social+networking%22&TD=6908&Srch1=%22social+networking%22&NextList2=Next+50+Hits
The message was java.lang.RuntimeException: Could not generate DH keypair.
When I go to this page (https://www.greenmountainenergy.com/for-home/products/comed/) I receive this error: An input/output error occurred while connecting to 'http://www.greenmountainenergy.com/for-home/products/comed/'. The message was java.lang.RuntimeException: Could not generate DH keypair.
I found this topic that might explain the underlying java issue: https://community.qualys.com/thread/1407
Thanks, Jeremy
Scaping confusing Ajax? site
Hi
I am attempting to scrape this site: http://davidevansagricultural.co.uk.temp.realssl.com/Used/UsedTractors/tabid/67/Default.aspx
I can see the html for the results page, (above) and I can see the data I want to take out of each of the relevant product detail pages here for example:
http://davidevansagricultural.co.uk.temp.realssl.com/Used/UsedTractors/tabid/67/ctl/Detail/mid/469/xmid/870/xmfid/7/Default.aspx
But I cannot find how to get the post data to create the link from the search results page to the product detail page.