screen-scraper support for licensed users
execute page javascript before file is scraped
Is there a way to tell screen-scraper to execute all page javascript before scraping the file, much like a browser would?
The problem I have, is that I have a number of sites using the same framework, Taleo, which builds the page from a javascript function called at the end of the page. My instincts tell me that this is purely an anti-scraper countermeasure. I've tried to find an "accessible" or scriptless version of the same page - to no avail.
Trying to get to grips with Soap ands its not working as expected
I have followed the example on the site showing how to call screen scraper from using the DLL, that worked fine and i prooved that I can call the scrapes,
however I wanted to do more i.e. take the log file of the scrape and check for errors to automatically warn me of problems...
So I took the soap example (it warns of the wsdl code not be generated quite right and needing to change [] to [][] but I am not sure how I cast that in a response)...
anyway that warning didnt seem to affect me as I dont want the data set only the log file... now these are the lines of code
Skip Output if File Exists and Looping
Hello Support:
I’m scraping a dynamic webpage, whereby real estate properties are added throughout the day. I would like to output the data of the new properties as they are uploaded throughout the day. The name of the output file will be the address of the property. I don’t want to overwrite the existing file if a property has been has been written already. Since there will be properties uploaded to the webpage through the course of the day I want Screen Scraper to loop throughout the day.
The Sequence is as follows:
Time Stamp in Name of the CSV OutputFile
Hello Support:
I'm trying to include the date in the name of the csv output file. I added this to the top of my script:
import java.text.DateFormat;
import java.text.SimpleDateFormat;
String getDateTime()
{
DateFormat dateFormat = new SimpleDateFormat("yyyy-MM-d HH:mm:ss");
Date date = new Date();
return dateFormat.format(date);
}
Then on the outputfile I wrote the date as follows:
screen-scraper and linux-vserver
We're trying to run screen-scraper enterprise 5.5 (latest update) on a linux-vserver guest and we're getting errors (it seems) because the screen-scraper server (or HSQLDB, or both) are trying to bind and connect to 127.0.0.1. Is there a configuration directive to control this? Can I tell it to use the vserver guest accessible interface? We're open to any approach to solve this one.
This page: http://linux-vserver.org/Problematic_Programs talks about several applications that use hardcoded references to localhost as 127.0.0.1 (which is what we think is happening with screen-scraper).
update to the newest version
Hey Guys,
Maybe my question is stupid, but i was searching forum and i couldnt find answer,
when im upgrading gui-less ss using link from website, but not recently, am i getting all changes done to the new update or i have to download and replace files each update?
Not sure if i was clear enough to understand my question :)
Cheers,
Radek
IndexOutOfBoundsException Error
Hi there,
I have a script that writes the recordset to a database doing a for( i = 0; i < dataSet.getNumDataRecords(); i++ ) loop.
Everything was working fine, until I introduced this changes in the script:
1. I store at the start of the database write script(before the for loop) the first datarecord:
CurrentDataRecord = dataSet.getDataRecord( 0 );
2. I store in a session variable one of the values of that first datarecord(a date)
3. I call a javascript script where using that session variable I check for the date difference with today´s date
Recommendation of anonymous proxies source?
Hi,
I am using the manual proxy pool way of anonymizing as shown here:
http://community.screen-scraper.com/anonymization_via_manual_proxy_pools
And I am getting a list of proxies from here:
http://www.textproxylists.com/proxy.php?anonymous
However, when I filter them for 7 second connection timeouts as per the example, I end up with only around 30 usable proxies out of a list of around 900 servers :-S
I am wondering if anybody can suggest me a better source for getting a list of good anonymous proxies.
Many thanks,
boga
Confused about why using Java...
Hi,
I have programmed in other languages before(Visual Basic, PHP, SQL...), but all of them had a lot of libraries of functions so for example you can do anything you want with strings and dates, etc...
If I understand right, with Java if I want to compare two dates and figure out what´s the difference in days between them I have to create a function myself because there isn´t one already created?
I am wondering if I should use Javascript instead for Screen-Scraper scripting. In what case would I want to use one or the other?
Cookie defaults changed? OUCH?
Had a scrape stop working. Changed the cookie dropdown from "According to cookie spec" to "Accept all cookies" and it suddenly began working. It's happened on multiple scrapes hitting multiple sites. Neither of these scrapes changed recently, and it started to happen after our 5 to 5.5 update.
Did the default selection for that change, or did the behavior of "according to cookie spec" change?
I cannot remember what the setting was previously without reverting to an old version of screen-scraper and re-importing an old version of the file.
This impacts a ton of my scrapes. :(