screen-scraper public support
Scraping multiple items from a part of a page
I would like to scrape multiple items from part of a page
Simplified example:
Drinks:
Beer
Water
Juice
Food:
Hamburger
Fries
Chicken
Beginner help.
Hi,
I am learning to use screen-scraper and encountered problem.
Some of my results in a table are links and some just plain text. For example:
1. John
2. Tom
3. Greg
4. Brian
How do I get rid of html elements?
Inconsistent characters in XML output
My XML parser says I have a bad character but it doesn't know me that well.
Let me see if I can explain:
I am having the same issue on all versions of screen scraper 3.0.67a, 3.0.70a and currently 4.0 running on Windows XP Pro. I have several scrapes that have worked fine untill a few days ago, they are still working in production but not on my development box.
screen-scraper server model goes wrong on linux pc
We cannot find the original source of an exception.
When screen-scraper is invoked from the commandline for adding a task,
an exception is caught by an interpreted java script that we are calling.
The exception is not very descriptive:
java.lang.ClassNotFoundException: com.mysql.jdbc.Driver
It would at first glance appear that CLASSPATH is not set, but it is:
# CLASSPATH=$HOME/ss/mysql-connector-java-5.1.5-bin.jar:$CLASSPATH java -jar screen-scraper.jar -s "generic - call-script" -p "scrapeFile=foo.quote"
incremental URL scraping
I know this relates kind of to the first two tutorials, but i've been have a semi difficult time trying to set it up -- so i thought i would ask here and see if anyone had any input.
Basically I would like to scrape some info from a site based on incremental URLs. ie: the website http://www.ccc.com/item.php?kasi=00001. I want the scraper to get a few pieces of information from the site (Book name, author, price, etc) from each site from kasi=00001 to 00200.
Force multi-part?
I have a form of POST parameters and a funky URL like:
Which, when requested just like that, end up being a POST. But the form needs to post as a multi-part. If I add a GET parameter to the list of params it does multi-part, but the site doesn't like that extra parameter. I tried adding a file upload as a parameter and screen-scraper just spins and spins and spins..
like condition in Javascript
Hi,
I would like to extract data that included a certain word in a variable. Although when I do this
unfortunately the productname variable includes all html until it finds the word Roundup (which could be quite a way down the page)
If there a way I could in my script test if productname has the word roundup in it? i.e
String pagestart = session.getVariable( "productname" );
if(pagestart.equals("Roundup") )
DO THIS
then I would be able to capture the whole product name and only go forward with the scrape if it the test = true.
setting dynamic parameters
Hello,
I am setting some dynamic parameters through a script by using the scrapeableFile.addHTTPParameter() function. However when I run another script after this one that uses the scrapeableFile.getCurrentPOSTData() function to get the POST parameters (that's what I assume it does :roll: ) and print them out to the session's log, this function returns null.
Scrape file parameters in basic version 4.0
:?: I just upgraded from v3 to v4 of screen-scraper basic and now my scraping session doesn't work because there are no "Parameters" and I can't re-add them.
Clicking "Add Parameter" changes the width of the columns but doesn't actually add anything.
I need to submit some values by POST to log into the site. Why have parameters stopped working and is there an alternative way to do it in the basic version?
Thanks.
Where are proxy & scrapting session data stored on compu
I had to migrate to a new computer and thought that I backed up everything. After installing screen-scraper and starting it, none of my proxy and scraping sessions are now in the new version even after I COPIED the entire program folder from "program files".
Where do I find that old data so I can bring back my hours and hours of work setting up these sessions?
Robert
