screen-scraper public support

How can I match a token conditionally, or pre parse the page to remove certain noise words so that my extractor pattern matches

This is my extractor pattern:-

<td class="prd-select"><input type="checkbox" value="~@SKU@~" name="aComparedProducts[]" /> </td>
<td class="prd-img"><a href="~@COMPAREPRODUCTURL@~"><img src="~@PRODUCTIMG@~" alt="~@IGNORE2@~" height="~@HEIGHT@~" width="~@WIDTH@~" /></a> </td>
<td class="prd-details">
<a href="~@PRODUCTURL@~">LG ~@MODELNO@~ ~@IGNORE2@~</a>

~@PRODUCTTITLE@~

<img src="~@RESERVEANDCOLLECTIMG@~" alt="~@RESERVEANDCOLLECTALT@~" />

JulianGuppy on 08/20/2010 at 1:02 pm

screen-scraper public support

5.0 Installation problem

When downloading version 5.0 of the basic edition and attempting to restart screenscraper, a message box appears entitled: Startup Error

It contains the text:
java.lang.NoClassDefFoundError: com/screenscraper/util/DataMain
at com.screenscraper.controller.ControllerMain.main(ControllerMain.java:544)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at java.lang.reflect.Method.invoke(Unknown Source)

itworked on 08/13/2010 at 12:52 am

screen-scraper public support

Simple variable question

In a screen scraping session I capture a URL using ~@SITE_URL@~ in an extractor pattern.

I then invoke a script from that extractor screen that executes "after each pattern match". Within the script when I reference SITE_URL using getVariable it shows SITE_URL as being null. I would like to use the contents of SITE_URL as
~#SITE_URL#~ for my next scrapable file URL, but can't because it is null even though the original scraping file filled it in.

thanks for the help

itworked on 08/13/2010 at 12:48 am

screen-scraper public support

1 comment

choosing a category in a form

Hello,
I would like to post in a form containing a select field where i have to choose from different categories. The choice of the category will be different everytime and depends on the subject i want to post to (it is a directory). I want to know if it is possible to make a keyword based search before chosing a category and posting the form.
If you have a hint or any code that could help it would be awsome !
Thanks in advance
hanlin

hanlin on 08/05/2010 at 8:06 am

screen-scraper public support

1 comment

How to pass multiple command-line parameters

The tutorial on command-line (http://community.screen-scraper.com/Tutorial_3_Page_3_Using_the_Command_Line) says that to pass a parameter via command-line, you use the form:

jre/bin/java -jar screen-scraper.jar -s "Hello World" --params "TEXT_TO_SUBMIT=Hello+World"

My question is -- how you do specify multiple parameters? What do you use as delimiters? Semi-colons? Or do you make multiple '--params'?

Is the ff. correct:

jre/bin/java -jar screen-scraper.jar -s "Hello World" --params "TEXT_TO_SUBMIT=Hello+World;TEXT2=Hi+World"

-or-

rchiang on 08/04/2010 at 7:34 am

screen-scraper public support

extractor Pattern doesn't work

I’m desesperate ! I try to scrape a simple page with a Extractor Pattern and sub extractor Pattern but nothing works !
Here is the link : http://www.lesinrocks.com/musique/concerts/detail-concert/concert/festival-all-stars/

What i want, is the description part :

Gils on 08/03/2010 at 4:08 pm

screen-scraper public support

a script to auto-increment int value after every scrape?

Hi,

This might be more of a Java question than screen-scraping. I was wondering is it possible to have a script in Java that can provide an auto-increment value to each row of scraped data? I am scraping product information and just need a simple product#1 has a "1", product#2 has a "2" as a product_id.

Thank you very much for any suggestions in advance!

Test_Scraper on 07/30/2010 at 4:22 am

screen-scraper public support

1 comment

Passing "%" in parameter

I want to make a call to a url with parameters containing the percent symbol (%) like so:

http://www.someurl.com?id=%99%9D%9B%9C%98

If I try putting "%99%9D%9A%9A%9B" as-is in the Parameters tab, the '%' gets expanded so that the actual URL being called is:

http://www.someurl.com?id=%2599%259D%259B%259C%2598

If I try using something like java.net.URLDecoder.decode(id, "UTF-8") on the parameter prior to passing it to scraper, the actual URL changes to:

http://www.someurl.com?id=%EF%BF%BD%EF%BF%BD

What is the correct way of doing this?

rchiang on 07/26/2010 at 1:37 am

screen-scraper public support

3 comments

Extractor Pattern - Script id

I would like to extract specific values that is part of a script on a Web page.
A short version of the page is shown below:

[...] 
< script id="script_1" type="text/javascript"> 
< !-- 
//configuration 
var fallback = new Object(); 
var parameters = 'address=Main+56;zipcode=1234'; 
< /script> 
[...]

(had to insert a space before the word script, to make the code visible on this forum)
These values is only stored in the script, and not visible as HTML on the page.

zenith004 on 07/16/2010 at 4:35 am

screen-scraper public support

Is HTML Tidy Permanently Turned on in Basic Edition 5.0?

Is Tidy HTML permanently turned on in Screen Scraper basic edition 5.0?

I have just tried making a new scraping session, and even though I have disabled Tidy HTML under Options>Settings, the following line appear when I look at the last response from the server:

Jakob P on 07/15/2010 at 2:39 pm

screen-scraper public support

1 comment

Search

Community

screen-scraper

User login