screen-scraper public support

Questions and answers regarding the use of screen-scraper. Anyone can post. Monitored occasionally by screen-scraper staff.

Input Method

I was just wondering if screen scraper could read input from something other than a CSV file? Can it read in from a table in a database or anything? Just wondering

Jason

Having trouble trying to scrape an ASPX site. Looking for ideas...

I'm trying the scrape an ASPX site, and I'm looking for direction on what to try next. I have already successfully scraped the first and second page of the site and have captured and passed along the Event Validation and View State variables. I now have a page of 500 results, but they are being displayed 20 at a time so I'm now trying to scrape the page which is displayed when the "Next 20 results" button is clicked. It keeps returning "11|pageRedirect||/Error.aspx|"

Cant edit scripts in Linux, but can in Windows

Hi I've downloaded the professional edition, and recently switched from Windows to Linux. I've imported the scripts from the Windows version and they run but I cannot edit them. I can create a new script, delete a script but cant edit or rename the script.

It appears I have read write and execute permissions for the directory in which SS is installed, however Im new to linux so perhaps there is something I've missed.

Currently I'm managing to edit in a version running under windows (via VirtualBox) then export and import into the linux version to run it.

Multiple values extracted by Sub Extractor

Hi,

Is is possible to have a sub extractor extract multiple instances of data in a block of html?

I have extracted the following typical html but each block has a similar structure. I want to pull out the city, date and 3rd number from each block, however my sub extractor only matches the first instance. For each state extracted with the main extractor there are between 1 and 6 cities. Below is the output for NSW with only 2 cities, Sydney and Newcastle.

Thanks........Chris

Start with minimized focus

First of all, thank you for such a great product!

I am using the free basic version on a Mythtv pvr. In order for the scraping jobs to run every hour, I have a cron job that triggers the scraping session, but in order for the scraping session to run, I need the workbench to be running. So I have it loaded at startup. The problem is that the workbench loads last and gets the focus, thus moving into the foreground of the PVR frontend app.

CSV Input File - looping through records and processing one record at a time in scrapeable file

Hi

My scraping session depends on the input of an initial url or several webpages. Ive created a csv as an input file utilising the script that reads the data from the csv. The only challenge I have is that the scrapeable file only kicks off once the entire csv has been read. This causes the scrape only to start with the last url. I have utilised the csv input script from your examples scripts.

Changing the URL of a ScrapeableFile Dynamically

My scenario is kinda similar to the one in http://community.screen-scraper.com/node/1263 wherein the URL that I would post to, gets redirected to another URL which contains dynamic entries in its path. If I manually add scrapeable files into the scraping sesson I'm building using the Proxy Session, obviously, the URL path will have static values that were created at that point in time.

Extracting data and preserving hierarchy information

I'm trying to figure out how to loop through a page structure and still maintain some information about the hierarchy. I have extracted a dataset, and within the data set is a number of entries that are categorized like this.

Resolved URLs - I don't want it!

Hi all, I've been looking all around for help on the forum to this particular problem. I've found many answers on here in the past, but this one is escaping me and 2 other guys in the room.

Basically, I have directly entered a URL into a scrapable file.

The URL has 4 brackets (2 pairs) [] [] in the address.

It's part of the parameters for page 2, 3, 4 and so on....

When screen-scraper uses the Resolved URL, no matter what we try, the brackets are replaced by %5D and %5b.

write current date

This is a very simple question for you but not for me...
I need to write the current date in a csv file, how can I reach the target??
out.write(???????????)
thx a lot for your answer

Fabio

PS:the perfect script is save the csv file with "current date.csv" like 20091129