screen-scraper public support
Why is Tidy HTML Failing?
I have created a series of scrapeable files with extractor patterns that are dependent upon upon Tidy HTML successfully cleaning responses. A recursive pattern is being used to obtain sub category data that appears differently as subsequent sub categories are obtained. As such, re-writing my logic to handle responses which have not be Tidy'd successfully isn't an option.
Multiple fields in a data record
Hi,
I've been using SS for a while now (thanks for a great tool) but I'm having a bit of difficulty extracting data on multiple sizes. For example, here I've got a page with multiple rows, each row containing a table (example below) which I extract into a datarecord. In this example table, I need to extract the names of the comics from the table cell below the one labelled comics. In this instance we have Adam Bloom, Ben Norris and John Fothergill (MC) but in the next table I may have 1, 2, 3, 4, 5 or in the second example 6 comics. Any ideas.
I installed SS and now many internet sites cannot be reached?
I installed SS last night on my WinXP Pro machine and worked through the first 2 tutorials without problems. Thought "Wow! This is just what I need!" This morning though I was unable to access several sites (Yahoo, etc...). It seems that SS made modifications to my internet connection that I cannot undo. Some sites work and some time out. Doesn't seem to be any rhyme or reason but it is consistent. It's as if the SS proxy server installed some wierd layer that is blocking certain IP addresses. It has definitely slowed down my access time to others as well.
404 Error when using downloadFile()
I am trying to scrape images from a website. Normally the image is seen through a java applet...According to the transaction logs, the site takes in several GET parameters including an image ID, then a .jar file (assuming this is the applet) is downloaded along with a few other files and finally the image is downloaded. Visually, I see the applet load, and then the image appears in the applet. When I try to download the image directly however, without first visiting the page with the applet, a http 404 error is returned.
Ability to Scrape this Site
I am having difficulty obtaining the url required to identify the pruducts as the part url is called using java in the html code. I have identified the unique id of the product but cant see how to use that to identify all the products to display the results page to gett the further information.
IS it still possible to scrape the following site, is therre someting I am missing?
Installing Professional Edition in Linux
Hi,
I have tried to install and launch Professional edition trail pack in linux machine.
It installed in /application/screenscraper location
In my machine we installed java in /software/java/jdk1.5.0_12 location.
well,
Now i can start the server by ./server start command.
And, when i try to launch screen scraper by, ./screen-scraper (in the installed location /application/screenscraper) it gives following error.
Grabbing Incremental Session Variable Changes
Hi everyone, I'm a newbie to Screen Scraper, though not to the concept and I have to say it's the best product out there I've found and the community support I've seen in the forums is really good too!
I'm struggling to get some basic information though: I need to grab and use a couple of session variables during a scrape, but I can't work out how to reference them in VB. I'll explain the two things I'm having difficulty doing:
1st problem: Grabbing all results up to 10 pages where there are less than 10 pages
SS Ability - Help needed
Hello, I am in the need of a Proof of Concept to find out if SS can fulfill a need for data capture. I need to navigate to a site, submit form data (from a File/Query) and capture the results page by either parsing the data (to a SQL Db) or saving the HTML to parse later.
Odd behaviour in SS, hanging, delete_me file, etc...
Greetings, I've been using Screen Scraper for almost a year now (ver 3.0) and I've never had much of a problem with the SS application itself. Occasional weird behavior, which does sometimes happen, has always been fixable with a restart of SS.
To Ignore particular strings using RegEX
Hi everybody,
I have a site to scrape where i am using a token to scrape the required ads.
In some records the data present are
the token value
Please help me in writing a regular expression inorder to get all required data except
Thank you,
Vivek.