screen-scraper support for licensed users
Nested error: java.lang.NullPointerException while importing an sss file
Hi,
I'm on version 5.0.21a and noticed that I cannot import sss files any longer because I get a "Nested error: java.lang.NullPointerException" message. This is very bad because now I cannot deploy changed scrapes.
Could you please reply to this as soon as possible? Thank you in advance.
Kind regards,
Edgar
Multithreading optimal sessions\threads mix
I have a setup that should allow multiple user simultaneous scrape. So, I was wondering what should I do?
Like install more instances running fewer threads, or install few instances running more threads? Can relate to the number of CPU cores?
Improve scrape speed
Hi,
I setup a scraping session that parses a relatively simple web page and returns like 100 rows of xml content . 20 nodes of 4 elements. say 20 pattern matches. Now I called scrape from C# using a RemoteScrapingSession object. I debugged the code and noticed that running the scrape command on this object takes about 10-12s. The machine is a 4 quad 3ghz. I'm in EU and scraping a US site.
Now, what can I do to improve the scrape time\ retrieve data faster for user?
removing session from GUI-less OS
Hey guys,
one more question about linux, how can i remove session in GUI less OS, is the only way using SOAP?
response content type
Hello,
I have a scraper that is spidering links that are discovered on arbitrary web sites. I'm trying to check for obvious URL's that I don't want to spider off to... e.g. things that end in .doc, .pdf, etc. However, sometimes it is unavoidable, I still hit the random binary file and screen-scrape tries to scrape it.
Is there a way to tell screen-scraper to fail fast if the content type of the response is something other than "text/xxx"?
5.0.19a bug?
I just updated my Professional license to 5.0.19a and I am having a strange problem. When double click on Extractor Pattern tokens in some instances it highlights part of the token and then random characters to the right of the token instead of bringing up the Editing Token screen.
Am I the only one having this issue?
Andrew
Enabling javascript in a scrape
Hi
The issue faced by us while scraping a website is that when we capture
the sessions through firefox, we are getting the correct and expected response.
However when we replay the captured scrap able file we get a HTML page as an response
stating
How can we enable javascript in screen scraper, just as we enable JS in IE or FF,
Sub-Sub Extractors?
I am having some issues figuring out the best way to do this. I am scraping content that is essentially in a table laid out similar to this.
Circuit City (Level 1)
Plasma (Level 2)
50"(Level 3)
46"(Level 3)
42"(Level 3)
LCD (Level 2)
55"(Level 3)
37"(Level 3)
Best Buy (Level 1)
Plasma (Level 2)
42"(Level 3)
50"(Level 3)
58"(Level 3)
I can get a pattern to match Level 1 and then the first Level 2 under that section but thats it.
linux multi instances
Hi Guys,
while I'm trying to figure out multi-threading, I decided to install screen scraper on linux machine,
after good few hours, my screen scraper was installed on Ubuntu 10.04, but...
im totally fresh in linux, and im just wondering how to start more than one screen scraper server, do I have to install screen scraper, in windows i had 10 instances and i was able to configure each server ports. Can someone tell me how to do that in linux
Regards,
Radek
Multithreading same scraping session
Hi,
We have problems with multithreading runs of same scrape session.
So, we have a script that creates an xmlWriter and stores it in a session. After this Initialization script, a scrape is run that gahteres data data from a site.
After the scrape is run, another script gets data from the scrapeSession object and writes it to a file using the xmlWriter.
So, what I did was to do a multithread(10 to 20 threads) run of the same Scraping session. Often the result files contained text that indicated info was
written by several threads.