screen-scraper support for licensed users

Questions and answers regarding the use of screen-scraper. Only licensed Professional and Enterprise Edition users can post; anyone can read. Licensed users please contact support with your registered email address for access. This forum is monitored closely by screen-scraper staff. Posts are generally responded to in one business day.

Anyone here an experienced Reddit scraper?

I am going to do a big project there scraping posts for word usage, and looking for correlation between seemingly unrelated subreddits. Does anyone here have any wise information about scraping there? I anticipate this project is going to be a few months long if not a year.

My main concern is avoiding detection. Of course I am going to use dummy accounts, but what is the best practice as far as bandwidth limiting. I don't actually need to crawl the whole site, so I don't need to crawl fast, I just need to do it efficiently.

Which modern-day browser works best with Workbench's Proxy Server

Running a proxy server on the workbench and using a browser like Chrome to see the pages being retrieved do not work well with https sites. The browser will give the usual warning about SSL certificates (since the workbench is acting as a proxy), but if you select the option to ignore it and proceed, Chrome returns an ERR_SSL_PROTOCOL_ERROR message.

How do I use workbench to scrape sites using a browser?

Invoking pattern after each pattern match of another pattern

My question is exactly the same as posted here: https://support.screen-scraper.com/node/1695

I'm trying to use the first method by processing a script "after each pattern match" of a level 1 session variable.(OPTION 1)
by using a script that contains the following:

levelOneString = dataRecord.get( "OPTION1" );
scrapeableFile.extractData( levelOneString, "Text Variants" );

This script is indeed processed "after each pattern match" of "OPTION1" however the extractor pattern "Text Variants" fails to find any pattern matches of the session variable "OPTION2"

Data of next page not getting

Hello,

I can't get the data after changing the page number.

For Example:-
https://www.curreycodealers.com/searchadv.aspx?searchterm=Product%20Search

You can see all Items which are the under this page but when you click on page 2 and then you will not get the Items of the Page 2 instead of this still you will get the same Items which are on the page 1.

I have add some parameters like
__EVENTTARGET
__EVENTARGUMENT
__VIEWSTATE
IsSubmit
pagenum
pagesize

Please help me out.

Thanks

Does screen-scraper support IPv6?

I apologize for the rather broad question. Our organization is preparing for IPv6 by assessing our solution portfolio. Does screen-scraper support IPv6? Can we configure IPv6 address in our configuration properties and in our scraping sessions?

Output to Google Sheets?

I would like to schedule something in Screen Scraper Server to automatically run then auto-publish on the web. Then use Google sheets IMPORTHTML or IMPORTXML or IMPORTDATA to pull the data into a Google Sheet.

Currently, my screen-scraper sessions export to a CSV. However, Google sheets requires something be published on the web in order to pull the data automatically (for obvious reasons).

Upgrade enterprise SS from 5.5 to 7.0, export/import of sessions works first time but gone after restart

It looks like all the scraping sessions disappear when I copy the resource/db back after 7.0 installation.

I would guess this is due to incompatibility.

Is there any trick I can use without having to rebuild them all?

When export/import and then ignore the updates for scripts, it shows up initially but restarting the scraper and then its gone again.

Any ideas?

Facing issue for Northampton council

I am trying to capture the below URL

http://planning.northamptonboroughcouncil.com/planning/search-applications

I have captured the pages but when I run the session and then I check the response in browser I get the result as

{"KeyObjects":[],"TotalRows":0}

Also the search page display in browser show -

An error has occured retrieving the list of search criteria.

Transport error calling server. . error

Planning Applications
Loading...

Searching

I am using 7.0a screen scrapper version.

Kindly help me for same.

linux command line scrape not finding scraping session

I have installed screen scraper enterprise on a Centos os linux server. I can't get screen scraper to recognize the scraping session. The scraping session was developed on a Mac and exported with the same name as the one I am trying to use in the command line on linux. The sss session is in the installation directory of screen scraper. I have tried using it with and without the -w command with no luck. I have examined the xml for the "name" element and it matches the file name. Where is the screen scraper looking for the sss file? I would be happy to put it there.

Selecting a proxy session causes the UI to freeze

If I select a proxy session the Pro interface will freeze. I'm on the latest beta, but it also happened with the latest released version. It would also happen with SS Version 6.0.

It doesn't happen with all proxy sessions and I can't figure out what the cause is. I have been able to determine that I can create a proxy, open a few pages in FireFox, then close and open again. After I open again it could freeze when selected. This only happens after a close and open. When selecting something else, then going back to the same proxy before the close it would not freeze.