screen-scraper support for licensed users
Error Opening PDF after Download
I have a pdf that I am downloading: https://campaigns.documatix.com/campaign/imagelib/Reun2/branch100.pdf
If I go to the site and save it, I can open it.
When I use screen scraper, it saves a file.
I get an error saying it's in the wrong format, and I can't open it.
This is what I am running:
session.downloadFile("https://campaigns.documatix.com/campaign/imagelib/Reun2/branch100.pdf", session.getVariable("DOWNLOAD_PATH"));
Thanks.
Bart
Not able to decipher the encoded date parameters
On selecting a from and to date the from date and to date are encrypted and sent to the councils as parameters. We are trying to decipher this encoding , so that we can send the proper encoded from and to date as dynamic parameters. Can you please help us to do so.
For your reference the URL is,
http://www.ashford.gov.uk/online_planning/DCCore/default.aspx?newsearch=true
Go to planning applications << Select the Date tab
How to delete a scrape session from database via command-line
What is the command-line equivalent of deleting a scrape session from the screen-scraper database? With the GUI interface, you just right-click the session you want and choose "delete" from the pop-up menu. How about with command-line Linux? How do you do it?
XML file with Shift_JIS character set
Hi Guys,
After im scraping some website with shift_JIS character set, and im receving data i have probably simply problem to fix: in screen scraper logs i have result:
//_LINK_: KN2300060600394539
//ENTITY_NAME: KOKO・TOYOTA
//ENTITY_ADDRESS: 〒471-0034 愛知県豊田市小坂本町4丁目1−4
//PHONE: 0838-26-5200
//_LINK_: KN3500060700059393
//ENTITY_NAME: ファミリーtoyota
//ENTITY_ADDRESS: 〒758-0011 山口県萩市大字椿東無田ケ原2884−1
//PHONE: 0120-060861
//_LINK_: KN2307011300001766
//ENTITY_NAME: トヨタすまいるライフ株式会社/レジデンス・THE・TOYOTAマンションパビリオン
//ENTITY_ADDRESS: 〒471-0878 愛知県豊田市下林町1丁目3−3−1501
//PHONE: 0565-37-8567
Pretty post data
Internally pretty post data is hidden on the site so that we cannot get the parameters through the screen scraper tool but raw post data is having those parameters. The below URL is one example of this type:
http://www.centralbedfordshire.gov.uk/PLANTECH/DCWebPages/acolnetcgi.gov?ACTION=UNWRAP&RIPNAME=Root.pgesearch
Kindly please help us to find out from where on the site these parameters gets passed internally.
How to get jsessionId from request header to set it into the whole session
Request Header Value
Cookie JSESSIONID=5C2B1867AB978DED75E17566E5D7CA6C
The value of jsession id gets changed randomly on site. Through screen scraper tool how do we capture this jsession id. We tried to set this cookie into the session but as its value gets changed randomly, so that the session gets failed. Kindly please help us to capture this jsession id.
Memory Problem
Hi,
I'm running into some memory problem with this one scrape I'm having.
I have scrape this site 3 months ago and it worked fine.
Now I go to scrape it and it shoots up the memory really high after a couple of minutes.
My specs:
Win 7 ultimate x64
intel i5 760@260ghz
4gb ram
1024mb memory allocation in screen scraper settings
Things I've tried:
Strip the Write Details Script to log the sku only.
turn tidy off
limit size of details page to 1.5mb (it's around 2.5mb original) via scrapeableFile.setMaxResponse();
Tried to run it on a different computer, same thing.
reformatDate question
Hi -
Can someone comment on to what extent the reformatDate utility implements the Sun SimpleDateFormat class? It works well reformatting dates that consist of just numbers like 04/19/2011, but doesn't correctly parse Apr-19-2011. Should it? Or is this beyond the scope of the utility?
Thanks.
https assistance
I'm doing some work with android marketplace and all the pages are https. When I look at the progress tab there aren't any pages that are scrapeable, although you can view the source in the browser and its fine. How can I get past this issue, please?
Removing session variables in latest version
I recently upgraded my version of SS and a key feature no longer works as it used to. To summarise, i used to essentially reset the session variables to blank by overwriting a session variable with something that was always null. On the recent update it's no longer possible to remove the session variables as i used to as these null variables are now ignored.
The best way to describe this is through an example:
Category -> Subcategory -> Child category.