Tor & Polipo processes left running on memory
Hello.
I am using Tor and Polipo from screen scraper(with the java library that you guys kindly provided in another post) succesfully.
My only nuisance comes from the fact that I have my script "Shutdown Tor & Polipo" run "After the scraping session ends", so when I am developing, everytime I force stop the scraping session, the mentioned shutdown script does not get to run so I end up with one "Tor" and one "Polipo" running in memory, and the count increases everytime I force stop the session(or it stops due to an error). Later I have to manually quit each of those processes from Mac OSX Activity Monitor.
The option of having this "Shutdown Tor & Polipo" script run "Before the scraping session begins" solves this, but then I arrive to the scrapes without Tor & Polipo running :-(
I will eventually export this scraping session to a Linux EC2 server where this session will run scheduled, so I am worried about this happening over there without me having so much control over it.
Can you think of a way to avoid this?
Thank you very much,
Boga
In the newest versions of
In the newest versions of screen-scraper, there is another option to run scripts on the scraping session, and it's "always at the end". If you set the stop tor script to run "always at the end" it should still run if you stop the scrape early.
awesome :-) thank you, Boga.
awesome :-)
thank you,
Boga.
One question regarding polipo
One question regarding polipo & tor.
In the Linux installation, by default Polipo & Tor autostart when the machine boots, and also there is a couple of daily polipo and tor scripts that are autoinstalled also and scheduled to run daily by being in the cron.daily folder, I believe with the purpose of doing some tidying & maintainance.
Is it ok to leave Polipo & Tor running or am I supposed to remove them from auto running? If they are always running will will it be conflicting when the screen scraper tor library calls them?
cheers,
boga
Should be fine to leave them
Should be fine to leave them up. The library is meant to allow for multiple concurrent tor threads.
Hi again. I noticed something
Hi again.
I noticed something weird. I have set several scripts to be run "always at the end". They are a script to close the database connection, another script to shutdown tor & polipo, and one script to log the end of the scraping session. Now when I click the stop button in the workbench while in the middle of the scraping session, the mentioned scripts are not run, and instead, the scraping session continues at some other point. I thought that the stop button should halt everything and run whatever is set to run at the end. What could be happening?
the code that seems to be
the code that seems to be running even after I clicked the stop button is in a script set to execute "before scraping session begins", inside a while (resultSet.next()) loop
disregard. I just read
disregard. I just read another post in the forum explaining why sometimes the execution doesn´t stop immediately when clicking the stop button.
cheers,
boga