Using HTTrack to capture and Screen-Scraper for continual scraping
Anyone here have any experience copying a whole site (in my particular case, several small to medium subreddits) via a service like HTTrack website copier, and then scraping that copy continuously with screen scraper?
Is HT-Track the best solution for this, or is there something better? HT-Track used to be good years ago, but seems to act wonky now. It has a lot of trouble capturing only what you want, and not trying to capture the whole internet, or a whole site even if you only want part of it. For reddit I have been having this problem, when I only want a few subreddits, and none are that active or huge except 1.
And if I capture by HT-track, do I need to run a webserver on my computer to host it and then scrape it locally, or can I somehow just scrape the files in the directories on the hard drive, and still traverse all the levels and links?