Wix Site - Proving Tricky

Hi
I have managed to get the urls that I need from the search results and I have set up a product page DATARECORD template and sub-extractors.

However, the problem seems to be that after the first product, the Wix widget does not load on subsequent product pages. I assume there is a setting but I have been unable to debug.

I have uploaded the SSS file - hopefully it will explain better my issue?

Thanks for your help here

Jason

Jason, On the first page,

Jason,

On the first page, look at the source for a tag with id "wix-warmup-data". If you copy the contents up to the

tag, you have a block of JSON data. This is hard to read, but paste it somewhere like https://jsonformatter.org/ to make it much more readable.

That JSON has entries for all the listings on that site, so you don't need to paginate (the site has all the data on page 1, but you need to click "next" on the browser for you to render it.)

Your extractor pattern is already getting the URLs from that JSON, so I think that you are already getting to all the listings, but if it were me, I would just parse all that JSON and I don't think you'd even need to request the details page to get all that data (maybe I missed something.)

Data is erratic?

Thanks for coming back to me. I have looked at this again. Sometimes the 'warmup' data appears and others it does not.

Unfortunately not all the data is in the 'warm up' for every product, (free text description for example)

Even when the urls appear in the warmup data for the search results page, the data in subsequent pages does not load (The Wix widget does not load)

I am wondering if it is a timing thing?

Thanks

Jason

The way the site shows only

The way the site shows only 18 results, and you have to click "load more"? That done via the JavaScript, so it has to be there. Can you replicate times when it's not?

If you do need data from the details page, maybe it's better to just request it like you were. I was just seeing a lot of data in the JSON and assumed maybe that was adequate.

It must be timing/speed

The 'Wix Widget' simply does not load 95% of the time (a guess) when called by SS - or not load in time. Could it be I try some type of delay or similar?

Thanks

Jason, I can run it and it

Jason,

I can run it and it consistently works, so I would theorize either:

  1. The site is blocking you somehow, so you may need a proxy
  2. A difference in our screen-scraper set ups causes different results

I updated the attached scrape. Could you run that, and send me the whole log? At the beginning I added some logging about your setup, and if there is an error it will show up lower.

Didn't seem to work

Thank You - I have run it and the warm up data looks a little thin on the ground?

See below.

Could it be my set up?
Thanks
Jaason

Starting scraper.
Running scraping session: KFS
Processing scripts before scraping session begins.
Processing script: "KFSInitialisation"
=========================================================
=================== Log Variables with Message ===============
screen-scraper Instance Information
=================== Static Values ================
Java Vendor: Oracle Corporation
Java Version: 1.8.0_66
OS Architecture: amd64
OS Name: Windows 10
OS Version: 10.0
Scrape HTTP Client: AsyncNingScrapingHttpClient
SS Connection Timeout: 180 seconds
SS Edition: Professional
SS Extractor Timeout: 30000 milliseconds
SS Max Concurrent Scraping Sessions: 5
SS Maximum Memory: 256 MB
SS Memory Use: 7%
SS Run Mode: Workbench
SS Version: 7.0
======== Message logged at: 03/07/2022 14:18:21.758 GMT ========
=========================================================
Processing script: "ZuniversalCloseCSVWriter"
Scraping file: "KFSSearchResults"
KFSSearchResults: Requesting URL: https://www.kfs-uk.com/used-machinery?page=1
KFSSearchResults: Sorry, tidying HTML failed. Returning the original HTML.
KFSSearchResults: Processing scripts before all pattern applications.
KFSSearchResults: Extracting data for pattern "Warm up data"
KFSSearchResults: The following data elements were found:
Warm up data--DataRecord 0:
JSON={"platform":{"ssrPropsUpdates":[{"comp-kzfxcoi2":{"isValid":false},"comp-kzfxcoj0":{"isValid":false},"comp-kzfxcoj5":{"isValid":false}}],"ssrStyleUpdates":[{"comp-kzfxcok61":{"visibility":"hidden !important"}}]},"appsWarmupData":{},"ooi":{"failedInSsr":{"TPASection_j88o57ll":true}}}
KFSSearchResults: Warm up data: Processing scripts after a pattern application.
Processing script: "KFS - results parse JSON"
KFSSearchResults: Warm up data: Processing scripts once if pattern matches.
KFSSearchResults: Warm up data: Processing scripts after all pattern applications.
Processing scripts after scraping session has ended.
Processing scripts always to be run at the end.
Scraping session "KFS" finished.

Would you be willing to run

Would you be willing to run it in screen-scraper version 7.0.14a?

If you don't want to update a production installation of screen-scraper, you can install to a new directory and

  1. Update the resource/conf/screen-scraper.properties file, and assign an unused port to
    • ServerPort
    • ProxyPort
    • DatabasePort
    • SOAPPort
    • WebServerShutdownPort
  2. In screen-scraper settings, check the box to "update to unstable versions" (it's very stable)
  3. Under "options" click "check for updates"
  4. It would be a good idea to import the scrape I made again (in case importing to an older version altered something)