remote scraping sessions called from beanshell

I seem to be able to call a remote session ok and pass session variables back and forth as long as it's not in lazy mode.

thread = new com.screenscraper.scraper.RemoteScrapingSession("fe - a2 user details","sshost1",8778);
thread.setDoLazyScrape(true); //everything works fine with this line commented out.

thread.setVariable("THREAD_COMPLETE","NO");
String status = "bob";
thread.setVariable("STATUS",status);
session.log(status + thread.getVariable("STATUS")); //this line works on lazyScrape
thread.scrape();
//session.pause(20000);
session.log(status + thread.getVariable("STATUS")); //this line causes a nullPointerException on a lazyScrape
while (!status.equals("---FINISHED")){
if (!session.isRunning())
break;
session.breakpoint();
status = thread.getVariable("STATUS");
session.breakpoint();
session.log(status);
}
session.log(status);

The remote session I'm calling just calls this script:

session.pause(10000);
session.setVariable("STATUS","---FINISHED");

if I use a lazy scrape in the parent script I get a ": java.lang.NullPointerException BSF info: null at line: 0 column: columnNo" as soon as I try to use thread.getVariable. I can access the same variable straight before the scrape is started though...

I've tried inserting a 20 sec pause to make sure the child script is finished before I try to read the session variable but the same result. It's only when lazyScrape is turned off that I can access the sessionVars...

is this the way it normally works or is it something to do with using it from beanshell? The remote session is definitely running, I can see it in the web interface running from the the other computer...

Is the 20 second pause too

Is the 20 second pause too long? If the remote session pauses for only 10 seconds, it seems like the 20 second pause of the parent scrape would make the remote session finish before parent checks the value. If Remote is finished when Parent checks in, you may very well get a null-pointer exception like that..

Conceptually this should work, but I'm not sure how to gracefully catch the scrape before it's finished.

Maybe Remote should finish it's execution by going into a loop, pausing every 10 or 20 seconds. The while loop condition could be checking on a session variable, waiting to see when it's no longer null. That way, Parent can check on Remote's STATUS variable whenever it gets to it. Once Parent sees that Remote's STATUS is finished, Parent can set the while-loop session variable for Remote to finish its otherwise infinite loop.

Seems to me to be the only way to avoid race conditions, where Parent has to check Remote before Remote finishes. If you can make it work, it would devilishly clever, and you would get an award from me personally, in the form of little ascii characters spelling "congratulations".

alas no award for me

alas no award for me today...

It seems to be the other way around...

If setDoLazyScrape = false then I can access the session vars of the remote thread after it's finished though obviously not while it's running since the parent thread is suspended.

If setDoLazyScrape = true.... I've tried every combination of pauses I can think of... If set the child to a long pause so the parent tries to access it's session variables while it's still running I get the nullpointerexection. If I make sure the parent thread waits for the remote thread to finish before trying to access it's variables the same thing. It's almost like RemoteScrapingSession object goes out of scope as soon as a lazyScrape is started. Actually that might be something worth testing... back in a minute... The object itself isn't out of the scope, just all the variables are nulled.

I've been looking around a bit further and found the SOAP interface which seems to be a much more thorough way of doing this sort of thing. It even has an isRunning() method so now I'm trying to make that work which leads me to yet another question...

I'm using the Java example in the documentation but I'm trying to use it within the SS workbench. I've run the command line given in the example and it's successfully creates the com.screenscaper.soapclient file structure (in the SS program folder). now I seem to be stuck on importing...

package com.screenscraper.soapclient; //line 7

import com.screenscraper.soapclient.SOAPInterface;
import com.screenscraper.soapclient.SOAPInterfaceService;
import com.screenscraper.soapclient.SOAPInterfaceServiceLocator;

import java.rmi.RemoteException;
import javax.xml.rpc.ServiceException;

SOAPInterfaceService service = new SOAPInterfaceServiceLocator(); //line 16
SOAPInterface soap = service.getSOAPInterface();

results in error message: The error message was: Encountered "." at line 7, column 12.

line 7 is the 'package' line. If I comment it out I get: The error message was: Typed variable declaration : Class: SOAPInterfaceService not found in namespace : at Line: 16.

I've never really used java outside of screen scraper so I'm not really sure if it should work inside beanshell or not... any ideas?

Ah, okay. Outside of

Ah, okay. Outside of beanshell, you type "package yada.yada" in order to name the way that other programs could import your own code. For instance, the DataSet object is at com.screenscraper.common.DataSet, so when the DataSet class is defined, it says "package com.screenscraper.common.DataSet;"

In beanshell, your code is never "importable" by other code, so they sort of ignore the "package" keyword.

Use "import" instead. This way your code goes and looks up the soapclient, instead of trying to claim that it *is* the soapclient package:

import com.screenscraper.soapclient.*;

err sorry for jumping the

err sorry for jumping the gun... I actually ran the wrong the script and didn't see any errors so assumed it had work... but it hadn't...

this is the exact code I'm running:


import com.screenscraper.soapclient.*;

import com.screenscraper.soapclient.SOAPInterface;
import com.screenscraper.soapclient.SOAPInterfaceService;
import com.screenscraper.soapclient.SOAPInterfaceServiceLocator;

import java.rmi.RemoteException;
import javax.xml.rpc.ServiceException;

SOAPInterfaceService service = new SOAPInterfaceServiceLocator(); //line 12
SOAPInterface soap = service.getSOAPInterface();

which gives:

The error message was: Typed variable declaration : Class: SOAPInterfaceService not found in namespace : at Line: 12.

the stub files compiled to: C:\Program Files\screen-scraper enterprise edition\com\screenscraper\soapclient
with the command line exactly as it's written (copy and pasted) on the java SOAP page in the doco....

any other ideas?

you're worth your weight in

you're worth your weight in gold tim... I'm feeling a bit guilty for bombarding you with so many questions so I bought an enterprise licence today to salve the guilt...

so now that the guilt trip is dealt with... next question...lol...

I notice looking through the files it creates in the 'com' directory that they are riddled all the way through with the hostname/port of the SOAP server... is there any way to abstract the hostname out of those files so you can assign it dynamically? otherwise it seems everytime you want to use a different host you'd have to run the command line to create a new set of files then import them again... I'm not even sure if that's possible to do from within a beanshell script...

*bump* tim have you been

*bump*

tim have you been having skiing accidents again?

Hi, Tim is actually out of

Hi,

Tim is actually out of town, and emailed asking if I would jump in and help on this one.

First, it might help to clarify on the first scenario with lazy vs. non-lazy scraping. If you're invoking screen-scraper remotely and you indicate that it should be a lazy scrape, once you invoke the "scrape" method, your remote Java application actually disconnects from screen-scraper. That is, there can be no communication between the two after that point.

SOAP may be an option for you, but if you're trying to use SOAP within screen-scraper it may be that you want to take a different approach entirely. I suppose that, technically, you could use the SOAP client in a screen-scraper script, but I honestly can't think of why you'd need to.

As such, could you perhaps describe just a bit more what you're trying to accomplish? Are you simply wanting to run many scraping sessions in parallel and get data from them as they progress?

Thanks,

Todd