I was wondering if there is anyway I could return the data from the previous from a scrape and pass the cookie to the user so the user can log into a session with his/her data entered.
Yes, this would be possible. Because all that screen-scraper does is replicate the requests made to a server that a normal browser would do you do have the ability to basically interrupt a request in order to ask a person to provide their credentials then pass them along as a request to the server.
First, you'll want to proxy the site and walk through all of the steps you will be asking your users to do. Then, create scrapeable files out of the appropriate transactions. Test thoroughly to make sure everything is working.
Then, study the request that is made just prior to when a person logs in. Note the key/value pairs of any cookies being set.
Next, study the request that is made when a person submits their login credentials. Confirm that the cookies being passed are the same cookies you noted from the previous transaction. If they do not match then see caveat below**. Also note all parameters being passed and the referrer URL used.
Now, you're going to want to split your scraping session into two. Make first one scraping session the "pre-login" scraping session which ends with the scrapeable file that just precedes the login. Make the second one the "post-login" scraping session starting with the scrapeable file that handles the login credentials.
Create a script that is called after the scrapeable file runs just prior to the log in scrapeable file. In this script use session.getCookies to iterate over each of the current cookies and save them as session variables.
Depending on how you are invoking screen-scraper you can optionally return back to your external application the current cookies as a session variable or write their values to a local text file (if not invoking from an external application).
Along with the cookies, also return or write out the key/values of any volatile parameters (changes each session) to be used when logging in, the referrer that the login scrapeable file will be expecting and stop your scraping session.
Now that you have the important pieces you need saved you can prompt your user for their credentials. Take their credentials and add them to your already saved data. Start the second scraping session and pass to your first scrapeable file all of the data that you have saved.
Because the Internet is stateless the web server your targeting won't know that you just took a break to ask the user for their credentials. So long as you do so within the time the server keeps a session active (typically within at least 15 minutes) then the server will take what you send it and fulfill your request without blinking an eye.
-Scott
**If the cookies of the pre-login request do not match the cookies of the login request then it is likely that the missing cookies are being set via Javascript on the client. If this is the case you will need to track down the means by which the cookies are being set and replicate this in a script in order to store the resulting key/values. Make use of the new "Detect Javascript Cookies" button under the proxy transactions tab.
Seamus, Yes, this would be
Seamus,
Yes, this would be possible. Because all that screen-scraper does is replicate the requests made to a server that a normal browser would do you do have the ability to basically interrupt a request in order to ask a person to provide their credentials then pass them along as a request to the server.
First, you'll want to proxy the site and walk through all of the steps you will be asking your users to do. Then, create scrapeable files out of the appropriate transactions. Test thoroughly to make sure everything is working.
Then, study the request that is made just prior to when a person logs in. Note the key/value pairs of any cookies being set.
Next, study the request that is made when a person submits their login credentials. Confirm that the cookies being passed are the same cookies you noted from the previous transaction. If they do not match then see caveat below**. Also note all parameters being passed and the referrer URL used.
Now, you're going to want to split your scraping session into two. Make first one scraping session the "pre-login" scraping session which ends with the scrapeable file that just precedes the login. Make the second one the "post-login" scraping session starting with the scrapeable file that handles the login credentials.
Create a script that is called after the scrapeable file runs just prior to the log in scrapeable file. In this script use session.getCookies to iterate over each of the current cookies and save them as session variables.
Depending on how you are invoking screen-scraper you can optionally return back to your external application the current cookies as a session variable or write their values to a local text file (if not invoking from an external application).
Along with the cookies, also return or write out the key/values of any volatile parameters (changes each session) to be used when logging in, the referrer that the login scrapeable file will be expecting and stop your scraping session.
Now that you have the important pieces you need saved you can prompt your user for their credentials. Take their credentials and add them to your already saved data. Start the second scraping session and pass to your first scrapeable file all of the data that you have saved.
Because the Internet is stateless the web server your targeting won't know that you just took a break to ask the user for their credentials. So long as you do so within the time the server keeps a session active (typically within at least 15 minutes) then the server will take what you send it and fulfill your request without blinking an eye.
-Scott
**If the cookies of the pre-login request do not match the cookies of the login request then it is likely that the missing cookies are being set via Javascript on the client. If this is the case you will need to track down the means by which the cookies are being set and replicate this in a script in order to store the resulting key/values. Make use of the new "Detect Javascript Cookies" button under the proxy transactions tab.
Thanks Scott.
Thanks Scott.