Short tutorial for Newbe?

Hi,

have just discovered screen-scraper.

After a couple of days I begun to familiarize with it.

However I have no previuos experience with programming, scripting or anything related with that what so ever. Thus it seems I am litlle bit too green and I simply can figure out how to think in order to came with solution for specific site.

There is a site I am interested to extract its data so i wonder if someone experienced could be so nice and guide me to start things?

I ran through the first couple of Tutorials, however I can get the right solution and I simply lack expertise, while dealing with site particulary interesting for me.

I would appreciate any help.

Could you post the link for

Could you post the link for the site? Not really sure what we're looking at yet...

the scripting involved in SS in minimal, but it can help to have an understanding of how the programming philosophy works in order to derive your own solutions.

Depending on the platform your working on (Windows, most likely, maybe Mac, or possibly a Unix/Linux setup), SS will be able to provide you with a few different programming languages. My personal favorite is Python, but support will be best provided if you use Java (as in, "interpreted Java", not "javascript") since screen-scraper itself is actually written in Java, and would be the most stable, and is in fact the company's language of preference for developing.

So, that being said, the main things to understand in practically any programming language is that you have something called an 'object'. 'object' names are (by standard convention) always capitalized, so that it's an "Object", instead of an "object". Seems like a petty difference, but it's quite useful to easily recognize an Object.

In Java, you've got access to basic "methods" or "functions", which are just specialized words to mean "some programming code that someone else already wrote that I can run in MY programming code". Java prefers the word "method", but both are usable.

In order to get access to more kinds of methods and Objects than the default set, you have to use "import" statements in your code before you try to make use of whatever extra methods you want access to.

For instance, Java allows for String (notice the capital letter on "String", since Strings are Objects) manipulation by default; that is to say that you have control over Strings whenever you want, without the need for these "import" statements which I've mentioned. Strings are just a sequence of characters wrapped up in double quotes, such as "I'm a hound dog.". It could contain anything: "123412341234", "oh happy day!!, !@#$Asdf", or possibly even shorthand for a "tab" character, or a "new line" character: "first line\n second line\t that was a tab character"

However, if you wanted to deal with other kinds of data types, like an "ArrayList" or some other thing, you have to write
    import java.util.ArrayList;

Now that you've got access to any given Object type via this sort of import, or if you're just using one built into Java's default setup (as in, where there's no imports needed), you can tell Java to make a live version of said Object. The 'access' to the Object type doesn't automatically given you the ability to play with it. You have to create a live copy of the Object:
    String myStringVariable = new String;

That follows a pattern of something like this:
    ObjectName nameTheInstanceOfTheObject = new ObjectName;

Using the word "new" is just the way to tell Java that you want a "new" instance of the given object to the right-hand side of the word "new". The left-hand side of the equals sign is just saying that you want to declare that [whatever results from the right-hand side] will be put into a variable named whatever you decide to name it. In summary, that String example would result in a variable named 'myStringVariable' which contains a new String, which happens to come into existence empty (ie, "", where there's nothing between those two quotes).

After that, you're free to use the variable however you want:
    myStringVariable = myStringVariable + "woah";

If you were to examine this variable NOW, it would no longer be empty, but would be ([empty] + "woah"). If you were to do this same line again, you'd come out with "woahwoah".

There's tons to explain about programming languages, but that's a decent start, I think... You just have to know that ANY Object has it's own 'methods' that you can access. String variables have methods for isolating 'sub'-strings, like if you wanted to get the 2nd-through-5th characters of a String, there's a method: myStringVariable.substring(2,5) which you could use. If you wanted to save that result until later, you would have to make a new variable out of it:
    String aSubString = myStringVariable.substring(2,5);

Notice that the '.' is the way that java (and most programming languages) represent the action of accessing a method that belongs to the Object's type. Every Object has different methods... Integers don't have a 'substring' method, for example, because that doesn't make sense.

For questions about types of Objects, I would recommend checking out the actual Java API, which is just a big directory of all the methods in existence (except for if you were to write your own methods, of course). You should know that when people refer to a "Class", they are referring to the template version of an Object. For instance, the 'String' Class is what you use to create a new 'String' Object. Objects are the actual pieces of data in the computer's memory, while Classes are just the blueprints that specify what the Object will be once you or I create a 'new' copy of the Object.

Java API: http://java.sun.com/javase/6/docs/api/

To know more about the String class specifically, you can find it in the bottom-left part of that webpage I just linked to. When you click on it, you'll see a good list of all the methods in the String class, and explanations of them.

If you can supply a URL for the site you're after, we can talk some specifics of what you'll need to do with it.

Tim

Here is the link 2

Sorry for double posting - have spotted some trying mistakes, so corrected them. :)
______________________________________________________________________________________________________________________

Hello Tim,

Thank you for the consideration and detailed explanation - this is exactly what I was expecting for.

Here is a link of yellow pages type website with executed search with certain variable - in this case I was searching for all companies from Vilnius which presented their web sites addresses (advanced search). [Maybe you will find this example interesting for some specific tutorial?]

http://www.visalietuva.lt/en/imones/rezultatai?de=t&he=&qu=&co=&no=&pc=&pn=&st=&hf=&ht=&re=460&di=461&ci=&zi=&qw=t

It seems the only way to get results as I did not find the way to get all lists of the companies, even with selecting city needed.

I was trying to apply logics of the second tutorial, however as you will see it yourself - the search result address pattern is changing at least twice while shifting through the search results pages. As for getting companies info sub-page - there is another problem - in the search result page code only part of the details company info page address is given - when you are trying to make logical assumption - to add given fragment to the page address (or simply check the address by browsing to the detailed company info page) this your created link simply do not work. Again question - why?

So I simply need an extraction pattern for every company in the search list with following info - company name, registration code, telephone, address and field of activity.

Maybe you can advise me on? As if I will get it - and will be able to use it in the future similar situations - I would love to buy this software as its flexibility potential is unlimited, however you have to be prepared to use it. By the way maybe some sort of exercises for the newbie would be a great addition for the community as perfect marketing move ;)

Oh yes and another thing - Screen Scrapper Basic is lagging very much on my pc. Taking into account that I have brad new laptop here and I am running Windows Vista. I am using interpreted Java.

Will wait for you answer or we can continue discussion via IM or Skype.

Best,
Roman