SOAPing it Up

Introduction to Web Services
In distributed computing, the geeks like to see the world divided into various “service providers” and “service users”. The service provider typically is the owner of a “software component” that does some unit of work. The service user has a system that does a complete task (involving many units of work)for which it needs the service of one or many such service providers.
Let us take the example of Google (which effectively uses Web Services currently). Google provides the service of searching the Internet. This service is being used by multiple other clients either directly or integrated with other functionality inside its applications. Google charges clients on a per-search basis (maybe just a few cents per search, but adds up to a substantial amount if traffic is good).
The beauty of it is that Google can provide the service over the Internet using Web Services technology. These other clients who use Google’s search engine need not be connected to Google’s services through LAN or WAN. Other companies like Reuters (to provide real-time news on a per-demand basis), Amazon (to provide e-commerce through web-sites) could also provide web services. Some Wall Street firms can provide a service that would provide stock quotes over the Internet.
Any applications that need to incorporate these function into their design simply act as clients to the web-service.
The mechanics of Web Services
To access the Google Web Services, the essential information that a client needs to provide is the “search string” (the text that you provide the search engine). The string should travel from the client’s machine across the Internet to Google’s server. Google’s server would do its searching business and get some content which would be the result of the search. This result needs to travel from the Google’s server back to the client’s machine.This is essentially the equivalent of executing the method
String doGoogleSearch(String searchString).
The essential difference here is that the logic for the method would exist on Google’s servers while the method needs to be executed from some client location. The two are only connected through the Internet.
To make this work, the Web-Service client and provider need to communicate using HTTP (the language of the Internet of course). Whoaaa… You would say … HTTP is used for browsers to communicate with applications.. what this all about two applications talking to each other using HTTP? Thats the new way that HTTP is being used by the Web Services.
So let us imagine that Google’s clients send it information pertaining to what method it wants to execute, and the parameters to the method. The http traffic would look like this:

POST /EndorsementSearch HTTP/1.1
Host: www.google.com
Content-Type: text/xml; charset="utf-8"
Content-Length:261
SOAP Action: "http://www.google.com/Search"
<SOAP-ENV:Envelope xmlns:SOAP-ENV="http://schema.xmlsoap.org.soap/envelope/"
    SOAP-ENV:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/">
    <SOAP-ENV:Body>
        <u>
            <m:DoGoogleSearch xmlns:m="http://google.com/service/Search">
                <searchString>Sammy Sosa</searchString>
            </m:DoGoogleSearch>
        </u>
    </SOAP-ENV Body>
    </SOAP-ENV: Envelope>

As you can see, SOAP is just a specific XML format to make the content understandable to service providers and users.
If you ignore the rest of the extraneous material, you’ll see that the essential information is the name of the Web Service http://google.com/service/Search, the method to be executed is doGoogleSearch() and parameter passed is “searchString” whose value is “Sammy Sosa”.
You’ll admit that this HTTP traffic looks pretty different from a regular HTTP traffic coming from the browser which will be more like

<html> <head>blah.. blah.. </head></html>

The Google servers should have an outward facing Web Service that can interpret this kind of traffic coming in, parse out essential information and execute relevant method and transmit back information.This is done by the “soap engine” . In Java, the soap engine is usually a special servlet which would trap all HTTP messages that hit a specified url pattern. Let’s say our Google Web Server’s soap engine interprets all URLs that end with service /* and assumes that it’s a soap request (as opposed to an HTML request).
The soap engine would execute the relevant method doGoogleSearch(…) and get the result (in this case it is “LARGE STRING SEARCH RESULT”) and transmit it back as HTTP response. The return transmit would look like this:

<?xml:namespace prefix = soap-env />
<SOAP-ENV:Envelope xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/" SOAP-ENV:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/">
<SOAP-ENV:Body>
<m:DoGoogleSearchResponse xmlns:m="http://google.com/service/Search">
<searchResults>LARGE STRING SEARCH RESULT</searchResults>
</m:DoGoogleSearchResponse>
</SOAP-ENV:Body>
</SOAP-ENV:Enve1ope>

The client can extract the search result from the SOAP message and do whatever it likes with it.
If you look at the http traffic, you will see that
1. It is normal text traffic travelling on tcp/ip (HTTP) (no custom data formatting)
2. It follows http request/response style protocol(so soap engines can use servlet/php/asp architecture)
3. It can flow through regular http ports 80. etc… that normal webservers use.(this makes Web Services very pervasive to firewalls)
4. The interpretation of messages going back and forth is in XML and can be standardised and shared (between Web Service providers and users) by using commong DTDs. All Java, .NET, ASP, PHP languages are capable of doing the above four things, so Web Services can effectively connect components between these disparate systems together.
That’s what all these guys are harping about. The Web Services can connect components in disparate worlds together to build a complete application.
Some caveats however-
1. Web Services make sense only if disparate systems are involved connected by the Internet. If you need two servers both using Java to talk to each other on a LAN, there is no need of Web Services, we can might as well use RMI or CORBA.
2. The Web Service provider should build a Web Service only if it is likely to be useful to many external entities. There’s no point in creating Web Services that the clients don’t need (they could write their own code to do it).
3. Web Services are slow. The slow response that we normally experience for traffic between browsers and applications will now appear within and between applications performing their tasks. Web Service defines only the structure to communicate. It does not provide inbuilt infrastructure for security and payment.For example Google will have to figure out the security/ authentication required to identify the service users who are allowed o use its service. Because this service is available only out on the world wide web anyone can become a user of the service unless restricted by specific means.
Also Google will figure out how to charge its customers for providing the service.

A deeper look at Java Web Services and SOAP

Let us say you invoke a method Foo getFoo(String blah). In this scenario the code that invokes getFoo() is called the client while the class that implements this method is called the server. This is easily done when both client and server objects reside on the same JVM. (Please note client and Server are relative terms).
However if the client and server reside on separate JVMs as maybe in the case of EJBs, in Java we need to to do a remote invocation using RMI-IIOP. However RMI is a proprietary protocol in Java and will not work when say the client uses Java and the server uses say, .NET.
This is where web services comes in. With Web Services you have different technologies communicating with each other. This kind of communication is useful when integrating different systems as in EAI (Enterprise Architecture Integration).
Web Services use http, the same protocol between browser and server, except that here there is no browser as client, the client and server are both application servers communicating via HTTP protocol.
If the client system needs to invoke a method on the server system, let’s say, Foo getFoo(String blah), then the client needs to send the server the “blah” parameter and in return get the “Foo” object, that is through HTTP.
So as you can see it is pretty much a request- response model.
So let’s say the client creates a small xml

<param1>blah</param1>

and sends it across via http to the Server which reads it and creates a Foo object and sends back a response again as xml.
So as you can see the data is sent as Strings embedded in an xml file which both server and client can construct into a java object. (Here we are dealing with both server and client using Java technologies, not disparate systems). This conversion to xml is also serialization, just that it is easier to read than what RMI does during serialization. This process is called XML over HTTP.
The problem with this technique is the client and server need to come up with some agreement on the XML syntax being used for data transfer.
This is where SOAP comes in. It is basically structured XML which tells you how the xml will look like, which part of the xml holds the parameters, where the result lies in the XML, etc. It’s just an envelope around the real data xml and hence called the SOAP envelope. It structures the data so both client and server can interpret the http requests and responses.
The way it works is-
The server creates a file called wsdl(pronounced visdel) – which describes the structure of the xml for specific data involved.
If there is a service available at a server with url say http://myserver.com/abc then the client gets the wsdl from the server by typing http://myserver.com?wsdl, and creates a java class using java utility classes and a SOAP client, usually a jar file. The java class is like the RMI stub for EJBs and acts as a proxy class – you invoke methods on the stub and it will do the communication for you; so you don’t need to worry about what is on the server, you just work with the wsdl you got. The SOAP client (a jar file) needs to be on the client classpath when running as well, not only for creating the proxy.
The server party has a tougher job that includes installing the web service (the SOAP engine). Just as there is a SOAP client on the client side, the SOAP engine, usually a war file, acts as the SOAP server on the server side.
The Server guy has written a java class manually with the implementation of the getFoo() method which he intends to make available through the web service. For this he deploys the war file on the web server. Then he writes a deployment descriptor called wsdd where he describes the java class containing the getFoo() method and configures the server war file to read it and make it available. So it is the war file that reads the deployment descriptor file(wsdd) and makes the service available.
When the web server, say Tomcat, with the SOAP server, is running, it is listening to http. The client makes a http request to the SOAP server after opening the http port on the server side, just like a standalone application can. The configuration for the deployment descriptor can be done through a web console similar to weblogic or websphere.
Sometimes you would not need a deployment descriptor, wsdd file, in case you rename .java to .jws. The SOAP engine (SOAP server) can read the .jws file and analyze it, after you point out your jws file through the web console configuration apparatus. You could also use the configuration console to point out your deployment descriptor to the SOAP engine, too.
UDDI
This is an optional feature and allows you to dynamically discover web services. For example if you want to know today’s interest rates, there may be multiple web services providing that information – so to get one you like, you look up it in the UDDI – it is like a registry. UDDI is not practical mostly, because usually the client has to know very well which server it’s going to talk to and what is the format of the SOAP request it expects, and so it’s hard to dynamically search and use web service this way.

About cuppajavamattiz
Matty Jacob - Avid technical blogger with interests in J2EE, Web Application Servers, Web frameworks, Open source libraries, Relational Databases, Web Services, Source control repositories, ETL, IDE Tools and related technologies.

Comments are closed.

%d bloggers like this: