Session 34: Remote procedure calls

RPC (not in text)
  basic concept
  problems
  RMI and CORBA
  java.rmi

RPC

Basic concept

Communicating across a network is tricky business: You have to define a protocol, and then each side of the communication must implement the protocol flawlessly. There's a lot of room for error, and you might hope that there would be an easier way of doing things.

RPC (or remote procedure calls) is an attempt to amend this difficulty. The idea behind RPC is to abstract each request as being a ``procedure call'', and that programs can send a request to a server simply by calling a function. This should make things simpler on the programmer, saving them the difficulty of network protocols and what not.

RPC has four steps.

RPC hides the business of creating protocols and the corresponding implementation code, which normally is simply a pain to get right.

I'd give you an example here, except that there's really no single standard that people are following. And the most popular ones are pretty ugly, since they're written to work with many existing programming languages and systems, and so they end up being ugly and complicated.

Problems

There are a few of problems with RPC. First, it's not all that it's cracked up to be. Nobody's come up with an implementation that's particularly simple (though admittedly this could be just a matter of somebody developing a decent system where it's the pervasive fundamental technique, instead of something added as an afterthought). Also, the concept of procedure calls may sound like a general abstraction of network communication, but it's a little limiting. For example, while you might think of a Web server as responding to clients calling a getWebPage() procedure, which takes a URL as a parameter and returns a file, it's not clear that this is the best way to understand the process.

The second problem is that of efficiency: RPC may be too simplistic. You can imagine that I might want to send an array as a parameter. With RPC, the entire array would be passed whether or not the server needs it, and from the programmer's point of view, this is the easiest thing to do. RPC would hide the fact that in fact a tremendous amount of communication has to happen in order to move this array from one location to another.

The final problem is more technical: How can you handle pointers? You can't simply pass them onto the server, since the server has a different memory space. On the other hand, if you attempt to try to make the client send the information pointed to by each pointer onto the server, you may send a impractically large amount of information. (And, besides, with some programming languages (e.g., C), you can never be exactly sure what is a pointer and what isn't.)

RMI and CORBA

With object-oriented languages came the object-oriented equivalent of RPC, called RMI (remote method invocation). CORBA was among the initial RMI systems, and it has gained quite a bit of popularity.

In CORBA, you can have an object reference, which would simply be an implementation of an interface, where each method's implementation would in fact act like RPC: it would take the parameters, built it up into a request, and send it onto the server. This object reference contains within it the location and name of the remote corresponding object, so the method invocation can happen transparently.

One of the nice things about RMI is that it solves the pointer problem with RPC pretty cleanly: If you have a pointer to an object, RMI can simply send an object reference. That way, you avoid the problem of following pointers, which would lead to huge chunks of data being sent from place to place using RPC.

We're not going to look at CORBA either, because it's also ugly and complicated, in order to be a cross-language, cross-platform system.

java.rmi

Instead, we'll look at java.rmi, a library build by Sun into Java to facilitate RMI. This particular implementation of RMI turns out to be pretty clean to use.

RMI is still a four-step process. In the first step, we define the interface that will be implemented by the remote object. In our example, we'll have an interface that simply contains a method for computes another number based on a given number i.

interface NumberServer extends java.rmi.Remote {
	public int getNumber(int i) throws java.rmi.RemoteException;
}

The second step is defining the remote object that should exist on the server. We'll just have the getNumber() method add 1 to i. (This is really stupid - we're afraid of adding 1 to a number on the client, since it's such a hideous amount of computation, so we're letting the client handle it. Pretend something more dramatic is happening on the server side that justifies all the extra communication.)

class IncrServer implements NumberServer {
	public int getNumber(int i) throws java.rmi.RemoteException {
		return i + 1;
	}
}

Now we register the server object on the server by executing a program containing the following code. Notice how we're naming the object using a URL.

String url = "rmi://sunfac3/Increment";
java.rmi.Naming.bind(url, new IncrServer());
We also run a command that generates some stub and skeleton code in order to get this to work. The stub code will also implement the NumberServer implementation.

In the final step, we write a program that accesses the server.

String url = "rmi://sunfac3/Increment";
NumberServer server = (NumberServer) java.rmi.Naming.lookup(url);
int i = server.getNumber(41);
System.out.println(i);
In the second line, server is just an instance of the stub code. There is no communication occurring in the second line (except possibly some verification that the URL is valid). The real communication is occuring in the third line, when 41 is sent to the IncrServer object on the server. It returns 42, so that is what this program fragment will print.