Welcome!

Java Authors: Raymond Feng, Maureen O'Gara, Per Sjofors, Lori MacVittie, Al Mannarino

Related Topics: Java

Java: Article

Better Scaling with New I/O

Better Scaling with New I/O

With J2SE Version 1.4, Java finally has a scalable I/O API. Not that the old API was an absolute failure (Java's tremendous success in the application server market refutes this), but some of the old API's properties led to drastic restrictions. The worst one was the blocking I/O.

To write data over a socket, you have to call the write() method of an associated OutputStream. This call returns only after you've written all the necessary bytes. Given that the send buffers are full and the connection is slow, this might take a while. If your program operates only with a single thread, other connections have to wait, even if they're ready to process write() calls. To work around this problem, you have to associate a thread with each socket. This way one thread can work while another one is blocked due to I/O-related tasks.

Threads aren't as heavyweight as real processes. But, depending on the underlying platform, they're not resource savers either. Each thread uses a certain amount of memory and, apart from that, many threads imply many thread-context switches, which aren't cheap.

Java needed a new API to separate the all-too-happy marriage of socket and thread. This finally happened with the new I/O API (java.nio.*).

In this article I show how you can write a simple Web server with both the new and the old API. Since HTTP, the Web's protocol, is not as trivial as it used to be, I'll realize only some simple central features. Therefore, the programs shown here are neither secure nor protocol-conforming.

Old School Httpd
Let's look at the old-school HTTP server first (see Listing 1). (Listings 1-5 can be downloaded from www.sys-con.com/java/sourcec.cfm.) Since I need only a single class for this realization, it's quickly explained. In the main() method, a ServerSocket is instantiated and bound to port 8080. Of course, you'd usually bind a Web server to port 80, but on Unix systems you can only do this with superuser rights. Fortunately, not everyone has them, which is why I chose to use port 8080.

public static void main() throws IOException {
ServerSocket serverSocket = new ServerSocket(8080);
for (int i=0; i < Integer.parseInt(args[0]); i++) {
new Httpd(serverSocket);
}
}
Then a number of Httpd objects are created and initialized with the shared ServerSocket. In the Httpd's constructor, I make sure all instances have a meaningful name, set a default protocol, and start the server by executing the start() method of its superclass Thread. This leads to an asynchronous call to the run() method, in which an infinite loop is located.

In this infinite loop, the ServerSocket's blocking accept() method is called. When a client connects to port 8080 of the server, the accept() method will return a socket object. Associated with each socket are an Input- and an OutputStream. Both are used in the following call to the handleRequest() method. In this method the client's request is read, checked, and an appropriate response is sent back. If it's a legitimate request, the requested file is sent back using sendFile(). If it's not, the client will receive a corresponding error message (sendError()). To keep things simple, I won't discuss the specifics of the protocol.

while (true) {
...
socket = serverSocket.accept();
...
handleRequest();
...
socket.close();
}
Now let's think about this realization for a second. Does it perform well? On the whole, yes. Certainly I could optimize the request parsing - the StringTokenizer doesn't have a reputation for being extremely fast. But at least I turned off the TCP delay (slow-start algorithm), which is unsuitable for short connections, and the sending of the file is buffered. But even more important, all threads operate independently of each other. The native, and therefore fast, accept() method decides which thread accepts a new connection. Apart from the ServerSocket object, the threads don't share any resources that might need to be synchronized. This solution is fast but, unfortunately, not very scalable, as threads are definitely a limited resource.

Nonblocking Httpd
Let's look at another solution that uses the new I/O package. It's a bit more complicated and requires the cooperation of different threads. It consists of four classes (see Figure 1):

  1. NIOHttpd (see Listing 2)
  2. Acceptor (see Listing 3)
  3. Connection (see Listing 4)
  4. ConnectionSelector (see Listing 5)
NIOHttpd basically launches the server. Just as in Httpd, a server socket is bound to port 8080. The important difference is that this time I use a java.nio.channels.ServerSocketChannel instead of a ServerSocket. I need to open the channel with a factory method before binding it explicitly to the port using the bind() method. Then I instantiate a ConnectionSelector and an Acceptor. Doing so, each ConnectionSelector is registered with an Acceptor. In addition, the Acceptor is provided with the ServerSocketChannel.

public static void main() throws IOException {
ServerSocketChannel ssc = ServerSocketChannel.open();
ssc.socket().bind(new InetSocketAddress(8080));
ConnectionSelector cs = new ConnectionSelector();
new Acceptor(ssc, cs);
}
Figure 2 depicts the concurrent execution of the Acceptor and ConnectionSelector threads. To understand the interaction between the two threads, let's first take a closer look at the Acceptor. Its task is to accept incoming connections and register them with the ConnectionSelector. Already in the constructor, the superclass's start() method is called as the required infinite loop is in the run() method. In this loop a blocking accept() method is called that will eventually return a socket object - almost exactly as in Httpd. But this time it's a ServerSocketChannel's accept() method, not a ServerSocket's. Finally, with the obtained socketChannel as an argument, a connection object is created and registered with the ConnectionSelector using its queue() method.

while (true) {
...
socketChannel = serverSocketChannel.accept();
connectionSelector.queue(new Connection(socketChannel));
...
}
To summarize: the Acceptor can only accept and register connections with a ConnectionSelector in an endless loop.

Like Acceptor, the ConnectionSelector is also a thread. In its constructor a queue is instantiated and a java.nio.channels.Selector is opened using the factory method Selector.open(). The Selector is probably the most important part of the server. It allows me to register connections and to obtain a list of those connections that are ready for reading or writing.

After the start() method is called in the constructor, the endless loop in run() is executed. In this loop I call the Selector's select() method. This method blocks until either one of the registered connections is ready for I/O operations or the Selector's wakeup() method is called.

while (true) {
...
int i = selector.select();
registerQueuedConnections();
...
// handle connections...
}
It's crucial to understand that while the ConnectionSelector thread executes select(), no Acceptor thread can register connections with the Selector, because the corresponding methods are synchronized. Therefore I use a queue, to which the Acceptor thread adds connections as needed.

public void queue(Connection connection) {
synchronized (queue) {
queue.add(connection);
}
selector.wakeup();
}
Right after queuing a connection, the Acceptor calls the Selector's wakeup() method. This causes the ConnectionSelector thread to resume execution and return from the blocking select() call. Since the Selector is not blocked anymore, the ConnectionSelector can now register the connection from the queue. It happens the following way in registerQueuedConnections():

if (!queue.isEmpty()) {
synchronized (queue) {
while (!queue.isEmpty()) {
Connection connection =
(Connection)queue.remove(queue.size()-1);
connection.register(selector);
}
}
}
Selector Registration Using Keys
At this point I have to focus on the Connection's register() method. Until now I've talked about a connection that's registered with a Selector. This is a bit simplified. Instead, a java.nio.channels.SocketChannel object is registered with a Selector, but only for specific I/O operations. After registration, a java.nio.channels.SelectionKey is returned. This key can be associated with arbitrary objects using its attach() method. To get a connection with a key, I attach the Connection object to the key. By doing so I can indirectly obtain a Connection from the Selector.

public void register(Selector selector)
throws IOException {
key = socketChannel.register(selector,
SelectionKey.OP_READ);
key.attach(this);
}
Getting back to the ConnectionSelector, the select() method's return value indicates how many connections are ready for I/O operations. If the return value is zero, I skip the rest and return to the select() call. Otherwise, I iterate over the selection keys, which I obtained as Set by calling selectedKeys(). From the keys I get the previously attached Connection objects and call their readRequest() or writeResponse() methods. Which method is actually called depends on whether the connections were registered for read or write operations.

This eventually brings me back to the Connection class. It represents the connection and handles all the protocol's specifics. In its constructor the provided SocketChannel is set to nonblocking mode. This is essential for the server. Then a couple of default values are set and the buffer requestLineBuffer is allocated. As the allocation of direct buffers is somewhat expensive and I'm using a new buffer for each connection, I use java.nio.ByteBuffer.allocate() instead of ByteBuffer.allocateDirect(). If I reuse the buffer, a direct buffer could prove to be more efficient.

public Connection(SocketChannel socketChannel)
throws IOException {
this.socketChannel = socketChannel;
...
socketChannel.configureBlocking(false);
requestLineBuffer = ByteBuffer.allocate(512);
...
}
After all initializations are done and the SocketChannel is ready for reading, the readRequest() method is called by the ConnectionSelector. Using socketChannel.read(requestLineBuffer), all available bytes are read into the buffer. If the full line can't be read, I return to the calling ConnectionSelector and thus allow another connection to take over. However, if the whole line is read, it's time to interpret the request just as I did in Httpd. If it's a legitimate request, I create a java.nio.Channels.FileChannel for the requested file and call the method prepareForResponse().

private void prepareForResponse() throws IOException {
StringBuffer responseLine = new StringBuffer(128);
...
responseLineBuffer = ByteBuffer.wrap(
responseLine.toString().getBytes("ASCII")
);
key.interestOps(SelectionKey.OP_WRITE);
key.selector().wakeup();
}
prepareForResponse() builds the response line and (if necessary) headers as well as error messages, and writes this data to responseLineBuffer. This ByteBuffer is a thin wrapper around a byte array that was created using the factory method ByteBuffer.wrap(byte[]). After generating the data that I want to write, I need to notify the ConnectionSelector that from now on I want to write data rather than read it. This is achieved by calling the selection key's method interestedOps(SelectionKey.OP_WRITE). To guarantee that the selector quickly realizes the connection's change of interest, I call its wakeup() method.

Now the ConnectionSelector calls the connection's writeResponse() method. First, the responseLineBuffer is written to the socket channel. If the entire content of the buffer can be written, and if I still have to send the requested file, I call the transferTo() method of the FileChannel that I opened before. transferTo() potentially transfers data very efficiently from a file to a channel. How efficiently depends on the underlying operating system. In any case, only as many bytes are transferred as can be written to the target channel without blocking. To be on the safe side and to ensure fairness between connections, I set an upper limit of 64KB.

If all data is transferred, close() does the clean-up work. Here, the deregistering of the Connection is important. This is achieved by calling the selection key's cancel() method.

public void close() {
...
if (key != null) key.cancel();
...
}
Again I wonder: How does this realization perform? And again I can answer: it performs well.

In principle, one Acceptor and one ConnectionSelector are sufficient to keep an arbitrary number of connections open. Thus this realization shines in the category of scalability. But as the two threads have to communicate through the synchronized queue() method, they might block each other. There are two ways out of this dilemma:

  1. A better realization of the queue
  2. Multiple Acceptor/ConnectionSelector pairs
One solution could be realized by using a LinkedQueue (see Concurrent Programming in Java by Doug Lea). This data structure is synchronized with two independent locks - one for the head and one for the tail. This ensures that adding and removing threads don't block each other. Only if the queue is empty is there a possibility of mutual blocking, but this can be avoided with an extra check.

In comparison to this elegant approach, my second solution qualifies for the "brute force" category. The load is balanced with multiple Acceptor/ConnectionSelector pairs and the synchronization problem isn't solved, but is somewhat reduced. Unfortunately, this causes additional costs for context switches. Compared to Httpd, fewer threads are needed.

One disadvantage to NIOHttpd, in comparison to Httpd, is that for each request, a new Connection object with buffers is created. This leads to an additional CPU cycle burning caused by the garbage collector. How large these extra costs are depends on the VM. However, Sun doesn't tire of emphasizing that with Hotspot, short-lived objects are not a problem anymore.

Comparative Number Games
How much better does NIOHttpd scale than Httpd? Let's play with a couple of numbers, but before I go into media res, be warned: the formulas and the numbers I'm going to find are highly speculative. Only the concepts' performance is estimated. Important context variables like thread synchronization, context switches, paging, hard disk speed, and caches are not considered.

First I estimate how long it takes to process r simultaneous requests for files with size s bytes, if the client bandwidth is b bytes/second. In the case of Httpd, this obviously depends directly on the number of threads t, as only t requests can be processed at a time. I assume that a corresponding formula looks like Formula 1. c is the constant cost for parsing, etc., that has to be paid for every request. In addition, I assume I can read data faster from the disk than I can write it to the socket, my bandwidth is greater than the sum of the clients' bandwidth, and the CPU is not fully utilized. Therefore the server-side bandwidth, caches, and hard disk speed are not part of the equation.

However, NIOHttpd is not dependent on t. The transfer time l depends mostly on the client bandwidth b, the size of the file s, and the previously mentioned constant costs c. This leads to Formula 2, which estimates the minimum transfer time for NIOHttpd.

The quotient d (see Formula 3) is of interest since it measures the relationship of the performances of NIOHttpd and Httpd.

After closer examination (...and some rows of data), it becomes apparent that if s, b, t, and c are constant, d grows toward a limit. This limit can be easily calculated using Formula 4, which measures the limit of d for r -> ƒ.

Thus, besides the number of threads and constant costs, the connection's length s/b has tremendous influence on d. The longer the connection exists, the smaller d is, and the advantage of NIOHttpd compared to Httpd is greater. Table 1 and Figure 3 show that NIOHttpd can be 126 times faster than Httpd, given that c=10ms, t=100, s=1mb, and b=8kb/s. NIOHttpd has a big advantage if the connection stays open for a long time. If the connection is short, e.g., in a local 100Mb network, the advantage is only 10% provided the files are large. If the files are small, the difference won't be detectable.

In these calculations it's assumed that the constant costs of NIOHttpd and Httpd are about the same and no new costs are introduced by the different ways the servers have been implemented. As mentioned before, this comparison only holds under ideal conditions.

This is sufficient, however, to give you the idea that either concept might be beneficial. It should be noted that most Web files are small, but HTTP-1.1-clients try to keep the connection open as long as possible (with a keep-alive or persistent connection). Often, connections that will never again transfer any data are kept open. In a server with one thread per connection this leads to an incredible waste of resources. So, especially for HTTP servers, the scalability can be increased dramatically by using the new I/O API.

Conclusion
With the new I/O API you can build highly scalable servers. In comparison to the old API, it's a bit more complex and requires a better understanding of multithreading and synchronization. Also, the documentation needs improvement. But if you've gotten over these hurdles, the new API proves to be a useful and necessary improvement of the Java 2 platform.

References

  • HTTP 1.1: www.w3.org/Protocols/rfc2616/rfc2616.html
  • Lea, D. (1999). Concurrent Programming in Java: Design Principles and Patterns. Second Edition. Addison-Wesley. http://gee.cs.oswego.edu/dl/cpj
  • More Stories By Hendrik Schreiber

    Hendrik Schreiber develops data synchronization solutions utilizing SyncML and J2ME/J2EE for Nexthaus in Raleigh, North Carolina. He is also co-author and author of two German Java related books, published
    by Addison-Wesley.

    Comments (0)

    Share your thoughts on this story.

    Add your comment
    You must be signed in to add a comment. Sign-in | Register

    In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.