| By Mike Edwards, Tim Ellison | Article Rating: |
|
| October 6, 2004 12:00 AM EDT | Reads: |
46,542 |
The Async IO package is designed to provide fast and scalable input/output (IO) for Java applications using sockets and files. It provides an alternative to the original synchronous IO classes available in the java.io and java.net packages, where scalability is limited by the inherent "one thread per IO object" design. It also provides an alternative to the New IO package (java.nio), where performance and scalability are limited by the polling design of the select() method.
As its name implies, the Async IO package provides asynchronous IO operations, where the application requests an IO operation from the system, the operation is executed by the system asynchronously from the application, and the system then informs the application when the operation is complete. The Async IO package supports a number of styles of application programming and gives the application designer considerable freedom in the management of the number of threads used to handle IO operations and also in the design of the components that handle the asynchronous notifications.
Why Java Applications Need the Async IO Package
The question "Why do Java applications need the Async IO package?" can be answered in two words: performance and scalability.
Performance and scalability are key attributes of the IO system for IO-intensive applications. IO-intensive applications are typically, although not exclusively, server-side applications. Server-side applications are characterized by the need to handle many network connections to many clients and also by the need to access many files to serve requests from those clients. The existing standard Java facilities for handling network connections and files do not serve the needs of server-side applications adequately. The java.io and java.net packages provide synchronous IO capabilities, which require a one-thread-per-IO-connection style of design, which limits scalability since running thousands of threads on a server imposes significant overhead on the operating system. The New IO package, java.nio, addresses the scalability issue of the one-thread-per-IO-connection design, but the New IO select() mechanism limits performance.
Current operating systems, such as Windows, AIX and Linux, provide facilities for fast, scalable IO based on the use of asynchronous notifications of IO operations taking place in the operating system layers. For example, Windows and AIX have IO Completion Ports, while Linux has the sys_epoll facility. The Async IO package aims to make these fast and scalable IO facilities available to Java applications through a package that provides IO capabilities linked to an asynchronous style of programming.
The current version of the Async IO package, com.ibm.io.async, is designed as an extension to the Java 2 Standard Edition 1.4, which can in principle be provided on any hardware and software platform. The platforms currently supported by the package include Windows, AIX, Linux, and Solaris.
Elements of the Async IO Package
The major elements of the Async IO package are the classes AsyncFileChannel, AsyncSocketChannel, and AsyncServerSocketChannel. The channels represent asynchronous versions of files, sockets, and server sockets. These fundamental classes are designed to be similar in naming and in operation to the channel classes of the New IO package. Good news for Java programmers familiar with the New IO package.
AsyncFileChannels and AsyncSocketChannels provide asynchronous read and write methods against the underlying file or socket. An asynchronous operation is a request to the system to perform the operation, where the method returns immediately to the calling application regardless of whether the operation has taken place or not. Instead of providing a return value that gives information about the operation, such as the number of bytes read/written, asynchronous read and write operations return objects that implement the IAsyncFuture interface.
The IAsyncFuture interface is another important component of the Async IO package. First an IAsyncFuture represents the state of the asynchronous operation - most important, whether the operation has completed or not. Second, the IAsyncFuture provides methods that return the result of the operation once it has completed. An IAsyncFuture can throw exceptions as well as the normal outcome of the operation, if something goes wrong during the operation.
The application uses one of three methods to find out whether a particular operation has completed:
- Polling: Calls the isCompleted() method of the IAsyncFuture, which returns true once the operation is complete
- Blocking: Uses the waitForCompletion() method of the IAsyncFuture, which can be used either to wait for a specified period or to wait indefinitely for the operation to complete
- Callback: Uses the addCompletionListener() method of the IAsyncFuture, so the application can register a method that's called back by the system when the operation completes
Data Formats Supported by Asynchronous Read and Write Operations
The read and write operations supplied by the Async IO package use the ByteBuffer class to hold the data. This class is the same as the one used in the New IO package. One difference between the Async IO package and the New IO package is that the ByteBuffers used for the Async IO package must be Direct ByteBuffers. Direct ByteBuffers have the memory for their content allocated in native memory outside the Java Heap. This provides better performance for IO operations since the operating system code can access the data in the buffer memory directly, without the need for copying.
ByteBuffers can be viewed as buffers supporting other primitive types, such as Int, Float, or Char, using methods such as bytebuffer.asIntBuffer(). ByteBuffers also have a series of methods that support the reading and writing of primitive types at arbitrary locations in the ByteBuffer using methods like bytebuffer.putLong( index, aLong).
Simple Examples of Async IO Read and Write Operations
Listing 1 shows the use of an AsyncSocketChannel as a client socket that involves connecting the socket to a remote server and then performing a read operation. In this example, the blocking style is used to wait for asynchronous operations to complete.
Listing 2 is a program fragment that shows the use of a callback to receive the notification of the completion of an asynchronous operation. This fragment shows just some of the methods of a class that is handling socket IO. It's assumed that an AsyncSocketChannel has already been opened and connected, that a direct ByteBuffer is available, and that an object named "state" tracks the state of the IO.
When the IO operation is requested (channel.read( ... )) an IAsyncFuture is returned. The next step is to give the IAsyncFuture a callback method by calling the addCompletionListener( ... ) method. The callback method gets called when the operation completes. The callback method is the futureCompleted( ... ) method that forms part of a class that implements the ICompletionListener interface.
In this example, the class with the callback is the same as the class that makes the read request (so "this" is used as the first parameter in the addCompletionListener method). The signature of the futureCompleted ( ... ) method is fixed: its parameters are an IAsyncFuture object that represents the operation and, second, an object that holds the application state, which is associated with the IAsync-Future through the addCompletion-Listener( ... ) method where it forms the second parameter (in this example, we use the object called "state").
The futureCompleted( ... ) method is called when the operation completes. It is possible that the operation is complete before the completion listener is added to the future. If this happens, the futureCompleted( ... ) method is called directly from the addCompletionListener( ... ) method, without any delay.
The futureCompleted( ... ) method receives the future object relating to the completed operation, plus the application state object.
Beyond the Basics: Multi Read/Write Operations and Timeouts
The previous sections described the basic functions available as part of the Java Async IO package. The package also supplies more advanced interfaces for asynchronous IO. The first advanced interface supplies the capability to perform read and write operations using multiple buffers for the data. The second advanced interface provides a time-out on the asynchronous IO operation.
Both the multi read/write operations and the time-out facility are provided by the AsyncSocketChannelHelper and AsyncFileChannelHelper classes. This is done to keep the interface to the Async-FileChannel and AsyncSocketChannel classes as straightforward as possible.
Create an AsyncSocketChannelHelper object by wrapping an existing AsyncSocketChannel. An AsyncFileChannelHelper is created by wrapping an existing AsyncFileChannel object. All operations on the channel helper object apply to the underlying asynchronous channel.
The multi read/write operations take ByteBuffer arrays as input and return IAsyncMultiFuture objects. IAsyncMultiFuture objects differ from IAsyncFuture objects only in that they have a getBuffers() method that returns the ByteBuffer arrays involved in the operation in place of the getBuffer() method, which relates to the single buffer read/write operations. The multi read/write operations are useful for applications that need to send or receive data that's best handled by multiple buffers, perhaps where different elements of the data are handled by different application components (see Listing 3).
The time-out operations provided by the AsyncSocketChannelHelper and AsyncFileChannelHelper classes are versions of the basic read and write operations that have a time-out period applied to them. The basic read and write operations of asynchronous channels can in principle take forever to complete. This is particularly a problem for an application that uses the callback technique to get notified that the operation is complete, since the callback might never get called if the operation does not complete. The use of the time-out versions of the operations guarantees that the IAsyncFuture will complete when the time-out expires, even if the underlying read/write operation does not complete. If the time-out expires, the IAsyncFuture completes with an AsyncTimeoutException. In addition, the underlying operation is cancelled (equivalent to invoking the IAsyncFuture cancel(future) method).
Note that using the time-out versions of read and write are different from using the IAsyncFuture waitForCompletion( timeout ) method (see Listing 4). waitForCompletion provides a time-out for the wait on the completion of the IAsyncFuture. If this time-out expires, control is returned to the application, but the IAsyncFuture is not completed and the underlying read/write operation is still underway. By contrast, if the time-out expires on the AsyncChannelHelper read/write methods, the IAsyncFuture is completed (with an AsyncTimeoutException) and the underlying operation is cancelled.
An important point about operations that time out is that the state of the channel is left indeterminate. Once an operation is cancelled, it's unlikely that the channel can be used again and the safe option is for the application to close the channel.
Asynchronous IO Thread Management
If you write an application program that uses the callback method to get notifications that asynchronous IO operations have completed, you need to understand which Java threads are used to run the callbacks. The threads used to run the callbacks will run application code. If your application code needs the threads to have any special characteristics, such as specific context information or security settings, this could cause problems for your application code unless your application carefully controls the actual threads that are used to run the callbacks.
The threading design of the Async IO package is outlined in Figure 1. Applications make requests to the package for Async IO operations. The requests are passed to the operating system's IO functions. When the operations complete, notifications of their completion are passed back to the Async IO package and are initially held in an IO Completion Queue. The Async IO package has a set of one or more Java threads that it uses to process the notifications in the IO Completion Queue. Notifications are taken from the Completion Queue, and the IAsyncFuture related to the operation is marked as completed. If a Callback Listener has been registered on the IAsyncFuture, the Callback Listener method is called. Once the CallBack Listener method finishes, the thread returns to the Async IO package and is used to process other notifications from the Completion Queue.
By default, the Async IO package uses its own Result Thread Manager to manage the threads that handle the callbacks. It allocates a number of threads, typically equal to the number of processors on the system. These threads are vanilla Java threads with no special characteristics. However, the application can control the threads in one of two ways.
The application can override the default Result Thread Manager by calling the setResultThreadManager(IResult-ThreadManager) method of the Abstract- AsyncChannel class. The application must supply its own manager class that implements the IResultThreadManager interface, which defines the full life cycle for threads used by the Async IO package. The IResultThreadManager interface provides control over the policies applied to the result threads, including the timing of creation and destruction, the minimum and maximum numbers of threads, plus the technique used for creation and destruction of the threads.
Alternatively, the application can use the default IResultThreadManager implementation provided by the Async IO package, but control the nature of the threads used to handle results and callbacks. This is done by supplying the default IResultThreadManager implementation with an application-defined IThreadPool object, by calling the set-ThreadPool( IThreadPool ) method on the IResultThreadManager. This allows the application to control the nature of the threads used in the Result Thread Manager. For example, application data can be attached to the thread or specific security settings applied to the thread, or the threads used in the IResultThreadManager can be cached by the IThreadPool.
Performance
Performance is one of the important reasons for using the Async IO package. How does its performance stack up against the original synchronous Java IO and also against the New IO package?
Performance is a complex issue, but a simple test provides some guidance. The test uses Socket IO with multiple clients communicating with a single server. Each client performs repeated operations, writing 256 bytes to the server and reading a 2,048 byte response from the server. For the test, the clients are always the same code, but three variations of the server code are used:
- Synchronous Server, using the original Java IO classes
- New IO Server, using the New IO classes
- Asynchronous IO Server, using the Async IO package
We ran the tests with a Windows 2000 single processor server system and a Windows Server 2003 four-way system running the clients, connected via a 100Mb Ethernet network, with varying numbers of client sockets each performing a connect followed by 50 read/write cycles with the server. The results are shown in Table 1, which provides the data for the average time in microseconds to complete each read/write cycle, quoted with and without the startup time included. The startup time is the time taken for the client socket to connect to the server before any data is transmitted.
(If you're surprised that the four-way server system is used to drive the client side for this test, it's used to ensure that the very large number of clients can be created successfully.)
The last two cases involve running with a number of inactive client sockets, which are connected to the server but are not transmitting any data during the test. This is more typical of a real Web server. These inactive sockets are a load for the server to handle alongside the active sockets.
This shows the Async IO, New IO, and Sync servers are all similar in terms of average times in lightly loaded situations. The failure of the Sync server to handle the case of 7,000 total clients shows its limitations in terms of scalability. The figures for the New IO server show that the performance suffers as the number of clients rise. In particular the New IO server shows a marked rise in the overhead for starting up new connections as the number of connections rises. The Async IO server manages to achieve reasonably stable performance right through the range tested, both for startup time and for the read/write cycle time.
These simple tests show that the Async IO package is able to deliver on its promise of performance and scalability and can form part of the solution for server applications intended to handle many thousands of clients.
Pitfalls to Avoid
As with the use of any API, there are some aspects of the Async IO API that you need to think about to avoid problems.
You need to be careful with the use of the ByteBuffers that are used in the read and write methods of asynchronous channels. Because the IO operations occur asynchronously, there is the potential for the Async IO package to use the ByteBuffers at the same time as the application code. The rule to follow in order to avoid trouble is that the application code should not access the ByteBuffers from the time that an asynchronous read or write operation is requested until the point that the Async IO package signals that the operation is complete. Any attempt by the application to access the ByteBuffers before the operation is complete could cause unpredictable results.
Asynchronous channels provide facilities for the cancellation of asynchronous IO operations. These include the explicit cancel() method available on the futures returned by operations on asynchronous channels, and also the implicit cancellation that takes place as part of the time-out of an IO operation on an AsyncSocketChannelHelper or AsyncFileChannelHelper. If an operation is cancelled, the under-lying channel (file or socket) is left in an indeterminate state. Because of this, your application should not attempt to perform any more operations on the channel once cancellation has occurred. The best thing to do is to close the channel as soon as possible.
The performance of read and write operations using Async IO is designed to be as close as possible to the performance of equivalent synchronous IO operations. However, there is some extra overhead involved in running an asynchronous operation compared with a synchronous operation, associated with setting up and executing the asynchronous notifications. The implication of this is that asynchronous reads and writes involving very small packets of data (i.e., a few bytes only) are going to have a significantly higher overhead than synchronous equivalents. You should take this into account when designing your application to use Async IO.
Summary
The Java Async IO package provides valuable facilities for fast, scalable Socket and File IO, which are an alternative to the use of java.io and java.nio facilities in client-side and server-side applications. The package also assists the program design by providing an event-driven interface for IO operations that is simple to use.
Resources
Published October 6, 2004 Reads 46,542
Copyright © 2004 SYS-CON Media, Inc. — All Rights Reserved.
Syndicated stories and blog feeds, all rights reserved by the author.
More Stories By Mike Edwards
Dr. Mike Edwards is a strategic planner in the IBM Java Technologies group in Hursley, England. He is responsible for technical planning for future IBM products including the IBM Java SDKs and for Web services–related products. Before working on Async IO, Mike was involved in the planning of Java SDK 1.4.0 and was a member of the Expert Group for JSR 059, which defined the specification for J2SE 1.4.0 and JSR 051, which created the New IO package. Mike received his PhD in Elementary Particle Physics from Birmingham University.
More Stories By Tim Ellison
Tim Ellison is a senior software engineer and strategic planner in the emerging technologies team at IBM Hursley Java Technologies group. He has contributed to the implementation of Smalltalk, IBM VisualAge Micro Edition, Eclipse, and the Java SDK over a period of 20 years. His interests are in new ways to apply object technology to difficult problems.
![]() |
Mike Edwards 10/26/04 10:55:01 AM EDT | |||
Paul, Please email me directly if you would like to discuss your question about NIO in more detail - I'd prefer to keep this discussion thread dedicated to Async IO. Yours, Mike. |
||||
![]() |
Paul 10/25/04 10:38:17 AM EDT | |||
Mike: Your article is great! I used NIO for a socket server, could you help me out a qustion? NIO send message by Byte between client and server, I got many samples with it to delever string message acting as HTTP server. HOw can I deliver and parse the message wrapped in an object instead of only string? could you give me some clues or any samples? Regards, Thanks you very much. Paul |
||||
![]() |
Mike Edwards 10/25/04 08:27:39 AM EDT | |||
Bret, Your question about why NIO performs less well than the original synchronous IO is an interesting one. Fundamentally, NIO is less about performance and more about scalability. Synchronous IO demands one thread per socket and most operating systems limit the number of threads. New IO allows many sockets per thread and so allows a much greater number of sockets per application. The figures in our article show this lack of scalability of synchronous IO. In terms of performance, New IO has to do the same read and write calls to the operating system that are done by synchronous IO. However, New IO requires the use of the Selector and the management of the key sets - this is an overhead. Synchronous IO by contrast has the overhead of thread switching between the many threads. At low numbers of sockets, the difference in the overheads is not significant, except that the setup time for putting a new channel into the Selector makes New IO slower to add a new channel (note: our code caches the threads used by synchronous IO). At high number of sockets, the time to insert a channel into the Selector climbs as does the time to do the Select operation, due to the data structures used to hold the select list. Thread switch time does not increase as much - so making New IO performance look worse at high numbers of sockets. We shall look to make our performance test code available on the AIO4J site, so that you can take a look at how the server code compares between Sync IO, New IO and AIO4J. Yours, Mike. |
||||
![]() |
Bret Hansen 10/23/04 12:17:20 PM EDT | |||
So your test shows that the nio package is slower than the original synchronous API. Can you explain why? I haven't looked at your code yet. Bret |
||||
![]() |
Mike Edwards 10/13/04 03:11:59 AM EDT | |||
Csaba, Yours, Mike. |
||||
![]() |
Csaba 10/12/04 05:12:32 AM EDT | |||
Nevermind, found it... |
||||
![]() |
Csaba 10/12/04 05:10:31 AM EDT | |||
Where are the tables/images for this article ? I was really interested in that comparison chart, but couldn't find the link... |
||||
- It's the Java vs. C++ Shootout Revisited!
- Patterns for Building High Performance Applications
- OpenXava 4.3: Rapid Java Web Development
- Asynchronous Logging Using Spring
- Java for Programmers (2nd Edition)
- Cross-Platform Mobile Website Development – a Tool Comparison
- Write Once Run Anywhere or Cross Platform Mobile Development Tools
- Three Buzzwords That Every CIO Hears but One They Should Listen To
- Immersing into JavaScript Frameworks
- Workday Reportedly Prepping to Go Public
- Cloud Expo New York: The Java EE 7 Platform - Developing for the Cloud
- Book Review: Sams Teach Yourself Java in 24 Hours
- Book Excerpt: Introducing HTML5
- Adobe Sends Flex to the Apache Foundation
- Five Years Waiting for JRE 7: Is It Justified? (Part 1)
- Book Excerpt: Java Application Profiling Tips and Tricks
- i-Technology in 2012: Five Industry Predictions
- It's the Java vs. C++ Shootout Revisited!
- Patterns for Building High Performance Applications
- OpenXava 4.3: Rapid Java Web Development
- The Next Web Architecture
- Asynchronous Logging Using Spring
- Java for Programmers (2nd Edition)
- Is Write Once Run Anywhere Ever Going to Be a Reality?
- A Cup of AJAX? Nay, Just Regular Java Please
- Java Developer's Journal Exclusive: 2006 "JDJ Editors' Choice" Awards
- JavaServer Faces (JSF) vs Struts
- The i-Technology Right Stuff
- Rich Internet Applications with Adobe Flex 2 and Java
- Java vs C++ "Shootout" Revisited
- Bean-Managed Persistence Using a Proxy List
- Reporting Made Easy with JasperReports and Hibernate
- Creating a Pet Store Application with JavaServer Faces, Spring, and Hibernate
- Why Do 'Cool Kids' Choose Ruby or PHP to Build Websites Instead of Java?
- What's New in Eclipse?
- i-Technology Predictions for 2007: Where's It All Headed?

















