|By Mike Edwards, Tim Ellison||
|October 6, 2004 12:00 AM EDT||
The Async IO package is designed to provide fast and scalable input/output (IO) for Java applications using sockets and files. It provides an alternative to the original synchronous IO classes available in the java.io and java.net packages, where scalability is limited by the inherent "one thread per IO object" design. It also provides an alternative to the New IO package (java.nio), where performance and scalability are limited by the polling design of the select() method.
As its name implies, the Async IO package provides asynchronous IO operations, where the application requests an IO operation from the system, the operation is executed by the system asynchronously from the application, and the system then informs the application when the operation is complete. The Async IO package supports a number of styles of application programming and gives the application designer considerable freedom in the management of the number of threads used to handle IO operations and also in the design of the components that handle the asynchronous notifications.
Why Java Applications Need the Async IO Package
The question "Why do Java applications need the Async IO package?" can be answered in two words: performance and scalability.
Performance and scalability are key attributes of the IO system for IO-intensive applications. IO-intensive applications are typically, although not exclusively, server-side applications. Server-side applications are characterized by the need to handle many network connections to many clients and also by the need to access many files to serve requests from those clients. The existing standard Java facilities for handling network connections and files do not serve the needs of server-side applications adequately. The java.io and java.net packages provide synchronous IO capabilities, which require a one-thread-per-IO-connection style of design, which limits scalability since running thousands of threads on a server imposes significant overhead on the operating system. The New IO package, java.nio, addresses the scalability issue of the one-thread-per-IO-connection design, but the New IO select() mechanism limits performance.
Current operating systems, such as Windows, AIX and Linux, provide facilities for fast, scalable IO based on the use of asynchronous notifications of IO operations taking place in the operating system layers. For example, Windows and AIX have IO Completion Ports, while Linux has the sys_epoll facility. The Async IO package aims to make these fast and scalable IO facilities available to Java applications through a package that provides IO capabilities linked to an asynchronous style of programming.
The current version of the Async IO package, com.ibm.io.async, is designed as an extension to the Java 2 Standard Edition 1.4, which can in principle be provided on any hardware and software platform. The platforms currently supported by the package include Windows, AIX, Linux, and Solaris.
Elements of the Async IO Package
The major elements of the Async IO package are the classes AsyncFileChannel, AsyncSocketChannel, and AsyncServerSocketChannel. The channels represent asynchronous versions of files, sockets, and server sockets. These fundamental classes are designed to be similar in naming and in operation to the channel classes of the New IO package. Good news for Java programmers familiar with the New IO package.
AsyncFileChannels and AsyncSocketChannels provide asynchronous read and write methods against the underlying file or socket. An asynchronous operation is a request to the system to perform the operation, where the method returns immediately to the calling application regardless of whether the operation has taken place or not. Instead of providing a return value that gives information about the operation, such as the number of bytes read/written, asynchronous read and write operations return objects that implement the IAsyncFuture interface.
The IAsyncFuture interface is another important component of the Async IO package. First an IAsyncFuture represents the state of the asynchronous operation - most important, whether the operation has completed or not. Second, the IAsyncFuture provides methods that return the result of the operation once it has completed. An IAsyncFuture can throw exceptions as well as the normal outcome of the operation, if something goes wrong during the operation.
The application uses one of three methods to find out whether a particular operation has completed:
- Polling: Calls the isCompleted() method of the IAsyncFuture, which returns true once the operation is complete
- Blocking: Uses the waitForCompletion() method of the IAsyncFuture, which can be used either to wait for a specified period or to wait indefinitely for the operation to complete
- Callback: Uses the addCompletionListener() method of the IAsyncFuture, so the application can register a method that's called back by the system when the operation completes
Data Formats Supported by Asynchronous Read and Write Operations
The read and write operations supplied by the Async IO package use the ByteBuffer class to hold the data. This class is the same as the one used in the New IO package. One difference between the Async IO package and the New IO package is that the ByteBuffers used for the Async IO package must be Direct ByteBuffers. Direct ByteBuffers have the memory for their content allocated in native memory outside the Java Heap. This provides better performance for IO operations since the operating system code can access the data in the buffer memory directly, without the need for copying.
ByteBuffers can be viewed as buffers supporting other primitive types, such as Int, Float, or Char, using methods such as bytebuffer.asIntBuffer(). ByteBuffers also have a series of methods that support the reading and writing of primitive types at arbitrary locations in the ByteBuffer using methods like bytebuffer.putLong( index, aLong).
Simple Examples of Async IO Read and Write Operations
Listing 1 shows the use of an AsyncSocketChannel as a client socket that involves connecting the socket to a remote server and then performing a read operation. In this example, the blocking style is used to wait for asynchronous operations to complete.
Listing 2 is a program fragment that shows the use of a callback to receive the notification of the completion of an asynchronous operation. This fragment shows just some of the methods of a class that is handling socket IO. It's assumed that an AsyncSocketChannel has already been opened and connected, that a direct ByteBuffer is available, and that an object named "state" tracks the state of the IO.
When the IO operation is requested (channel.read( ... )) an IAsyncFuture is returned. The next step is to give the IAsyncFuture a callback method by calling the addCompletionListener( ... ) method. The callback method gets called when the operation completes. The callback method is the futureCompleted( ... ) method that forms part of a class that implements the ICompletionListener interface.
In this example, the class with the callback is the same as the class that makes the read request (so "this" is used as the first parameter in the addCompletionListener method). The signature of the futureCompleted ( ... ) method is fixed: its parameters are an IAsyncFuture object that represents the operation and, second, an object that holds the application state, which is associated with the IAsync-Future through the addCompletion-Listener( ... ) method where it forms the second parameter (in this example, we use the object called "state").
The futureCompleted( ... ) method is called when the operation completes. It is possible that the operation is complete before the completion listener is added to the future. If this happens, the futureCompleted( ... ) method is called directly from the addCompletionListener( ... ) method, without any delay.
The futureCompleted( ... ) method receives the future object relating to the completed operation, plus the application state object.
Beyond the Basics: Multi Read/Write Operations and Timeouts
The previous sections described the basic functions available as part of the Java Async IO package. The package also supplies more advanced interfaces for asynchronous IO. The first advanced interface supplies the capability to perform read and write operations using multiple buffers for the data. The second advanced interface provides a time-out on the asynchronous IO operation.
Both the multi read/write operations and the time-out facility are provided by the AsyncSocketChannelHelper and AsyncFileChannelHelper classes. This is done to keep the interface to the Async-FileChannel and AsyncSocketChannel classes as straightforward as possible.
Create an AsyncSocketChannelHelper object by wrapping an existing AsyncSocketChannel. An AsyncFileChannelHelper is created by wrapping an existing AsyncFileChannel object. All operations on the channel helper object apply to the underlying asynchronous channel.
The multi read/write operations take ByteBuffer arrays as input and return IAsyncMultiFuture objects. IAsyncMultiFuture objects differ from IAsyncFuture objects only in that they have a getBuffers() method that returns the ByteBuffer arrays involved in the operation in place of the getBuffer() method, which relates to the single buffer read/write operations. The multi read/write operations are useful for applications that need to send or receive data that's best handled by multiple buffers, perhaps where different elements of the data are handled by different application components (see Listing 3).
The time-out operations provided by the AsyncSocketChannelHelper and AsyncFileChannelHelper classes are versions of the basic read and write operations that have a time-out period applied to them. The basic read and write operations of asynchronous channels can in principle take forever to complete. This is particularly a problem for an application that uses the callback technique to get notified that the operation is complete, since the callback might never get called if the operation does not complete. The use of the time-out versions of the operations guarantees that the IAsyncFuture will complete when the time-out expires, even if the underlying read/write operation does not complete. If the time-out expires, the IAsyncFuture completes with an AsyncTimeoutException. In addition, the underlying operation is cancelled (equivalent to invoking the IAsyncFuture cancel(future) method).
Note that using the time-out versions of read and write are different from using the IAsyncFuture waitForCompletion( timeout ) method (see Listing 4). waitForCompletion provides a time-out for the wait on the completion of the IAsyncFuture. If this time-out expires, control is returned to the application, but the IAsyncFuture is not completed and the underlying read/write operation is still underway. By contrast, if the time-out expires on the AsyncChannelHelper read/write methods, the IAsyncFuture is completed (with an AsyncTimeoutException) and the underlying operation is cancelled.
An important point about operations that time out is that the state of the channel is left indeterminate. Once an operation is cancelled, it's unlikely that the channel can be used again and the safe option is for the application to close the channel.
Asynchronous IO Thread Management
If you write an application program that uses the callback method to get notifications that asynchronous IO operations have completed, you need to understand which Java threads are used to run the callbacks. The threads used to run the callbacks will run application code. If your application code needs the threads to have any special characteristics, such as specific context information or security settings, this could cause problems for your application code unless your application carefully controls the actual threads that are used to run the callbacks.
The threading design of the Async IO package is outlined in Figure 1. Applications make requests to the package for Async IO operations. The requests are passed to the operating system's IO functions. When the operations complete, notifications of their completion are passed back to the Async IO package and are initially held in an IO Completion Queue. The Async IO package has a set of one or more Java threads that it uses to process the notifications in the IO Completion Queue. Notifications are taken from the Completion Queue, and the IAsyncFuture related to the operation is marked as completed. If a Callback Listener has been registered on the IAsyncFuture, the Callback Listener method is called. Once the CallBack Listener method finishes, the thread returns to the Async IO package and is used to process other notifications from the Completion Queue.
By default, the Async IO package uses its own Result Thread Manager to manage the threads that handle the callbacks. It allocates a number of threads, typically equal to the number of processors on the system. These threads are vanilla Java threads with no special characteristics. However, the application can control the threads in one of two ways.
The application can override the default Result Thread Manager by calling the setResultThreadManager(IResult-ThreadManager) method of the Abstract- AsyncChannel class. The application must supply its own manager class that implements the IResultThreadManager interface, which defines the full life cycle for threads used by the Async IO package. The IResultThreadManager interface provides control over the policies applied to the result threads, including the timing of creation and destruction, the minimum and maximum numbers of threads, plus the technique used for creation and destruction of the threads.
Alternatively, the application can use the default IResultThreadManager implementation provided by the Async IO package, but control the nature of the threads used to handle results and callbacks. This is done by supplying the default IResultThreadManager implementation with an application-defined IThreadPool object, by calling the set-ThreadPool( IThreadPool ) method on the IResultThreadManager. This allows the application to control the nature of the threads used in the Result Thread Manager. For example, application data can be attached to the thread or specific security settings applied to the thread, or the threads used in the IResultThreadManager can be cached by the IThreadPool.
Performance is one of the important reasons for using the Async IO package. How does its performance stack up against the original synchronous Java IO and also against the New IO package?
Performance is a complex issue, but a simple test provides some guidance. The test uses Socket IO with multiple clients communicating with a single server. Each client performs repeated operations, writing 256 bytes to the server and reading a 2,048 byte response from the server. For the test, the clients are always the same code, but three variations of the server code are used:
- Synchronous Server, using the original Java IO classes
- New IO Server, using the New IO classes
- Asynchronous IO Server, using the Async IO package
We ran the tests with a Windows 2000 single processor server system and a Windows Server 2003 four-way system running the clients, connected via a 100Mb Ethernet network, with varying numbers of client sockets each performing a connect followed by 50 read/write cycles with the server. The results are shown in Table 1, which provides the data for the average time in microseconds to complete each read/write cycle, quoted with and without the startup time included. The startup time is the time taken for the client socket to connect to the server before any data is transmitted.
(If you're surprised that the four-way server system is used to drive the client side for this test, it's used to ensure that the very large number of clients can be created successfully.)
The last two cases involve running with a number of inactive client sockets, which are connected to the server but are not transmitting any data during the test. This is more typical of a real Web server. These inactive sockets are a load for the server to handle alongside the active sockets.
This shows the Async IO, New IO, and Sync servers are all similar in terms of average times in lightly loaded situations. The failure of the Sync server to handle the case of 7,000 total clients shows its limitations in terms of scalability. The figures for the New IO server show that the performance suffers as the number of clients rise. In particular the New IO server shows a marked rise in the overhead for starting up new connections as the number of connections rises. The Async IO server manages to achieve reasonably stable performance right through the range tested, both for startup time and for the read/write cycle time.
These simple tests show that the Async IO package is able to deliver on its promise of performance and scalability and can form part of the solution for server applications intended to handle many thousands of clients.
Pitfalls to Avoid
As with the use of any API, there are some aspects of the Async IO API that you need to think about to avoid problems.
You need to be careful with the use of the ByteBuffers that are used in the read and write methods of asynchronous channels. Because the IO operations occur asynchronously, there is the potential for the Async IO package to use the ByteBuffers at the same time as the application code. The rule to follow in order to avoid trouble is that the application code should not access the ByteBuffers from the time that an asynchronous read or write operation is requested until the point that the Async IO package signals that the operation is complete. Any attempt by the application to access the ByteBuffers before the operation is complete could cause unpredictable results.
Asynchronous channels provide facilities for the cancellation of asynchronous IO operations. These include the explicit cancel() method available on the futures returned by operations on asynchronous channels, and also the implicit cancellation that takes place as part of the time-out of an IO operation on an AsyncSocketChannelHelper or AsyncFileChannelHelper. If an operation is cancelled, the under-lying channel (file or socket) is left in an indeterminate state. Because of this, your application should not attempt to perform any more operations on the channel once cancellation has occurred. The best thing to do is to close the channel as soon as possible.
The performance of read and write operations using Async IO is designed to be as close as possible to the performance of equivalent synchronous IO operations. However, there is some extra overhead involved in running an asynchronous operation compared with a synchronous operation, associated with setting up and executing the asynchronous notifications. The implication of this is that asynchronous reads and writes involving very small packets of data (i.e., a few bytes only) are going to have a significantly higher overhead than synchronous equivalents. You should take this into account when designing your application to use Async IO.
The Java Async IO package provides valuable facilities for fast, scalable Socket and File IO, which are an alternative to the use of java.io and java.nio facilities in client-side and server-side applications. The package also assists the program design by providing an event-driven interface for IO operations that is simple to use.
|Mike Edwards 10/26/04 10:55:01 AM EDT|
Please email me directly if you would like to discuss your question about NIO in more detail - I'd prefer to keep this discussion thread dedicated to Async IO.
|Paul 10/25/04 10:38:17 AM EDT|
Your article is great! I used NIO for a socket server, could you help me out a qustion?
NIO send message by Byte between client and server, I got many samples with it to delever string message acting as HTTP server. HOw can I deliver and parse the message wrapped in an object instead of only string? could you give me some clues or any samples?
Thanks you very much.
|Mike Edwards 10/25/04 08:27:39 AM EDT|
Your question about why NIO performs less well than the original synchronous IO is an interesting one.
Fundamentally, NIO is less about performance and more about scalability. Synchronous IO demands one thread per socket and most operating systems limit the number of threads. New IO allows many sockets per thread and so allows a much greater number of sockets per application. The figures in our article show this lack of scalability of synchronous IO.
In terms of performance, New IO has to do the same read and write calls to the operating system that are done by synchronous IO. However, New IO requires the use of the Selector and the management of the key sets - this is an overhead. Synchronous IO by contrast has the overhead of thread switching between the many threads. At low numbers of sockets, the difference in the overheads is not significant, except that the setup time for putting a new channel into the Selector makes New IO slower to add a new channel (note: our code caches the threads used by synchronous IO). At high number of sockets, the time to insert a channel into the Selector climbs as does the time to do the Select operation, due to the data structures used to hold the select list. Thread switch time does not increase as much - so making New IO performance look worse at high numbers of sockets.
We shall look to make our performance test code available on the AIO4J site, so that you can take a look at how the server code compares between Sync IO, New IO and AIO4J.
|Bret Hansen 10/23/04 12:17:20 PM EDT|
So your test shows that the nio package is slower than the original synchronous API.
Can you explain why? I haven't looked at your code yet.
|Mike Edwards 10/13/04 03:11:59 AM EDT|
|Csaba 10/12/04 05:12:32 AM EDT|
Nevermind, found it...
|Csaba 10/12/04 05:10:31 AM EDT|
Where are the tables/images for this article ? I was really interested in that comparison chart, but couldn't find the link...
The Internet of Things will challenge the status quo of how IT and development organizations operate. Or will it? Certainly the fog layer of IoT requires special insights about data ontology, security and transactional integrity. But the developmental challenges are the same: People, Process and Platform. In his session at @ThingsExpo, Craig Sproule, CEO of Metavine, demonstrated how to move beyond today's coding paradigm and shared the must-have mindsets for removing complexity from the develo...
Jul. 25, 2016 01:45 AM EDT Reads: 1,195
SYS-CON Events announced today that MangoApps will exhibit at the 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. MangoApps provides modern company intranets and team collaboration software, allowing workers to stay connected and productive from anywhere in the world and from any device.
Jul. 25, 2016 01:30 AM EDT Reads: 1,243
Basho Technologies has announced the latest release of Basho Riak TS, version 1.3. Riak TS is an enterprise-grade NoSQL database optimized for Internet of Things (IoT). The open source version enables developers to download the software for free and use it in production as well as make contributions to the code and develop applications around Riak TS. Enhancements to Riak TS make it quick, easy and cost-effective to spin up an instance to test new ideas and build IoT applications. In addition to...
Jul. 25, 2016 12:30 AM EDT Reads: 1,868
The 19th International Cloud Expo has announced that its Call for Papers is open. Cloud Expo, to be held November 1-3, 2016, at the Santa Clara Convention Center in Santa Clara, CA, brings together Cloud Computing, Big Data, Internet of Things, DevOps, Digital Transformation, Microservices and WebRTC to one location. With cloud computing driving a higher percentage of enterprise IT budgets every year, it becomes increasingly important to plant your flag in this fast-expanding business opportuni...
Jul. 25, 2016 12:15 AM EDT Reads: 2,478
"We've discovered that after shows 80% if leads that people get, 80% of the conversations end up on the show floor, meaning people forget about it, people forget who they talk to, people forget that there are actual business opportunities to be had here so we try to help out and keep the conversations going," explained Jeff Mesnik, Founder and President of ContentMX, in this SYS-CON.tv interview at 18th Cloud Expo, held June 7-9, 2016, at the Javits Center in New York City, NY.
Jul. 24, 2016 11:30 PM EDT Reads: 1,257
“delaPlex Software provides software outsourcing services. We have a hybrid model where we have onshore developers and project managers that we can place anywhere in the U.S. or in Europe,” explained Manish Sachdeva, CEO at delaPlex Software, in this SYS-CON.tv interview at @ThingsExpo, held June 7-9, 2016, at the Javits Center in New York City, NY.
Jul. 24, 2016 11:00 PM EDT Reads: 1,495
From wearable activity trackers to fantasy e-sports, data and technology are transforming the way athletes train for the game and fans engage with their teams. In his session at @ThingsExpo, will present key data findings from leading sports organizations San Francisco 49ers, Orlando Magic NBA team. By utilizing data analytics these sports orgs have recognized new revenue streams, doubled its fan base and streamlined costs at its stadiums. John Paul is the CEO and Founder of VenueNext. Prior ...
Jul. 24, 2016 10:45 PM EDT Reads: 1,979
IoT is rapidly changing the way enterprises are using data to improve business decision-making. In order to derive business value, organizations must unlock insights from the data gathered and then act on these. In their session at @ThingsExpo, Eric Hoffman, Vice President at EastBanc Technologies, and Peter Shashkin, Head of Development Department at EastBanc Technologies, discussed how one organization leveraged IoT, cloud technology and data analysis to improve customer experiences and effi...
Jul. 24, 2016 10:00 PM EDT Reads: 1,952
The Internet of Things will challenge the status quo of how IT and development organizations operate. Or will it? Certainly the fog layer of IoT requires special insights about data ontology, security and transactional integrity. But the developmental challenges are the same: People, Process and Platform and how we integrate our thinking to solve complicated problems. In his session at 19th Cloud Expo, Craig Sproule, CEO of Metavine, will demonstrate how to move beyond today's coding paradigm ...
Jul. 24, 2016 09:45 PM EDT Reads: 2,121
Internet of @ThingsExpo, taking place November 1-3, 2016, at the Santa Clara Convention Center in Santa Clara, CA, is co-located with the 19th International Cloud Expo and will feature technical sessions from a rock star conference faculty and the leading industry players in the world and ThingsExpo Silicon Valley Call for Papers is now open.
Jul. 24, 2016 09:00 PM EDT Reads: 2,464
Big Data engines are powering a lot of service businesses right now. Data is collected from users from wearable technologies, web behaviors, purchase behavior as well as several arbitrary data points we’d never think of. The demand for faster and bigger engines to crunch and serve up the data to services is growing exponentially. You see a LOT of correlation between “Cloud” and “Big Data” but on Big Data and “Hybrid,” where hybrid hosting is the sanest approach to the Big Data Infrastructure pro...
Jul. 24, 2016 07:45 PM EDT Reads: 1,853
"My role is working with customers, helping them go through this digital transformation. I spend a lot of time talking to banks, big industries, manufacturers working through how they are integrating and transforming their IT platforms and moving them forward," explained William Morrish, General Manager Product Sales at Interoute, in this SYS-CON.tv interview at 18th Cloud Expo, held June 7-9, 2016, at the Javits Center in New York City, NY.
Jul. 24, 2016 07:30 PM EDT Reads: 2,056
A critical component of any IoT project is what to do with all the data being generated. This data needs to be captured, processed, structured, and stored in a way to facilitate different kinds of queries. Traditional data warehouse and analytical systems are mature technologies that can be used to handle certain kinds of queries, but they are not always well suited to many problems, particularly when there is a need for real-time insights.
Jul. 24, 2016 07:30 PM EDT Reads: 1,694
With 15% of enterprises adopting a hybrid IT strategy, you need to set a plan to integrate hybrid cloud throughout your infrastructure. In his session at 18th Cloud Expo, Steven Dreher, Director of Solutions Architecture at Green House Data, discussed how to plan for shifting resource requirements, overcome challenges, and implement hybrid IT alongside your existing data center assets. Highlights included anticipating workload, cost and resource calculations, integrating services on both sides...
Jul. 24, 2016 07:00 PM EDT Reads: 1,931
"We are a well-established player in the application life cycle management market and we also have a very strong version control product," stated Flint Brenton, CEO of CollabNet,, in this SYS-CON.tv interview at 18th Cloud Expo, held June 7-9, 2016, at the Javits Center in New York City, NY.
Jul. 24, 2016 06:45 PM EDT Reads: 1,762
Unless your company can spend a lot of money on new technology, re-engineering your environment and hiring a comprehensive cybersecurity team, you will most likely move to the cloud or seek external service partnerships. In his session at 18th Cloud Expo, Darren Guccione, CEO of Keeper Security, revealed what you need to know when it comes to encryption in the cloud.
Jul. 24, 2016 05:00 PM EDT Reads: 2,357
We're entering the post-smartphone era, where wearable gadgets from watches and fitness bands to glasses and health aids will power the next technological revolution. With mass adoption of wearable devices comes a new data ecosystem that must be protected. Wearables open new pathways that facilitate the tracking, sharing and storing of consumers’ personal health, location and daily activity data. Consumers have some idea of the data these devices capture, but most don’t realize how revealing and...
Jul. 24, 2016 05:00 PM EDT Reads: 2,033
What are the successful IoT innovations from emerging markets? What are the unique challenges and opportunities from these markets? How did the constraints in connectivity among others lead to groundbreaking insights? In her session at @ThingsExpo, Carmen Feliciano, a Principal at AMDG, will answer all these questions and share how you can apply IoT best practices and frameworks from the emerging markets to your own business.
Jul. 24, 2016 04:15 PM EDT Reads: 1,551
Ask someone to architect an Internet of Things (IoT) solution and you are guaranteed to see a reference to the cloud. This would lead you to believe that IoT requires the cloud to exist. However, there are many IoT use cases where the cloud is not feasible or desirable. In his session at @ThingsExpo, Dave McCarthy, Director of Products at Bsquare Corporation, will discuss the strategies that exist to extend intelligence directly to IoT devices and sensors, freeing them from the constraints of ...
Jul. 24, 2016 03:45 PM EDT Reads: 1,726
You think you know what’s in your data. But do you? Most organizations are now aware of the business intelligence represented by their data. Data science stands to take this to a level you never thought of – literally. The techniques of data science, when used with the capabilities of Big Data technologies, can make connections you had not yet imagined, helping you discover new insights and ask new questions of your data. In his session at @ThingsExpo, Sarbjit Sarkaria, data science team lead ...
Jul. 24, 2016 03:30 PM EDT Reads: 910