|By Mike Edwards, Tim Ellison||
|October 6, 2004 12:00 AM EDT||
The Async IO package is designed to provide fast and scalable input/output (IO) for Java applications using sockets and files. It provides an alternative to the original synchronous IO classes available in the java.io and java.net packages, where scalability is limited by the inherent "one thread per IO object" design. It also provides an alternative to the New IO package (java.nio), where performance and scalability are limited by the polling design of the select() method.
As its name implies, the Async IO package provides asynchronous IO operations, where the application requests an IO operation from the system, the operation is executed by the system asynchronously from the application, and the system then informs the application when the operation is complete. The Async IO package supports a number of styles of application programming and gives the application designer considerable freedom in the management of the number of threads used to handle IO operations and also in the design of the components that handle the asynchronous notifications.
Why Java Applications Need the Async IO Package
The question "Why do Java applications need the Async IO package?" can be answered in two words: performance and scalability.
Performance and scalability are key attributes of the IO system for IO-intensive applications. IO-intensive applications are typically, although not exclusively, server-side applications. Server-side applications are characterized by the need to handle many network connections to many clients and also by the need to access many files to serve requests from those clients. The existing standard Java facilities for handling network connections and files do not serve the needs of server-side applications adequately. The java.io and java.net packages provide synchronous IO capabilities, which require a one-thread-per-IO-connection style of design, which limits scalability since running thousands of threads on a server imposes significant overhead on the operating system. The New IO package, java.nio, addresses the scalability issue of the one-thread-per-IO-connection design, but the New IO select() mechanism limits performance.
Current operating systems, such as Windows, AIX and Linux, provide facilities for fast, scalable IO based on the use of asynchronous notifications of IO operations taking place in the operating system layers. For example, Windows and AIX have IO Completion Ports, while Linux has the sys_epoll facility. The Async IO package aims to make these fast and scalable IO facilities available to Java applications through a package that provides IO capabilities linked to an asynchronous style of programming.
The current version of the Async IO package, com.ibm.io.async, is designed as an extension to the Java 2 Standard Edition 1.4, which can in principle be provided on any hardware and software platform. The platforms currently supported by the package include Windows, AIX, Linux, and Solaris.
Elements of the Async IO Package
The major elements of the Async IO package are the classes AsyncFileChannel, AsyncSocketChannel, and AsyncServerSocketChannel. The channels represent asynchronous versions of files, sockets, and server sockets. These fundamental classes are designed to be similar in naming and in operation to the channel classes of the New IO package. Good news for Java programmers familiar with the New IO package.
AsyncFileChannels and AsyncSocketChannels provide asynchronous read and write methods against the underlying file or socket. An asynchronous operation is a request to the system to perform the operation, where the method returns immediately to the calling application regardless of whether the operation has taken place or not. Instead of providing a return value that gives information about the operation, such as the number of bytes read/written, asynchronous read and write operations return objects that implement the IAsyncFuture interface.
The IAsyncFuture interface is another important component of the Async IO package. First an IAsyncFuture represents the state of the asynchronous operation - most important, whether the operation has completed or not. Second, the IAsyncFuture provides methods that return the result of the operation once it has completed. An IAsyncFuture can throw exceptions as well as the normal outcome of the operation, if something goes wrong during the operation.
The application uses one of three methods to find out whether a particular operation has completed:
- Polling: Calls the isCompleted() method of the IAsyncFuture, which returns true once the operation is complete
- Blocking: Uses the waitForCompletion() method of the IAsyncFuture, which can be used either to wait for a specified period or to wait indefinitely for the operation to complete
- Callback: Uses the addCompletionListener() method of the IAsyncFuture, so the application can register a method that's called back by the system when the operation completes
Data Formats Supported by Asynchronous Read and Write Operations
The read and write operations supplied by the Async IO package use the ByteBuffer class to hold the data. This class is the same as the one used in the New IO package. One difference between the Async IO package and the New IO package is that the ByteBuffers used for the Async IO package must be Direct ByteBuffers. Direct ByteBuffers have the memory for their content allocated in native memory outside the Java Heap. This provides better performance for IO operations since the operating system code can access the data in the buffer memory directly, without the need for copying.
ByteBuffers can be viewed as buffers supporting other primitive types, such as Int, Float, or Char, using methods such as bytebuffer.asIntBuffer(). ByteBuffers also have a series of methods that support the reading and writing of primitive types at arbitrary locations in the ByteBuffer using methods like bytebuffer.putLong( index, aLong).
Simple Examples of Async IO Read and Write Operations
Listing 1 shows the use of an AsyncSocketChannel as a client socket that involves connecting the socket to a remote server and then performing a read operation. In this example, the blocking style is used to wait for asynchronous operations to complete.
Listing 2 is a program fragment that shows the use of a callback to receive the notification of the completion of an asynchronous operation. This fragment shows just some of the methods of a class that is handling socket IO. It's assumed that an AsyncSocketChannel has already been opened and connected, that a direct ByteBuffer is available, and that an object named "state" tracks the state of the IO.
When the IO operation is requested (channel.read( ... )) an IAsyncFuture is returned. The next step is to give the IAsyncFuture a callback method by calling the addCompletionListener( ... ) method. The callback method gets called when the operation completes. The callback method is the futureCompleted( ... ) method that forms part of a class that implements the ICompletionListener interface.
In this example, the class with the callback is the same as the class that makes the read request (so "this" is used as the first parameter in the addCompletionListener method). The signature of the futureCompleted ( ... ) method is fixed: its parameters are an IAsyncFuture object that represents the operation and, second, an object that holds the application state, which is associated with the IAsync-Future through the addCompletion-Listener( ... ) method where it forms the second parameter (in this example, we use the object called "state").
The futureCompleted( ... ) method is called when the operation completes. It is possible that the operation is complete before the completion listener is added to the future. If this happens, the futureCompleted( ... ) method is called directly from the addCompletionListener( ... ) method, without any delay.
The futureCompleted( ... ) method receives the future object relating to the completed operation, plus the application state object.
Beyond the Basics: Multi Read/Write Operations and Timeouts
The previous sections described the basic functions available as part of the Java Async IO package. The package also supplies more advanced interfaces for asynchronous IO. The first advanced interface supplies the capability to perform read and write operations using multiple buffers for the data. The second advanced interface provides a time-out on the asynchronous IO operation.
Both the multi read/write operations and the time-out facility are provided by the AsyncSocketChannelHelper and AsyncFileChannelHelper classes. This is done to keep the interface to the Async-FileChannel and AsyncSocketChannel classes as straightforward as possible.
Create an AsyncSocketChannelHelper object by wrapping an existing AsyncSocketChannel. An AsyncFileChannelHelper is created by wrapping an existing AsyncFileChannel object. All operations on the channel helper object apply to the underlying asynchronous channel.
The multi read/write operations take ByteBuffer arrays as input and return IAsyncMultiFuture objects. IAsyncMultiFuture objects differ from IAsyncFuture objects only in that they have a getBuffers() method that returns the ByteBuffer arrays involved in the operation in place of the getBuffer() method, which relates to the single buffer read/write operations. The multi read/write operations are useful for applications that need to send or receive data that's best handled by multiple buffers, perhaps where different elements of the data are handled by different application components (see Listing 3).
The time-out operations provided by the AsyncSocketChannelHelper and AsyncFileChannelHelper classes are versions of the basic read and write operations that have a time-out period applied to them. The basic read and write operations of asynchronous channels can in principle take forever to complete. This is particularly a problem for an application that uses the callback technique to get notified that the operation is complete, since the callback might never get called if the operation does not complete. The use of the time-out versions of the operations guarantees that the IAsyncFuture will complete when the time-out expires, even if the underlying read/write operation does not complete. If the time-out expires, the IAsyncFuture completes with an AsyncTimeoutException. In addition, the underlying operation is cancelled (equivalent to invoking the IAsyncFuture cancel(future) method).
Note that using the time-out versions of read and write are different from using the IAsyncFuture waitForCompletion( timeout ) method (see Listing 4). waitForCompletion provides a time-out for the wait on the completion of the IAsyncFuture. If this time-out expires, control is returned to the application, but the IAsyncFuture is not completed and the underlying read/write operation is still underway. By contrast, if the time-out expires on the AsyncChannelHelper read/write methods, the IAsyncFuture is completed (with an AsyncTimeoutException) and the underlying operation is cancelled.
An important point about operations that time out is that the state of the channel is left indeterminate. Once an operation is cancelled, it's unlikely that the channel can be used again and the safe option is for the application to close the channel.
Asynchronous IO Thread Management
If you write an application program that uses the callback method to get notifications that asynchronous IO operations have completed, you need to understand which Java threads are used to run the callbacks. The threads used to run the callbacks will run application code. If your application code needs the threads to have any special characteristics, such as specific context information or security settings, this could cause problems for your application code unless your application carefully controls the actual threads that are used to run the callbacks.
The threading design of the Async IO package is outlined in Figure 1. Applications make requests to the package for Async IO operations. The requests are passed to the operating system's IO functions. When the operations complete, notifications of their completion are passed back to the Async IO package and are initially held in an IO Completion Queue. The Async IO package has a set of one or more Java threads that it uses to process the notifications in the IO Completion Queue. Notifications are taken from the Completion Queue, and the IAsyncFuture related to the operation is marked as completed. If a Callback Listener has been registered on the IAsyncFuture, the Callback Listener method is called. Once the CallBack Listener method finishes, the thread returns to the Async IO package and is used to process other notifications from the Completion Queue.
By default, the Async IO package uses its own Result Thread Manager to manage the threads that handle the callbacks. It allocates a number of threads, typically equal to the number of processors on the system. These threads are vanilla Java threads with no special characteristics. However, the application can control the threads in one of two ways.
The application can override the default Result Thread Manager by calling the setResultThreadManager(IResult-ThreadManager) method of the Abstract- AsyncChannel class. The application must supply its own manager class that implements the IResultThreadManager interface, which defines the full life cycle for threads used by the Async IO package. The IResultThreadManager interface provides control over the policies applied to the result threads, including the timing of creation and destruction, the minimum and maximum numbers of threads, plus the technique used for creation and destruction of the threads.
Alternatively, the application can use the default IResultThreadManager implementation provided by the Async IO package, but control the nature of the threads used to handle results and callbacks. This is done by supplying the default IResultThreadManager implementation with an application-defined IThreadPool object, by calling the set-ThreadPool( IThreadPool ) method on the IResultThreadManager. This allows the application to control the nature of the threads used in the Result Thread Manager. For example, application data can be attached to the thread or specific security settings applied to the thread, or the threads used in the IResultThreadManager can be cached by the IThreadPool.
Performance is one of the important reasons for using the Async IO package. How does its performance stack up against the original synchronous Java IO and also against the New IO package?
Performance is a complex issue, but a simple test provides some guidance. The test uses Socket IO with multiple clients communicating with a single server. Each client performs repeated operations, writing 256 bytes to the server and reading a 2,048 byte response from the server. For the test, the clients are always the same code, but three variations of the server code are used:
- Synchronous Server, using the original Java IO classes
- New IO Server, using the New IO classes
- Asynchronous IO Server, using the Async IO package
We ran the tests with a Windows 2000 single processor server system and a Windows Server 2003 four-way system running the clients, connected via a 100Mb Ethernet network, with varying numbers of client sockets each performing a connect followed by 50 read/write cycles with the server. The results are shown in Table 1, which provides the data for the average time in microseconds to complete each read/write cycle, quoted with and without the startup time included. The startup time is the time taken for the client socket to connect to the server before any data is transmitted.
(If you're surprised that the four-way server system is used to drive the client side for this test, it's used to ensure that the very large number of clients can be created successfully.)
The last two cases involve running with a number of inactive client sockets, which are connected to the server but are not transmitting any data during the test. This is more typical of a real Web server. These inactive sockets are a load for the server to handle alongside the active sockets.
This shows the Async IO, New IO, and Sync servers are all similar in terms of average times in lightly loaded situations. The failure of the Sync server to handle the case of 7,000 total clients shows its limitations in terms of scalability. The figures for the New IO server show that the performance suffers as the number of clients rise. In particular the New IO server shows a marked rise in the overhead for starting up new connections as the number of connections rises. The Async IO server manages to achieve reasonably stable performance right through the range tested, both for startup time and for the read/write cycle time.
These simple tests show that the Async IO package is able to deliver on its promise of performance and scalability and can form part of the solution for server applications intended to handle many thousands of clients.
Pitfalls to Avoid
As with the use of any API, there are some aspects of the Async IO API that you need to think about to avoid problems.
You need to be careful with the use of the ByteBuffers that are used in the read and write methods of asynchronous channels. Because the IO operations occur asynchronously, there is the potential for the Async IO package to use the ByteBuffers at the same time as the application code. The rule to follow in order to avoid trouble is that the application code should not access the ByteBuffers from the time that an asynchronous read or write operation is requested until the point that the Async IO package signals that the operation is complete. Any attempt by the application to access the ByteBuffers before the operation is complete could cause unpredictable results.
Asynchronous channels provide facilities for the cancellation of asynchronous IO operations. These include the explicit cancel() method available on the futures returned by operations on asynchronous channels, and also the implicit cancellation that takes place as part of the time-out of an IO operation on an AsyncSocketChannelHelper or AsyncFileChannelHelper. If an operation is cancelled, the under-lying channel (file or socket) is left in an indeterminate state. Because of this, your application should not attempt to perform any more operations on the channel once cancellation has occurred. The best thing to do is to close the channel as soon as possible.
The performance of read and write operations using Async IO is designed to be as close as possible to the performance of equivalent synchronous IO operations. However, there is some extra overhead involved in running an asynchronous operation compared with a synchronous operation, associated with setting up and executing the asynchronous notifications. The implication of this is that asynchronous reads and writes involving very small packets of data (i.e., a few bytes only) are going to have a significantly higher overhead than synchronous equivalents. You should take this into account when designing your application to use Async IO.
The Java Async IO package provides valuable facilities for fast, scalable Socket and File IO, which are an alternative to the use of java.io and java.nio facilities in client-side and server-side applications. The package also assists the program design by providing an event-driven interface for IO operations that is simple to use.
|Mike Edwards 10/26/04 10:55:01 AM EDT|
Please email me directly if you would like to discuss your question about NIO in more detail - I'd prefer to keep this discussion thread dedicated to Async IO.
|Paul 10/25/04 10:38:17 AM EDT|
Your article is great! I used NIO for a socket server, could you help me out a qustion?
NIO send message by Byte between client and server, I got many samples with it to delever string message acting as HTTP server. HOw can I deliver and parse the message wrapped in an object instead of only string? could you give me some clues or any samples?
Thanks you very much.
|Mike Edwards 10/25/04 08:27:39 AM EDT|
Your question about why NIO performs less well than the original synchronous IO is an interesting one.
Fundamentally, NIO is less about performance and more about scalability. Synchronous IO demands one thread per socket and most operating systems limit the number of threads. New IO allows many sockets per thread and so allows a much greater number of sockets per application. The figures in our article show this lack of scalability of synchronous IO.
In terms of performance, New IO has to do the same read and write calls to the operating system that are done by synchronous IO. However, New IO requires the use of the Selector and the management of the key sets - this is an overhead. Synchronous IO by contrast has the overhead of thread switching between the many threads. At low numbers of sockets, the difference in the overheads is not significant, except that the setup time for putting a new channel into the Selector makes New IO slower to add a new channel (note: our code caches the threads used by synchronous IO). At high number of sockets, the time to insert a channel into the Selector climbs as does the time to do the Select operation, due to the data structures used to hold the select list. Thread switch time does not increase as much - so making New IO performance look worse at high numbers of sockets.
We shall look to make our performance test code available on the AIO4J site, so that you can take a look at how the server code compares between Sync IO, New IO and AIO4J.
|Bret Hansen 10/23/04 12:17:20 PM EDT|
So your test shows that the nio package is slower than the original synchronous API.
Can you explain why? I haven't looked at your code yet.
|Mike Edwards 10/13/04 03:11:59 AM EDT|
|Csaba 10/12/04 05:12:32 AM EDT|
Nevermind, found it...
|Csaba 10/12/04 05:10:31 AM EDT|
Where are the tables/images for this article ? I was really interested in that comparison chart, but couldn't find the link...
With so much going on in this space you could be forgiven for thinking you were always working with yesterday’s technologies. So much change, so quickly. What do you do if you have to build a solution from the ground up that is expected to live in the field for at least 5-10 years? This is the challenge we faced when we looked to refresh our existing 10-year-old custom hardware stack to measure the fullness of trash cans and compactors.
Aug. 28, 2016 02:00 AM EDT Reads: 1,759
The emerging Internet of Everything creates tremendous new opportunities for customer engagement and business model innovation. However, enterprises must overcome a number of critical challenges to bring these new solutions to market. In his session at @ThingsExpo, Michael Martin, CTO/CIO at nfrastructure, outlined these key challenges and recommended approaches for overcoming them to achieve speed and agility in the design, development and implementation of Internet of Everything solutions wi...
Aug. 28, 2016 01:30 AM EDT Reads: 2,063
Cloud computing is being adopted in one form or another by 94% of enterprises today. Tens of billions of new devices are being connected to The Internet of Things. And Big Data is driving this bus. An exponential increase is expected in the amount of information being processed, managed, analyzed, and acted upon by enterprise IT. This amazing is not part of some distant future - it is happening today. One report shows a 650% increase in enterprise data by 2020. Other estimates are even higher....
Aug. 28, 2016 01:00 AM EDT Reads: 2,959
Today we can collect lots and lots of performance data. We build beautiful dashboards and even have fancy query languages to access and transform the data. Still performance data is a secret language only a couple of people understand. The more business becomes digital the more stakeholders are interested in this data including how it relates to business. Some of these people have never used a monitoring tool before. They have a question on their mind like “How is my application doing” but no id...
Aug. 28, 2016 12:15 AM EDT Reads: 1,816
The 19th International Cloud Expo has announced that its Call for Papers is open. Cloud Expo, to be held November 1-3, 2016, at the Santa Clara Convention Center in Santa Clara, CA, brings together Cloud Computing, Big Data, Internet of Things, DevOps, Digital Transformation, Microservices and WebRTC to one location. With cloud computing driving a higher percentage of enterprise IT budgets every year, it becomes increasingly important to plant your flag in this fast-expanding business opportuni...
Aug. 27, 2016 11:00 PM EDT Reads: 3,993
Identity is in everything and customers are looking to their providers to ensure the security of their identities, transactions and data. With the increased reliance on cloud-based services, service providers must build security and trust into their offerings, adding value to customers and improving the user experience. Making identity, security and privacy easy for customers provides a unique advantage over the competition.
Aug. 27, 2016 08:45 PM EDT Reads: 2,338
19th Cloud Expo, taking place November 1-3, 2016, at the Santa Clara Convention Center in Santa Clara, CA, will feature technical sessions from a rock star conference faculty and the leading industry players in the world. Cloud computing is now being embraced by a majority of enterprises of all sizes. Yesterday's debate about public vs. private has transformed into the reality of hybrid cloud: a recent survey shows that 74% of enterprises have a hybrid cloud strategy. Meanwhile, 94% of enterpri...
Aug. 27, 2016 06:00 PM EDT Reads: 3,092
Smart Cities are here to stay, but for their promise to be delivered, the data they produce must not be put in new siloes. In his session at @ThingsExpo, Mathias Herberts, Co-founder and CTO of Cityzen Data, will deep dive into best practices that will ensure a successful smart city journey.
Aug. 27, 2016 05:15 PM EDT Reads: 1,576
SYS-CON Events announced today that 910Telecom will exhibit at the 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. Housed in the classic Denver Gas & Electric Building, 910 15th St., 910Telecom is a carrier-neutral telecom hotel located in the heart of Denver. Adjacent to CenturyLink, AT&T, and Denver Main, 910Telecom offers connectivity to all major carriers, Internet service providers, Internet backbones and ...
Aug. 27, 2016 05:00 PM EDT Reads: 1,883
There is growing need for data-driven applications and the need for digital platforms to build these apps. In his session at 19th Cloud Expo, Muddu Sudhakar, VP and GM of Security & IoT at Splunk, will cover different PaaS solutions and Big Data platforms that are available to build applications. In addition, AI and machine learning are creating new requirements that developers need in the building of next-gen apps. The next-generation digital platforms have some of the past platform needs a...
Aug. 27, 2016 04:00 PM EDT Reads: 581
SYS-CON Events announced today Telecom Reseller has been named “Media Sponsor” of SYS-CON's 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. Telecom Reseller reports on Unified Communications, UCaaS, BPaaS for enterprise and SMBs. They report extensively on both customer premises based solutions such as IP-PBX as well as cloud based and hosted platforms.
Aug. 27, 2016 03:15 PM EDT Reads: 772
I wanted to gather all of my Internet of Things (IOT) blogs into a single blog (that I could later use with my University of San Francisco (USF) Big Data “MBA” course). However as I started to pull these blogs together, I realized that my IOT discussion lacked a vision; it lacked an end point towards which an organization could drive their IOT envisioning, proof of value, app dev, data engineering and data science efforts. And I think that the IOT end point is really quite simple…
Aug. 27, 2016 12:45 PM EDT Reads: 2,350
Internet of @ThingsExpo, taking place November 1-3, 2016, at the Santa Clara Convention Center in Santa Clara, CA, is co-located with 19th Cloud Expo and will feature technical sessions from a rock star conference faculty and the leading industry players in the world. The Internet of Things (IoT) is the most profound change in personal and enterprise IT since the creation of the Worldwide Web more than 20 years ago. All major researchers estimate there will be tens of billions devices - comp...
Aug. 27, 2016 12:30 PM EDT Reads: 3,624
DevOps at Cloud Expo, taking place Nov 1-3, 2016, at the Santa Clara Convention Center in Santa Clara, CA, is co-located with 19th Cloud Expo and will feature technical sessions from a rock star conference faculty and the leading industry players in the world. The widespread success of cloud computing is driving the DevOps revolution in enterprise IT. Now as never before, development teams must communicate and collaborate in a dynamic, 24/7/365 environment. There is no time to wait for long dev...
Aug. 27, 2016 11:00 AM EDT Reads: 2,378
Data is the fuel that drives the machine learning algorithmic engines and ultimately provides the business value. In his session at Cloud Expo, Ed Featherston, a director and senior enterprise architect at Collaborative Consulting, will discuss the key considerations around quality, volume, timeliness, and pedigree that must be dealt with in order to properly fuel that engine.
Aug. 27, 2016 10:15 AM EDT Reads: 1,923
Pulzze Systems was happy to participate in such a premier event and thankful to be receiving the winning investment and global network support from G-Startup Worldwide. It is an exciting time for Pulzze to showcase the effectiveness of innovative technologies and enable them to make the world smarter and better. The reputable contest is held to identify promising startups around the globe that are assured to change the world through their innovative products and disruptive technologies. There w...
Aug. 27, 2016 07:45 AM EDT Reads: 690
Personalization has long been the holy grail of marketing. Simply stated, communicate the most relevant offer to the right person and you will increase sales. To achieve this, you must understand the individual. Consequently, digital marketers developed many ways to gather and leverage customer information to deliver targeted experiences. In his session at @ThingsExpo, Lou Casal, Founder and Principal Consultant at Practicala, discussed how the Internet of Things (IoT) has accelerated our abil...
Aug. 27, 2016 02:30 AM EDT Reads: 2,022
Is the ongoing quest for agility in the data center forcing you to evaluate how to be a part of infrastructure automation efforts? As organizations evolve toward bimodal IT operations, they are embracing new service delivery models and leveraging virtualization to increase infrastructure agility. Therefore, the network must evolve in parallel to become equally agile. Read this essential piece of Gartner research for recommendations on achieving greater agility.
Aug. 25, 2016 05:15 PM EDT Reads: 832
SYS-CON Events announced today that Venafi, the Immune System for the Internet™ and the leading provider of Next Generation Trust Protection, will exhibit at @DevOpsSummit at 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. Venafi is the Immune System for the Internet™ that protects the foundation of all cybersecurity – cryptographic keys and digital certificates – so they can’t be misused by bad guys in attacks...
Aug. 25, 2016 01:00 PM EDT Reads: 2,658
For basic one-to-one voice or video calling solutions, WebRTC has proven to be a very powerful technology. Although WebRTC’s core functionality is to provide secure, real-time p2p media streaming, leveraging native platform features and server-side components brings up new communication capabilities for web and native mobile applications, allowing for advanced multi-user use cases such as video broadcasting, conferencing, and media recording.
Aug. 25, 2016 08:45 AM EDT Reads: 2,183