Welcome!

Java IoT Authors: Elizabeth White, Yeshim Deniz, Tim Hinds, Douglas Lyon, Stackify Blog

Related Topics: Java IoT

Java IoT: Article

Java Programming: The Java Async IO Package

Fast, scalable IO for sockets and files

The Async IO package is designed to provide fast and scalable input/output (IO) for Java applications using sockets and files. It provides an alternative to the original synchronous IO classes available in the java.io and java.net packages, where scalability is limited by the inherent "one thread per IO object" design. It also provides an alternative to the New IO package (java.nio), where performance and scalability are limited by the polling design of the select() method.

As its name implies, the Async IO package provides asynchronous IO operations, where the application requests an IO operation from the system, the operation is executed by the system asynchronously from the application, and the system then informs the application when the operation is complete. The Async IO package supports a number of styles of application programming and gives the application designer considerable freedom in the management of the number of threads used to handle IO operations and also in the design of the components that handle the asynchronous notifications.

Why Java Applications Need the Async IO Package
The question "Why do Java applications need the Async IO package?" can be answered in two words: performance and scalability.

Performance and scalability are key attributes of the IO system for IO-intensive applications. IO-intensive applications are typically, although not exclusively, server-side applications. Server-side applications are characterized by the need to handle many network connections to many clients and also by the need to access many files to serve requests from those clients. The existing standard Java facilities for handling network connections and files do not serve the needs of server-side applications adequately. The java.io and java.net packages provide synchronous IO capabilities, which require a one-thread-per-IO-connection style of design, which limits scalability since running thousands of threads on a server imposes significant overhead on the operating system. The New IO package, java.nio, addresses the scalability issue of the one-thread-per-IO-connection design, but the New IO select() mechanism limits performance.

Current operating systems, such as Windows, AIX and Linux, provide facilities for fast, scalable IO based on the use of asynchronous notifications of IO operations taking place in the operating system layers. For example, Windows and AIX have IO Completion Ports, while Linux has the sys_epoll facility. The Async IO package aims to make these fast and scalable IO facilities available to Java applications through a package that provides IO capabilities linked to an asynchronous style of programming.

The current version of the Async IO package, com.ibm.io.async, is designed as an extension to the Java 2 Standard Edition 1.4, which can in principle be provided on any hardware and software platform. The platforms currently supported by the package include Windows, AIX, Linux, and Solaris.

Elements of the Async IO Package
The major elements of the Async IO package are the classes AsyncFileChannel, AsyncSocketChannel, and AsyncServerSocketChannel. The channels represent asynchronous versions of files, sockets, and server sockets. These fundamental classes are designed to be similar in naming and in operation to the channel classes of the New IO package. Good news for Java programmers familiar with the New IO package.

AsyncFileChannels and AsyncSocketChannels provide asynchronous read and write methods against the underlying file or socket. An asynchronous operation is a request to the system to perform the operation, where the method returns immediately to the calling application regardless of whether the operation has taken place or not. Instead of providing a return value that gives information about the operation, such as the number of bytes read/written, asynchronous read and write operations return objects that implement the IAsyncFuture interface.

The IAsyncFuture interface is another important component of the Async IO package. First an IAsyncFuture represents the state of the asynchronous operation - most important, whether the operation has completed or not. Second, the IAsyncFuture provides methods that return the result of the operation once it has completed. An IAsyncFuture can throw exceptions as well as the normal outcome of the operation, if something goes wrong during the operation.

The application uses one of three methods to find out whether a particular operation has completed:

  • Polling: Calls the isCompleted() method of the IAsyncFuture, which returns true once the operation is complete
  • Blocking: Uses the waitForCompletion() method of the IAsyncFuture, which can be used either to wait for a specified period or to wait indefinitely for the operation to complete
  • Callback: Uses the addCompletionListener() method of the IAsyncFuture, so the application can register a method that's called back by the system when the operation completes
Which method an application uses to find out about the completion of an operation is driven by the overall design of the application, although the most truly "asynchronous" designs would tend to use the callback method. If an application uses the callback method, the thread that requests the IO operation carries on to do more work, while the callback is normally handled on a separate thread.

Data Formats Supported by Asynchronous Read and Write Operations
The read and write operations supplied by the Async IO package use the ByteBuffer class to hold the data. This class is the same as the one used in the New IO package. One difference between the Async IO package and the New IO package is that the ByteBuffers used for the Async IO package must be Direct ByteBuffers. Direct ByteBuffers have the memory for their content allocated in native memory outside the Java Heap. This provides better performance for IO operations since the operating system code can access the data in the buffer memory directly, without the need for copying.

ByteBuffers can be viewed as buffers supporting other primitive types, such as Int, Float, or Char, using methods such as bytebuffer.asIntBuffer(). ByteBuffers also have a series of methods that support the reading and writing of primitive types at arbitrary locations in the ByteBuffer using methods like bytebuffer.putLong( index, aLong).

Simple Examples of Async IO Read and Write Operations
Listing 1 shows the use of an AsyncSocketChannel as a client socket that involves connecting the socket to a remote server and then performing a read operation. In this example, the blocking style is used to wait for asynchronous operations to complete.

Listing 2 is a program fragment that shows the use of a callback to receive the notification of the completion of an asynchronous operation. This fragment shows just some of the methods of a class that is handling socket IO. It's assumed that an AsyncSocketChannel has already been opened and connected, that a direct ByteBuffer is available, and that an object named "state" tracks the state of the IO.

When the IO operation is requested (channel.read( ... )) an IAsyncFuture is returned. The next step is to give the IAsyncFuture a callback method by calling the addCompletionListener( ... ) method. The callback method gets called when the operation completes. The callback method is the futureCompleted( ... ) method that forms part of a class that implements the ICompletionListener interface.

In this example, the class with the callback is the same as the class that makes the read request (so "this" is used as the first parameter in the addCompletionListener method). The signature of the futureCompleted ( ... ) method is fixed: its parameters are an IAsyncFuture object that represents the operation and, second, an object that holds the application state, which is associated with the IAsync-Future through the addCompletion-Listener( ... ) method where it forms the second parameter (in this example, we use the object called "state").

The futureCompleted( ... ) method is called when the operation completes. It is possible that the operation is complete before the completion listener is added to the future. If this happens, the futureCompleted( ... ) method is called directly from the addCompletionListener( ... ) method, without any delay.

The futureCompleted( ... ) method receives the future object relating to the completed operation, plus the application state object.

Beyond the Basics: Multi Read/Write Operations and Timeouts
The previous sections described the basic functions available as part of the Java Async IO package. The package also supplies more advanced interfaces for asynchronous IO. The first advanced interface supplies the capability to perform read and write operations using multiple buffers for the data. The second advanced interface provides a time-out on the asynchronous IO operation.

Both the multi read/write operations and the time-out facility are provided by the AsyncSocketChannelHelper and AsyncFileChannelHelper classes. This is done to keep the interface to the Async-FileChannel and AsyncSocketChannel classes as straightforward as possible.

Create an AsyncSocketChannelHelper object by wrapping an existing AsyncSocketChannel. An AsyncFileChannelHelper is created by wrapping an existing AsyncFileChannel object. All operations on the channel helper object apply to the underlying asynchronous channel.

The multi read/write operations take ByteBuffer arrays as input and return IAsyncMultiFuture objects. IAsyncMultiFuture objects differ from IAsyncFuture objects only in that they have a getBuffers() method that returns the ByteBuffer arrays involved in the operation in place of the getBuffer() method, which relates to the single buffer read/write operations. The multi read/write operations are useful for applications that need to send or receive data that's best handled by multiple buffers, perhaps where different elements of the data are handled by different application components (see Listing 3).

The time-out operations provided by the AsyncSocketChannelHelper and AsyncFileChannelHelper classes are versions of the basic read and write operations that have a time-out period applied to them. The basic read and write operations of asynchronous channels can in principle take forever to complete. This is particularly a problem for an application that uses the callback technique to get notified that the operation is complete, since the callback might never get called if the operation does not complete. The use of the time-out versions of the operations guarantees that the IAsyncFuture will complete when the time-out expires, even if the underlying read/write operation does not complete. If the time-out expires, the IAsyncFuture completes with an AsyncTimeoutException. In addition, the underlying operation is cancelled (equivalent to invoking the IAsyncFuture cancel(future) method).

Note that using the time-out versions of read and write are different from using the IAsyncFuture waitForCompletion( timeout ) method (see Listing 4). waitForCompletion provides a time-out for the wait on the completion of the IAsyncFuture. If this time-out expires, control is returned to the application, but the IAsyncFuture is not completed and the underlying read/write operation is still underway. By contrast, if the time-out expires on the AsyncChannelHelper read/write methods, the IAsyncFuture is completed (with an AsyncTimeoutException) and the underlying operation is cancelled.

An important point about operations that time out is that the state of the channel is left indeterminate. Once an operation is cancelled, it's unlikely that the channel can be used again and the safe option is for the application to close the channel.

Asynchronous IO Thread Management
If you write an application program that uses the callback method to get notifications that asynchronous IO operations have completed, you need to understand which Java threads are used to run the callbacks. The threads used to run the callbacks will run application code. If your application code needs the threads to have any special characteristics, such as specific context information or security settings, this could cause problems for your application code unless your application carefully controls the actual threads that are used to run the callbacks.

The threading design of the Async IO package is outlined in Figure 1. Applications make requests to the package for Async IO operations. The requests are passed to the operating system's IO functions. When the operations complete, notifications of their completion are passed back to the Async IO package and are initially held in an IO Completion Queue. The Async IO package has a set of one or more Java threads that it uses to process the notifications in the IO Completion Queue. Notifications are taken from the Completion Queue, and the IAsyncFuture related to the operation is marked as completed. If a Callback Listener has been registered on the IAsyncFuture, the Callback Listener method is called. Once the CallBack Listener method finishes, the thread returns to the Async IO package and is used to process other notifications from the Completion Queue.

By default, the Async IO package uses its own Result Thread Manager to manage the threads that handle the callbacks. It allocates a number of threads, typically equal to the number of processors on the system. These threads are vanilla Java threads with no special characteristics. However, the application can control the threads in one of two ways.

The application can override the default Result Thread Manager by calling the setResultThreadManager(IResult-ThreadManager) method of the Abstract- AsyncChannel class. The application must supply its own manager class that implements the IResultThreadManager interface, which defines the full life cycle for threads used by the Async IO package. The IResultThreadManager interface provides control over the policies applied to the result threads, including the timing of creation and destruction, the minimum and maximum numbers of threads, plus the technique used for creation and destruction of the threads.

Alternatively, the application can use the default IResultThreadManager implementation provided by the Async IO package, but control the nature of the threads used to handle results and callbacks. This is done by supplying the default IResultThreadManager implementation with an application-defined IThreadPool object, by calling the set-ThreadPool( IThreadPool ) method on the IResultThreadManager. This allows the application to control the nature of the threads used in the Result Thread Manager. For example, application data can be attached to the thread or specific security settings applied to the thread, or the threads used in the IResultThreadManager can be cached by the IThreadPool.

Performance
Performance is one of the important reasons for using the Async IO package. How does its performance stack up against the original synchronous Java IO and also against the New IO package?

Performance is a complex issue, but a simple test provides some guidance. The test uses Socket IO with multiple clients communicating with a single server. Each client performs repeated operations, writing 256 bytes to the server and reading a 2,048 byte response from the server. For the test, the clients are always the same code, but three variations of the server code are used:

  • Synchronous Server, using the original Java IO classes
  • New IO Server, using the New IO classes
  • Asynchronous IO Server, using the Async IO package
The server code is as similar as possible, but the differences implied by the different programming models of the IO packages are built into the code. Most notably, the threading design of the Synchronous Server is one-thread-per-client, while the number of threads used for the New IO and Asynchronous Server is determined by the number of processors on the server system and is much smaller than the number of clients. An important feature of the server code is that the caches of the common objects used by the server code are used as much as possible - notably the ByteBuffers (in all three cases) and the threads (in the case of the Sync Server). This is done to reduce the startup times as much as possible for each socket; this reflects a common practice for typical server designs.

We ran the tests with a Windows 2000 single processor server system and a Windows Server 2003 four-way system running the clients, connected via a 100Mb Ethernet network, with varying numbers of client sockets each performing a connect followed by 50 read/write cycles with the server. The results are shown in Table 1, which provides the data for the average time in microseconds to complete each read/write cycle, quoted with and without the startup time included. The startup time is the time taken for the client socket to connect to the server before any data is transmitted.

(If you're surprised that the four-way server system is used to drive the client side for this test, it's used to ensure that the very large number of clients can be created successfully.)

The last two cases involve running with a number of inactive client sockets, which are connected to the server but are not transmitting any data during the test. This is more typical of a real Web server. These inactive sockets are a load for the server to handle alongside the active sockets.

This shows the Async IO, New IO, and Sync servers are all similar in terms of average times in lightly loaded situations. The failure of the Sync server to handle the case of 7,000 total clients shows its limitations in terms of scalability. The figures for the New IO server show that the performance suffers as the number of clients rise. In particular the New IO server shows a marked rise in the overhead for starting up new connections as the number of connections rises. The Async IO server manages to achieve reasonably stable performance right through the range tested, both for startup time and for the read/write cycle time.

These simple tests show that the Async IO package is able to deliver on its promise of performance and scalability and can form part of the solution for server applications intended to handle many thousands of clients.

Pitfalls to Avoid
As with the use of any API, there are some aspects of the Async IO API that you need to think about to avoid problems.

You need to be careful with the use of the ByteBuffers that are used in the read and write methods of asynchronous channels. Because the IO operations occur asynchronously, there is the potential for the Async IO package to use the ByteBuffers at the same time as the application code. The rule to follow in order to avoid trouble is that the application code should not access the ByteBuffers from the time that an asynchronous read or write operation is requested until the point that the Async IO package signals that the operation is complete. Any attempt by the application to access the ByteBuffers before the operation is complete could cause unpredictable results.

Asynchronous channels provide facilities for the cancellation of asynchronous IO operations. These include the explicit cancel() method available on the futures returned by operations on asynchronous channels, and also the implicit cancellation that takes place as part of the time-out of an IO operation on an AsyncSocketChannelHelper or AsyncFileChannelHelper. If an operation is cancelled, the under-lying channel (file or socket) is left in an indeterminate state. Because of this, your application should not attempt to perform any more operations on the channel once cancellation has occurred. The best thing to do is to close the channel as soon as possible.

The performance of read and write operations using Async IO is designed to be as close as possible to the performance of equivalent synchronous IO operations. However, there is some extra overhead involved in running an asynchronous operation compared with a synchronous operation, associated with setting up and executing the asynchronous notifications. The implication of this is that asynchronous reads and writes involving very small packets of data (i.e., a few bytes only) are going to have a significantly higher overhead than synchronous equivalents. You should take this into account when designing your application to use Async IO.

Summary
The Java Async IO package provides valuable facilities for fast, scalable Socket and File IO, which are an alternative to the use of java.io and java.nio facilities in client-side and server-side applications. The package also assists the program design by providing an event-driven interface for IO operations that is simple to use.

Resources

  • The C10K Problem Page contains a comprehensive discussion of the need for fast, scalable IO for servers and the facilities available to provide this on various systems: www.kegel.com/c10k.html
  • Pattern-Oriented Software Architecture: www.cs.wustl.edu/~schmidt/POSA/
  • New IO APIs have a full description of the standard Java New IO package: http://java.sun.com/j2se/1.4.2/docs/guide/nio/index.html
  • Download the Async IO package to try out in your applications: www.alphaworks.ibm.com/tech/aio4j
  • More Stories By Mike Edwards

    Dr. Mike Edwards is a strategic planner in the IBM Java Technologies group in Hursley, England. He is responsible for technical planning for future IBM products including the IBM Java SDKs and for Web services–related products. Before working on Async IO, Mike was involved in the planning of Java SDK 1.4.0 and was a member of the Expert Group for JSR 059, which defined the specification for J2SE 1.4.0 and JSR 051, which created the New IO package. Mike received his PhD in Elementary Particle Physics from Birmingham University.

    More Stories By Tim Ellison

    Tim Ellison is a senior software engineer and strategic planner in the emerging technologies team at IBM Hursley Java Technologies group. He has contributed to the implementation of Smalltalk, IBM VisualAge Micro Edition, Eclipse, and the Java SDK over a period of 20 years. His interests are in new ways to apply object technology to difficult problems.

    Comments (7)

    Share your thoughts on this story.

    Add your comment
    You must be signed in to add a comment. Sign-in | Register

    In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.


    @ThingsExpo Stories
    Leading companies, from the Global Fortune 500 to the smallest companies, are adopting hybrid cloud as the path to business advantage. Hybrid cloud depends on cloud services and on-premises infrastructure working in unison. Successful implementations require new levels of data mobility, enabled by an automated and seamless flow across on-premises and cloud resources. In his general session at 21st Cloud Expo, Greg Tevis, an IBM Storage Software Technical Strategist and Customer Solution Architec...
    Nordstrom is transforming the way that they do business and the cloud is the key to enabling speed and hyper personalized customer experiences. In his session at 21st Cloud Expo, Ken Schow, VP of Engineering at Nordstrom, discussed some of the key learnings and common pitfalls of large enterprises moving to the cloud. This includes strategies around choosing a cloud provider(s), architecture, and lessons learned. In addition, he covered some of the best practices for structured team migration an...
    Product connectivity goes hand and hand these days with increased use of personal data. New IoT devices are becoming more personalized than ever before. In his session at 22nd Cloud Expo | DXWorld Expo, Nicolas Fierro, CEO of MIMIR Blockchain Solutions, will discuss how in order to protect your data and privacy, IoT applications need to embrace Blockchain technology for a new level of product security never before seen - or needed.
    Imagine if you will, a retail floor so densely packed with sensors that they can pick up the movements of insects scurrying across a store aisle. Or a component of a piece of factory equipment so well-instrumented that its digital twin provides resolution down to the micrometer.
    In his keynote at 18th Cloud Expo, Andrew Keys, Co-Founder of ConsenSys Enterprise, provided an overview of the evolution of the Internet and the Database and the future of their combination – the Blockchain. Andrew Keys is Co-Founder of ConsenSys Enterprise. He comes to ConsenSys Enterprise with capital markets, technology and entrepreneurial experience. Previously, he worked for UBS investment bank in equities analysis. Later, he was responsible for the creation and distribution of life settle...
    BnkToTheFuture.com is the largest online investment platform for investing in FinTech, Bitcoin and Blockchain companies. We believe the future of finance looks very different from the past and we aim to invest and provide trading opportunities for qualifying investors that want to build a portfolio in the sector in compliance with international financial regulations.
    No hype cycles or predictions of a gazillion things here. IoT is here. You get it. You know your business and have great ideas for a business transformation strategy. What comes next? Time to make it happen. In his session at @ThingsExpo, Jay Mason, an Associate Partner of Analytics, IoT & Cybersecurity at M&S Consulting, presented a step-by-step plan to develop your technology implementation strategy. He also discussed the evaluation of communication standards and IoT messaging protocols, data...
    Coca-Cola’s Google powered digital signage system lays the groundwork for a more valuable connection between Coke and its customers. Digital signs pair software with high-resolution displays so that a message can be changed instantly based on what the operator wants to communicate or sell. In their Day 3 Keynote at 21st Cloud Expo, Greg Chambers, Global Group Director, Digital Innovation, Coca-Cola, and Vidya Nagarajan, a Senior Product Manager at Google, discussed how from store operations and ...
    In his session at 21st Cloud Expo, Raju Shreewastava, founder of Big Data Trunk, provided a fun and simple way to introduce Machine Leaning to anyone and everyone. He solved a machine learning problem and demonstrated an easy way to be able to do machine learning without even coding. Raju Shreewastava is the founder of Big Data Trunk (www.BigDataTrunk.com), a Big Data Training and consulting firm with offices in the United States. He previously led the data warehouse/business intelligence and B...
    "IBM is really all in on blockchain. We take a look at sort of the history of blockchain ledger technologies. It started out with bitcoin, Ethereum, and IBM evaluated these particular blockchain technologies and found they were anonymous and permissionless and that many companies were looking for permissioned blockchain," stated René Bostic, Technical VP of the IBM Cloud Unit in North America, in this SYS-CON.tv interview at 21st Cloud Expo, held Oct 31 – Nov 2, 2017, at the Santa Clara Conventi...
    A strange thing is happening along the way to the Internet of Things, namely far too many devices to work with and manage. It has become clear that we'll need much higher efficiency user experiences that can allow us to more easily and scalably work with the thousands of devices that will soon be in each of our lives. Enter the conversational interface revolution, combining bots we can literally talk with, gesture to, and even direct with our thoughts, with embedded artificial intelligence, whic...
    When shopping for a new data processing platform for IoT solutions, many development teams want to be able to test-drive options before making a choice. Yet when evaluating an IoT solution, it’s simply not feasible to do so at scale with physical devices. Building a sensor simulator is the next best choice; however, generating a realistic simulation at very high TPS with ease of configurability is a formidable challenge. When dealing with multiple application or transport protocols, you would be...
    Smart cities have the potential to change our lives at so many levels for citizens: less pollution, reduced parking obstacles, better health, education and more energy savings. Real-time data streaming and the Internet of Things (IoT) possess the power to turn this vision into a reality. However, most organizations today are building their data infrastructure to focus solely on addressing immediate business needs vs. a platform capable of quickly adapting emerging technologies to address future ...
    We are given a desktop platform with Java 8 or Java 9 installed and seek to find a way to deploy high-performance Java applications that use Java 3D and/or Jogl without having to run an installer. We are subject to the constraint that the applications be signed and deployed so that they can be run in a trusted environment (i.e., outside of the sandbox). Further, we seek to do this in a way that does not depend on bundling a JRE with our applications, as this makes downloads and installations rat...
    Widespread fragmentation is stalling the growth of the IIoT and making it difficult for partners to work together. The number of software platforms, apps, hardware and connectivity standards is creating paralysis among businesses that are afraid of being locked into a solution. EdgeX Foundry is unifying the community around a common IoT edge framework and an ecosystem of interoperable components.
    DX World EXPO, LLC, a Lighthouse Point, Florida-based startup trade show producer and the creator of "DXWorldEXPO® - Digital Transformation Conference & Expo" has announced its executive management team. The team is headed by Levent Selamoglu, who has been named CEO. "Now is the time for a truly global DX event, to bring together the leading minds from the technology world in a conversation about Digital Transformation," he said in making the announcement.
    In this strange new world where more and more power is drawn from business technology, companies are effectively straddling two paths on the road to innovation and transformation into digital enterprises. The first path is the heritage trail – with “legacy” technology forming the background. Here, extant technologies are transformed by core IT teams to provide more API-driven approaches. Legacy systems can restrict companies that are transitioning into digital enterprises. To truly become a lead...
    Digital Transformation (DX) is not a "one-size-fits all" strategy. Each organization needs to develop its own unique, long-term DX plan. It must do so by realizing that we now live in a data-driven age, and that technologies such as Cloud Computing, Big Data, the IoT, Cognitive Computing, and Blockchain are only tools. In her general session at 21st Cloud Expo, Rebecca Wanta explained how the strategy must focus on DX and include a commitment from top management to create great IT jobs, monitor ...
    "Cloud Academy is an enterprise training platform for the cloud, specifically public clouds. We offer guided learning experiences on AWS, Azure, Google Cloud and all the surrounding methodologies and technologies that you need to know and your teams need to know in order to leverage the full benefits of the cloud," explained Alex Brower, VP of Marketing at Cloud Academy, in this SYS-CON.tv interview at 21st Cloud Expo, held Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clar...
    The IoT Will Grow: In what might be the most obvious prediction of the decade, the IoT will continue to expand next year, with more and more devices coming online every single day. What isn’t so obvious about this prediction: where that growth will occur. The retail, healthcare, and industrial/supply chain industries will likely see the greatest growth. Forrester Research has predicted the IoT will become “the backbone” of customer value as it continues to grow. It is no surprise that retail is ...