Welcome!

Java Authors: Don MacVittie, Maureen O'Gara, Liz McMillan, Walter H. Pinson, III, Yakov Werde

Related Topics: Java

Java: Article

High-Performance Java for Compute-Intensive Applications

Reduce the risk of finding developers

Like any teenager, Java has changed considerably since its early days and many of these changes are allowing some programmers use Java for specific kinds of HPC applications. This article will discuss changes to the Java programming language that are improving its performance, and look at examples where Java has been and will continue to evolve as a good fit for compute-intensive applications.

The Evolution of Java
Java has made significant progress from interpreted bytecode. The first and primary enhancement to Java performance came with Just-in-Time (JIT) compilers. A JIT compiler enables higher performance by compiling the Java bytecode into native machine code on-the-fly. While Java bytecode is platform-independent and key to Java's portability, the generated native code is unique to the specific execution architecture. The HotSpot JIT compiler was introduced as a standard part of the Sun Virtual Machine (VM) implementation with version 1.3. IBM and other platform vendors have their own JIT compilers. To appreciate how much JIT compiling aids performance, try turning off the HotSpot JIT compiler using the -Xint command line switch with the Java application and see the results.

Wide support of 64-bit Java occurred with version 1.4, where the growing new x86-64 architecture (AMD64 and Intel EM64T) was supported along with Itanium 2 IA-64 and 64-bit Sparc. Sun took care to maintain full backwards-compatibility, avoiding changes in primitive data types or array indexes so that all existing Java programs would run the same in either 32-bit or 64-bit VMs, ensuring faster adoption.

The Java platform has always supported a robust threading model. After early cleanup that thankfully included deprecating stop() and suspend()/resume() methods, it's relatively easy to create a multithreaded Java application. Developers also have considerable control over the EventQueue in AWT and Swing user interface applications. On the other hand, early Java was designed to be completely portable between hardware environments. The Sun JVM managed all the threading, and green threads allowed a multithreaded environment to evolve for platforms without native thread support. Where native threads are supported, performance would be much better and scalable to multiple CPUs. As a result all modern JDK distributions include native threads and some will give developers the choice of which paradigm to use (for example, Blackdown on Linux).

With the emergence of grid computing in the late 1990s, Java started to look more attractive to developers. For most large multi-use networks, dictating consistent operating environments is an impossible task, so a grid is inherently heterogeneous, often in both hardware and software. Even though the market still didn't think of Java as a high-performance computing tool, it clearly held advantages in these mixed environments. Numerous grid frameworks were written in pure Java to ensure portability, though some have native code parts and will support spawning native code applications on the grid. However, a pure Java grid manager with a pure Java application could be run on almost any environment without developers needing to install and configure compilers, rebuild native components, or worry about the details of any individual unit on the grid.

Throughout the lifetime of the Java platform, common desktop hardware has expanded in unforeseen degrees. In 1995, most desktop computers were running Intel 486 CPUs with clock speeds below 100MHz. The introduction of Windows 95 later that year prompted many users to upgrade to 8MB or even 16MB of memory. Asking these computers to run bytecode in a virtual machine with software-rendered GUI elements would certainly have pushed the processor and memory to its limits. No one will ever claim that Java 1.0 was fast. High-performance computing was a decade away from the desktop and relegated still to Unix servers. Six major Java releases later, the current environment consists of dual-core, low-power CPUs running around 3GHz with multiple gigabytes of memory. Entire operating systems can now be deployed virtually on your typical laptop. Linux is a common word, and developers are looking for ways to take advantage of quad-core desktops. One result of the continued improvement of the hardware landscape is that the standard benchmarks of 1999 that took a minute to run now take only seconds. For an application developer this means that in the equivalent amount of time, they can now run multiple models instead of one, or run an order of magnitude more data on one model.

Besides improved performance, Java has also had to overcome its initial lack of support for numerical analysis, a definite requirement for compute-intensive applications. Early Java was missing obvious analysis requirements like a complex number implementation and matrix operations. Several open source projects emerged to fill the void and the foundations of linear algebra as well as more advanced algorithms became available. Unfortunately, many of these early projects achieved their fundamental "1.0" goals and ceased to be actively developed. Larger consortium-style projects such as Jakarta Mathematics and JScience.org took their place, extending the focus well beyond linear algebra and matrix operations. With these advances, the adoption of Java for numerical analysis increased, moving outside academia and early adopters to commercial industries. As more commercial customers became interested in Java for numerical analysis, vendors looked to replace open source libraries with commercial libraries. In 2002, Visual Numerics became the first vendor to release a commercial Java numerical analysis library by porting the industry-leading IMSL Numerical Library to Java, naming the new library set the JMSL Numerical Library.

Current Examples of Java for Compute-Intensive Applications
The performance improvements and broader numerical analysis capabilities added to Java since its early days have increased the practical applications of Java used for compute-intensive applications. The following two examples demonstrate the high-performance capabilities of Java in different programming environments. The first is a shared-memory threaded application and the second is a grid-based application. Both reflect the fact that Java is a viable language for certain compute-intensive applications.

The first example is a pure Java application for the financial services industry. The financial services industry is a major consumer of leading-edge technologies like Java. While this industry is challenged with simply keeping up with the extremely large volume of transactions that occur daily, the realm of financial engineering poses a different, yet an equally critical challenge. This discipline involves complex mathematical models where the number of transactions is not a concern. Instead, the number and complexity of mathematical computations involved are the main factors that strain application performance. An example of this type of computation is calculating the value of an American-style put option on a single stock. There's no simple equation to price such an instrument - the movement of the stock over time must be simulated, and all potential exercise dates for the option must be considered. (Figure 1)

Visual Numerics developed a Java program and demonstrated it as part of an Intel keynote address at the JavaOne conference in April 2007 with Fujitsu and BEA. The application calculates the value of an American-style put option of a single stock, doing this calculation using a Least Squares Monte Carlo simulation, hypothesizing four million potential price paths that a stock could take in a two-year period (the time during which the option may be exercised). The multithreaded Java application split the calculation work across 16 available processor cores, eight dual-core Itanium2 processors, on a Fujitsu PrimeQuest 520 system and returned a value in seconds. This kind of simulation, with the required calculations, couldn't even be attempted on a computer running a standard 32-bit desktop system. This particular application could be scaled further for systems with more cores either by increasing the number of simulations, to gain more accuracy, or by adding more securities to the calculation. The latter would form the core component for potentially evaluating a complete American-style option portfolio.

A second example is a pure Java Monte Carlo risk management application developed by Visual Numerics and run on a remote grid environment. The application leveraged the grid to run a larger number of simulations than could be run on a desktop system to provide a more accurate analysis of the portfolio's volatility. The program was launched from a 32-bit Windows-based desktop system in Tokyo and executed on a 64-bit Xeon-based grid environment in Seoul, Korea using N*Grid management software. The application ran 500,000,000 samples on the grid and returned the result in seconds. As with the first example, this application highlighted the flexibility of Java, based on its computational capabilities and applicability to a high-performance, compute-intensive application. And in this case Java's heterogeneity allowed for easy adaptability in grid computing.

Predictions for 2008 & Beyond
With its inherent performance improvements and support from commercial numerical analysis library vendors, Java should no longer be considered too slow to support compute-intensive applications. Numerous successful commercial examples serve as proof that the portability and ease-of-use capabilities of Java combined with increasingly faster performance make it a fully viable language to support high performance computing.

With the barriers to using Java for compute-intensive applications removed, organizations should expect a more widespread use of Java in high-performance areas in 2008 and beyond. In fact, organizations should look for areas where Java can be used for their compute-intensive applications. Java is one of, if not the most, popular programming languages available today, so the quantity and quality of programmers for Java are widespread and deep. Increased adoption of Java in the classroom means that the next generation of programmers will be Java-knowledgeable. So writing applications in this language instead of a more traditional HPC language reduces the risk of finding developers to support these important compute-intensive applications in the future.

More Stories By Edward Stewart

Edward Stewart is the product manager for the IMSL Numerical Libraries. He has experience in many quantitative areas including quantification and interpretation of statistics and probability, coordination and analysis of large data sets, frequency domain time series analysis, partial differential equations, finite difference numerical modeling, and nonlinear dynamics. Ed received his Ph.D. in physical ocean science and engineering from the University of Delaware.

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.