Java Authors: Liz McMillan, Frank Huerta, Elizabeth White, Pat Romanski, Sandi Mappic

Related Topics: Java

Java: Article

We Are Made to Persist. That's How We Find Out Who We Are

We Are Made to Persist. That's How We Find Out Who We Are

In Java's early years, the language received a lot of flak from its opponents over performance. Java turns its .class file bytecodes into machine instructions (MI) at runtime, something that costs cycles and is slower than a fully compiled language that creates the MI as part of the development stage. While to a certain extent this is true, the performance delta has all but been removed with the use of just-in-time (JIT) compilers that cache machine instructions in the VM and do other clever tricks to ensure the JVM runtime speed has very little slack. There was a time when JIT had to be switched off for debugging as it interfered with the ability to map stack and heap information back to the original source. However, even this is no longer true in the newer JVMs that can run in high-performance debug modes with no significant difference between having -Xdebug there or not.

The garbage collector is another area that the Java language has received criticism for. The original concept is that programmers don't have to worry about freeing memory, all they do is create objects at will and let the JVM determine when an object is no longer required. It is certainly a lot simpler than de-allocating heap memory manually in a C program; however, because it's based on presumptive algorithms, frequently it's unfairly blamed for JVM memory bloat or performance problems. Modern garbage collectors though are very efficient and can reclaim memory in small segments with incremental pauses to let the JVM continue uninterrupted (www.research.ibm.com/metronome/). They also defragment memory as they go along, and JSR 1 introduces new APIs that allow fully deterministic garbage collection and high-resolution time management, although they do introduce the problem of how to manage immortal memory.

When a Java program is launched, the java command contains the -classpath that points to a set of directories containing .class files and other resources required by the main class. The VM loads classes on-demand when they are first referenced by searching the directories (or .jars if they are zipped up) and loading bytecodes before compiling them. In a textbook "Hello World" program considerably more time is spent loading the JRE classes required for the user's code to run than is spent loading the main class. As the base classes are loaded, they need memory allocated for them; they need just-in-time compilation into machine instructions, and other steps such as bytecode verification and linkage all take JVM cycles to perform. Once the JVM is up and running it can rerun the scenario quickly because the bootstrap work has been done and the VM is fully warmed up. When the user exits the JVM, however, all of the work is thrown away. The next time they rerun the same Java program, all the steps required to load the base JRE classes, allocate memory for them, and so forth takes place de novo. Developers of server-side JVMs like Shiraz (www.haifa.il.ibm.com/projects/systems/rs/persistent.html) that run on z/OS solve this by allowing the sharing of classes between JVMs, providing good scalability. Apple's JVMs use a flavor of class-sharing known as Java Shared Archive (JSA) while Java 5 introduced formal Class Data Sharing that allows the sharing of base classes between VMs and, in the future, user-defined classes.

All of this is great news for Java: it has high-performance JITs that can work in debug conditions, garbage collectors that don't freeze the JVM each time a global sweep of the heap has to be done to determine unreferenced objects, and the sharing of data between JVMs to assist scalability. There's one area left though that I'd like to see tackled - serialization and rehydration of a JVM's state. The advantage would be that a program could be exited and re-started on subsequent re-opens as though it had just been temporarily paused. There are problems though that any such solution would encounter.

One is that the Java runtime used by the JVM when the it's re-opened might be different than the one when it was saved. In this case, the classes might have changed physical shape, with instance variables added, removed, renamed, or any number of changes that mean that the serialized instances from the first save can't be mapped to the new class shape. Binary serialization is very brittle and unforgiving when any kind of class shape change occurs. What would be nice is if there was a programmatic way to mutate instances to a new class shape. If the author of a class changes its shape, there could be a method called by the JVM that allowed them to deserialize old shaped instances and map them into new objects.

The second problem is that some objects hold handles to pointers outside the JVM that won't necessarily exist next time round. These could be references to files, sockets, GUI widgets, or anything where the object interacts with the platform in some way. This could be solved by having a clearly defined life cycle to the VM whereby objects were called back on save and load, and something like a GUI widget could keep all of its data on save except the actual window handle, and then re-create itself again from this state when the VM is reloaded. There would still be problems about what to do with something that might not be there any more, such as a file on disk that was no longer present, but if the VM had a clearly defined API cycle through which objects were saved and restored that class authors adhered to, this could be dealt with.

Java has matured in its release cycle since the early days and new versions occur in 18-month rather than weekly periods. For every Java program that runs again and again all year with the same set of unchanged classes, we need to think about how to optimize it for the common scenario: nothing in the JRE has changed, it's on the same computer that ran last time round, and the user wants it to come up as fast as any native program. I think the whole area of VM object persistence needs to be looked at again and seen whether this time around it can be made to work, benefiting programmers with more control and flexibility and users with faster and more reliable programs.

More Stories By Joe Winchester

Joe Winchester, Editor-in-Chief of Java Developer's Journal, was formerly JDJ's longtime Desktop Technologies Editor and is a software developer working on development tools for IBM in Hursley, UK.

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.