Welcome!

Java IoT Authors: APM Blog, Liz McMillan, Elizabeth White, Simon Hill, Stackify Blog

Related Topics: Java IoT

Java IoT: Article

JFluid: A New Way to Profile Java Applications

JFluid: A New Way to Profile Java Applications

Anyone who develops production applications eventually spends some time profiling. JFluid is an experimental new technology for profiling Java code. It was developed at Sun Microsystems Laboratories and can be a handy tool in your profiling toolbox.

Your application should run fast and not overconsume valuable resources such as memory. For production applications it's important to "scale well" by running quickly and within a reasonable memory footprint when the workload increases. Profiling tools help identify bottlenecks in your code - sections that contribute the most to execution time and resource consumption. Profiling can reveal unanticipated facts about application performance. In the world of Java application development, all too often performance assumptions tend to be wrong.

Until now there have been two ways to profile Java code. You can modify the code by hand or use a profiling tool that leverages the Java Virtual Machine Profiler Interface (JVMPI). The first approach (e.g., inserting calls to System.currentTimeMillis()) is tedious and error prone. The second approach requires learning one of the many JVMPI profiling tools. Some are open source; others are commercial products. JFluid provides a third alternative.

Features
First, an important note: currently JFluid works only with its own Java Virtual Machine (JVM). This JVM is a slightly modified version of Sun's v1.4.2 HotSpot JVM and is fully compatible with the unmodified JVM. The only difference is a small internal profiling API. Replacing the standard JVM with the JFluid JVM is easy - see the "Installation" section of this article.

The JFluid tool is a Swing application that is currently a bit rough around the edges - hopefully it will improve over time. The application to be profiled can be started by the tool. Alternatively, you can attach the tool to a JVM that is already running. Attaching to a running JVM is useful when you want to profile applications that run in a container such as a Web or application server. An important JFluid feature is that no special command-line settings are required when starting the JVM to which you eventually attach. Until the JFluid tool is attached, the application runs at full speed with no profiling overhead. Think about the implication of that. It means that applications can be profiled in their normal deployed environment without advance preparation and without incurring any overhead when profiling is not being done. In other words, JFluid allows profiling to be turned on and off at will.

Minimizing overhead is a cornerstone of JFluid's design. Its CPU profiling allows you to profile a subset of the application's methods. By not profiling the rest of the methods you can dramatically reduce profiling overhead. To profile a group of methods, first select one or more methods manually. JFluid treats each selected method as the root of a "call graph". Starting at the root method JFluid determines which methods are called by it, repeating the process recursively until the entire call subgraph is identified. Only the methods that belong to the call subgraph are instrumented and profiled; the rest of your code runs unchanged at full speed. Method selections can be changed arbitrarily while the program is running. This allows you to perform a "drill down" with decreasing overhead, or to successively profile different code areas for which there would be too much overhead if profiled together. Tools that allow only a selection of methods for profiling by name or package are unable to provide this functionality in such an easy-to-use form.

JFluid also tracks memory allocations with an eye on minimizing overhead. By default JFluid does not record complete information on every object allocation; instead it does statistical sampling by keeping a counter for each class. It decrements the class's counter when an object is allocated. When the allocation counter reaches zero, JFluid does detailed profiling and resets the counter. Detailed profiling of an allocation consists of capturing a stack trace and monitoring whether the object is still on the heap, in other words the object's liveness. By default the counter is 10 so 10% of object allocations are tracked. In production applications, particularly server applications, the number of objects allocated is so high that the information that is discarded is usually not significantly different from what is retained. The allocation counter can be decreased to improve precision or increased to reduce overhead.

For tracked allocations JFluid records the age of each object, where "age" means number of garbage collections the object has survived. It also reports the number of different ages for all tracked allocations of a class; this is called "surviving generations." The surviving generations value is useful for detecting memory leaks caused by objects that are constantly generated, but only partially garbage collected. In other words, the group of objects grows steadily. "Steadily" is the key word: it is not a group of objects that has been generated once within a short period of time in the past and remains fixed since then. Nor is it a group of objects that may grow for quite some time, but then get collected at once. In both of those cases, the number of different ages of objects within a group would be small, no matter how young or old the objects themselves are. It is typically in a leaking object group that the number of different ages grows steadily.

JFluid can also tell you where in your code an object was allocated. However, it does not help answer the question of why an object has not been garbage collected. In other words, no display is provided to show which objects are pointing to the suspected leaking object. Sometimes just knowing where an object was allocated is enough of a clue to help you figure out why it has not been garbage collected. When that is not enough, a heap graph would be helpful. We have yet to see what JFluid is going to offer in this area.

In addition to the instrumentation of CPU usage and heap allocations, JFluid provides four monitors. These graphs show thread count, current heap size and usage, time spent doing garbage collection, and the count of the number of different ages for all heap objects. The count of the number of different ages is the surviving generations value for the entire heap.

Installation
All the necessary .zip files are on the JFluid Web site. Support is included for SPARC Solaris, x86 Solaris, Linux, and Windows. Installation consists of two parts: replacing the JVM and installing the tool.

To replace the JVM, download the binary files (libjvm.so or jvm.dll, depending on your platform) and put them in place of the corresponding files in your standard JVM. Alternatively, you can download a complete JDK that has already been modified to support JFluid.

To install the JFluid tool, unzip the file into any directory. Edit the shell script or batch file that's used to start the JFluid tool so that your JVM and JFluid directories are specified, and you are ready to go.

Sample Program
To demonstrate JFluid I wrote a simple prime number generator that uses the Sieve of Eratosthenes. Its method accepts a single integer parameter and returns all prime numbers less than or equal to that integer. (The source code for this article can be downloaded from www.sys-con.com/java/sourcec.cfm.) The code is poorly written on purpose so that I can demonstrate some of JFluid's features. To test JFluid's ability to connect to a container, I also wrote a poorly implemented servlet that uses the prime number class; the servlet wastes an atrocious amount of heap space. Finally, I created a small .html form that invokes the servlet.

I ran JFluid on both a SunBlade 2000 running Solaris 9 and a Pentium3-based PC running Windows 2000. All screenshots in this article are from the PC, which is where I ran the example profiling session.

Example Profiling Session
After modifying my JVM with the JFluid files, I started the Tomcat servlet container. After starting the JFluid tool I modified its settings, which includes two different CLASSPATH values. The first is for the main application class loader, the second for any other class loader that gets used by your code. For JFluid to be able to find and instrument classes of my servlet I had to specify the servlet's classes in the path for the other class loader (see Figure 1). In the Instrumentation tab of the settings dialog I turned on profiling of the core Java classes, which is off by default to reduce overhead.

To begin a profiling session I chose Run>Attach from the menu. JFluid prompted me for the name of the working directory for the JVM process - with my Tomcat installation this is the directory in which I started Tomcat. After specifying the directory I clicked on the console window that displayed when I started Tomcat; that allowed me to send the JVM a signal by pressing Ctrl-Break on my keyboard. I then returned to JFluid and verified that it connected to the JVM. This process is awkward but painless. The steps for connecting to the JVM running on a Unix system are a bit different; I have some tips on that below.

After JFluid attached to the JVM I clicked the Monitors tab to see the graph of heap usage. This graph is always available, even when no detailed instrumentation of memory allocations is being done. My next task was to look into the slow performance of the servlet. Using the Select and add root method> From binary menu entry, I specified my PrimeServlet.doPost() method. With that done, I chose the Instrument> Selected root methods transitively from the menu and I was ready to go. In a browser window I brought up my HTML form, typed 123456 as the maximum value, and then clicked Submit. After a pause my browser displayed results.

Meanwhile, JFluid was profiling my code. Clicking on the Profile> Get latest results menu took me to the CPU results tab, which displayed the window in Figure 2. The top line showed that my doPost() method took 409 milliseconds (ms). Of that time, however, only 2.16 ms were actually spent in the code of doPost() itself; the rest of the time was spent in methods called by doPost(). In particular, the java.lang. String.split() method took up a huge amount of time - 243 ms. That's almost twice the amount of time used by PrimeNumbers.getEratosthenes() to generate the prime numbers. Drilling a little deeper, I saw that most of the time in PrimeNumbers getEratosthenes() was actually spent in a method I wrote to convert the answer to a string. Wasteful string processing was hurting performance.

In addition to wasting CPU cycles, the code was also wasting heap space. Using the Monitors tab to watch the graph of heap usage I noticed that the heap continued to grow as I used the servlet to request additional prime numbers. Some of that growth was expected, but the servlet was caching answers so requests with a previously requested value should not have caused much additional memory usage. But they did. This is due to a bug I put in on purpose: the cache was broken because the key used to add to the cache was not the same key used to request entries from the cache. So if 123456 is requested four times, there will be four instances of PrimeNumber in the cache, even though only one is needed. The extra instances of PrimeNumber cannot be garbage collected from the heap because the cache holds references to them.

To identify this memory leak I selected Object liveness from the Instrument menu. This turned off JFluid's CPU profiling and turned on detailed profiling of object allocations. Since this was a small application, I set the allocation counter to 1 to get complete profiling. To reduce the overhead incurred, I set the stack trace depth that it records to 3. Then I went back to my browser and did four more requests of prime numbers.

When I clicked on Profile > Get latest results, the Memory results tab displayed with the list of objects allocated since my selection of Object liveness. The line of interest was this one:

4 live obj. - 4 alloc obj. - 3.8 avg. age - 2 surv. gen. - 4 total alloc obj. -
PrimeNumbers

Each time I requested a prime number I used the same value: 123456. Only one live instance of the PrimeNumbers class should exist. There are three extra objects because of the cache bug. After "live objects," the other figures reported are:

  • alloc. obj: The number of allocations being tracked
  • avg. age: The average of the number of garbage collections the live objects have survived
  • surv. gen.: The number of different ages of the live objects
  • total alloc. obj: All allocations including those that are not being tracked
These numbers provide clues when you are looking for memory leaks. Double-clicking the line of figures brings up a window that shows the different stack traces that were captured for each profiled object. In my example application the problem was obvious from the number of live objects reported. Unfortunately, memory leak detection is not always so simple. This is why the average age and surviving generation tail values are displayed. As described in the "Features" section, classes with large values for these figures are potential sources of memory leaks.

Using JFluid on Unix Systems
The only difference between running JFluid on a Windows system and a Unix system is the method of attaching to a running JVM. On Unix systems the JFluid tool sends a SIGQUIT signal to the running JVM to establish a connection. To do that the JFluid client must be running with adequate privileges. For example, when testing on Solaris 9, I was attaching the JFluid client to the JVM used by Sun ONE Web Server v6.1. By default, that JVM was run by the user ID nobody. So I had to run the JFluid tool as nobody, not as another user.

Sun Microsystems Laboratories
4150 Network Circle
Santa Clara, CA 95054
Web: http://research.sun.com
Phone: 800 555-9SUN

Conclusion
JFluid is an experimental but powerful tool. It provides detailed profiling information on demand, allowing you to turn profiling on and off at will. When profiling is on you can control how much overhead is incurred. The JFluid tool lacks some features and polish, but it's easy to learn and installation is quick.

References

  • JFluid: http://research.sun.com/projects/jfluid/
  • JFluid research: http://research.sun.com/techrep/2003/abstract-125.html
  • JVMPI: http://java.sun.com/j2se/1.4.2/docs/guide/jvmpi/jvmpi.html
  • More Stories By Gregg Sporar

    Gregg Sporar is a Staff Engineer in the Services Division of Sun Microsystems and is a Sun Certified Java Developer.

    Comments (0)

    Share your thoughts on this story.

    Add your comment
    You must be signed in to add a comment. Sign-in | Register

    In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.


    @ThingsExpo Stories
    Coca-Cola’s Google powered digital signage system lays the groundwork for a more valuable connection between Coke and its customers. Digital signs pair software with high-resolution displays so that a message can be changed instantly based on what the operator wants to communicate or sell. In their Day 3 Keynote at 21st Cloud Expo, Greg Chambers, Global Group Director, Digital Innovation, Coca-Cola, and Vidya Nagarajan, a Senior Product Manager at Google, discussed how from store operations and ...
    In his session at 21st Cloud Expo, Carl J. Levine, Senior Technical Evangelist for NS1, will objectively discuss how DNS is used to solve Digital Transformation challenges in large SaaS applications, CDNs, AdTech platforms, and other demanding use cases. Carl J. Levine is the Senior Technical Evangelist for NS1. A veteran of the Internet Infrastructure space, he has over a decade of experience with startups, networking protocols and Internet infrastructure, combined with the unique ability to it...
    "There's plenty of bandwidth out there but it's never in the right place. So what Cedexis does is uses data to work out the best pathways to get data from the origin to the person who wants to get it," explained Simon Jones, Evangelist and Head of Marketing at Cedexis, in this SYS-CON.tv interview at 21st Cloud Expo, held Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA.
    SYS-CON Events announced today that CrowdReviews.com has been named “Media Sponsor” of SYS-CON's 22nd International Cloud Expo, which will take place on June 5–7, 2018, at the Javits Center in New York City, NY. CrowdReviews.com is a transparent online platform for determining which products and services are the best based on the opinion of the crowd. The crowd consists of Internet users that have experienced products and services first-hand and have an interest in letting other potential buye...
    "MobiDev is a software development company and we do complex, custom software development for everybody from entrepreneurs to large enterprises," explained Alan Winters, U.S. Head of Business Development at MobiDev, in this SYS-CON.tv interview at 21st Cloud Expo, held Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA.
    SYS-CON Events announced today that Telecom Reseller has been named “Media Sponsor” of SYS-CON's 22nd International Cloud Expo, which will take place on June 5-7, 2018, at the Javits Center in New York, NY. Telecom Reseller reports on Unified Communications, UCaaS, BPaaS for enterprise and SMBs. They report extensively on both customer premises based solutions such as IP-PBX as well as cloud based and hosted platforms.
    WebRTC is great technology to build your own communication tools. It will be even more exciting experience it with advanced devices, such as a 360 Camera, 360 microphone, and a depth sensor camera. In his session at @ThingsExpo, Masashi Ganeko, a manager at INFOCOM Corporation, introduced two experimental projects from his team and what they learned from them. "Shotoku Tamago" uses the robot audition software HARK to track speakers in 360 video of a remote party. "Virtual Teleport" uses a multip...
    Gemini is Yahoo’s native and search advertising platform. To ensure the quality of a complex distributed system that spans multiple products and components and across various desktop websites and mobile app and web experiences – both Yahoo owned and operated and third-party syndication (supply), with complex interaction with more than a billion users and numerous advertisers globally (demand) – it becomes imperative to automate a set of end-to-end tests 24x7 to detect bugs and regression. In th...
    "Space Monkey by Vivent Smart Home is a product that is a distributed cloud-based edge storage network. Vivent Smart Home, our parent company, is a smart home provider that places a lot of hard drives across homes in North America," explained JT Olds, Director of Engineering, and Brandon Crowfeather, Product Manager, at Vivint Smart Home, in this SYS-CON.tv interview at @ThingsExpo, held Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA.
    "IBM is really all in on blockchain. We take a look at sort of the history of blockchain ledger technologies. It started out with bitcoin, Ethereum, and IBM evaluated these particular blockchain technologies and found they were anonymous and permissionless and that many companies were looking for permissioned blockchain," stated René Bostic, Technical VP of the IBM Cloud Unit in North America, in this SYS-CON.tv interview at 21st Cloud Expo, held Oct 31 – Nov 2, 2017, at the Santa Clara Conventi...
    "Cloud Academy is an enterprise training platform for the cloud, specifically public clouds. We offer guided learning experiences on AWS, Azure, Google Cloud and all the surrounding methodologies and technologies that you need to know and your teams need to know in order to leverage the full benefits of the cloud," explained Alex Brower, VP of Marketing at Cloud Academy, in this SYS-CON.tv interview at 21st Cloud Expo, held Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clar...
    It is of utmost importance for the future success of WebRTC to ensure that interoperability is operational between web browsers and any WebRTC-compliant client. To be guaranteed as operational and effective, interoperability must be tested extensively by establishing WebRTC data and media connections between different web browsers running on different devices and operating systems. In his session at WebRTC Summit at @ThingsExpo, Dr. Alex Gouaillard, CEO and Founder of CoSMo Software, presented ...
    A strange thing is happening along the way to the Internet of Things, namely far too many devices to work with and manage. It has become clear that we'll need much higher efficiency user experiences that can allow us to more easily and scalably work with the thousands of devices that will soon be in each of our lives. Enter the conversational interface revolution, combining bots we can literally talk with, gesture to, and even direct with our thoughts, with embedded artificial intelligence, whic...
    SYS-CON Events announced today that Evatronix will exhibit at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Evatronix SA offers comprehensive solutions in the design and implementation of electronic systems, in CAD / CAM deployment, and also is a designer and manufacturer of advanced 3D scanners for professional applications.
    Leading companies, from the Global Fortune 500 to the smallest companies, are adopting hybrid cloud as the path to business advantage. Hybrid cloud depends on cloud services and on-premises infrastructure working in unison. Successful implementations require new levels of data mobility, enabled by an automated and seamless flow across on-premises and cloud resources. In his general session at 21st Cloud Expo, Greg Tevis, an IBM Storage Software Technical Strategist and Customer Solution Architec...
    To get the most out of their data, successful companies are not focusing on queries and data lakes, they are actively integrating analytics into their operations with a data-first application development approach. Real-time adjustments to improve revenues, reduce costs, or mitigate risk rely on applications that minimize latency on a variety of data sources. In his session at @BigDataExpo, Jack Norris, Senior Vice President, Data and Applications at MapR Technologies, reviewed best practices to ...
    An increasing number of companies are creating products that combine data with analytical capabilities. Running interactive queries on Big Data requires complex architectures to store and query data effectively, typically involving data streams, an choosing efficient file format/database and multiple independent systems that are tied together through custom-engineered pipelines. In his session at @BigDataExpo at @ThingsExpo, Tomer Levi, a senior software engineer at Intel’s Advanced Analytics gr...
    When talking IoT we often focus on the devices, the sensors, the hardware itself. The new smart appliances, the new smart or self-driving cars (which are amalgamations of many ‘things’). When we are looking at the world of IoT, we should take a step back, look at the big picture. What value are these devices providing? IoT is not about the devices, it’s about the data consumed and generated. The devices are tools, mechanisms, conduits. In his session at Internet of Things at Cloud Expo | DXWor...
    Everything run by electricity will eventually be connected to the Internet. Get ahead of the Internet of Things revolution. In his session at @ThingsExpo, Akvelon expert and IoT industry leader Sergey Grebnov provided an educational dive into the world of managing your home, workplace and all the devices they contain with the power of machine-based AI and intelligent Bot services for a completely streamlined experience.
    SYS-CON Events announced today that Synametrics Technologies will exhibit at SYS-CON's 22nd International Cloud Expo®, which will take place on June 5-7, 2018, at the Javits Center in New York, NY. Synametrics Technologies is a privately held company based in Plainsboro, New Jersey that has been providing solutions for the developer community since 1997. Based on the success of its initial product offerings such as WinSQL, Xeams, SynaMan and Syncrify, Synametrics continues to create and hone inn...