Welcome!

Java Authors: Hovhannes Avoyan, Bob Gourley, Andreas Grabner, Vivek Arora, Joel Murach

Related Topics: Java

Java: Case Study

From 500 to 1500 Managed JVMs Without Increasing Staff

IBM Tivoli Composite Application Manager (ITCAM) case-study

Recently I listened to a large enterprise customer talk about their four years of experience with IBM Tivoli Composite Application Manager (ITCAM).  They noted that ITCAM had proved to be extremely valuable from the standpoint of proactively troubleshooting production applications.  Which is what I expected to hear. After all, ITCAM enables you to do both:

End-to-end monitoring, so you can better understand how your customers are experiencing the application, and

Detailed diagnostic monitoring you'll need for the truly gnarly performance problems.

However, I wasn't expecting the customer to be excited about ITCAM's new agent installation and deployment features.  Yes, I bet you did a double-take as you read that.

The new ITCAM combines the configuration and installation of the end-to-end monitoring agents and detailed data collectors into a single console.  The number of install panels has been reduced and clearer instructions for creating differently configured data collectors are available. The console also simplifies the application of a new profile across existing collectors.

Nice stuff, but why the enthusiasm?

The answer had a lot to do with how the customer delegates IT roles and responsibilities and their desire to go from 500 to 1500 managed JVMs (Java Virtual Machines) without increasing staff and in a relatively short timeframe.

Challenges caused by people, process and technology
Like most large enterprises, the customer has separate development, engineering, operations teams involved in deploying and managing their web applications.  Getting ITCAM monitoring solution initially installed into their production environment was challenging because the development team had the responsibility and control of the process for deploying anything to the WebSphere installation in the production environment.  This meant that the monitoring solution, particularly the data collection agents, had to be included in the development team's deployment process.

At the time, the development team was not interested in using ITCAM's deployment facilities, in part because the console was designed for operations experts.  As a result of the mismatch, the operations had to create internal processes and best practices for configuring, deploying and maintaining the agents and data collectors within the production installation.

One can imagine the additional effort required to manage the deployment process across the silos as the number of managed web applications grew from 10 four years ago to the approximately 500 that are managed today.  Yet it was clear to the business and the IT teams that the predictive and problem resolution benefits obtained from analyzing data from ITCAM's agents was worth the effort.  However, it's also easy to see that the effort is a drain on the total benefits received.

Now, consider their plans to expand management to another 1000 applications. Their enthusiasm for ITCAM's new streamlined deployment capabilities becomes even more clear.

Solution benefits are maximized by people, collaboration and technology
After fifteen years of following the application management space and writing solution case studies, I can say that this type of story is not unique.  The value a customer receives from their management solution is not only dependent on the management features. It also depends on the solution's administrative technology, the variety of stakeholders that must use the solution, and the amount of process effort it takes for those stakeholders to collaborate.

So here's my take on what it takes to maximize the benefit from faster deployment of monitoring capabilities, using the ITCAM customer's experience as a guide.

The new streamlined agent configuration and deployment makes it easier non-operations groups to include the monitoring capabilities in their deployment packages.  This may eliminate the installation tasks from operations to-do list (depending on how your organization has delegated out deployment responsibilities).  Yet, this can also increase the importance of optimizing the configuration of the monitoring agents as part of your application deployment process.

This optimization task must be done regardless of which group has the deployment responsibility or what deployment technology you are using.  Also, do not make the mistake of believing that virtualization and image management tools will eliminate this task.  Virtualization tools dramatically speed up your image deployment times, but they do nothing to ensure that the configuration of that image is optimal. Those tools will rapidly deploy both resource hogging software and highly efficient software.  If your application image contains a poorly configured monitoring agent, it is likely to cause a dreaded "performance impacting event."  In other words, the solution you are using to solve an application performance problem is actually adding to the problem.

The best way to avoid that situation is through collaboration between the operational monitoring and application development teams while the agent is being configured. While ITCAM provides much more information about how collector configurations will impact application performance, all enterprise applications are not the same. This means trade-off decisions will have to be made about what to collect and when to collect it for every enterprise application.  This is where combining operational experience from monitoring other enterprise applications and the application design knowledge maximizes the benefit.

Additionally, when both teams share ITCAM's console, collaboration is easier because both teams share a context with which to have a meaningful exchange. Collaboration becomes a quick call, exchanging text messages, posts to internal wikis or project blog pages - instead of a top-down process enforced by the iron-will of a CIO.

Thus the value of doing this collaborative monitoring optimization is twofold. First, it can prevent ‘egg-on-face' situations where the cure is adding to the problem.  Second, it can dramatically reduce the amount of time development and operations teams must spend on problem resolution. Time that is taken away from developing new software! Time that is taken away from predictive analysis and preventative maintenance!

That's how something as simple as a streamlined deployment can help an enterprise go from 500 to 1500 proactively managed applications.

More Stories By Jasmine Noel

Jasmine Noel is a founding partner of Ptak, Noel & Associates. She has over 15 years experience analyzing and consulting on IT management issues. She currently focuses on technologies and processes that organizations require to design, engineer and manage the performance and service quality of business applications, workloads and services. Noel served previously as director of systems and applications management at Hurwitz Group, where she formulated and managed the company’s research agenda. She was also a senior analyst at D.H. Brown Associates, where her responsibilities included technology trend analysis in the network and systems management space. Noel is regularly quoted in and contributed articles to several leading publications and content portals on various IT management topics. She holds a bachelor of science from the Massachusetts Institute of Technology and a master of science from the University of Southern California.