Click here to close now.


Java IoT Authors: Elizabeth White, Pat Romanski, Liz McMillan, Chris Fleck, Gary Kaiser

Related Topics: @DevOpsSummit, Java IoT, Microservices Expo, Linux Containers, @CloudExpo, @BigDataExpo

@DevOpsSummit: Blog Post

Software Quality Metrics for Your Continuous Delivery Pipeline | Part 3

It is safe to say that insightful logging and performance are two opposite goals

Let me ask you a question: would you say that you have implemented logging correctly for your application? Correct in the sense that it will provide you with all the insights you require to keep your business going once your users are struck by errors? And in a way that does not adversely impact your application performance? Honestly, I bet you have not. Today I will explain why you should turn off logging completely in production because of its limitations:

  • Relies on Developers
  • Lacks Context
  • Impacts Performance

Intrigued? Bear with me and I will show you how you can still establish and maintain a healthy and useful logging strategy for your deployment pipeline, from development to production, guided by metrics.

What Logging Can Do for You
Developers, including myself, often write log messages because they are lazy. Why should I set a breakpoint and fire up a debugger if it is so much more convenient to dump something to my console via a simple println()? This simple yet effective mechanism also works on headless machines where no IDE is installed, such as staging or production environments:

System.out.println("Been here, done that.");

Careful coders would use a logger to prevent random debug messages from appearing in production logs and additionally use guard statements to prevent unnecessary parameter construction:

if (logger.isDebugEnabled()) {
logger.debug("Entry number: " + i + " is " + String.valueOf(entry[i]));

Anyways, the point about logging is that that traces of log messages allow developers to better understand what their program is doing in execution. Does my program take this branch or that branch? Which statements were executed before that exception was thrown? I have done this at least a million of times, and most likely so have you:

if (condition) {
logger.debug("7: yeah!")
} else {
logger.debug("8: DAMN!!!")

Test Automation Engineers, usually developers by trade, equally use logging to better understand how the code under test complies with their test scenarios:

class AgentSpec extends spock.lang.Specification {

def "Agent.compute()"() {
def agent = AgentPool.getAgent()

def result = agent.compute(TestFixtures.getData())

logger.debug("result: "  + result);
result == expected

Logging is, undoubtedly, a helpful tool during development and I would argue that developers should use it as freely as possible if it helps them to understand and troubleshoot their code.

In production, application logging is useful for tracking certain events, such as the occurrence of a particular exception, but it usually fails to deliver what it is so often mistakenly used for: as a mechanism for analyzing application failures in production. Why?

Because approaches to achieving this goal with logging are naturally brittle: their usefulness depends heavily on developers, messages are without context, and if not designed carefully, logging may severely slow down your application.

Secretly, what you are really hoping to get from your application logs, in the one or the other form, is something like this:

A logging strategy that delivers out-of-the-box using dynaTrace: user context, all relevant data in place, zero config

The Limits of Logging
Logging Relies on Developers
Let's face it: logging is, inherently, a developer-centric mechanism. The usefulness of your application logs stands and falls with your developers. A best practice for logging in production says: "don't log too much" (see Optimal Logging @ Google testing blog). This sounds sensible, but what does this actually mean? If we recall the basic motivation behind logging in production, we could equally rephrase this as "log just enough information you need to know about a failure that enables you to take adequate actions". So, what would it take your developers to provide such actionable insights? Developers would need to correctly anticipate where in the code errors would occur in production. They would also need to collect any relevant bits of information along an execution path that bear these insights and, last but not least, present them in a meaningful way so that others can understand, too. Developers are, no doubt, a critical factor to the practicality of your application logs.

Logging Lacks Context
Logging during development is so helpful because developers and testers usually examine smaller, co-located units of code that are executed in a single thread. It is fairly easy to maintain an overview under such simulated conditions, such as a test scenario:

13:49:59 INFO - Registered user ‘foo'.
13:49:59 INFO - User ‘foo' has logged in.
13:49:59 INFO - User ‘foo' has logged out.

But how can you reliably identify an entire failing user transaction in a real-life scenario, that is, in a heavily multi-threaded environment with multiple tiers that serve piles of distributed log files? I say, hardly at all. Sure, you can go mine for certain isolated events, but you cannot easily extract causal relationships from an incoherent, distributed set of log messages:

13:49:59 INFO - User ‘foo' has logged in.
13:49:59 INFO - User ‘bar' has logged in.
13:49:60 SEVERE org.hibernate.exception.JDBCConnectionException: could not execute query
at org.hibernate.exception.SQLStateConverter.convert(

After all, the ability to identify such contexts is key to deciding why a particular user action failed.

Logging Impacts Performance
What is a thorough logging strategy worth if your users cannot use your application because it is terribly slow? In case you did not know, logging, especially during peak load times, may severely slow down your application. Let's have a quick look at some of the reasons:

Writing log messages from the application's memory to persistent storage, usually to the file system, demands substantial I/O (see Top Performance Mistakes when moving from Test to Production: Excessive Logging). Traditional logger implementations wrote files by issuing synchronous I/O requests, which put the calling thread into a wait state until the log message was fully written to disk.

In some cases, the logger itself may cause a decent bottle-neck: in the Log4j library (up to version 1.2), every single log activity results in a call to an internal template method Appender.doAppend() that is synchronized for thread-safety (see Multithreading issues - doAppend is synchronised?). The practical implication of this is that threads, which log to the same Appender, for example a FileAppender, must queue up with any other threads writing logs. Consequently, the application spends valuable time waiting in synchronization instead of doing whatever the app was actually designed to do. This will hurt performance, especially in heavily multi-thread environments like web application servers.

These performance effects can be vastly amplified when exception logging comes into play: exception data, such as error message, stack trace and any other piggy-backed exceptions ("initial cause exceptions") greatly increase the amount of data that needs to be logged. Additionally, once a system is in a faulty state, the same exceptions tend to appear over and over again, further hurting application performance. We had once monitored a 30% drawdown on CPU resources due to more than 180,000 exceptions being thrown in only 5 minutes on one of our application servers (see Performance Impact of Exceptions: Why Ops, Test and Dev need to care). If we had written these exceptions to the file system, they would have trashed I/O, filled up our disk space in no time and had considerably increased our response times.

Subsequently, it is safe to say that insightful logging and performance are two opposite goals: if you want the one, then you have to make a compromise on the other.

For more logging tips click here for the full article.

More Stories By Martin Etmajer

Martin Etmajer has 10+ years of experience as a developer and software architect, as well as in maintaining highly available and performant cluster environments. In his current role, Martin works as a Technology Strategist for Dynatrace with a focus on Continuous Delivery and DevOps. He speaks at technology conferences and meetups and publishes articles on this blog. Reach him at @metmajer

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.

@ThingsExpo Stories
As more intelligent IoT applications shift into gear, they’re merging into the ever-increasing traffic flow of the Internet. It won’t be long before we experience bottlenecks, as IoT traffic peaks during rush hours. Organizations that are unprepared will find themselves by the side of the road unable to cross back into the fast lane. As billions of new devices begin to communicate and exchange data – will your infrastructure be scalable enough to handle this new interconnected world?
"Matrix is an ambitious open standard and implementation that's set up to break down the fragmentation problems that exist in IP messaging and VoIP communication," explained John Woolf, Technical Evangelist at Matrix, in this interview at @ThingsExpo, held Nov 4–6, 2014, at the Santa Clara Convention Center in Santa Clara, CA.
SYS-CON Events announced today that Super Micro Computer, Inc., a global leader in high-performance, high-efficiency server, storage technology and green computing, will exhibit at the 17th International Cloud Expo®, which will take place on November 3–5, 2015, at the Santa Clara Convention Center in Santa Clara, CA. Supermicro (NASDAQ: SMCI), the leading innovator in high-performance, high-efficiency server technology is a premier provider of advanced server Building Block Solutions® for Data Center, Cloud Computing, Enterprise IT, Hadoop/Big Data, HPC and Embedded Systems worldwide. Supermi...
Nowadays, a large number of sensors and devices are connected to the network. Leading-edge IoT technologies integrate various types of sensor data to create a new value for several business decision scenarios. The transparent cloud is a model of a new IoT emergence service platform. Many service providers store and access various types of sensor data in order to create and find out new business values by integrating such data.
Too often with compelling new technologies market participants become overly enamored with that attractiveness of the technology and neglect underlying business drivers. This tendency, what some call the “newest shiny object syndrome,” is understandable given that virtually all of us are heavily engaged in technology. But it is also mistaken. Without concrete business cases driving its deployment, IoT, like many other technologies before it, will fade into obscurity.
There are so many tools and techniques for data analytics that even for a data scientist the choices, possible systems, and even the types of data can be daunting. In his session at @ThingsExpo, Chris Harrold, Global CTO for Big Data Solutions for EMC Corporation, will show how to perform a simple, but meaningful analysis of social sentiment data using freely available tools that take only minutes to download and install. Participants will get the download information, scripts, and complete end-to-end walkthrough of the analysis from start to finish. Participants will also be given the pract...
WebRTC services have already permeated corporate communications in the form of videoconferencing solutions. However, WebRTC has the potential of going beyond and catalyzing a new class of services providing more than calls with capabilities such as mass-scale real-time media broadcasting, enriched and augmented video, person-to-machine and machine-to-machine communications. In his session at @ThingsExpo, Luis Lopez, CEO of Kurento, will introduce the technologies required for implementing these ideas and some early experiments performed in the Kurento open source software community in areas ...
Electric power utilities face relentless pressure on their financial performance, and reducing distribution grid losses is one of the last untapped opportunities to meet their business goals. Combining IoT-enabled sensors and cloud-based data analytics, utilities now are able to find, quantify and reduce losses faster – and with a smaller IT footprint. Solutions exist using Internet-enabled sensors deployed temporarily at strategic locations within the distribution grid to measure actual line loads.
“In the past year we've seen a lot of stabilization of WebRTC. You can now use it in production with a far greater degree of certainty. A lot of the real developments in the past year have been in things like the data channel, which will enable a whole new type of application," explained Peter Dunkley, Technical Director at Acision, in this interview at @ThingsExpo, held Nov 4–6, 2014, at the Santa Clara Convention Center in Santa Clara, CA.
The Internet of Everything is re-shaping technology trends–moving away from “request/response” architecture to an “always-on” Streaming Web where data is in constant motion and secure, reliable communication is an absolute necessity. As more and more THINGS go online, the challenges that developers will need to address will only increase exponentially. In his session at @ThingsExpo, Todd Greene, Founder & CEO of PubNub, will explore the current state of IoT connectivity and review key trends and technology requirements that will drive the Internet of Things from hype to reality.
There will be 20 billion IoT devices connected to the Internet soon. What if we could control these devices with our voice, mind, or gestures? What if we could teach these devices how to talk to each other? What if these devices could learn how to interact with us (and each other) to make our lives better? What if Jarvis was real? How can I gain these super powers? In his session at 17th Cloud Expo, Chris Matthieu, co-founder and CTO of Octoblu, will show you!
Today’s connected world is moving from devices towards things, what this means is that by using increasingly low cost sensors embedded in devices we can create many new use cases. These span across use cases in cities, vehicles, home, offices, factories, retail environments, worksites, health, logistics, and health. These use cases rely on ubiquitous connectivity and generate massive amounts of data at scale. These technologies enable new business opportunities, ways to optimize and automate, along with new ways to engage with users.
Through WebRTC, audio and video communications are being embedded more easily than ever into applications, helping carriers, enterprises and independent software vendors deliver greater functionality to their end users. With today’s business world increasingly focused on outcomes, users’ growing calls for ease of use, and businesses craving smarter, tighter integration, what’s the next step in delivering a richer, more immersive experience? That richer, more fully integrated experience comes about through a Communications Platform as a Service which allows for messaging, screen sharing, video...
With major technology companies and startups seriously embracing IoT strategies, now is the perfect time to attend @ThingsExpo in Silicon Valley. Learn what is going on, contribute to the discussions, and ensure that your enterprise is as "IoT-Ready" as it can be! Internet of @ThingsExpo, taking place Nov 3-5, 2015, at the Santa Clara Convention Center in Santa Clara, CA, is co-located with 17th Cloud Expo and will feature technical sessions from a rock star conference faculty and the leading industry players in the world. The Internet of Things (IoT) is the most profound change in personal an...
SYS-CON Events announced today that Sandy Carter, IBM General Manager Cloud Ecosystem and Developers, and a Social Business Evangelist, will keynote at the 17th International Cloud Expo®, which will take place on November 3–5, 2015, at the Santa Clara Convention Center in Santa Clara, CA.
WebRTC converts the entire network into a ubiquitous communications cloud thereby connecting anytime, anywhere through any point. In his session at WebRTC Summit,, Mark Castleman, EIR at Bell Labs and Head of Future X Labs, will discuss how the transformational nature of communications is achieved through the democratizing force of WebRTC. WebRTC is doing for voice what HTML did for web content.
The Internet of Things (IoT) is growing rapidly by extending current technologies, products and networks. By 2020, Cisco estimates there will be 50 billion connected devices. Gartner has forecast revenues of over $300 billion, just to IoT suppliers. Now is the time to figure out how you’ll make money – not just create innovative products. With hundreds of new products and companies jumping into the IoT fray every month, there’s no shortage of innovation. Despite this, McKinsey/VisionMobile data shows "less than 10 percent of IoT developers are making enough to support a reasonably sized team....
As a company adopts a DevOps approach to software development, what are key things that both the Dev and Ops side of the business must keep in mind to ensure effective continuous delivery? In his session at DevOps Summit, Mark Hydar, Head of DevOps, Ericsson TV Platforms, will share best practices and provide helpful tips for Ops teams to adopt an open line of communication with the development side of the house to ensure success between the two sides.
The IoT market is on track to hit $7.1 trillion in 2020. The reality is that only a handful of companies are ready for this massive demand. There are a lot of barriers, paint points, traps, and hidden roadblocks. How can we deal with these issues and challenges? The paradigm has changed. Old-style ad-hoc trial-and-error ways will certainly lead you to the dead end. What is mandatory is an overarching and adaptive approach to effectively handle the rapid changes and exponential growth.
Today air travel is a minefield of delays, hassles and customer disappointment. Airlines struggle to revitalize the experience. GE and M2Mi will demonstrate practical examples of how IoT solutions are helping airlines bring back personalization, reduce trip time and improve reliability. In their session at @ThingsExpo, Shyam Varan Nath, Principal Architect with GE, and Dr. Sarah Cooper, M2Mi's VP Business Development and Engineering, will explore the IoT cloud-based platform technologies driving this change including privacy controls, data transparency and integration of real time context w...