Welcome!

Java IoT Authors: Elizabeth White, Liz McMillan, Yeshim Deniz, Pat Romanski, Yakov Fain

Related Topics: @DevOpsSummit, Java IoT, Microservices Expo, Linux Containers, @CloudExpo, @BigDataExpo

@DevOpsSummit: Blog Post

Software Quality Metrics for Your Continuous Delivery Pipeline | Part 3

It is safe to say that insightful logging and performance are two opposite goals

Let me ask you a question: would you say that you have implemented logging correctly for your application? Correct in the sense that it will provide you with all the insights you require to keep your business going once your users are struck by errors? And in a way that does not adversely impact your application performance? Honestly, I bet you have not. Today I will explain why you should turn off logging completely in production because of its limitations:

  • Relies on Developers
  • Lacks Context
  • Impacts Performance

Intrigued? Bear with me and I will show you how you can still establish and maintain a healthy and useful logging strategy for your deployment pipeline, from development to production, guided by metrics.

What Logging Can Do for You
Developers, including myself, often write log messages because they are lazy. Why should I set a breakpoint and fire up a debugger if it is so much more convenient to dump something to my console via a simple println()? This simple yet effective mechanism also works on headless machines where no IDE is installed, such as staging or production environments:

System.out.println("Been here, done that.");

Careful coders would use a logger to prevent random debug messages from appearing in production logs and additionally use guard statements to prevent unnecessary parameter construction:

if (logger.isDebugEnabled()) {
logger.debug("Entry number: " + i + " is " + String.valueOf(entry[i]));
}

Anyways, the point about logging is that that traces of log messages allow developers to better understand what their program is doing in execution. Does my program take this branch or that branch? Which statements were executed before that exception was thrown? I have done this at least a million of times, and most likely so have you:

if (condition) {
logger.debug("7: yeah!")
} else {
logger.debug("8: DAMN!!!")
}

Test Automation Engineers, usually developers by trade, equally use logging to better understand how the code under test complies with their test scenarios:

class AgentSpec extends spock.lang.Specification {

def "Agent.compute()"() {
def agent = AgentPool.getAgent()

when:
def result = agent.compute(TestFixtures.getData())

then:
logger.debug("result: "  + result);
result == expected
}
}

Logging is, undoubtedly, a helpful tool during development and I would argue that developers should use it as freely as possible if it helps them to understand and troubleshoot their code.

In production, application logging is useful for tracking certain events, such as the occurrence of a particular exception, but it usually fails to deliver what it is so often mistakenly used for: as a mechanism for analyzing application failures in production. Why?

Because approaches to achieving this goal with logging are naturally brittle: their usefulness depends heavily on developers, messages are without context, and if not designed carefully, logging may severely slow down your application.

Secretly, what you are really hoping to get from your application logs, in the one or the other form, is something like this:

A logging strategy that delivers out-of-the-box using dynaTrace: user context, all relevant data in place, zero config

The Limits of Logging
Logging Relies on Developers
Let's face it: logging is, inherently, a developer-centric mechanism. The usefulness of your application logs stands and falls with your developers. A best practice for logging in production says: "don't log too much" (see Optimal Logging @ Google testing blog). This sounds sensible, but what does this actually mean? If we recall the basic motivation behind logging in production, we could equally rephrase this as "log just enough information you need to know about a failure that enables you to take adequate actions". So, what would it take your developers to provide such actionable insights? Developers would need to correctly anticipate where in the code errors would occur in production. They would also need to collect any relevant bits of information along an execution path that bear these insights and, last but not least, present them in a meaningful way so that others can understand, too. Developers are, no doubt, a critical factor to the practicality of your application logs.

Logging Lacks Context
Logging during development is so helpful because developers and testers usually examine smaller, co-located units of code that are executed in a single thread. It is fairly easy to maintain an overview under such simulated conditions, such as a test scenario:

13:49:59 INFO com.company.product.users.UserManager - Registered user ‘foo'.
13:49:59 INFO com.company.product.users.UserManager - User ‘foo' has logged in.
13:49:59 INFO com.company.product.users.UserManager - User ‘foo' has logged out.

But how can you reliably identify an entire failing user transaction in a real-life scenario, that is, in a heavily multi-threaded environment with multiple tiers that serve piles of distributed log files? I say, hardly at all. Sure, you can go mine for certain isolated events, but you cannot easily extract causal relationships from an incoherent, distributed set of log messages:

13:49:59 INFO com.company.product.users.UserManager - User ‘foo' has logged in.
13:49:59 INFO com.company.product.users.UserManager - User ‘bar' has logged in.
...
13:49:60 SEVERE org.hibernate.exception.JDBCConnectionException: could not execute query
at org.hibernate.exception.SQLStateConverter.convert(SQLStateConverter.java:99)
...

After all, the ability to identify such contexts is key to deciding why a particular user action failed.

Logging Impacts Performance
What is a thorough logging strategy worth if your users cannot use your application because it is terribly slow? In case you did not know, logging, especially during peak load times, may severely slow down your application. Let's have a quick look at some of the reasons:

Writing log messages from the application's memory to persistent storage, usually to the file system, demands substantial I/O (see Top Performance Mistakes when moving from Test to Production: Excessive Logging). Traditional logger implementations wrote files by issuing synchronous I/O requests, which put the calling thread into a wait state until the log message was fully written to disk.

In some cases, the logger itself may cause a decent bottle-neck: in the Log4j library (up to version 1.2), every single log activity results in a call to an internal template method Appender.doAppend() that is synchronized for thread-safety (see Multithreading issues - doAppend is synchronised?). The practical implication of this is that threads, which log to the same Appender, for example a FileAppender, must queue up with any other threads writing logs. Consequently, the application spends valuable time waiting in synchronization instead of doing whatever the app was actually designed to do. This will hurt performance, especially in heavily multi-thread environments like web application servers.

These performance effects can be vastly amplified when exception logging comes into play: exception data, such as error message, stack trace and any other piggy-backed exceptions ("initial cause exceptions") greatly increase the amount of data that needs to be logged. Additionally, once a system is in a faulty state, the same exceptions tend to appear over and over again, further hurting application performance. We had once monitored a 30% drawdown on CPU resources due to more than 180,000 exceptions being thrown in only 5 minutes on one of our application servers (see Performance Impact of Exceptions: Why Ops, Test and Dev need to care). If we had written these exceptions to the file system, they would have trashed I/O, filled up our disk space in no time and had considerably increased our response times.

Subsequently, it is safe to say that insightful logging and performance are two opposite goals: if you want the one, then you have to make a compromise on the other.

For more logging tips click here for the full article.

More Stories By Martin Etmajer

Leveraging his outstanding technical skills as a lead software engineer, Martin Etmajer has been a key contributor to a number of large-scale systems across a range of industries. He is as passionate about great software as he is about applying Lean Startup principles to the development of products that customers love.

Martin is a life-long learner who frequently speaks at international conferences and meet-ups. When not spending time with family, he enjoys swimming and Yoga. He holds a master's degree in Computer Engineering from the Vienna University of Technology, Austria, with a focus on dependable distributed real-time systems.

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.


@ThingsExpo Stories
SYS-CON Events announced today that Super Micro Computer, Inc., a global leader in Embedded and IoT solutions, will exhibit at SYS-CON's 18th International Cloud Expo®, which will take place on June 7-9, 2016, at the Javits Center in New York City, NY. Supermicro (NASDAQ: SMCI), the leading innovator in high-performance, high-efficiency server technology, is a premier provider of advanced server Building Block Solutions® for Data Center, Cloud Computing, Enterprise IT, Hadoop/Big Data, HPC and ...
18th Cloud Expo, taking place June 7-9, 2016, at the Javits Center in New York City, NY, will feature technical sessions from a rock star conference faculty and the leading industry players in the world. Cloud computing is now being embraced by a majority of enterprises of all sizes. Yesterday's debate about public vs. private has transformed into the reality of hybrid cloud: a recent survey shows that 74% of enterprises have a hybrid cloud strategy. Meanwhile, 94% of enterprises are using some...
SoftLayer operates a global cloud infrastructure platform built for Internet scale. With a global footprint of data centers and network points of presence, SoftLayer provides infrastructure as a service to leading-edge customers ranging from Web startups to global enterprises. SoftLayer's modular architecture, full-featured API, and sophisticated automation provide unparalleled performance and control. Its flexible unified platform seamlessly spans physical and virtual devices linked via a world...
SYS-CON Events announced today that BMC Software has been named "Siver Sponsor" of SYS-CON's 18th Cloud Expo, which will take place on June 7-9, 2015 at the Javits Center in New York, New York. BMC is a global leader in innovative software solutions that help businesses transform into digital enterprises for the ultimate competitive advantage. BMC Digital Enterprise Management is a set of innovative IT solutions designed to make digital business fast, seamless, and optimized from mainframe to mo...
"What we see what happens when you have a completely networked society and the potential to now drive the value creation and the collaboration and the ecosystems that are possible when you start to be able to connect people and industries together in ways that have never been possible before," explained Esmeralda Swartz, VP of Marketing Enterprise & Cloud at Ericsson, in this SYS-CON.tv interview at @ThingsExpo, held November 3-5, 2015, at the Santa Clara Convention Center in Santa Clara, CA.
Companies can harness IoT and predictive analytics to sustain business continuity; predict and manage site performance during emergencies; minimize expensive reactive maintenance; and forecast equipment and maintenance budgets and expenditures. Providing cost-effective, uninterrupted service is challenging, particularly for organizations with geographically dispersed operations.
Cloud computing delivers on-demand resources that provide businesses with flexibility and cost-savings. The challenge in moving workloads to the cloud has been the cost and complexity of ensuring the initial and ongoing security and regulatory (PCI, HIPAA, FFIEC) compliance across private and public clouds. Manual security compliance is slow, prone to human error, and represents over 50% of the cost of managing cloud applications. Determining how to automate cloud security compliance is critical...
The Internet of Things (IoT) is growing rapidly by extending current technologies, products and networks. By 2020, Cisco estimates there will be 50 billion connected devices. Gartner has forecast revenues of over $300 billion, just to IoT suppliers. Now is the time to figure out how you’ll make money – not just create innovative products. With hundreds of new products and companies jumping into the IoT fray every month, there’s no shortage of innovation. Despite this, McKinsey/VisionMobile data...
The IoTs will challenge the status quo of how IT and development organizations operate. Or will it? Certainly the fog layer of IoT requires special insights about data ontology, security and transactional integrity. But the developmental challenges are the same: People, Process and Platform. In his session at @ThingsExpo, Craig Sproule, CEO of Metavine, will demonstrate how to move beyond today's coding paradigm and share the must-have mindsets for removing complexity from the development proc...
SYS-CON Events announced today TechTarget has been named “Media Sponsor” of SYS-CON's 18th International Cloud Expo, which will take place on June 7–9, 2016, at the Javits Center in New York City, NY, and the 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. TechTarget is the Web’s leading destination for serious technology buyers researching and making enterprise technology decisions. Its extensive global networ...
SYS-CON Events announced today that MangoApps will exhibit at SYS-CON's 18th International Cloud Expo®, which will take place on June 7-9, 2016, at the Javits Center in New York City, NY. MangoApps provides modern company intranets and team collaboration software, allowing workers to stay connected and productive from anywhere in the world and from any device. For more information, please visit https://www.mangoapps.com/.
SYS-CON Events announced today that IBM Cloud Data Services has been named “Bronze Sponsor” of SYS-CON's 18th Cloud Expo, which will take place on June 7-9, 2016, at the Javits Center in New York City, NY. IBM Cloud Data Services offers a portfolio of integrated, best-of-breed cloud data services for developers focused on mobile computing and analytics use cases.
SYS-CON Events announced today that Commvault, a global leader in enterprise data protection and information management, has been named “Bronze Sponsor” of SYS-CON's 18th International Cloud Expo, which will take place on June 7–9, 2016, at the Javits Center in New York City, NY, and the 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. Commvault is a leading provider of data protection and information management...
The essence of data analysis involves setting up data pipelines that consist of several operations that are chained together – starting from data collection, data quality checks, data integration, data analysis and data visualization (including the setting up of interaction paths in that visualization). In our opinion, the challenges stem from the technology diversity at each stage of the data pipeline as well as the lack of process around the analysis.
SYS-CON Events announced today that Alert Logic, Inc., the leading provider of Security-as-a-Service solutions for the cloud, will exhibit at SYS-CON's 18th International Cloud Expo®, which will take place on June 7-9, 2016, at the Javits Center in New York City, NY. Alert Logic, Inc., provides Security-as-a-Service for on-premises, cloud, and hybrid infrastructures, delivering deep security insight and continuous protection for customers at a lower cost than traditional security solutions. Ful...
A strange thing is happening along the way to the Internet of Things, namely far too many devices to work with and manage. It has become clear that we'll need much higher efficiency user experiences that can allow us to more easily and scalably work with the thousands of devices that will soon be in each of our lives. Enter the conversational interface revolution, combining bots we can literally talk with, gesture to, and even direct with our thoughts, with embedded artificial intelligence, wh...
In his session at 18th Cloud Expo, Bruce Swann, Senior Product Marketing Manager at Adobe, will discuss how the Adobe Marketing Cloud can help marketers embrace opportunities for personalized, relevant and real-time customer engagement across offline (direct mail, point of sale, call center) and digital (email, website, SMS, mobile apps, social networks, connected objects). Bruce Swann has more than 15 years of experience working with digital marketing disciplines like web analytics, social med...
Designing IoT applications is complex, but deploying them in a scalable fashion is even more complex. A scalable, API first IaaS cloud is a good start, but in order to understand the various components specific to deploying IoT applications, one needs to understand the architecture of these applications and figure out how to scale these components independently. In his session at @ThingsExpo, Nara Rajagopalan is CEO of Accelerite, will discuss the fundamental architecture of IoT applications, ...
SYS-CON Events announced today that Tintri Inc., a leading producer of VM-aware storage (VAS) for virtualization and cloud environments, will exhibit at the 18th International CloudExpo®, which will take place on June 7-9, 2016, at the Javits Center in New York City, New York, and the 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA.
SYS-CON Events announced today that ContentMX, the marketing technology and services company with a singular mission to increase engagement and drive more conversations for enterprise, channel and SMB technology marketers, has been named “Sponsor & Exhibitor Lounge Sponsor” of SYS-CON's 18th Cloud Expo, which will take place on June 7-9, 2016, at the Javits Center in New York City, New York. “CloudExpo is a great opportunity to start a conversation with new prospects, but what happens after the...