Welcome!

Java IoT Authors: APM Blog, Jnan Dash, Elizabeth White, Stackify Blog, Liz McMillan

Related Topics: @DevOpsSummit, Linux Containers, Containers Expo Blog

@DevOpsSummit: Blog Feed Post

Causation and Correlation in #Monitoring | @DevOpsSummit #DevOps #APM

Some vendors use time-based correlation to connect events across multiple observed data sets

Causation and Correlation: What Difference Does It Make?
By Joseph Hoffman

Causation and Correlation: What’s the difference? Why does it matter? Does it matter? Some vendors use time-based correlation to connect events across multiple observed data sets, and claim there’s a connection between two observed data sets. This type of approach is flawed and can lead to wildly inaccurate conclusions, which itself leads to wasted time by technical teams and high cost to the business.

In the chart below we observe the Number of Requests and the Memory Utilization. Wow, the increase in the Number of Requests is causing memory utilized to go up. Quick, call Engineering and get them on this right away.

We can easily see that there’s a correlation between the Number of requests and the memory size. Based on this data it would be relatively easy to conclude that the increase in Requests is causing memory to increase. But the truth is simply that these two observed values have nothing to do with each other. After talking to the system operators, we learn that a large memory intensive analysis job was running.   Once they shut down the Analysis job, things changed. Look at the same measures after a few minutes have passed. Same conclusion? Probably not.

A far superior approach is to use causation, instead of a correlation. A causative based approach would provide us 100% confidence about whether an observed data point does cause another observed data point.

So why doesn’t everyone use Causation? Partly because truly using causation is harder to implement in an APM product and many customers don’t understand the difference. Let’s look at what these terms mean and why they’re similar but distinctly and so importantly different.

Correlation & Causation
Studies show (okay, not really, I just made this up) that when I sleep more I make more money. The assumption is sleeping causes one to make money.   But I’m sure most of us realize that sleeping more is likely letting me be more awake during the day, being a more productive worker and thus producing more outcome which leads to higher income. But if I took the connection between sleep and income directly, I would conclude I should just sleep 24/7 and make lots of money. As we all know, the sleep itself isn’t directly generating the income, but it’s related in some manner. So, we can say that there’s a correlation between sleep quantity and income capacity.

Statisticians measure the relationship between two events and assign a value called the Correlation Coefficient. This is the statistical measure of the degree to which changes to one observed value predict changes to another observed value. Think sleep and income as we discussed earlier. The Correlation Coefficient is defined as a value between -1 and +1. There can also be negative correlation. For example, in the winter, the longer my wife leaves the front door open to talk to the neighbor the colder the house gets. So, there’s a negative correlation between the door open time and the house temperature. In contrast, every time I turn the stereo volume knob to the right, the volume increases, a positive correlation.

The Correlation Coefficient simply measures the strength of the association between two values or events. If I see a direct pattern of one value changing in sync when another value changes, then I can assign a high coefficient of correlation. But that is still not causation. Why is it so important to have causation? Let’s look at another example of what happens when correlation is used incorrectly.

Incorrect Conclusions
Just because we observe one value as going up and observe another value go up, does not mean they have anything to do with each other. The Number of Requests and Memory example above is a good example, and understandably confusing.  Let’s look at a more obvious example. Tyler Vigen[1] did some great work using US Census data to look for spurious correlations in random public data and came up with some great conclusions. Did you know that margarine consumption can cause divorce?[2]

Clearly there’s no connection between margarine consumption and divorce rates but this is the exact conclusion that some APM vendors make when they use correlation to determine what’s wrong with your Enterprise Data Center.

A better way
There are many products on the market which use correlation but claim confidence in the conclusions like its causation. The APM market is no exception. We’ve probably all heard the statement “correlation does not imply causation”. It’s not uncommon for products to claim a relationship between two observables simply because they happened at the same time. The Request/Memory and the Divorce/Margarine stories are good examples. Multi-threaded computing systems are another example of where time based correlation can be dangerous and lead to very wrong conclusions.

But what if I was measuring the environment in a way that allowed me to directly connect event A with event B, not because their time stamps lined up, but because when A happened it created a unique identifier which only showed up in things (such as B) which it directly affected and not in C which it did not affect. This gives me 100% confidence that A caused B.  Event A did trigger event B and not event C, even though the time stamps all line up together. This is called causation.

Why does this all matter
In complex multi-threaded computing systems, there’s lots of things going on at once. And, as a Performance Engineer, if I’m going to get to the root cause of my system problems and fix the user response time complaints, I must have an ability to know for certain which events cause what other events particularly as transactions  cross threads, processes, machines and even data centers. This allows for quickly resolving issues and accurately getting to the root cause even in complex systems. The following diagram shows a relatively simply transaction that flowed across two Java tiers and an Oracle database. This is from a single user click which was reported to be taking too long. By using a causation based approach to analyzing the various events in this multi-tier environment, I am able to easily see that there’s 168 calls into Oracle, all for a single user click. This is highlighted in the red oval. Note (the red arrow) that this makes up 84% of the total users response time. This cannot be done without causation.

If I attempt to use correlation to resolve this type of problem, I would not find this type of problem at all. I would simply be guessing as to what activity on the downstream tiers was caused by the upstream user click event. But only when being able to use causation can I determine what events (trips to the database) are happening as part of a single user click event.

Have confidence and don’t risk your business’s future
The simple fact that the Dynatrace AppMon product captures every transaction all the time and uses a tagging approach across every remoting call, gives the performance engineer causation based data, which gives them confidence and hard facts on what is causing system problems. Being able to point the Dev team directly to the root cause is priceless when time, money and your business reputation is on the line. So why gamble your business with correlation, when you can have the best, causation.

[1] http://www.tylervigen.com/spurious-correlations

[2] http://www.bbc.com/news/magazine-27537142

The post Causation and Correlation: What difference does it make? appeared first on Dynatrace blog – monitoring redefined.

@DevOpsSummit at Cloud Expo taking place June 6-8, 2017, at Javits Center, New York City, and is co-located with the 20th International Cloud Expo and will feature technical sessions from a rock star conference faculty and the leading industry players in the world.

DevOps at Cloud Expo / @ThingsExpo 2017 New York 
(June 6-8, 2017, Javits Center, Manhattan)

DevOps at Cloud Expo / @ThingsExpo 2017 Silicon Valley
(October 31 - November 2, 2017, Santa Clara Convention Center, CA)

Download Show Prospectus ▸ Here

The widespread success of cloud computing is driving the DevOps revolution in enterprise IT. Now as never before, development teams must communicate and collaborate in a dynamic, 24/7/365 environment. There is no time to wait for long development cycles that produce software that is obsolete at launch. DevOps may be disruptive, but it is essential.

@DevOpsSummit will expand the DevOps community, enable a wide sharing of knowledge, and educate delegates and technology providers alike. Recent research has shown that DevOps dramatically reduces development time, the amount of enterprise IT professionals put out fires, and support time generally. Time spent on infrastructure development is significantly increased, and DevOps practitioners report more software releases and higher quality. Sponsors of @DevOpsSummit will benefit from unmatched branding, profile building and lead generation opportunities through:

  • Featured on-site presentation and ongoing on-demand webcast exposure to a captive audience of industry decision-makers.
  • Showcase exhibition during our new extended dedicated expo hours
  • Breakout Session Priority scheduling for Sponsors that have been guaranteed a 35-minute technical session
  • Online advertising in SYS-CON's i-Technology Publications
  • Capitalize on our Comprehensive Marketing efforts leading up to the show with print mailings, e-newsletters and extensive online media coverage.
  • Unprecedented PR Coverage: Editorial Coverage on DevOps Journal
  • Tweetup to over 75,000 plus followers
  • Press releases sent on major wire services to over 500 industry analysts.

For more information on sponsorship, exhibit, and keynote opportunities, contact Carmen Gonzalez by email at events (at) sys-con.com, or by phone 201 802-3021.

The World's Largest "Cloud Digital Transformation" Event

@CloudExpo / @ThingsExpo 2017 New York 
(June 6-8, 2017, Javits Center, Manhattan)

@CloudExpo / @ThingsExpo 2017 Silicon Valley
(Oct. 31 - Nov. 2, 2017, Santa Clara Convention Center, CA)

Full Conference Registration Gold Pass and Exhibit Hall ▸ Here

Register For @CloudExpo ▸ Here via EventBrite

Register For @ThingsExpo ▸ Here via EventBrite

Register For @DevOpsSummit ▸ Here via EventBrite

Sponsorship Opportunities

Sponsors of Cloud Expo @ThingsExpo will benefit from unmatched branding, profile building and lead generation opportunities through:

  • Featured on-site presentation and ongoing on-demand webcast exposure to a captive audience of industry decision-makers
  • Showcase exhibition during our new extended dedicated expo hours
  • Breakout Session Priority scheduling for Sponsors that have been guaranteed a 35 minute technical session
  • Online targeted advertising in SYS-CON's i-Technology Publications
  • Capitalize on our Comprehensive Marketing efforts leading up to the show with print mailings, e-newsletters and extensive online media coverage
  • Unprecedented Marketing Coverage: Editorial Coverage on ITweetup to over 100,000 plus followers, press releases sent on major wire services to over 500 industry analysts

For more information on sponsorship, exhibit, and keynote opportunities, contact Carmen Gonzalez (@GonzalezCarmen) today by email at events (at) sys-con.com, or by phone 201 802-3021.

Secrets of Sponsors and Exhibitors ▸ Here
Secrets of Cloud Expo Speakers ▸ Here

All major researchers estimate there will be tens of billions devices - computers, smartphones, tablets, and sensors - connected to the Internet by 2020. This number will continue to grow at a rapid pace for the next several decades.

With major technology companies and startups seriously embracing Cloud strategies, now is the perfect time to attend @CloudExpo@ThingsExpo, June 6-8, 2017, at the Javits Center in New York City, NY and October 31 - November 2, 2017, Santa Clara Convention Center, CA. Learn what is going on, contribute to the discussions, and ensure that your enterprise is on the right path to Digital Transformation.

Track 1. FinTech
Track 2. Enterprise Cloud | Digital Transformation
Track 3. DevOps, Containers & Microservices 
Track 4. Big Data | Analytics
Track 5. Industrial IoT
Track 6. IoT Dev & Deploy | Mobility
Track 7. APIs | Cloud Security
Track 8. AI | ML | DL | Cognitive Computing

Delegates to Cloud Expo @ThingsExpo will be able to attend 8 simultaneous, information-packed education tracks.

There are over 120 breakout sessions in all, with Keynotes, General Sessions, and Power Panels adding to three days of incredibly rich presentations and content.

Join Cloud Expo @ThingsExpo conference chair Roger Strukhoff (@IoT2040), June 6-8, 2017, at the Javits Center in New York City, NY and October 31 - November 2, 2017, Santa Clara Convention Center, CA for three days of intense Enterprise Cloud and 'Digital Transformation' discussion and focus, including Big Data's indispensable role in IoT, Smart Grids and (IIoT) Industrial Internet of Things, Wearables and Consumer IoT, as well as (new) Digital Transformation in Vertical Markets.

Financial Technology - or FinTech - Is Now Part of the @CloudExpo Program!

Accordingly, attendees at the upcoming 20th Cloud Expo @ThingsExpo June 6-8, 2017, at the Javits Center in New York City, NY and October 31 - November 2, 2017, Santa Clara Convention Center, CA will find fresh new content in a new track called FinTech, which will incorporate machine learning, artificial intelligence, deep learning, and blockchain into one track.

Financial enterprises in New York City, London, Singapore, and other world financial capitals are embracing a new generation of smart, automated FinTech that eliminates many cumbersome, slow, and expensive intermediate processes from their businesses.

FinTech brings efficiency as well as the ability to deliver new services and a much improved customer experience throughout the global financial services industry. FinTech is a natural fit with cloud computing, as new services are quickly developed, deployed, and scaled on public, private, and hybrid clouds.

More than US$20 billion in venture capital is being invested in FinTech this year. @CloudExpo is pleased to bring you the latest FinTech developments as an integral part of our program, starting at the 20th International Cloud Expo June 6-8, 2017 in New York City and October 31 - November 2, 2017 in Silicon Valley.

@CloudExpo is accepting submissions for this new track, so please visit www.CloudComputingExpo.com for the latest information.

Speaking Opportunities

The upcoming 20th International @CloudExpo@ThingsExpo, June 6-8, 2017, at the Javits Center in New York City, NY and October 31 - November 2, 2017, Santa Clara Convention Center, CA announces that its Call For Papers for speaking opportunities is open.

Submit your speaking proposal today! ▸ Here

Our Top 100 Sponsors and the Leading "Digital Transformation" Companies

(ISC)2, 24Notion (Bronze Sponsor), 910Telecom, Accelertite (Gold Sponsor), Addteq, Adobe (Bronze Sponsor), Aeroybyte, Alert Logic, Anexia, AppNeta, Avere Systems, BMC Software (Silver Sponsor), Bsquare Corporation (Silver Sponsor), BZ Media (Media Sponsor), Catchpoint Systems (Silver Sponsor), CDS Global Cloud, Cemware, Chetu Inc., China Unicom, Cloud Raxak, CloudBerry (Media Sponsor), Cloudbric, Coalfire Systems, CollabNet, Inc. (Silver Sponsor), Column Technologies, Commvault (Bronze Sponsor), Connect2.me, ContentMX (Bronze Sponsor), CrowdReviews (Media Sponsor) CyberTrend (Media Sponsor), DataCenterDynamics (Media Sponsor), Delaplex, DICE (Bronze Sponsor), EastBanc Technologies, eCube Systems, Embotics, Enzu Inc., Ericsson (Gold Sponsor), FalconStor, Formation Data Systems, Fusion, Hanu Software, HGST, Inc. (Bronze Sponsor), Hitrons Solutions, IBM BlueBox, IBM Bluemix, IBM Cloud (Platinum Sponsor), IBM Cloud Data Services/Cloudant (Platinum Sponsor), IBM DevOps (Platinum Sponsor), iDevices, Industrial Internet of Things Consortium (Association Sponsor), Impinger Technologies, Interface Masters, Intel (Keynote Sponsor), Interoute (Bronze Sponsor), IQP Corporation, Isomorphic Software, Japan IoT Consortium, Kintone Corporation (Bronze Sponsor), LeaseWeb USA, LinearHub, MangoApps, MathFreeOn, Men & Mice, MobiDev, New Relic, Inc. (Bronze Sponsor), New York Times, Niagara Networks, Numerex, NVIDIA Corporation (AI Session Sponsor), Object Management Group (Association Sponsor), On The Avenue Marketing, Oracle MySQL, Peak10, Inc., Penta Security, Plasma Corporation, Pulzze Systems, Pythian (Bronze Sponsor), Cosmos, RackN, ReadyTalk (Silver Sponsor), Roma Software, Roundee.io, Secure Channels Inc., SD Times (Media Sponsor), SoftLayer (Platinum Sponsor), SoftNet Solutions, Solinea Inc., SpeedyCloud, SSLGURU LLC, StarNet, Stratoscale, Streamliner, SuperAdmins, TechTarget (Media Sponsor), TelecomReseller (Media Sponsor), Tintri (Welcome Reception Sponsor), TMCnet (Media Sponsor), Transparent Cloud Computing Consortium, Veeam, Venafi, Violin Memory, VAI Software, Zerto

About SYS-CON Media & Events
SYS-CON Media (www.sys-con.com) has since 1994 been connecting technology companies and customers through a comprehensive content stream - featuring over forty focused subject areas, from Cloud Computing to Web Security - interwoven with market-leading full-scale conferences produced by SYS-CON Events. The company's internationally recognized brands include among others Cloud Expo® (@CloudExpo), Big Data Expo® (@BigDataExpo), DevOps Summit (@DevOpsSummit), @ThingsExpo® (@ThingsExpo), Containers Expo (@ContainersExpo) and Microservices Expo (@MicroservicesE).

Cloud Expo®, Big Data Expo® and @ThingsExpo® are registered trademarks of Cloud Expo, Inc., a SYS-CON Events company.

More Stories By Dynatrace Blog

Building a revolutionary approach to software performance monitoring takes an extraordinary team. With decades of combined experience and an impressive history of disruptive innovation, that’s exactly what we ruxit has.

Get to know ruxit, and get to know the future of data analytics.

@ThingsExpo Stories
"MobiDev is a software development company and we do complex, custom software development for everybody from entrepreneurs to large enterprises," explained Alan Winters, U.S. Head of Business Development at MobiDev, in this SYS-CON.tv interview at 21st Cloud Expo, held Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA.
In his session at 21st Cloud Expo, Carl J. Levine, Senior Technical Evangelist for NS1, will objectively discuss how DNS is used to solve Digital Transformation challenges in large SaaS applications, CDNs, AdTech platforms, and other demanding use cases. Carl J. Levine is the Senior Technical Evangelist for NS1. A veteran of the Internet Infrastructure space, he has over a decade of experience with startups, networking protocols and Internet infrastructure, combined with the unique ability to it...
"Space Monkey by Vivent Smart Home is a product that is a distributed cloud-based edge storage network. Vivent Smart Home, our parent company, is a smart home provider that places a lot of hard drives across homes in North America," explained JT Olds, Director of Engineering, and Brandon Crowfeather, Product Manager, at Vivint Smart Home, in this SYS-CON.tv interview at @ThingsExpo, held Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA.
Widespread fragmentation is stalling the growth of the IIoT and making it difficult for partners to work together. The number of software platforms, apps, hardware and connectivity standards is creating paralysis among businesses that are afraid of being locked into a solution. EdgeX Foundry is unifying the community around a common IoT edge framework and an ecosystem of interoperable components.
"IBM is really all in on blockchain. We take a look at sort of the history of blockchain ledger technologies. It started out with bitcoin, Ethereum, and IBM evaluated these particular blockchain technologies and found they were anonymous and permissionless and that many companies were looking for permissioned blockchain," stated René Bostic, Technical VP of the IBM Cloud Unit in North America, in this SYS-CON.tv interview at 21st Cloud Expo, held Oct 31 – Nov 2, 2017, at the Santa Clara Conventi...
"Akvelon is a software development company and we also provide consultancy services to folks who are looking to scale or accelerate their engineering roadmaps," explained Jeremiah Mothersell, Marketing Manager at Akvelon, in this SYS-CON.tv interview at 21st Cloud Expo, held Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA.
Coca-Cola’s Google powered digital signage system lays the groundwork for a more valuable connection between Coke and its customers. Digital signs pair software with high-resolution displays so that a message can be changed instantly based on what the operator wants to communicate or sell. In their Day 3 Keynote at 21st Cloud Expo, Greg Chambers, Global Group Director, Digital Innovation, Coca-Cola, and Vidya Nagarajan, a Senior Product Manager at Google, discussed how from store operations and ...
Large industrial manufacturing organizations are adopting the agile principles of cloud software companies. The industrial manufacturing development process has not scaled over time. Now that design CAD teams are geographically distributed, centralizing their work is key. With large multi-gigabyte projects, outdated tools have stifled industrial team agility, time-to-market milestones, and impacted P&L stakeholders.
Gemini is Yahoo’s native and search advertising platform. To ensure the quality of a complex distributed system that spans multiple products and components and across various desktop websites and mobile app and web experiences – both Yahoo owned and operated and third-party syndication (supply), with complex interaction with more than a billion users and numerous advertisers globally (demand) – it becomes imperative to automate a set of end-to-end tests 24x7 to detect bugs and regression. In th...
"Cloud Academy is an enterprise training platform for the cloud, specifically public clouds. We offer guided learning experiences on AWS, Azure, Google Cloud and all the surrounding methodologies and technologies that you need to know and your teams need to know in order to leverage the full benefits of the cloud," explained Alex Brower, VP of Marketing at Cloud Academy, in this SYS-CON.tv interview at 21st Cloud Expo, held Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clar...
"There's plenty of bandwidth out there but it's never in the right place. So what Cedexis does is uses data to work out the best pathways to get data from the origin to the person who wants to get it," explained Simon Jones, Evangelist and Head of Marketing at Cedexis, in this SYS-CON.tv interview at 21st Cloud Expo, held Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA.
SYS-CON Events announced today that CrowdReviews.com has been named “Media Sponsor” of SYS-CON's 22nd International Cloud Expo, which will take place on June 5–7, 2018, at the Javits Center in New York City, NY. CrowdReviews.com is a transparent online platform for determining which products and services are the best based on the opinion of the crowd. The crowd consists of Internet users that have experienced products and services first-hand and have an interest in letting other potential buye...
SYS-CON Events announced today that Telecom Reseller has been named “Media Sponsor” of SYS-CON's 22nd International Cloud Expo, which will take place on June 5-7, 2018, at the Javits Center in New York, NY. Telecom Reseller reports on Unified Communications, UCaaS, BPaaS for enterprise and SMBs. They report extensively on both customer premises based solutions such as IP-PBX as well as cloud based and hosted platforms.
It is of utmost importance for the future success of WebRTC to ensure that interoperability is operational between web browsers and any WebRTC-compliant client. To be guaranteed as operational and effective, interoperability must be tested extensively by establishing WebRTC data and media connections between different web browsers running on different devices and operating systems. In his session at WebRTC Summit at @ThingsExpo, Dr. Alex Gouaillard, CEO and Founder of CoSMo Software, presented ...
WebRTC is great technology to build your own communication tools. It will be even more exciting experience it with advanced devices, such as a 360 Camera, 360 microphone, and a depth sensor camera. In his session at @ThingsExpo, Masashi Ganeko, a manager at INFOCOM Corporation, introduced two experimental projects from his team and what they learned from them. "Shotoku Tamago" uses the robot audition software HARK to track speakers in 360 video of a remote party. "Virtual Teleport" uses a multip...
A strange thing is happening along the way to the Internet of Things, namely far too many devices to work with and manage. It has become clear that we'll need much higher efficiency user experiences that can allow us to more easily and scalably work with the thousands of devices that will soon be in each of our lives. Enter the conversational interface revolution, combining bots we can literally talk with, gesture to, and even direct with our thoughts, with embedded artificial intelligence, whic...
SYS-CON Events announced today that Evatronix will exhibit at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Evatronix SA offers comprehensive solutions in the design and implementation of electronic systems, in CAD / CAM deployment, and also is a designer and manufacturer of advanced 3D scanners for professional applications.
Leading companies, from the Global Fortune 500 to the smallest companies, are adopting hybrid cloud as the path to business advantage. Hybrid cloud depends on cloud services and on-premises infrastructure working in unison. Successful implementations require new levels of data mobility, enabled by an automated and seamless flow across on-premises and cloud resources. In his general session at 21st Cloud Expo, Greg Tevis, an IBM Storage Software Technical Strategist and Customer Solution Architec...
To get the most out of their data, successful companies are not focusing on queries and data lakes, they are actively integrating analytics into their operations with a data-first application development approach. Real-time adjustments to improve revenues, reduce costs, or mitigate risk rely on applications that minimize latency on a variety of data sources. In his session at @BigDataExpo, Jack Norris, Senior Vice President, Data and Applications at MapR Technologies, reviewed best practices to ...
An increasing number of companies are creating products that combine data with analytical capabilities. Running interactive queries on Big Data requires complex architectures to store and query data effectively, typically involving data streams, an choosing efficient file format/database and multiple independent systems that are tied together through custom-engineered pipelines. In his session at @BigDataExpo at @ThingsExpo, Tomer Levi, a senior software engineer at Intel’s Advanced Analytics gr...