Welcome!

Java IoT Authors: Dana Gardner, Liz McMillan, Elizabeth White, Kevin Benedict, Pat Romanski

Related Topics: Java IoT, Microservices Expo, IoT User Interface, Agile Computing, Release Management , @CloudExpo

Java IoT: Article

Case Study: It Takes More than a Tool

Swarovski’s 10 Requirements for Creating an APM Culture

Swarovski - the leading producer of cut crystal in the world - relies on its eCommerce store as much like other companies in the highly competitive eCommerce environment. Swarovski's story is no different from others in this space: They started with "Let's build a website to sell our products online" a couple of years ago and quickly progressed to "We sell to 60 million annual visitors across 23 countries in six languages." There were bumps along the road and they realized that it takes more than just a bunch of servers and tools to keep the site running.

Why APM and why you don't just need a tool
Swarovski relies on Intershop's eCommerce platform and faced several challenges as they rapidly grew. Their challenges required them to apply Application Performance Management (APM) practices to ensure they could fulfill the business requirements to keep pace with customer growth while maintaining an excellent user experience. The most insightful comment I heard was from René Neubacher, Senior eBusiness Technology Consultant at Swarovski: "APM is not just about software. APM is a culture, a mindset and a set of business processes. APM software supports that."

René recently discussed their Journey to APM, what their initial problems were and what requirements they ended up having on APM and the tools they needed to support their APM strategy. By now they reached the next level of maturity by establishing a Performance Center of Excellence. This allows them to tackle application performance proactively throughout the organization instead of putting out fires reactively in production.

This article describes the challenges they faced, the questions that arose and the new generation APM requirements that paved the way forward in their performance journey:

The Challenge!
Swarvoski had traditional system monitoring in place on all their systems across their delivery chain including web servers, application servers, SAP, database servers, external systems and the network. Knowing that each individual component is up and running 99.99% of the time is great but no longer sufficient. How might these individual component outages impact the user experience of their online shoppers? Who is actually responsible for the end user experience and how should you monitor the complete delivery chain and not just the individual components? These and other questions came up when the eCommerce site attracted more customers which was quickly followed by more complaints about their user experience:

APM includes getting a holistic view of the complete delivery chain and requires someone to be responsible for end user experience.

Questions that had no answers
In addition to "Who is responsible in case users complain?" the other questions that needed to be urgently addressed included:

  • How often is the service desk called before IT knows that there is a problem?
  • How much time is spent in searching for system errors versus building new features?
  • Do we have a process to find the root-cause when a customer reports a problem?
  • How do we visualize our services from the customer‘s point of view?
  • How much revenue, brand image and productivity are at risk or lost while IT is searching for the problem?
  • What to do when someone says "it‘s slow"?

The Ten Requirements
These unanswered questions triggered the need to move away from traditional system monitoring and develop the requirements for new generation APM and user experience management.

1: Support State-of-the-Art Architecture
Based on their current system architecture it was clear that Swarovski needed an approach that was able to work in their architecture, now and in the future. The rise of more interactive Web 2.0 and mobile applications had to be factored in to allow monitoring end users from many different devices and regardless of whether they used a web application or mobile native application as their access point.

Transactions need to be followed from the browser all the way back to the database. It is important to support distributed transactions. This approach also helps to spot architectural and deployment problems immediately

2: 100% transactions and clicks - No Averages
Based on their experience, Swarovski knew that looking at average values or sampled data would not be helpful when customers complained about bad performance. Responding to a customer complaint with "Our average user has no problem right now - sorry for your inconvenience" is not what you want your helpdesk engineers to use as a standard phrase. Averages or sampling also hides the real problems you have in your system. Check out the blog post Why Averages Suck by Michael Kopp for more detail.

Measuring end user performance of every customer interaction allows for quick identification of regional problems with CDNs, Third Parties or Latency.

Having 100% user interactions and transactions available makes it easy to identify the root cause for individual users

3: Business Visibility
As the business had a growing interest in the success of the eCommerce platform, IT had to demonstrate to the business what it took to fulfill their requirements and how business requirements are impacted by the availability or the lack of investment in the application delivery chain.

Correlating the number of Visits with Performance on incoming Orders illustrates the measurable impact of performance on revenue and what it takes to support business requirements.

4: Impact of 3rd Parties and CDNs
It was important to not only track transactions involving their own Data Center but all user interactions with their web site - even those delivered through CDNs or third parties. All of these interactions make up the user experience and therefore all of it needs to be analyzed.

Seeing the actual load impact of third-party components or content delivered from CDNs enables IT to pinpoint user experience problems that originate outside their own data center.

5: Across the life cycle - supporting collaboration and tearing down silos
The APM initiative was started because Swarovski reacted to problems happening in production. Fixing these problems in production is only the first step. Their ultimate goal is to become pro-active by finding and fixing problems in development or testing-before they spill over into production. Instead of relying on different sets of tools with different capabilities, the requirement is to use one single solution that is designed to be used across the application lifecycle (Developer Workstation, Continuous Integration, Testing, Staging and Production). It will make it easier to share application performance data between lifecycle stages allowing individuals to not only easily look at data from other stages but also compare data to verify impact and behavior of code changes between version updates.

Continuously catching regressions in Development by analyzing unit and performance tests allows application teams to become more proactive.

Pinpointing integration and scalability issues, continuously, in acceptance and load testing makes testing more efficient and prevents problems from reaching production.

6: Down to the source code
In order to speed up problem resolution Swarovski's operations and development teams require as much code-level insight as possible - not only for their own engineers who are extending the Intershop eCommerce Platform but also for Intershop to improve their product. Knowing what part of the application code is not performing well with which input parameters or under which specific load on the system eliminates tedious reproduction of the problem. The requirement is to lower the Mean Time To Repair (MTTR) from as much as several days down to only a couple of hours.

The SAP Connector turned out to have a performance problem. This method-level detailed information was captured without changing any code.

7: Zero/Acceptable overhead
"Who are we kidding? There is nothing like zero overhead especially when you need 100% coverage!" - Just the words from René when you explained that requirement. And he is right: once you start collecting information from a production system you add a certain amount of overhead. A better term for this would be "imperceptible overhead" - overhead that's so small, you don't notice it.

What is the exact number? It depends on your business and your users. The number should be worked out from the impact on the end user experience, rather than additional CPU, memory or network bandwidth required in the data center. Swarovski knew they had to achieve less than 2% overhead on page load times in production, as anything more would have hurt their business; and that's what they achieved.

8: Centralized data collection and administration
Running a distributed eCommerce application that gets potentially extended to additional geographical locations requires an APM system with a centralized data collection and administration option. It is not feasible to collect different types of performance information from different systems, servers or even data centers. It would either require multiple different analysis tools or data transformation to a single format to use it for proper analysis.

Instead of this approach, a single unified APM system was required by Swarovski. Central administration is equally important as they need to eliminate the need to rely on remote IT administrators to make changes to the monitored system, for example, simple tasks such as changing the level of captured data or upgrading to a new version.

By storing and accessing performance data from a single, centralized repository, enables fast and powerful analytic and visualization. For example, system metrics such as CPU utilization can be correlated with end-user response time or database execution time - all displayed on one single dashboard.

9: Auto-Adapting Instrumentation without digging through code
As the majority of the application code is not developed in-house but provided by Intershop, it is mandatory to get insight into the application without doing any manual code changes. The APM system must auto-adapt to changes so that no manual configuration change is necessary when a new version of the application is deployed.

This means Swarovski can focus on making their applications positively contribute to business outcomes; rather than spend time maintaining IT systems.

10: Ability to extend
Their application is an always growing an ever-changing IT environment. Where everything might have been deployed on physical boxes it might be moved to virtualized environments or even into a public cloud environment.

Whatever the extension may be - the APM solution must be able to adapt to these changes and also be extensible to consume new types of data sources, e.g., performance metrics from Amazon Cloud Services or VMware, Cassandra or other Big Data Solutions or even extend to legacy Mainframe applications and then bring these metrics into the centralized data repository and provide new insights into the application's performance.

Extending the application monitoring capabilities to Amazon EC2, Microsoft Windows Azure, a public or private cloud enables the analysis of the performance impact of these virtualized environments on end user experience.

The Solution and the Way Forward
Needless to say that Swarovski took the first step in implementing APM as a new process and mindset in their organization. They are now in the next phase of implementing a Performance Center of Excellence. This allows them moving from Reactive Performance Troubleshooting to Proactive Performance Prevention.

Stay tuned for more blog posts on the Performance Center of Excellence and how you can build one in your own organization. The key message is that it is not about just using a bunch of tools. It is about living and breathing performance throughout the organization. If you are interested in this check out the blogs by Steve Wilson: Proactive vs Reactive: How to prevent problems instead of fixing them faster and Performance in Development is the Chief Cornerstone.

More Stories By Andreas Grabner

Andreas Grabner has been helping companies improve their application performance for 15+ years. He is a regular contributor within Web Performance and DevOps communities and a prolific speaker at user groups and conferences around the world. Reach him at @grabnerandi

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.


@ThingsExpo Stories
SYS-CON Events announced today that ReadyTalk, a leading provider of online conferencing and webinar services, has been named Vendor Presentation Sponsor at the 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. ReadyTalk delivers audio and web conferencing services that inspire collaboration and enable the Future of Work for today’s increasingly digital and mobile workforce. By combining intuitive, innovative tec...
Major trends and emerging technologies – from virtual reality and IoT, to Big Data and algorithms – are helping organizations innovate in the digital era. However, to create real business value, IT must think beyond the ‘what’ of digital transformation to the ‘how’ to harness emerging trends, innovation and disruption. Architecture is the key that underpins and ties all these efforts together. In the digital age, it’s important to invest in architecture, extend the enterprise footprint to the cl...
Fifty billion connected devices and still no winning protocols standards. HTTP, WebSockets, MQTT, and CoAP seem to be leading in the IoT protocol race at the moment but many more protocols are getting introduced on a regular basis. Each protocol has its pros and cons depending on the nature of the communications. Does there really need to be only one protocol to rule them all? Of course not. In his session at @ThingsExpo, Chris Matthieu, co-founder and CTO of Octoblu, walk you through how Oct...
Vidyo, Inc., has joined the Alliance for Open Media. The Alliance for Open Media is a non-profit organization working to define and develop media technologies that address the need for an open standard for video compression and delivery over the web. As a member of the Alliance, Vidyo will collaborate with industry leaders in pursuit of an open and royalty-free AOMedia Video codec, AV1. Vidyo’s contributions to the organization will bring to bear its long history of expertise in codec technolo...
SYS-CON Events announced today that Bsquare has been named “Silver Sponsor” of SYS-CON's @ThingsExpo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. For more than two decades, Bsquare has helped its customers extract business value from a broad array of physical assets by making them intelligent, connecting them, and using the data they generate to optimize business processes.
If you’re responsible for an application that depends on the data or functionality of various IoT endpoints – either sensors or devices – your brand reputation depends on the security, reliability, and compliance of its many integrated parts. If your application fails to deliver the expected business results, your customers and partners won't care if that failure stems from the code you developed or from a component that you integrated. What can you do to ensure that the endpoints work as expect...
In his keynote at 18th Cloud Expo, Andrew Keys, Co-Founder of ConsenSys Enterprise, provided an overview of the evolution of the Internet and the Database and the future of their combination – the Blockchain. Andrew Keys is Co-Founder of ConsenSys Enterprise. He comes to ConsenSys Enterprise with capital markets, technology and entrepreneurial experience. Previously, he worked for UBS investment bank in equities analysis. Later, he was responsible for the creation and distribution of life sett...
The Transparent Cloud-computing Consortium (abbreviation: T-Cloud Consortium) will conduct research activities into changes in the computing model as a result of collaboration between "device" and "cloud" and the creation of new value and markets through organic data processing High speed and high quality networks, and dramatic improvements in computer processing capabilities, have greatly changed the nature of applications and made the storing and processing of data on the network commonplace.
Cognitive Computing is becoming the foundation for a new generation of solutions that have the potential to transform business. Unlike traditional approaches to building solutions, a cognitive computing approach allows the data to help determine the way applications are designed. This contrasts with conventional software development that begins with defining logic based on the current way a business operates. In her session at 18th Cloud Expo, Judith S. Hurwitz, President and CEO of Hurwitz & ...
The vision of a connected smart home is becoming reality with the application of integrated wireless technologies in devices and appliances. The use of standardized and TCP/IP networked wireless technologies in line-powered and battery operated sensors and controls has led to the adoption of radios in the 2.4GHz band, including Wi-Fi, BT/BLE and 802.15.4 applied ZigBee and Thread. This is driving the need for robust wireless coexistence for multiple radios to ensure throughput performance and th...
Enterprise IT has been in the era of Hybrid Cloud for some time now. But it seems most conversations about Hybrid are focused on integrating AWS, Microsoft Azure, or Google ECM into existing on-premises systems. Where is all the Private Cloud? What do technology providers need to do to make their offerings more compelling? How should enterprise IT executives and buyers define their focus, needs, and roadmap, and communicate that clearly to the providers?
SYS-CON Events announced today that Commvault, a global leader in enterprise data protection and information management, has been named “Bronze Sponsor” of SYS-CON's 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. Commvault is a leading provider of data protection and information management solutions, helping companies worldwide activate their data to drive more value and business insight and to transform moder...
The Internet of Things can drive efficiency for airlines and airports. In their session at @ThingsExpo, Shyam Varan Nath, Principal Architect with GE, and Sudip Majumder, senior director of development at Oracle, will discuss the technical details of the connected airline baggage and related social media solutions. These IoT applications will enhance travelers' journey experience and drive efficiency for the airlines and the airports. The session will include a working demo and a technical d...
There is little doubt that Big Data solutions will have an increasing role in the Enterprise IT mainstream over time. Big Data at Cloud Expo - to be held November 1-3, 2016, at the Santa Clara Convention Center in Santa Clara, CA - has announced its Call for Papers is open. Cloud computing is being adopted in one form or another by 94% of enterprises today. Tens of billions of new devices are being connected to The Internet of Things. And Big Data is driving this bus. An exponential increase is...
Digital innovation is the next big wave of business transformation based on digital technologies of which IoT and Big Data are key components, For example: Business boundary innovation is a challenge to excavate third-party business value using IoT and BigData, like Nest Business structure innovation may propose re-building business structure from scratch, as Uber does in the taxicab industry The social model innovation is also a big challenge to the new social architecture with the design fr...
The many IoT deployments around the world are busy integrating smart devices and sensors into their enterprise IT infrastructures. Yet all of this technology – and there are an amazing number of choices – is of no use without the software to gather, communicate, and analyze the new data flows. Without software, there is no IT. In this power panel at @ThingsExpo, moderated by Conference Chair Roger Strukhoff, panelists will look at the protocols that communicate data and the emerging data analy...
DevOps at Cloud Expo, taking place Nov 1-3, 2016, at the Santa Clara Convention Center in Santa Clara, CA, is co-located with 19th Cloud Expo and will feature technical sessions from a rock star conference faculty and the leading industry players in the world. The widespread success of cloud computing is driving the DevOps revolution in enterprise IT. Now as never before, development teams must communicate and collaborate in a dynamic, 24/7/365 environment. There is no time to wait for long dev...
SYS-CON Events announced today that China Unicom will exhibit at the 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. China United Network Communications Group Co. Ltd ("China Unicom") was officially established in 2009 on the basis of the merger of former China Netcom and former China Unicom. China Unicom mainly operates a full range of telecommunications services including mobile broadband (GSM, WCDMA, LTE F...
Data is an unusual currency; it is not restricted by the same transactional limitations as money or people. In fact, the more that you leverage your data across multiple business use cases, the more valuable it becomes to the organization. And the same can be said about the organization’s analytics. In his session at 19th Cloud Expo, Bill Schmarzo, CTO for the Big Data Practice at EMC, will introduce a methodology for capturing, enriching and sharing data (and analytics) across the organizati...
SYS-CON Events announced today the Enterprise IoT Bootcamp, being held November 1-2, 2016, in conjunction with 19th Cloud Expo | @ThingsExpo at the Santa Clara Convention Center in Santa Clara, CA. Combined with real-world scenarios and use cases, the Enterprise IoT Bootcamp is not just based on presentations but with hands-on demos and detailed walkthroughs. We will introduce you to a variety of real world use cases prototyped using Arduino, Raspberry Pi, BeagleBone, Spark, and Intel Edison. Y...