Click here to close now.




















Welcome!

Java IoT Authors: Liz McMillan, Pat Romanski, VictorOps Blog, Cloud Best Practices Network, Elizabeth White

Related Topics: Java IoT, Microservices Expo, IoT User Interface, Agile Computing

Java IoT: Article

Don’t Let Load Balancers Ruin Your Holiday Business

Proactive performance management

An eCommerce site that crashes seven times during the Christmas season being down for up to five hours each time it crashes is a site that loses a lot of money and reputation. It happened to one of our customers who told this story at our annual performance conference earlier this month. Among the several reasons that led to these crashes I want to share more details on one of them that I see more often with other websites as well. Load balancers on a round-robin instead of least-busy can easily lead to app server crashes caused by heap memory exhaustion. Let's dig into some details on how to identify these problems and how to avoid it.

The Symptom: Crashing Tomcat Instances
The website is deployed on six Tomcats with three front-end Apache Web Servers. During peak load hours individual Tomcat instances started showing growing response times and a growing number of requests in the Tomcat processing queue. After a while these instances crashed due to out-of-memory exceptions and with that also brought down the rest of the site as load couldn't be handled any more with the remaining servers. Figure 1 shows the actual flow of transactions through the system highlighting unevenly distributed response time in the application servers and functional errors being reported on all tiers (red-colored server icon):

Even with equally distributed load (Round Robin Load Balancer Setting) one of the Tomcats spiked in Response Time Contribution before crashing

Once the App Server started rejecting incoming connections we can observe the first ripple effect of errors. We can see a very high number of exceptions in the database layer, exceptions thrown between application tiers with the web app responding with HTTP 500s:

Within 30 minutes the application serves 43000 pages with an HTTP 500 Response correlating to Exceptions in the Database and Inter-Tier Communication

The Root Cause: Inefficient Database Statements and Connection Pool Usage
The exceptions caught in the Database Layer (JDBC) were already a very good hint of the root cause of this problem. A closer look at the Exceptions shows that connection pools are exhausted, which causes problems in the different components of the application:

Exhausted Connection Pool causes Exceptions that impact Data Access Layer as well as Widget Rendering

Looking at the performance breakdown by application layer reveals how much performance impact connection pooling has on the overall transaction response time:

Due to the connection pool problem a single request had to wait 3.8s on average to obtain a connection from the pool

Now - it was not only the size of the pool that was the problem - but - several very inefficient database statements that took long to execute for certain business transactions of the application. This caused the application server to hold on to the connection longer than normal. As the load balancer was configured with Round Robin the app server still got additional requests served. Eventually - just by the random nature of incoming requests - one app server received several of these requests that executed these inefficient database calls. Once the connection pool was exhausted the application started throwing exceptions that ultimately also led to a crash of the JVM. Once the first app server crashed, it didn't take too long to take the other app servers down as well.

The Solution: Optimizing App and Load Balancer
The problem was fixed by looking at the slowest database statements and optimizing them for performance by, e.g, adding indices on the database or making the SQL statements more efficient. They also optimized the pool size to accommodate the expected load during peak hours.

 

They started by optimizing SQL Statements that took long to execute and those that got executed several times within the same transaction

They also changed the load balancer setting from Round-Robin to Least-Busy, which was the preferred setting from the LB vendor but had simply forgotten to configure in the production environment.

The Result: Site Has Not Been Down Since
Since they made the changes to the application and the load balancer the site has never gone down since. Now - the next holiday season is coming up and they are ready for the next seasonal spikes. Even though they are really confident that everything will work without any problems, they learned their lesson and are approaching performance proactively through proper load testing.

Next Steps: Proactive Performance Management
The lesson learned was that these problems could have been found prior to the holiday shopping season by doing proper load testing. They did load testing before but never encountered this problem because of two reasons: 1) they didn't test using expected peak volumes for long enough sessions and 2) didn't use a tool that simulated real customer behavior variations (too few scripts and the scripts were too simple) that tested their highly interactive web site.

Their strategy for proactive performance management is that they:

  1. Perform load tests on the production system during low traffic hours (2 a.m.-6 a.m.), accepting the risk of minor sales losses in case of a crash, versus major sales losses during the holiday shopping season.
  2. Multiply the hourly load test volume by 2.5 since their actual peaks are 10 hours long.
  3. Use a load testing service that uses real browsers in different locations around the U.S.
  4. Use an APM solution that identifies problems within the application while running the load test.

If you want to read more on common performance problems that are not found prior to moving to production check out my recent series of blogs: Supersized Content, Deployment Mistakes or Excessive Logging

More Stories By Michael Kopp

Michael Kopp has over 12 years of experience as an architect and developer in the Enterprise Java space. Before coming to CompuwareAPM dynaTrace he was the Chief Architect at GoldenSource, a major player in the EDM space. In 2009 he joined dynaTrace as a technology strategist in the center of excellence. He specializes application performance management in large scale production environments with special focus on virtualized and cloud environments. His current focus is how to effectively leverage BigData Solutions and how these technologies impact and change the application landscape.

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.


@ThingsExpo Stories
The Internet of Things is in the early stages of mainstream deployment but it promises to unlock value and rapidly transform how organizations manage, operationalize, and monetize their assets. IoT is a complex structure of hardware, sensors, applications, analytics and devices that need to be able to communicate geographically and across all functions. Once the data is collected from numerous endpoints, the challenge then becomes converting it into actionable insight.
Consumer IoT applications provide data about the user that just doesn’t exist in traditional PC or mobile web applications. This rich data, or “context,” enables the highly personalized consumer experiences that characterize many consumer IoT apps. This same data is also providing brands with unprecedented insight into how their connected products are being used, while, at the same time, powering highly targeted engagement and marketing opportunities. In his session at @ThingsExpo, Nathan Treloar, President and COO of Bebaio, will explore examples of brands transforming their businesses by t...
With the Apple Watch making its way onto wrists all over the world, it’s only a matter of time before it becomes a staple in the workplace. In fact, Forrester reported that 68 percent of technology and business decision-makers characterize wearables as a top priority for 2015. Recognizing their business value early on, FinancialForce.com was the first to bring ERP to wearables, helping streamline communication across front and back office functions. In his session at @ThingsExpo, Kevin Roberts, GM of Platform at FinancialForce.com, will discuss the value of business applications on wearable ...
With the proliferation of connected devices underpinning new Internet of Things systems, Brandon Schulz, Director of Luxoft IoT – Retail, will be looking at the transformation of the retail customer experience in brick and mortar stores in his session at @ThingsExpo. Questions he will address include: Will beacons drop to the wayside like QR codes, or be a proximity-based profit driver? How will the customer experience change in stores of all types when everything can be instrumented and analyzed? As an area of investment, how might a retail company move towards an innovation methodolo...
The Internet of Things (IoT) is about the digitization of physical assets including sensors, devices, machines, gateways, and the network. It creates possibilities for significant value creation and new revenue generating business models via data democratization and ubiquitous analytics across IoT networks. The explosion of data in all forms in IoT requires a more robust and broader lens in order to enable smarter timely actions and better outcomes. Business operations become the key driver of IoT applications and projects. Business operations, IT, and data scientists need advanced analytics t...
While many app developers are comfortable building apps for the smartphone, there is a whole new world out there. In his session at @ThingsExpo, Narayan Sainaney, Co-founder and CTO of Mojio, will discuss how the business case for connected car apps is growing and, with open platform companies having already done the heavy lifting, there really is no barrier to entry.
Contrary to mainstream media attention, the multiple possibilities of how consumer IoT will transform our everyday lives aren’t the only angle of this headline-gaining trend. There’s a huge opportunity for “industrial IoT” and “Smart Cities” to impact the world in the same capacity – especially during critical situations. For example, a community water dam that needs to release water can leverage embedded critical communications logic to alert the appropriate individuals, on the right device, as soon as they are needed to take action.
SYS-CON Events announced today that HPM Networks will exhibit at the 17th International Cloud Expo®, which will take place on November 3–5, 2015, at the Santa Clara Convention Center in Santa Clara, CA. For 20 years, HPM Networks has been integrating technology solutions that solve complex business challenges. HPM Networks has designed solutions for both SMB and enterprise customers throughout the San Francisco Bay Area.
SYS-CON Events announced today that Micron Technology, Inc., a global leader in advanced semiconductor systems, will exhibit at the 17th International Cloud Expo®, which will take place on November 3–5, 2015, at the Santa Clara Convention Center in Santa Clara, CA. Micron’s broad portfolio of high-performance memory technologies – including DRAM, NAND and NOR Flash – is the basis for solid state drives, modules, multichip packages and other system solutions. Backed by more than 35 years of technology leadership, Micron's memory solutions enable the world's most innovative computing, consumer,...
As more intelligent IoT applications shift into gear, they’re merging into the ever-increasing traffic flow of the Internet. It won’t be long before we experience bottlenecks, as IoT traffic peaks during rush hours. Organizations that are unprepared will find themselves by the side of the road unable to cross back into the fast lane. As billions of new devices begin to communicate and exchange data – will your infrastructure be scalable enough to handle this new interconnected world?
Through WebRTC, audio and video communications are being embedded more easily than ever into applications, helping carriers, enterprises and independent software vendors deliver greater functionality to their end users. With today’s business world increasingly focused on outcomes, users’ growing calls for ease of use, and businesses craving smarter, tighter integration, what’s the next step in delivering a richer, more immersive experience? That richer, more fully integrated experience comes about through a Communications Platform as a Service which allows for messaging, screen sharing, video...
SYS-CON Events announced today that Pythian, a global IT services company specializing in helping companies leverage disruptive technologies to optimize revenue-generating systems, has been named “Bronze Sponsor” of SYS-CON's 17th Cloud Expo, which will take place on November 3–5, 2015, at the Santa Clara Convention Center in Santa Clara, CA. Founded in 1997, Pythian is a global IT services company that helps companies compete by adopting disruptive technologies such as cloud, Big Data, advanced analytics, and DevOps to advance innovation and increase agility. Specializing in designing, imple...
In his session at @ThingsExpo, Lee Williams, a producer of the first smartphones and tablets, will talk about how he is now applying his experience in mobile technology to the design and development of the next generation of Environmental and Sustainability Services at ETwater. He will explain how M2M controllers work through wirelessly connected remote controls; and specifically delve into a retrofit option that reverse-engineers control codes of existing conventional controller systems so they don't have to be replaced and are instantly converted to become smart, connected devices.
SYS-CON Events announced today that IceWarp will exhibit at the 17th International Cloud Expo®, which will take place on November 3–5, 2015, at the Santa Clara Convention Center in Santa Clara, CA. IceWarp, the leader of cloud and on-premise messaging, delivers secured email, chat, documents, conferencing and collaboration to today's mobile workforce, all in one unified interface
WebRTC has had a real tough three or four years, and so have those working with it. Only a few short years ago, the development world were excited about WebRTC and proclaiming how awesome it was. You might have played with the technology a couple of years ago, only to find the extra infrastructure requirements were painful to implement and poorly documented. This probably left a bitter taste in your mouth, especially when things went wrong.
Too often with compelling new technologies market participants become overly enamored with that attractiveness of the technology and neglect underlying business drivers. This tendency, what some call the “newest shiny object syndrome,” is understandable given that virtually all of us are heavily engaged in technology. But it is also mistaken. Without concrete business cases driving its deployment, IoT, like many other technologies before it, will fade into obscurity.
As more and more data is generated from a variety of connected devices, the need to get insights from this data and predict future behavior and trends is increasingly essential for businesses. Real-time stream processing is needed in a variety of different industries such as Manufacturing, Oil and Gas, Automobile, Finance, Online Retail, Smart Grids, and Healthcare. Azure Stream Analytics is a fully managed distributed stream computation service that provides low latency, scalable processing of streaming data in the cloud with an enterprise grade SLA. It features built-in integration with Azur...
Akana has announced the availability of the new Akana Healthcare Solution. The API-driven solution helps healthcare organizations accelerate their transition to being secure, digitally interoperable businesses. It leverages the Health Level Seven International Fast Healthcare Interoperability Resources (HL7 FHIR) standard to enable broader business use of medical data. Akana developed the Healthcare Solution in response to healthcare businesses that want to increase electronic, multi-device access to health records while reducing operating costs and complying with government regulations.
For IoT to grow as quickly as analyst firms’ project, a lot is going to fall on developers to quickly bring applications to market. But the lack of a standard development platform threatens to slow growth and make application development more time consuming and costly, much like we’ve seen in the mobile space. In his session at @ThingsExpo, Mike Weiner, Product Manager of the Omega DevCloud with KORE Telematics Inc., discussed the evolving requirements for developers as IoT matures and conducted a live demonstration of how quickly application development can happen when the need to comply wit...
The Internet of Everything (IoE) brings together people, process, data and things to make networked connections more relevant and valuable than ever before – transforming information into knowledge and knowledge into wisdom. IoE creates new capabilities, richer experiences, and unprecedented opportunities to improve business and government operations, decision making and mission support capabilities.