Java IoT Authors: Liz McMillan, Elizabeth White, Yeshim Deniz, Zakia Bouachraoui, Pat Romanski

Related Topics: @DevOpsSummit, Java IoT, @CloudExpo

@DevOpsSummit: Blog Post

Carefree, Scalable Infrastructure by @HoardingInfo | @DevOpsSummit [#DevOps]

Using log analysis to monitor a scaling environment

Article by Chris Riley, for Logentries

There is a whole lot of talk about this DevOps thing. Pushing teams to move faster, increasing focus on results, and doing so with better and better quality. But the elephant in the room is how we go from immutable infrastructure to scalable environments that move with the code, not against it. And making infrastructure move at the speed of code, takes more than orchestration tools. Operations needs to be confident that they can let the meta-application scale infrastructure on-demand without resulting in a huge mess of tangled servers.

Virtual Machines (VM) help make infrastructure more flexible. Machines can now be treated much like any file, although very large. They can be moved around and copied on-demand. And if you think about it, in the highly flexible model, VMs should have very short life spans. They should live and die based on the current application load. Or for complete end-to-end testing, a full set of infrastructure should be provisioned for each test run. And then deleted when the test completes. When a developer asks for a machine the response should be "you will have it in minutes," not days. Even better, "you can do it yourself," without burdening IT operations with every request, but maintaining oversight of infrastructure. This vision is not how it usually goes.

The reality is most of your VMs have been up and running for months. And they are not as agile as we want them to be.

Are you a server hugger?
As we all know, we are creatures of habit. We are accustomed to thinking about servers as the physical things they used to be. And thus, something that you set and forget. And, if you are not dealing with your own datacenter and/or do not have control over the virtualization layer, you might have limited control over performance in copying, moving, and spinning up VMs, thus you do not do it often. This can be solved with a well managed private cloud, or a high power public cloud designed for such a thing. Like spot instances on AWS.

Scalable Infrastructure with Log Analysis

But the other huge, and perhaps the most pressing, reason why we don't free our VMs is because we are afraid of server sprawl. Server sprawl is a very large problem. It can impact costs; and if you are using a cloud provider, it can cause issues in knowing which servers are handling which work loads, and it can just waste a tremendous set of resources.

How many rogue VMs are currently in your environment?

In any case, most of us want to avoid this situation as much as we can. A free-reign, scalable environment by some meta-level orchestration layer is a bit frightening, and rightly so.

The trick to making it all happen is log analysis.

Herd Servers with Log Analysis
Normally you think of log analysis as that thing added to VMs once they are created, in order to monitor logs that the VM creates. But it also gives you the possibility of letting your server farm to run free without the fear and lack of management that could possibly happen.

And the method for doing it is quite simple: create a gold master VM, or orchestration script. To that VM you will pre-install your log agent with appropriate settings. When anything changes to the configuration such as updates, and installs, you will only make that change on the gold master scripts or VMs, not in production. That change may or may not trigger a replacement of machines already provisioned.

All your VMs will be provisioned off this gold master. And when done correctly they will automatically be linked to your log analysis platform. Now here is the gotcha. Naming conventions. Before taking on this approach you need to have a strong, universal and easy to understand naming convention.

This is important for easier management, but also the ability to remote into machines without much guessing. Or, if you identify a machine in a log file, you want to be able to know just from the name what its purpose, location, creation date, and workload type are. I'm not suggesting your names get silly long like some file names. Only that the name tells enough about an isolated machine, that you can take the next step.

As part of your provisioning this will require you to use something like SysPrep on Windows, or an orchestration tool on Linux, to do the necessary dynamic changes to the machines' admin accounts, network configuration, and machine and host names.

Here is where log analysis comes in to help again. You can actually take the log files from your virtualization layer to associate server provisioning information with individual servers. This way, even if you did not proactively create a good information architecture for your VMs, you can associate machines with their logs and be able to ask questions about the details, respond to an issue, or take an action on it.

For the Advanced
More advanced implementations will likely have custom services on the VMs that are sending more detailed logs to the analysis platform. And many organizations will consider using Docker as the container layer instead of heavy virtualization.

The other scenario for the advanced, or possibly a challenge, is keeping track of machine IPs.  It is especially important if your model includes allowing front and back end developers to access machines in the ever changing farm. To do so they will need some way to identify the IP of the machine quickly.

This requires some smarts on the network layer. If you are leveraging virtualization like VMWare it is possible to snapshot multi-tier environments including the network layer and make this portable as well. That way all IPs are maintained, but contained in individual environments and the vLan isolated from all others. Thus all you need to know is the environment name. However this will complicate any configuration changes to the gold master.

Or you can make sure that in orchestration or configuration management scripts you make IP allocations variable, but record in your log platform all the details like we suggest in this post. There are also some new orchestration as a service tools that will do variable IP allocation for you, and an area that is for sure going to improve.

It is not terribly easy to get this engine running, and the most complex part is planning for change management, not the log analysis. For example do you allow outdated VMs to keep running in the data center or do you automatically kill them and replace them with a new one based on an updated gold master.

Do it now, or do it later, unshackling your infrastructure is a must in order to make your move further and further into the DevOps framework. And the way to make sure that highly scalable environments do not get out of everyone's control, is by building in variable log analysis. Log analysis that automatically attaches itself to every machine provisioned, will help everyone have a picture of the entire environment, without necessarily touching a single VM. It is not easy setting your infrastructure free, but when you do, the benefits of better application performance, better tools for developers, and better control over your "datacenter" make it worth it. And log analysis is the way to ease the tension.

More Stories By Trevor Parsons

Trevor Parsons is Chief Scientist and Co-founder of Logentries. Trevor has over 10 years experience in enterprise software and, in particular, has specialized in developing enterprise monitoring and performance tools for distributed systems. He is also a research fellow at the Performance Engineering Lab Research Group and was formerly a Scientist at the IBM Center for Advanced Studies. Trevor holds a PhD from University College Dublin, Ireland.

IoT & Smart Cities Stories
The deluge of IoT sensor data collected from connected devices and the powerful AI required to make that data actionable are giving rise to a hybrid ecosystem in which cloud, on-prem and edge processes become interweaved. Attendees will learn how emerging composable infrastructure solutions deliver the adaptive architecture needed to manage this new data reality. Machine learning algorithms can better anticipate data storms and automate resources to support surges, including fully scalable GPU-c...
Machine learning has taken residence at our cities' cores and now we can finally have "smart cities." Cities are a collection of buildings made to provide the structure and safety necessary for people to function, create and survive. Buildings are a pool of ever-changing performance data from large automated systems such as heating and cooling to the people that live and work within them. Through machine learning, buildings can optimize performance, reduce costs, and improve occupant comfort by ...
The explosion of new web/cloud/IoT-based applications and the data they generate are transforming our world right before our eyes. In this rush to adopt these new technologies, organizations are often ignoring fundamental questions concerning who owns the data and failing to ask for permission to conduct invasive surveillance of their customers. Organizations that are not transparent about how their systems gather data telemetry without offering shared data ownership risk product rejection, regu...
René Bostic is the Technical VP of the IBM Cloud Unit in North America. Enjoying her career with IBM during the modern millennial technological era, she is an expert in cloud computing, DevOps and emerging cloud technologies such as Blockchain. Her strengths and core competencies include a proven record of accomplishments in consensus building at all levels to assess, plan, and implement enterprise and cloud computing solutions. René is a member of the Society of Women Engineers (SWE) and a m...
Poor data quality and analytics drive down business value. In fact, Gartner estimated that the average financial impact of poor data quality on organizations is $9.7 million per year. But bad data is much more than a cost center. By eroding trust in information, analytics and the business decisions based on these, it is a serious impediment to digital transformation.
Digital Transformation: Preparing Cloud & IoT Security for the Age of Artificial Intelligence. As automation and artificial intelligence (AI) power solution development and delivery, many businesses need to build backend cloud capabilities. Well-poised organizations, marketing smart devices with AI and BlockChain capabilities prepare to refine compliance and regulatory capabilities in 2018. Volumes of health, financial, technical and privacy data, along with tightening compliance requirements by...
Predicting the future has never been more challenging - not because of the lack of data but because of the flood of ungoverned and risk laden information. Microsoft states that 2.5 exabytes of data are created every day. Expectations and reliance on data are being pushed to the limits, as demands around hybrid options continue to grow.
Digital Transformation and Disruption, Amazon Style - What You Can Learn. Chris Kocher is a co-founder of Grey Heron, a management and strategic marketing consulting firm. He has 25+ years in both strategic and hands-on operating experience helping executives and investors build revenues and shareholder value. He has consulted with over 130 companies on innovating with new business models, product strategies and monetization. Chris has held management positions at HP and Symantec in addition to ...
Enterprises have taken advantage of IoT to achieve important revenue and cost advantages. What is less apparent is how incumbent enterprises operating at scale have, following success with IoT, built analytic, operations management and software development capabilities - ranging from autonomous vehicles to manageable robotics installations. They have embraced these capabilities as if they were Silicon Valley startups.
As IoT continues to increase momentum, so does the associated risk. Secure Device Lifecycle Management (DLM) is ranked as one of the most important technology areas of IoT. Driving this trend is the realization that secure support for IoT devices provides companies the ability to deliver high-quality, reliable, secure offerings faster, create new revenue streams, and reduce support costs, all while building a competitive advantage in their markets. In this session, we will use customer use cases...