Welcome!

Java IoT Authors: Elizabeth White, Carmen Gonzalez, Pat Romanski, Liz McMillan, Yeshim Deniz

Related Topics: Java IoT, Microservices Expo

Java IoT: Article

Why Rule-Based Log Correlation Is Almost a Good Idea... (Part 5)

Performance tolls - why you can't correlate 100% of your logs

Performance Tolls - Why you cannot correlate 100% of your logs...?
Compounding the combinatory explosion in the number of static-based correlation rules, it is impossible to correlate 100% of all your logs, it is just too expensive and not practical. Read on...

A correlation engine works really hard, even when dealing with a limited set of scenarios:

  • Each scenario requires lots of rules and exceptions, and most of these rules need to be interpreted further as dozen, if not hundred of simple checks and tests. For example, you may want to flag loops with a simple rule such as "IP Origin" = "IP Destination". If you have 1 000 logs this means that for each log you need to do 1 000 tests. Imagine having a million logs, a trillion logs, which is not uncommon on a medium sized infrastructure over a couple days.
  • Each scenario requires state information to be kept and managed for hours, or even days. For example, you may want to be alerted if A happens then within 1 day B happens and then within 1 day C happens. This means that lots of state information needs to be kept over a 2-day span, the engine is constantly monitoring for A to happen, then as soon as A happens the engine starts the clock and monitors for B, and if B happens within that day then a new countdown starts to look for C, all the meanwhile also constantly monitoring for A so as to start a new A then B then C condition... If A happens a lot, and B also happens frequently after A then the engine will need to store lots of A then B state information while monitoring for all the required C.

This requires:

  • Extremely powerful servers needed to run these rules and their corresponding "if then else" tests and checks when there are lots of logs.
  • Vast amounts of temporary memory needed to keep state information in memory and speed up processing while avoiding swapping to disks.

So the correlation engine needs to be fed very carefully, don't give it more than what it can chew or it will essentially run out of resources and die.

Knowing which logs need to be part of scope is an important part of tuning a correlation engine.

No, you cannot ask your correlation solution to manage all of your logs. It's not designed for that. Managing only the most critical ones is already a daunting task for it.

Correlation load for one simple correlation rule over one hour
As an example, let's have a closer look at Attack Scenario 1 - Identity Theft that we elaborated on, and put a threshold of 1 hour for flagging Identity Theft.

Assumptions - at that time, we have:

o 5 logins/sec average from local logins

o 5 logins/sec average from VPN logins

o 1 000 events/sec average total infrastructure

o Logs kept for a total of 1 week - for reporting etc.

Total data space of 1000*3600*24*7 = 604 800 000 events

For each local login event - which is 5 times per second

o Look in the VPN login events - for the past 3600 seconds - and check if that same person logged in through VPN

Total data space of 3600*5 = 18 000 events VPN logins

Total of 18 000 * 5 = 90 000 checks per second

Size of that 1 hour data space in which to perform the 90 000 checks

o 1000 events/sec * 3600 = 3 600 000 events

So, for this one correlation rule:

  • Among the 604 million records total for the past week
  • Among the 3.6 million records for the past hour
  • The Correlation Engine needs to perform:

Reads

90 000 database reads and checks per second

Writes

While at the same time doing 1000 record writes and inserts per second

  • And at the same time, continue collecting, parsing and normalizing, reporting, alerting, "signing of logs", housekeeping and allowing users to log in and use the tool etc etc...

Correlation load for one complex global rule over one day
Imagine now a complex correlation rule that requires the engine to look 100 times per second, and to do this over the full 1-day sliding window.

The assumptions are then:

  • Full data space

Same 1 week = 604 800 000 events

  • One-day sliding window data space

1000 * 3600 * 24 = 86 400 000 events

  • For each correlation rule, we are doing

Number of tests = 100 times per second, look into each record in the 1-day sliding window data space

100 * 86 400 000 = 8 640 000 000 tests per second

That's 8 billion reads per second!!!
Sure you can use tricks and shortcuts to avoid doing all the 8 billion checks, but that gives an idea of the searching power required... for this 1 scenario!!!

Imagine having to enrich this 1 correlation rule with geolocalization information, or somehow putting a dynamic dimension to it.

Imagine having 100's or 1000's of correlation rules, what would be the impact on number of database reads and load?

This is just not practical, and you cannot always solve this problem by throwing more hardware at it.

Did you know that APT attacks can last weeks and months? Stay tuned for what this means for static rule based correlation...

More Stories By Gorka Sadowski

Gorka is a natural born entrepreneur with a deep understanding of Technology, IT Security and how these create value in the Marketplace. He is today offering innovative European startups the opportunity to benefit from the Silicon Valley ecosystem accelerators. Gorka spent the last 20 years initiating, building and growing businesses that provide technology solutions to the Industry. From General Manager Spain, Italy and Portugal for LogLogic, defining Next Generation Log Management and Security Forensics, to Director Unisys France, bringing Cloud Security service offerings to the market, from Director of Emerging Technologies at NetScreen, defining Next Generation Firewall, to Director of Performance Engineering at INS, removing WAN and Internet bottlenecks, Gorka has always been involved in innovative Technology and IT Security solutions, creating successful Business Units within established Groups and helping launch breakthrough startups such as KOLA Kids OnLine America, a social network for safe computing for children, SourceFire, a leading network security solution provider, or Ibixis, a boutique European business accelerator.

@ThingsExpo Stories
SYS-CON Events announced today that delaPlex will exhibit at SYS-CON's @CloudExpo, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. delaPlex pioneered Software Development as a Service (SDaaS), which provides scalable resources to build, test, and deploy software. It’s a fast and more reliable way to develop a new product or expand your in-house team.
Five years ago development was seen as a dead-end career, now it’s anything but – with an explosion in mobile and IoT initiatives increasing the demand for skilled engineers. But apart from having a ready supply of great coders, what constitutes true ‘DevOps Royalty’? It’ll be the ability to craft resilient architectures, supportability, security everywhere across the software lifecycle. In his keynote at @DevOpsSummit at 20th Cloud Expo, Jeffrey Scheaffer, GM and SVP, Continuous Delivery Busine...
SYS-CON Events announced today that Outscale, a global pure play Infrastructure as a Service provider and strategic partner of Dassault Systèmes, will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Founded in 2010, Outscale simplifies infrastructure complexities and boosts the business agility of its customers. Outscale delivers a secure, reliable and industrial strength solution for its customers, which in...
SYS-CON Events announced today that CollabNet, a global leader in enterprise software development, release automation and DevOps solutions, will be a Bronze Sponsor of SYS-CON's 20th International Cloud Expo®, taking place from June 6-8, 2017, at the Javits Center in New York City, NY. CollabNet offers a broad range of solutions with the mission of helping modern organizations deliver quality software at speed. The company’s latest innovation, the DevOps Lifecycle Manager (DLM), supports Value S...
In order to meet the rapidly changing demands of today’s customers, companies are continually forced to redefine their business strategies in order to meet these needs, stay relevant and continue to see profitable growth. IoT deployment and development is integral in this transformation, and today businesses are increasingly seeing the value of investing their resources into IoT deployments. These technologies are able increase ROI through projects such as connecting supply chains or enabling sm...
A strange thing is happening along the way to the Internet of Things, namely far too many devices to work with and manage. It has become clear that we'll need much higher efficiency user experiences that can allow us to more easily and scalably work with the thousands of devices that will soon be in each of our lives. Enter the conversational interface revolution, combining bots we can literally talk with, gesture to, and even direct with our thoughts, with embedded artificial intelligence, whic...
SYS-CON Events announced today that Peak 10, Inc., a national IT infrastructure and cloud services provider, will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Peak 10 provides reliable, tailored data center and network services, cloud and managed services. Its solutions are designed to scale and adapt to customers’ changing business needs, enabling them to lower costs, improve performance and focus intern...
Everywhere we turn in our industry we can find strong opinions about the direction, type and nature of cloud’s impact on computing and business. Another word that is used in every context in our industry is “hybrid.” In his session at 20th Cloud Expo, Alvaro Gonzalez, Director of Technical, Partner and Field Marketing at Peak 10, will use a combination of a few conceptual props and some research recently commissioned by Peak 10 to offer a real-world consideration of how the various categories of...
SYS-CON Events announced today that Super Micro Computer, Inc., a global leader in compute, storage and networking technologies, will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Supermicro (NASDAQ: SMCI), the leading innovator in high-performance, high-efficiency server technology, is a premier provider of advanced server Building Block Solutions® for Data Center, Cloud Computing, Enterprise IT, Hadoop/...
Multiple data types are pouring into IoT deployments. Data is coming in small packages as well as enormous files and data streams of many sizes. Widespread use of mobile devices adds to the total. In this power panel at @ThingsExpo, moderated by Conference Chair Roger Strukhoff, panelists will look at the tools and environments that are being put to use in IoT deployments, as well as the team skills a modern enterprise IT shop needs to keep things running, get a handle on all this data, and deli...
In his opening keynote at 20th Cloud Expo, Michael Maximilien, Research Scientist, Architect, and Engineer at IBM, will motivate why realizing the full potential of the cloud and social data requires artificial intelligence. By mixing Cloud Foundry and the rich set of Watson services, IBM's Bluemix is the best cloud operating system for enterprises today, providing rapid development and deployment of applications that can take advantage of the rich catalog of Watson services to help drive insigh...
SYS-CON Events announced today that SoftLayer, an IBM Company, has been named “Gold Sponsor” of SYS-CON's 18th Cloud Expo, which will take place on June 7-9, 2016, at the Javits Center in New York, New York. SoftLayer, an IBM Company, provides cloud infrastructure as a service from a growing number of data centers and network points of presence around the world. SoftLayer’s customers range from Web startups to global enterprises.
With major technology companies and startups seriously embracing Cloud strategies, now is the perfect time to attend @CloudExpo | @ThingsExpo, June 6-8, 2017, at the Javits Center in New York City, NY and October 31 - November 2, 2017, Santa Clara Convention Center, CA. Learn what is going on, contribute to the discussions, and ensure that your enterprise is on the right path to Digital Transformation.
SYS-CON Events announced today that EARP Integration will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. EARP Integration is a passionate software house. Since its inception in 2009 the company successfully delivers smart solutions for cities and factories that start their digital transformation. EARP provides bespoke solutions like, for example, advanced enterprise portals, business intelligence systems an...
SYS-CON Events announced today that Progress, a global leader in application development, has been named “Bronze Sponsor” of SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Enterprises today are rapidly adopting the cloud, while continuing to retain business-critical/sensitive data inside the firewall. This is creating two separate data silos – one inside the firewall and the other outside the firewall. Cloud ISVs oft...
As cloud adoption continues to transform business, today's global enterprises are challenged with managing a growing amount of information living outside of the data center. The rapid adoption of IoT and increasingly mobile workforce are exacerbating the problem. Ensuring secure data sharing and efficient backup poses capacity and bandwidth considerations as well as policy and regulatory compliance issues.
The 21st International Cloud Expo has announced that its Call for Papers is open. Cloud Expo, to be held October 31 - November 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA, brings together Cloud Computing, Big Data, Internet of Things, DevOps, Digital Transformation, Machine Learning and WebRTC to one location. With cloud computing driving a higher percentage of enterprise IT budgets every year, it becomes increasingly important to plant your flag in this fast-expanding busin...
Internet of @ThingsExpo, taking place October 31 - November 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA, is co-located with the 21st International Cloud Expo and will feature technical sessions from a rock star conference faculty and the leading industry players in the world. @ThingsExpo Silicon Valley Call for Papers is now open.
SYS-CON Events announced today that Interoute has been named “Bronze Sponsor” of SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Interoute is the owner operator of Europe's largest network and a global cloud services platform, which encompasses over 70,000 km of lit fiber, 15 data centers, 17 virtual data centers and 33 colocation centers, with connections to 195 additional partner data centers. Our full-service Unifie...
DevOps is often described as a combination of technology and culture. Without both, DevOps isn't complete. However, applying the culture to outdated technology is a recipe for disaster; as response times grow and connections between teams are delayed by technology, the culture will die. A Nutanix Enterprise Cloud has many benefits that provide the needed base for a true DevOps paradigm.