Welcome!

Java IoT Authors: Zakia Bouachraoui, Elizabeth White, Liz McMillan, Pat Romanski, Yeshim Deniz

Related Topics: @DevOpsSummit, Java IoT

@DevOpsSummit: Blog Feed Post

Logagent-js – Alternative to Logstash, Filebeat, Fluentd, Rsyslog? | @DevOpsSummit #DevOps

What is the easiest way to parse, ship and analyze my web server logs?

Logagent-js - Alternative to logstash, filebeat, fluentd, rsyslog?
By Stefan Thies

What is the easiest way to parse, ship and analyze my web server logs? You should know that I’m a Node.js fan boy and not very thrilled with the idea of running a heavy process like Logstash on my low memory server, hosting my private Ghost Blog. I looked into Filebeat, a very light-weight log forwarder written in Go with an impressively low memory footprint of only a few MB, but Filebeat ships only unparsed log lines to Elasticsearch.  In other words, it sort of still needs Logstash to parse web server logs, which include many fields and numeric values!  Of course, structuring logs is essential for analytics.  The setup for rsyslog with elasticsearch and regex parsers is a bit more time consuming but very efficient compared to Logstash. Are there any better alternatives? Having a quick setup, well structured logs and a low memory footprint?

Guess what?  There is! Meet logagent-js – a log parser and shipper with log patterns for a number of popular log formats – from various Docker Images including Nginx, Apache, Linux and Mac system logs, to Elasticsearch, Redis, Solr, MongoDB and more. Logagent-js detects the log format automatically using the built-in pattern definitions (and also lets you provide your own, custom patterns).

Logagent-js includes a command line tool with default settings for Logsene as the Elasticsearch backend for storing the shipped logs.  Logsene is compatible with the Elasticsearch API, but can do much more, such as role-based access control, account sharing for DevOps teams,  ad-hoc charts in the Logsene UI, alerts on logs, and finally it integrates Kibana to ease the life of everybody dealing with log data!

Now let’s see what I run on my private blog site: logagent-js as single command to tail, parse and ship logs, all with less than 40 MB of RAM. Compare that to Logstash, which would not even start with just 40 MB of JVM heap.  Logagent-js can be installed as a command line tool with npm, which is included in Node.js (>0.12):

npm i logagent-js -g

Logagent-js needs only the Logsene Token as a parameter to ship logs to Logsene. When running it as a background process or daemon, it makes sense to limit the Node.js memory with  –max-old-space-size=60 to 100 MB, just in case.  Without such setting Node.js could consume more memory to improve performance in a long running process:

node --max-old-space-size=60 /usr/local/bin/logagent -s -t your-logsene-token-here logs/access_log &

You can also run logagent-js as upstart or systemd service, of course.

A few seconds after you start it you’ll see all your logs, parsed and structured into fields, with correct timestamps, numeric fields, etc., all without any additional configuration! A real gift and a huge time time saver for busy ops people!

Logsene-create-chart

Charting Logs
Next, let’s create some fancy charts with data from our logs. Logsene has ad-hoc charting functions (look for the little blue chart icons in the above screenshot) that let you draw Pie, Area, Line, Spline, Bar, and other types of charts. Logsene is smart and automatically provides chooses Pie charts to display distinct values and bar/line charts for numeric values over time.

Bildschirmfoto 2016-01-20 um 10.11.37

In the above screenshot we see the top viewed pages and the distribution of HTTP status codes.  We were able to generate these charts literally with just a few mouse clicks. The charts use the current query, so we could search for specific URLs and exclude e.g. images, stylesheets or traffic from robots using Logsene’s query language e.g. ‘NOT css AND NOT jpg AND NOT png AND NOT seoscanners’ or, more simply: -css -jpg -png -seoscanners).

Kibana Dashboards
If you prefer Kibana dashboards then you’ll need more complex Elasticsearch queries to remove Stylesheets, JavaScripts or other URLs from the top list. Open Kibana 4 in the Logsene UI and create a visualistaion to filter specific URLs – a ‘Terms Query’ can use regular expressions to Exclude and Include Filters.

Bildschirmfoto 2016-01-20 um 10.21.29

This visualization could be saved and added to a Kibana dashboard. If you know Kibana this takes a few minutes per visualization.  The result is a stored dashboard that could be shared with colleagues, which might not know how to create such dashboards.

Alert Me
The final thing I usually do is define alert queries e.g. to get notified about a growing number of HTTP error messages. For my private blog I use e-mail notifications, but Logsene integrates well with PagerDuty, HipChat, Slack or arbitrary WebHooks.

There are even more options like using Grafana with Logsene, or shipping logs automatically when using Docker.

Finally, a few more words about  logagent-js, which I consider a ‘swiss army knife’ for logs.  It integrates seamlessly with Logsene, while at the same time it can also work with other log destinations. It provides what I believe is a good compromise in terms of performance and setup time – I’d say it’s somewhere between rsyslog and logstash.

All tools for log processing require memory for this processing, but looking at the initial memory usage after starting the tools gives you an impression of the minimum resource usage.  Here are some numbers taking from my server:

Contributions to the pattern library for even more log formats are welcome – we are happy to help with additional log formats or input sources beside the existing inputs (standard input, file, Heroku, CloudFoundry and syslog UDP). Feel free to contact me @seti321 or @sematext to get up and running with your special setup!

If you don’t want to run and manage your own Elasticsearch cluster but would like to use Kibana for log and data analysis, then give Logsene a quick try by registering here – we do all the backend heavy lifting so you can focus on what you want to get out of your data and not on infrastructure.  There’s no commitment and no credit card required.

We are happy to answer questions or receive feedback – please drop us a line or get us @sematext.

Filed under: Logging Tagged: elasticsearch, ELK, log analytics, log management, logging, logsene

Read the original blog entry...

More Stories By Sematext Blog

Sematext is a globally distributed organization that builds innovative Cloud and On Premises solutions for performance monitoring, alerting and anomaly detection (SPM), log management and analytics (Logsene), and search analytics (SSA). We also provide Search and Big Data consulting services and offer 24/7 production support for Solr and Elasticsearch.

IoT & Smart Cities Stories
Nicolas Fierro is CEO of MIMIR Blockchain Solutions. He is a programmer, technologist, and operations dev who has worked with Ethereum and blockchain since 2014. His knowledge in blockchain dates to when he performed dev ops services to the Ethereum Foundation as one the privileged few developers to work with the original core team in Switzerland.
René Bostic is the Technical VP of the IBM Cloud Unit in North America. Enjoying her career with IBM during the modern millennial technological era, she is an expert in cloud computing, DevOps and emerging cloud technologies such as Blockchain. Her strengths and core competencies include a proven record of accomplishments in consensus building at all levels to assess, plan, and implement enterprise and cloud computing solutions. René is a member of the Society of Women Engineers (SWE) and a m...
Andrew Keys is Co-Founder of ConsenSys Enterprise. He comes to ConsenSys Enterprise with capital markets, technology and entrepreneurial experience. Previously, he worked for UBS investment bank in equities analysis. Later, he was responsible for the creation and distribution of life settlement products to hedge funds and investment banks. After, he co-founded a revenue cycle management company where he learned about Bitcoin and eventually Ethereal. Andrew's role at ConsenSys Enterprise is a mul...
Whenever a new technology hits the high points of hype, everyone starts talking about it like it will solve all their business problems. Blockchain is one of those technologies. According to Gartner's latest report on the hype cycle of emerging technologies, blockchain has just passed the peak of their hype cycle curve. If you read the news articles about it, one would think it has taken over the technology world. No disruptive technology is without its challenges and potential impediments t...
If a machine can invent, does this mean the end of the patent system as we know it? The patent system, both in the US and Europe, allows companies to protect their inventions and helps foster innovation. However, Artificial Intelligence (AI) could be set to disrupt the patent system as we know it. This talk will examine how AI may change the patent landscape in the years to come. Furthermore, ways in which companies can best protect their AI related inventions will be examined from both a US and...
In his general session at 19th Cloud Expo, Manish Dixit, VP of Product and Engineering at Dice, discussed how Dice leverages data insights and tools to help both tech professionals and recruiters better understand how skills relate to each other and which skills are in high demand using interactive visualizations and salary indicator tools to maximize earning potential. Manish Dixit is VP of Product and Engineering at Dice. As the leader of the Product, Engineering and Data Sciences team at D...
Bill Schmarzo, Tech Chair of "Big Data | Analytics" of upcoming CloudEXPO | DXWorldEXPO New York (November 12-13, 2018, New York City) today announced the outline and schedule of the track. "The track has been designed in experience/degree order," said Schmarzo. "So, that folks who attend the entire track can leave the conference with some of the skills necessary to get their work done when they get back to their offices. It actually ties back to some work that I'm doing at the University of San...
When talking IoT we often focus on the devices, the sensors, the hardware itself. The new smart appliances, the new smart or self-driving cars (which are amalgamations of many ‘things'). When we are looking at the world of IoT, we should take a step back, look at the big picture. What value are these devices providing. IoT is not about the devices, its about the data consumed and generated. The devices are tools, mechanisms, conduits. This paper discusses the considerations when dealing with the...
Bill Schmarzo, author of "Big Data: Understanding How Data Powers Big Business" and "Big Data MBA: Driving Business Strategies with Data Science," is responsible for setting the strategy and defining the Big Data service offerings and capabilities for EMC Global Services Big Data Practice. As the CTO for the Big Data Practice, he is responsible for working with organizations to help them identify where and how to start their big data journeys. He's written several white papers, is an avid blogge...
Dynatrace is an application performance management software company with products for the information technology departments and digital business owners of medium and large businesses. Building the Future of Monitoring with Artificial Intelligence. Today we can collect lots and lots of performance data. We build beautiful dashboards and even have fancy query languages to access and transform the data. Still performance data is a secret language only a couple of people understand. The more busine...