Log Management on SmartOS featuring DTrace and Node.js

Machine-generated log data is a rich source of information that is used more and more frequently outside of the traditional scenarios of "developer debugging." Today, log data is used across the business for production monitoring, product usage tracking, application performance management, mobile analytics, and more.

Logentries actually surveyed its 25,000+ Community on how they are using logs and found some very interesting use cases (Check out the full results here).

There are a few reasons why we believe logs have suddenly become a very popular choice for such use cases:

  • Real Time Log Management as a Service: Log management solutions today can consume high volumes of log data in real time, can parse metrics out of the events and can perform statistical analysis on them and visualize this data showing trends and valuable statistics. You can now do this as a service so no need to fire up large numbers of instances to crunch through your data.

  • Logs are easy to create: Adding logs to your system is dead simple. You do not need to spend weeks on integration, as can be the case with expensive and specialized BI solutions for example. Simply add the logs to your code, in a well-structured format, and make sure to capture the data relevant for your use case. For example, if you want to track user behavior in your web app, it's as simple as installing a JS libraryin your client code to track what buttons or web pages users are clicking on. Every time a user performs an action, a log will be created capturing details such as 'timestamp', 'user id', 'page clicked', 'response time' 'context' etc.

  • Logs are decoupled from your system: One of the beautiful simplicities of using your logs as data, is that your system does not end up being tightly coupled with your APM tool or web analytics solution. What do I mean by tightly coupled? If you are using an APM tool, for example, you generally have to integrate its monitoring libraries or agents into your system so that it is instrumented and the APM tool can start to capture system traces, performance metrics and resource usage information. This can not only impact your application performance, but also means that your application is essentially locked into using this solution unless you are prepared to rip out the library from your application code. With logs, this isn't the case. You simply log your events to disk, or syslog for example, and then you can use a log management solution to extract and visualize the important data. If you decide you don't like your logging provider you can simply send your logs to another service or solution, without the need to rip out any libraries or interfere with your application source code.

  • Logs can visualize whatever data you add into them: With log data you are only really limited by your imagination – what you use them for really depends on what you put into them. Internally at Logentries, a few things we use our logs for include tracking user sign-ups and feature usage, identifying performance threshold breaches, understanding system resource usage, tracking marketing campaigns via pixel tracking visualizing total $$$ sales per day … the list goes on…

  • Logs can be generated from every component and device in your stack: Logs can be used to give a complete end-to-end view of your system and are generally produced by every component in all layers of your stack. I recently wrote a blog post on how logs are particularly useful when trying to get visibility into cloud components that can otherwise be considered as black boxes – in short, the blog outlines how cloud services, that you can not instrument with traditional APM solutions, produce log data that you can be used to get visibility into those cloud components and services. Furthermore, you can now also capture logs from your users' web browsers, or mobile devices in real time that give true end-to-end visibility of your application from the client device, through your middleware components and all the way to the database – so that you can also track events through complex stacks

  • Logs maintain the evidence: Finally, and most important of all in my opinion, is that dashboards based on log data have an important property that does not exist when creating dashboards with many other approaches - i.e. your logs maintain the evidence! This means that if there is a spike in the number of sign-ups or an increase in the number of customers using a particular feature, you can quickly validate what caused that change. Validating your data is something that can be particularly painstaking when using APM, web analytics tools or home grown metrics dashboards as it generally involves speaking with the developer responsible for the integration to make sure that the data is correct. With logs, you simple drill down into the individual log events to quickly investigate what caused the change.

Given the above it begs the question how can you easily generate and collect useful log data from your Joyent infrastructure. Below we give you some quick tips on getting log data from different layers of your Joyent environment from the OS layer all the way up to your Application.

Configuring Logentries on SmartOS

SmartOS ships with Rsyslog so you can easily forward your logs to a third party logging service or centralized syslog server. If you don't want to delve into the world of syslog you can simply use a log collection agent (the Logentries agent runs out of the box on SmartOS) to grab these logs and then analyze them, set up alerts, extract metrics and pin them to a centralized dashboard for example.

Using Rsyslog or the Logentries SmartOS collector will allow you to collect any log data already being produced on your system including web server, database or logs from any system process for that matter.

Logentries DTrace Support

In some cases you might want a bit more insight than what is already provided by the existing logs on your system. Enter Dtrace… or to be more accurate, enter logentries-dtrace, i.e. a simple script that allows you to stream your Dtrace metrics into Logentries. This can be a nice way to correlate your system performance or resource usage information with your 'standard' log information i.e. those logs mentioned above which are being collected from the server either via Rsyslog or a collector agent.

Capturing Slow Request Response Time via Dtrace

This can be useful if you want to cross correlate errors or exceptions from a system process log with resource usage (e.g. out of memory) data as captured by Dtrace. Sending all this information to a logging service allows you to build metrics dashboards based on error rates or resource usage stats such that you can see where spikes occur - however the beauty of logs is that you can always drill down into any dashboards to investigate the exact moment or event that caused the issue, as well as what was going on in the system beforehand.

Raw GC Time
GC Time over Time in Nanoseconds

With Dtrace you can profile your system for events that are important for understanding the characteristics of your system such garbage collection and memory analysis latency in network requests and even user-defined probes.

You should consider Dtrace when you want minimum impact logging on your application as the kernel level aggregation of events can mean a smaller impact then application logging that is done in userspace.

How Logentries-dtrace works

Using logentries-drace is as simple as :

$ npm install logentries-drace --save

var logdtrace = require("logentries-dtrace")('LOGENTRIES_TOKEN', 'trace.d', collectioninterval)

There is even a set of example D-Scripts to help you get up and running.

See https://github.com/No9/logentries-dtrace

Logentries Node.js support

Logentries have been a long term supporter of node.js and the community by providing support not only for Dtrace but node.js application logging in general.

If you have access to your application source code, you can also use logs to track app level performance and usage metrics for example.

App Features Used Over Past 5 Min

In this area Logentries has provided support for Winston. Winston is a multi-transport logging library, where a transport is a storage device or logging service. Logging via Winston to Logentries can be achieved using the Logentries-node library that can be registered as a transport for Winston.

If your preference for application level logging is towards Bunyan then we also provide a generic stream interface to Logentries called logentries-stream that enables you to pass data from your Bunyan logs or any stream data source directly to Logentries.

In summary, logs give you visibility across all layers of your software stack, from the OS all the way up to the application tier. And if you take advantage of some logging best practices you can very easily use them to give you visibility into everything from system security to performance as well as product or application usage patterns. No need for complex monitoring solutions or expensive BI tools - get back to basics with your logs and you can find almost everything you need.



Post written by Trevor Parsons, Co-founder & Chief Scientist, Logentries and Anton Whalley