The so-called writings of Michael Gorsuch.

Introducing logtrend, or HTTP Request Trending With EventMachine

I have this problem of not wanting to demonstrate code unless it is prettied up. It probably stems from inexperience and the fact that I work with people much smarter than myself. I told myself that its time to start showing off as soon as something seems useful, so here you go. In this era of social coding, the sooner something is released, the sooner the feedback comes.

The following code demonstrates a simple way to trend the volume of HTTP requests in your environment.

Our Problem

I assume you run a web app of reasonable volume, and stream your access logs to disk. If you’ve ever been fire fighting and scanning logs looking for long running or more-frequent-than-usual requests, you know it is hard and often futile work.

One Possible Solution

Here is logtrend, my contribution to putting these days behind us.

The idea is simple, give logtrend a log to follow, define categories for different types of events, and let it automatically build graphs as you define them. You can then check up on your graphs when you things start to behave funny and see if traffic patterns have changed. For example, perhaps you have a customer abusing your API or hammering on an entry point that is especially expensive.

This can probably be better demonstrated via the following code:

require 'rubygems'
require 'logtrend'

# Invoke this to begin trending your data...
LogTrend::Base.run("/var/log/httpd-access") do |lt|  
  # Set new locations for our graphs and rrds.  defaults to '.'
  lt.rrd_dir = '/tmp/rrd'
  lt.graphs_dir = '/tmp/graphs'
  
  # Add some things to trend.  An RRD is built for each one of these items.
  # Each time we read a line from the log file, we pass it to the block.
  # If your block returns true, we count that as a hit.
  # Every minute, the RRD is updated with the hits for the previous period.
  lt.add_trend(:total) {|line| line.match /.*/}
  lt.add_trend(:fbod) {|line| line.match /fogbugz.com/}
  lt.add_trend(:kod) {|line| line.match /kilnhg.com/}
  lt.add_trend(:long) do |line|
    # Let us pretend that request time is in seconds
    # and is the last item on the log line
    request_time = line.split.last.to_i
    request_time > 10
  end
  
  # Build a graph displaying some of the items we are trending
  # Label it as :requests_per_minute
  lt.add_graph(:requests_per_minute) do |g|
    g.add_point :area, :total, "#333333"
    g.add_point :line, :fbod, "#0066cc"
    g.add_point :line, :kod, "#993333"
  end

  # Build a second graph for our long running queries
  lt.add_graph(:long_requests) do |g|
    g.add_point :area, :long, '#000000'
  end
end

Once started, logtrend will watch the logfile in real time, outputting graphs every minute.

Here is a small sample of our first graph (numbers and other sensitive data have been edited out):

The black background represents the total requests for the given service, and the yellow line represents all API requests coming in. Notice the spike? Fortunately it did not cause any service issues, but had it, we would have had actionable intelligence.

Further Notes

Installation should be a matter of sudo gem install logtrend. Feel free to fork and patch as you see fit. There are still some rough edges, but I have been using it in production for over a week without any issue.

The system is currently retaining a year’s worth of data, but only rendering graphs for the past 24 hours. Expect this to be improved in an upcoming release.

logtrend is built with the intention of staying simple, so it probably won’t ever be a good fit for trending more complicated data than already mentioned.

Credit

As always, this was built on the shoulders of giants. Abstraction upon abstraction made a difficult problem very easy to solve. The most obvious contributions come from the flexibility of Ruby + the awesomeness of EventMachine.

Jordan Sissel’s eventmachine-tail made watching the files a cinch, and Thiago Morello’s rrd-ffi made graph generation and data retention pain free.