Logging in a Docker Hosting World

by nick.schuch / 6 October 2015

Docker is reinventing the way we package and deploy our applications, bringing new challenges to hosting. In this blog post I will provide a recipe for logging your Docker packaged applications.

Goals

Going into this I had 2 major goals:

Zero remote console logins - What is the number one cause for a developer / sys admin to login to the console on the remote environment? To inspect the logs. If we expose our logs via an API we immediately cut out the vast majority of our remote logins.
Aggregation - I personally despise having to log into multiple hosts and inspect the logs for a clustered service. It can result in you being "grumpy" at the situation well before you start on the task you are meant to do.
If we have all our logs in one place, we don't have to worry about having to access multiple machines to analyse data.

Components

The following are the components which make up a standard logging pipeline.

Storage

I have started with the most core piece of the puzzle, the storage.
The component is in charge of:

Receiving the logs
Reliably storing them for a retention preiod (eg. 3 months)
Exposing them via an interface (API / UI)

Some open source options:

Logstash (more specifically the ELK stack)
Graylog

Some services where you can get this right out of the box:

Loggly
AWS CloudWatch Logs
Papertrail

These services don't require you to run Docker container based hosting. You can run these right now on your existing infrastructure.

However, they do become a key component when hosting Docker-based infrastructure because we are constantly rolling out new containers in place of the old ones.

Collector

This is an extremely simple service tasked with the job to collect all the logs and push them to the remote service.

Don't confuse simple with important though. I highly recommend you setup monitoring for this component.

Visualiser

On most occasions the "storage" component provides an interface for interacting with the logged data.

In addition we can also write applications to consume the "storage" components API and provide a command line experience.

Implementation

So how do we implement these components in a Docker hosting world? The key to our implementation is the Docker API.

In the below example we have:

Started a container with an "echo" command
Queried the Docker logs API via the Docker CLI application

What this means, is that we can pick up all the logs for a service IF the services inside the container are printing to STDOUT instead of logging to a file.

With this in mind, we developed the following logs pipeline, and open sourced some of the components:

Expose service logs to the Docker daemon (Apache / Drupal Watchdog / Syslog)
Collect the logs (https://github.com/previousnext/log)
Visualise via the UI and CLI (https://github.com/nickschuch/cloudwatch-cli)

Conclusion

I feel like we have achieved a lot by doing this.

Here are some takeaways:

The logs pipeline is generic and not Drupal specific
We didn't reinvent the wheel on how logs are shipped to remote services.
Some interesting projects were built along the way which can be used standalone.

Tagged

Drupal System Administration

Nick SchuchOperations Lead