Managing configuration with envdir, remotely

envdir runs another program with environment modified according to files in a specified directory.

envdir is a component of daemontools subsequently ported to Python that does one thing: it reads files from a specified directory and injects the names and values of those files as environment variables into the environment of a specified process (which is subsequently run).

“That’s it” but it turns out to be quite handy.

Hard coded secrets and secret settings files

How do you manage environment specific configuration for deployed apps? Or secrets? You could put everything in a configuration file, in Django it’d be settings.py and commit it and hope no ever finds the database passwords and API keys you’ve added. (Hint: no). Or, to break things out by environment you could use distinct configuration files, named after each target environment. This works better, but you still have the issue of secrets like passwords.

Here a common solution is to import from a non-source controlled file. This is obviously better for security as it’s not committed to your repo, but managing the secrets file isn’t obvious. You’ll probably have to log into the server and edit the file in frickin’ nano.

This breaks down a bit if you use different releases and need to start copying the settings file. You can edit the copy in the source cache directory, but then you need to make a new deployment just to activate the new settings. And if you edit the file in the latest release directory you’ll lose these updates without duplicating said update in the cached directory.

Keeping settings outside the project

A better solution is to keep these settings outside of the project and its directory altogether. They can be sourced from a file kept in a specified location or you can use the system’s environment to feed configuration values to the process.

The choice for me comes down to two questions: compatability with the application and ease of use for managing the values.

Having worked with Heroku-deployed apps enough and Foreman/Honcho for running Django apps locally, both the style of storing and managing the values by envirionment variable is very appealing.

envdir and the process

I want to motivate envdir by working backwards, from running the process, that is.

To run an app with envdir you specify the directory where you configuration values are stored and the process to run.

envdir /path/to/directory/ process

That’s it.

So to run, say, gunicorn:

envdir /var/apps/myapp/config gunicorn coolapp.wsgi:application --port=5000

That’s the only change necessary for getting these values into your application.

Storing variables using envdir

The next question is how do you store them. envdir relies on a NoSQL database called the file system for persistent local key/value storage.

This diagram from the Python port is useful:

$ tree envs/prod/
envs/prod/
├── DJANGO_SETTINGS_MODULE
├── MYSITE_DEBUG
├── MYSITE_DEPLOY_DIR
├── MYSITE_SECRET_KEY
└── PYTHONSTARTUP

0 directories, 3 files
$ cat envs/prod/DJANGO_SETTINGS_MODULE
mysite.settings

In case you missed it, you have a directory with a bunch of files, the name of each file represents the name of the environment variable, and the value of the variable is stored as the file content.

Managing the values

Okay, running this is all very simple, but why use a littany of little files instead of a single file?

The first reason for me was that there’s already a well tested tool that does what I want. I could find or write a tool to read these values from a single file (I’ve used shell scripts and custom Django manage.py scripts to do this), of course. The second issue is that it ends up being simpler in the end working with many little files. The file system protects against accidental duplication which is one nice feature.

It’s also easier to manage remotely and that’s ultimately the winner for me. I mentioned using Heroku, and while most of the projects I work on are not on Heroku, I covet the CLI-based control I have when I’m working on something else. Especially with configuration.

heroku config

I want that everywhere! And with Fabric coupled with envdir, we can do it.

fab production config
fab staging config.set:DEBUG=False

The solution below, which would be the contents of fabfile/config.py in a project fabfile module, makes great use of envdir’s simple storage layout to read and update configuration values. A sister module, services includes some simple tasks to manage application services and here you assume that services.restart causes any affected services to reload the configuration.

from fabric.api import task, run
from fabric.context_managers import cd, hide
from fabric.state import env

from fabfile import services


@task(default=True)
def list():
    """Lists the environment variables for the app user"""
    with hide('running', 'stdout'):
        with cd(env.config_dir):
            environment = {
                var: run("cat {}".format(var))
                for var in run("ls").split()
            }

    longest_key = max([len(i) for i in environment.keys()]) + 1
    padding = 30 if longest_key < 30 else longest_key
    print_str = "} ".format(padding)
    for var in sorted(environment.keys()):
        print(print_str.format(var, environment[var]))


@task
def set(**kwargs):
    """Add one or more shell variables to the app environment"""
    with hide('running', 'stdout'):
        with cd(env.config_dir):
            for var, value in kwargs.items():
                run("echo '{0}' > {1}".format(value, var))
    services.restart()


@task
def remove(*args):
    """Remove configuration variables from the app environment"""
    success = 0
    with hide('running', 'stdout'):
        with cd(env.config_dir):
            envvars = [var for var in run("ls").split()]
            for env_key in args:
                if env_key not in envvars:
                    print("No such variable {0}".format(env_key))
                    continue
                run("rm {0}".format(env_key))
                success += 1
    if success:
        services.restart()

Constraints and alternatives

If you have multiple servers, you’re going to have keep multiple copies of the configuration set. Fabric will run against multiple machines of course, but the script here may be insufficient for doing just that. For instance you’d probably want to know if a value was not set on one server or failed to remove on another, etc.

Some of you reading this may be nodding and thinking, ah, “If only he knew about etcd or consul!” I do, and while they look like really fine tools, they’re really geared toward cloud deployments. Not just deployments on cloud servers, but large multi-server deployments in which new nodes are created and destroyed like our hopes and dreams. With one or two servers it’d still be useful but it’d also be yet-another-service. Let’s keep it simple for the simple cases.

Hey! Before you go, checkout my book, Django Standalone Apps! It's a bit rough around the edges as I'm writing it in public, which means chapters get released periodically. But I'd love if you'd check it out (it's free to read online).