Despite the fact containerization and Docker are present in our lives for a couple of years, there is still no strict guidelines on how to separate container and its configuration. The topic of application/data/configuration separation goes beyond container applications. This is part of the more general problem on how to deploy application logic with its configuration for a specific environment (development/acceptance/production) so that we have a fully working instance (we won't focus on the data separation in this article).

Definitions

Let's start by defining a few things first:

  • application code - that's the logic with all necessary code (network or database connectivity support, models, etc.) to deliver business value.
  • application configuration - the core of the application that does not vary between environments (e.g. which modules application starts).
  • environment configuration - everything that is likely to vary between environments (staging, production, developer, etc.) including secrets (e.g. API credentials).
  • environment - the fully operating application, that is configured to work in a given context (e.g. Development or Production).

Now, I'd like to consider three approaches to providing environment configuration to the containerized application.

Approach 1: Docker image per environment with embedded configuration

In this approach, each environment has its dedicated image with environment configuration embedded.

Pros

  • Configuration and code can be stored in a single repository.
  • No additional development for supplying environment configuration required.

Cons

  • An image is not reusable and is not portable.
  • Multiple separate docker images required for a single application version.
  • Affects the Continuous Deployment and testing approach. No advantage from the container approach benefits (e.g. promoting the same image from development to production guarantees there are no discrepancies between an application on different environments).
  • Changing configuration (e.g. log level or API endpoint address) requires a release.

Approach 2: Embed every-environment configuration in Docker Image

In this approach, we have only a single image for an application containing all environments configurations inside it. This image would be deployed on every environment and the proper configuration is selected when running the application.

Pros

  • Single image for each application version - mitigates the discrepancies between environments.
  • Configuration and code can be stored in a single repository.
  • Running container on a specific environment can be achieved fairly easy (e.g. with environment variable).

Cons

  • Security: anyone with access to the image will be able to pull and see the configuration/secrets.
  • As the project grows further, new environments appear (like performance, dev-1, dev-2) resulting in explosion of config groups. Not mentioning this makes managing deployments of the app hard, adding a new environment requires building a new image and release.
  • Changing configuration (e.g. log level or API endpoint address) requires a release.

Approach 3: Configuration supplied to the container

In this approach, an image contains only application code and application configuration. The environment configuration is delivered by the environment that runs the container (e.g. via k8s ConfigMap, environment variables, by mounting a volume with config files).

Pros

  • Single image for each application version - mitigates the discrepancies between environments.
  • This is an approach that scales up smoothly as the app naturally expands into more environments over its lifetime.
  • Good separation of application and configuration.
  • Production secrets/config can be properly secured (limited access).
  • Changes in configuration (e.g. log level, enabling/disabling metrics, endpoints URL/key) can be done without releasing a new image.

Cons

  • Configuration and application code stored in separate repositories.
  • Requires developing an approach to deliver the configuration for containers running on different environments.

Recommendations and comments

I won't elaborate here, why having container image per environment with embedded configuration is bad - please just look at its cons (and if you still don't see it, please do it again and again and again...) and read more on references provided at the end of this post. While the third approach is the one that gives us the most flexible solutions and is probably the best choice for most of the application, I wouldn't cross out the second approach completely. For some situations, it actually might be a good choice, especially when we extend it to a "hybrid approach", where still one image contains all environment configurations, but some of them (e.g. log level) may be quickly overridden by supplying it externally to the container. To make that choice, please answer honestly on the questions below and compare your answers with the pros and cons of every approach.

Questions to answer:

  • Can you afford a release every time the configuration needs changing?

    • How many configurations will change frequently?
    • How long is the period you want to be able to change a specific configuration?
    • When you release a new version, will that be only this application or there are other dependencies you need to upgrade?
    • What is the chance that some configuration must be changed in a matter of seconds ("no time for release coz prod is burning")?
  • How many environments there will be? How frequently will they change?

    • Can you afford a release every time a new environment is required?
    • Will that be a bottleneck to spawning new environments?
  • How production configuration/secrets will be protected from the people who should not have access to it?

    • Should access to some configuration/secrets be limited?
    • Who will have access to the docker images and repository?
    • Who will have access to the configuration and repository?
  • How the deployment will be performed?

    • Is the image released in isolation, or is it a part of some bigger release (e.g. other application that depends on the image is released at the same time)?
    • How error-proof (idiot-proof) should the deployment be?

Summary

As we can see, #1 and #3 are two completely opposite approaches, while #2 is a compromise between pure evil and a more complicated but mature approach. If you are hesitating between any of those approaches, consider their pros and cons and decide what aspects are critical for your application. I will end with some of my favourite quotes from references:

Running a Docker container in production should be the assembly of an image with various configurations and secrets.

Config is correctly isolated from codebase if and only if the codebase is ready to be made public without security compromises.

Environment-specific configuration doesn't belong in the image, it belongs only in the running container.

Config varies substantially across environments, code does not.

Well designed application should have separated code, configuration and data.

References