Set up a Docker in Docker Agent Template

Issues

  • I need to use Docker Multi-stage builds on my build agents based on Docker Agent Templates.
  • I need to use a different Docker version than the one provided on my build agents based on Docker Agent Templates.
  • I need to run Docker commands with a local bind mount ( -v $(pwd):/app) on my build agents based on Docker Agent Templates.
  • I do not want “dangling” containers to last after my builds terminate.
  • I do not want other builds to see or interact with my containers (“re-entrant build”).
  • My build, running on an Agent Template, cannot use Docker with the following error:
Got permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock:
Get http://%2Fvar%2Frun%2Fdocker.sock/v1.35/containers/json:
dial unix /var/run/docker.sock: connect: permission denied

Environment

Resolution

Those issues are caused by the default Docker Agent Template of CJEv1 configuration, referred to as “Docker on Docker (DonD)”, “sibling containers”, or “ docker.sock socket sharing”.

This default “docker” template works by sharing the file /var/run/docker.sock and the workspace inside your build agent container, to allow communication from your build to the worker machine’s Docker Engine. It allows Docker layer caching by default which improves build time.

Docker On Docker Diagram

To solve your issue, you need to configure a new Docker Agent Template providing a self-service and ephemeral Docker Engine, which your build will use instead of the worker machine’s Docker Engine.

This pattern is referred as “Docker in Docker”, or “nested containerization”.

Docker in Docker diagram

This article covers the following steps to achieve this:

  1. Prepare a custom Docker Image
  2. Configure a new Docker Agent Template
  3. Validate the setup
  4. Configure the caching of Docker images
  5. Discussion and implications of this solution

1. Prepare a custom Docker Image

The solution provided in this article assumes that you are familiar with building a custom Docker Image and hosting it on a Docker registry. If you are not, start with https://docs.docker.com/docker-hub/ and https://training.play-with-docker.com/dev-landing/.

You need to create an image with the following elements:

  • Dockerfile:
# Official Docker Image from https://hub.docker.com/_/docker/
# Set the Docker version you want to use
FROM docker:18.02-dind

# Defining default variables and build arguments
ARG user=jenkins
ARG group=jenkins
ARG uid=1000
ARG gid=1000
ARG jenkins_user_home=/home/${user}

ENV JENKINS_USER_HOME=${jenkins_user_home} \
  LANG=C.UTF-8 \
  JAVA_HOME=/usr/lib/jvm/java-1.8-openjdk \
  PATH=${PATH}:/usr/local/bin:/usr/lib/jvm/java-1.8-openjdk/jre/bin:/usr/lib/jvm/java-1.8-openjdk/bin \
  DOCKER_IMAGE_CACHE_DIR=/docker-cache \
  AUTOCONFIGURE_DOCKER_STORAGE=true

# Install required packages for running a Jenkins agent
RUN apk add --no-cache \
  bash \
  curl \
  ca-certificates \
  git \
  openjdk8 \
  unzip \
  tar \
  tini

# Set up default user for jenkins
RUN addgroup -g ${gid} ${group} \
  && adduser \
    -h "${jenkins_user_home}" \
    -u "${uid}" \
    -G "${group}" \
    -s /bin/bash \
    -D "${user}" \
  && echo "${user}:${user}" | chpasswd

# Adding the default user to groups used by Docker engine
# "docker" for avoiding sudo, and "dockremap" if you enable user namespacing
RUN addgroup docker \
  && addgroup ${user} docker \
  && addgroup ${user} dockremap

# Custom start script
COPY ./entrypoint.bash /usr/local/bin/entrypoint.bash

# Those folders should not be on the Docker "layers"
VOLUME ${jenkins_user_home} /docker-cache /tmp

# Default working directory
WORKDIR ${jenkins_user_home}

# Define the "default" entrypoint command executed on the container as PID 1
ENTRYPOINT ["/sbin/tini","-g","--","bash","/usr/local/bin/entrypoint.bash"]
#!/bin/bash
#
# This entrypoint is dedicated to start Docker Engine (in Docker)
# For CloudBees Jenkins Enterprise 1.x Agent Templates
# Cf. https://support.cloudbees.com/hc/en-us/articles/115001626487-Customize-entrypoint-on-CJE-Agent-Docker-images

set -e

if [ $# -gt 0 ]
then
  ## Default Docker flag. It can be overwritten by an environment variable
  DOCKER_OPTS="${DOCKER_OPTS:-"--bip=192.168.0.1/24"}"

  if [ "${AUTOCONFIGURE_DOCKER_STORAGE}" = "true" ]
  then
    # Default settings: we let the script autodetect the "best"
    # storage driver for Docker based on the host kernel features
    if [ -f "/proc/filesystems" ]
    then
      ## Best case: we can use overlayFS 2 (overlay2 for Docker)
      if [ "$(grep overlay /proc/filesystems)" ]
      then
        STORAGE_DRIVER="overlay2"
      elif [ "$(grep aufs /proc/filesystems)" ]
      # overlay not available: let's try aufs, the Docker "original" layer FS
      then
        STORAGE_DRIVER="aufs"
      else
        # Other drivers exists: btrfs, zfs, but they need advanced configuration
        # Check https://docs.docker.com/storage/storagedriver/select-storage-driver/
        # We fallback to the worst case, "vfs": slow but works everywhere
        STORAGE_DRIVER="vfs"
      fi
      DOCKER_OPTS+=" --storage-driver ${STORAGE_DRIVER}"
    fi
  fi

  echo "== Starting Docker Engine with the following options: ${DOCKER_OPTS}"
  /usr/local/bin/dockerd-entrypoint.sh ${DOCKER_OPTS} >/docker.log 2>&1 &

  # Wait for Docker to start by checking the TCP socket locally
  ## Wait 1 second to let all process and file handle being created
  echo "== Waiting for Docker Engine process to start"
  sleep 1

  ## Try reaching the unix socket 6 times, waiting 5s between tries
  curl -XGET -sS -o /dev/null --fail \
    --retry 6 --retry-delay 5 \
    --unix-socket /var/run/docker.sock \
    http://DOCKER/_ping || (cat /docker.log && exit 1)

  ## Last check: the _ping endpoint should send "OK" on stdout
  [ "$(curl -sS -X GET --unix-socket /var/run/docker.sock http:/images/_ping)" == "OK" ]

  echo "== Docker Engine started and ready"

  # Load any "tar-gzipped" Docker image from the local cache
  if [ -n "${DOCKER_IMAGE_CACHE_DIR}" ] && [ -d "${DOCKER_IMAGE_CACHE_DIR}" ]
  then
    echo "== Variable 'DOCKER_IMAGE_CACHE_DIR' found and pointing to an existing Directory"
    echo "== Loading following .tar files in Docker:"
    find "${DOCKER_IMAGE_CACHE_DIR}" -type f -name "*.gz" -print \
      -exec sh -c 'gunzip --stdout "$1" | docker load' -- {} \;
  fi

  # second argument is the java command line generated by CJE (passed as a single arg)
  shift
  echo "== Launching the following user-provided command: ${*}"
  exec /bin/sh -c "$@"
fi

For the next steps, we assume that this image is built and hosted with the name youcompany.registry.com/cje-dind-agent:1.0.0.

2. Configure a new Docker Agent Template

Create a new Agent Template, either in Operation Center or in one of your Managed Masters.

Configure this template with the following settings; if you are unsure about a setting, refer to the settings in the default docker Agent Template in your Operations Center:

  • Additional labels: docker-in-docker is a good start. You can add docker if you want to make it the default, but take care of disabling the default one. You can also use the Docker version if you need strict compliance: docker-18.02 in our example.
  • Resources: CPU shares and Memory needs to be adapted to your needs. Use the same values as the default Docker Agent Template to start.
  • Java Options: same as for Resources, use the default Docker Agent Template or tune it to your needs.
  • Filesystem: “Remote root directory”: /jenkins
  • Definition: “Image” must be set up to your Docker image name. It would be registry.yourcompany.com/cje-dind-agent:1.0.0 in our example.
  • Options:
    • Remove ALL existing options unless you know what you are doing.
    • The path /var/run/docker.sock must not be shared.
    • Add a “Launch in privileged mode” option
    • Add a “Use a custom Docker shell command” option, and set the field to DUMMY
    • Add a “Parameter”, and set the “Key” to rm and the “Value” to true
    • Add a “Volume” option, set the host path to a folder on the VM that will contain the cache, let’s say /DATA/docker-cache, and set “Container Path” to /docker-cache

Configuration of the new Template Agent

3. Validate the setup

You can validate that this setup works by executing the following Pipeline; the build Output should show you the full Docker Engine configuration. Verify that the version is 18.02 (or the version on your Dockerfile’s FROM instruction):

pipeline {
  agent {
      label 'docker-in-docker'
  }
  stages{
    stage('Validate Docker') {
      steps {
        sh 'docker info'
      }
    }
  }
}

4. Configure the caching of Docker images

There is a big downside on this configuration: a fresh Docker engine is spawned on each build. It means that the “Docker cached layers” are not reusable across your builds.

However, this setup provides a facility, by preloading any file with the extension tar.gz, stored on the /docker-cache directory of the agent container.

In our example, we are mapping this folder to the VM folder /DATA_docker-cache: persistence is done at agent level; as it was the case before.

Here is a Pipeline example that will ask for a Docker image, download it, and store it into the cache:

pipeline {
  agent {
      label 'docker-in-docker'
  }
  parameters {
    string(name: 'DOCKER_IMAGE_TO_CACHE', defaultValue: 'alpine', description: 'Which Docker Image do you want to cache?')
  }
  stages{
    stage('Validate Docker') {
      steps {
        sh '''
        ARCHIVE_NAME="$(echo "${DOCKER_IMAGE_TO_CACHE}" | sed -e 's#/#-#g' -e 's#:#_#g').tar.gz"
        CACHE_DIR=/docker-cache
        ls -l "${CACHE_DIR}"
        docker pull "${DOCKER_IMAGE_TO_CACHE}"
        docker save "${DOCKER_IMAGE_TO_CACHE}" | gzip > "${CACHE_DIR}/${ARCHIVE_NAME}"
        '''
      }
    }
  }
}

Please note we are using “gzipped” images to avoid wasting too much disk space. It can have an impact on the “agent startup time”. If you have plenty of hard drive space, you can remove the “gzip/gunzip” steps and only store cache as “tar” archive.

5. Discussion and implications of this solution

You might have a read on this awesome blog article: https://jpetazzo.github.io/2015/09/03/do-not-use-docker-in-docker-for-ci/.

This article underlines 2 elements, when using “DinD” (Docker in Docker):

  • “The bad”: we are launching the agent containers with the flag --privileged. It means that this container can do anything on the worker machine.
    But the usage of “DonD” has the same implication: Docker is run as root and can access anything on the host.
    So changing from “DonD” to “Dind” did not changed anything on this topic. A good read on this: http://blog.loof.fr/2018_01_14_archive.html to go deep dive as well.
  • “The ugly / the worse”: The file storage issue are related to what happened back in 2015.
    Recent Docker Storage Drivers implementations (AUFS, overlayfs and devicemapper) do not have these performances or corruption issues.

The entrypoint.bash script takes care to auto-detect the best storage option at container startup:

  • The driver overlay2 is selected by default if the worker kernel supports it.
  • Otherwise, the driver aufs is selected if supported by the worker kernel.
  • If neither overlay nor aufs are supported by your kernel, then the vfs driver is used as fallback.

You can find more information on these pages:

Tested product/plugin versions

The latest update of this article has been tested with:

Have more questions? Submit a request

0 Comments

Please sign in to leave a comment.