Technical Insights: Docker at California’s Child Welfare Digital Service

Technical Insights: Docker at California’s Child Welfare Digital Service

By Thomas Ramirez

This is part one of a three-part series regarding work we performed at CWDS and is focused on containers, and more specifically our experience with Docker containers.

Note: Containers are a tool and deployment model which make it possible to isolate applications into compact, lightweight execution environments that share an underlying operating system. Typically measured in megabytes, containers use far fewer resources than virtual machines, start up almost immediately, and can be packed far more densely on the same hardware. Docker specifically brings cloud-like flexibility to any infrastructure capable of running containers.

I’m writing this month’s blog post from Momentum. It’s been a great week so far with lots of awesome content from so many presenters. Samuel and I were able to share a little about containers. Docker in particular – along with our capabilities and experiences with this tool. This week I gave a brief overview about what we did for the Intake project for CWDS (where we helped create a system that supports emergency response workers with information to make better decisions for children’s’ safety and well-being) and want to give a little more detail from the techy side here.

Anyhow, there are four main things [container functions] you can do with Docker: Create Docker images, push those images to a repository like DockerHub, pull images from a repository, and run the images. For CWDS, we did all four. Since I just said all this stuff we can do with images, I’d better explain what a Docker image is. Probably the easiest way to describe an image is to say they are like a snapshot of a Virtual Machine (VM) but without the Operating System (OS)

Tech Tip 1: Creating a Docker image is similar to making a layer cake in that you define the parts of the image and stack the parts on top of each other. This means that if you want to change the top layer, you just replace it but, if you want to change some lower layer, you have to replace all the layers above it. Okay, maybe it’s not exactly like a layer cake but I don’t bake so I’m good with the analogy. Where creating these images is unlike making a layer cake is that each layer is the delta (or difference) from the layer before it. This means you have to be careful about how you form the layers.

Let me give you a somewhat silly example to show what I mean. Let’s say we want to put a directory listing into a file and include that file in the image. One way we can accomplish this is to take the following three steps: 1. copy the directory into the image, 2. create the new file, and then 3. delete the directory. Here’s the problem: Let’s say you have 50 MBs of files in the directory. In the first step, you copy the directory in the image so the layer for that step adds in 50 MBs worth of files and the image is currently 50 MB big. In the second step, you put a listing of the 50 MB directory into a file. Let’s say the file is 2k so the second layer is 2k. Now the [total] image is 50 MB + 2k. Finally, you delete the initial 50 MB directory so the change to the image is 50 MB being removed. This means that the third layer of the image is 50 MB. The total image is now 100 MBs + 2k even though the results of applying the image will only have the 2k file.

Tech Tip 2: Another trick about working with Docker containers is that you need to remember where you are, particularly when you are debugging from the command line. Since the environment in the container is largely independent from the environment outside of the container, the available directory structure may be unrelated and tools that are available outside of the container may not be available inside the container or may be different versions.

While use of Docker containers adds complexities to some parts of the development process, it also simplifies deployment and provides a stronger assurance of consistency between environments. Particularly when the controlling Dockerfile and docker-compose.yml files (which I have not discussed here) are themselves in the code repository and versioned as part of the code.