Container-based application design encourages certain principles. One of these principles is that there should just be one process running in a container. That is to say, a Docker container should have just one program running inside it.
Docker is efficient at creating and starting containers. It allocates PID (Process ID) 1 to the process running inside the container.
Why is this? And is it possible to run multiple processes in a container?
Why should a container have only one process?
It’s sometimes recommended in books and documentation to stick to the principle of one container, one process. Many container images on Docker Hub usually just run one process.
But why is this the case? Why is it recommended for a container to run just one process? There are various benefits to running one process per container:
Isolation: By placing each process in a separate container, you gain the benefits of isolating the process so that it can’t interfere with others.
Easier to scale: When a container consists of just one single process, it is easier to scale the application by creating more instances of the container.
If an application consists of a web server and a database, it is better to run two containers – one for the web server and one for the database – so that they can be scaled independently. For example, if the web server is serving a lot of traffic, it can be scaled separately from the database.
Easier to build and test: When processes are isolated, it makes it easier to build container images because there is less work to do. Developers can build container images without being impacted by others who may be working on another process inside the same container.
Components can be upgraded independently: If you separate your application components into different containers, you can maintain and patch them separately. For example, if a new version of your web server is released, you can deploy the latest version, without impacting processes running in other containers.
Better reusability: One of the benefits of a container-based application is that it can be run for different purposes and in different environments, by just changing its configuration. This makes a container like a building block. However when a container has more than one process, it is significantly more specialised and the potential for reuse is limited.
Easier to collect logs: If there is a single foreground process running in a container, it is easier to identify logs that are sent to standard out. If there are multiple processes running, you will either to either do some work to identify the logs from the different processes in standard out, or you will need to redirect the logs to separate files, neither of which is a nice solution.
Simpler to manage with Docker: Docker watches your application’s process (PID 1), and uses this to report the event that your container has stopped. So it knows if your application has terminated nicely, or if it has crashed horrifically! If multiple processes are running inside the container, this becomes harder, and you might need to monitor your processes manually.
Can you run multiple processes in a container?
If you need to run multiple processes, it’s generally recommended to place each process in its own container. You can run multiple containers manually, or you can use an orchestration tool like Docker Compose, or Kubernetes (check out our page of awesome Kubernetes learning resources).
For an application that consists of two processes, like a web server and a database, these can run in two separate containers, and use container networking to communicate.
However, if you have an application which consists of two very tightly-coupled processes, you might have a requirement to run multiple services inside the same container. You can do this using a couple of different approaches:
Use a process manager which can run multiple processes: You can set the container’s entrypoint to a specialised program which is capable of running and managing multiple processes.
One example of this is supervisord. You can use supervisord as your container entrypoint, which will then load the services that you need. You can do this from an init script. There are also several other options for multi-process containers.
Write your own startup or wrapper script: You can write your own startup script which starts multiple processes, but then you will need to handle things like shutdown signals and zombie processes appropriately, and clean them up when they finish.
Think: One concern per container
The current thought for container-based application design is one concern, one container. This is sometimes referred to as the single concern principle:
Each container addresses a single concern, and does it well.
Some applications may need to spawn multiple processes. If you follow the one-process rule, you may think that these need to be placed in separate containers, but this can add unnecessary complexity.
So instead of being overly concerned with the number of processes running in your container, it is better to think of a container as providing one concern or job. A web server, database and message broker can all be thought of as single concerns.
If you follow this approach, you will find that your containers are easier to build and manage, and you will benefit from increased flexibility and scalability.
Got any thoughts on what you've just read? Anything wrong, or no longer correct? Sign in with your GitHub account to leave a comment.
(All comments get added as Issues in our GitHub repo here, using the comments tool Utterances)