Skip to content

5.4. Software Containers

What are software containers?

Software containers are lightweight, stand-alone packages that include everything needed to run a piece of software, including the code, runtime, system tools, system libraries, and settings. Containers are isolated from each other and the host system, yet they share the OS kernel, making them more efficient and faster than traditional virtual machines. Containers ensure consistency across different development and deployment environments, addressing the "it works on my machine" headache.

Why do you need software containers?

  • Package system dependencies: Containers encapsulate all dependencies required by an application, such as specific versions of libraries and other software, ensuring that it can run in any environment without issues.
  • Share reusable software artifacts: Containers allow for the creation and sharing of reusable software artifacts. For instance, you can create a container image with a configured environment that can be shared with other developers, ensuring everyone works with the same setup.
  • Make your package runnable from any system: Containers abstract away the underlying operating system, allowing applications to run uniformly across different environments. This is particularly useful in heterogeneous environments where systems may be running different versions of operating systems or have different configurations.

While Python packages can be challenging to install across different systems, Docker ensures consistency between operating systems and environments. Running a Docker image is often more straightforward and faster than managing Python packages directly on the system.

Which tool should you use for creating containers?

The most common tool for creating containers is Docker. Docker simplifies the process of building, running, and managing containers. It uses Dockerfile to automate the deployment of applications in lightweight and portable containers.

To install Docker, follow these steps:

  1. Visit the Docker website and download the Docker Desktop application for your operating system.
  2. Follow the installation instructions provided on the website.
  3. Once installed, open a terminal or command prompt and verify the installation by running docker --version.

Docker offers some paid services, such as Docker Hub private repositories and Docker Enterprise Edition, which provide additional features like enhanced security, user management, and automated image builds. It's important to contact your IT administrators to see if these solutions are available in your organization and to explore potential alternatives.

Which tool should you use for hosting containers?

Containers are software packages that must be hosted on a container registry. You can use either Docker Hub or GitHub Packages to host your containers.

To use GitHub Packages, follow these steps:

  1. Create a GitHub account and sign in.
  2. Navigate to your repository where you want to host your container.
  3. In the repository, go to "Packages" and follow the instructions to publish your container image.

You must then login to the system from the command-line and tag with image by adding ghcr.io/[user] at the beginning, such in: ghcr.io/fmind/mlops-python-package.

Which image should you create for your MLOps projects?

To start using Docker for your MLOps projects, you can define a simple image in a Dockerfile in your project repository as follows:

# https://docs.docker.com/engine/reference/builder/

FROM python:3.12
COPY dist/*.whl .
RUN pip install *.whl
CMD ["bikes", "--help"]

You can then build and push this image with the following commands:

poetry build # build the wheel file
docker build --tag=bikes:latest .
docker run --rm bikes:latest

Which tips and tricks should you use to optimize your container workflow?

Software container additional resources