5.3. CI/CD Workflows
What is CI/CD?
CI/CD is a cornerstone of modern software development that combines Continuous Integration and Continuous Delivery or Deployment. It automates the process of integrating code from multiple contributors, testing it, and preparing it for release.
- Continuous Integration (CI) is the practice of frequently merging all developers' code changes into a central repository. After each merge, an automated build and a series of automated tests are run to detect integration issues early.
- Continuous Delivery (CD) extends CI by automatically deploying all code changes to a testing and/or production environment after the build stage.
- Continuous Deployment is a step further, where every change that passes all stages of the pipeline is automatically released to customers.
The primary goal is to make software development faster, more reliable, and less error-prone by automating the entire release process.
What is a CI/CD workflow?
A CI/CD workflow, often called a pipeline, is the automated sequence of steps that takes code from a developer's machine to the production environment. This pipeline typically includes stages for building the application, running a comprehensive suite of automated tests (unit, integration, security), and deploying the application. By automating this path, teams can release new features and fixes to users with speed and confidence.
Why are CI/CD workflows essential for MLOps?
In MLOps, CI/CD workflows are critical for managing the complexity of machine learning systems. They provide several key benefits:
- Ensure Code and Model Quality: CI/CD acts as a gatekeeper, enforcing quality standards for both code and models. By running automated checks for code style, typing, security, and test coverage, it prevents regressions and maintains a healthy codebase.
- Automate Repetitive Tasks: Workflows automate tedious but crucial tasks like dependency installation, testing, packaging, and publishing. This frees up AI/ML engineers to focus on higher-value activities like model development and performance tuning.
- Enhance Reproducibility: By codifying the build, test, and deployment process, CI/CD ensures that every version of your ML system is built and deployed in a consistent, reproducible manner. This is vital for tracking experiments and complying with regulatory requirements.
- Improve Collaboration and Visibility: Centralized workflows provide a clear, shared understanding of the project's health. They generate reports on code quality, test results, and deployment status, making it easier for team members to collaborate and maintain high standards.
Which CI/CD solution should you use?
While many CI/CD solutions exist, GitHub Actions is a powerful and convenient choice for projects hosted on GitHub. It is deeply integrated with the GitHub platform, allowing you to build, test, and deploy your code directly from your repository.
To create a workflow, you define a YAML file in the .github/workflows
directory of your project. This file specifies the triggers (e.g., a pull request), the jobs to run, and the individual steps within each job.
What are the essential workflows for an MLOps project?
For a typical MLOps project, you should establish two primary workflows: one for verification and another for publication.
Verification Workflow
This workflow runs on every pull request to ensure that code changes meet quality standards before being merged into the main branch.
name: Check
on:
pull_request:
branches:
- '*'
concurrency:
cancel-in-progress: true
group: ${{ github.workflow }}-${{ github.ref }}
jobs:
checks:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: ./.github/actions/setup
- run: uv sync --group=check
- run: uv run just check-code
- run: uv run just check-type
- run: uv run just check-format
- run: uv run just check-security
- run: uv run just check-coverage
Workflow Breakdown:
name
: The workflow's name, "Check," as it appears in the GitHub UI.on
: Triggers the workflow on any pull request (pull_request
).concurrency
: Ensures that only one run of this workflow per branch is active at a time. If a new commit is pushed, the previous run is canceled.jobs.checks.steps
: Defines the sequence of steps to execute.actions/checkout@v4
: Checks out the repository code../.github/actions/setup
: Runs a reusable composite action to set up the environment (e.g., install Python and uv).uv sync --group=check
: Installs all dependencies required for the verification checks.uv run just check-*
: Executes a series of checks for code quality, type safety, formatting, security vulnerabilities, and test coverage.
Publication Workflow
This workflow is triggered when a new release is created. It handles building and publishing the project artifacts, such as documentation and a Docker container.
name: Publish
on:
release:
types: [edited, published]
env:
DOCKER_IMAGE: ghcr.io/fmind/mlops-python-package
concurrency:
cancel-in-progress: true
group: publish-workflow
jobs:
pages:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: ./.github/actions/setup
- run: uv sync --group=doc
- run: uv run just doc
- uses: JamesIves/github-pages-deploy-action@v4
with:
folder: docs/
branch: gh-pages
packages:
permissions:
packages: write
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: ./.github/actions/setup
- run: uv sync --only-dev
- run: uv run just package
- uses: docker/login-action@v3
with:
registry: ghcr.io
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
- uses: docker/setup-buildx-action@v3
- uses: docker/build-push-action@v6
with:
push: true
context: .
cache-to: type=gha
cache-from: type=gha
tags: |
${{ env.DOCKER_IMAGE }}:latest
${{ env.DOCKER_IMAGE }}:${{ github.ref_name }}
Workflow Breakdown:
on
: Triggers the workflow when a release isedited
orpublished
.env.DOCKER_IMAGE
: Defines an environment variable for the Docker image name for easy reuse.jobs.pages
: A job dedicated to building and deploying the project's documentation to GitHub Pages.jobs.packages
: A job for publishing packages.permissions
: Grants the jobwrite
permissions to thepackages
scope, allowing it to publish to GitHub Packages Container Registry (ghcr.io
).docker/login-action
: Logs into the container registry.docker/build-push-action
: Builds the Docker image, tags it withlatest
and the release version, and pushes it to the registry. Using a container ensures a consistent, portable environment for running the ML model.
How can you avoid repeating steps in CI/CD workflows?
To follow the DRY (Don't Repeat Yourself) principle, you can encapsulate common sequences of steps into reusable composite actions. These are stored within your repository, typically in the .github/actions
directory.
For example, a setup
action can handle installing Python and project dependencies, ensuring every workflow starts with a consistent environment.
.github/actions/setup/action.yml
:
name: Setup
description: Setup for project workflows
runs:
using: composite
steps:
- name: Install uv
uses: astral-sh/setup-uv@v5
with:
enable-cache: true
- name: Setup Python
uses: actions/setup-python@v5
with:
python-version-file: .python-version
You can then use this action in any workflow with a single line: - uses: ./.github/actions/setup
. This makes your workflows cleaner, more modular, and easier to maintain.
You can also find thousands of pre-built actions on the GitHub Marketplace to integrate with third-party services and streamline your workflows.
What are some best practices for CI/CD in MLOps?
- Automate Everything: Automate all manual steps in your ML lifecycle, including data validation, model training, evaluation, and deployment, to reduce human error and increase velocity.
- Manage Secrets Securely: Use encrypted secrets to store sensitive information like API keys, passwords, and cloud credentials. GitHub Actions provides a secure way to manage secrets at the repository or organization level.
- Master GitHub Actions Syntax: A deep understanding of the workflow syntax, including contexts, expressions, and triggers, will allow you to build highly dynamic and powerful pipelines.
- Use Concurrency Strategically: The
concurrency
key is essential for managing workflow runs efficiently, preventing race conditions, and saving resources by canceling outdated jobs. - Leverage the GitHub CLI: Use the
gh
command-line tool to interact with your workflows, check run status, and trigger them manually (e.g.,gh workflow run ...
), streamlining your development loop. - Implement Branch Protection Rules: Protect your main branch by requiring status checks (like your verification workflow) to pass before pull requests can be merged. This is a critical safeguard for maintaining a stable and deployable project.