Skip to content

6.4. Templates

What is a code template?

A code template is a predefined, reusable project structure that acts as a blueprint for creating new projects. It standardizes foundational components like configuration files, directory layouts, and setup scripts for essential tools such as linters, formatters, and testing frameworks.

By establishing a consistent baseline, templates allow developers to customize project-specific details—like its name, description, or dependencies—while ensuring that engineering best practices are followed from the start.

For instance, the authors of this course provide the Cookiecutter MLOps Package, which scaffolds new MLOps projects based on the principles taught here. This section explains how to leverage and adapt such templates for your own work.

Why are code templates essential for MLOps?

In MLOps, where speed and reliability are critical, templates are indispensable for scaling operations efficiently. They offer several key advantages:

  • Standardize Best Practices: Enforce uniform architecture, tooling, and coding standards across all projects, making them easier to maintain and integrate.
  • Accelerate Development: Automate the repetitive setup process, allowing teams to bypass initial configuration and immediately focus on the core business problem.
  • Promote Focused Work: Separate the concerns of infrastructure and application logic. Template maintainers can focus on improving the foundational framework, while project developers concentrate on building features.

As AI/ML development increasingly resembles a factory assembly line, templates ensure that every new project is built quickly and to a high standard of quality.

What are the best tools for creating code templates?

Cookiecutter

Cookiecutter is the industry standard for scaffolding projects in the Python ecosystem. It uses a simple command-line interface to generate a new project from a template.

cookiecutter [template-directory-or-url]

The command uses a cookiecutter.json file within the template to prompt the user for variables, which are then injected into the project files.

Cruft

Cruft is an essential companion to Cookiecutter that manages updates. After a project is created, Cruft links it to the original template, allowing you to pull in improvements and bug fixes over time.

Initialize a new project with Cruft:

cruft create [template-repository-url]

Update the project with the latest template changes:

cruft update

How do you pass variables into a code template?

Cookiecutter uses the Jinja2 templating engine to embed variables directly into files and filenames. These variables are defined in the cookiecutter.json file, which acts as the template's public interface.

When you run cookiecutter, it reads this file, asks you for input for each variable, and uses your answers to render the final project files.

Example of a variable in a Python file:

# The placeholder "{{ cookiecutter.project_name }}" will be replaced
# with the value you provide during generation.
project_name = "{{ cookiecutter.project_name }}"

Example cookiecutter.json file:

This file defines the template's variables and their default values. You can even use variables to define other variables.

{
    "user": "fmind",
    "name": "MLOps Project",
    "repository": "{{cookiecutter.name.lower().replace(' ', '-')}}",
    "package": "{{cookiecutter.repository.replace('-', '_')}}",
    "license": "MIT",
    "version": "0.1.0",
    "description": "A new MLOps project.",
    "python_version": "3.13",
    "mlflow_version": "2.20.3"
}

How should you structure a Cookiecutter template?

A well-structured Cookiecutter template repository has two main components:

  1. The Template Directory: A single directory whose name contains a variable, like {{cookiecutter.repository}}. Everything inside this directory—files, subdirectories, and their content—will be rendered into the new project.
  2. Configuration and Hooks: Files that control the generation process but are not part of the final project. These include:
    • cookiecutter.json: Defines the variables for the template.
    • hooks/: A directory for scripts that run before or after generation.

For a complete, real-world example, explore the cookiecutter-mlops-package template created by this course's authors.

Initialize this template package:

cookiecutter gh:fmind/cookiecutter-mlops-package

For advanced techniques, refer to the Advanced Usage section of the Cookiecutter documentation.

What should a good code template include and exclude?

A template should provide project scaffolding, not a finished application. The goal is to give developers a head start without imposing a rigid implementation.

What to Include (The Scaffolding):

  • Task Automation: A justfile or Makefile to automate common commands.
  • Linters & Formatters: Configurations for tools like Ruff to enforce code quality.
  • Testing Frameworks: Setup for pytest to enable immediate testing.
  • Project Metadata: A pyproject.toml file to manage dependencies and project settings.
  • CI/CD Pipelines: Basic workflow files for services like GitHub Actions.

What to Exclude (Project-Specific Logic):

  • Source Code: Avoid including specific application logic or architectural patterns. The template should be agnostic to how a developer chooses to solve their problem.
  • Tests: Do not include tests tied to a specific implementation.

How do you keep a project synchronized with its template?

To prevent "project drift" and ensure your project benefits from the latest template improvements, always initialize it with Cruft.

When the template is updated, run the following command inside your project directory:

cruft update

Cruft will fetch the latest changes, compare them to your project, and create a pull request with the proposed updates, using Git to manage any merge conflicts.

How can you demonstrate a template's usage?

The best way to illustrate a template's power and flexibility is to create one or more reference implementations. These are fully functional demo repositories generated from the template.

Reference implementations serve multiple purposes: - Provide a Live Demo: Show a practical, real-world application of the template. - Act as Documentation: Serve as a clear example for developers to follow. - Serve as a Testbed: Use the demo repository to develop and validate new features before backporting them to the template.

What is the best way to improve a code template?

The most effective way to evolve a template is through an iterative refinement loop, often called "dogfooding" (i.e., eating your own dog food).

  1. Generate: Create a new project from your template.
  2. Implement: Build a feature or fix a bug in the generated project.
  3. Backport: Once the changes are validated, move them back into the template itself.

This feedback loop ensures that your template remains practical, robust, and aligned with real-world needs.

How can you automatically test a code template?

Automated testing is critical to ensure a template doesn't break as it evolves. With pytest-cookies, you can write tests that automatically generate a project and verify the output.

# Test that the project generates successfully
def test_bake_project(cookies):
    result = cookies.bake(extra_context={"project_name": "helloworld"})

    assert result.exit_code == 0
    assert result.exception is None
    assert result.project_path.name == "helloworld"
    assert result.project_path.is_dir()

You can also use a library like pytest-shell-utilities to run shell commands and validate that setup tasks in the generated project work as expected.

def test_assert_good_exitcode(shell):
    ret = shell.run("exit", "0")
    assert ret.returncode == 0

def test_assert_bad_exitcode(shell):
    ret = shell.run("exit", "1")
    assert ret.returncode == 1

How do you run automated tasks after generation?

Cookiecutter hooks are Python or shell scripts that execute automatically before or after project generation. They are perfect for cleanup tasks or conditional logic.

A common use case is removing files that are not needed based on the user's choices during setup.

Example post_gen_project.py hook script:

This script removes a requirements.txt file if the user chose a package manager other than pip.

import os

# A list of files to remove based on template variable conditions
REMOVE_PATHS = [
    "{% if cookiecutter.packaging != 'pip' %}requirements.txt{% endif %}",
]

for path in REMOVE_PATHS:
    path = path.strip()
    if path and os.path.exists(path):
        if os.path.isfile(path):
            os.unlink(path)
        else:
            os.rmdir(path)

What is the difference between using a template and forking a repository?

Although they seem similar, templates and forks serve fundamentally different purposes.

  • Template: Use a template to start many new, independent projects from a shared baseline. Each new project is a distinct entity and does not share history with the template. The goal is standardization.
  • Fork: Create a fork to make a single, related copy of an existing repository. A fork is typically used to propose changes back to the original project (the "upstream") or as a starting point for a closely related but distinct project. The goal is contribution or parallel development.

Additional Resources