6.4. Templates
What is a code template?
A code template provides a standardized framework for initiating new projects, particularly beneficial in environments where multiple projects share common elements. This structure usually includes predefined configurations for common tools like linters, unit testing frameworks, and code formatters. By doing so, each new project can be customized with specific details such as the project name, description, and operating environment, while maintaining a consistent approach to software development.
For instance, the authors of this course provide the Cookiecutter MLOps Package for free to quickly generate new MLOps Projects based on the principles described in this course. This section explains how to leverage similar code templates and adapt them for your organization.
Why do you need to create code templates?
- Align code base and best practices: Creating a code template helps enforce uniformity and adherence to best practices across all projects within an organization
- Augment your productivity: Templates streamline the project setup process, significantly reducing the time and effort needed to start new projects
- Focus on the main problem: Code templates allow developers to concentrate on solving the business problem rather than getting bogged down by repetitive setup tasks. The template maintainers can focus on enhancing the template itself, ensuring it incorporates the latest and most efficient solutions.
As AI/ML projects become more standardized and akin to an assembly line production in factories, automating their creation ensures faster deployment and a higher standard of initial setup quality.
Which tools should you use to create code templates?
Cookiecutter
Cookiecutter is a widely-used tool in the Python community for creating project templates. It uses a straightforward command-line interface to generate new projects from user-defined templates.
cookiecutter [template-directory]
This command processes the template directory or repository containing a cookiecutter.json
file and potentially other template files, prompting the user for input on defined variables.
Cruft
Cruft complements Cookiecutter by helping manage updates to projects created from a Cookiecutter template. It tracks changes in the template and can apply these changes to existing projects, helping maintain consistency and up-to-date practices across all projects.
Initializing a new project with Cruft:
cruft create [template-repository-url]
Updating the project afterwards:
cruft update
How can you pass variables to replace inside a code template?
Variables in Cookiecutter templates are managed using Jinja2, a template engine for Python. Jinja2 allows for dynamic content generation using placeholders in template files, which are replaced by actual values at runtime based on user input or default values defined in cookiecutter.json
.
Example of using Cookiecutter variables in code:
# Define a variable in your Python script that uses Cookiecutter variables
project_name = "{{ cookiecutter.project_name }}"
The cookiecutter.json
file is where all default values for the variables in a template are defined. When a new project is generated, Cookiecutter prompts the user to input values for these variables or accept the defaults as specified in the JSON file.
Here is an example of defining variables in the cookiecutter.json
file, which includes project metadata and configuration defaults:
{
"user": "fmind",
"name": "MLOps Project",
"repository": "{{cookiecutter.name.lower().replace(' ', '-')}}",
"package": "{{cookiecutter.repository.replace('-', '_')}}",
"license": "MIT",
"version": "0.1.0",
"description": "TODO",
"python_version": "3.12",
"mlflow_version": "2.19.0"
}
How should you structure a cookiecutter template repository?
A cookiecutter template repository typically consists of two primary levels of structure:
-
Template Files and Folders: These are the directories and files that will be generated and are identifiable by their names containing cookiecutter variables (e.g.,
{{cookiecutter.project_slug}}
. All such template files and folders are contained within a single root directory (e.g,{{cookiecutter.repository}}
) to facilitate easy generation. -
Supporting Files: These include additional resources such as documentation, scripts for automating setup tasks, or configuration files necessary for the template itself but not part of the generated project files. These files reside outside the main template folder.
Refer to the cookiecutter-mlops-package template by this course authors for a practical example of how to structure a template repository.
Initializing this template package:
cookiecutter gh:fmind/cookiecutter-mlops-package
For advanced structuring techniques and best practices, refer to the Advanced Usage section of Cookiecutter documentation.
What should you include and exclude from the code template?
What to Include:
- Automation tasks: Tools like PyInvoke to automate common development tasks.
- Linters: Tools such as Ruff to ensure code quality and style consistency.
- Unit test configurations: Setup configurations for tools like pytest to facilitate testing.
- Project configuration files: Such as
pyproject.toml
for managing project dependencies and settings.
What to Exclude:
- Specific source code or tests: Avoid including actual code or tests that imply a specific programming style or architecture. This allows developers to apply their preferred coding practices without constraints.
How should you keep the project updated with the code template?
To maintain alignment with the original template as it evolves, initiate your project with cruft and periodically run:
cruft update
This will help integrate enhancements and bug fixes from the template into your project, utilizing Git's capabilities to manage any conflicts that arise.
How should you illustrate the usage of a code template?
Creating one or several demo repositories based on the template serves multiple purposes:
- Demonstrates practical application: Show how the template can be used in a real-world scenario.
- Encourages experimentation: Provides a base for others to experiment with different coding styles or architectural approaches.
- Facilitates specific feature development: Use the demo to develop features that might not be universally required but are useful for some projects.
How should you change and improve the code template?
The most effective approach to develop and refine a code template is to:
- Generate a new project using the code template.
- Implement and test your changes in the generated project.
- Once validated, backport these changes to the template itself.
This iterative approach helps ensure that the template remains robust and functional across different use cases.
How can you test the generation of a code template automatically?
Utilizing pytest-cookies, you can automatically test the generation process of your Cookiecutter template:
def test_bake_project(cookies):
result = cookies.bake(extra_context={"repo_name": "helloworld"})
assert result.exit_code == 0
assert result.exception is None
assert result.project_path.name == "helloworld"
assert result.project_path.is_dir()
Additionally, pytest-shell-utilities can be used to run shell commands post-generation to validate setup tasks:
def test_assert_good_exitcode(shell):
ret = shell.run("exit", "0")
assert ret.returncode == 0
def test_assert_bad_exitcode(shell):
ret = shell.run("exit", "1")
assert ret.returncode == 1
How can you perform automated actions after the code generation?
Cookiecutter's hook mechanism allows the execution of Python or shell scripts after the project generation. This functionality is crucial for performing setup tasks or cleaning up unnecessary files based on the user's choices.
Example of a post-generation hook script:
import os
REMOVE_PATHS = [
"{% if cookiecutter.packaging != 'pip' %}requirements.txt{% endif %}",
]
for path in REMOVE_PATHS:
path = path.strip()
if path and os.path.exists(path):
os.unlink(path) if os.path.isfile(path) else os.rmdir(path)