5.1. Task Automation

What is task automation?

Task automation refers to the process of automating repetitive and manual command-line tasks using software tools. This enables tasks to be performed with minimal human intervention, increasing efficiency and accuracy. A common example of task automation in software development is the use of make, a utility that automates the execution of predefined tasks like configure, build, and install within a project repository. By executing a simple command:

make configure build install

developers can streamline the compilation and installation process of software projects, saving time and reducing the likelihood of errors.

Why do you need task automation?

Task automation is essential for several reasons:

Don't repeat yourself: Automating tasks helps in avoiding the repetition of similar tasks, ensuring that you spend your time on tasks that require your unique skills and insights.
Share common actions: It enables teams to share a common set of tasks, ensuring consistency and reliability across different environments and among different team members.
Avoid typing mistakes: Automation reduces the chances of errors that can occur when manually typing commands or performing repetitive tasks, leading to more reliable outcomes.

Embracing task automation is a step towards improving efficiency for programmers. The initial effort in setting up automation pays off by saving time and reducing errors, making it a valuable practice in software development.

Which tools should you use to automate your tasks?

While Make is a ubiquitous and powerful tool for task automation, its syntax can be challenging due to its use of unique symbols (e.g., $*, $%, :=, ...) and strict formatting rules, such as the requirement for tabs instead of spaces. This complexity can make Make intimidating for newcomers.

For those seeking a more approachable alternative, Just offers a simpler, Rust-based syntax for defining and running tasks. Here is an example showcasing how to build a Python package (wheel file) using Just:

# run package tasks
[group('package')]
package: package-build

# build package constraints
[group('package')]
package-constraints constraints="constraints.txt":
    uv pip compile pyproject.toml --generate-hashes --output-file={{constraints}}

# build python package
[group('package')]
package-build constraints="constraints.txt": clean-build package-constraints
    uv build --build-constraint={{constraints}} --require-hashes --wheel

This example illustrates how tasks can be easily defined and automated using Python, making it accessible for those already familiar with the language. Developers can then execute the task from their terminal:

# execute the build task
just build

How can you configure your task automation system?

Configuring your task automation system with Just is straightforward. It can be installed as a Python dependency through:

uv add --group dev rust-just

Then, to configure just for your project, create an justfile file in your repository:

# https://just.systems/man/en/

# REQUIRES

docker := require("docker")
find := require("find")
rm := require("rm")
uv := require("uv")

# SETTINGS

set dotenv-load := true

# VARIABLES

PACKAGE := "bikes"
REPOSITORY := "bikes"
SOURCES := "src"
TESTS := "tests"

# DEFAULTS

# display help information
default:
    @just --list

# IMPORTS

import 'tasks/check.just'
import 'tasks/clean.just'
import 'tasks/commit.just'
import 'tasks/doc.just'
import 'tasks/docker.just'
import 'tasks/format.just'
import 'tasks/install.just'
import 'tasks/mlflow.just'
import 'tasks/package.just'
import 'tasks/project.just'

This configuration file allows you to define general settings like dotend-load and project-specific variables like the project SOURCES and TESTS locations. Detailed documentation and more configuration options can be found on Just website.

How should you organize your tasks in your project folder?

For an MLOps project, it's advisable to organize tasks into categories and place them within a tasks/ directory at the root of your repository. This directory can include files for different task categories such as cleaning, commits, container management, and more. Here's an example structure:

Each module, like check, can define multiple tasks. For example:

# run check tasks
[group('check')]
check: check-code check-type check-format check-security check-coverage

# check code quality
[group('check')]
check-code:
    uv run ruff check {{SOURCES}} {{TESTS}}

# check code coverage
[group('check')]
check-coverage numprocesses="auto" cov_fail_under="80":
    uv run pytest --numprocesses={{numprocesses}} --cov={{SOURCES}} --cov-fail-under={{cov_fail_under}} {{TESTS}}

# check code format
[group('check')]
check-format:
    uv run ruff format --check {{SOURCES}} {{TESTS}}

# check code security
[group('check')]
check-security:
    uv run bandit --recursive --configfile=pyproject.toml {{SOURCES}}

# check unit tests
[group('check')]
check-test numprocesses="auto":
    uv run pytest --numprocesses={{numprocesses}} {{TESTS}}

# check code typing
[group('check')]
check-type:
    uv run mypy {{SOURCES}} {{TESTS}}

These tasks can then be invoked from the command line as needed, providing a structured and efficient way to manage and execute project-related tasks.

# run the code checker
just check-code
# run tests with a parameter
just check-test 1
# run the code and format checker
just check-code check-format
# run all the check tasks in the module
just check