4.5. Formatting

What is code formatting?

Code formatting is the practice of applying a consistent style guide to your source code. It governs aesthetic aspects of code, such as indentation, line length, variable naming, and the placement of comments. The goal is to make the code uniform and predictable, improving its overall quality and readability.

Why is code formatting crucial?

Consistent formatting is a cornerstone of professional software development for several key reasons:

Improves Readability: Well-formatted code is visually organized, making it easier for developers to read, understand, and navigate complex logic.
Streamlines Collaboration: When all team members adhere to the same formatting rules, it eliminates stylistic inconsistencies. This ensures that the codebase feels familiar to everyone, reducing friction and onboarding time.
Enhances Maintainability: A uniform style makes it easier to spot bugs, apply updates, and refactor code. It also prevents trivial debates (e.g., tabs vs. spaces), allowing the team to focus on solving real problems.

What is the standard formatting convention for Python?

The official style guide for Python code is PEP 8. It provides a comprehensive set of guidelines for everything from code layout to naming conventions. Adhering to PEP 8 is highly recommended as it is the universal standard across the Python community.

While it's best to stick to the defaults, some PEP 8 rules can be adjusted. For example, the default line length is 79 characters, which was suitable for older monitors. On modern screens, a value like 88 or 100 is often more practical.

What is the difference between formatting and linting?

Although often used together, formatting and linting serve different purposes:

Formatting automatically rewrites your code to conform to a specific style guide. Its primary goal is to ensure visual consistency and readability. It is non-discretionary and deterministic.
Linting analyzes your code to detect programmatic errors, potential bugs, stylistic issues, and "code smells." Its primary goal is to improve code quality and prevent errors. It flags issues but often requires manual intervention to fix them.

Tools like Ruff can perform both formatting and linting, providing a comprehensive solution for code quality.

Which tools should you use to format a Python codebase?

While black (a code formatter) and isort (an import sorter) were the traditional choices, Ruff now provides a superior, all-in-one solution. Ruff is an extremely fast formatter and linter that can replace both black and isort, simplifying your toolchain.

You can install and run Ruff to format your entire codebase with these commands:

# Install Ruff into your project's development dependencies
uv add --group check ruff

# Sort and organize all import statements
uv run ruff check --select I --fix src/ tests/

# Format all source code files
uv run ruff format src/ tests/

How can you automate formatting?

Manually running commands is inefficient. The best practice is to configure your code editor to format your code automatically every time you save a file. This "set it and forget it" approach ensures your code is always compliant without any extra effort.

For VS Code, you can install the Ruff extension and add the following to your [project].code-workspace file:

{
    "settings": {
        // Enable format on save for all files
        "editor.formatOnSave": true,
        // Specific settings for Python files
        "[python]": {
            // Run code actions like organizing imports on save
            "editor.codeActionsOnSave": {
                "source.organizeImports": "explicit"
            },
            // Set Ruff as the default formatter for Python
            "editor.defaultFormatter": "charliermarsh.ruff",
        },
    },
    "extensions": {
        // Recommend the Ruff extension to anyone opening the project
        "recommendations": [
            "charliermarsh.ruff",
        ]
    }
}

When should you customize formatting rules?

To maximize productivity and avoid style debates, it is highly recommended to adopt the default settings of your chosen formatter. This is often called the "zero-configuration" principle.

However, if your project requires specific adjustments, you can configure Ruff in your pyproject.toml file. Common customizations include line length and docstring conventions.

[tool.ruff]
# Set the maximum line length
line-length = 100

[tool.ruff.format]
# Enable formatting of code snippets within docstrings
docstring-code-format = true

[tool.ruff.lint.pydocstyle]
# Set the expected docstring style (e.g., google, numpy)
convention = "google"

How can you disable formatting for specific lines?

On rare occasions, you may need to prevent the formatter from altering a specific block of code where the default formatting reduces readability. You can achieve this in two ways:

Implicitly: Add a trailing comma inside a list, dictionary, or set to force the formatter to keep each item on a separate line.

# The trailing comma prevents this dict from being collapsed into one line
items = {
    "a": 1,
    "b": 2,
    "c": 3,
}

Explicitly: Wrap the code block with # fmt: off and # fmt: on comments to tell the formatter to ignore it completely.

# fmt: off
# This block will not be formatted
not_formatted      = 3
also_not_formatted = 4
# fmt: on