1.3 Poetry
What is Poetry?
Poetry stands out as a contemporary tool for Python package and dependency management, aiming to streamline the process of defining, managing, and packaging project dependencies. It fulfills the need for a unified tool capable of handling project setup, dependency resolution, and package distribution, proving to be an essential asset for Python developers.
What is a package?
A Python package is a set of Python modules grouped together that can be installed and used within your projects. Python packages help you manage the functionality of Python by allowing you to add and utilize external libraries and frameworks that are not part of the standard Python library.
Poetry simplifies the management of these packages by handling them as dependencies. When using Poetry, developers can easily specify which packages are needed for their projects through a pyproject.toml
file. Poetry ensures that all specified dependencies are installed in the correct versions, maintaining a stable and conflict-free environment for development. Here’s an example of specifying dependencies with Poetry:
[tool.poetry]
name = "example-project"
version = "0.1.0"
description = "An example project to demonstrate Poetry"
[tool.poetry.dependencies]
python = "^3.8"
requests = "^2.25.1"
[tool.poetry.dev-dependencies]
pytest = "^5.2"
You will learn more on how to construct and publish Python Package in the Package section of this course.
Why do you need a package manager?
In the Python ecosystem, the distribution and installation of software through packages is a standard practice. These packages, often available in Wheel or zip formats, encapsulate source code along with vital metadata. Manually handling these packages and their dependencies can quickly become cumbersome, underscoring the need for package managers. Tools like Poetry automate these processes, boosting productivity and guaranteeing consistent environments across development and deployment.
By default, Poetry will download and install Python packages from Pypi, a repository of software for the Python programming language. If needed, other Python repositories can be configured to providing extra source of dependencies.
Why should you use Poetry in your project?
Incorporating Poetry into your project brings several key advantages:
- Improved Environment Management: Poetry streamlines the management of different project environments, promoting consistent development practices.
- Simplified Package Building and Distribution: It provides a unified workflow for building, distributing, and installing packages, easing the complexities usually associated with these tasks.
- Uniform Project Metadata: Poetry employs a standardized approach to defining project metadata, including dependencies, authors, and versioning, through a
pyproject.toml
file. This standardization ensures clarity and uniformity.
Compared to traditional approaches that involve pip, venv, and manual dependency management, Poetry offers a more cohesive and friendly experience, merging multiple package and environment management tasks into a single, simplified process.
How can you install Poetry on your system?
Poetry can be installed through various methods to accommodate different preferences and system setups. The recommended way is via pipx, which installs Poetry in an isolated environment to avoid conflicts with other project dependencies. Confirming the installation is as simple as running poetry --version
in the terminal, which will display the installed Poetry version.
# Install pipx on your system
python -m pip install pipx
# install poetry using pipx
pipx install poetry
At the time of writing, the latest version of Poetry is 1.8.2.
How can you use Poetry for your MLOps project?
Integrating Poetry into your MLOps project involves several key steps designed to configure and prepare your development environment:
- Begin by creating a new project directory and navigate into it.
- Run
poetry init
in your terminal. This command starts an interactive guide to help set initial project parameters, such as package name, version, description, author, and dependencies. This step generates apyproject.toml
file, crucial for your project's configuration under Poetry. - Run
poetry install
to install the project dependencies and source code. This will let you access your project code throughpoetry shell
and its command-line utilities withpoetry run
.
The pyproject.toml
file plays a central role in defining your project’s dependencies and settings, with further configuration details available in the Poetry documentation.
# https://python-poetry.org/docs/pyproject/
# PROJECT
[tool.poetry]
name = "bikes"
version = "1.0.0"
description = "Predict the number of bikes available."
repository = "https://github.com/fmind/mlops-python-package"
documentation = "https://fmind.github.io/mlops-python-package/"
readme = "README.md"
license = "CC BY"
keywords = ["mlops", "python", "package"]
packages = [{ include = "bikes", from = "src" }]
[tool.poetry.scripts]
bikes = 'bikes.scripts:main'
[tool.poetry.dependencies]
python = "^3.12"
[build-system]
requires = ["poetry-core"]
build-backend = "poetry.core.masonry.api"
At the end of the installation process, a poetry.lock
file is generated with all the project dependencies that have been installed. You can remove and regenerate the poetry.lock
file if you wish to update the list of dependencies.
What is a Python virtual environment?
A Python virtual environment is an isolated space where you can manage Python packages for specific projects without affecting the global Python installation. This setup allows different projects to have their own dependencies, regardless of what dependencies every other project has.
Poetry enhances the management of virtual environments by automatically creating and managing them for each project. It handles the installation of required packages into these isolated environments, ensuring that dependencies for one project do not interfere with another. This process is seamless and automatic, simplifying development workflows and reducing dependency conflicts.
How can you make your Poetry project easier to manage?
To enhance your Poetry management experience, consider creating a poetry.toml
file in your project's root with specific virtual environment configurations:
# https://python-poetry.org/docs/configuration/
[virtualenvs]
in-project = true
prefer-active-python = true
These configurations ensure that Poetry creates virtual environments directly within your project directory and prioritizes the active Python interpreter. This practice is aligned with optimal environment management standards. More details are available in the Poetry documentation on environment management.
How can you install dependencies for your project with Poetry?
Poetry differentiates between main (production) and development dependencies, offering an organized approach to dependency management. To add dependencies, use the following commands:
# For main dependencies
$ poetry add pandas scikit-learn
# For development dependencies
$ poetry add -G dev ipykernel
Executing these commands updates the pyproject.toml
file, accurately managing and versioning your project's dependencies.
In production, you can decide to install only the main dependencies using this command:
poetry install --only main
What is the difference between main and dev dependencies in Poetry?
In Poetry, dependencies are divided into two types: main dependencies and development (dev) dependencies.
Main Dependencies: These are essential for your project's production environment—your application can't run without them. For example, libraries like Pandas or XGBoost would be main dependencies for an MLOps project.
Development Dependencies: These are used only during development and testing, such as testing frameworks (e.g., pytest) or linters (e.g., ruff). They are not required in production.
Here’s a simple example in a pyproject.toml
file:
[tool.poetry.dependencies]
flask = "^2.0.1" # Main dependency
[tool.poetry.dev-dependencies]
pytest = "^6.2.4" # Development dependency`
This setup helps keep production environments lean by excluding unnecessary development tools.
Can you use Poetry to download Python dependencies from your own organization's code repository?
Poetry supports incorporating custom package repositories, including private or organizational ones. This capability allows for the use of proprietary packages in conjunction with those from the public PyPI. Adding a custom repository and setting up authentication is facilitated by Poetry's configuration commands, offering secure and adaptable dependency management.
For a deeper understanding of Poetry's features, including advanced configurations, package specifications, and command instructions, consult the official Poetry documentation. These resources offer detailed insights into leveraging Poetry to its fullest potential in your Python projects.