Skip to main content

Python for Data Science Series - Python environments with pyenv and poetry

· 19 min read

Introduction

Python, a versatile programming language widely embraced in fields such as web development, data science, machine learning, and scientific computing. However, navigating through different Python installations and dependencies can often become overwhelming. On this post we will explore how tools like pyenv and poetry can simplify this process by effectively managing project dependencies. Let's embark on this journey of optimizing code environments together!

Why Python?

According to the 2022 stack overflow developer survey, Python is one of the most widely used programming languages today. Of 71,467 responses, 68% of developers expressed that they love the language and are planning to continue working with Python, and approximately 12.000 of those who haven't got the chance to use it have expressed their interest in starting developing with it. Its popularity is mainly due to its simplicity in syntax, expressiveness, and versatility. We can use Python to create any kind of software, from web applications to scientific computing.

Python also has PyPI package index,a python package repository, where you can find many useful packages (Tensorflow, pandas, NumPy, etc.), which facilitates and speeds up your project's development. In PyPI, you can also publish your packages and share them with the community. The ecosystem keeps growing fast, and big companies like Google, Facebook, and IBM contribute by adding new packages.

info

The Python Package Index, abbreviated as PyPI (/ˌpaɪpiˈaɪ/) and also known as the Cheese Shop (a reference to the Monty Python's Flying Circus sketch "Cheese Shop"), is the official third-party software repository for Python. It is analogous to the CPAN repository for Perl and to the CRAN repository for R.[1]

Python Dependency hell

Well, it sounds like Python is amazing! However, if you have been using Python for a while, you may have already noticed that handling different python-installations and dependencies(packages) can be a nightmare! An issue commonly known as dependency hell, which is a term associated with the frustration arising from problems managing our project's dependencies.

Dependency hell in Python often happens because pip does not have a dependency resolver and because all dependencies are shared across projects. So, other projects could be affected when a given dependency may need to be updated or uninstalled.

On top of it, since Python doesn't distinguish between different versions of the same library in the /site-packages directory, this leads to many conflicts when you have two projects requiring different versions of the same library or the global installation doesn't match.

Thus, having tools that enable us to isolate and manage our project's dependencies is highly convenient. In this post, I will show you how to use pyenv and poetry to create your code environments.

Virtual environments to the rescue!

A Python virtual environment is a separate folder where only your project's dependencies(packages) are located. Each virtual environment has its own Python binary (which matches the version of the binary that was used to create this environment) and its own independent set of installed Python packages in its site directories. That is a very convenient way to prevent Dependency Hell.

tip

Python virtual environment allows multiple versions of Python to coexist in the same machine, so you can test your application using different Python versions. It also keeps your project's dependencies isolated, so they don't interfere with the dependencies of others projects.

There are different tools out there that can be used to create Python virtual environments. In this post, I will show you how to use pyenv and poetry. However, you can also try other tools, such as virtualenv or anaconda, and based on your experience, you can choose that one you feel most comfortable with. the video below will provide you with more information about these kinds of tools.

Pyenv

pyenv is a command line tool which allows you to install and run multiple versions of Python in the same machine. For those who come from a javascript background, pyenv is a very similar tool to nvm.

Setup & get started with pyenv

You can follow the steps below for installing pyenv on macOS or check the documentation for alternative installation methods.

Loading...

After having installed pyenv, you can then install any python version running the command pyenv install <version>.

Loading...
tip

if you are not sure about which versions are available to be installed in your machine, you can run the command pyenv install --list.

You can run the command pyenv versions to check which Python versions have been installed.

Loading...

To set the default version of Python to be used, you can run the command pyenv global <version>. This version will be used when you run python or python3 in your terminal.

Loading...

Alternatively to the pyenv global command, Sometimes you want to set a specific version of Python to be used within a specific folder. You can create a .python-version file in the folder and set the version you want to use, or by running the command pyenv local <version>. pyenv will then use this version when you run python or python3 in the folder.

Loading...

To make sure what python version is being used by pyenv, you can run the command pyenv version.

Poetry

Poetry is a tool that allows you to manage your project's dependencies and facilitates the process of packaging for distribution. It resolves your project dependencies and makes sure that there are no conflicts between them.

Poetry integrates with the PyPI package index to find and install your environment dependencies, and pyenv to set your project python runtime.

To install poetry we follow the steps below:

Loading...

If you were able to run the previous commands, we can then move forward with the rest of the tutorial.

To ask poetry to create a new project, we use the command poetry new <project name>. This will create a new folder with the name <project name> and a pyproject.toml folder inside it.

Loading...

If you already have a project, and you want to use poetry to manage the dependencies, you can use the command poetry init. So, poetry will add the pyproject.toml file to your project.

Loading...

The main file of your poetry project is the pyproject.toml file. This file defines your project's dependencies(python packages) and holds the required metadata for packaging. Poetry updates this file every time a new python package is installed. By sharing this file with others, they can recreate your project environment and run your application. To do so, they will need to have poetry installed and run the command poetry install within the same folder where the pyproject.toml file is located.

Now we can start adding dependencies to our project. To do so, we use the command poetry add <package name>.

Loading...

Now our pyproject.toml file looks like:

    [tool.poetry]
name = "myproject"
version = "0.1.0"
description = ""
authors = [`Henry Ruiz <henry.ruiz.tamu@gmail.com>`]

[tool.poetry.dependencies]
python = "^3.10"
numpy = "^1.23.1"
pandas = "^1.4.3"

[tool.poetry.dev-dependencies]
pytest = "^5.2"

[build-system]
requires = ["poetry-core>=1.0.0"]
build-backend = "poetry.core.masonry.api"

Lest review that file sections:

  • [tool.poetry]: This section contains informational metadata about our package, such as the package name, description, author details, etc. Most of the config values here are optional unless you're planning on publishing this project as an official PyPi package.
  • [tool.poetry.dependencies]: This section defines the dependencies of your project. Here is where you define the python packages that your project requires to run. We can update this file manually if it is needed.
  • [tool.poetry.dev-dependencies]: This section defines the dev dependencies of your project. These dependencies are not required for your project to run, but they are useful for development.
  • [build-system]: This is rarely a section you'll need to touch unless you upgrade your version of Poetry.

To see in a nicer format the dependencies of your project, you can use the command poetry show --tree. This command draws a graph of all of our dependencies as well as the dependencies of our dependencies.

Loading...

If we are not sure at some point that we have the latest version of a dependency, we can tell poetry to check on our package repository if there is a new version by using “— latest” option

Loading...

If we list our folder content, we will see that not only the pyproject.toml file is created, but also some other folders and files. So, let's take a look at the contents of the myproject folder.

├── .venv
│   ├── .gitignore
│   ├── bin
│   ├── lib
│   └── pyvenv.cfg
├── README.rst
├── myproject
│   └── __init__.py
├── poetry.lock
├── pyproject.toml
└── tests
├── __init__.py
└── test_myproject.py

5 directories, 11 files
  • `.venv`: This folder is created by poetry when it creates a virtual environment.It isolates the project from the system environment and provides a clean environment for your project. It contains the Python interpreter and your projects dependencies.
  • poetry.lock: When Poetry finished installing the dependencies, it writes all of the packages and the exact versions of them to the poetry.lock file, locking the project to those specific versions.
note

Notice that this folder structure is created only if the poetry new myproject was executed. When poetry is initialized within a folder that already exists ( using the poetry init command), only the pryproject.toml and the .env folder are created.

tip

You should commit the poetry.lock file to your project repo so that all people working on the project are locked to the same versions of dependencies. For more info, check this link : Poetry basic usage

Building our project and publishing it is just running the poetry build and poetry publish commands, so it is pretty intuitive. The publish command will submit our application to pip, so other developers can easily install it.

Hands-on tutorial

Creating a python package using poetry

In this section, you will learn how to create a simple python package named style_image with poetry. This simple python package takes two images, the style image, and the content image, and performs style transfer. "Style transfer is an optimization technique used to take two images—a content image and a style reference image (such as an artwork by a famous painter)—and blend them together, so the output image looks like the content image, but “painted” in the style of the style reference image."[2]

For our style_image package we will use the magenta/arbitrary-image-stylization-v1-256 model available in TensorflowHub under-the-hood.

So, let's do it!!

We will start by creating a new project called style_image using the command poetry new style_image.

Loading...

Installing package dependencies

Next we are going to install the dependencies of our project, so we run the commands:

Loading...

We will see that there is an error trying to install tensorflow:

Creating virtualenv style-image in /Users/haruiz/temp/style_image/.venv
Using version ^0.12.0 for tensorflow-hub
Using version ^2.9.1 for tensorflow
Using version ^1.23.1 for numpy
Using version ^9.2.0 for Pillow
Using version ^0.20.0 for validators
Using version ^0.6.1 for typer

Updating dependencies
Resolving dependencies... (4.2s)

SolverProblemError

The current project's Python requirement `(>=3.10,<4.0)` is not compatible with some of the required packages Python requirement:
- tensorflow-io-gcs-filesystem requires Python `>=3.7, <3.11`, so it will not be satisfied for Python `>=3.11,<4.0`

The great thing is that poetry generally provides information on how to fix them. For the error above, poetry suggests restricting the python property to >=3.10,<3.11 in the pyproject.toml file. For tensorflow-io-gcs-filesystem, a possible solution would be to set the python property to >=3.10,<3.11

tip

Make sure you always check the output in the terminal.

So the pyproject.toml file is now:

[tool.poetry]
name = "style_image"
version = "0.1.0"
description = ""
authors = ["Henry Ruiz <henryruiz22@gmail.com>"]
readme = "README.md"

[tool.poetry.dependencies]
python = ">=3.10,<3.11"

[tool.poetry.dev-dependencies]
pytest = "^5.2"
black = "^22.6.0"

[tool.poetry.scripts]
style_image = "style_image.main:app"

[build-system]
requires = ["poetry-core>=1.0.0"]
build-backend = "poetry.core.masonry.api"

We can then try to install the dependencies again:

poetry add tensorflow-hub tensorflow numpy pillow validators "typer[all]"

After installing the dependencies, our pyproject.toml file is now:

[tool.poetry]
name = "style_image"
version = "0.1.0"
description = ""
authors = ["Henry Ruiz <henryruiz22@gmail.com>"]
readme = "README.md"

[tool.poetry.dependencies]
python = ">=3.10,<3.11"
tensorflow-hub = "^0.12.0"
numpy = "^1.23.1"
Pillow = "^9.2.0"
tensorflow = "^2.9.1"
validators = "^0.20.0"
typer = {extras = ["all"], version = "^0.6.1"}

[tool.poetry.dev-dependencies]
pytest = "^5.2"
black = "^22.6.0"

[tool.poetry.scripts]
style_image = "style_image.main:app"

[build-system]
requires = ["poetry-core>=1.0.0"]
build-backend = "poetry.core.masonry.api"

Coding our style_image package

At this point, we are ready to start coding, let's create the folder structure below and replace the code in each .py file with the code on this repository https://github.com/haruiz/style_image:

    ├── README.md
├── README.rst
├── data
│   └── content_image.jpg
├── main.py
├── poetry.lock
├── pyproject.toml
├── style_image
│   ├── __init__.py
│   ├── core
│   │   ├── __init__.py
│   │   └── style_image.py
│   ├── main.py
│   └── util
│   ├── __init__.py
│   ├── __pycache__
│   └── image_utils.py
├── stylized_image.png
└── tests
├── __init__.py
└── test_style_image.py

Code :

from style_image import StyleImage

if __name__ == "__main__":

content_image_path = "data/content_image.jpg"
style_image_path = "data/style_image.jpg"

stylized_image = (
StyleImage(style_image_path)
.transfer(content_image_path, output_image_size=800)
.save("stylized_image.jpg")
)
running your scripts using the virtual environment

Notice that if you want to execute the main.py file or any other file/script using the python environment you just created, you need to run the command poetry run python main.py. So, poetry knows that you are running the main.py file with the python environment created for the style_image package. If you feel more comfortable running python main.py, instead of running poetry run ... you can permanently activate the environment running the command poetry shell.. So it will be activated for all the commands you run.

Poetry and pyenv are integrated with visual studio code and Pycharm. In fact, they will automatically recognize the python environment created by poetry.

Publishing our package to PyPi

Publishing our package in Pypi should be straightforward. We just run the poetry publish command. Since this is just a demo, we are going to publish our package to the pypi test repository https://test.pypi.org/. However, the steps should be the same in production https://pypi.org/.

Loading...

If the publish command is successful, you will be able to find the package in the testpypi repository.

That is all!! We are done!!. You can check the links below for more information about poetry.

Thanks for your support and don't forget to share,

Some useful resources

References