The following is a sample from the new book Effective Python.
Building larger and more complex programs often leads you to rely on various packages from the Python community. You’ll find yourself running
pip to install packages like
numpy, and many others.
The problem is that by default
pip installs new packages in a global location. That causes all Python programs on your system to be affected by these installed modules. In theory, this shouldn’t be an issue. If you install a package and never
import it, how could it affect your programs?
The trouble comes from transitive dependencies: the packages that the packages you install depend on. For example, you can see what the
Sphinx package depends on after installing it by asking
$ pip3 show Sphinx --- Name: Sphinx Version: 1.2.2 Location: /usr/local/lib/python3.4/site-packages Requires: docutils, Jinja2, Pygments
If you install another package like
flask, you can see that it too depends on the
$ pip3 show flask --- Name: Flask Version: 0.10.1 Location: /usr/local/lib/python3.4/site-packages Requires: Werkzeug, Jinja2, itsdangerous
The conflict arises as
flask diverge over time. Perhaps right now they both require the same version of
Jinja2 and everything is fine. But six months or a year from now,
Jinja2 may release a new version that makes breaking changes to users of the library. If you update your global version of
pip install --upgrade, you may find that
Sphinx breaks while
flask keeps working.
The cause of this breakage is that Python can only have a single global version of a module installed at a time. If one of your installed packages must use the new version and another package must use the old version, your system isn’t going to work properly.
Such breakage can even happen when package maintainers try their best to preserve API compatibility between releases. New versions of a library can subtly change behaviors that API consuming code relies on. Users on a system may upgrade one package to a new version but not others, breaking dependencies. There’s a constant risk of the ground moving beneath your feet.
These difficulties are magnified when you collaborate with other developers who do their work on separate computers. It’s reasonable to assume that the versions of Python and global packages they have installed on their machines will be slightly different than your own. This can cause frustrating situations where a codebase works perfectly on one programmer’s machine and is completely broken on another’s.
The solution to all of these problems is a tool called
pyvenv, which provides virtual environments. Since Python 3.4, the
pyvenv command-line tool is available by default along with the Python installation (it’s also accessible with
python -m venv). Prior versions of Python require installing a separate package (with
pip install virtualenv) and using a command-line tool called
pyvenv allows you to create isolated versions of the Python environment. Using
pyvenv, you can have many different versions of the same package installed on the same system at the same time without conflicts. This lets you work on many different projects and use many different tools on the same computer.
pyvenv does this by installing explicit versions of packages and their dependencies into completely separate directory structures. This makes it possible to reproduce a Python environment that you know will work with your code. It’s a reliable way to avoid surprising breakages.
Here’s a quick tutorial on how to use
pyvenv effectively. Before using the tool, it’s important to note the meaning of the
python3 command-line on your system. On my computer,
python3 is located in the
/usr/local/bin directory and evaluates to version 3.4.2.
$ which python3 /usr/local/bin/python3 $ python3 --version Python 3.4.2
To demonstrate the setup of my environment, I can test that running a command to import the
pytz module doesn’t cause an error. This works because I already have the
pytz package installed as a global module.
$ python3 -c 'import pytz' $
Now I use
pyvenv to create a new virtual environment called
myproject. Each virtual environment must live in its own unique directory. The result of the command is a tree of directories and files.
$ pyvenv /tmp/myproject $ cd /tmp/myproject $ ls bin include lib pyvenv.cfg
To start using the virtual environment, I use the
source command from my shell on the
activate modifies all of my environment variables to match the virtual environment. It also updates my command-line prompt to include the virtual environment name (
'myproject') to make it extremely clear what I’m working on.
$ source bin/activate (myproject)$
After activation, you can see that the path to the
python3 command-line tool has moved to within the virtual environment directory.
(myproject)$ which python3 /tmp/myproject/bin/python3 (myproject)$ ls -l /tmp/myproject/bin/python3 ... -> /tmp/myproject/bin/python3.4 (myproject)$ ls -l /tmp/myproject/bin/python3.4 ... -> /usr/local/bin/python3.4
This ensures that changes to the outside system will not affect the virtual environment. Even if the outer system upgrades its default
python3 to version 3.5, my virtual environment will still explicitly point at version 3.4.
The virtual environment I created with
pyvenv starts with no packages installed except for
setuptools. Trying to use the
pytz package that was installed as a global module in the outside system will fail because it’s unknown to the virtual environment.
(myproject)$ python3 -c 'import pytz' Traceback (most recent call last): File "<string>", line 1, in <module> ImportError: No module named 'pytz'
I can use
pip to install the
pytz module into my virtual environment.
(myproject)$ pip3 install pytz
Once it’s installed, I can verify it’s working with the same test import command.
(myproject)$ python3 -c 'import pytz' (myproject)$
When you’re done with a virtual environment and want to go back to your default system, you use the
deactivate command. This restores your environment to the system defaults, including the location of the
python3 command-line tool.
(myproject)$ deactivate $ which python3 /usr/local/bin/python3
If you ever want to work in the
myproject environment again, you can just run
source bin/activate in the directory like before.
Once you have a virtual environment, you can continue installing packages with
pip as you need them. Eventually, you may want to copy your environment somewhere else. For example, say you want to reproduce your development environment on a production server. Or maybe you want to clone someone else’s environment on your own machine so you can run their code.
pyvenv makes these situations easy. You can use the
pip freeze command to save all of your explicit package dependencies into a file. By convention this file is named
(myproject)$ pip3 freeze > requirements.txt (myproject)$ cat requirements.txt numpy==1.8.2 pytz==2014.4 requests==2.3.0
Now imagine you’d like to have another virtual environment that matches the
myproject environment. You can create a new directory like before using
$ pyvenv /tmp/otherproject $ cd /tmp/otherproject $ source bin/activate (otherproject)$
The new environment will have no extra packages installed.
(otherproject)$ pip3 list pip (1.5.6) setuptools (2.1)
You can install all of the packages from the first environment by running
pip install on the
requirements.txt that you generated with the
pip freeze command.
(otherproject)$ pip3 install -r /tmp/myproject/requirements.txt
This command will crank along for a little while as it retrieves and installs all of the packages required to reproduce the first environment. Once it’s done, listing the set of installed packages in the second virtual environment will produce the same list of dependencies found in the first virtual environment.
(otherproject)$ pip list numpy (1.8.2) pip (1.5.6) pytz (2014.4) requests (2.3.0) setuptools (2.1)
requirements.txt file is ideal for collaborating with others through a revision control system. You can commit changes to your code at the same time you update your list of package dependencies, ensuring they move in lockstep.
The gotcha with virtual environments is that moving them breaks everything because all of the paths, like
python3, are hard-coded to the environment’s install directory. But that doesn’t matter. The whole purpose of virtual environments is to make it easy to reproduce the same setup. Instead of moving a virtual environment directory, just
freeze the old one, create a new one somewhere else, and reinstall everything from the
Things to Remember
- Virtual environments allow you to use
pipto install many different versions of the same package on the same machine without conflicts.
- Virtual environments are created with
pyvenv, enabled with
source bin/activate, and disabled with
- You can dump all of the requirements of an environment with
pip freeze. You can reproduce the environment by supplying the
pip install -r.
- In versions of Python before 3.4, the
pyvenvtool must be downloaded and installed separately. The command-line tool is called