Introduction to Python pathlib
One of the first things many beginner Python tutorials teach is how to read or write files using patterns like:
with open('file.txt') as f:
f.read()
And then walk directories with libraries such as os
:
import os
g = os.walk('.')
next(g)
Which will go recursively through results depth first, leaving to you, the programmer, the joy of writing your own algorithm for searching files or listing files of a certain type.
Other calls such as os.rename
or to os.path.join
have the developer
manipulate paths manually with string concatenation and os.sep
.
Not much fun.
Neither of these approaches is wrong. But they’re cumbersome and code can get thorny quickly.
Today we’re going to take a look at pathlib
which was introduced in Python 3.4
and simplifies our work with files tremendously.
We’ll showcase pathlib’s most popular features.
Let’s start with the basics: Reading and writing text files.
We’re going to use read_text
and write_text
on our Path instance.
Let’s throw in some yaml.
from pathlib import Path
import yaml
path = Path('tmp/file.yaml')
contents = path.read_text()
yaml_content = yaml.load(contents)
# yaml_content
# {'name': 'Radu', 'uses': 'linux', 'twitter': 'wooptoo'}
yaml_content['name'] = 'Rad'
path.write_text(yaml.dump(yaml_content))
What if we want to rename our file? We’ll reuse the path
from the previous example.
path = Path('tmp/file.yaml')
new_path = path.with_name('myfile.yaml')
# PosixPath('tmp/myfile.yaml')
path.rename(new_path)
path.exists()
# False
new_path.exists()
# True
Our file was renamed in a few easy steps, in a nice object oriented fashion.
path.with_suffix('.txt')
is very similar, and will just change our file’s extension
and keep the initial file name.
PosixPath
represents our path instance, which enables us to do all sorts
of operations, like creating a new directory, checking for existence,
checking for file type, getting size, checking for user, group, permissions, etc.
Basically everything we would previously do with the os.path
module.
Let’s fetch all our yaml files now with glob
:
from pathlib import Path
tmp = Path('tmp')
g = tmp.glob('*.yaml')
list(g)
# [PosixPath('tmp/myfile4.yaml'),
# PosixPath('tmp/myfile3.yaml'),
# PosixPath('tmp/myfile2.yaml'),
# PosixPath('tmp/myfile.yaml')]
And we can take it from there. glob
and rglob
are super handy for this kind of stuff.
Of course Python had the glob
module before, but having it under pathlib’s umbrella is extremely handy.
Next we can iterate through sub-directories using rglob with patterns like ./
Or we can just use iterdir
which is nicer.
tmp = Path('tmp')
dgen = tmp.iterdir()
list(dgen)
# [PosixPath('tmp/snappysnaps'),
# PosixPath('tmp/cameraphotos'),
# PosixPath('tmp/yamlfiles')]
And one last thing which I’m going to touch upon is traversing folders, and even creating new paths.
tmp = Path('tmp')
new_path = tmp / 'pictures' / 'camera'
# PosixPath('tmp/pictures/camera')
new_path.mkdir(parents=True)
new_path.exists()
# True
This wizardry is done by overloading the /
operator of the Path instance.
In Python 3 this is the private __truediv__
method.
p = Path('tmp')
p / 'myfile.txt' == p.__truediv__('myfile.txt')
# True
In a nutshell these are the most common cases
which you’ll use on a daily basis if you work with files a lot.
We’ve managed to replace three different libraries (os, glob and open/read) with
pathlib
which gives us a neat developer interface.
Since pathlib is now part of the Python standard library there’s absolutely no reason to not use it for new projects. More on the library can be read in the manual page.