Python is a computer programming language that is easy to learn and use. It is one of the most popular programming languages out there. In this digital age, where everyone is looking for ways to automate their business, Python is on the rise.
One of the many things that Python developers have to do over and over again is looping through files in a directory.
In this article, we will show you a few different way to for-loop each file in a directory, both with and without importing additional package.
Iterate over files in a given directory using os.path
The os
module provides an unified interface to many common operating system features in different platforms.
Depending on the platform in which the program is running, the os
module can automatically loads the right implementation (whether it’s posix
or nt
) and perform the proper system call.
os
is bundled as part of Python standard package, so using it will reduce your dependencies a lot. Below is a small code snippet where we iterate through files in a given directory and print their names using os.listdir()
import os
for filename in os.listdir('/home/linuxpip'):
if filename.endswith(".py"):
print(filename)
else:
continue
Code language: PHP (php)
Alternatively, you can use os.walk()
to loop through files in a directory. Just remember that os.walk()
function returns 3-tuple, which includes dirpath, dirnames, and filenames.
import os
if __name__ == "__main__":
for (root, dirs, files) in os.walk('/home/linuxpip', topdown=True):
print("The files are: ")
print(files)
Code language: PHP (php)
Iterate over files in a given directory using pathlib
pathlib (Python 3.4+) is the newer way to interact with the filesystem in an unified way.
Why pathlib
when you already had os
, you may ask. The problem is os
treats paths as strings, so you can’t get any further details about a specific path unless writing a few more lines of code. Besides that, the os
module doesn’t allow you to find paths that match a given pattern inside a hierarchy natively. Plus, pathlib
offers so many more streamlined approach to managing and interacting with filesystem paths across different operating systems.
In order to loop over files in a given directory, you can simply use Path.iterdir()
.
from pathlib import Path
my_dir = Path("/home/linuxpip")
for path in dir.iterdir():
print(path)
Code language: JavaScript (javascript)
On older Python version, even Python 2, you can still use pathlib by installing it with pip.
pathlib vs os.path
Pathlib has more intuitive syntax, whereas I feel os.path
to be old and clunky at certain times. Pathlib object can perform filesystem operations on its own by calling its internal methods, while you need to call a bunch of different os.path
functions to do the same thing. Pathlib allows you to easily traverse the paths using parent
function. Meanwhile, os.path
has to rely on directory names and path strings. On top of that, Pathlib allows you to iterate on directories and perform pattern matching natively when os.path
does not. Finally, every pathlib’s Path
object has multiple useful methods and attributes that can be used to perform filesystem operations or get attributes, which you have to use additional libraries such as glob
or shutil
in companion with os.path
.
For example, joining paths in os.path
must be
os.path.join(os.getcwd(), "processed_data", "output.xlsx")
Code language: Python (python)
With Pathlib, you can simply use the / operator to join paths. A breakthrough that really level up the code readability.
os.path.join(os.getcwd(), "processed_data", "output.xlsx")
Code language: CSS (css)
We hope that the article helped you learned how to loop through files in a directory efficiently. We’ve also written a few other guides for fixing common Python errors, such as Timeout in Python requests, Python Unresolved Import in VSCode or “IndexError: List Index Out of Range” in Python. If you have any suggestion, please feel free to leave a comment below.