How to loop through each file in directory in Python

Python is a computer programming language that is easy to learn and use. It is one of the most popular programming languages out there. In this digital age, where everyone is looking for ways to automate their business, Python is on the rise.

One of the many things that Python developers have to do over and over again is looping through files in a directory.

In this article, we will show you a few different way to for-loop each file in a directory, both with and without importing additional package.

Iterate over files in a given directory using os.path

The os module provides an unified interface to many common operating system features in different platforms.

Depending on the platform in which the program is running, the os module can automatically loads the right implementation (whether it’s posix or nt ) and perform the proper system call.

os is bundled as part of Python standard package, so using it will reduce your dependencies a lot. Below is a small code snippet where we iterate through files in a given directory and print their names using os.listdir()

import os

for filename in os.listdir('/home/linuxpip'):
    if filename.endswith(".py"): 
        continueCode language: PHP (php)

Alternatively, you can use os.walk() to loop through files in a directory. Just remember that os.walk() function returns 3-tuple, which includes dirpath, dirnames, and filenames.

import os
if __name__ == "__main__":
    for (root, dirs, files) in os.walk('/home/linuxpip', topdown=True):
        print("The files are: ")
        print(files)Code language: PHP (php)

Iterate over files in a given directory using pathlib

pathlib (Python 3.4+) is the newer way to interact with the filesystem in an unified way.

Why pathlib when you already had os, you may ask. The problem is os treats paths as strings, so you can’t get any further details about a specific path unless writing a few more lines of code. Besides that, the os module doesn’t allow you to find paths that match a given pattern inside a hierarchy natively. Plus, pathlib offers so many more streamlined approach to managing and interacting with filesystem paths across different operating systems.

In order to loop over files in a given directory, you can simply use Path.iterdir().

from pathlib import Path
my_dir = Path("/home/linuxpip")
for path in dir.iterdir():
    print(path)Code language: JavaScript (javascript)

On older Python version, even Python 2, you can still use pathlib by installing it with pip.

pathlib vs os.path

Pathlib has more intuitive syntax, whereas I feel os.path to be old and clunky at certain times. Pathlib object can perform filesystem operations on its own by calling its internal methods, while you need to call a bunch of different os.path functions to do the same thing. Pathlib allows you to easily traverse the paths using parent function. Meanwhile, os.path has to rely on directory names and path strings. On top of that, Pathlib allows you to iterate on directories and perform pattern matching natively when os.path does not. Finally, every pathlib’s Path object has multiple useful methods and attributes that can be used to perform filesystem operations or get attributes, which you have to use additional libraries such as glob or shutil in companion with os.path.

For example, joining paths in os.path must be

os.path.join(os.getcwd(), "processed_data", "output.xlsx")Code language: Python (python)

With Pathlib, you can simply use the operator to join paths. A breakthrough that really level up the code readability.

os.path.join(os.getcwd(), "processed_data", "output.xlsx")Code language: CSS (css)

We hope that the article helped you learned how to loop through files in a directory efficiently. We’ve also written a few other guides for fixing common Python errors, such as Timeout in Python requests, Python Unresolved Import in VSCode or “IndexError: List Index Out of Range” in Python. If you have any suggestion, please feel free to leave a comment below.

Leave a Comment