How to use glob() to find files recursively?


Question

This is what I have:

glob(os.path.join('src','*.c'))

but I want to search the subfolders of src. Something like this would work:

glob(os.path.join('src','*.c'))
glob(os.path.join('src','*','*.c'))
glob(os.path.join('src','*','*','*.c'))
glob(os.path.join('src','*','*','*','*.c'))

But this is obviously limited and clunky.

1
638
3/20/2019 12:35:38 AM

Accepted Answer

Python 3.5+

Since you're on a new python, you should use pathlib.Path.glob from the the pathlib module.

from pathlib import Path

for filename in Path('src').glob('**/*.c'):
    print(filename)

If you don't want to use pathlib, just use glob.glob, but don't forget to pass in the recursive keyword parameter.

For cases where matching files beginning with a dot (.); like files in the current directory or hidden files on Unix based system, use the os.walk solution below.

Older Python versions

For older Python versions, use os.walk to recursively walk a directory and fnmatch.filter to match against a simple expression:

import fnmatch
import os

matches = []
for root, dirnames, filenames in os.walk('src'):
    for filename in fnmatch.filter(filenames, '*.c'):
        matches.append(os.path.join(root, filename))
1196
5/13/2019 6:26:27 PM

Similar to other solutions, but using fnmatch.fnmatch instead of glob, since os.walk already listed the filenames:

import os, fnmatch


def find_files(directory, pattern):
    for root, dirs, files in os.walk(directory):
        for basename in files:
            if fnmatch.fnmatch(basename, pattern):
                filename = os.path.join(root, basename)
                yield filename


for filename in find_files('src', '*.c'):
    print 'Found C source:', filename

Also, using a generator alows you to process each file as it is found, instead of finding all the files and then processing them.


Licensed under: CC-BY-SA with attribution
Not affiliated with: Stack Overflow
Icon