Find out number of capture groups in Python regular expressions


Question

Is there a way to determine how many capture groups there are in a given regular expression?

I would like to be able to do the follwing:

def groups(regexp, s):
    """ Returns the first result of re.findall, or an empty default

    >>> groups(r'(\d)(\d)(\d)', '123')
    ('1', '2', '3')
    >>> groups(r'(\d)(\d)(\d)', 'abc')
    ('', '', '')
    """
    import re
    m = re.search(regexp, s)
    if m:
        return m.groups()
    return ('',) * num_of_groups(regexp)

This allows me to do stuff like:

first, last, phone = groups(r'(\w+) (\w+) ([\d\-]+)', 'John Doe 555-3456')

However, I don't know how to implement num_of_groups. (Currently I just work around it.)

EDIT: Following the advice from rslite, I replaced re.findall with re.search.

sre_parse seems like the most robust and comprehensive solution, but requires tree traversal and appears to be a bit heavy.

MizardX's regular expression seems to cover all bases, so I'm going to go with that.

1
30
5/23/2017 11:47:28 AM

Accepted Answer

def num_groups(regex):
    return re.compile(regex).groups
34
7/17/2014 8:38:09 AM

f_x = re.search(...)
len_groups = len(f_x.groups())

Licensed under: CC-BY-SA with attribution
Not affiliated with: Stack Overflow
Icon