Python non-greedy regexes


Question

How do I make a python regex like "(.*)" such that, given "a (b) c (d) e" python matches "b" instead of "b) c (d"?

I know that I can use "[^)]" instead of ".", but I'm looking for a more general solution that keeps my regex a little cleaner. Is there any way to tell python "hey, match this as soon as possible"?

1
120
10/24/2011 9:12:45 PM

Accepted Answer

You seek the all-powerful '*?'

http://docs.python.org/3/howto/regex.html#greedy-versus-non-greedy

the non-greedy qualifiers *?, +?, ??, or {m,n}? [...] match as little text as possible.

152
4/10/2019 12:03:07 PM

>>> x = "a (b) c (d) e"
>>> re.search(r"\(.*\)", x).group()
'(b) c (d)'
>>> re.search(r"\(.*?\)", x).group()
'(b)'

According to the docs:

The '*', '+', and '?' qualifiers are all greedy; they match as much text as possible. Sometimes this behavior isn’t desired; if the RE <.*> is matched against '<H1>title</H1>', it will match the entire string, and not just '<H1>'. Adding '?' after the qualifier makes it perform the match in non-greedy or minimal fashion; as few characters as possible will be matched. Using .*? in the previous expression will match only '<H1>'.


Licensed under: CC-BY-SA with attribution
Not affiliated with: Stack Overflow
Icon