I ran into a small problem using Python Regex.
Suppose this is the input:
What I'm trying to achieve is obtain whatever is between parentheses as a single match, and any char outside as an individual match. The desired result would be along the lines of:
The order of matches should be kept.
I've tried obtaining this with Python 3.3, but can't seem to figure out the correct Regex. So far I have:
matches = findall(r'\((.*?)\)|\w', '(zyx)bc')
print(matches) yields the following:
Any ideas what I'm doing wrong?
From the documentation of
If one or more groups are present in the pattern, return a list of groups; this will be a list of tuples if the pattern has more than one group.
While your regexp is matching the string three times, the
(.*?) group is empty for the second two matches. If you want the output of the other half of the regexp, you can add a second group:
>>> re.findall(r'\((.*?)\)|(\w)', '(zyx)bc') [('zyx', ''), ('', 'b'), ('', 'c')]
Alternatively, you could remove all the groups to get a simple list of strings again:
>>> re.findall(r'\(.*?\)|\w', '(zyx)bc') ['(zyx)', 'b', 'c']
You would need to manually remove the parentheses though.
Let's take a look at our output using
branch literal 40 subpattern 1 min_repeat 0 65535 any None literal 41 or in category category_word
Ouch, there's only one
subpattern in there but
re.findall only pulls out
subpatterns if one exists!
a = re.findall(r'\((.*?)\)|(.)', '(zyx)bc',re.DEBUG); a [('zyx', ''), ('', 'b'), ('', 'c')] branch literal 40 subpattern 1 min_repeat 0 65535 any None literal 41 or subpattern 2 any None
Now we just have to make this into the format you want.
[i if i != '' else i for i in a] ['zyx', 'b', 'c']