How to replace only part of the match with python re.sub


Question

I need to match two cases by one reg expression and do replacement

'long.file.name.jpg' -> 'long.file.name_suff.jpg'

'long.file.name_a.jpg' -> 'long.file.name_suff.jpg'

I'm trying to do the following

re.sub('(\_a)?\.[^\.]*$' , '_suff.',"long.file.name.jpg")

But this is cut the extension '.jpg' and I'm getting

long.file.name_suff. instead of long.file.name_suff.jpg I understand that this is because of [^.]*$ part, but I can't exclude it, because I have to find last occurance of '_a' to replace or last '.'

Is there a way to replace only part of the match?

1
50
5/4/2010 8:12:28 AM

Accepted Answer

 re.sub(r'(?:_a)?\.([^.]*)$', r'_suff.\1', "long.file.name.jpg")

?: starts a non matching group (SO answer), so (?:_a) is matching the _a but not enumerating it, the following question mark makes it optional.

So in English, this says, match the ending .<anything> that follows (or doesn't) the pattern _a

Another way to do this would be to use a lookbehind (see here). Mentioning this because they're super useful, but I didn't know of them for 15 years of doing REs

23
5/23/2017 11:47:08 AM

Put a capture group around the part that you want to preserve, and then include a reference to that capture group within your replacement text.

re.sub(r'(\_a)?\.([^\.]*)$' , r'_suff.\2',"long.file.name.jpg")

Licensed under: CC-BY-SA with attribution
Not affiliated with: Stack Overflow
Icon