How to extract the substring between two markers?


Let's say I have a string 'gfgfdAAA1234ZZZuijjk' and I want to extract just the '1234' part.

I only know what will be the few characters directly before AAA, and after ZZZ the part I am interested in 1234.

With sed it is possible to do something like this with a string:

echo "$STRING" | sed -e "s|.*AAA\(.*\)ZZZ.*|\1|"

And this will give me 1234 as a result.

How to do the same thing in Python?

10/10/2018 10:41:52 PM

Accepted Answer

Using regular expressions - documentation for further reference

import re

text = 'gfgfdAAA1234ZZZuijjk'

m ='AAA(.+?)ZZZ', text)
if m:
    found =

# found: 1234


import re

text = 'gfgfdAAA1234ZZZuijjk'

    found ='AAA(.+?)ZZZ', text).group(1)
except AttributeError:
    # AAA, ZZZ not found in the original string
    found = '' # apply your error handling

# found: 1234
10/8/2013 3:50:59 PM

>>> s = 'gfgfdAAA1234ZZZuijjk'
>>> start = s.find('AAA') + 3
>>> end = s.find('ZZZ', start)
>>> s[start:end]

Then you can use regexps with the re module as well, if you want, but that's not necessary in your case.

Licensed under: CC-BY-SA with attribution
Not affiliated with: Stack Overflow