Regular expression matching everything except a given regular expression


Question

I am trying to figure out a regular expression which matches any string that doesn't start with mpeg. A generalization of this is matching any string which doesn't start with a given regular expression.

I tried something like as follows:

[^m][^p][^e][^g].* 

The problem with this is that it requires at least 4 characters to be present in the string. I was not able to figure out a good way to handle this and a generalized way to handle this in a general purpose manner.

I will be using this in Python.

1
11
2/8/2018 12:01:33 AM

Accepted Answer

^(?!mpeg).*

This uses a negative lookahead to only match a string where the beginning doesn't match mpeg. Essentially, it requires that "the position at the beginning of the string cannot be a position where if we started matching the regex mpeg, we could successfully match" - thus matching anything which doesn't start with mpeg, and not matching anything that does.

However, I'd be curious about the context in which you're using this - there might be other options aside from regex which would be either more efficient or more readable, such as...

if not inputstring.startswith("mpeg"):
28
12/11/2013 6:29:42 AM

don't lose your mind with regex.

if len(mystring) >=4 and mystring[:4]=="mpeg":
    print "do something"

or use startswith() with "not" keyword

if len(mystring)>=4 and not mystring.startswith("mpeg")

Licensed under: CC-BY-SA with attribution
Not affiliated with: Stack Overflow
Icon