I'm just learning Python, and I can't seem to figure out regular expressions.
r1 = re.compile("$.pdf") if r1.match("spam.pdf"): print 'yes' else: print 'no'
I want this code to print 'yes', but it obstinately prints 'no'. I've also tried each of the following:
r1 = re.compile(r"$.pdf") r1 = re.compile("$ .pdf") r1 = re.compile('$.pdf') if re.match("$.pdf", "spam.pdf") r1 = re.compile(".pdf")
Plus countless other variations. I've been searching for quite a while, but can't find/understand anything that solves my problem. Can someone help out a newbie?
You've tried all the variations except the one that works. The
$ goes at the end of the pattern. Also, you'll want to escape the period so it actually matches a period (usually it matches any character).
r1 = re.compile(r"\.pdf$")
However, an easier and clearer way to do this is using the string's
if filename.endswith(".pdf"): # do something
That way you don't have to decipher the regular expression to understand what's going on.
Comparison of both methods is clearly shown in the Python documentation chapter called "search() vs. match()"
Also the meaning of characters in regular expressions is different than you are trying to use it (see Regular Expression Syntax for details):
^ matches the beginning:
(Caret.) Matches the start of the string, and in MULTILINE mode also matches immediately after each newline.
$ matches the end:
Matches the end of the string or just before the newline at the end of the string, and in
MULTILINEmode also matches before a newline. foo matches both ‘
foo’ and ‘
foobar’, while the regular expression
foo$matches only ‘
foo’. More interestingly, searching for foo.$ in 'foo1\nfoo2\n' matches ‘
foo2’ normally, but ‘
MULTILINEmode; searching for a single
foo\n' will find two (empty) matches: one just before the newline, and one at the end of the string.
The solution you are looking for may be:
import re r1 = re.compile("\.pdf$") # regular expression corrected if r1.search("spam.pdf"): # re.match() replaced with re.search() print "yes" else: print "no"
which checks, if the string ends with "
.endswith(), but if kindall's answer works for you, choose it (it is cleaner as you may not need regular expressions at all).