Regular Expressions

Regular expressions, often referred to as REs or regexes, are bits of small and highly specialized programming language embedded inside Python. The PyDocs actually contains an entire page on how to use regex. We will be touching on some of it's functionality... it'll be up to you to utilize the PyDocs and take your knowledge to the next level.

PyDocs HOWTO

https://docs.python.org/3/howto/regex.html

PyDocs Re

https://docs.python.org/2/library/re.html

Using Regular Expressions

re.compile: compiles regular expression into pattern objects which allows for repeated use

re.match: apply pattern to the start of a string, return match or None

re.search: scan through a string, return match or None

re.findall: find all matches and return them as a list

Python re search and compile example:

import re

search_str = "This is a string to search"
re_search = re.compile("string")
matched_obj = re_search.search(search_str)

print matched_obj
<_sre.SRE_Match object at 0x02CBFA68>

Practical Example

import re

patterns = ['Enterprise', 'Death Star']
pharse = "The Enterprise is the flagship of the Federation."

for pattern in patterns:
    print 'Looking for "{}" in "{}": '.format(pattern, pharse)

    if re.search(pattern, pharse):
        print "found a match!"
    else:
        print "no match!"
Looking for "Enterprise" in "The Enterprise is the flagship of the Federation.":
found a match!
Looking for "Death Star" in "The Enterprise is the flagship of the Federation.":
no match!

Python re search start() and end():

import re

pattern = "Dave"
text = "I'm sorry Dave, I'm afarid I can't do that."

match = re.search(pattern, text)

start = match.start()
end = match.end()

print 'Found "{}" in "{}" from {} to {} ("{}")'.format(match.re.pattern, match.string, start, end, text[start:end])
Found "Dave" in "I'm sorry Dave, I'm afarid I can't do that." from 10 to 14 ("Dave")

Python re match example:

Python's re.match is similar to re.search. The main difference being that match searches only at the start of the string whereas search will apply the pattern to the entire string.

import re

text = "I turned myself into a pickle. I'm Pickle Riiiiick."
text2 = "I did not turn myself into a pickle.  I am not Pickle Riiiiick."

match = re.match("I turned myself", text)
match2 = re.match("I did not", text2)

if match == None:
    print 'Could not find "{}" in "{}"'.format(match.re.pattern, match.string)
else:
    print 'Found "{}" in "{}"'.format(match.re.pattern, match.string)

if match2 == None:
    print 'Cound not find "{}" in "{}"'.format(match2.re.pattern, match2.string)
else:
    print 'Found "{}" in "{}"'.format(match2.re.pattern, match2.string)
Found "I turned myself" in "I turned myself into a pickle. I'm Pickle Riiiiick"
Found "I did not" in "I did not turn myself into a pickle.  I am not Pickle Riiiiick."