Outline for March 1
Reading: text, §9.2
Due: Homework 4, due on March 4 at 11:55pm
- “Raw” string notation: backslash not handled specially; put “r” before string
- Useful functions/methods [recomp.py, renocomp.py, regroup.py]
- re.compile(str) compiles the pattern into pc (that is, pc = re.compile(str))
- pc.match(str) returns None if compiled pattern pc does not match beginning of string str
- pc.search(str) returns None if pattern pc does not match any part of string str
- pc.findall(str) returns a list of substrings of the stringstr that match the pattern pc
- pc.group(str) returns the substring of the string str that the pattern pc matches
- pc.start(str) returns the starting position of the match
- pc.end(str) returns the ending position of the match
- pc.span(str) returns tuple (start, end) positions of match
- Useful abbreviations
- \d matches any digit; same as [0-9]
- \s matches any space character; same as [\ \t\n\r\f\v]
- \w matches any alphanumeric character and underscore; same as [a-zA-Z0-9_]
- \D matches any character except a digit; inverse of \d
- \S matches any character except a space character; inverse of \s
- \W matches any character except an alphanumeric character or underscore; inverse of \w
- \b matches a word boundary — a word is a sequence of alphanumeric characters