Outline for February 27, 2019
Reading
:
text
, §11
Homework: due March 8, 2018 at 11:59pm
Modifying parameter lists
Not directly [
modpar1.py
]
Using lists [
modpar2.py
]
Why it works
Pattern matching
Regular expressions
Atoms: letters, digits
Match any character except newline:
.
Match any of a set of characters:
[0123456789]
,
[^0123456789]
,
[0-9]
Repetition:
*
,
+
,
{
m
,
n
}; greedy matching; put
?
after and they match as few characters as possible
Match start, end of string:
^
,
$
;
$
matches end of line, also
Grouping:
(
,
)
Escape metacharacters:
\
“Raw” string notation: backslash not handled specially; put “r” before string
Useful functions/methods [
recomp.py
,
renocomp.py
,
regroup.py
]
re.compile(
str
)
compiles the pattern into
pc
(that is,
pc = re.compile(str)
)
pc
.match(
str
)
returns None if compiled pattern
pc
does not match beginning of string
str
pc
.search(
str
)
returns None if pattern
pc
does not match any part of string
str
pc
.findall(
str
)
returns a list of substrings of the string
str
that match the pattern
pc
pc
.group(
str
)
returns the substring of the string
str
that the pattern
pc
matches
pc
.start(
str
)
returns the starting position of the match
pc
.end(
str
)
returns the ending position of the match
pc
.span(
str
)
returns tuple (start, end) positions of match
Useful abbreviations
\d
matches any digit; same as
[0-9]
\s
matches any space character; same as
[ \t\n\r\f\v]
\w
matches any alphanumeric character and underscore; same as
[a-zA-Z0-9_]
\D
matches any character
except
a digit; inverse of
\d
\S
matches any character
except
a space character; inverse of
\s
\W
matches any character
except
an alphanumeric character or underscore; inverse of
\w
\b
matches a word boundary — a word is a sequence of alphanumeric characters
Matt Bishop
Department of Computer Science
University of California at Davis
Davis, CA 95616-8562 USA
Last modified: Version of February 27, 2019 at 12:31AM
Winter Quarter 2019
You can get a PDF version of this