Software Carpentry
Regular Expressions


You Can Skip This Lecture If...

A Simple Example

This or That


Escaping Special Characters

Raw Strings


Making Something Optional

Character Sets


Special Cases


Extracting Matches

Match Objects

Match Groups

Reversing Columns


Finding Title Case Words

Finding All Matches

Reference Material

But Wait, There's More



Exercise 17.1:

By default, regular expression matches are greedy: the first term in the RE matches as much as it can, then the second part, and so on. As a result, if you apply the RE «X(.*)X(.*)» to the string "XaX and XbX", the first group will contain "aX and Xb", and the second group will be empty.

It's also possible to make REs match reluctantly, i.e., to have the parts match as little as possible, rather than as much. Find out how to do this, and then modify the RE in the previous paragraph so that the first group winds up containing "a", and the second group " and XbX".

Exercise 17.2:

What the easiest way to write a case-insensitive regular expression? (Hint: read the documentation on compilation options.)

Exercise 17.3:

What does the VERBOSE option do when compiling a regular expression? Use it to rewrite some of the REs in this lecture in a more readable way.

Exercise 17.4:

What does the DOTALL option do when compiling a regular expression? Use it to get rid of the call to string.split in the example that finds words ending in vowels.

Send comments