Learn
Natural Language Parsing with Regular Expressions
Compiling and Matching

Before you dive into more complex syntax parsing, you’ll begin with basic regular expressions in Python using the re module as a regex refresher.

The first method you will explore is .compile(). This method takes a regular expression pattern as an argument and compiles the pattern into a regular expression object, which you can later use to find matching text. The regular expression object below will exactly match 4 upper or lower case characters.

regular_expression_object = re.compile("[A-Za-z]{4}")

Regular expression objects have a .match() method that takes a string of text as an argument and looks for a single match to the regular expression that starts at the beginning of the string. To see if your regular expression matches the string "Toto" you can do the following:

result = regular_expression_object.match("Toto")

If .match() finds a match that starts at the beginning of the string, it will return a match object. The match object lets you know what piece of text the regular expression matched, and at what index the match begins and ends. If there is no match, .match() will return None.

With the match object stored in result, you can access the matched text by calling result.group(0). If you use a regex containing capture groups, you can access these groups by calling .group() with the appropriately numbered capture group as an argument.

Instead of compiling the regular expression first and then looking for a match in separate lines of code, you can simplify your match to one line:

result = re.match("[A-Za-z]{4}","Toto")

With this syntax, re‘s .match() method takes a regular expression pattern as the first argument and a string as the second argument.

Instructions

1.

The re module has been imported for you at the top of the workspace. .compile() a regular expression object named regular_expression that will match any 7 character string of word characters.

2.

Use regular_expression‘s .match() method to check if the regex matches the string stored in character_1. Save the result to result_1 and print it.

3.

Access the match in result_1 using its .group() method with an argument of 0. Save the result to match_1 and print it.

4.

In one line, use re‘s .match() method to compile a regular expression that will match any string of characters of length 7 and check if the regex matches the string stored in character_2. Save the result to result_2 and print it. Was a match found?

Folder Icon

Sign up to start coding

Already have an account?