Introduction to Regular Expressions in Python

Basics of Regex, Common Patterns, and Practical Examples for Pattern Matching and Data Validation

Learn the fundamentals of regular expressions (regex) in Python. This tutorial covers the basics, common patterns, and practical examples for pattern matching and data validation using Python’s re module.

Programming
Author
Affiliation
Published

February 9, 2024

Modified

February 9, 2025

Keywords

Python regular expressions, regex in Python tutorial, pattern matching Python

Introduction

Regular expressions (regex) are a powerful tool for searching, matching, and manipulating text using patterns. Python’s built-in re module provides robust support for regex, enabling you to perform complex text processing tasks efficiently. In this tutorial, we’ll cover the basics of regex, explore common patterns, and demonstrate practical examples for pattern matching and data validation.



What are Regular Expressions?

Regular expressions are sequences of characters that define a search pattern. They are widely used for tasks such as:

  • Validating input (e.g., email addresses, phone numbers)
  • Searching and extracting specific patterns from text
  • Replacing or modifying substrings within a larger string

Basic Syntax and Functions in Python’s re Module

Python’s re module offers several key functions:

  • re.search(): Searches for a pattern anywhere in the string.
  • re.match(): Checks for a match only at the beginning of the string.
  • re.findall(): Returns a list of all non-overlapping matches.
  • re.sub(): Replaces occurrences of a pattern with a specified string.
  • re.split(): Splits a string by the occurrences of a pattern.

Practical Examples

Searching for a Pattern

Use re.search() to locate a pattern in a string:

#| label: regex-search
import re

text = "The quick brown fox jumps over the lazy dog."
pattern = r"fox"
match = re.search(pattern, text)
if match:
    print("Match found:", match.group())
else:
    print("No match found.")

Output:

Match found: fox

Finding All Occurrences

Use re.findall() to extract all matches of a pattern:

#| label: regex-findall
import re

text = "apple, banana, cherry, apple, banana"
pattern = r"apple"
matches = re.findall(pattern, text)
print("All matches:", matches)

Output:

All matches: ['apple', 'apple']

Replacing Patterns

Use re.sub() to replace matched patterns with a new string:

#| label: regex-sub
import re

text = "The price is $100. The discount price is $80."
pattern = r"\$\d+"
new_text = re.sub(pattern, "REDACTED", text)
print("Updated text:", new_text)

Output:

Updated text: The price is REDACTED. The discount price is REDACTED.

Using Groups for Extraction

Groups allow you to extract specific parts of a pattern:

#| label: regex-groups
import re

text = "My email is alice@example.com."
pattern = r"(\w+)@(\w+\.\w+)"
match = re.search(pattern, text)
if match:
    username, domain = match.groups()
    print("Username:", username)
    print("Domain:", domain)

Output:

Username: alice
Domain: example.com

Tips and Best Practices

  • Keep It Simple:
    Start with simple patterns and gradually build complexity. Overly complex regex can be hard to read and maintain.

  • Test Your Patterns:
    Use online tools like regex101.com to test and debug your regular expressions interactively.

  • Document Your Regex:
    When writing complex patterns, add comments or break them into smaller parts for clarity.

  • Use Raw Strings:
    Prefix regex patterns with r to avoid issues with escape sequences (e.g., r"\d+").

Conclusion

Regular expressions are an indispensable tool for text processing in Python. By mastering the basics and experimenting with practical examples, you can efficiently validate inputs, extract meaningful data, and transform text to meet your needs. With practice, you’ll find that regex can greatly simplify many common text processing tasks.

Further Reading

Happy coding, and enjoy harnessing the power of regular expressions in Python!

Back to top

Reuse

Citation

BibTeX citation:
@online{kassambara2024,
  author = {Kassambara, Alboukadel},
  title = {Introduction to {Regular} {Expressions} in {Python}},
  date = {2024-02-09},
  url = {https://www.datanovia.com/learn/programming/python/additional-tutorials/regex.html},
  langid = {en}
}
For attribution, please cite this work as:
Kassambara, Alboukadel. 2024. “Introduction to Regular Expressions in Python.” February 9, 2024. https://www.datanovia.com/learn/programming/python/additional-tutorials/regex.html.