I'm a huge fan of RegEx - by which I mean, I have used it an inordinate number of times when starting out with Python. 

Whether it be using the search() function to perform pattern matching or using findall() for every conceivable use case there is, "import re" was a standard of-sorts in any code I wrote.

Until of course, shit went downhill on a production incident at my day job and I *really* understood the gravity of misusing RegEx!

So what exactly is RegEx?

RegEx, also known as Regular Expression, is a sequence of characters forming a search pattern of-sorts.

In Python, we have a module re that we can utilize to work with Regular Expressions. We use the module along with one of the functions listed below to perform pattern matching & replacement

- findall

- search

- split

- sub

What do we use it for?

From the above statement, it is pretty evident that one of the major use cases for RegEx is pattern matching & replacement.

How do we use it?

To use the re module in Python, you first need to import it using the below line:

import re

Once you import the module, you use one of the functions above (basis what you want to achieve).

Python Docs obviously explains this way better than I do, however I absolutely found this W3Schools tutorial super-useful when learning.

Catastrophic much?

The most common problems with using RegEx are backtracking & back referencing. Here's a Stackoverflow page differentiating the two & linking to another great resource.

 

If these weren't examples enough, some real-life use cases of how backtracking has caused significant outages are linked below:

- The CloudFlare outage on July 2, 2019

One of the more recent events to have popped up on account of regular expressions backtracking enormously and causing significant CPU starvation, this event brought down Cloudflare's WAF, core proxying, & CDN functionality.

- Moment library (versions older than 2.15.2)

A lightweight JS date library for parsing, validating, manipulating, and formatting dates, versions of moment() older than 2.15.2 were prone to ReDOS per this vulnerability reported by snyk. Patches to the version were released to address this issue.

Obviously, this does not vilify RegEx (or users of RegEx!). It's a great tool for getting stuff done when used carefully & sparingly. But if you're anything like beginner me was, I'd definitely suggest exercising caution when using it in your code!

  • LinkedIn
  • Twitter

©2021 by divya-mohan.com.