The basics of regular expressions — explained simply
Regular expression — or regex — is something that we all, as developers, will encounter at some point.
For many, it remains a form of gibberish. For some developers, they avoid it simply because it feels like it’s just too hard to get your head around.
Whatever your case, here is a quick and simple guide to help you decrypt the mysteries of regular expression — aka, regex.
What’s the big deal with regular expressions?
A regular expression is essentially a way to search through a string of text.
Why do you need to do this?
The most common reason for using regular expressions is to create validations or advance find and replace situations.
It is also useful for verifying the structure of strings, extraction, replacement, rearrangement and splitting strings into tokens.
When you deal with a lot of data, regular expressions gives you the ability to manipulate and transform that data into something meaningful.
If you think about it, regular expression is a version of if else
conditions (along with a few other conditional parameters) — but for strings. Its main task is to identify a particular pattern that fits within a certain set condition.
So without further ado, let’s dive right into the world of regular expressions.
Starting with the basics
There are three major concepts when it comes to regular expressions — alternatives, grouping and quantification.
A alternative statement basically tells the interpreter that the matching characters can be either/or. This is signified through a verticle bar — |
For example:
color | colour
The regex for this will pick up both versions of the spelling.
Grouping is a form of signifying the relationship between the surrounding regular expression. This is done through a pair of ()
Remember back to your old algebra parenthesis rules? Well, it’s the same concept in regular expression.
For example —
(3+5)(1+1) = 16
(3 + 5)
is counted as one group and (1+1)
is another. They both get processed first before things can proceed in the equation. This is because the parenthesis( )
acts as a scope for the numbers.
In regular expression, this idea remains the same.
For example —
col ( o | ou ) r
( o | ou )
is the equation that gets processed first before things can proceed. In this case, it’s saying that either o
or ou
can proceed for strings beginning with col
and ending in r