Using Regular Expressions in PHP
PHP has several functions that can be used with regular expressions; regular expressions are commonly referred to as regex in programming.
Introduction
In general, these perform a simple but crucial task, which is to find a text string that meets certain requirements (a combination of characters).
Regular expressions have specific rules that they check if we are complying with. Have you ever wondered how an input knows if the field has a certain number of characters or has fewer characters, if they are numbers, or if it starts with a character that it shouldn’t?
Functions that use Regex
PHP has several functions that can use regular expressions, and here are the following, with an asterisk (*) marking the most commonly used ones:
- preg_match() *
- preg_match_all()
- preg_match_callback()
- preg_match_filter()
- preg_replace() *
- preg_replace_callback()
- preg_replace_filter()
- preg_split() *
- preg_quote() *
- preg_last_error()
- preg_last_error_msg()
- filter_input() *
- filter_var()
- filter_var_array()
Regular expressions in PHP are defined as text strings delimited by two forward slashes (/).
For example, if our regular expression is [a-zA-Z], our regular expression would be “/[a-zA-Z]/” before being passed as a parameter to the function we need to use.
You might see regular expressions without being delimited in this way, but that can cause issues.
Websites that can help you with regular expressions.
There are some websites that are very good for creating your regular expressions, as they check what you are selecting as you write the expression. You can also test to see what the expression captures from a string of characters and what it doesn’t.
https://regexr.com/
https://regex101.com/
For example:
[a-zA-Z] – indicates that it expects to find a character between a and z, which can also be in uppercase.
\s – indicates that it expects a white space.
x{3,} – 3 or more occurrences of the character x, where x can be any character.
[a-z]* – 0 or more characters between a and z.
\w, \W: any letter or number character. For ASCII, it is recommended to use [a-zA-Z0-9_].
Therefore, if we wanted to create a regex for a .com domain, it would be:
\w+.com or [a-zA-Z0-9_]+.com, although there are also domains with hyphens (-) that we could add later.
You can learn more about the rules at https://www3.ntu.edu.sg/home/ehchua/programming/howto/Regexe.html