Skip to content

7. Regular Expressions

Image

7.1. script [regex-01]

In the PHP course, we used the following code to illustrate PHP 7 regular expressions:


<?php

// strict type for function parameters
declare (strict_types=1);

// regular expressions in PHP
// extract the different fields from a string
// the pattern: a sequence of digits surrounded by any characters
// we only want to extract the sequence of digits
$pattern = "/(\d+)/";
// compare the string to the pattern
comparePatternToString($pattern, "xyz1234abcd");
comparePatternToString($pattern, "12 34");
comparePatternToString($pattern, "abcd");

// the pattern: a sequence of digits surrounded by any characters
// We want the sequence of digits as well as the fields that follow and precede it
$pattern = "/^(.*?)(\d+)(.*?)$/";
// we match the string against the pattern
comparePattern2String($pattern, "xyz1234abcd");
comparePattern2String($pattern, "12 34");
comparePatternToString($pattern, "abcd");

// the pattern - a date in dd/mm/yy format
$pattern = "/^\s*(\d\d)\/(\d\d)\/(\d\d)\s*$/";
comparePattern2String($pattern, "10/05/97");
comparePattern2String($pattern, "  04/04/01  ");
compareTemplate2String($template, "5/1/01");

// the pattern - a decimal number
$pattern = "/^\s*([+|-]?)\s*(\d+\.\d*|\.\d+|\d+)\s*/";
comparePattern2String($pattern, "187.8");
comparePattern2String($pattern, "-0.6");
comparePattern2String($pattern, "4");
comparePattern2String($pattern, ".6");
comparePattern2String($pattern, "4.");
compareModel2String($model, " + 4");

// end
exit;

// --------------------------------------------------------------------------
function compareTemplateToString(string $template, string $string): void {
  // compares the string $string to the pattern $pattern
  // compare the string to the pattern
  $fields = [];
  $match = preg_match($pattern, $string, $fields);
  // display results
  print "\nResults($pattern,$string)\n";
  if ($match) {
    for ($i = 0; $i < count($fields); $i++) {
      print "fields[$i]=$fields[$i]\n";
    }
  } else {
    print "The string [$string] does not match the pattern [$pattern]\n";
  }
}

We convert this code to JavaScript as follows:


'use strict';

/// Regular expressions in JavaScript
// extract the different fields from a string
// the pattern: a sequence of digits surrounded by any characters
// we only want to extract the sequence of digits
let pattern = /(\d+)/;
// compare the string to the pattern
comparePatternToString(pattern, "xyz1234abcd");
comparePatternToString(pattern, "12 34");
comparePatternToString(pattern, "abcd");

// the pattern: a sequence of digits surrounded by any characters
// We want the sequence of digits as well as the fields that come before and after it
pattern = /^(.*?)(\d+)(.*?)$/;
// we match the string against the pattern
comparePatternToString(pattern, "xyz1234abcd");
comparePatternToString(pattern, "12 34");
comparePatternToString(pattern, "abcd");

// the pattern - a date in dd/mm/yy format
pattern = /^\s*(\d\d)\/(\d\d)\/(\d\d)\s*$/;
comparePatternToString(pattern, "10/05/97");
comparePatternToString(pattern, "  04/04/01  ");
comparePatternToString(pattern, "5/1/01");

// the pattern - a decimal number
pattern = /^\s*([+|-]?)\s*(\d+\.\d*|\.\d+|\d+)\s*$/;
comparePatternToString(pattern, "187.8");
comparePatternToString(pattern, "-0.6");
comparePatternToString(pattern, "4");
comparePatternToString(pattern, ".6");
comparePatternToString(pattern, "4.");
compareModelToString(model, " + 4");

// --------------------------------------------------------------------------
function comparePatternToString(pattern, string) {
  // compares the string [string] to the pattern [pattern]
  console.log(`----------- string=${string}, pattern=${pattern}`)
  // compare the string to the pattern
  const result1 = pattern.exec(string);
  console.log(`comparison with exec=`, result1);
  // another way to do it
  const result2 = string.match(pattern);
  console.log(`comparison with match=`, result2);
}

Comments

  • PHP and JavaScript code are very similar;
  • line 7: Note that in JavaScript, a regular expression is not a string but an object. Do not put quotes or apostrophes around the expression;
  • Lines 41 and 44: There are two ways to achieve the same result;

Execution


[Running] C:\myprograms\laragon-lite\bin\nodejs\node-v10\node.exe -r esm "c:\Data\st-2019\dev\es6\javascript\regexp\regexp-01.js"
type of a regular expression: object
----------- string=xyz1234abcd, pattern=/(\d+)/
comparison with exec= [ '1234',
'1234',
index: 3,
input: 'xyz1234abcd',
groups: undefined ]
comparison with match= [ '1234',
'1234',
index: 3,
input: 'xyz1234abcd',
groups: undefined ]
----------- string=12 34, pattern=/(\d+)/
Comparison with exec= [ '12', '12', index: 0, input: '12 34', groups: undefined ]
comparison with match= [ '12', '12', index: 0, input: '12 34', groups: undefined ]
----------- string=abcd, pattern=/(\d+)/
comparison with exec= null
comparison with match= null
----------- string=xyz1234abcd, pattern=/^(.*?)(\d+)(.*?)$/
comparison with exec= [ 'xyz1234abcd',
'xyz',
'1234',
'abcd',
index: 0,
input: 'xyz1234abcd',
groups: undefined ]
comparison with match= [ 'xyz1234abcd',
'xyz',
'1234',
'abcd',
index: 0,
input: 'xyz1234abcd',
groups: undefined ]
----------- string=12 34, pattern=/^(.*?)(\d+)(.*?)$/
comparison with exec= [ '12 34',
'',
'12',
' 34',
index: 0,
input: '12 34',
groups: undefined ]
comparison with match= [ '12 34',
'',
'12',
' 34',
index: 0,
input: '12 34',
groups: undefined ]
----------- string=abcd, pattern=/^(.*?)(\d+)(.*?)$/
comparison with exec=null
comparison with match=null
----------- string=10/05/97, pattern=/^\s*(\d\d)\/(\d\d)\/(\d\d)\s*$/
comparison with exec= [ '10/05/97',
'10',
'05',
'97',
index: 0,
input: '10/05/97',
groups: undefined ]
comparison with match= [ '10/05/97',
'10',
'05',
'97',
index: 0,
input: '10/05/97',
groups: undefined ]
----------- string= 04/04/01 , pattern=/^\s*(\d\d)\/(\d\d)\/(\d\d)\s*$/
comparison with exec= [ ' 04/04/01 ',
'04',
'04',
'01',
index: 0,
input: '04/04/01',
groups: undefined ]
comparison with match= [ ' 04/04/01 ',
'04',
'04',
'01',
index: 0,
input: '04/04/01',
groups: undefined ]
----------- string=5/1/01, pattern=/^\s*(\d\d)\/(\d\d)\/(\d\d)\s*$/
comparison with exec=null
comparison with match=null
----------- string=187.8, pattern=/^\s*([+|-]?)\s*(\d+\.\d*|\.\d+|\d+)\s*$/
comparison with exec= [ '187.8',
'',
'187.8',
index: 0,
input: '187.8',
groups: undefined ]
comparison with match= [ '187.8',
'',
'187.8',
index: 0,
input: '187.8',
groups: undefined ]
----------- string=-0.6, pattern=/^\s*([+|-]?)\s*(\d+\.\d*|\.\d+|\d+)\s*$/
comparison with exec= [ '-0.6', '-', '0.6', index: 0, input: '-0.6', groups: undefined ]
comparison with match= [ '-0.6', '-', '0.6', index: 0, input: '-0.6', groups: undefined ]
----------- string=4, pattern=/^\s*([+|-]?)\s*(\d+\.\d*|\.\d+|\d+)\s*$/
comparison with exec= [ '4', '', '4', index: 0, input: '4', groups: undefined ]
comparison with match= [ '4', '', '4', index: 0, input: '4', groups: undefined ]
----------- pattern=.6, regex=/^\s*([+|-]?)\s*(\d+\.\d*|\.\d+|\d+)\s*$/
comparison with exec= [ '.6', '', '.6', index: 0, input: '.6', groups: undefined ]
comparison with match= [ '.6', '', '.6', index: 0, input: '.6', groups: undefined ]
----------- string=4., pattern=/^\s*([+|-]?)\s*(\d+\.\d*|\.\d+|\d+)\s*$/
comparison with exec= [ '4.', '', '4.', index: 0, input: '4.', groups: undefined ]
comparison with match= [ '4.', '', '4.', index: 0, input: '4.', groups: undefined ]
----------- string= + 4, pattern=/^\s*([+|-]?)\s*(\d+\.\d*|\.\d+|\d+)\s*$/
comparison with exec= [ ' + 4', '+', '4', index: 0, input: ' + 4', groups: undefined ]
comparison with match= [ ' + 4', '+', '4', index: 0, input: ' + 4', groups: undefined ]

The [regexp.exec] and [string.match] methods return the same results:

  • [null] if there are no matches between the string and its pattern;
  • an array t, if there is a match with:
    • t[0]: the string matching the pattern;
    • t[1]: the string matching the first parenthesis of the pattern;
    • t[2]: the string matching the second parenthesis of the pattern;
    • t[input]: the entire string in which the pattern was searched for;

7.2. script [regexp-02]

Sometimes you don’t want to extract elements from the tested string, but only want to know if it matches the pattern:


'use strict';

/// regular expressions in JavaScript
// extract the different fields from a string
// the pattern: a sequence of digits surrounded by any characters
// we only want to extract the sequence of digits
let pattern = /\d+/;
console.log("Type of a regular expression: ", typeof (pattern));
// compare the string to the pattern
comparePatternToString(pattern, "xyz1234abcd");
comparePatternToString(pattern, "12 34");
comparePatternToString(pattern, "abcd");

// the pattern: a sequence of digits surrounded by arbitrary characters
// We want the sequence of digits as well as the fields that follow and precede it
pattern = /^.*?\d+.*?$/;
// we compare the string to the pattern
comparePatternToString(pattern, "xyz1234abcd");
comparePatternToString(pattern, "12 34");
comparePatternToString(pattern, "abcd");

// the pattern - a date in dd/mm/yy format
pattern = /^\s*\d\d\/\d\d\/\d\d\s*$/;
comparePatternToString(pattern, "10/05/97");
comparePatternToString(pattern, "  04/04/01  ");
comparePatternToString(pattern, "5/1/01");

// the pattern - a decimal number
pattern = /^\s*[+|-]?\s*\d+\.\d*|\.\d+|\d+\s*$/;
comparePatternToString(pattern, "187.8");
compareModelToString(model, "-0.6");
compareModelToString(model, "4");
compareModelToString(model, ".6");
compareModelToString(model, "4.");
compareModelToString(model, " + 4");

// --------------------------------------------------------------------------
function comparePatternToString(pattern, string) {
  // test
  const matches = pattern.test(string);
  // compare the string [string] to the pattern [pattern]
  console.log(`----------- string=${string}, pattern=${pattern}, matches=${matches}`);
}

Comments

  • [regexp-02] uses the code from [regexp-01] with the following differences:
    • we do not want to extract elements from the tested string. Therefore, we have removed the parentheses from the regular expressions used;
    • Line 40: We use the [Regexp.test] method to determine if a string matches a regular expression;

The results of the execution are as follows:


[Running] C:\myprograms\laragon-lite\bin\nodejs\node-v10\node.exe -r esm "c:\Data\st-2019\dev\es6\cours\regexp\regexp-02.js"
regular expression type:  object
----------- string=xyz1234abcd, pattern=/\d+/, match=true
----------- string=12 34, pattern=/\d+/, match=true
----------- string=abcd, pattern=/\d+/, match=false
----------- string=xyz1234abcd, pattern=/^.*?\d+.*?$/, matches=true
----------- string=12 34, pattern=/^.*?\d+.*?$/, matches=true
----------- string=abcd, pattern=/^.*?\d+.*?$/, matches=false
----------- string=10/05/97, pattern=/^\s*\d\d\/\d\d\/\d\d\s*$/, matches=true
----------- string=  04/04/01  , pattern=/^\s*\d\d\/\d\d\/\d\d\s*$/, matches=true
----------- string=5/1/01, pattern=/^\s*\d\d\/\d\d\/\d\d\s*$/, matches=false
----------- string=187.8, pattern=/^\s*[+|-]?\s*\d+\.\d*|\.\d+|\d+\s*$/, matches=true
----------- string=-0.6, pattern=/^\s*[+|-]?\s*\d+\.\d*|\.\d+|\d+\s*$/, matches=true
----------- string=4, pattern=/^\s*[+|-]?\s*\d+\.\d*|\.\d+|\d+\s*$/, matches=true
----------- string=.6, pattern=/^\s*[+|-]?\s*\d+\.\d*|\.\d+|\d+\s*$/, matches=true
----------- string=4., pattern=/^\s*[+|-]?\s*\d+\.\d*|\.\d+|\d+\s*$/, matches=true
----------- string= + 4, pattern=/^\s*[+|-]?\s*\d+\.\d*|\.\d+|\d+\s*$/, matches=true

[Done] exited with code=0 in 0.269 seconds