Saturday, November 05, 2011

Regular Expressions in Bash

Traditionally, external tools such as grep, sed, awk and perl have been used to match a string against a regular expression, but the Bash shell has this functionality built into it as well!

In Bash, the =~ operator allows you to match a string on the left against an extended regular expression on the right and returns 0 if the string matches the pattern, and 1 otherwise. Capturing groups are saved in the array variable BASH_REMATCH with the first element, Group 0, representing the entire expression.

The following script matches a string against a regex and prints out the capturing groups:

#!/bin/bash

if [ $# -lt 2 ]
then
    echo "Usage: $0 regex string" >&2
    exit 1
fi

regex=$1
input=$2

if [[ $input =~ $regex ]]
then
    echo "$input matches regex: $regex"

    #print out capturing groups
    for (( i=0; i<${#BASH_REMATCH[@]}; i++))
    do
        echo -e "\tGroup[$i]: ${BASH_REMATCH[$i]}"
    done
else
    echo "$input does not match regex: $regex"
fi
Example usage:
sharfah@starship:~> matcher.sh '(.*)=(.*)' foo=bar
foo=bar matches regex (.*)=(.*)
    Group[0]: foo=bar
    Group[1]: foo
    Group[2]: bar