Regex lookarounds tutorial

The standard practice for particular phrase searching at any part of a text, is by using regex lookarounds.

There are two types of lookarounds:

  • Lookbehind, which is used to match a phrase that is preceded by a user specified text.

    • Positive lookbehind is syntaxed like (?<=a)something which can be used along with any regex parameter.

      The above phrase matches any "something" word that is preceded by an "a" word.

      • Negative Lookbehind is syntaxed like (?<!a)something which is used to match a "something" word that is not preceded by an "a".
  • Lookahead, which is used to match a phrase that is followed by a user specified text.

    • Positive Lookahead is syntaxed like something(?=a) and matches a "something" word that is being followed by an "a".

    • Negative Lookahead is syntaxed like something(?!a) and matches a "something" word that is being followed by an "a"

For example:

Given the below text, we need to identify all the values corresponding to names:

  • Provider: .NET Runtime
  • Level 2
  • Task 0
  • Keywords 0x80000000000000

by creating one regex phrase for each name we have:

  • (?<=Provider:\s).+(?=\n)

    Matches anything ".+" that is preceded by "Provider:(space character)" and followed by "(new line character)". We indicate special characters using the expression "\".

  • (?<=Level\s).+(?=\n)

    Matches anything ".+" that is preceded by "Level(space)" and followed by "(new line char)". This expression can also be written like (?<=Level\s)\d+(?=\n) where the expression "\d+" indicates one or more (+) decimal numbers.

  • (?<=Task\s).+(?=\n)

    Also matches anything preceded by "Task(space)" and followed by "(new line char)".

  • (?<=Keywords\s).+(?=)

    Here we need to specify ".+" in order to get everything until the end of the string. We don't use (new line char) because there is none.

    The expression can also be written like: (?<=Keywords\s)\d+\x\d+(?=) which matches the phrase "\d+\x\d+" (one or more decimal numbers the 'x' character and after that one or more decimal numbers). However, this works only if we know the structure of the value.

Use this free .NET RegEx tester to test your expressions.