How to: Strip Invalid Characters from a String

The following example uses the static Regex.Replace method to strip invalid characters from a string.

Example

You can use the CleanInput method defined in this example to strip potentially harmful characters that have been entered into a text field that accepts user input. In this case, CleanInput strips out all nonalphanumeric characters except periods (.), at symbols (@), and hyphens (-), and returns the remaining string. However, you can modify the regular expression pattern so that it strips out any characters that should not be included in an input string.

using System;
using System.Text.RegularExpressions;

public class Example
{
static string CleanInput(string strIn)
{
// Replace invalid characters with empty strings.
try {
return Regex.Replace(strIn, @"[^\w\.@-]", "",
RegexOptions.None, TimeSpan.FromSeconds(1.5));
}
// If we timeout when replacing invalid characters,
// we should return Empty.
catch (RegexMatchTimeoutException) {
return String.Empty;
}
}
}

Imports System.Text.RegularExpressions

Module Example
Function CleanInput(strIn As String) As String
' Replace invalid characters with empty strings.
Try
Return Regex.Replace(strIn, "[^\w\.@-]", "")
' If we timeout when replacing invalid characters,
' we should return String.Empty.
Catch e As RegexMatchTimeoutException
Return String.Empty
End Try
End Function
End Module


The regular expression pattern [^\w\.@-] matches any character that is not a word character, a period, an @ symbol, or a hyphen. A word character is any letter, decimal digit, or punctuation connector such as an underscore. Any character that matches this pattern is replaced by String.Empty, which is the string defined by the replacement pattern. To allow additional characters in user input, add those characters to the character class in the regular expression pattern. For example, the regular expression pattern [^\w\.@-\\%] also allows a percentage symbol and a backslash in an input string.