De-entitizing Special XML Characters

There are several tricky issues around the serialization of special XML characters inside SQL Server 2005. This was highlighted to me earlier today when I had a question about how to output a URL value that caontains an ampersand ('&') character from the XML data type. Consider the following query:

 declare @x xml
set @x=''
select @x.query( '"https://www.example.com&asp"' )

When we execute this, it returns the following:

 https://www.example.com & asp

Notice that the ampersand character is still entitized, making this an invalid URL! Since this is a simple string value, we would expect that it be de-entitized on output. But, what you need to understand here is that the output of an XML data type query method is always another XML data type. So, even though the output appears to be a simple string value, it is actually a top-level XML text node and thus is required to have entitized contents. The correct way to output this URL is to use the XML data type value method. Taking the query from above and replacing the query method with an equivalent call to the value method, we get:

 declare @x xml
set @x=''
select @x.value( '"https://www.example.com&asp"', 'nvarchar(50)' )

This will now return us a valid URL value:

 https://www.example.com&asp
-
Disclaimer:
This posting is provided “AS IS” with no waranties, and confers no rights.
Use of included script samples are subject to the terms specified at https://www.microsoft.com/info/cpyright.htm.