Database Programming: A New String Concatenation Contender for SQL Server 2008

UPDATED 11:11p 7 September 2009: Based on my latest exchange with Adam (see the comments for this post), compound operators should be avoided for string concatenation due to the ORDER BY bug discused in KB 287515. The XML variant below is the preferred approach for string concatenation, but the compound operator remains a viable alternative in other scenarios.

UPDATED 7 September 2009: See Adam Machanic's comments on this post for a link to a discussion of a SQL Server 2008 ORDER BY bug (KB 287515) which could influence the sequencing of your results under certain scenarios. The XML variant might be better after all (the ORDER BY issue is something of an edge case (functions applied to ORDER BY columns) and should be repaired eventually, but the current risk is non-trivial); compound operators are still of use in scenarios where sequencing of results is not an issue (aggregates, etc.).

Back in March of last year, we had a discussion which concluded here (and was briefly revisited here) regarding string concatenation techniques. I’ve uncovered a new SQL Server 2008-based contender which I’d like to share.

This approach is based on the new-in-SQL-Server-2008 += compound operator, which allows all order of iterative processing (copied from linked BOL article):

Operator Link to more information Action

+=

+= (Add EQUALS) (Transact-SQL)

Adds some amount to the original value and sets the original value to the result.

-=

-= (Subtract EQUALS) (Transact-SQL)

Subtracts some amount from the original value and sets the original value to the result.

*=

*= (Multiply EQUALS) (Transact-SQL)

Multiplies by an amount and sets the original value to the result.

/=

/= (Divide EQUALS) (Transact-SQL)

Divides by an amount and sets the original value to the result.

%=

%= (Modulo EQUALS) (Transact-SQL)

Divides by an amount and sets the original value to the modulo.

&=

&= (Bitwise AND EQUALS) (Transact-SQL)

Performs a bitwise AND and sets the original value to the result.

^=

^= (Bitwise Exclusive OR EQUALS) (Transact-SQL)

Performs a bitwise exclusive OR and sets the original value to the result.

|=

|= (Bitwise OR EQUALS) (Transact-SQL)

Performs a bitwise OR and sets the original value to the result.

Since the use of compound operators restricts us to SQL Server 2008, the snippet below exploits the also-new-in-SQL-Server-2008 row constructor capability (scroll to Example B in the latest BOL examples for the INSERT statement). In my testing, nine distinct INSERT statements each consumed 6% of the queries resources; the two statements below were each 9% of the total query cost. This is a pretty impressive economy for such a small bed of data, so those of you writing long seed scripts should definitely take this into account!

Two uses of row constructors as well as the use of the compound operator are highlighted in the code snippet below (the previous XML-based contender is also included for reference):

-- lay the groundwork

DECLARE @Results NVARCHAR(4000)

-- drop and create the tables
IF EXISTS (SELECT 1 FROM sys.tables WHERE name = 'Parent')
    DROP TABLE Parent

CREATE TABLE Parent
(
    ParentID INT
   ,ParentString VARCHAR(100)
)
IF EXISTS (SELECT 1 FROM sys.tables WHERE name = 'Child')
    DROP TABLE Child

CREATE TABLE Child
(
    ChildID INT
   ,ParentID INT
)

-- populate the tables
-- since this is SQL Server 2008 only, we can use row constructors
INSERT Parent VALUES
    (1, 'Parent 1 String'),
    (2, 'Parent 2 String'),
    (3, 'Parent 3 String')

INSERT Child VALUES
    (1, 1),
    (2, 1),
    (2, 2),
    (2, 3),
    (3, 1),
    (3, 3)

-- show the data

SELECT  Child.ChildID, ISNULL(Parent.ParentString, '')
FROM    Child
INNER   JOIN Parent
ON      Child.ParentID = Parent.ParentID
ORDER BY
        Child.ChildID, Child.ParentID

-- Concatenate a string via the new compound operator
-- Pivot Parent values into 1 column for 1 base row

SET @Results = N''

SELECT  @Results += ',' + Parent.ParentString
FROM    Child
JOIN    Parent
ON      Child.ParentID = Parent.ParentID
WHERE   Child.ChildID = 2 -- must specify 1 base row. could be a correlated subquery
ORDER BY
        Child.ParentID

-- Two ways to display the result without the leading comma
SELECT RIGHT(@Results,LEN(@Results)-1)
SELECT STUFF(@Results,1,1,'')

-- Concatenate via XML
-- Pivot Parent values into 1 column for 1 base row
SELECT  STUFF(( SELECT [text()]= ',' + ISNULL(Parent.ParentString, '') + ''
FROM    Child
JOIN    Parent
ON      Child.ParentID = Parent.ParentID
WHERE   Child.ChildID = 2 -- must specify 1 base row. could be a correlated subquery
ORDER BY
        Child.ParentID
FOR XML PATH('')), 1,1, '') AS Parent_CSV

The query plans for each query that the XML-based call includes a statistically insignificant (at this volume) UDX call for the XML instantiation:

Query Plan For Compound Operator Variant

query plan for compound operator variant

 

Query Plan For XML Variant

query plan for XML variant

For this reason, I’d expect the compound operator-based query to scale better with a large volume of data. Once one becomes familiar with compound operators, which are commonplace in nGL languages, this construction strikes me as both more maintainable and more elegant.

On a related note, I continue to find the subtle programmability enhancements to SQL Server 2008 to be very cool.

Of course, these are my biases. What do you think?

-wp


this copyrighted material was originally posted at http://blogs.technet.com/wardpond

the author and his employer are pleased to provide this content for you at that site, and via rss, free of charge and without advertising.

the author welcomes and appreciates links to and citations of his work. however, if you are viewing the full text of this article at any other website, be aware that its author does not endorse and is not compensated by any advertising or access fees you may be subjected to outside the original web and rss sites