Thursday, August 20, 2009

Using CTE


This post is a part of Mohd Nizamuddin very interesting article
Sending multiple rows to the Database from an Application
Albert

Table valued function using Number List



First I will explain the pieces of code, which form the building blocks of the final table valued function.
We need to create a number list using the CTE as below
      ;WITH
            L0 AS(SELECT 1 AS c UNION ALL SELECT 1),
            L1 AS(SELECT 1 AS c FROM L0 AS A, L0 AS B),
            L2 AS(SELECT 1 AS c FROM L1 AS A, L1 AS B),
            L3 AS(SELECT 1 AS c FROM L2 AS A, L2 AS B),
            L4 AS(SELECT 1 AS c FROM L3 AS A, L3 AS B),
            Numbers AS(SELECT ROW_NUMBER() OVER(ORDER BY c) AS Number FROM L4)
      SELECT * FROM Numbers

This CTE is creating the list of numbers from 1 to POWER(POWER(POWER(POWER(2, 2), 2), 2), 2), i.e. until 65536.
Now Consider the below code snippet, where @list and @delim variables have been assigned.

 DECLARE
 @list NVARCHAR(MAX), @delim NCHAR(1)
 SELECT @list = 'aaa,bbbbb,cccc,dddd', @delim = ','
 
                ;WITH
              L0 AS(SELECT 1 AS
c UNION ALL SELECT 1),
              L1 AS(SELECT 1 AS c FROM L0 AS A, L0 AS B),
              L2 AS(SELECT 1 AS c FROM L1 AS A, L1 AS B),
              L3 AS(SELECT 1 AS c FROM L2 AS A, L2 AS B),
              L4 AS(SELECT 1 AS c FROM L3 AS A, L3 AS B),
              Numbers AS(SELECT ROW_NUMBER() OVER(ORDER BY c) AS Number FROM L4)
         SELECT
         @list List,
         SUBSTRING(@list, Number, CHARINDEX(@delim,
@list + @delim, Number) - Number) AS   Value,
         Number AS StartingFrom,
         CHARINDEX(@delim, @list + @delim, Number) AS DelimeterPosition
         FROM Numbers
         WHERE Number <= CONVERT(INT, LEN(@list))
         AND SUBSTRING(@delim + @list, Number,
1) = @delim

The SUBSTRING statement, cuts characters from @list starting from character position (1, 5, 11 and 16).
SUBSTRING(@list, Number, CHARINDEX(@delim, @list + @delim, Number) - Number)

The number of characters to be cut is decided by CHARINDEX which will
return 4, 10, 15, 20 in each row, where it finds the delimiter character.
CHARINDEX(@delim, @list + @delim, Number) - Number
The above SELECT only works until the number of characters present in the
@list variable due to the condition
Number <= CONVERT(INT, LEN(@list))

The duplicate values are filtered out from the output list by the "WHERE" condition created
using the SUBSTRING function which will only return a value when it finds the delimiter
SUBSTRING(@delim + @list, Number, 1) = @delim

The output of the code snippet above would be:









ListValueStarting FromDelimiter Position
aaa,bbbbb,cccc,ddddaaa14
aaa,bbbbb,cccc,ddddbbbbb510
aaa,bbbbb,cccc,ddddcccc1115
aaa,bbbbb,cccc,dddddddd1620


Table valued function using Numbered List: Implementation
Now combining all the above explained pieces of SQL, we create our
table valued function which will parse the string and return a table having two columns viz. ID and Data.
      CREATE FUNCTION [dbo].[TableFormDelimetedString]
      (
            @param      NVARCHAR(MAX),
            @delimeter  NCHAR(1)
      )
      RETURNS @tmp TABLE
      (
            ID      INT      
        IDENTITY    
          (1,    
          1),
            Data        Varchar(MAX)
      )
      BEGIN
 
            ;WITH
                  L0   AS(SELECT 1 AS c   UNION   ALL   SELECT   1),
                  L1   AS(SELECT 1 AS c FROM L0 AS   A,   L0   AS   B),
                  L2   AS(SELECT 1 AS c FROM L1 AS   A,   L1   AS   B),
                  L3   AS(SELECT 1 AS c FROM L2 AS   A,   L2   AS   B),
                  L4   AS(SELECT 1 AS c FROM L3 AS   A,   L3   AS   B),
                  Numbers AS(SELECT ROW_NUMBER() OVER(ORDER BY c) AS Number FROM L4)
            INSERT INTO
               @tmp (Data)
            SELECT
             LTRIM(RTRIM(CONVERT(NVARCHAR(4000),
             SUBSTRING(@param, Number,
             CHARINDEX(@delimeter, @param + @delimeter, Number) - Number)
          ))) AS Value
         FROM   Numbers
         WHERE    Number <=
           CONVERT(INT,     LEN(@param))
          AND  SUBSTRING(@delimeter+ @param, Number, 1) = @delimeter
     RETURN
    END
Table valued function using Numbered List: Usage
So if we now invoke the above function like
SELECT * FROM [TableFormDelimetedString]('Andy:Roger:Thomas:Rob:Victor',':')

We will obtain the following result set:








IDData
1Andy
2Roger
3Thomas
4Rob
5Victor

Table valued function using recursive CTE



Here again I will first explain the pieces of code,which form
the building blocks of the final table valued function.
As we know, in a recursive CTE, we have one anchor part and one recursive part.
But if we create a CTE having only the anchor part it would look something like
     DECLARE
      @list NVARCHAR(MAX), @delim NCHAR(1) 
      SELECT @list = 'aaa,bbbbb,cccc,dddd', @delim
= ','   
      ;WITH CTETable (start, stop) AS  
      (
       SELECT start = CONVERT(bigint, 1), stop = CHARINDEX(@delim, @list +@delim, 1)
      )
       SELECT @list List, LTRIM(RTRIM(SUBSTRING(@list,
start,
              CASE
              WHEN
stop > 0
              THEN
stop - start
              ELSE
0
              END
               )))
AS Data
         start AS StartingFrom, stop
AS   DelimiterPosition
       FROM CTETable
The output of the SQL above will be like



ListValueStarting FromDelimiter Position
aaa,bbbbb,cccc,ddddaaa14



Now by adding a recursive member to the above CTE, which iterates over the stop variable, the SQL looks like
    DECLARE
    @list NVARCHAR(MAX),
    @delim NCHAR(1)
 
    SELECT @list = 'aaa,bbbbb,cccc,dddd', @delim = ',' 
    ;WITH CTETable (start, stop)  
        AS 
        (
        SELECT start = CONVERT(bigint, 1), stop = CHARINDEX(@delim, @list +@delim, 1)
        UNION ALL       -- added for recursive part of CTE
        SELECT start = stop + 1, stop = CHARINDEX(@delim, @list +  
        @delim, stop+ 1) FROM CTETable WHERE
stop > 0 -- added for recursive part of CTE
        )
        SELECT @list List, LTRIM(RTRIM(SUBSTRING(@list,
start,
        CASE
        WHEN stop > 0
        THEN stop - start
        ELSE 0
        END
        ))) AS Data
        start AS StartingFrom, stop AS DelimiterPosition
        FROM CTETable
        WHERE stop > 0

And gives the following result set






ListValueStarting FromDelimiter Position
aaa,bbbbb,cccc,ddddaaa14
aaa,bbbbb,cccc,ddddbbbbb510
aaa,bbbbb,cccc,ddddcccc1115
aaa,bbbbb,cccc,dddddddd1620


Table valued function using recursive CTE: Implementation
Finally we create a table valued function from the above code blocks, which looks like
      CREATE FUNCTION [dbo].[TableFormDelimetedStringWithoutNumberList]
      (@list NVARCHAR(MAX),
      @delim  NCHAR(1)   = ','
      )
       RETURNS @tmp TABLE
       (
        ID  INT IDENTITY   (1, 1),
        Data Varchar(MAX)
       )
      BEGIN
       ;WITH CTETable (start, stop)
        AS
        (
         SELECT start = CONVERT(bigint, 1),
           stop = CHARINDEX(@delim, @list + @delim)
         UNION ALL   -- added for recursive part of CTE
         SELECT start = stop + 1,
         stop = CHARINDEX(@delim, @list + @delim, stop + 1) -- added for recursive part of CTE
         FROM CTETable
         WHERE  stop > 0
        )
       INSERT INTO @tmp (Data)
        SELECT LTRIM(RTRIM(SUBSTRING(@list,
             start,
            CASE
            WHEN stop > 0
            THEN
            stop - start
            ELSE
            0
            END))) AS Data
            FROM CTETable
        WHERE stop > 0
     RETURN
    END
Table valued function using recursive CTE: Usage
So if we now invoke the above function like
     SELECT * FROM[TableFormDelimetedStringWithoutNumberList]('Andy:Roger:Thomas:Rob:Victor',':')
We will obtain the following result set:







IDData
1Andy
2Roger
3Thomas
4Rob
5Victor
Why I like these two implementations is because the looping has been handled
by the SQL server database engine itself, which would definitely be more efficient
than explicit SQL looping code written by a developer.

2 comments:

  1. Congratulations for the premier.
    Very interesting stuff...Why did you reach this topic? used it in the "real" world? did you checked its performance against other method/s?

    ReplyDelete
  2. Sometimes the demands of our customers
    is slipping beyond the bounds of reasonable,
    but for our brother programmer –
    this is not a barrier, so that's how it was ...

    Client gave us their database
    (may God give him good health), ...
    There was an interesting table.
    In this table column was - Street. Customer or its employees (not the point) in this field [Street] pasted the entire address (City, street number, street, POB and many other interesting types of mutations of the city with the street or even the company name and street number - a curious thing ..) .
    It was necessary to seek methods of decoding .
    Well I will not awaken your memories Diego and move on to a case.

    It was then that I found this interesting article.
    My tests gave no bad results:
    The first method can be used freely as a method of Split on the strings at DB level ( set, select, insert and updates statements).
    The second method is faster
    - Recursive, but in case of a large amount of data, this method limits the maximum number of recursions to 100 after my attempts to change the number of recursions and overloading the server will still get a maximum of 100 recursions.
    And not any client will give permissions to set the maximum number of recursions. So i found the recursion less useful but more powerful on small amount of data.
    Both methods will work in 2005, not later. But more than anything here interesting operator WITH and CTE method of POW - very original solution)

    ReplyDelete