Tuesday, 3 February 2015

SQL Query: Convert Row data into Column

Convert row data to column in SQL Server

Today I was asked by my colleague to transform data from a vertical staging table into a horizontal table. I mean transform rows to column. I used PIVOT and resolved it. But got into situation where I am getting trouble to move data if the data field repeats itself.
Here is the test data that I am working on:
CREATE TABLE STAGING 
(
    ENTITYID INT,
    PROPERTYNAME VARCHAR(25),
    PROPERTYVALUE VARCHAR(25)
)

INSERT INTO STAGING VALUES (1, 'NAME', 'DONNA')
INSERT INTO STAGING VALUES (1, 'SPOUSE', 'HENRY')
INSERT INTO STAGING VALUES (1, 'CHILD', 'JACK')
INSERT INTO STAGING VALUES (2, 'CHILD', 'KAYALA')
I used PIVOT to show row data as columns:
SELECT * FROM 
(SELECT ENTITYID, PROPERTYNAME, PROPERTYVALUE FROM STAGING) AS T
PIVOT (MAX(PROPERTYVALUE) FOR PROPERTYNAME IN (NAME, SPOUSE, CHILD)) AS T2
The output is:
ENTITYID    NAME    SPOUSE  CHILD
1           DONNA   HENRY   JACK
2           NULL    NULL    KAYALA
But he wanted the output something like:
ENTITYID    NAME    SPOUSE  CHILD   CHILD
1           DONNA   HENRY   JACK    KAYALA
bottom line is that there can be more than one CHILD attribute coming into the staging table. And we need to consider this and move all the CHILDREN to columns.
Is this possible?
Solution:
 CREATE TABLE STAGING 
(
    ENTITYID INT,
    PROPERTYNAME VARCHAR(25),
    PROPERTYVALUE VARCHAR(25)
)

INSERT INTO STAGING VALUES (1, 'NAME', 'DONNA')
INSERT INTO STAGING VALUES (1, 'SPOUSE', 'HENRY')
INSERT INTO STAGING VALUES (1, 'CHILD', 'JACK')
INSERT INTO STAGING VALUES (1, 'CHILD', 'KAYALA')


SELECT * FROM
(
SELECT ENTITYID, PROPERTYNAME = PROPERTYNAME + CAST(ROW_NUMBER() OVER(PARTITION BY ENTITYID, PROPERTYNAME ORDER BY PROPERTYVALUE) AS VARCHAR(5)),PROPERTYVALUE
FROM STAGING   
) AS T
PIVOT (MAX(PROPERTYVALUE) FOR PROPERTYNAME IN (NAME1, SPOUSE1, CHILD1, CHILD2, CHILD3, CHILD4, CHILD5)) AS T2

Explanation 
You can add a row number to the propertyname that will allow you to do what you want:
SELECT * FROM
(
SELECT ENTITYID
       , PROPERTYNAME = PROPERTYNAME + CAST(ROW_NUMBER() OVER(PARTITION BY ENTITYID, PROPERTYNAME ORDER BY PROPERTYVALUE) AS VARCHAR(5))
      ,PROPERTYVALUE
FROM #STAGING   
) AS T
PIVOT (MAX(PROPERTYVALUE) FOR PROPERTYNAME IN (NAME1, SPOUSE1, CHILD1, CHILD2, CHILD3, CHILD4, CHILD5)) AS T2
I'm assuming here that the ENTITYID ties the children to the parent, ie all children for the same person have ENTITYID of 1, but your example shows a 2 for Kayala.
Here is a Demo: SQL Fiddle
If you only want the numbers for the CHILD fields you could put this:
PROPERTYNAME = CASE WHEN PROPERTYNAME LIKE '%CHILD%' THEN PROPERTYNAME + CAST(ROW_NUMBER() OVER(PARTITION BY ENTITYID, PROPERTYNAME ORDER BY PROPERTYVALUE) AS VARCHAR(5))                                                   ELSE PROPERTYNAME END
Then remove the number from the other fields in your IN() statement.
Bonus Question- Do the above dynamically: We don't want to assume that people only have one spouse or 2.3 children, so we do the whole bit dynamically:
DECLARE @cols AS NVARCHAR(MAX),
    @query  AS NVARCHAR(MAX)

SELECT @cols = STUFF((SELECT ',' + PROPERTYNAME
                    FROM (SELECT DISTINCT PROPERTYNAME = PROPERTYNAME + CAST(ROW_NUMBER() OVER(PARTITION BY ENTITYID, PROPERTYNAME ORDER BY PROPERTYVALUE) AS VARCHAR(5))
                          FROM STAGING )sub
                    ORDER BY CASE WHEN PROPERTYNAME LIKE '%NAME%' THEN 1
                        WHEN PROPERTYNAME LIKE '%SPOUSE%' THEN 2
                        WHEN PROPERTYNAME LIKE '%CHILD%' THEN 3
                    ELSE 4
                    END
                    ,RIGHT(PROPERTYNAME,1) 
                  FOR XML PATH(''), TYPE
            ).value('.', 'NVARCHAR(MAX)') 
        ,1,1,'')


SET @query = 'SELECT * FROM
                (
                SELECT ENTITYID, PROPERTYNAME = PROPERTYNAME + CAST(ROW_NUMBER() OVER(PARTITION BY ENTITYID, PROPERTYNAME ORDER BY PROPERTYVALUE) AS VARCHAR(5)),PROPERTYVALUE
                FROM STAGING   
                ) AS T
                PIVOT (MAX(PROPERTYVALUE) FOR PROPERTYNAME IN ('+@cols+')) AS T2

'
EXEC(@query)
Note: The ordering will only work for spouses 1-9 and children 1-9, you can adjust that to suit, but it's arbitrary anyway.

0 Comments:

Post a Comment

Subscribe to Post Comments [Atom]

<< Home