Adding Subtotals using GROUPING SETS
I recently had to convert an existing report constructed against SQL Server 2000 to our new SQL Server 2008 dimensional data warehouse. The output of the report procedure was a single set of aggregated totals by different levels of the date dimension. After reviewing the code for some time, which consisted of several pages of stream of consciousness SQL and the creation of a multitude of temp tables, I arrived at the conclusion that there just had to be a better way. That’s when I stumbled upon a little documented enhancement in SQL Server 2008 called GROUPING SETS.
Using the AdventureWorksDW sample database as a source, the use of GROUPING SETS lets us produce output like below using a single SQL statement.
How is this done? Lets first consider the case where we want to generate an aggregate by month. Using a GROUP BY statement, this result set is produced by the following:
SELECT ordDate.CalendarYear,
ordDate.CalendarQuarter,
ordDate.MonthNumberOfYear,
ordDate.EnglishMonthName,
SUM(fact.SalesAmount) AS SalesAmount
FROM dbo.FactInternetSales fact
INNER JOIN dbo.DimTime ordDate
ON fact.OrderDateKey = ordDate.TimeKey
GROUP BY
ordDate.CalendarYear,
ordDate.CalendarQuarter,
ordDate.MonthNumberOfYear,
ordDate.EnglishMonthName
SELECT ordDate.CalendarYear,
ordDate.CalendarQuarter,
ordDate.MonthNumberOfYear,
ordDate.EnglishMonthName,
SUM(fact.SalesAmount) AS SalesAmount
FROM dbo.FactInternetSales fact
INNER JOIN dbo.DimTime ordDate
ON fact.OrderDateKey = ordDate.TimeKey
GROUP BY
GROUPING SETS (
(
ordDate.CalendarYear,
ordDate.CalendarQuarter,
ordDate.MonthNumberOfYear,
ordDate.EnglishMonthName
)
)
The two statements produce the same exact output. However, the use of GROUPING SETS lets me specify another vector aggregate, say just Year and Quarter like below
SELECT ordDate.CalendarYear,
ordDate.CalendarQuarter,
ordDate.MonthNumberOfYear,
ordDate.EnglishMonthName,
SUM(fact.SalesAmount) AS SalesAmount
FROM dbo.FactInternetSales fact
INNER JOIN dbo.DimTime ordDate
ON fact.OrderDateKey = ordDate.TimeKey
GROUP BY
GROUPING SETS (
(
ordDate.CalendarYear,
ordDate.CalendarQuarter,
ordDate.MonthNumberOfYear,
ordDate.EnglishMonthName
),
(
ordDate.CalendarYear,
ordDate.CalendarQuarter
)
)
This statement produces aggregates at two different levels, Month and Quarter, of the time dimension.
To add Yearly totals and GrandTotals, two new tuples are added to the GROUPING SETS function, (ordDate.CalendarYear), () so that the final statement looks like the following.
SELECT ordDate.CalendarYear,
ordDate.CalendarQuarter,
ordDate.MonthNumberOfYear,
ordDate.EnglishMonthName,
SUM(fact.SalesAmount) AS SalesAmount
FROM dbo.FactInternetSales fact
INNER JOIN dbo.DimTime ordDate
ON fact.OrderDateKey = ordDate.TimeKey
GROUP BY
GROUPING SETS(
(
ordDate.CalendarYear,
ordDate.CalendarQuarter,
ordDate.MonthNumberOfYear,
ordDate.EnglishMonthName
), — Month summary
(
ordDate.CalendarYear,
ordDate.CalendarQuarter
), — Quarter summary
(
ordDate.CalendarYear
), — Year summary
() — Grand Totals
)
Quite beautiful.