Archive

Archive for September, 2010

Generate, Top Count, and SubTotals Revisited

September 12, 2010 1 comment

In my previous post, I used Generate, TopCount, and EXISTING to produce a list of the top products and the total for all other products associated with a category. The problem this post deals with is only slightly more complicated. Rather than address the top products associated with a product category, in this example, we want to display the top products associated with a sales region and in addition to the top products, we also want to display the totals for the current region’s top 5 products. The approach used in the previous post took advantage of the fact that each product was associated with one and only one product category. Since a product can be purchased in more than one region, a slightly different approach has to be used here.  Like before, the Generate statement combined with the TopCount function can be used to produce the top products for each region, [Region Top Products]. However, unlike the previous post, the sales region has to be included in the second argument of the Generate and included in the resultant set to identify the context of the current region since a single product can be returned more than once.

WITH SET [Region Top Products] AS
    Generate(
        { [Sales Territory].[Sales Territory].[Region].Members },
        [Sales Territory].[Sales Territory].CurrentMember
        * TopCount( 
            { [Product].[Product Categories].[Product].Members },
            5,
            ( [Measures].[Internet Sales Amount], [Sales Territory].[Sales Territory].CurrentMember )
        )
    )

The region subtotals are also slightly more difficult to return. Like before, a new calculated member in the [Product].[Product Categories] hierarchy needs to be created to represent the top 5 product totals. However, once again since both the identity of the region AND the product is relevant, both have to be accounted for. To do so, this region subtotal member is defined as an aggregate over the intersection of the [Region Top Products] set defined above and the crossjoin of the current sales region and products. Since this crossjoin must be re-evaluated in the context of the current sales region in each iteration, the EXISTING declarative must once again be used. The resultant definition is given below.

MEMBER [Product].[Product Categories].[All].[Top 5 Product Total] AS
    Aggregate(
        Intersect(
            [Sales Territory].[Sales Territory].CurrentMember
            * EXISTING { [Product].[Product Categories].[Product].Members },
            [Region Top Products]
        ),
        [Measures].[Internet Sales Amount]
    )

And lastly, to include the subtotals together with the top product list, two sets are combined and ordered by the sales region name. The complete final query is returned below.

WITH SET [Region Top Products] AS
    Generate(
        { [Sales Territory].[Sales Territory].[Region].Members },
        [Sales Territory].[Sales Territory].CurrentMember
        * TopCount( 
            { [Product].[Product Categories].[Product].Members },
            5,
            ( [Measures].[Internet Sales Amount], [Sales Territory].[Sales Territory].CurrentMember )
        )
    )
MEMBER [Product].[Product Categories].[All].[Top 5 Product Total] AS
    Aggregate(
        Intersect(
            [Sales Territory].[Sales Territory].CurrentMember
            * EXISTING { [Product].[Product Categories].[Product].Members },
            [Region Top Products]
        ),
        [Measures].[Internet Sales Amount]
    )
SELECT    {
        [Measures].[Internet Sales Amount]
    } ON COLUMNS,
    Order(
        [Region Top Products]
        + CrossJoin(
            { [Sales Territory].[Sales Territory].[Region].Members },
            { [Product].[Product Categories].[Top 5 Product Total] }
        ),
        [Sales Territory].[Sales Territory].CurrentMember.MEMBER_CAPTION,
        BDESC
    ) ON ROWS
FROM    [Adventure Works]
Advertisements
Categories: MDX

Generate, TopCount, and All Others

September 4, 2010 Leave a comment

Recently, there was a post on the Analysis Services forum that I made a very sad attempt at answering. The question asked was how to construct an MDX statement that shows the Top 2 children associated with a parent and also show the amount associated with all other children as an aggregated total. After struggling with this problem and posting several real sorry attempts at solving, I think with a whole lot of guidance from Stefano, a decent, easily explainable solution has been derived. I’ll try to explain here and hope others can use the approach to solve similar problems.

This problem is not a new but one that Chris Webb (and likely many others) have written about. His approach was applicable to MSAS but with the ability to evaluate sets using the current context with the EXISTING declarative, the approach given in that post is overly complex now in 2005 and beyond. I apologize in advance to anyone if this solution has already been posted but I searched and couldn’t find any references. Not that I searched all that hard. Plus, I thought it pretty cool so writing about it again is not such a bad thing.

The problem sounds simple enough. First step is to calculate the top products for each category member. Using the Adventure Works as an example, a Generate statement combined with the TopCount function accomplishes what we wish to achieve. So to calculate the Top 2 products associated with each category member, we can create a named set, [Top Items], defined as the following.

WITH SET [Top Items] AS 
        Generate(
                { [Product].[Category].[Category].Members },
                TopCount(
                        EXISTING [Product].[Product].[Product].members,
                        2,
                        [Measures].[Reseller Sales Amount]
                )
        )

There are a couple points to make about the above definition. One, the EXISTING declarative before [Product].[Product].[Product].Members set is critical. If not used, the TopCount set is evaluated only once and those top products are the only ones returned in the final results. That’s not what we want. Second point to make is that the current [Product].[Category] member is NOT included in the resultant set. Stefano’s solution (and mine on the post) added it in the second argument of the Generate function and then used the Extract function to only return the [Product].[Product] hierarchy members. Just not including the category hierarchy in the second argument of the Generate statement accomplishes the same result without the penalty of performing a CrossJoin and then an Extract.

Once a set of the top two products associated with each category member has been constructed, the next step is to determine the aggregate of all other products associated with each category. To achieve this objective, a new calculated member, [Product].[Product].[All].[All Other Products], is defined as the Aggregate of all products less those included in the [Top Items] set. Again, an EXISTING declarative is necessary so that the [Product].[Product].[Product].Members set is evaluated under the context of the current Category member.

MEMBER [Product].[Product].[All].[All Other Products] AS
        Aggregate( { EXISTING { [Product].[Product].[Product].Members } - [Top Items] } )

And lastly, to wrap the whole thing up, a set of [Top Items] and the [All Other Products] member is created and CrossJoin-ed with Categories along the rows. Adding the [All Products] member to the set just to verify that the results are correct verifies the approach.

WITH SET [Top Items] AS 
        Generate(
                { [Product].[Category].[Category].Members },
                TopCount(
                        EXISTING [Product].[Product].[Product].members,
                        2,
                        ( [Measures].[Reseller Sales Amount] )
                )
        )
MEMBER [Product].[Product].[All].[All Other Products] AS
        Aggregate( { EXISTING { [Product].[Product].[Product].Members } - [Top Items] } )
SELECT        { 
                [Measures].[Reseller Sales Amount] 
        } ON COLUMNS,
        {
                { [Product].[Category].[Category].Members }
                * {
                        [Top Items], 
                        [Product].[Product].[All Other Products], 
                        [Product].[Product].[All] 
                } 
        } ON ROWS
FROM        [Adventure Works]

Voila! And very efficiently as well.

Technorati Tags: ,,,

Categories: MDX