There have recently been a couple of questions in microsoft.public.sqlserver.olap on how to implement a dimension where members can have multiple parents. I did this for a client a number of years ago and thought I would share the technique here.
Before I start, I need to stress that this technique does have down sides, it will take longer to process your cubes and having the same members in multiple positions in a dimension can be confusing to some users. If you can implement your design using multiple hierarchies or making use of attributes in Analysis Services 2005, you should do so. It may be possible to implement the design using attributes in Analysis Services 2005 or Virtual Dimensions in Analysis Services 2000, especially if you implement an “Unknown“ or “Other“ member to catch the data that does not fit. Or, if you do not need to drill down through your alternate hierarchies, you should look into implementing them as calculated members. This technique really only works with parent-child dimensions. I have build a leveled dimension with partial alternate hierarchies, but I did this by writing a stored procedure that “leveled“ the parent-child structure, this was done purely for performance reasons as the dimension was still conceptually a parent-child dimension.
So, lets start by define what I mean by a “partial alternate hierarchy“. It's basically a situation where you want one or more members to appear in more than one position in your navigational hierarchy. The example below was posted on the OLAP newsgroup and as someone pointed out, it could possibly be implemented as 2 hierarchies. One would be a regional hierarchy (Western Europe) and the other would be an EU/Non-EU hierarchy.
But what about the following:
The European countries appear twice as they are additionally classified as EU and "Non-EU", but South American Countries only appear once. The EU/Non-EU branches are what I am calling a "partial hierarchy".
Analysis Services 2000 Solution
This technique will not work with optimized dimensions in Analysis Services 2000 as it relies on the inner join between the dimension and fact tables to generate the data for the alternate hierarchies. The table below basically outlines how you need to set up the dimension table in order to get partial hierarchies working in Analysis Services 2000.
ID, ParentId, LocationId, UnaryOperator, Name
1, 1, 1, +, Europe
2, 1, 2, +, Western Europe
3, 2, 3, +, Switzerland
4, 2, 4, +, France
5, 2, 5, +, Spain
6, 1, 6, ~, EU
7, 6, 5, +, Spain
8, 6, 4, +, France
9, 1, 9, ~, Non EU
10, 9, 3, +, Switzerland
11, 11, 11, +, South America
12, 11, 12, +, Argentina
13, 11, 13, +, Columbia
14, 11, 14, +, Peru
You set up a parent-child dimension using the ID as the key column, the ParentId as the Parent column and the name as the name column. If you want to avoid doubling up measure amounts you need to make sure you pay attention to the unary operators and make sure that the parents of any duplicated members do not in turn rollup to their parent.
The key to this are the records with the id's of 7,8 and 10, You will notice that the name and LocationID columns are identical to the records 5,6 and 2 respectively.
When you link the dimension to the fact table it needs to be linked to the LocationID column and not the ID column as would be the normal practice with a Parent-Child dimension. For this configuration to work it is also important that the schema is not optimized for this dimension as the schema optimization requires that the key column for the dimension is unique and matches the key in the fact table which is not the case here.
A simple Analysis Services 2000 schema
Notice that the data for the European countries appears twice, but due to the way the unary operators are setup, they only count once towards All Locations
Analysis Services 2005 Solution
In Analysis Services 2005, the same solution was not possible. AS 2005 requires that you link a dimension to the fact table using it's logical primary key and when you create a parent child dimension it links the parent attribute to the logical key also. I experimented for a while trying to get the same sort of solution as I had in Analysis Services 2000, but could not get it to work. I even tried to migrate the AS 2000 cubes, but had trouble getting my AS 2005 Virtual PC to see the cubes on the host operating system. Not to worry though, there was another technique I thought I should be able to make work…
Many-to-many dimensions - this is an exciting new feature, although partial hierarchies are not really a "text book" usage of it. In order to create the bridging table I set up a named query based on the dimension table which only included the Id and LocationId columns. I then created a measure group on this bridging table and linked the Location dimension to the Sales fact table through it using the many-to-many feature.
A simple Analysis Services 2005 many-to-many schema, implementing an alternate hierarchy.
Dimension Usage in Analysis Services 2005
Update (27 Oct): There was one vital piece of information missing from the Analysis Services 2005 solution. I needed to setup my bridging table as both a fact and a dimension.
When ever you are setting up a measure group to join a many to many dimension, it needs to be linked to 2 dimensions. One for each side of the “Many-to-Many“ relationship. So, for each duplicated dimension member in “Dim Location“
- Is a standard Many to many relationship, using “Sales To Loc“ as the intermediate measure group
- Is a regular relationship joining the Key (locationID) of the “Sales To Loc“ dimension to the LocationID in the fact table
- Is a regular relationship joining the Key (ID) of the “Dim Location“ dimension to the ID column in the “Sales to Loc“ fact table
- is a regular relationship joining the Key (locationID) of the “Sales to Loc“ dimensions to the locationID in the “Sales to Loc“ fact table
Update (13 May 09): some of the relationships in the above table were clarrified
The granularity for relationship 3 is set at the ID column of the “Dim Location“ dimension and the granularity for relationships 2 and 4 is set at the primary key for the “Sales to Loc“ dimension. The logical primary key of the “Sales to Loc dimension is set as the combination of the LocationId and DimId, this is the only attribute defined for this dimension.
Note: I have also hidden the “Sales to Loc“ dimensions and the count measure that is created as part of the intermediate “Sales to Loc“ measure group.
I have not seen any documentation on this, but my theory is as follows. Many-to-Many relationships are implemented as 2 Many-to-1 relationships. In the diagram above there are many of 2 for each 4 and many 3 for each 4.This is purely my personal theory, so if anyone can confirm or deny it please let me know - but it makes sense to me.
This is the same data as in the AS 2000 solution
There are further improvements that can be made to this solution. Back in 2002, when I first implemented this, we used the parent-child table as a meta-data. We developed a GUI utility for the client and set up a stored procedure to convert the parent child structure into a leveled dimension. This gave us the logical benefits of a parent-child dimension, with the performance of a leveled dimension.
The AS 2005 solution also lends itself nicely toward setting up a separate bridging table which would hold the alternate hierarchies, joining the primary hierarchy in by unioning the dimension table to this one with a view or named query in the DSV.
The design presented here is not meant to represent a best practice, it has been simplified and compacted to try and clearly illustrate how to get partial alternate hierarchies to work.
Update: 1 Aug 2007 - Fixed images, the gallery location appears to have been changed after the upgrade of the blogging engine.