Wednesday, March 18, 2015 #

DAX Studio 2.1 Released

Today I am pleased to announce the release of the latest update to DAX Studio – v2.1.0

You can get it from the releases page on codeplex:

Below is an outline of what’s new in 2.1. A big thanks to Daniele Perilli for his assistance with the graphics and Marco Russo for his work on the Query Plans and Server Timings tabs plus his help with testing this release.

UI Refresh

Thanks to assistance from Daniele Perilli on the graphics side we now have a lot more consistency in our ribbon and we’ve moved all the buttons onto the Home tab and moved the database dropdown to the metadata pane. This has enabled us to fit all the buttons you’d use regularly onto the home ribbon.


Metadata Search

We now support searching in the metadata pane. If you hover over the little magnifying glass in the top right of the metadata pane the search control will fly out and as you type the metadata will filter down to only show objects containing the characters you’ve typed.

The icon changes to a green colour imagewhen a search is active so that you know that you are not looking at the full set of metadata.


Search and Replace

We now support searching for text within a query document, including highlighting all matching text.


And we support replacing of text


Both search and replace include the ability to do case-sensitive searches, regular expressions, wildcards and full word matches. These dialogs use the same hotkeys as Visual Studio Ctrl-F for Find and Ctrl-H for Replace.

Improved Server Timings Tab

We now show a much nicer view of the aggregate timing details as well as showing the detailed scan events with their timings. You also have the option of showing cache and internal events although these are hidden by default.


Improved Query Plan tab

The QueryPlans are now pre-processed to make them easier to read and the total number of records for each line is split out so that it can be clearly seen and so that you can sort by this column to find the operations which are traversing large numbers or rows.


Save Query Results to a File

Thanks to the codeplex user mogular who submitted an initial code sample for this feature. You can now export query results to either a comma-separated (csv) file or a tab-delimited (txt) file by choosing the File output option.


Using Semantic Versioning for releases

I’ve never really had a strict way of assigning version numbers, but I was not really happy with using the full 4 part version that is generated by .Net. So I’ve decided to start using a variation of Semantic Versioning for the releases. This uses a 3 part version number <major>.<minor>.<patch> where:

<major> – gets updated for significant or breaking changes

<minor> – gets updated for new features

<patch> – gets updated for fixes to existing features

and updates to a higher number reset the lower ones.


There have been numerous small fixes but some of the notable ones are:

  • Excel Add-in Regional Support – there was a bug in v2.0 which caused the Excel add-in to fail to load correctly on PCs that had regional settings that did not use a period (.) as the decimal separator. This has now been fixed and the Excel add-in should work regardless of the regional settings on the PC
  • New Version notification – this was actually silently failing in v2.0
  • Updates to syntax highlighting definitions
  • Improvements to the Excel “Linked Table” output 

Known Issues

  • "Linked Table" output for the Excel add-in only works for PowerPivot models if there is at least one table in the model that was created by linking data from an Excel table. We have not been able to find a workaround to this yet, it looks like it might be due to a bug in the Excel object model.
  • I’ve temporarily removed the “Quick Access” buttons in the title bar as they are not working consistently at the moment.

Posted On Wednesday, March 18, 2015 7:17 AM | Comments (4)

Monday, February 2, 2015 #

The Care and Feeding of SSAS in Production - SQL Saturday 365 Melbourne

SQLSAT365_SPEAKINGWe are not far away from SQL Saturday #365 which is occurring on Feb 7th, 2015. If you live in or around Melbourne it’s worth considering making the effort to come along. We have a great line up of 30 different sessions with speakers from 7 different countries including 16 MVPs.

This year I’m doing a talk called “The Care and Feeding of Analysis Services in Production”. I was partly inspired by the “Accidental DBA” topics that I’ve seen people doing for the relational engine and figured it was time someone did something similar for SSAS.

I’m going to talk about both Tabular and Multi-Dimensional, so there should be something in there for everyone.

Below is the session abstract:image

A lot of the information you'll find on Analysis Services is focused around the initial creation of databases and models, but once you have a solution deployed to production then what? In this session we will look at what it takes to run an Analysis Services server in production. What are the basics that you need to know about how the server works. Including things like threading, memory usage and locking. How can you monitor the health of your server? What tools can you use to find out what's happening on your server? We'll have a look at what you should be monitoring to make sure your system is running properly and run through what to investigate went things don't run as smoothly as you'd like.

Posted On Monday, February 2, 2015 6:31 AM | Comments (0)

Monday, December 8, 2014 #

DAX Studio v2–Christmas comes early

Read the following in your best movie trailer voice….

It’s been a long time coming….

19 months in the making….

5,000 lines of XAML code….

7,000 lines of C# code…

With the ability to run both inside of Excel and as a Standalone program

Version 2 of DAX Studio is finally here.

Maybe that’s a bit melodramatic, but maybe you get a hint of how exciting it is to finally be able to share this with you all. Version 2 is pretty close to full re-write of the user interface. And in the process there have been a number of difficult hurdles to overcome. So it’s taken a bit longer than anticipated, but it’s finally ready.

Below is a screenshot of the new user interface, which I think you’ll agree looks pretty slick.


You can download the latest version from under the downloads tab. The documentation tab has also been updated to cover all the new features in in v2.

Some of the exciting new features are:

  • An “Office 2013” ribbon window
  • True tabular metadata (modelled of the PowerView metadata pane)
  • Integrated tracing support
  • Bracket matching
  • Version update notification
  • comment / uncomment support
  • a single universal installer
  • plus many more 

But probably more importantly the code has been re-architected in a more modular structure so it should be easier to extend and improve going forward. There are still plenty of features that I’d like to add and it should be possible to do some smaller releases now that the major re-structuring is complete.

enjoy :)

Posted On Monday, December 8, 2014 11:18 PM | Comments (6)

Monday, November 17, 2014 #

SQL Saturday #365–Melbourne Feb 7 2015

Things have been a little quite around here recently, one of the reasons for that is that I’ve been hard at work as a committee member for SQL Saturday 365 which will be held in Melbourne, Australia on Feb 7th 2015.

Following on from Melbourne’s first highly successful event earlier this year, next year's event promises to be bigger and better.

Located again at the Caulfield Campus of Monash University, the event will have a mixture of local, interstate and international speakers. And for the very first time there will be full day pre-con sessions available for a very reasonable price on the Friday before the event.

You can find out more information about the event here:

You can register to attend here:

If you would like to present a session at SQL Saturday you can submit a proposal here:

Pre Cons
This year there will also be 3 Pre-Con Full Day Sessions on the Friday.  These present amazing value for money with a full day training by top experts for only $315 ($265 for early-bird registrations). The session details and registration links are as follows:

BIML Bootcamp
with Reeves Smith, Peter Avenant, Warwick Rudd and Paul Schmidt

Mastering Execution Plan Analysis
with Paul White

Practical Power BI
with Peter ter Braake

Full details of the PreCon sessions and how to register are in the links.

Posted On Monday, November 17, 2014 9:37 PM | Comments (0)

Monday, July 28, 2014 #

The perils of calculating an Average of Averages

I've seen questions around issues calculating averages come up a few times in various forums and it came up again last week and I feel that there is some benefit in walking through the details of this issue. For many of you the following will be nothing new, but I'm hoping that this may serve as a reference that you can point to when you get requests for this sort of calculation.

The core issue here is really a fundamental mathematical one. Personally I see it surfacing most often in DAX and MDX as those are languages that I spend a lot of time with, but also because of their multi-dimensional natures you need to be able to write generic calculations that will work regardless of how the end users slice and dice the data.

The discussions invariably start with a statement like the following:

"I have a calculated measure that an average, but my totals are calculating incorrectly"

There are 2 different issues I see relating to this.

The first one is trying to use the AVG() function in MDX. Basically if you want an average calculation that works with all your different dimensions then avoid this function. The AVG function in MDX calculates the average over a fixed set. You may be able to use it in a static query, but to calculate an average in your MDX script simply create the two base measures - a sum and a count, then divide the sum by the count. This is not as much of an issue in DAX as the built-in AVERAGE, AVERAGEA and AVERAGEX generally work as expected.

The other sort of question that I see is related to how the totals are calculated and the question is usually something like the following:

"I have an average measure calculated by doing sum / count - which produces the correct average for each row, but the total is calculated as "sum of sums" / "sum of counts" and my user wants to see it as the average of all the averages."

And to put it bluntly this requirement is invalid. You should never "total" a series of averages by averaging them. The easiest way to explain why this is the case is to illustrate with some data. So let's have a look at a few scenarios.

The first problem you will see with the "average of averages" approach is that it gives too much weight to outlying amounts.

















Given the data above how should we calculate the total average? if we do the "average of averages" approach we have:

(1000 + 10) / 2 = 505

If we take the SUM(Amount) / SUM(Count) approach we get the following:

11000 / 1001 = 10.99

This is an extreme example to prove a point, but which do you think is correct? Should the 1 bike we sold for $1000 skew the average to $505 or should the fact that it was just one product out of 1001 mean that the average should only be $10.99?

Your business user might be happy seeing a higher average amount, but what if the situation was reversed and we had sold 1000 bikes and just one helmet? This would make the "average of averages" still equal 505 while recalculating the average at the total level would give us $999.01 - I know which calculation I think is giving a better indication of the total average sales.

It's possible that you may be thinking at this point that this is not so much of a big deal for you because you don't have that sort of variability in your data. However that is only the start of the issues. If you are still unsure about the evils of averaging averages then read on because it only gets worse.

To show the next nasty side effect we need to look at just a little bit more data. Take the following 4 records for example where we have data split between 2 cities and 2 product categories

City Category



Melbourne Bikes



Melbourne Helmets



Seattle Bikes



Seattle Helmets



When we group the data by City we get the following results. The "Total" line is where the average is recalculated at the total level. Where as the "Avg of Averages" line is where I've take the average of the 2 City averages.













Avg of Averages    






Now lets have a look at what happens to the figures when we group the data by the product category. Notice that the Total line has remained unchanged, but the "Avg of Averages" is now different!













Avg of Averages    






This sort of behaviour - where the figures reported for total and sub-totals will vary depending on how the data is sliced and diced - will be the death of your BI project.

Trust - the most important "feature" of any BI project

I would argue that possibly the most important "feature" of any BI solution is trust. You can always add features and missing functionality, but it can be very difficult to win back the trust of your users once it's been lost. And nothing will erode the trust of your users than seeing inconsistent results.

It's not just straight Averages that are the issue

Anytime you are mixing calculations that do sums and divisions you need to be careful of the order of operations. Ratios, Percentages and moving averages are just a few of the examples of other calculation types for which you need to take care of the order in which add and divide things.

Posted On Monday, July 28, 2014 7:18 AM | Comments (1)

Thursday, July 3, 2014 #

The case of the vanishing KPIs

I was contacted today with an interesting issue, we had a tabular model that had some KPIs which were not showing up in Power View.

The first thing I checked was the version setting on the model. KPI support was not added to tabular models in SP1. If your model is set to a compatibility version of RTM (1100) Power View will detect this and will effectively not ask for metadata about the KPIs.

However in this case when we checked the database properties from SSMS the compatibility setting appeared to be correctly set to SP1 (1103)


So the next thing I did was to open a profiler trace and look at the metadata queries that Power View executed as it started up. Excel treats SSAS Tabular models as if they were multi-dimensional models and queries the metadata using a number of different DISCOVER queries against different schema rowsets. When SSAS Tabular was developed a new schema rowset was introduced called DISCOVER_CSDL_METADATA which is what DAX clients like Power View use to populate their field browser windows.

Checking the command I could see that it was correctly requesting a version 2.0 recordset. If the model was set to a compatibility setting of RTM (1100) or if there was a problem detecting the compatibility setting of the model you may see a 1.0 in the version restriction. Version 1.0 CSDL will not include KPI information. This is so that client tools can specify the version of metadata which they know how to handle.


At this point it looks like Power View and SSAS are correctly talking to each other, but we are still no closer to understanding why the KPIs are visible in Excel, but not in Power View.

The next thing to looks at was the actual response returned by the metadata query to see if there was anything strange in there. To do that I took the RestrictionList and PropertyList elements from the profiler trace and inserted them into the Restrictions and Properties elements in the query below. I also had to remove the LocaleIdentifier and DataSourceInfo elements from the PropertyList as these related to a different session. Below is an example of a DISCOVER_CSDL_METADATA command which can be run from an XMLA window in SSMS.

<Discover xmlns="urn:schemas-microsoft-com:xml-analysis">

        <RestrictionList xmlns="urn:schemas-microsoft-com:xml-analysis">
            <CATALOG_NAME>Adventure Works Tabular</CATALOG_NAME>
        <PropertyList xmlns="urn:schemas-microsoft-com:xml-analysis">
            <Catalog>Adventure Works Tabular</Catalog>

(you can simply replace the 2 references to the Catalog name in the XMLA below to run this against one of your models)

When I searched in the results for "KPI" I came across the following interesting piece of data.


Notice the Hidden="true" attribute? It turns out that the original developer decided to hide the measures before creating the KPI which resulted in the KPI itself being hidden. Setting the Hidden property to false on the measure fixed this issue. Mystery solved.

So although the end solution turned out to something simple I thought it might be interesting to share the process

A footnote

Note that we still have one minor issue, now in Excel we can now see both the KPI and the measure, while in Power View we only see the KPI. My suspicion is that this may be a bug in the MDSCHEMA_MEASURES rowset which Excel uses to find out what measures a model has. My opinion is that in order to be consistent with Power View that measures which are used for KPI values should not also be displayed as "normal" measures.

Posted On Thursday, July 3, 2014 7:33 AM | Comments (0)

Friday, May 23, 2014 #

BI Survey 14

It's BI Survey time again :)

If you haven't done this before here is a little background on it from the guys that run it:

The BI Survey, published by BARC, is the world's largest and most comprehensive annual survey of the real world experiences of business intelligence software users. Now in its fourteenth year, The BI Survey regularly attracts around 3000 responses from a global audience. It provides an invaluable resource to companies deciding which software to select and to vendors who want to understand the needs of the market.

The Survey is funded by its readers, not by the participant vendors. As with the previous thirteen editions, no vendors have been involved in any way with the formulation of The BI Survey. Unlike most other surveys, it is not commissioned, sponsored or influenced by vendors.

Here is a link to the survey:

If you take the survey you will get access to a summary of the results. By helping to promote the survey here I'll get access to some more detailed results including some country specific analysis so it will be interesting to see the results.

Posted On Friday, May 23, 2014 6:45 AM | Comments (0)

Friday, May 16, 2014 #

Function Folding in #PowerQuery

Looking at a typical Power Query query you will noticed that it's made up of a number of small steps. As an example take a look at the query I did in my previous post about joining a fact table to a slowly changing dimension. It was roughly built up of the following steps:

  1. Get all records from the fact table
  2. Get all records from the dimension table
  3. do an outer join between these two tables on the business key (resulting in an increase in the row count as there are multiple records in the dimension table for each business key)
  4. Filter out the excess rows introduced in step 3
  5. remove extra columns that are not required in the final result set.

If Power Query was to execute a query like this literally, following the same steps in the same order it would not be overly efficient. Particularly if your two source tables were quite large. However Power Query has a feature called function folding where it can take a number of these small steps and push them down to the data source. The degree of function folding that can be performed depends on the data source, As you might expect, relational data sources like SQL Server, Oracle and Teradata support folding, but so do some of the other sources like OData, Exchange and Active Directory.

To explore how this works I took the data from my previous post and loaded it into a SQL database. Then I converted my Power Query expression to source it's data from that database. Below is the resulting Power Query which I edited by hand so that the whole thing can be shown in a single expression:

    SqlSource = Sql.Database("localhost", "PowerQueryTest"),
    BU = SqlSource{[Schema="dbo",Item="BU"]}[Data],
    Fact = SqlSource{[Schema="dbo",Item="fact"]}[Data],
    Source = Table.NestedJoin(Fact,{"BU_Code"},BU,{"BU_Code"},"NewColumn"),
    LeftJoin = Table.ExpandTableColumn(Source, "NewColumn"
                                  , {"BU_Key", "StartDate", "EndDate"}
                                  , {"BU_Key", "StartDate", "EndDate"}),
    BetweenFilter = Table.SelectRows(LeftJoin, each (([Date] >= [StartDate]) and ([Date] <= [EndDate])) ),
    RemovedColumns = Table.RemoveColumns(BetweenFilter,{"StartDate", "EndDate"})

If the above query was run step by step in a literal fashion you would expect it to run two queries against the SQL database doing "SELECT * …" from both tables. However a profiler trace shows just the following single SQL query:

select [_].[BU_Code],
    select [$Outer].[BU_Code],
    from [dbo].[fact] as [$Outer]
    left outer join
        select [_].[BU_Key] as [BU_Key],
            [_].[BU_Code] as [BU_Code2],
            [_].[BU_Name] as [BU_Name],
            [_].[StartDate] as [StartDate],
            [_].[EndDate] as [EndDate]
        from [dbo].[BU] as [_]
    ) as [$Inner] on ([$Outer].[BU_Code] = [$Inner].[BU_Code2] or [$Outer].[BU_Code] is null and [$Inner].[BU_Code2] is null)
) as [_]
where [_].[Date] >= [_].[StartDate] and [_].[Date] <= [_].[EndDate]

The resulting query is a little strange, you can probably tell that it was generated programmatically. But if you look closely you'll notice that every single part of the Power Query formula has been pushed down to SQL Server. Power Query itself ends up just constructing the query and passing the results back to Excel, it does not do any of the data transformation steps itself.

So now you can feel a bit more comfortable showing Power Query to your less technical Colleagues knowing that the tool will do it's best fold all the  small steps in Power Query down the most efficient query that it can against the source systems.

Posted On Friday, May 16, 2014 7:40 AM | Comments (0)

Monday, May 5, 2014 #

#PowerQuery – Joining to a Slowly Changing Dimension

I blogged previously about how to look up a surrogate key for a slowly changing dimension using DAX. This post is about how to do the same thing using Power Query.

I'm going to start off with the same 2 tables that I used in the previous blog post. One is a fact table and the other is my BU (Business Unit) table. I started by clicking on each table of data in Excel and choosing the "From Table" data source option.


And for each table I unchecked the "Load to worksheet" option and then clicked apply & save.


Once I had done that for both tables my Power Query tool pane looked like the following, I have two queries defined, but neither of them is loading any data directly.


Now that we have our two source queries we want to use the Merge option in Power Query to join them together


The Merge option in Power Query is how you join matching rows in two tables together. I chose "Fact" as my first table as for each row in the Fact I want to find the matching BU_Key from the BU table.


You'll notice that at this point we can only choose columns for an equality match, there are no options for us to test that the Date in Fact is between the StartDate and EndDate in the BU table.

When we click on OK we end up with a result like the following which has our original rows from the Fact table and then a column called "NewColumn" which contains the 1 or more rows from the BU table which matched on the BU_Code column.


If we click on the little double arrow button in the header of the NewColumn column you get the following options:


We can choose to either expand or aggregate the rows in the nested table. Because we want to lookup the BU_Key we tick that as well as the StartDate and EndDate columns as we will need those later.

That gives us a result like the following:


Now we are getting close, but we still have one major issue. We now have 16 rows instead of our original 8 because each row in the Fact table is matching to multiple rows in the BU table as we have not done any filtering based on the start and end dates yet. Clicking on the filter button at the top of the "Date" column it initially looks like doing a date filter and choosing the "Between" option would be a solution.


But that only gives us the option to select fixed dates values from our data, not references to another column.


One solution would be to put in fixed dates and then manually edit the filter in the formula bar, but I wanted to see how far I could get without resorting to doing any advanced editing. The solution I came up with involved some minor code, but it can be done without manually editing the formula.

What I ended up doing was inserting a new custom column which we can then use to filter out the rows we don't want. So from the "Insert" tab on the ribbon I chose the "Insert Custom Column" option:


Then I entered the following expression to create a new column called "DateFilter" which will return a value of True if the Date from the current Fact row was between the StartDate and EndDate from the BU table.

= ( ( [Date] >= [NewColumn.StartDate] ) and ( [Date] <= [NewColumn.EndDate] ) )


That gives us the following result:


Then to filter down to just the "True" values we just need to click on the dropdown in the header of the "DateFilter" column and select the "TRUE" value in our filter.


We are now back to our original 8 rows.


Then we just need to do a little clean up. By holding the Ctrl key while clicking on the green columns above we can remove those columns. Then I just renamed "NewColumn.BU_Key" to BU_Key and clicked on the "Date" column and set it's type as date (which somehow did not get correctly get detected) we now end up with our finished table which we could choose to load into Excel or directly into a Power Pivot model.


Below is the Power Query Formula that was created as a result of the above steps. (this is just the merge query excluding the 2 source queries for "BU" and "Fact")

    Source = Table.NestedJoin(Fact,{"BU_Code"},BU,{"BU_Code"},"NewColumn"),
    #"Expand NewColumn" = Table.ExpandTableColumn(Source
            , "NewColumn"
            , {"BU_Key", "StartDate", "EndDate"}
            , {"NewColumn.BU_Key", "NewColumn.StartDate", "NewColumn.EndDate"}),
    InsertedCustom = Table.AddColumn(
            #"Expand NewColumn", "DateFilter"
            , each ( ( [Date] >= [NewColumn.StartDate] ) and ( [Date] <= [NewColumn.EndDate] ) )),
    FilteredRows = Table.SelectRows(InsertedCustom, each ([DateFilter] = true)),
    RemovedColumns = Table.RemoveColumns(
            FilteredRows,{"BU_Code", "NewColumn.StartDate", "NewColumn.EndDate", "DateFilter"}),
    RenamedColumns = Table.RenameColumns(RemovedColumns,{{"NewColumn.BU_Key", "BU_Key"}})

If you want to manually tweak things you can go into the Advanced Editor and manually edit the formula to combine all three queries into one and you can also do away with the custom column and just do the between filtering inline. The following query shows the single query solution.

    Fact1 = Excel.CurrentWorkbook(){[Name="Fact"]}[Content],
    BU1 = Excel.CurrentWorkbook(){[Name="BU"]}[Content],
    Join = Table.NestedJoin(Fact1,{"BU_Code"},BU1,{"BU_Code"},"NewColumn"),
    #"Expand NewColumn" = Table.ExpandTableColumn(Join
            , "NewColumn"
            , {"BU_Key", "StartDate", "EndDate"}
            , {"NewColumn.BU_Key", "NewColumn.StartDate", "NewColumn.EndDate"}),
    FilteredRows = Table.SelectRows(#"Expand NewColumn"
            , each ( ( [Date] >= [NewColumn.StartDate] ) and ( [Date] <= [NewColumn.EndDate] ) )),

    RemovedColumns = Table.RemoveColumns(
            FilteredRows,{"BU_Code", "NewColumn.StartDate", "NewColumn.EndDate"}),
    RenamedColumns = Table.RenameColumns(RemovedColumns,{{"NewColumn.BU_Key", "BU_Key"}}),
    ChangedType = Table.TransformColumnTypes(RenamedColumns,{{"Date", type datetime}})

If you are curious you can download the workbook I used for this blog post from my OneDrive:

Posted On Monday, May 5, 2014 9:53 PM | Comments (2)

Tuesday, April 22, 2014 #

Implementing Column Security with #SSAS Tabular and #DAX

Out of the box Analysis Services (both Tabular and Multi-dimensional) has great support for horizontal or row based security. An example of this is where you would give User1 access to all data where the Country is “Australia” and give User2 access to all data where the country = “United States”. This covers a large percentage of the security requirements that most people have.

But neither technology has great support for vertical or column based security. This sort of requirement is most common in privacy scenarios. One example of this would be a model with medical data. It may be acceptable to show all your users demographic data such as the state they live in or their gender. But only a specific subset of users should have access to see individual patient details such as their name or phone number.

One approach would be to simply create 2 models, one with the secure information and one without. While this works, it doubles your processing time and doubles any maintenance activities and takes up double the storage.

Looking at the features in SSAS you may be tempted to try using perspectives. At first glance they appear to do what we want - allowing us to hide a subset of columns. But perspectives are not a security feature. All they do is to show a subset of the metadata to the user, but the user still has to have access to the full model and the hidden columns are still fully query-able from DAX and MDX. Trying to use perspectives for security is like putting a "Keep Out" sign on your front door, but then not actually locking it…

To explore this issue further I created a very simple database in SQL Server which has a Patient table and a FactPatient table which look like the following:


What I want to do is to create a model where only specific people can see the PatientName column. So because we can't restrict access to specific columns in a single table I created 2 views over the Patient table - vPatient which has every column except the PatientName and vPatientSensitive which has the PatientID and PatientName


At this point I then created a tabular model bringing in FactPatient, vPatient and vPatientSensitive.

If you create your relationships in the default manner you will end up with something like the following:


This works great for the role which has access to the sensitive information, but if you create a role which does not give access to any of the rows in vPatientSensitive, these users can't see any data.

The reason for this is that the Filter Context flows down through the chain of one to many realtionships


So if a role has no access to any rows in vPatientSensitive, this flows through the relationships to also filter vPatient and FactPatient resulting in this role not being able to see any data.

Because the relationship between vPatient and vPatientSensitive is actually a 1:1 we can reverse the direction of the relationship as per the following diagram:


Now we are getting closer. Our secured role works again, but we've now introduced a problem with our role that has full access. When they browse the data they see the following with the same amounts repeated for every patient name.


If you take another look at our relationships you'll notice that it now looks like a many to many relationship. And there well established pattern of dealing with many to many relationships using CALCULATE( <expression>, <intermediate table> ).

So we could try something like CALCULATE( SUM( FactPatient[Amount], vPatientSensitive ) - however we can't just use this expression as if the vPatientSensitive is restricted then we will be back to our original scenario where restricted people can't see any data. So we need to check if the current user has access to the sensitive data before applying this expression. We can do this with COUNTROWS( ALL( vPatientSensitive ) ).

Then our full expression for a measure over the FactPatient[Amount] column becomes:

Total Amount :=
IF (
COUNTROWS ( ALL ( vPatientSensitive ) ) > 0,
CALCULATE ( SUM ( FactPatient[Amount] ), vPatientSensitive ),
SUM ( FactPatient[Amount] )

To test this design I setup a number of different roles.

The FullAccess role has no filters applied on any tables.


and can see all the patient data including the PatientName.


The NoSensitive role can see all the facts, but cannot see any columns from the vPatientSensitive table


So when they run the same query as the FullAccess role all they see is the following where the PatientName column from vPatientSensitive only shows blank values:


It's also possible to mix and match this approach with standard row based security. So we could limit a role to only seeing data from a particular state and also given them access to the sensitive data:



Or we could restrict the role to a particular state and deny access to the sensitive information



If you want to have a play with this solution yourself I've uploaded both the tabular project and a T-SQL script which will build the source database to a folder on my OneDrive.

Posted On Tuesday, April 22, 2014 11:20 PM | Comments (2)

Thursday, April 17, 2014 #

Running MDX Studio against SQL 2012

Even though MDX Studio has not been updated since SQL 2008 it’s still a fantastic tool for working with MDX. However if you have only installed SQL 2012 (or later) on your machine then you may get errors like the following:

System.IO.FileNotFoundException: Could not load file or assembly 'Microsoft.AnalysisServices, Version=, Culture=neutral, PublicKeyToken=89845dcd8080cc91' or one of its dependencies. The system cannot find the file specified.
File name: 'Microsoft.AnalysisServices, Version=, Culture=neutral, PublicKeyToken=89845dcd8080cc91'

There are two ways to address this issue:

1. Install either the SQL 2008 or SQL 2008 R2 version of AMO (which is part of the SQL Server feature pack)

2. Configure assembly redirection via a config file.

You can download a copy of the mdxstudio.exe.config file from my onedrive or save the following xml to a file of that name (this file needs to be in the same folder as the MDXStudio.exe file). This file redirects the 2008 / 2008R2 version of Microsoft.AnalysisServices.dll (v10.0.0.0) to the SQL 2012 version (v11.0.0.0) to redirect to later versions it’s just a matter of changing the newVersion attribute (assuming that the new library is backward compatible)

  <assemblyBinding xmlns="urn:schemas-microsoft-com:asm.v1">
    <assemblyIdentity name="Microsoft.AnalysisServices" culture="neutral" publicKeyToken="89845dcd8080cc91" />
    <bindingRedirect oldVersion="" newVersion="" />
    <assemblyIdentity name="Microsoft.AnalysisServices.AdomdClient" culture="neutral" publicKeyToken="89845dcd8080cc91" />
    <bindingRedirect oldVersion="" newVersion="" />

Posted On Thursday, April 17, 2014 6:19 AM | Comments (1)

Wednesday, April 9, 2014 #

#DAX – Joining to a Slowly Changing Dimension

The following is one of the scenarios that I showed during my “Drop your DAX” talk at SQL Saturday #296 in Melbourne.

Currently SSAS Tabular and PowerPivot models can only have a relationship based on a single column. So what do you do when you need to join based on multiple columns?

Ideally you would solve this during your ETL. With a type 2 slowly changing dimension you typically want to insert the surrogate key for the dimension into the fact table. As you may know, “type 2” dimensions can have one or more records for a given business key. These records will have a different effective start and end dates.  To find the correct surrogate key you have to match on the business key and a date, where the date is between the effective start and end dates.

This is pretty easy to do in SQL, but what do you do if your two tables are not only on separate servers, but also on different types of servers. I had this situation recently. We were building a quick proof of concept and we had an existing type 2 dimension in Teradata, but the fact table was coming from an Oracle server. One option would have been to stage both sets of data locally and do the join there, but that was going to take time and resources and this was just meant to be a quick proof-of-concept.

So is there a way of fixing this in the model using DAX?

To demonstrate this I’m going to work with the following simplified dimension:


And the following simple fact table:


What I wanted to do was to try to create a calculated column in the fact table to look up the surrogate from the dimension table. Creating a filter statement to find the matching records was not too hard.

FILTER(BU,Fact[BU_Code] = BU[BU_Code] && Fact[Date] >= BU[StartDate] && Fact[Date] < BU[EndDate])

But FILTER returns a table, not a scalar value so you can’t use it in a calculated column. My first attempt involved using the powerfull CALCULATE function to return a single value from the BU_Key column.

    VALUES ( BU[BU_Key] ),
    FILTER (
        'Fact'[BU_Code] = BU[BU_Code]
            && 'Fact'[Date] >= BU[StartDate]
            && 'Fact'[Date] < BU[EndDate]

This produced the results I wanted, but when I tried to create a relationship against this column I got the following error:


Sounds a bit strange doesn’t it. The issue here is that CALCULATE() is trying to transform the filter context into a row context in order to evaluate the column expression. The problem with this is that relationships influence the  filter context. Because CALCULATE() needs to be able to follow all the relationships in order to return a result - you can't create a relationship over a column that uses CALCULATE(). That’s where the circular dependency comes from.

The solution to this is to switch the expression to use one of the iterative functions like MINX or MAXX. I’ve used MINX here, but it does not matter which one you use as the FILTER should only return a single row.

    FILTER (
        'Fact'[BU_Code] = BU[BU_Code]
            && 'Fact'[Date] >= BU[StartDate]
            && 'Fact'[Date] < BU[EndDate]

Now we can create a relationship over our calculated column.

However it’s not all chocolates and roses…

While this is a handy technique to have in your tool chest for quick prototypes, there are some downsides that you should be aware of.

Tabular processing happens in 3 phases:

1. Data acquisition

2. Data compression

3. Evaluation of calculations

You’ll notice 2 issues here. Firstly adding calculated columns will increase your processing time. Secondly calculated columns are not compressed which will result in slower scan speeds. So this technique will have a negative impact on both processing and query performance. For a long term solution it probably better to look at an ETL based solution.

Posted On Wednesday, April 9, 2014 7:18 AM | Comments (4)

Monday, March 31, 2014 #

How to build your own SSAS Resource Governor with PowerShell

A few weeks ago I posted a way to manually find and kill long running SSAS queries. In this post I’ll take it a step further and show you an automated solution. The idea behind this technique is inspired by a blog post Chris Webb did some years ago. Chris implemented his solution using SSIS while this version uses PowerShell.

You might ask - why create a Powershell version?

Well it does a little bit more in that it logs the cancelled queries and sends the user an email. It also uses membership in an Active Directory group to control who bypasses the governor. This makes that list easy to maintain separately from the script. Scripts are easy to view and change as all you need is a text editor. I also think that there is probably less overhead in running a Powershell script as compared to starting an SSIS package.

This script works with both Tabular and Multi-dimensional instances. And it could easily be extended to monitor multiple instances. 

The basic operation of the resource governor script is as follows:

1. Query $system.DISCOVER_COMMANDS for any commands that have been running longer than x minutes or seconds. (Excluding any XMLA commands (processing, deploying, tracing, etc)

2. Look up the session for these commands in $system.DISCOVER_SESSIONS to find out the LoginName and the database they are querying 

3 Check if the user is in the bypass list for the Resource Governor, if they are log this fact and exit. Otherwise cancel the query, log the details of it to a table in SQL Server and send them a nice email.

This script then gets run every minute during business hours by a SQL Agent job.


Before running this script you need to download invoke-sqlcmd2 from the Technet Gallery. It’s a nice lightweight way of running SQL commands from Powershell.

And to use the script as is you will need to use the following script to create some logging tables in a SQL database and create an SQL Agent job.


If you are interested in running this script on one of your servers you might want to consider commenting out the bits that cancels the query and sends the user an email. Then just let it log the actions it would have taken. Once you start cancelling queries you’ll want to monitor the log tables. In some cases you will discover opportunities to improve your cubes. In other cases you will be able to assist the users with a better way to achieve their desired result.

Below is the full script, but you can also download it from here

All the key variables are declared at the top of the script. You should just need to change these to suit your environment in order to use this script.

    Resource Governor script for Microsoft SQL Server Analysis Services
    Automatically cancels queries that have been running longer than the maximum allowed time
    Author : Darren Gosbell (
    Date   : 9 Mar 2014
    Idea from -

##### initialization variables #####

$servers = "localhost\tabular" ,"localhost\multidim"
$threshold = 300   # in seconds

$sqlInstance = "localhost"
$bypassADGroup = "CN=ImportantGroup,OU=Distribution Lists,OU=Shared Mailboxes,DC=mycompany,DC=com"

$cancelEmailSubject = "Analysis Services Long Running Query Cancellation"
$cancelEmailFrom = ""
$cancelEmailBcc = ""
$cancelEmailServer = ""
$supportEmail = ""


# load the AMO library
[System.Reflection.Assembly]::LoadWithPartialName("Microsoft.AnalysisServices.adomdclient") > $null

# load the invoke-sqlcmd2 cmdlet
. $PSScriptRoot\invoke-sqlcmd2.ps1

## ============ Start Helper Functions =================
Function Send-Email( $to, $subject, $body )
    $emailFrom = $cancelEmailFrom
    $bcc = $cancelEmailBcc
    $smtpServer = $cancelEmailServer
    $msg = new-object Net.Mail.MailMessage
    $msg.From = $emailFrom
    $msg.Subject = $subject
    $msg.Body = $body
    $smtp = new-object Net.Mail.SmtpClient($smtpServer)
    #$smtp.Send($emailFrom, $emailTo, $subject, $body)

foreach ($svr in $servers)
    $connStr = "data source=$svr"
    [Microsoft.AnalysisServices.adomdclient.adomdconnection]$cnn = new-object Microsoft.AnalysisServices.adomdclient.adomdconnection($connStr)
    $cmd = new-Object Microsoft.AnalysisServices.AdomdClient.AdomdCommand
    $cmd.Connection = $cnn

    $qryLongCmd = @"
    FROM `$system.discover_commands

    $qrySessions = @"
    from `$system.discover_sessions


    # get a list of current commands that exceeded the time threshold
    $cmd.CommandText = $qryLongCmd
    $da = new-Object Microsoft.AnalysisServices.AdomdClient.AdomdDataAdapter($cmd)
    $dsCmd = new-Object System.Data.DataSet
    $da.Fill($dsCmd) > $null
    # filter out any xmla commands that start with '<'
    $drCmd = $dsCmd.Tables[0].rows | where {$_.COMMAND_TEXT.StartsWith("<") -eq $false }

    if (@($drCmd.count).count -eq 0)
        write-host "no excessive queries found"

    # get a list of the current sessions
    $cmd.CommandText = $qrySessions
    $da = new-Object Microsoft.AnalysisServices.AdomdClient.AdomdDataAdapter($cmd)
    $dsSess = new-Object System.Data.DataSet
    $da.Fill($dsSess) > $null

    # Lookup the session information for each long running command
    foreach ($c in $drCmd) 
        $s = $dsSess.Tables[0].select("SESSION_SPID = $($c.SESSION_SPID)");
        $c |Add-Member -NotePropertyName "SESSION_USER_NAME" -NotePropertyValue $s.SESSION_USER_NAME ;
        $c |Add-Member -NotePropertyName "SESSION_CURRENT_DATABASE" -NotePropertyValue $s.SESSION_CURRENT_DATABASE ;             
        $c |Add-Member -NotePropertyName "COMMAND_ELAPSED_TIME" -NotePropertyValue $([System.Timespan]::FromMilliseconds($c.COMMAND_ELAPSED_TIME_MS))
        $user = $s.SESSION_USER_NAME.Replace("ACCOUNT-01\","")
        $srchr = (New-Object DirectoryServices.DirectorySearcher "(&(ObjectClass=user)(Name=$user))")
        $srchr.PropertiesToLoad.Add("mail") > $null
        $srchr.PropertiesToLoad.Add("memberof") > $null
        $ad = $srchr.FindOne()
        $InPriorityGroup = $ad.Properties["memberof"]  -Contains $bypassADGroup
        $c |Add-Member -NotePropertyName "InPriorityGroup" -NotePropertyValue $InPriorityGroup
        $c |Add-Member -NotePropertyName "Email" -NotePropertyValue $($ad.Properties["mail"])

    # kill any sessions that were returned 
    foreach ($spid in $drCmd)
        if ($spid.InPriorityGroup -eq $true)
            write-output "Bypassing SPID: $($spid.SESSION_SPID) as it is in the Workload Priorty Group"
            $bypassLogCmd = "INSERT INTO tb_WorkloadBypassLog (UserLogin, CommandDurationMS, SessionDatabase, SPID) VALUES ('$($spid.SESSION_USER_NAME)', '$($spid.COMMAND_ELAPSED_TIME_MS)', '$($spid.SESSION_CURRENT_DATABASE)', '$($spid.SESSION_SPID)')"
            invoke-sqlcmd2 -ServerInstance $sqlInstance -Database "OlapTrace" -Query $bypassLogCmd -As None > $null
        $eml = $spid.Email 
        write-progress "Cancelling SPID $($spid.SESSION_SPID)"
        # log the Cancellation attempt
        $qry = $spid.COMMAND_TEXT.Replace("'","''")
        $insertCmd = "INSERT INTO tb_CancelledQueries (UserLogin, CommandStart, CommandDurationMS, SessionDatabase, Query, Email) VALUES ('$($spid.SESSION_USER_NAME)', '$($spid.COMMAND_START_TIME)', '$($spid.COMMAND_ELAPSED_TIME_MS)', '$($spid.SESSION_CURRENT_DATABASE)', '$qry', '$($spid.Email)')"

        # Send email notification to end user
        $msg = @"
Your query against the '$($spid.SESSION_CURRENT_DATABASE)' Analysis Services database has consumed excessive resources
and has been marked for cancellation by the workload management system.

Please wait a few minutes or try requesting a smaller set of data.

For assistance with structuring efficient queries or advice about resource management, 
please forward this email and a brief outline of what you are trying to achieve to 
        # if we have an email registered in AD send the user a notification
        if ($ad.Properties.Contains("mail"))
            Send-Email $eml $cancelEmailSubject $msg
        # cancel the query
        $cmd.CommandText = "<Cancel xmlns=`"`"><SPID>$($spid.SESSION_SPID)</SPID></Cancel>"
        # log the cancellation
        invoke-sqlcmd2 -ServerInstance $sqlInstance -Database "OlapTrace" -Query $insertCmd -As None > $null


Posted On Monday, March 31, 2014 6:04 AM | Comments (0)

Sunday, March 23, 2014 #

Extending the PowerQuery date table generator to include ISO Weeks

Chris Webb and Matt Mason have both blogged about formulas for generating  a date table using PowerQuery, but both of these posts focus on the standard year-month-day calendar. I’ve been doing a little work with some week based calculations and thought I would see how hard it would be to extend this sort of approach to generate some columns for a week hierarchy.

The ISO Week standard is part of ISO 8601 and defines a week as starting on Monday and ending on Sunday. That in itself is not very hard. The tricky bit comes into play when you go to assign each week to a year. Because weeks don’t fit evenly into years you need to either move some days from the end of December forward or move a few days of January back to the prior year.

The way to do this is as follows:

  • Find the week that contains January 4 as that is always the first week of the year.
  • If Jan 4 is before Thursday then any January days prior to Monday are allocated to the previous year.
  • If Jan 4 is after Thursday then any December days at the start of the week are treated as being part of the current year.

I’ve also taken this a step further and created a small inline function that figures out the current 4-4-5 period and quarter that a given week falls into. I’m using a function which returns a record to return both the period and quarter from the one function which I think is pretty cool.

The following is an extension of Matt Mason’s method, he has some great screen shots of how to use the function so if you have not seen that post it’s definitely worth checking out.

Basically you start a new blank query, switch to the advanced mode and then paste in the following and invoke it:

let CreateDateTable = (StartDate as date, EndDate as date, optional Culture as nullable text) as table =>
    DayCount = Duration.Days(Duration.From(EndDate - StartDate)),
    Source = List.Dates(StartDate,DayCount,#duration(1,0,0,0)),
    TableFromList = Table.FromList(Source, Splitter.SplitByNothing()),   
    ChangedType = Table.TransformColumnTypes(TableFromList,{{"Column1", type date}}),
    RenamedColumns = Table.RenameColumns(ChangedType,{{"Column1", "Date"}}),
    InsertYear = Table.AddColumn(RenamedColumns, "Year", each Date.Year([Date])),
    InsertQuarter = Table.AddColumn(InsertYear, "QuarterOfYear", each Date.QuarterOfYear([Date])),
    InsertMonth = Table.AddColumn(InsertQuarter, "MonthOfYear", each Date.Month([Date])),
    InsertDay = Table.AddColumn(InsertMonth, "DayOfMonth", each Date.Day([Date])),
    InsertDayInt = Table.AddColumn(InsertDay, "DateInt", each [Year] * 10000 + [MonthOfYear] * 100 + [DayOfMonth]),
    InsertMonthName = Table.AddColumn(InsertDayInt, "MonthName", each Date.ToText([Date], "MMMM", Culture), type text),
    InsertCalendarMonth = Table.AddColumn(InsertMonthName, "MonthInCalendar", each (try(Text.Range([MonthName],0,3)) otherwise [MonthName]) & " " & Number.ToText([Year])),
    InsertCalendarQtr = Table.AddColumn(InsertCalendarMonth, "QuarterInCalendar", each "Q" & Number.ToText([QuarterOfYear]) & " " & Number.ToText([Year])),
    InsertDayWeek = Table.AddColumn(InsertCalendarQtr, "DayInWeek", each Date.DayOfWeek([Date],1)+1),
    InsertDayName = Table.AddColumn(InsertDayWeek, "DayOfWeekName", each Date.ToText([Date], "dddd", Culture), type text),
    InsertWeekEnding = Table.AddColumn(InsertDayName, "WeekEndingFriday", each Date.EndOfWeek([Date],6), type date),   
    InsertCurrentThursday = Table.AddColumn(InsertWeekEnding, "CurrentThursday", each Date.AddDays([Date], -Date.DayOfWeek([Date],1) + 3), type date),
    InsertISOWeekJan4 = Table.AddColumn(InsertCurrentThursday, "ISOWeekJan4", each Date.FromText(Number.ToText(Date.Year([CurrentThursday])) & "-01-04") ,type date),
    InsertISOWeekYear = Table.AddColumn(InsertISOWeekJan4, "ISOWeekYear", each Date.Year([CurrentThursday])) ,  
    InsertISOWeekFirstMon = Table.AddColumn(InsertISOWeekYear, "ISOWeekFirstMon", each
        if [CurrentThursday] < [ISOWeekJan4]
        then Date.AddDays([CurrentThursday],-3)
        else Date.AddDays([ISOWeekJan4], - Date.DayOfWeek([ISOWeekJan4],1) )
      ,type date),
    InsertISOWeekNum = Table.AddColumn(InsertISOWeekFirstMon, "ISOWeekNum", each Number.RoundUp(((Duration.Days(Duration.From([Date] - [ISOWeekFirstMon]))+1) /7 )), type number),
    InsertISOWeekID = Table.AddColumn(InsertISOWeekNum, "ISOWeekID", each [ISOWeekYear] * 100 + [ISOWeekNum], type number),
    InsertISOWeekName = Table.AddColumn(InsertISOWeekID, "ISOWeekName", each Text.From([ISOWeekYear]) & "W" & Text.End( "0" & Text.From(([ISOWeekNum]*10)  + [DayInWeek]),3)),
    InsertISOWeekNameLong = Table.AddColumn(InsertISOWeekName, "ISOWeekNameLong", each Text.From([ISOWeekYear]) & "-W" & Text.End( "0" & Text.From([ISOWeekNum]),2) & "-" & Text.From([DayInWeek])),

    fnPeriod445a = (weekNum) => let
      Periods =
            {(x)=>x<5,  [P=1,Q=1]},
      {(x)=>x<9,  [P=2,Q=1]},
      {(x)=>x<14, [P=3,Q=1]},
      {(x)=>x<18, [P=4,Q=1]},
      {(x)=>x<22, [P=5,Q=2]},
      {(x)=>x<27, [P=6,Q=2]},
      {(x)=>x<31, [P=7,Q=3]},
      {(x)=>x<35, [P=8,Q=3]},
      {(x)=>x<40, [P=9,Q=3]},
            {(x)=>x<44, [P=10,Q=4]},
            {(x)=>x<48, [P=11,Q=4]},
            {(x)=>true, [P=12,Q=4]}
      Result = List.First(List.Select(Periods, each _{0}(weekNum))){1}

    InsertPeriod445 = Table.AddColumn(InsertISOWeekNameLong, "Period445Record", each fnPeriod445a([ISOWeekNum])),
    ExpandPeriod445 = Table.ExpandRecordColumn(InsertPeriod445, "Period445Record", {"P","Q" }, {"Period445", "Quarter445"}),
    RemovedColumns = Table.RemoveColumns(ExpandPeriod445,{"CurrentThursday", "ISOWeekFirstMon"})

Posted On Sunday, March 23, 2014 10:11 PM | Comments (2)

Monday, March 10, 2014 #

SQL Saturday Melbourne – 5 April 2014

I’m pretty excited to be part of the first SQL Saturday to be held in my home town of Melbourne. It’s being held on April 5, 2014 at the Monash University Caulfield Campus. If you are interested to know more about the event you can find out more here. I think there may still be places open if you would like to go, but have not registered yet. It’s a great place to meet up with like minded people and learn about SQL Server from some of the best and brightest in the industry – oh and I’ll be there too Smile

Posted On Monday, March 10, 2014 7:47 PM | Comments (0)