James Rogers

BI from the trenches...
posts - 13, comments - 25, trackbacks - 0

My Links

News

Archives

Post Categories

Thursday, October 13, 2011

SSAS, MPP and PDW

 

I cut my DBA teeth on IBM's DB2 UDB EEE (enough acronyms) back in the 90's. I have always appreciated the scalability that is available with a MPP architecture. So a couple of years ago I was excited to hear that SQL Server was going to have a MPP architecture available. Now I am at SQLPass Summit 2011 and seeing the PDW (Parallel Data Warehouse) and the hardware offerings by Dell and HP to scale that up and out. However, I was disappointed to learn that this technology does not extend to Analysis Services.
     I spend a great deal of my time tuning SSAS and can say that it scales very well as is and there is a plethora of information out there to facilitate those efforts. However, let's dream a little here. Imagine a multi-node/single-instance/shared-nothing installation of SSAS. You can distribute/partition the cubes across the nodes and then within the nodes in a manner that allows you to isolate hot-spots of data on their own hardware so that data can stay cached. Meanwhile, other queries on data that is less-often accessed or more ad-hoc in nature can process with resources allocated on separate nodes for that and the majority of queries do not suffer as much from cache turnover and the overall system load. Ahhh, it is good to dream. We could take that a step further and wish that would could mix Vertipaq storage for that more frequently used/"hot" data node and traditional Analysis Services storage for the other data where aggregations may have more value in helping the queries. Additionally, processing of mining models could be further isolated from ongoing activities and processing activities could take advantage of the MPP architecture and resulting parallelism to allow for more processing in shorter time than on a similar, single box solution. Could even go further and dedicate a node to ROLAP partitions for more real-time requirements while limiting the burden of such queries on the overall system performance.
     I realize that MS probably doesn't have much of a target market for such dreams but it would be cool nonetheless and provide some real scalability options for big data analytics implementations. Like always, the devil is in the details. OK - back to reality.
  • Share This Post:
  • Share on Twitter
  • Share on Facebook
  • Share on Technorati

Posted On Thursday, October 13, 2011 12:01 PM | Feedback (2) | Filed Under [ Analysis Services SQLServer ]

Monday, December 20, 2010

Analysis Services (SSAS) - Unexpected Internal Error when processing (ProcessUpdate). Workaround/Resolution

 /* EDIT - This problem has been fixed in the latest SQL Server 2008 R2 Cumulative Update package. It was identified that changing the aggregation design without reprocessing the aggregations prior to a ProcessUpdate on the dimensions causes this problem and it is a bug */

Many implementations require the use of ProcessUpdate to support Type 1 slowly changing dimensions. ProcessUpdate drops all of the affected indexes and aggregations in partitions affected by data that changes in the Dimension on which the ProcessUpdate is being performed. Twice now I have had situations where the processing fails with "Internal error: An unexpected exception occurred." Any subsequent ProcessUpdate processing will also fail with the same error. In talking with Microsoft the issue is corrupt indexes for the Dimension(s) being processed in the partitions of the affected measure group. I cannot guarantee that the following will correct your problem but it did in my case and saved us quite a bit of down time.
 
Workaround: ProcessIndexes on the entire cube that is being processed and throwing the error. This corrected the problem on both 2008 and 2008 R2.
 
Pros: 
  1. Does not require a complete rebuild of the data (ProcessFull) for either the Dimension or Cube.
  2. User access can continue while this ProcessIndexes in underway.
 
Cons:
  1. Can take a long time, especially on large cubes with many partitions, dimensions and/or aggregations.
  2. Query Performance is usually severely impacted due to the memory and CPU requirements for Aggregation and Index building
 
 <Parallel>
    <Process xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:ddl2="http://schemas.microsoft.com/analysisservices/2003/engine/2" xmlns:ddl2_2="http://schemas.microsoft.com/analysisservices/2003/engine/2/2" xmlns:ddl100_100="http://schemas.microsoft.com/analysisservices/2008/engine/100/100" xmlns:ddl200="http://schemas.microsoft.com/analysisservices/2010/engine/200" xmlns:ddl200_200="http://schemas.microsoft.com/analysisservices/2010/engine/200/200">
      <Object>
        <DatabaseID>MyDatabase</DatabaseID>
        <CubeID>MyCube</CubeID>
      </Object>
      <Type>ProcessIndexes</Type>
      <WriteBackTableCreation>UseExisting</WriteBackTableCreation>
    </Process>
 </Parallel>
</Batch>
 
The cube where the corruption exists can be found by having Profiler running while the ProcessUpdate is executing. The first partition that displays the "The Job has ended in failure." message in the TextData column will be part of the cube/measuregroup that has the corruption. You can try to run ProcessIndexes on just that measure group. This may correct the problem and save additional time if you have other large measure groups in the cube that are not affected by the corruption.
 
Remember to execute your normal ProcessUpdate batch after the successful completion of the ProcessIndexes. The ProcessIndexes does not pick up data changes.
 
Things that did not work:
  1. ProcessClearIndexes - why this doesn't work and ProcessIndexes does is unclear at this point.
  2. ProcessFull on the partition in question. In my latest case, this would clear up the problem for that partition. However, the next partition the ProcessUpdate touched that had data in it would generate and error. This leads me to believe the corruption problem will exist in all partitions in the affected measure group that have data in them.
 
NOTE: I experience this problem in both a SQL 2008 and SQL 2008 R2 Analysis Services environment, on separate built from the same relational database. This leads me to believe that some data condition in the tables used for the Dimension processing caused the corruption since the two environments were on physically separate hardware. I am waiting on Microsoft to analyze the dumps to give us more insight into what actually caused the corruption and will update this post accordingly.
  • Share This Post:
  • Share on Twitter
  • Share on Facebook
  • Share on Technorati

Posted On Monday, December 20, 2010 5:51 AM | Feedback (0) | Filed Under [ Analysis Services SQLServer ]

Tuesday, November 02, 2010

SSAS Usage-Based Optimization, OlapQueryLog (Enabling and Disabling), Aggregations and Structure Changes.

Analysis Services has a useful feature called Usage-Based Optimization. This feature is used to design aggregations based on actual user queries using Business Intelligence Development Studio (BIDS). Information on how to configure (or disable) the query logging required for this is fairly hard to find.
 
The Analysis Services Properties identified by "Log \ QueryLog" in the Analysis Services Properties window are used to configure the query logging used by Usage-Based Optimization:
 
 
 
The definition of these properties can be found in Books On Line (BOL) and on MSDN.
 
Enabling Usage-Based Optimization:
 
1.       Set CreateQueryLogTable to true
2.       Set the value of QueryLogConnectionString appropriately (database specified in the connection string must already exist)
3.       Set the value of QueryLogSampling to the appropriate value for your environment (the default value of 10 means every 10th query will be logged). The lower the value of this property, the higher the overhead of query logging will be on the system.
4.       Set the value of QueryLogTableName appropriately.
5.       Click "OK"
 
Disabling Usage-Based Optimization:
 
1.       Set the value of QueryLogSampling to "0". This will stop query logging.
2.       Click "OK"
 
Advantages of Usage-Based Optimization and logging:
 
1.       If you have an environment where users are submitting repetitive queries with similar tuples/slicers - the usage based-optimization will probably do a good job of helping design useful aggregations.
2.       Very little detailed knowledge of the cube structure and querying practices is needed to create aggregations.
3.       Aggregations will be designed (using BIDS or BIDS Helper utilities) based on actual user queries and not just gather requirements.
 
Disadvantages of Usage-Based Optimization and logging:
 
1.       Can create overhead on the system is sampling is too frequent.
2.       Resulting aggregation designs may not help as much as desired or may not help the average use-case.
3.       The logging table must be maintained and kept at a manageable size:
a.       Any structure changes will cause analysis services to issue a delete against the logging table for any queries containing the cube being changed/affected. This can cause cube structure changes to take a long time if the logging table is extremely large.
b.      Utilities from BIDS (or BIDS Helper functionality) can be mislead by older log data that may no longer apply to the current usage patterns of the database.
 
  • Share This Post:
  • Share on Twitter
  • Share on Facebook
  • Share on Technorati

Posted On Tuesday, November 02, 2010 3:31 AM | Feedback (0) | Filed Under [ Analysis Services ]

Tuning Analysis Services (SSAS) for better Parallel Aggregation/Index Processing

One of the key performance concepts in Analysis Services is the design and use of good Aggregations to support user queries. However, Analysis Services (Both 2005 and 2008) by default is not configured very well to take advantage of multi-processor environments for parallel processing of these indexes, particularly for large MOLAP partitions. Keep in mind, using a large portion of available processors for aggregation building is only desirable in environments where a processing window is allocated and user activity is minimized or prevented entirely for that window. If you have small datasets and processing time is not a problem then making changes to support better parallel processing may not be needed or recommended. Changing of these settings is best done under guidance of Microsoft.
 
Problem Isolation: Using a combination of SQL Server Profiler and watching the Task Manager from windows allows simple detection of the key symptom of this problem: seeing only one processor (or processor core) being utilized while processing of aggregations is taking place. To test this, issue a multi-partition ProcessIndexes XMLA statement such as the one below and watch your Profiler and Task Manager (could also use Perfmon as well).
 
 <Parallel>
    <Processxmlns:xsd="http://www.w3.org/2001/XMLSchema"xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"xmlns:ddl2="http://schemas.microsoft.com/analysisservices/2003/engine/2"xmlns:ddl2_2="http://schemas.microsoft.com/analysisservices/2003/engine/2/2"xmlns:ddl100_100="http://schemas.microsoft.com/analysisservices/2008/engine/100/100">
      <Object>
        <DatabaseID>Adventure Works DW 2008</DatabaseID>
        <CubeID>Adventure Works</CubeID>
        <MeasureGroupID>Fact Internet Sales 1</MeasureGroupID>
        <PartitionID>Internet_Sales_2001</PartitionID>
      </Object>
      <Type>ProcessIndexes</Type>
      <WriteBackTableCreation>UseExisting</WriteBackTableCreation>
    </Process>
    <Processxmlns:xsd="http://www.w3.org/2001/XMLSchema"xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"xmlns:ddl2="http://schemas.microsoft.com/analysisservices/2003/engine/2"xmlns:ddl2_2="http://schemas.microsoft.com/analysisservices/2003/engine/2/2"xmlns:ddl100_100="http://schemas.microsoft.com/analysisservices/2008/engine/100/100">
      <Object>
        <DatabaseID>Adventure Works DW 2008</DatabaseID>
        <CubeID>Adventure Works</CubeID>
        <MeasureGroupID>Fact Internet Sales 1</MeasureGroupID>
        <PartitionID>Internet_Sales_2002</PartitionID>
      </Object>
      <Type>ProcessIndexes</Type>
      <WriteBackTableCreation>UseExisting</WriteBackTableCreation>
    </Process>
    <Processxmlns:xsd="http://www.w3.org/2001/XMLSchema"xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"xmlns:ddl2="http://schemas.microsoft.com/analysisservices/2003/engine/2"xmlns:ddl2_2="http://schemas.microsoft.com/analysisservices/2003/engine/2/2"xmlns:ddl100_100="http://schemas.microsoft.com/analysisservices/2008/engine/100/100">
      <Object>
        <DatabaseID>Adventure Works DW 2008</DatabaseID>
        <CubeID>Adventure Works</CubeID>
        <MeasureGroupID>Fact Internet Sales 1</MeasureGroupID>
        <PartitionID>Internet_Sales_2003</PartitionID>
      </Object>
      <Type>ProcessIndexes</Type>
      <WriteBackTableCreation>UseExisting</WriteBackTableCreation>
    </Process>
    <Processxmlns:xsd="http://www.w3.org/2001/XMLSchema"xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"xmlns:ddl2="http://schemas.microsoft.com/analysisservices/2003/engine/2"xmlns:ddl2_2="http://schemas.microsoft.com/analysisservices/2003/engine/2/2"xmlns:ddl100_100="http://schemas.microsoft.com/analysisservices/2008/engine/100/100">
      <Object>
        <DatabaseID>Adventure Works DW 2008</DatabaseID>
        <CubeID>Adventure Works</CubeID>
        <MeasureGroupID>Fact Internet Sales 1</MeasureGroupID>
        <PartitionID>Internet_Sales_2004</PartitionID>
     </Object>
      <Type>ProcessIndexes</Type>
      <WriteBackTableCreation>UseExisting</WriteBackTableCreation>
    </Process>
 </Parallel>
</Batch>
 
If, during the course of processing you see only one partition processing at a time (in the Profiler window) and that also correlates to mdsmdsrv.exe only using the equivalent of one CPU (i.e. ~50% on a 2-CPU machine) then tuning the SSAS server properties to allow better parallel aggegation/index processing might be an advantage for the environment (see original statement regarding processing windows and data sets in the first paragraph).
 
Configuration Changes: The configuration properties of AggregationMemoryLimitMax and AggregationMemoryLimitMin (these are in the OLAP \ Process section of the Analysis Services Properties Window):
 
 
The default value of 80 for AggregationMemoryLimitMax will prevent more than one partition from being processed, especially for large cubes/partitions with large aggregations. This is because a setting of 80 means up to 80% of the available memory can be used for building aggregations on the partition. That leaves only 20% of the available memory for other partitions, which is not enough to start another partition until the partition currently being processed is completed. I use the following formula for setting AggregationMemoryLimitMax and AggregationMemoryLimitMin (this is only used for reference, the proper settings vary from environment to environment based on hardware resources, business requirements and availability requirements):
 
AggregationMemoryLimitMax=ceiling(100/(<number processing cores>-2))
AggregationMemoryLimitMin=ceiling(100/((<number processing cores>-2)*1.5))
 
Using (<number processing cores>+1) helps to leave enough memory for at least one processor core for non-SSAS activities (such as paging, SQL Server activities).
 
On a 16-core machine the AggregationMemoryLimitMax would come out to 7 which means up to 14 (100/7) partitions could process aggregations in parallel.   The AggregationMemoryLimitMin would come to 5 which means no more than 20 (100/5) partitions could process in parallel.
 
Result:
 
Processing time for rebuilding aggregations and indexes is significantly reduced in multi-processor environments, provided the environment does not have other resource constraints. The degree of improvement will depend on the number of processing cores and amount of memory available to the system for the Aggregation processing. However, on large mult-core, multi-processor servers with large partitions and aggregations, this can make the difference between being able to process the aggregations frequent enough to help user queries or having the user queries suffer longer durations due to missing indexes and/or aggregations that are not processed.
 
  • Share This Post:
  • Share on Twitter
  • Share on Facebook
  • Share on Technorati

Posted On Tuesday, November 02, 2010 12:33 AM | Feedback (1) | Filed Under [ Analysis Services ]

Friday, June 11, 2010

MDX using EXISTING, AGGREGATE, CROSSJOIN and WHERE

 

It is a well-published approach to using the EXISTING function to decode AGGREGATE members and nested sub-query filters.  Mosha wrote a good blog on it here and a more recent one here.  The use of EXISTING in these scenarios is very useful and sometimes the only option when dealing with multi-select filters.  However, there are some limitations I have run across when using the EXISTING function against an AGGREGATE member:
 
  1. The AGGREGATE member must be assigned to the Dimension.Hierarchy being detected by the EXISTING function in the calculated measure.
  1. The AGGREGATE member cannot contain a crossjoin from any other dimension or hierarchy or EXISTING will not be able to detect the members in the AGGREGATE member.
 
Take the following query (from Adventure Works DW 2008):
 
With
  member [Week Count] as 'count(existing([Date].[Fiscal Weeks].[Fiscal Week].members))'
   member [Date].[Fiscal Weeks].[CM] as 'AGGREGATE({[Date].[Fiscal Weeks].[Fiscal Week].&[47]&[2004],[Date].[Fiscal Weeks].[Fiscal Week].&[48]&[2004],[Date].[Fiscal Weeks].[Fiscal Week].&[49]&[2004],[Date].[Fiscal Weeks].[Fiscal Week].&[50]&[2004]})'  
select
  {[Week Count]} on columns
from
  [Adventure Works]
 
  where
  [Date].[Fiscal Weeks].[CM]
 
Here we are attempting to count the existing fiscal weeks in slicer.  This is useful to get a per-week average for another member. Many applications generate queries in this manner (such as Oracle OBIEE).  This query returns the correct result of (4) weeks.
Now let's put a twist in it.  What if the querying application submits the query in the following manner:
 
With
  member [Week Count] as 'count(existing([Date].[Fiscal Weeks].[Fiscal Week].members))'
   member [Customer].[Customer Geography].[CM] as 'AGGREGATE({[Date].[Fiscal Weeks].[Fiscal Week].&[47]&[2004],[Date].[Fiscal Weeks].[Fiscal Week].&[48]&[2004],[Date].[Fiscal Weeks].[Fiscal Week].&[49]&[2004],[Date].[Fiscal Weeks].[Fiscal Week].&[50]&[2004]})'  
select
  {[Week Count]} on columns
from
  [Adventure Works]
 
  where
  [Customer].[Customer Geography].[CM]
 
Here we are attempting to count the existing fiscal weeks in slicer.  However, the AGGREGATE member is built on a different dimension (in name) than the one EXISTING is trying to detect.  In this case the query returns (174) which is the total number of [Date].[Fiscal Weeks].[Fiscal Week].members defined in the dimension.
 
Now another twist, the AGGREGATE member will be named appropriately and contain the hierarchy we are trying to detect with EXISTING but it will be cross-joined with another hierarchy:
 
With
  member [Week Count] as 'count(existing([Date].[Fiscal Weeks].[Fiscal Week].members))'
   member [Date].[Fiscal Weeks].[CM] as 'AGGREGATE({[Date].[Fiscal Weeks].[Fiscal Week].&[47]&[2004],[Date].[Fiscal Weeks].[Fiscal Week].&[48]&[2004],[Date].[Fiscal Weeks].[Fiscal Week].&[49]&[2004],[Date].[Fiscal Weeks].[Fiscal Week].&[50]&[2004]}*
   {[Customer].[Customer Geography].[Country].&[Australia],[Customer].[Customer Geography].[Country].&[United States]})' 
select
  {[Week Count]} on columns
from
  [Adventure Works] 
  where
  [Date].[Fiscal Weeks].[CM]
 
Once again, we are attempting to count the existing fiscal weeks in slicer.  Again, in this case the query returns (174) which is the total number of [Date].[Fiscal Weeks].[Fiscal Week].members defined in the dimension. However, in 2008 R2 this query returns the correct result of 4 and additionally , the following will return the count of existing countries as well (2):
 
With
  member [Week Count] as 'count(existing([Date].[Fiscal Weeks].[Fiscal Week].members))'
  member [Country Count] as 'count(existing([Customer].[Customer Geography].[Country].members))'
 member [Date].[Fiscal Weeks].[CM] as 'AGGREGATE({[Date].[Fiscal Weeks].[Fiscal Week].&[47]&[2004],[Date].[Fiscal Weeks].[Fiscal Week].&[48]&[2004],[Date].[Fiscal Weeks].[Fiscal Week].&[49]&[2004],[Date].[Fiscal Weeks].[Fiscal Week].&[50]&[2004]}*
   {[Customer].[Customer Geography].[Country].&[Australia],[Customer].[Customer Geography].[Country].&[United States]})' 
select
  {[Week Count]} on columns
from
  [Adventure Works] 
  where
  [Date].[Fiscal Weeks].[CM]
 
2008 R2 seems to work as long as the AGGREGATE member is on at least one of the hierarchies attempting to be detected (i.e. [Date].[Fiscal Weeks] or [Customer].[Customer Geography]). If not, it seems that the engine cannot find a "point of entry" into the aggregate member and ignores it for calculated members.
 
One way around this would be to put the sets from the AGGREGATE member explicitly in the WHERE clause (slicer).  I realize this is only supported in SSAS 2005 and 2008.  However, after talking with Chris Webb (his blog is here and I highly recommend following his efforts and musings) it is a far more efficient way to filter/slice a query:
 
With
  member [Week Count] as 'count(existing([Date].[Fiscal Weeks].[Fiscal Week].members))'
   select
  {[Week Count]} on columns
from
  [Adventure Works] 
  where
  ({[Date].[Fiscal Weeks].[Fiscal Week].&[47]&[2004],[Date].[Fiscal Weeks].[Fiscal Week].&[48]&[2004],[Date].[Fiscal Weeks].[Fiscal Week].&[49]&[2004],[Date].[Fiscal Weeks].[Fiscal Week].&[50]&[2004]}
  ,{[Customer].[Customer Geography].[Country].&[Australia],[Customer].[Customer Geography].[Country].&[United States]})
 
This query returns the correct result of (4) weeks.  Additionally, we can count the cross-join members of the two hierarchies in the slicer:
 
With
  member [Week Count] as 'count(existing([Date].[Fiscal Weeks].[Fiscal Week].members)*existing([Customer].[Customer Geography].[Country].members))'
   select
  {[Week Count]} on columns
from
  [Adventure Works] 
  where
  ({[Date].[Fiscal Weeks].[Fiscal Week].&[47]&[2004],[Date].[Fiscal Weeks].[Fiscal Week].&[48]&[2004],[Date].[Fiscal Weeks].[Fiscal Week].&[49]&[2004],[Date].[Fiscal Weeks].[Fiscal Week].&[50]&[2004]}
  ,{[Customer].[Customer Geography].[Country].&[Australia],[Customer].[Customer Geography].[Country].&[United States]})
 
We get the correct number of (8) here.

 

  • Share This Post:
  • Share on Twitter
  • Share on Facebook
  • Share on Technorati

Posted On Friday, June 11, 2010 9:43 PM | Feedback (1) | Filed Under [ Analysis Services ]

Wednesday, November 04, 2009

PerformancePoint Filter Query suggestions

 
 As I have moved forward with PerformancePoint I have made extensive use of queries for filter building. This approach has a couple of distinct advantages over the other options.
  1. Allows filters to be dynamic and pick up new data as it becomes available. One good example of this is a date filter. We use it to keep our filters as a dynamic set of n time periods from the current time period back (i.e. lastperiods(52,exists([Date].[Week].[Week].members,[Date].[Current Date].&[1]).item(0)))
  2. It allows the query to be tuned or tweaked to exclude empty members (there are some pitfalls here, noted later).
 
If you need to build a tree filter based on a hierarchy, a good query is as follows: HIERARCHIZE([Dimension].[Hierarchy].members).
 
A couple of things to keep in mind.
 
  1. Keep your query results below 500 members. PerformancePoint doesn't seem to handle more than that very well, although I have seen cases where we have gone above that.
  2. Be careful using MDX funtions that require cell inspection (such as NONEMPTY, filter, etc). This is problematic if you have a large cube/measure group. This can cause a scan of the measure group partitions to evaluate your query. NONEMPTY is especially bad about this and is compounded because you cannot chain filters (i.e. I can't pass a date filter into an item filter).
    1. Use an additional member filter in the query that causes the query to be more specific (i.e. nonempty items for the current week's sales)
      1. NONEMPTY(HIERARCHIZE([Dimension].[Hierarchy].members),(exists([Date].[Week].[Week].members,[Date].[Current Date].&[1]).item(0),[Sales Amount]))
    1. Use a very small measure group (at the query's required granularity) to satisfy the request. This requires cube changes and may not be an option. I don't recommend this as a first option because those measure groups would have little usefulness to other analysis being done in the cube as well as the down stream impact on the ETL process and cube processing.
  • Share This Post:
  • Share on Twitter
  • Share on Facebook
  • Share on Technorati

Posted On Wednesday, November 04, 2009 1:14 PM | Feedback (0) |

SQL Azure shows promise for small data applications - not there for BI yet.

 

I am currently sitting in a seminar on BI in the Cloud by John Welch at the PASS Summit 2009.  SQL Server Azure looks promising for a variety of applications.  Currently it only supports SQL Server relational database services but future plans to include the BI stack of SQL Server. 
           I have yet to see how loading of data is handled and security is handled but it seems to have a lot of potential for small data sets (< 10 GB). 
 
Couple of points:
  1. Apparently data manipulation is slower than normal SQL Server. 
  2. Queries are limited to 5 minutes
  3. Max database size is 10GB
  4. Bulk Insert not available yet but available in an upcoming CTP
  5. No control over physical database implementation (files,filegroups,partitions, etc…)
  6. No Windows auth supported
  • Share This Post:
  • Share on Twitter
  • Share on Facebook
  • Share on Technorati

Posted On Wednesday, November 04, 2009 12:49 PM | Feedback (1) |

Tuesday, August 12, 2008

PerformancePoint Dashboard Performance, KPI Properties and Parameter Passing - Issue Found - Solved in SP1

In PerformancePoint 2007 you have the capability to setup custom properties and pass them into dashboard objects. This is especially useful for simplifying dashboards where you are linking scorecard KPIs to report views. Alyson Powell Erwin outlines how to do this in the PerformancePoint MSDN blog:
 
 
This is very handy. By setting up a custom property that contains measure names for the KPI you can use one report for each data source and pass the value of the custom property into the [Measures] endpoint on the report view. If you have a scorecard with 20 KPIs from the same data source and each KPI displays two report views it trims the number of reports required from 40 to 2.
 
Issue Found - Solved in SP1
 
However, I noticed something going awry when doing my performance testing while running Profiler against the Analysis Services database. Each time I clicked on a KPI it would execute the query for the two report views as expected AND submit a query for the scorecard with NO FILTERS. This basically amounts to running the scorecard query for all time because we pass the org and date filters in from the dashboard filter selections. This obviously has a negative effect on overall query performance from the Analysis Services server. However, in testing PerformancePoint 2007 SP1 I found that this problem has been addressed although I saw no mention of it specifically in the SP1 release notes. It is easy for this issue to go unnoticed because of the good caching done by Analysis Services - provided there is sufficient CPU and memory on the Analysis Services server.
  • Share This Post:
  • Share on Twitter
  • Share on Facebook
  • Share on Technorati

Posted On Tuesday, August 12, 2008 10:45 PM | Feedback (0) |

Tuesday, March 25, 2008

Filtering PerformancePoint Reports and Scorecards by Dimensions Different Than the Filter

In my earlier post NON EMPTY Filtering on PerformancePoint Scorecard, I mentioned that you may be able filter scorecards and/or reports by dimensions that are not the same as the filter by using a cross join or other MDX querying technique.  This is true if the other dimension is in the same cube as the query or member selection used by the filter.  Consider the following:

Cube:                 Sales

Dimensions:     Organizational Hierarchy (lowest level is Site)

                            Site (this contains attributes that only apply to the site)

 

Filter: Org (Member Selection of Organizational Hierarchy)

 

Report: Sales by State/Province

 

Approach 1:  We could put the Organizational Hierarchy dimension into the Sales by State/Province and pass the Org filter value directly into that.  However, this may limit the flexibility of the report and its applicability to other dashboards.  Additionally, if we were to use the report without passing a value into it we may complicate the resulting query and unnecessarily degrade the report’s query performance.

 

Approach 2: We apply a MDX function to the filter when it is mapped to the report filter to only show States/Provinces that apply to the Organizational Hierarchy (which are often geographically defined).  The resulting Filter Formula would look something like the following:

 

NONEMPTY([Site].[Geography].[State].members,<<SourceValue>>)

 

A key cube design reminder here – make sure your cubes have a high performing default measure if you do not specify a specific measure in the NONEMPTY filter expression like the following:

 

NONEMPTY(NONEMPTY([Site].[Geography].[State].members,<<SourceValue>>),[Measures].[Site Count])

 

 I like to use something that only returns one row in the base fact table per granular dimension value (i.e. one row per site).   I generally set aside a measure group just to support this type of filtering or I set up a cube just to support PerformancePoint filters (to be discussed in a later post).  This requires additional education on the part of publisher so they understand that the measure group or cube has very little analytical value.  At a bare minimum our aggregation design needs to support the dimensions involved in the filter expression unless we have very small cubes.

One thing that would be nice would be if we could filter filters (no that is not a typo). Ideally, if we had a date filter, we would want to filter a MDX query or Member Selection filter (such as org) to only show members that have values for the applied date filter.  This has value in industry Same-Store or Comparables dashboards.


 

  • Share This Post:
  • Share on Twitter
  • Share on Facebook
  • Share on Technorati

Posted On Tuesday, March 25, 2008 9:09 PM | Feedback (1) |

Wednesday, February 13, 2008

Every Data Warehouse Solution Needs one of these

I have had the pleasure (or pain, depending on your point of view) of working on several large data warehousing projects over the years and I have found that there are many key ingredients to a successful project.  Most of these are well documented by Ralph Kimball and Bill Inmon so I won't reiterate those here.  However, as an architect I have come to really value the role of the Business Analyst in these projects.  It is a no brainer that the Business Analyst is the go-to person for requirements but I believe that there should be a closer tie in the project between the architect(s) and the analyst(s).  Often, we architects tend to guide questions by our knowledge of the data and infrastructure.  This by itself may not lead to the best solution.  While I have learned to put on the "business" hat over the years, having an analyst (or more than one) on the project who can focus on asking questions from a business perspective not only frees me (the architect) up to focus on making a scalable, maintainable and extensible solution.  The resulting brainstorming with the analysts fosters an environment where we can say "how do we solve for this?" instead of "we can't do that because the data...."  Most questions can be answered and most problems can be solved.  Having the analysts and architects working closely on the same team (not just treating the analyst as a "customer") results in a solution that answers the business's questions more completely while still maintaining a good solution. BTW - every team needs a Workdog!
  • Share This Post:
  • Share on Twitter
  • Share on Facebook
  • Share on Technorati

Posted On Wednesday, February 13, 2008 10:45 PM | Feedback (1) |

Powered by: