My blog is moving |
I've neglected blogging far too long. I had what you might call a crisis of faith. That and a ton of work! Time to reboot, and I think a more modern blog platform is in order.
So I'm starting again here: https://davidwimbush.wordpress.com/.
I'd like to thank Tony and all at SQL Blogcasts for enabling me to get started.
http://sqlblogcasts.com/blogs/davidwimbush/archive/2015/06/27/my-blog-is-moving.aspx
Метки: goodbye |
SQL Developer Opportunity – London and work from home |
|
The killer feature in SQL Server 2016 |
http://sqlblogcasts.com/blogs/simons/archive/2015/06/10/the-killer-feature-in-sql-server-2016.aspx
|
Creating test machines in VSO – slight bug - Fixed |
Метки: Continuous Integration Visualstudio Online ALM |
Dynamic Partition Creation in SSAS multidimensional using the SSAS Partition Manager |
<
There are plenty of blog entries out there that outline how to dynamically create partitions in SSAS multidimensional databases. However, all of them suffer the same drawback: they always hard-code some part of the logic into the programming language in the form of hard-coded names or hard-coded partition queries. None of them provide a generic solution that you could just pick up and apply to any project. Perhaps this is a reflection of the usual implementation language of choice: SSIS. Let's face it, SSIS is famous for been brittle; packages break when moved to a new environment. They are rarely reusable without modification. Generating SSIS packages from BIML has solved some of these problem, but even BIML does not handle C# script tasks well; it is rather like painting a room through a keyhole!
Like many business intelligence consultants, I take example scripts from project to project, each time extending their functionality a little more or adapting them to meet new requirements. For many years, I have used C# and AMO to create partitions and process cubes. Indeed, I have presented on this at SQLBits. I do find the AMO object model extremely intuitive and C# much easier to code than doing the equivalent partition creation logic in an SSIS package.
After several iterations, I now have an entirely generic piece of C# code which will work in any situation. The only configuration required is to change the content of a set of tables and edit two views so they generate the correct MDX member keys to match the date dimension in your cube. Fortunately, this is very simple and very much in the realm of the average business intelligence consultant as it is pure T-SQL.
Once configured the SSAS Partition Manager will add partitions to all cubes in all databases listed in the SsasPartitions view, so it will work out of the box even in multi-cube, multi-database environments. The application will also delete partitions that are no longer required, thus removing redundant data from the cube.
To illustrate, I have chosen to partition each measure group in the AdventureWorks sample cube by different criteria. So the Internet Customers measure group is partitioned by year, the Internet Sales partition by quarter, Internet Orders and Reseller Orders partitioned by month, the Reseller Orders partitioned by week. Finally, the Sales Order measure group is partitioned by a hybrid scheme whereby years 2014 onwards have monthly partitions and prior years are partitioned by quarter. Of course, all this partitioning is completely over the top for the tiny amount of data held by the AdventureWorks sample cube, but does illustrate the power of the SSAS Partition Manager.
All this is encapsulated in the SsasPartitionDefinitions view which allows you to review your configuration (shown below).
Note that the PartitionStartDate is set to 2011-01-01 for all partition schemes, but this could easily be changed by editing the SsasMeasureGroup table.
The PartitionSliceIsRange column is a powerful new feature of my latest implementation and discussed below.
Correctly setting the partition slice property of each partition helps the storage engine eliminate the partitions that do not contain relevant data. Referred to as partition elimination, this greatly increases the performance of MDX queries as not as much data needs to be read by the SSAS storage engine. This is one of the main reasons why cubes should be partitioned. However, this does rely on the end-user using the dimension upon which the partitioning is based in their MDX queries. Partitioning by the date dimension is the typical approach. But be warned, cubes often contain several date dimensions all playing different roles (e.g. order date, delivery date, ship date etc.), so be careful to base your partitioning strategy on the right date dimension. I have known situations where the cube designer inadvertently picked the wrong date dimension and so the storage engine could never employ partition elimination. Big mistake! Huge!
The auto-slice functionality in SSAS does not always get it right. During processing, SSAS internally identifies the range of data that is contained in each partition by using the Min and Max DataIDs to set the partition slice. Unfortunately, DataIDs are assigned during dimension processing as new members are encountered and we have no explicit control over how they are allocated, so because SSAS just looks at the minimum and maximum value of the DataID in the partition, you can end up reading partitions that don’t contain relevant data. Thus it is always best practice to explicitly assign a value to the partition slice property of each partition.
You can set the partition slice property of a partition to be a single MDX member, or a set of members. However, MDX ranges operators (i.e. :) are not allowed. So, for example, the following two statements work fine:
{[Date].[Calendar].[Month].&[2014]&[4]}
{[Date].[Calendar].[Date].&[20140101],[Date].[Calendar].[Date].&[20140102],[Date].[Calendar].[Date].&[20140103],[Date].[Calendar].[Date].&[20140104],[Date].[Calendar].[Date].&[20140105],-- etc. --,[Date].[Calendar].[Date].&[20140131]}
Whereas the following range statement will cause the partition processing to fail:
{[Date].[Calendar].[Date].&[20140101] : [Date].[Calendar].[Date].&[20140131]}
Of course the above MDX statements were generated from the Adventure Works cube date dimension.
If you get your partition slice wrong, you will get an error such as: "Errors in the OLAP storage engine: The restrictions imposed on partition slice where violated". Note that you will ONLY get such errors if the partition contains over 4096 rows which is the IndexBuildThreshold defined in msmdsrv.ini file. In the Adventure Works database, partitions rarely get above 5000 rows, so the partition slice setting is ignored. In reality, the Adventure Works database has such low data volumes there is no need to partition the cube, but then it is just a sample database!
For many years now, I have used a set of SQL tables to hold the definition of each partitioning scheme required in the cube. A single view then combines the content of these tables with the date dimension to create a list of partitions that need to be created in the cube. This partition definition view even generates the MDX required to set the partition slice property correctly. Thus the SSAS Partition Manager program is entirely generic; it simply connects to the SSAS database and compares the partitions it finds in each measure group with the list of desired partitions in the partition definition view: SsasPartitions.
Instead of holding one row per partition in a static table, we get the partition definition view to combine the underlying partition definition tables to generate the list of partitions we need. A CROSS JOIN to the date dimension simply fills in the gaps. The view is entirely dynamic, giving you all the partitions you need for today. Tomorrow the view will add on any new partitions as they are required by the partitioning scheme: monthly, quarterly, yearly.
This view is read by the SSAS Partition Manager program which compares the partitions in the cube with the ones listed by the view. If the partition does not exist in the cube, then it is created. Note that the C# program does not update existing partitions. It simply creates new ones or deletes redundant partitions.
The clever thing about this view is not only how it uses date dimension to generate a list of all partitions up to the present day, but also uses it again to generate the list of keys for MDX ranges. Note the content of the PartitionSlice column in the case of the Internet Orders measure group. This was generated because the PartitionSliceIsRange flag is set to True (1).
Note that the view is entirely dynamic, giving you all the partitions you need for today. Tomorrow the view will add on any new partitions as they are required by the partitioning scheme: monthly, quarterly, yearly. This is because the SsasPartitions view only generates partitions between the PartitionStartDate and today's date. Unfortunately the AdventureWorks date dimension table ends in 2014, so we never get any new partitions been created as we roll over to the new week, month, quarter or year. Clearly this would not be the case in a true production cube!
Once the SSAS Partition Manager has been run against the AdventureWorks cube, this is what you see in SSMS:
I provide the T-SQL for these database objects as a SQL Server Data Tools (SSDT) project called SsasPartitionDefinition which is part of the download for the SSAS Partition Manager hosted on Codeplex. Note that I include DimDate from the AdventureWorksDW2014 database just so my SSDT project compiles correctly.
The Populate_Partition_Definitions.sql file populates the three partition definition tables with the partitioned measure groups from the AdventureWorks cube. This can easily be adapted to your own cube(s).
Note that this table structure supports many measure groups in many cubes in many databases, so with one run of the SSAS Partition Manager you can create partitions in hundreds of cubes, all with the correct partition slice set!
To get yourself up and running with the SSAS Partition Manager this is all you have to do:
All this is explained in detail on the following page: Configuring the SSAS Partiton Manager
The first step taken by the SSAS Partition Manager is to validate your configuration. Thus it will check that all the measure groups exist in the relevant cubes in the relevant databases. It even checks the measure groups have a template partition with its where clause set to WHERE 1=0. All this is logged along with the creation/deletion of each partition. So check the log files to ensure your initial configuration is correct.
The SSAS Partition Manager is available for download from Codeplex. I would be glad to have your feedback on the project via the project’s discussions tab.
Метки: SSAS C# SQL Server 2012 AMO SSAS Partition Manager |
Dynamic Partition Creation in SSAS multidimensional using the SSAS Partition Manager |
<
There are plenty of blog entries out there that outline how to dynamically create partitions in SSAS multidimensional databases. However, all of them suffer the same drawback: they always hard-code some part of the logic into the programming language in the form of hard-coded names or hard-coded partition queries. None of them provide a generic solution that you could just pick up and apply to any project. Perhaps this is a reflection of the usual implementation language of choice: SSIS. Let's face it, SSIS is famous for been brittle; packages break when moved to a new environment. They are rarely reusable without modification. Generating SSIS packages from BIML has solved some of these problem, but even BIML does not handle C# script tasks well; it is rather like painting a room through a keyhole!
Like many business intelligence consultants, I take example scripts from project to project, each time extending their functionality a little more or adapting them to meet new requirements. For many years, I have used C# and AMO to create partitions and process cubes. Indeed, I have presented on this at SQLBits. I do find the AMO object model extremely intuitive and C# much easier to code than doing the equivalent partition creation logic in an SSIS package.
After several iterations, I now have an entirely generic piece of C# code which will work in any situation. The only configuration required is to change the content of a set of tables and edit two views so they generate the correct MDX member keys to match the date dimension in your cube. Fortunately, this is very simple and very much in the realm of the average business intelligence consultant as it is pure T-SQL.
Once configured the SSAS Partition Manager will add partitions to all cubes in all databases listed in the SsasPartitions view, so it will work out of the box even in multi-cube, multi-database environments. The application will also delete partitions that are no longer required, thus removing redundant data from the cube.
To illustrate, I have chosen to partition each measure group in the AdventureWorks sample cube by different criteria. So the Internet Customers measure group is partitioned by year, the Internet Sales partition by quarter, Internet Orders and Reseller Orders partitioned by month, the Reseller Orders partitioned by week. Finally, the Sales Order measure group is partitioned by a hybrid scheme whereby years 2014 onwards have monthly partitions and prior years are partitioned by quarter. Of course, all this partitioning is completely over the top for the tiny amount of data held by the AdventureWorks sample cube, but does illustrate the power of the SSAS Partition Manager.
All this is encapsulated in the SsasPartitionDefinitions view which allows you to review your configuration (shown below).
Note that the PartitionStartDate is set to 2011-01-01 for all partition schemes, but this could easily be changed by editing the SsasMeasureGroup table.
The PartitionSliceIsRange column is a powerful new feature of my latest implementation and discussed below.
Correctly setting the partition slice property of each partition helps the storage engine eliminate the partitions that do not contain relevant data. Referred to as partition elimination, this greatly increases the performance of MDX queries as not as much data needs to be read by the SSAS storage engine. This is one of the main reasons why cubes should be partitioned. However, this does rely on the end-user using the dimension upon which the partitioning is based in their MDX queries. Partitioning by the date dimension is the typical approach. But be warned, cubes often contain several date dimensions all playing different roles (e.g. order date, delivery date, ship date etc.), so be careful to base your partitioning strategy on the right date dimension. I have known situations where the cube designer inadvertently picked the wrong date dimension and so the storage engine could never employ partition elimination. Big mistake! Huge!
The auto-slice functionality in SSAS does not always get it right. During processing, SSAS internally identifies the range of data that is contained in each partition by using the Min and Max DataIDs to set the partition slice. Unfortunately, DataIDs are assigned during dimension processing as new members are encountered and we have no explicit control over how they are allocated, so because SSAS just looks at the minimum and maximum value of the DataID in the partition, you can end up reading partitions that don’t contain relevant data. Thus it is always best practice to explicitly assign a value to the partition slice property of each partition.
You can set the partition slice property of a partition to be a single MDX member, or a set of members. However, MDX ranges operators (i.e. :) are not allowed. So, for example, the following two statements work fine:
{[Date].[Calendar].[Month].&[2014]&[4]}
{[Date].[Calendar].[Date].&[20140101],[Date].[Calendar].[Date].&[20140102],[Date].[Calendar].[Date].&[20140103],[Date].[Calendar].[Date].&[20140104],[Date].[Calendar].[Date].&[20140105],-- etc. --,[Date].[Calendar].[Date].&[20140131]}
Whereas the following range statement will cause the partition processing to fail:
{[Date].[Calendar].[Date].&[20140101] : [Date].[Calendar].[Date].&[20140131]}
Of course the above MDX statements were generated from the Adventure Works cube date dimension.
If you get your partition slice wrong, you will get an error such as: "Errors in the OLAP storage engine: The restrictions imposed on partition slice where violated". Note that you will ONLY get such errors if the partition contains over 4096 rows which is the IndexBuildThreshold defined in msmdsrv.ini file. In the Adventure Works database, partitions rarely get above 5000 rows, so the partition slice setting is ignored. In reality, the Adventure Works database has such low data volumes there is no need to partition the cube, but then it is just a sample database!
For many years now, I have used a set of SQL tables to hold the definition of each partitioning scheme required in the cube. A single view then combines the content of these tables with the date dimension to create a list of partitions that need to be created in the cube. This partition definition view even generates the MDX required to set the partition slice property correctly. Thus the SSAS Partition Manager program is entirely generic; it simply connects to the SSAS database and compares the partitions it finds in each measure group with the list of desired partitions in the partition definition view: SsasPartitions.
Instead of holding one row per partition in a static table, we get the partition definition view to combine the underlying partition definition tables to generate the list of partitions we need. A CROSS JOIN to the date dimension simply fills in the gaps. The view is entirely dynamic, giving you all the partitions you need for today. Tomorrow the view will add on any new partitions as they are required by the partitioning scheme: monthly, quarterly, yearly.
This view is read by the SSAS Partition Manager program which compares the partitions in the cube with the ones listed by the view. If the partition does not exist in the cube, then it is created. Note that the C# program does not update existing partitions. It simply creates new ones or deletes redundant partitions.
The clever thing about this view is not only how it uses date dimension to generate a list of all partitions up to the present day, but also uses it again to generate the list of keys for MDX ranges. Note the content of the PartitionSlice column in the case of the Internet Orders measure group. This was generated because the PartitionSliceIsRange flag is set to True (1).
Note that the view is entirely dynamic, giving you all the partitions you need for today. Tomorrow the view will add on any new partitions as they are required by the partitioning scheme: monthly, quarterly, yearly. This is because the SsasPartitions view only generates partitions between the PartitionStartDate and today's date. Unfortunately the AdventureWorks date dimension table ends in 2014, so we never get any new partitions been created as we roll over to the new week, month, quarter or year. Clearly this would not be the case in a true production cube!
Once the SSAS Partition Manager has been run against the AdventureWorks cube, this is what you see in SSMS:
I provide the T-SQL for these database objects as a SQL Server Data Tools (SSDT) project called SsasPartitionDefinition which is part of the download for the SSAS Partition Manager hosted on Codeplex. Note that I include DimDate from the AdventureWorksDW2014 database just so my SSDT project compiles correctly.
The Populate_Partition_Definitions.sql file populates the three partition definition tables with the partitioned measure groups from the AdventureWorks cube. This can easily be adapted to your own cube(s).
Note that this table structure supports many measure groups in many cubes in many databases, so with one run of the SSAS Partition Manager you can create partitions in hundreds of cubes, all with the correct partition slice set!
To get yourself up and running with the SSAS Partition Manager this is all you have to do:
All this is explained in detail on the following page: Configuring the SSAS Partiton Manager
The first step taken by the SSAS Partition Manager is to validate your configuration. Thus it will check that all the measure groups exist in the relevant cubes in the relevant databases. It even checks the measure groups have a template partition with its where clause set to WHERE 1=0. All this is logged along with the creation/deletion of each partition. So check the log files to ensure your initial configuration is correct.
The SSAS Partition Manager is available for download from Codeplex. I would be glad to have your feedback on the project via the project’s discussions tab.
Метки: SSAS C# SQL Server 2012 AMO SSAS Partition Manager |
SSAS Partition Manager for both SSAS Tabular Models and SSAS Multidimensional Cubes |
Today I am proud to announce the release of SSAS Partition Manager that can dynamically create partitions in both SSAS Tabular models and SSAS Multidimensional cubes using a wide variety of partitioning schemes (yearly, monthly, weekly etc.).
What is more, for OLAP cubes it correctly sets the MDX in the PartitionSlice property of the partition so you don't have to rely on SQL Server Analysis Services' auto-slice functionality which has known bugs and limitations.
SSAS Partition Manager is a production ready application which is very easy to configure and can be easily integrated into your deployment scripts and regular data warehouse processing jobs. Indeed, a full set of unit and integration tests has been prepared for the SSAS Partition Manager to ensure the quality of the code.
SSAS Partition Manager provides the ability to both deploy a database (either Tabular or Multidimensional), add the partitions and process the newly deployed database all with one simple command.
Using command-line options you can:
The SSAS Partition Manager is an entirely generic piece of C# code which will work in any situation. With a tiny amount of configuration, you can start using the SSAS Partition Manager with your multidimensional cube and/or tabular models very quickly. You should never need to modify the C# code. All configuration is done in T-SQL, so it is very much in the realm of the average business intelligence consultant. No specialist skills required!
The key to understanding the SSAS Partition Manager is to know that it creates partitions based on the content of a view in the SQL database called SsasPartitions which tells it what type of partitions to create and how many. When it connects to the cube or model, it expects to find a template partition which has its where clause set to WHERE 1=0 which it clones to form the new partition. That's it! The SSAS Partition Manager does the rest.
The SSAS Partition Manager starts by reading from its config file the connection string to the SQL Server database and a flag which indicates which server environment to connect to (i.e. DEV, UAT or PROD). It then connects to the SQL Server database and queries the SsasServer table for the name of the SSAS server using the server environment flag. Thus, DEV would connect to your localhost, UAT to your UAT server etc. etc.
The program then connects to the SSAS server using the Analysis Management Objects (AMO) library Microsoft.AnalysisServices.
The SSAS Partition Manager then queries the SsasPartitions view to find out what partitions are needed. This view lists all the partitions in all the measure groups in all the cubes/models in all the databases on the SSAS server. Thus the SSAS Partition Manager can handle a multi-cube/model, multi-database environment very easily.
The SSAS Partition Manager will add partitions to all cubes/models in all databases listed in the SsasPartitions view, so it will work out of the box even in multi-cube, multi-database environments. The SSAS Partition Manager will also delete partitions which are not listed in the SsasPartitions view with the exception of any template partition (i.e. a partition with the word template in its name). Thus you can implement a rolling-partition scheme whereby old data is removed from the cube automatically.
The program logic is comprehensively tested by a full set of unit and integration tests before each release to ensure the quality of the code, so you can rest assured it won't go wrong on your server.
The SSAS Partition Manager relies on the presence of a few key database objects. You can choose which database you put these in yourself. The main thing is that you customize the SsasDateDimWrapper view to work with your own date dimension or equivalent. All this is explained in Configuring the SSAS Partition Manager.
SSAS Partition Manager replaces the OlapPartitionManager which I discussed in previous posts as it is more generic and caters for both Tabular and Multidimensional databases.
So read the documentation and go to the download page and start using it today!
Метки: SSAS C# AMO Multidimensional Tabular SSAS Partition Manager |
PowerQuery – The power of M |
http://sqlblogcasts.com/blogs/simons/archive/2015/05/18/powerquery-the-power-of-m.aspx
Метки: BI Excel PowerBI PowerQuery |
SSAS: Automating OLAP cube deployment, dynamic partitioning and processing |
Imagine been able to deploy your OLAP database, dynamically partition the cube and then process the new database all in one step from an .asDatabase file with minimal configuration. Well now you can! Today’s release of the OlapPartitionManager provides a set of features which allow you to do just that. What is more, for the first time you can easily integrate the build and deployment of OLAP databases into your continuous integration tasks so you can write regression tests that check the figures in your cube.
OLAP cube deployment is usually done using the Analysis Services Deployment Wizard. However, the wizard only deploys the partitions that were part of the original cube design; it will not dynamically add the new partitions required by your partitioning scheme.
I introduced the OlapPartitionManager in my previous post: Dynamic Partition Creation in SSAS multidimensional using the OlapPartitionManager generic C# AMO app. The OlapPartitionManager allows each measure group to have its own partitioning scheme and out of the box it supports the following partitioning schemes.
Hybrid partitioning schemes are where some years use one scheme and other years use another. For example, previous years could be partitioned by quarter whereas the current year is partitioned by day. Matrix partitioning schemes are where you partition by date and some other dimension. Note that the last two options require some configuration on your part which is all explained on Configuring the OlapPartitionManager.
All of this power is easy to configure by changing two SQL views and the content of three tables - well within the realms of the average business intelligence consultant!
Today’s release takes the project a step further by allowing you to deploy a fresh copy of your OLAP database, partition all of the cubes and then process the database using one simple command-line such as:
OlapPartitionManager.exe /d AdventureWorks /a "C:\Dev\Samples\AdventureWorks.asdatabase" /p ProcessDeployed
The above command-line will deploy a cube called AdventureWorks using the AdventureWorks.asdatabase OLAP database definition file onto the SSAS server listed in the OlapServer table. The OlapPartitionManager will then dynamically add all the partitions listed in the OlapPartitions view. Finally it will process the new OLAP database. That’s it! Dead simple! Exactly how to set everything up is all explained on Configuring the OlapPartitionManager.
There is no need for you to delve into the C# code behind the OlapPartitionManager as all the functionality is easily configurable as explained on Configuring the OlapPartitionManager. However, if you do wish to delve a little deeper, you will notice that the new DeployDatabase method is simply a new implementation of some code I released as part of this blog post: SSIS: Deploying OLAP cubes using C# script tasks and AMO
Deploy a new OLAP database and add partitions:
OlapPartitionManager.exe /d AdventureWorks /a "C:\Dev\Samples\AdventureWorks.asdatabase"
Deploy a new OLAP database, add partitions and process the newly deployed OLAP database:
OlapPartitionManager.exe /d AdventureWorks /a "C:\Dev\Samples\AdventureWorks.asdatabase" /p ProcessDeployed
Deploy a new OLAP database, override the standard connection string to the Data Mart, add partitions and process the newly deployed OLAP database:
OlapPartitionManager.exe /d AdventureWorks /a "C:\Dev\Samples\AdventureWorks.asdatabase" /c "Provider=SQLNCLI11.1;Data Source=localhost;Integrated Security=SSPI;Initial Catalog=AdventureWorksDW2014" /p ProcessDeployed
Perform a full process on all OLAP databases found in the OlapPartitions view:
OlapPartitionManager.exe /p ProcessAll
All of the arguments are explained on OlapPartitionManager Command-Line Options
So download your copy and start using the OlapPartitionManager today!
Метки: SSAS C# OlapPartitionManager |
Continuous delivery and Azure SQL Database achieving the impossible |
Метки: Agile Security Azure SSDT Continuous Integration Automation |
Dynamic Partition Creation in SSAS multidimensional using the OlapPartitionManager generic C# AMO app |
There are plenty of blog entries out there that outline how to dynamically create partitions in SSAS multidimensional databases. However, all of them suffer the same drawback: they always hard-code some part of the logic into the programming language in the form of hard-coded names or hard-coded partition queries. None of them provide a generic solution that you could just pick up and apply to any project. Perhaps this is a reflection of the usual implementation language of choice: SSIS. Let's face it, SSIS is famous for been brittle; packages break when moved to a new environment. They are rarely reusable without modification. Generating SSIS packages from BIML has solved some of these problem, but even BIML does not handle C# script tasks well; it is rather like painting a room through a keyhole!
Like many business intelligence consultants, I take example scripts from project to project, each time extending their functionality a little more or adapting them to meet new requirements. For many years, I have used C# and AMO to create partitions and process cubes. Indeed, I have presented on this at SQLBits. I do find the AMO object model extremely intuitive and C# much easier to code than doing the equivalent partition creation logic in an SSIS package.
After several iterations, I now have an entirely generic piece of C# code which will work in any situation. The only configuration required is to change the content of a set of tables and edit two views so they generate the correct MDX member keys to match the date dimension in your cube. Fortunately, this is very simple and very much in the realm of the average business intelligence consultant as it is pure T-SQL.
Once configured the OlapPartitionManager will add partitions to all cubes in all databases listed in the OlapPartitions view, so it will work out of the box even in multi-cube, multi-database environments. The application will also delete partitions that are no longer required, thus removing redundant data from the cube.
To illustrate, I have chosen to partition each measure group in the AdventureWorks sample cube by different criteria. So the Internet Customers measure group is partitioned by year, the Internet Sales partition by quarter, Internet Orders and Reseller Orders partitioned by month, the Reseller Orders partitioned by week. Finally, the Sales Order measure group is partitioned by a hybrid scheme whereby years 2014 onwards have monthly partitions and prior years are partitioned by quarter. Of course, all this partitioning is completely over the top for the tiny amount of data held by the AdventureWorks sample cube, but does illustrate the power of the OlapPartitionManager.
All this is encapsulated in the OlapPartitionDefinitions view which allows you to review your configuration (shown below).
Note that the PartitionStartDate is set to 2011-01-01 for all partition schemes, but this could easily be changed by editing the OlapMeasureGroup table.
The PartitionSliceIsRange column is a powerful new feature of my latest implementation and discussed below.
Correctly setting the partition slice property of each partition helps the storage engine eliminate the partitions that do not contain relevant data. Referred to as partition elimination, this greatly increases the performance of MDX queries as not as much data needs to be read by the SSAS storage engine. This is one of the main reasons why cubes should be partitioned. However, this does rely on the end-user using the dimension upon which the partitioning is based in their MDX queries. Partitioning by the date dimension is the typical approach. But be warned, cubes often contain several date dimensions all playing different roles (e.g. order date, delivery date, ship date etc.), so be careful to base your partitioning strategy on the right date dimension. I have known situations where the cube designer inadvertently picked the wrong date dimension and so the storage engine could never employ partition elimination. Big mistake! Huge!
The auto-slice functionality in SSAS does not always get it right. During processing, SSAS internally identifies the range of data that is contained in each partition by using the Min and Max DataIDs to set the partition slice. Unfortunately, DataIDs are assigned during dimension processing as new members are encountered and we have no explicit control over how they are allocated, so because SSAS just looks at the minimum and maximum value of the DataID in the partition, you can end up reading partitions that don’t contain relevant data. Thus it is always best practice to explicitly assign a value to the partition slice property of each partition.
You can set the partition slice property of a partition to be a single MDX member, or a set of members. However, MDX ranges operators (i.e. :) are not allowed. So, for example, the following two statements work fine:
{[Date].[Calendar].[Month].&[2014]&[4]}
{[Date].[Calendar].[Date].&[20140101],[Date].[Calendar].[Date].&[20140102],[Date].[Calendar].[Date].&[20140103],[Date].[Calendar].[Date].&[20140104],[Date].[Calendar].[Date].&[20140105],-- etc. --,[Date].[Calendar].[Date].&[20140131]}
Whereas the following range statement will cause the partition processing to fail:
{[Date].[Calendar].[Date].&[20140101] : [Date].[Calendar].[Date].&[20140131]}
Of course the above MDX statements were generated from the Adventure Works cube date dimension.
If you get your partition slice wrong, you will get an error such as: "Errors in the OLAP storage engine: The restrictions imposed on partition slice where violated". Note that you will ONLY get such errors if the partition contains over 4096 rows which is the IndexBuildThreshold defined in msmdsrv.ini file. In the Adventure Works database, partitions rarely get above 5000 rows, so the partition slice setting is ignored. In reality, the Adventure Works database has such low data volumes there is no need to partition the cube, but then it is just a sample database!
For many years now, I have used a set of SQL tables to hold the definition of each partitioning scheme required in the cube. A single view then combines the content of these tables with the date dimension to create a list of partitions that need to be created in the cube. This partition definition view even generates the MDX required to set the partition slice property correctly. Thus the OlapPartitionManager program is entirely generic; it simply connects to the SSAS database and compares the partitions it finds in each measure group with the list of desired partitions in the partition definition view: OlapPartitions.
Instead of holding one row per partition in a static table, we get the partition definition view to combine the underlying partition definition tables to generate the list of partitions we need. A CROSS JOIN to the date dimension simply fills in the gaps. The view is entirely dynamic, giving you all the partitions you need for today. Tomorrow the view will add on any new partitions as they are required by the partitioning scheme: monthly, quarterly, yearly.
This view is read by the OlapPartitionManager program which compares the partitions in the cube with the ones listed by the view. If the partition does not exist in the cube, then it is created. Note that the C# program does not update existing partitions. It simply creates new ones or deletes redundant partitions.
The clever thing about this view is not only how it uses date dimension to generate a list of all partitions up to the present day, but also uses it again to generate the list of keys for MDX ranges. Note the content of the PartitionSlice column in the case of the Internet Orders measure group. This was generated because the PartitionSliceIsRange flag is set to True (1).
Note that the view is entirely dynamic, giving you all the partitions you need for today. Tomorrow the view will add on any new partitions as they are required by the partitioning scheme: monthly, quarterly, yearly. This is because the OlapPartitions view only generates partitions between the PartitionStartDate and today's date. Unfortunately the AdventureWorks date dimension table ends in 2014, so we never get any new partitions been created as we roll over to the new week, month, quarter or year. Clearly this would not be the case in a true production cube!
Once the OlapPartitionManager has been run against the AdventureWorks cube, this is what you see in SSMS:
I provide the T-SQL for these database objects as a SQL Server Data Tools (SSDT) project called OlapPartitionDefinition which is part of the download for the OlapPartitionManager hosted on Codeplex. Note that I include DimDate from the AdventureWorksDW2014 database just so my SSDT project compiles correctly.
The Populate_OLAP_Partition_Definitions.sql file populates the three partition definition tables with the partitioned measure groups from the AdventureWorks cube. This can easily be adapted to your own cube(s).
Note that this table structure supports many measure groups in many cubes in many databases, so with one run of the OlapPartitionManager you can create partitions in hundreds of cubes, all with the correct partition slice set!
To get yourself up and running with the OlapPartitionManager this is all you have to do:
All this is explained in detail on the following page: Configuring the OlapPartitonManager
The first step taken by the OlapPartitionManager is to validate your configuration. Thus it will check that all the measure groups exist in the relevant cubes in the relevant databases. It even checks the measure groups have a template partition with its where clause set to WHERE 1=0. All this is logged along with the creation/deletion of each partition. So check the log files to ensure your initial configuration is correct.
The OlapPartitionManager is available for download from Codeplex. I would be glad to have your feedback on the project via the project’s discussions tab.
Метки: SSAS C# T-SQL SQL Server 2012 SSAS AMO C# AMO OlapPartitionManager |
Non Parallelizable operations in SQL Server |
|
Download Azure Publish Settings file/Publish Profile |
Метки: Azure Automation |
Download Azure Publish Settings file/Publish Profile |
Метки: Azure Automation |
Backup to azure |
http://sqlblogcasts.com/blogs/simons/archive/2015/03/27/backup-to-azure.aspx
|
Managing Encryption Keys in SQL – Azure Key Vault |
|
Comparison of database systems |
http://sqlblogcasts.com/blogs/simons/archive/2014/12/19/comparison-of-database-systems.aspx
|
Unable to determine the workspace – TF.exe – are you using the wrong version of TF.exe |
Метки: Dev ops Build TFS Continuous Integration |
The Data Power Hour - A more engaging community expereince |
Closer to home, The Data Power Hour rolls up in Leeds on 22nd Oct! I am a really proud of coming up with this concept, and when it was piloted in Manchester it got rave reviews. I wanted to come up with something different for the community other than the usual format of two speakers in a night. I also wanted this to coincide with the fact that Manchester had become PASS Chapter. So how does it work....
Well, there is a 1 hour session on a given topic as you would expect at most local events. In Leeds we will be looking at indexing. The second hour is the Power Hour itself. Here the attendees perform a 1 hour lab with mentoring from experienced user group attendees. This runs in small groups, in Manchester we had five groups of four people plus a mentor. We then finished the evening with a summary of what had been done and then asked each group to share what the key learning point of the evening.
I would be very happy to share the format with other Chapters/User Groups. Indeed, I am currently syncing up with James Rowland Jones to see if there is an opportunity to make use of the material across the PASS organisation. Just to be clear, PASS has not endorsed this in any way, I am working with James to see how this can scale to more Chapters Globally and be put in touch with the relevant people within the organisation
The type of feedback I received in Manchester was....
I can't stress enough the value that the mentors bring to this. I gets a number of Manchester Chapters attendees who always want to help more with the Chapter. The Power Hour provides the perfect opportunity to make use of enthusiastic Chapter attendees who want to help and share knowledge.
So I am looking forward to the second running of this in Leeds, and I will be reaching out to Chapter/User Group leaders in the UK and Europe to see if they will let this event run at their locations in 2014/15
So you heard the Data Power Hour here first!
And I hope it takes flight!
Thanks
Chris
|
SQL Saturday in Holland |
October and November represents a busy time for the SQL Server Community. For those in Holland next week I will be conducting a precon about Data Warehousing. If you nedd to learn how to create a Data Warehouse and use the supporting technologies to drive a solution. This precon will be for you. By the end of the day you will have seen a Data Warehouse being built utilising SSIS, T-SQL, Data Quality Services, Master Data Services and a few Azure resources to augment your data. If you want more details then visit http://www.sqlsaturday.com/336/eventhome.aspx for information. If Data Warehousing is not your forte, my good friends Denny Cherry, Kevin Kline, Marco Russo and Laerte JR Poltronieri are also conducting a variety of sessions that provide great coverage across the SQL Server stack!
At the same event, I have a one hour DBA session looking at controlling resources in SQL Server using Resource Governor and Policy Based Management. Feature often overlooked in the product stack, but can provide you with the ability to have greater control of your SQL Server boxes. I always look forward to this event in Holland. It is one of my favourite events, and I look forward to mingling with the attendees during the event!
See you there
Chris
http://sqlblogcasts.com/blogs/testas/archive/2014/09/26/sql-saturday-in-holland.aspx
|