28 May 2014

Cube Partitioning

Each measure group in a cube is divided into partitions, where a partition defines a portion of the fact data that is loaded into a measure group. SQL Server Analysis Services (SSAS) on SQL Server 2008 R2 Standard Edition allows only one partition per measure group, while multiple partitions are allowed in the Enterprise Edition. 

Partitions are completely transparent to the end user, but they have an important impact on performance and scalability. For example, partitions can be processed separately and in parallel. They can have different aggregation designs, although this was not implemented in System Center 2012 – Service Manager.

You can reprocess a partition without affecting all the other partitions in a measure group. Also, SSAS automatically scans only the partitions that contain the necessary data for a query, which can vastly improve query performance.

Cube partitioning is performed on every data warehouse maintenance job run, which is hourly by default. The specific process module that runs is named Manage Cube Partitions. It always runs after the Create Mart Partitions step.

This dependency data is stored in the infra.moduletriggercondition table.

The main dynamic link library (DLL), which handles partitioning, is in the warehouse utility DLL, Microsoft.EnterpriseManagement.Warehouse.Utility, in the PartitionUtil class. Specifically, there is a ManagePartitions () method in the class that handles all partition maintenance. 

The data warehouse maintenance DLL, Microsoft.EnterpriseManagement.Warehouse.Maintenance, and the data warehouse online analytical processing (OLAP) DLL, Microsoft.EnterpriseManagement.Warehouse.Olap, both call into Microsoft.EnterpriseManagement.Warehouse.Utility to handle partitions during maintenance and cube deployment

This is why actual partition handling is in the common warehouse utility DLL to avoid duplicating logic or code.

Cube Partitioning Maintenance performs the following tasks:
  • Create partitions
  • Delete partitions    
  • Update partition boundaries


To do this, the Structured Query Language (SQL) table etl.TablePartition is read to determine all the fact partitions that have been created for a measure group. The following actions occur:

Start cube processing for each measure group in the cube

Get all partitions from the Each measure group in a cube is divided into partitions, where a partition defines a portion of the fact data that is loaded into a measure group. 

SQL Server Analysis Services (SSAS) on SQL Server 2008 R2 Standard Edition allows only one partition per measure group, while multiple partitions are allowed in the Enterprise Edition. 
Partitions are completely transparent to the end user, but they have an important impact on performance and scalability.

For example, partitions can be processed separately and in parallel. They can have different aggregation designs, although this was not implemented in System Center 2012 – Service Manager. 

You can reprocess a partition without affecting all the other partitions in a measure group. Also, SSAS automatically scans only the partitions that contain the necessary data for a query, which can vastly improve query performance.

Cube partitioning is performed on every data warehouse maintenance job run, which is hourly by default. The specific process module that runs is named Manage Cube Partitions. It always runs after the Create Mart Partitions step. 

This dependency data is stored in the infra.moduletriggercondition table.

The main dynamic link library (DLL), which handles partitioning, is in the warehouse utility DLL, Microsoft.EnterpriseManagement.Warehouse.Utility, in the PartitionUtil class. Specifically, there is a ManagePartitions () method in the class that handles all partition maintenance.

The data warehouse maintenance DLL, Microsoft.EnterpriseManagement.Warehouse.Maintenance, and the data warehouse online analytical processing (OLAP) DLL, Microsoft.EnterpriseManagement.Warehouse.Olap, both call into Microsoft.EnterpriseManagement.Warehouse.Utility to handle partitions during maintenance and cube deployment.

This is why actual partition handling is in the common warehouse utility DLL to avoid duplicating logic or code.

Cube Partitioning Maintenance performs the following tasks:
  • Create partitions
  • Delete partitions
  • Update partition boundaries
To do this, the Structured Query Language (SQL) table etl.TablePartition is read to determine all the fact partitions that have been created for a measure group.

The following actions occur:
  • Start cube processing for each measure group in the cube
  • Get all partitions from the etl.TablePartition table for the measure group
  • Delete any partitions that exist in the measure group but that are missing from the etl.TablePartition table
  • Add any new partitions that have been created and that exist only in the etl.TablePartition table
  • Update any partition that might have changed by matching each partition to the RangeStartDate and RangeEndDate in the etl.TablePartition table
Remember the following about cube processing:

Only measure groups that are targeted at facts contain multiple partitions in SQL Server 2008 R2 Standard Edition. By default, all measure groups and dimensions contain only one partition. Therefore, the partition does not have any boundary conditions.

The partition boundaries are defined by a query binding that is based on date keys that match up to the date keys for the corresponding fact partition in the etl.TablePartition table.

Get all partitions from the etl.TablePartition table for the measure group table for the measure group

Delete any partitions that exist in the measure group but that are missing from the etl.TablePartition table

Add any new partitions that have been created and that exist only in the etl.TablePartition table
Update any partition that might have changed by matching each partition to the RangeStartDate and RangeEndDate in the etl.TablePartition table

Remember the following about cube processing:

Only measure groups that are targeted at facts contain multiple partitions in SQL Server 2008 R2 Standard Edition. By default, all measure groups and dimensions contain only one partition. Therefore, the partition does not have any boundary conditions.

The partition boundaries are defined by a query binding that is based on date keys that match up to the date keys for the corresponding fact partition in the etl.TablePartition table.

No comments:

Post a Comment