29 May 2014

Creating and Maintaining Partition Cube

You can create additional partitions for a measure group to improve processing performance. Having multiple partitions allows you to allocate fact data across a corresponding number of physical data files on local as well as remote servers.

In Analysis Services, partitions can be processed independently and in parallel, giving you more control over processing workloads on the server.

Partitions can be created in SQL Server Data Tools during model design, or after the solution is deployed using SQL Server Management Studio or XMLA. We recommend that you choose one approach only.


 If you alternate between tools, you might find that changes made to a deployed database in SQL Server Management Studio are overwritten when you subsequently redeploy the solution from SQL Server Data Tools.

Notes
Check whether you have either the business intelligence edition or enterprise edition. Standard edition does not support multiple partitions. To check the edition, right-click the server node in SQL Server Management Studio and choose Reports | General. For more information about feature availability,

From the outset, it's important to understand that partitions must share the same aggregation design if you want to merge them later. Partitions can be merged only if they have identical aggregation designs and storage modes.

Explore the data in Data Source View (DSV) to understand the range and depth of the data you are partitioning. For example, if partitioning by date, you can sort on a date column to determine the upper and lower bounds of each partition.

The most important consideration when creating partitions is to segment the data so that there are no duplicate rows. Data must be stored in one, and only one, partition to avoid double counting any rows. As such, it's common to partition by DATE so that you can define clear boundaries between each partition.
You can use either technique to distribute the fact data across multiple partitions. The following techniques can be used to segment the data.

Choose an approach

Technique
Recommendations
Partitions can be sourced from SQL queries. During processing, the SQL query is to retrieve the data. The query's WHERE clause provides the filter that segments the data for each partition. Analysis Services generates the query for you, but you must fill in the WHERE clause to properly segment the data.

The primary advantage of this approach is the ease with which you can partition data from a single source table. If all of the source data originates from a large fact table, you can build queries that filter that data into discrete partitions, without having to create additional data structures in the Data Source View (DSV).

One disadvantage is that using queries will break the binding between the partition and the DSV. If you later update the DSV in the Analysis Services project, such as adding columns to the fact table, you must manually edit the queries for each partition to include the new column. The second approach, discussed next, does not have this disadvantage.
You can bind a partition to a table, named query, or view in the DSV. As the basis of a partition, all three are functionally equivalent. The entire table, named query, or view provides all of the data to a single partition.

Using a table, view, or named query places all of the data selection logic in the DSV, which can be easier to manage and maintain over time. An important advantage to this approach is that table bindings are preserved.

If you update the source table later, you do not have to modify the partitions that use it. Secondly, all of the tables, named queries and views exist in a common work space, making updates more convenient than having to open and edit partition queries individually


No comments:

Post a Comment