Warm tip: This article is reproduced from stackoverflow.com, please click
mdx ssas

Why would you want to use Process Data vs Process Full for processing a partition

发布于 2020-05-08 01:19:03

As stated here : Process data will process data only without building aggregations or indexes. If there is data in the partition, it will be dropped before re-populating the partition with source data. On the other hand, for Process Full Analysis Services will also drop all data in the partition, and then process the partition. So, if attribute hierarchies connected to the partition are set to AttributeHierarchyOptimizedState = true Analysis Services will build indexes for the attribute hierarchy to improve query performance. I was able to confirm this behavior by running following query against Adventure Works database:

SELECT
DIMENSION_NAME, ATTRIBUTE_NAME, ATTRIBUTE_INDEXED,
ATTRIBUTE_COUNT_MIN, ATTRIBUTE_COUNT_MAX
FROM SystemRestrictSchema($system.discover_partition_dimension_stat
        ,DATABASE_NAME = 'AdventureWorksDW2014Multidimensional-EE'
        ,CUBE_NAME = 'Adventure Works'
        ,MEASURE_GROUP_NAME = 'Internet Sales'
        ,PARTITION_NAME = 'Internet_Sales_2013') 

When I Process data ATTRIBUTE_INDEXED column is set to false for all attributes. When I do process full ATTRIBUTE_INDEXED column is set to true for the most of the attributes.

So I was wondering why would someone want to process data only, and hence sacrifice query performance, to gain what? Thanks.

Questioner
vldmrrdjcc
Viewed
21
Ferdipux 2020-03-02 15:23

In a nutshell -- to reduce cube downtime and cube preparation time. At cost, as you stated, query performance degradation in period between ProcessData and ProcessIndex.
Processing usually takes cube offline, at least on committing changes, which can take considerable amount of time. To reduce cube offline time due to processing, especially on big cubes, partitions (parts) of cube with updated data are processed with Process Date option. Having completed this processing, new data in cube is available for queries, but with performance penalty. After that Process Indexes on affected partitions is started, which builds all indexes and aggregations on the objects.
From my experience, it is used and has sense on big projects only. Another finding - running Process Default on cube builds missing indexes and aggregations without specifying exact measure group etc.