Optimize your datatypes for SSAS

Not everyone is aware of the fact that not all data types are supported in SSAS cubes / models. If you use one of the data types that are not supported by SSAS, SSAS will cast all your unsupported data types to supported data types when the cube is being processed.

This will cause your cube processing time to increase. And you will use more CPU than necessary.

Microsoft has provided a list of supported data types here

To avoid these kinds of problems there are several things you can do.

  • Use views as source for your cubes. Then cast the values in these views. SQL Server is better than SSAS to cast values. And it is more likely that your SQL Server has more power that your SSAS server.
  • Make sure your DWH only uses supported SSAS data types. But that’s not to likely is it?
Advertisements

Hardware specification for SSAS

Top things you should consider when buying new hardware for your SSAS server.

CPU
When you query a SSAS multidimensional cube your query might get split between the storage engine (raw data) and the formula engine (complex calculations). The formula engine is single threaded. So it is important to choose a CPU with high frequency. 3.4 Ghz will be much better than 2.1 Ghz.

The CPU should also have as much cache as possible.

Disk
I prefer to always have DAS (Directly Attached Storage) rather than using a SAN. At least 2 fast SSD disks in proper RAID should be perfect. Or you could consider flash memory like Fusion-IO (SanDisk). SSAS multidimensional uses Windows File Cache quite a lot.

I am not using RAID with failover. Because if the server crashes I can create a new virtual server in no-time. And then deploy the cubes to it.

Memory
You should choose as fast memory as possible. And make sure you have enough of it 🙂 SSAS Tabular reads the model into memory. So this will increase you query performance quite a lot.

NUMA
I will not talk too much about NUMA since it’s quite complicated. SSAS Multidimensional should perform quite well using NUMA. SSAS Tabular might have problems on a server with NUMA. I have read some forum posts telling that a query runs faster on a Surface Pro than a high-end server. And that the problem might be because of NUMA. On of the solutions on this is to install a virtual server on top of the server. And then bind one of the sockets to this.

At one of our customer we ended up buying this server (august 2015).

DL380 gen9
2 x Intel Xeon E5-2643v3, 3.4GHz, 6-core, 20MB L3 cache
120GB Memory 2133MHz
2 x 800GB SSD drive
10Gb Ethernet

Please note that this server specification might not suit your needs.

You should always identify bottlenecks before you decide on a new server specification. And then scale the server for your needs.

Marco Russo has written a nice article on “Optimize Hardware Settings for Analysis Services Tabular”

Using subqueries in MDX

Did you know that you can write subqueries in MDX? It might be handy if you have some complex dimension filters.

Lets take a basic sample

SELECT [Measures].[Internet Total Sales] ON 0,
[Date].[Calendar Year].[Calendar Year] ON 1
FROM [Adventure Works Internet Sales Model]

And the result will look like this

Subqueries 1

If you only want to get sales for the year of 2007 it can be achieved with a subquery.

SELECT [Measures].[Internet Total Sales] ON 0,
[Date].[Calendar Year].[Calendar Year] ON 1
FROM 
( 
	SELECT [Date].[Calendar Year].&[2007] ON 0
	FROM [Adventure Works Internet Sales Model]
)

Subqueries 2

Yes I know this can be achieved in simpler ways 🙂 But I had some problems figuring out better samples

Useful filters for your time dimension (DAX / Tabular)

It is always nice to have some time filters in your Tabular model. Because of reuse I try to create most of these filters in the data source. But if that is not an option you can off course add them directly to your Tabular model. So here are some useful samples.

The first one will give you a flag on the last date with sales.

First lets create a new measure.

Last Sales Date:=LASTNONBLANK('Date'[Date]; CALCULATE(SUM('Internet Sales'[Sales Amount])))

And then add a new column in the date dimension. I my sample I will call this column “Last Date With Sales”

=IF('Date'[Date] = calculate('Date'[Last Sales Date];ALL('Date'));1;0)

So if you need the last date with sales in your report you can add the dimension attribute “Last Date With Sales” and then filter the value to 1. This is quite handy if you are using date functions such as TotalYTD or TotalMTD. Please note that you can use the FIRSTNONBLANK functions to. This will return the first date with sales.

The next flag will be a flag that indicates tomorrows date.

To make this flag I will create a new column and add the following expression

=IF('Date'[Date] = TODAY()-1;1;0)

To test this in AdventureWork you first have to figure out how many days to subtract. This can be done by using the following SQL

select datediff(day, '2010/01/02', getdate())

And then change your expression to something like this

=IF('Date'[Date] = TODAY()-1983;1;0)

To get the current month you can add a column like this

=IF(Month('Date'[Date]) = MONTH(TODAY()-1) && YEAR('Date'[Date]) = YEAR(TODAY()-1);1;0)

And to get the current year you can add a column like this

=IF(YEAR('Date'[Date]) = YEAR(TODAY()-1);1;0)

Use you inactive relations with “USERELATIONSHIP” in DAX

In Tabular models you can only have one active relationship between two tables. But you can have multiple inactive ones. In the image below you can see how the active relations are marked with a solid line. And then inactive ones are marked as dotted lines.

Relationsships

So how do you use your inactive ones?

I start with a basic query that gives me the sales per year. Because I did not specify a relationsship the active relationship is used. In this case the order date.

EVALUATE
 (
    ADDCOLUMNS (
        VALUES ( 'Date'[Calendar Year] ),
        "Total sales", 'Internet Sales'[Internet Total Sales]
    )
)

And the result

Relation 1

So if I want to use an inactive relationship too I need to use the USERELATIONSHIP function. Below is an example on how to get both sales per order year and sales per shipment year in the same query.

DEFINE
    MEASURE 'Internet Sales'[Total Sales By ShipmentDate] =
        CALCULATE (
            'Internet Sales'[Internet Total Sales],
            USERELATIONSHIP ( 'Internet Sales'[ShipDateKey], 'Date'[DateKey] )
        )
EVALUATE
 (
    ADDCOLUMNS (
        VALUES ( 'Date'[Calendar Year] ),
        "Total sales", 'Internet Sales'[Internet Total Sales],
        "Total sales by ShipmentDate", 'Internet Sales'[Total Sales By ShipmentDate]
    )
)

And the result will look like this

Relation 2

Search for dimension members in DAX using the “search” function

If you want to search for specific dimension members you can use the “Search” function in DAX. The “Search” function also support wildcards.

So if you want sales for every products that contains the word “Mountain” you can write a query like this.

EVALUATE
FILTER (
    CALCULATETABLE (
        ADDCOLUMNS (
            VALUES ( 'Product'[Product Name] ),
            "Sum of Sales Amount", 'Internet Sales'[Internet Total Sales]
        )
    ),
    'Internet Sales'[Internet Total Sales] <> 0
        && SEARCH ( "*Mountain*", 'Product'[Product Name], 1, 0 ) <> 0
)

And the result will look like this

Search 1

Below is a more complex sample. It returns all products that contains the word “Mountain”. But it also returns the percentage of the total for all products containing the word “mountain”

DEFINE
    MEASURE 'Internet Sales'[Total] =
        CALCULATE (
            'Internet Sales'[Internet Total Sales],
            FILTER (
                ALL ( 'Product'[Product Name] ),
                SEARCH ( "*Mountain*", 'Product'[Product Name], 1, 0 ) <> 0
            )
        )
EVALUATE
FILTER (
    CALCULATETABLE (
        ADDCOLUMNS (
            VALUES ( 'Product'[Product Name] ),
            "Sum of Sales Amount", 'Internet Sales'[Internet Total Sales],
            "Percent of all mountain", FORMAT (
                DIVIDE (
                    'Internet Sales'[Internet Total Sales],
                    'Internet Sales'[Total]
                ),
                "Percent"
            )
        )
    ),
    'Internet Sales'[Internet Total Sales] <> 0
        && SEARCH ( "*Mountain*", 'Product'[Product Name], 1, 0 ) <> 0
)

And the results look like this

Search 2

Search for dimensions members in MDX using inStr()

Sometimes it might be handy to search dimensions members in MDX. The MDX below will show you an example on how to get all products that contains the name ‘Bike’

SELECT [Measures].[Sales Amount] on 0,
( 
  Filter( 
	[Product].[Product].[Product].ALLMEMBERS, 
	Instr( [Product].[Product].currentmember.Properties( 'Member_Caption' ), 'Bike' )  > 0  
  )
) on 1
from [Adventure Works]

And here is some other examples

List products that does not contain the word ‘Bike’

Instr( [Product].[Product].currentmember.Properties( 'Member_Caption' ), 'Bike' )  = 0  

List products that is named ‘Bike Wash – Dissolver’

Instr( [Product].[Product].currentmember.Properties( 'Member_Caption' ), 
'Bike Wash - Dissolver' )  = 1

Cheers,
Sindre