Dossier Performance Troubleshooting Guide

1

Dossier Performance Troubleshooting Guide

Since MicroStrategy 10.9, Visual Insight Dashboards (Usually known as VI) were renamed to

Dossiers. For future reference, when the term “Dossier” is used in this document it also refers to

Visual Insight Dashboards. This document is intended for MicroStrategy customers who are

concerned about their Dossiers performance during execution and manipulation (changing selectors

or filters).

Performance wise it’s always a good idea to have in consideration the following phrase: “The larger the dossier (more chapters/pages/visualizations/datasets/derived elements, more data being displayed in each visualization) the longer time it will take be executed on the server and longer time it will be needed to render it on the client.” During the different points described in this document you’ll find recommendations and considerations that will help to improve your dossiers performance, hence reducing the execution and manipulation (changing selectors or filters) time.

Table of Contents

Query details ............................................................................................................................................... 2

Data combination setting ............................................................................................................................ 3

Cube partitioning ......................................................................................................................................... 3

Enable partitioning in MTDI Cubes ..........................................................................................................................4

Enable partitioning in OLAP Cubes ..........................................................................................................................4

Direct cubes over view reports ..................................................................................................................... 4

Limit the use of derived attributes or metrics ............................................................................................... 5

Turn on dossier cache .................................................................................................................................. 6

Delete unused objects ................................................................................................................................ 12

Avoid duplicated datasets .......................................................................................................................... 13

Avoid images on thresholds ....................................................................................................................... 16

Load chapters on demand .......................................................................................................................... 19

Push-down Derived Attributes for live connection cubes ............................................................................. 19

Performance implications of large and complex dossiers ............................................................................ 20

Connect Live performance considerations .................................................................................................. 21

2

Query details Starting in MicroStrategy 11 there is a new setting that allows you to see how long it’s taking each visualization to be executed and also what query is being sent to the DataWarehouse or what request is being done to an In-Memory intelligent cube. This setting is located on each visualization, when you mouse over the visualization then you’ll see the three dots on it:

Click on the three dots and a new menu will be displayed, then you will see the setting Query Details…

Once you click on it you will see the query details and you will also see how many steps were required to calculate that visualization and how long it took to be executed, here is an example of it.

3

The first marked section is the Time Spent, this section will tell you how long it took to execute the visualization and the second marked section will tell you the query that was executed, you can also see more information like the number of rows returned in that step and the tables accessed. With this feature you can go and find which visualization is consuming most of the time in your dossier, then create a copy of it and in that copy delete that visualization and see if this improves the performance if it does then you can check the definition of that specific visualization and find a way to optimize it or improve its calculations.

Data combination setting Make sure that the data combination setting is set to “Use minimum relations to join attributes using mostly binary relations (faster, less memory usage)”. To apply this change, follow the steps located in the data_combinations video attached to this article.

Cube partitioning When intelligent cubes of 2 million rows or more are used as data sources for Dossier it’s always recommended to turn on partitioning, this will improve the publishing time for the intelligent cubes and the processing time for derived metrics and derived attributes as well as aggregation calculations. There are two main settings involved in cube partitioning configurations:

• Partition Attribute: The partition attribute should be the attribute with more cardinality in the cube (more cardinality means the attribute with more different elements, for example in the Tutorial project the attribute Item or Customer).

https://community.microstrategy.com/s/article/Dossier-Performance-Troubleshooting-Guide?language=en_US

4

• Number of Partitions: The number of partitions is based on the number of CPUs on the Intelligence Server it should always be the number of CPUs divided by 2, for instance if the Intelligence Server has 16 CPUs, then the number of partitions should be 8.

Partitioning will depend on the type of cube, MTDI (Multi-table Data Import introduced in 10.x) or OLAP (first generation of in memory cubes, introduced in 9.x). To identify if your cube is MTDI or OLAP it’s really simple, if the cube was created through the Data Import Web Interface (“Add External Data” section on MicroStrategy Web), then it’s an MTDI Cube.

If the cube was created through MicroStrategy Developer by using schema objects (attributes) and application objects (metrics), then it’s an OLAP cube. For more information about MTDI vs OLAP cubes and it’s differences take a look at this KB article and this PDF document.

Enable partitioning in MTDI Cubes There are two ways to enable cube partitioning in MTDI Cubes, one method automatically selects the partition attribute for you, the steps to follow to use this method are detailed in the mtdi_partition_automatic video attached to this article. To manually select the partition attribute, follow the steps in the mtdi_partition_manual_attribute video attached to this article. In both cases once the settings are changed the cube needs to be republished.

Enable partitioning in OLAP Cubes To enable cube partitioning in OLAP Cubes follow the steps in the olap_partition video attached to this article. After the previous changes you’ll have to save the cube and republish it.

Direct cubes over view reports

https://community.microstrategy.com/s/article/In-memory-Reporting-OLAP-MTDI-Best-Practices

https://community.microstrategy.com/servlet/fileField?entityId=ka144000000bupZAAQ&field=Attachment__Body__s




5

It’s always recommended to use the intelligent cubes directly in your Dossier, never use view reports (MicroStrategy reports coming from intelligent cubes). The only reason why you might want to use a view report in your Dossier is if you need to use prompts. Whenever you use a view report instead of the intelligent cube, the MicroStrategy Analytical Engine (MicroStrategy AE) spends more time searching and fetching the report definition and then applies that definition to the intelligent cube. When you use the cube directly in the Dossier, the definition is applied to the cube directly and the MicroStrategy AE saves time since it doesn’t have to search and fetch an extra report definition.

Limit the use of derived attributes or metrics Derived elements like derived attributes and derived metrics are calculated on the fly by the MicroStrategy AE, even when the data is coming from an intelligent cube, MicroStrategy still needs to calculate the derived elements and depending on the data size and complexity sometimes these calculations can take minutes. Based on the previous explanation, you should only create derived attributes or metrics whenever you require them based on the business definition, whenever possible define all attributes and metrics in the intelligent cube or in the MicroStrategy schema and application objects. This can cause your intelligent cube or report execution time to grow, however the end user performance will be improved. For instance, let’s say you have an Intelligent Cube with 2 million rows and 10 attributes and 5 metrics and the publishing time of this cube is 1 hour, then you use this cube in a Dossier and create 15 derived attributes and 15 derived metrics. The execution time of the Dossier is taking 1 minute which is not acceptable for the end user, they want this Dossier to run in less than 10 seconds. Within the same example, if you define the 15 derived attributes and 15 derived metrics in the Intelligent Cube, the cube publishing time may increase, let’s say now it’s taking 1 hour and 20 minutes to be published, however the Dossier execution time without all these extra calculations on the run time (derived attributes and metrics) is now taking 5 seconds. In the previous example all the extra calculations were pushed to the intelligent cube which will improve the performance for the end users. To identify the derived elements in Dossier, you just have to take a look at the datasets section of the Dossier. For derived metrics you’ll see the fx icon on them:

6

For derived attributes you’ll also see the fx icon on them:

Turn on dossier cache Every request to the Data Warehouse (or Warehouse) requires time and resources from the Intelligence Server. An easy way to avoid asking for the same data to the Warehouse are caches and Intelligent Cubes. Even when you have prompts or security filters on your Dossier, it’s recommended to turn on caches. Caches not only speed up requests but also avoid unnecessary recurrent requests to the Warehouse. To explain this in more detail let’s review the following caching scenarios Scenario 1

7

You have a report with a prompt and user1 logs into MicroStrategy Web and requests that report. The Intelligence Server sends the request to the Warehouse, once the request is back from the Warehouse with the data, if cache is enabled, the Intelligence Server will generate a cache which will be stored in the Intelligence Server memory and in a local file. If the same user requests the same report with the same prompt answer, the Intelligence Server won’t go to the Warehouse and instead will retrieve the cache that’s stored in memory. As you can see in this scenario, whenever a user requests the same data with the same prompt answer, having cache enabled will prevent your Intelligence Server from sending repeat queries to the Warehouse. Scenario 2 You have a report with a prompt and user1 logs into MicroStrategy Web and requests that report, the steps of Scenario 1 will happen if cache is enabled. Now, imagine that during the Warehouse execution, another user, user2, logs in and requests the same report with the same prompt answer, if cache is enabled, then the job for user2 will be queued once the job from user1 is finished, and the IServer retrieves the same cache generated for user1 to user2. If cache is not enabled, the IServer will query the Warehouse again, for the same data, this means two jobs running concurrently against the Warehouse. If you extrapolate this scenario for 10 users requesting the same report with the same prompt answer, you will see 10 jobs running concurrently to the Warehouse, putting additional stress and overhead on your resources. However, if you enable cache, then only one job will be sent to the Warehouse and the other 9 jobs will be queued until the first request is finished. As you can see in previous scenarios, enabling caching can significantly improve your IServer performance and stability. Cache can also be enabled for Documents and Dossiers/VIs, you might ask why the IServer needs a cache for Documents and Dossiers/VIs? Well, whenever you execute a Document or a Dossier and let’s assume the reports cache was already generated, the IServer still generates a virtual join between all datasets and then calculates all derived attributes and metrics and format the data to be presented in grids, graphs, visualizations, thresholds, etc. With the previous explanation, document caches need to be enabled for additional improved performance. To verify the report, document and Dossier cache settings (HTML and XML/Flash/HTML5 checkboxes for dossier), please check the project configuration in the following section in Developer:

8

For additional information on document cache settings, you can click on the Help Button at the bottom of the window. In the Storage and Maintenance sections, you can control where these cache files are stored locally and under the Maintenance section you can configure for how long these caches are going to be valid, among other settings.

9

Report and document caches can be enabled or disabled at document and report level, make sure that they are enabled there, if the default checkbox is enabled, then they will use the project level settings, however if the default checkbox is changed then they will use the report/document level setting. For reports these settings are located under the Data -> Report Caching Options and Format -> Document properties menus in developer:

10

For documents:

For Dossiers/VIs the cache is automatically enabled as long as the project settings are enabled for XML, HTML and HTML5 Documents. You can verify the status of caches for reports and documents in the cache monitor, if a cache was generated for a report, then they will be displayed there:

11

If you need to check the caches for documents/dossiers/VIs, go to the Documents section of the Cache monitor:

In both monitors you can control the cache information that you want to display by right mouse clicking on the monitor’s white section (don’t click on a specific cache), and selecting View Options.

12

For more information about the meaning of the cache monitor columns, please click the Help button. Note: Even documents/dossiers/VIs based on Intelligent cubes need caches enabled, if the data is coming from a cube and the document/Dossier are not enabled, you can still experience performance problems due to formatting or derived elements calculations.

Delete unused objects To achieve the best performance possible, it’s important to delete all unused objects like datasets, visualizations, tabs (for VIs), chapters (for Dossiers), pages (for Dossiers) and derived elements (derived metrics and derived attributes). It’s important to know that all datasets, visualizations, tabs, chapters, pages and derived elements require valuable processing time to be rendered, hence the necessity of deleting all the unused ones. For example, if you are a MicroStrategy developer/user/analyst and you were requested to generate a Dossier based on an existing Dossier, the first thing you have to do is to understand what information from the existing Dossier will be necessary, then create a copy of the existing Dossier and

13

start making changes on it, you’ll have to verify which datasets, visualizations, tabs, chapters, pages and derived elements (derived metrics and derived attributes) are used and which ones aren’t. Then proceed to delete the ones that aren’t needed, after that, you can continue to make the necessary modifications and include additional datasets, visualizations, derived elements, tabs, chapters and pages.

Avoid duplicated datasets As explained in the “Turn on document cache” section whenever a Dossier is executed, the IServer will create a virtual dataset by joining all datasets included in the Dossier. Hence, the more datasets, the more time it can take for your Dossier to be executed. A good performance practice is to avoid duplicated dataset and merge them into a single dataset. For instance, let’s say you have a Dossier with 3 datasets and it’s taking 1 minute to run and users expectation is that this Dossier can be opened in less than 10 seconds.

You then analyze the datasets in Developer and realize that all of them are reports coming from the same intelligent cube. The steps of this analysis are:

• In Developer right click the dossier and select the option “Search for Components…”.

14

• In the next window, select a dataset and right-click on it and then select “Search for

Components…”.

• You then can see the cube that is used by the dataset as data source, you can also see the

cube location. After this you can perform the same exercise for each dataset and determine if they are coming from the same cube.

15

• Another method that you can use, is to search for the Cube dependents by right clicking on the

cube and then select on the option “Search for Dependents…”.

• This will show you which datasets are depending on this cube, you can compare this result

with the result of the first step.

16

In the previous example two methods are detailed on how to search for duplicated datasets in a Dossier. Also, based on the results we can conclude that the Dossier design is not following the best practices in terms of performance. It’s not recommended to have multiple datasets in dossier when they can be merged by using the source intelligent cube. When this situation occurs, you can replace the datasets with the source intelligent cube, in this example replace the datasets Cost, Profit and Revenue with the cube “Category_Cube”, to do that you can follow the steps in the replace_datasets video attached to this article. After these changes you’ll notice how the execution time for the Dossier is highly reduced, in this example it was reduced from 1 minute to 10 seconds. In conclusion, whenever possible, reduce the number of datasets in a dossier by following the previous method.

Avoid images on thresholds Thresholds are a versatile feature in MicroStrategy and they can immediately help the end users identify if actions or items to pay attention to with a simple visual queue. In this example we have a simple grid with 4 rows:


17

Let’s say that the end user wants to replace the cost metric value with a red down arrow like this one on rows where cost is greater than 2 million:

You apply this change in your Dossier and you end up with the following grid:

In the previous example it doesn’t look like a big problem to use an image in a threshold, however if for example you had 8 metrics and 1000 rows, you would be rendering 8000 metrics, which can significantly impact the performance.

Using the same example, instead of using an image we can replace it with a ↓ symbol, if you apply

the change in the threshold in the previous example you’ll end up with the following output.

18

Using symbols instead of images in thresholds can significantly improve the performance. Rendering an image takes more time than rendering a symbol, symbols are characters like letters, hence rendering them is faster. Also, note how using symbols looks cleaner and provides with more formatting flexibility, for instance, in the image above, the cell spacing in Category when you compare the image threshold to the symbol threshold makes the formatting look off. Also using symbols allows you to control the symbol size in the font size section, you can’t do this with an image. You can easily find symbols on the internet, in the following page you’ll find multiple symbols and you can search for them: http://www.i2symbol.com/symbols/ In conclusion, use symbols whenever possible in thresholds. Use images in thresholds as a last resource.

http://www.i2symbol.com/symbols/

19

Load chapters on demand Starting with MicroStrategy 10.11, you can load the dossier chapters on demand by enabling the setting with the same name. This setting will allow the end user to render only the chapter that is selected by default, this setting can also significantly improve the initial loading performance of a Dossier.

For instance, let’s say that you have a dossier with 8 chapters, each chapter with 5 pages. If the setting load chapters on demand is disabled (by default the setting is disabled) and the end user executes the dossier without cache, then 8 chapters and 40 pages will be processed and rendered the first time the user opens the dossier. However, if you enable this setting and let’s say, chapter 3 is selected by default, then if the end user executes the Dossier without cache enabled, only one chapter (chapter 3) is rendered and only 5 pages will be rendered, reducing the initial execution time considerably. To enable the setting to load chapters on demand follow the steps in the enable_load_chapters_on_demand video attached to this article. Note that if you enable this setting a cache will be generated per chapter, this means that the navigation between chapters can be longer if the cache for the selected chapter hasn’t been generated, this is the trade-off to have a quick first execution time.

Push-down Derived Attributes for live connection cubes Prior to 10.11 Derived Attributes for “Connect Live” cubes were always evaluated in memory which requires the data to be fetched in memory by the iServer, then the derived attributes will be calculated, this can result in longer execution times, depending on the complexity of the calculations.


20

Starting 10.11 the derived attributes are now evaluated at the source when all the required columns come from the same table, which covers the majority of the use cases. This feature is only supported if the defined function on the derived attribute is also supported by the data source. This feature supports more Big Data scenarios where moving large amounts of data to memory is not practical. Time-derived columns such as “Month of timestamp” or “Day of timestamp” and other out of the box time-based derived attributes are now calculated at the source. This feature also supports multiple attribute forms. Only if the ID attribute form is supported by the datasource, if there is an attribute form that is not supported other than the ID, that particular attribute form won’t be calculated by the datasource and it will be displayed as a blank data in the dossier.

For more information about this feature please check the following pages: https://microstrategy.atlassian.net/wiki/spaces/TECServerAnalyticRelQueryEng/pages/154086377/F6621+Derived+Attribute+Push+Down https://microstrategy.atlassian.net/wiki/spaces/TECServerAnalyticRelQueryEng/pages/252117026/Multi-Form+Derived+Attribute+push+down+in+single+connect+live+dataset

Performance implications of large and complex dossiers Performance wise it’s always a good idea to have in consideration the following phrase: “The larger the dossier(more chapters/pages/visualizations/datasets/derived elements, more data being displayed in each visualization) the longer time it will take be executed on the server and longer

https://microstrategy.atlassian.net/wiki/spaces/TECServerAnalyticRelQueryEng/pages/154086377/F6621+Derived+Attribute+Push+Down

https://microstrategy.atlassian.net/wiki/spaces/TECServerAnalyticRelQueryEng/pages/154086377/F6621+Derived+Attribute+Push+Down

https://microstrategy.atlassian.net/wiki/spaces/TECServerAnalyticRelQueryEng/pages/252117026/Multi-Form+Derived+Attribute+push+down+in+single+connect+live+dataset

https://microstrategy.atlassian.net/wiki/spaces/TECServerAnalyticRelQueryEng/pages/252117026/Multi-Form+Derived+Attribute+push+down+in+single+connect+live+dataset

21

time it will be needed to render it on the client, there are certainly tradeoffs between the contents size and performance. If you have two or more separate business stories two tell, try to put them in two or more separate dossiers.” This is why it’s really important to add in the dossier only the necessary components, anything (chapters/pages/visualizations/datasets/derived elements) that is not needed to tell a business story must also be deleted.

Connect Live performance considerations To understand the Live connected intelligent cubes performance considerations, it’s necessary to understand how many steps are involved in the process, of course each step consumes time, hence the more steps the more time will take a dossier to be executed. For a dossier that is pulling data from a Live Connected Intelligent Cube here are the steps performed when the dossier is executed:

1. The request is received by the Intelligence Server. 2. The Intelligence Server brings the Live Connected Intelligent Cube definition from the

Metadata and analyzes the information requested from that definition. 3. The Intelligence Server generates a query based on the requested information and the

definition from last step. 4. The Intelligence Server executes the previous query against the Data Warehouse. 5. The Data Warehouse process the request. 6. Once the request is processed from the Data Warehouse the Intelligence Server starts to

transfer the data from the Data Warehouse to the Intelligence Server. 7. Once the data is transferred the Intelligence Server performs any additional operations

requested like subtotals, formatting, etc… 8. The data is rendered in the end user browser.

Now let’s see the steps when the dossier is pulling data from an In-Memory Intelligent Cube.

1. The request is received by the Intelligence Server. 2. The Intelligence Server brings the In-Memory Intelligent Cube definition from the Memory and

the Metadata. 3. The Intelligence Server performs any additional operations requested like subtotals,

formatting, derived elements, etc… 4. The data is rendered in the end user browser.

As you can see the Live Connected workflow takes 8 steps while the In-Memory Intelligent Cube takes 4 steps, clearly the In-Memory workflow is faster. Of course, there is a trade of, since the In-Memory technology keeps the data in memory, there are RAM memory implications using this approach, however if that’s the only concern, you should still use the In-Memory and just add more RAM to the Intelligence Server. Try not to use Live Connected technology when memory is the main concern. If your dossier is powered by Live Connected datasets, every visualization within the dossier will become a job that

22

goes through the 8 steps previously detailed, most of the time the performance would be better with an In-Memory solution. Unless you have a very powerful in-memory data warehouse that can outperform the MicroStrategy In-Memory Intelligent Cubes, also even if that’s your situation you also need to take in consideration the data transfer between your Data Warehouse and the Intelligence Server.