Upload
rajesh-pillai
View
1.866
Download
31
Tags:
Embed Size (px)
DESCRIPTION
A collection of QlikView tips and tricks from the Qlik Community. The required attribution and source has been provided in the document. This document has about 90+ tips, tricks and solutions for common qlikview challenges.
Citation preview
TekAcademyLabs.com
QlikView Tips and Tricks
June 9
2014QlikView tips and tricks is the document in which content is aggregated from the qlik community. The appropriate source of the content is also mentioned to avoid plagiarism.
http://tekacademylabs.com/
Contents 1. Load XML field from database (Oracle) ............................................................................................. 3
2. Hash functions and Collisions ............................................................................................................. 4
Hash functions.................................................................................................................................... 4
Hash functions in QlikView ................................................................................................................. 5
Detecting changed records ................................................................................................................... 5
Hash collisions ................................................................................................................................... 6
The birthday problem .......................................................................................................................... 7
Calculating the probability of a hash collision .......................................................................................... 7
3. Autonumber Vs AutonumberHash Vs Autonumberhash128 ............................................................. 9
4. Loosely coupled tables ...................................................................................................................... 10
5. Circular References ........................................................................................................................... 11
6. Incremental Load .............................................................................................................................. 14
7. Three types of Incremental Load ...................................................................................................... 17
8. Qlikview Associative data model ...................................................................................................... 20
9. The magic of variables ...................................................................................................................... 21
10. The QlikView Cache......................................................................................................................... 25
11. Null handling in QlikView ................................................................................................................ 27
12. Text searches .................................................................................................................................. 27
13. Automatic Number interpretation .................................................................................................. 30
15. Colors in Chart ................................................................................................................................. 35
16. Aggregations and Function Classes ................................................................................................ 38
17. Its all Aggregations ........................................................................................................................ 40
18. Dimensions and Measures ............................................................................................................. 42
19. Qlikview Quoteology ...................................................................................................................... 45
20. The Crosstable Load ....................................................................................................................... 47
21. On Boolean fields and functions ..................................................................................................... 49
22. The Dual() function ........................................................................................................................ 52
23. A primer on Section Access ............................................................................................................. 56
24. Data reduction using multiple fields .............................................................................................. 58
25. Color, state and vectors .................................................................................................................. 61
26. Handling multiple languages .......................................................................................................... 62
27. Dynamically selecting timeframes ................................................................................................. 65
28. The Only() function ......................................................................................................................... 67
29. AND and OR .................................................................................................................................... 69
30. To JOIN or not to JOIN..................................................................................................................... 71
31. Canonical Date ................................................................................................................................ 73
32. Linking to two or more dates .......................................................................................................... 77
33. IntervalMatch and Slowly Changing Dimension ............................................................................ 78
34. The Calculation engine ................................................................................................................... 78
35. Symbol Tables and Bit stuffed pointers .......................................................................................... 80
36. Basics for Complex authorization .................................................................................................. 82
37. Generic Keys .................................................................................................................................... 83
38. Generate missing data in QlikView ................................................................................................ 84
39. Strategies for creating key tables ................................................................................................... 85
40. Recipe for a Gantt chart .................................................................................................................. 86
41. Relative Calendar Fields ................................................................................................................. 87
42. Master Calendar .............................................................................................................................. 90
43. Year Over Year Comparisons........................................................................................................... 91
44. Redefining the week numbers ........................................................................................................ 94
45. Preceding Load ................................................................................................................................ 97
46. Macros are BAD .............................................................................................................................. 99
47. Recipe for Pareto Analysis ............................................................................................................ 101
48. Monte Carlo Methods .................................................................................................................. 104
49. A myth about COUNT distinct ....................................................................................................... 107
50. Unbalanced n level hierarchies ................................................................................................... 110
51. Hierarchies .................................................................................................................................... 113
52. Loops in the Script ......................................................................................................................... 114
53. IntervalMatch ................................................................................................................................ 117
54. Counters in the Load ..................................................................................................................... 119
55. Synthetic Keys ............................................................................................................................... 121
56. Data types in QlikView .................................................................................................................. 124
57. The nature of Dual flags ................................................................................................................ 126
58. Dont Join use ApplyMap instead ................................................................................................ 128
59. Slowly Changing Dimension .......................................................................................................... 130
60. Search, but what shall you find? .................................................................................................. 132
71. Cyclic or Sequential ....................................................................................................................... 134
72. The magic of Dollar Expansion ...................................................................................................... 137
73. When should the Aggr function not be used ................................................................................ 139
74. Recipe for memory statistics analysis ........................................................................................... 142
75. The Table Viewer.......................................................................................................................... 145
76. Creating a Scatter Chart ................................................................................................................ 146
77. Fiscal Year ...................................................................................................................................... 147
78. The Master Time Table ................................................................................................................ 150
79. Create reference dates for intervals ............................................................................................. 152
80. Fact table with mixed granularity ................................................................................................. 154
81. How to populate a sparsely populated field ................................................................................. 156
82. Calculated Dimensions .................................................................................................................. 158
83. Finding Null .................................................................................................................................. 158
84. Creating intervals from a single date ........................................................................................... 160
85. Why dont my dates work ............................................................................................................. 162
86. Master table with multiple roles................................................................................................... 165
87. Rounding errors ........................................................................................................................... 168
88. Generic Load ................................................................................................................................. 170
89. Clarity Vs. Speed ........................................................................................................................... 173
90. Green is the Colour ....................................................................................................................... 176
91. Joins ............................................................................................................................................... 178
92. On format codes for numbers and date ....................................................................................... 182
93. The Aggregation Scope ................................................................................................................. 185
QlikView Tips and Tricks
1. Load XML field from database (Oracle)
Source: http://community.qlik.com/thread/8453
SQL SELECT extract(XML_DOCUMENTO,'/','xmlns="http://www.portalfiscal.inf.br/cte"').getClobVal() AS
XML_DOCUMENTO
FROM XML_DOCUMENTO_FISCAL
This way the files were extracted and can work with the data to make the load of the QVD created.
2. Hash functions and Collisions
Source: http://www.qlikfix.com/2014/03/11/hash-functions-collisions/
Im currently updating my materials for the upcoming Masters
Summit for QlikView in Chicago, and thought Id share a little bit with you. In my session on
data modeling, I explain how you can deal with various types of Slowly Changing
Dimensions in QlikView. One of the techniques I explain is using hash functions to detect
changes in (historical) records. During the previous events, this always lead to two questions
from the audience:
What exactly are hash functions and hashes?
And, from those who already know the answer to the first question: Arent you worried
about hash collisions?
Today I will answer both questions and hopefully give you some insight into hash functions,
their usefulness in QlikView and the risks of hash collisions.
Hash functions
A hash function is an algorithm that maps data of arbitrary length to data of a fixed length.
The value returned by a hash function is like a fingerprint of the input value, and is called a
hash value or simply hash. For example, all of the text above can be translated into the
following MD5 hash: 357799131ceffdd43cc0fe9f52b36eeb.
You will notice that this hash is much shorter than the original string used to generate
it.Besides that, if only a single character in the text is changed this will lead to a completely
different hash. This property makes hash functions very useful to compare things, for
example files, but also historical versions of a record.
A hash function is deterministic, meaning that the same input value should always lead to
the same hash value. Typically, a hash function is a one-way function, you cannot decode
the original input value based on the hash value alone. Besides that, a good hash function is
also uniform, which means that each hash value should have the same probability of being
picked. The image at the top of this post illustrates a very simple hash function. Each of the
four input values is mapped to a unique output value.
Hash functions in QlikView
In QlikView, the following hash functions are available:
Hash128(): a 128 bit hash function that returns a 22 character string.
Hash160(): a 160 bit hash function that returns 27 character string.
Hash256(): a 256 bit hash function that returns a 43 character string.
The number of bits determines the output range of the function. A 128 bit hash can store
2^128 (or, 340.282.366.920.938.000.000.000.000.000.000.000.000) different combinations.
160 and 256 bit can store even more combinations (2^160 and 2^256 respectively).
Besides these functions, QlikView also has
the AutoNumberHash128() and AutoNumberHash256()functions. These functions
basically take the output of the Hash128() and Hash256() function and passes it through
the AutoNumber() function. While I think they have a nicer syntax than the
regularAutoNumber(), you can supply a comma-separated list of fields instead of a
concatenated string, the usefulness of these functions eludes me.
Detecting changed records
Consider a QlikView application containing the following Employee table:
Now, assume we get some new, changed data and want to quickly determine which rows
have changed:
As you can see, Jim has moved to another office. How can we detect that this row has
changed? We could compare each field in the table to each previous version of the field, but
as we are only interested in detecting if the row has changed, using a hash function is a more
elegant solution. UsingHash128(Name, Position, Office) we can calculate a hash value for
each row:
The hash value for Dwights record hasnt changed, because the record hasnt changed either.
Jims changed record however does have another hash value than the previous one. Once
weve detected this we can do further processing on these records. This will be the topic of a
future blog post. Or, if you dont want to wait for that, my data modeling session at
the Masters Summit for QlikView.
Hash collisions
As noted before, a hash function is an algorithm that maps data of arbitrary length to data of a
fixed length. When different input values lead to the same output hash value, this is known as
a hash collision. Consider the following, simplified hash function:
In this example, both Michael and Toby get the same hash value of 2. Its easy to see what
the problem is here, there are 5 input values and only 4 possible hash values. The input
domain is greater than the output range.
Now, you may think this isnt a problem for me, the number of input values I deal with is
much less than 2^128, let alone 2^256. Its a simple assumption to make, but also a wrong
one as hash collisions can occur long before the number of input values reaches the range of
the hash function.
The birthday problem
Imagine youre in a room with a group of people. How many people do you think need to be
in that room before the probability of two people sharing the same birthday reaches 50%?
There are (excluding leap years) 365 days in a year, so maybe 185? 200?
The answer is 23. Surprising, isnt it? If we raise the number of people to 75, the probability
of at least two people sharing a birthday raises to 99,95%. This is known as the birthday
problem.
As this is a QlikView blog and not a math blog, I wont go through the complete solution and
proof. Basically, instead of calculating the probability that two people in a group share a
birthday, the trick is to calculate the probability that no one in the group shares a birthday.
This is much easier to calculate. The result is then subtracted from 1, which gives the
probability that at least two people in the group share a birthday.
Calculating the probability of a hash collision
If you looked closely at the previous example, you may see that the people can be considered
input values and that their birthdays can be considered hash values. When two people share
the same birthday its a hash collision! If we understand this, then we can apply the same
logic to determine the probability of a hash collision in our data sets. To calculate the
approximate probability of a hash collision we can use the following formula:
I created a small Excel workbook to calculate the probability of a hash collision. Now, its
good to realize that Excel only uses 30 significant digits. As these probabilities are very
small, this means that Excel is unable to calculate probabilities for very small input values.
So, in the example below, I calculated the probability that 1 quadrillion (thats a 1 with 15
zeroes) input values could lead to a hash collision when using a 128 bit hash.
The probability of this happening are around 1 in 680 million. Or, to put it in perspective:
Now, there is a small caveat with this calculation. It assumes the hash functions used in
QlikView leads to a uniform output, meaning each value has the same probability. This may
not be the case.
On the other hand, we are not comparing a quadrillion records, we are only comparing two.
When calculating the probability of a hash collision with just 2 records and a 128 bit hash
using an online high precision calculator, the result is 2.938735877055718769922E-39 (1 in
2.9 Duodecillion). Or, to put it in perspective again, this is less likely than a single person
winning the lottery, getting hit by a meteorite, getting attacked by a shark -and- becoming
president of the USA in their lifetime.
Switch to a 160 bit hash and the likelihood of a collision becomes lower than the combined
probability of all events in the chart above. Now, just because it is very unlikely doesnt mean
that it cant happen (see:Law of large numbers), but I like those odds!
3. Autonumber Vs AutonumberHash Vs Autonumberhash128
I read it that autonumber stores the expression value and gives it a unique integer value
whereas autonumberhash128 stores just the hash value (in 128 bits) of the corresponding
expression value. Therefore, autonumberhash128 should be more efficient in data storage
(particularily when the expression value is larger) and so the document size reduced.
Other notes:
Having our new baby (AKA the mini QlikView addict) around has meant very little time for
anything, let alone blogging. So in order to ensure I at least manage the odd post or 2 I
thought it would be good to start a new series of short posts on different qlikview functions
and their uses. To kick things off I have decided to take a look at the autonumber() function
and the closely related autonumberhash128() and autonumberhash256(). All 3 functions do a
very similar thing so let's look at autonumber() first and then consider how the other 2
functions differ.
Autonumber() can be considered a lookup function. It takes a passed expression and looks up
the value in a lookup table. If the expression value isn't found then it is added to the table and
assigned an integer value which is returned. If the expression value is found then it returns
the integer value that is assigned against it. Simply put, autonumber() converts each unique
expression value into a unique integer value.
Autonumber() is only useful within the QlikView script and has the following syntax:
autonumber(expression [, index])
The passed expression can be any string, numeric value or most commonly a field within a
loaded table. The passed index is optional and can again be any string or numeric value. For
each distinct value within the passed index, QlikView will create a separate lookup table and
so the same passed expression values will result in a different returned integer if a different
index is specified.
So how exactly are the 3 autonumber functions different? Autonumber() stores the expression
value in its lookup table whereas autonumberhash128() stores just the 128bit hash value of
the expression value. I'm sure you can guess therefore, autonumberhash256() stores the
256bit hash value expression value.
Why on earth would I want to use any of these functions? Well the answer is quite simply for
efficiency. Key fields between two or more tables in QlikView are most efficient if they
contain only consecutive integer values starting from 0. All 3 of the autonumber functions
allow you to convert any data value and type into a unique integer value and so using it for
key fields allow you to maintain optimum efficiency within your data model.
A final word of warning. All 3 of the autonumber functions have one pitfall, the lookup
table(s) exist only whilst the current script execution is active. After the script completes, the
lookup table is destroyed and so the same expression value may be assigned different integer
values in different script executions. This means that the autonumber functions can't be used
for key fields within incremental loads.
2 comments:
1. Mike Taylor8 March 2014 11:13
Nice, simple explanation. Can you clarify how autonumber function deal with null values. I
had some issues recently and resorted back to using the original values where I had one table
with nulls (which were assigned a autonumber of 0) and I was trying to join to another table
that had no null values.
Reply
2. Matthew Fryer24 April 2014 15:00
Hi Mike
First of all, how autonumber() will behave depends on if it is a true null or a zero length
string. For true nulls, the result of the autonumber() will also be null. You can prove this by
running the following script:
LOAD autonumber(null()) AS field AUTOGENERATE 1;
By adding "field" to a list box you will see no records. Being a true null and the fact that null
values don't really exist in QlikView (they are simply the absence of a value) means that
when using it for a key field, QlikView will not associate a null in one table to a null in the
other.
For a zero length string, autonumber() will assign it a value as it would any other passed
value. The number assigned to the zero length string will depend on the order it appears in the
values that are passed to autonumber(). You can see this by running the following script:
LOAD autonumber('') AS field AUTOGENERATE 1;
The result will be a single value in "field" of "1". Autonumber() is 1 indexed and so I'm not
sure where you are getting your 0 value back.
4. Loosely coupled tables http://community.qlik.com/thread/104608
NOTES:
Loosely coupled tables are created automatically when the data model(3 or More Tables) that
includes circular references is loaded into QlikView,to avoid them the circular references create a loop
in the QlikView internal logic. These loosely coupled tables need to be handled in order to visualize
data in a way that is expected and understandable.
See Article Circular References
Any table can also be made loosely coupled interactively from this dialog or via macros. Additionally,
it is possible to declare loosely coupled tables explicitly in the script via the Loosen Table statement.
The normal QlikView associative logic is disconnected internally for loosely coupled tables. This
means that selections in one field do not Associate through to the other fields in the table. its very
useful to avoid circular references in the data structure in various scenarios. For more examples of
how this feature can be used, please refer to the QlikView Reference Manual - Intentionally Creating
Loosely Coupled Tables
One or more QlikView internal data tables can be explicitly declared loosely coupled during script
execution by using a Loosen Table statement.
The use of one or more Loosen Table statements in the script will make QlikView disregard any setting
of tables as loosely coupled made before the script execution.
The syntax is:
Loosen Table[s] tablename [ , tablename2 ...]
Either syntax: Loosen Table and Loosen Tables can be used.
Example:
Table1:
Select * from Trans;
Loosen table Table1;
Note!
Should QlikView find circular references in the data structure which cannot be broken by tables
declared loosely coupled interactively or explicitly in the script, one or more additional tables will be
forced loosely coupled until no circular references remain. When this happens, the Loop
Warning dialog, gives a warning.
5. Circular References http://community.qlik.com/blogs/qlikviewdesignblog/2013/06/25/circular-references
There are two Swedish car brands, Volvo and SAAB. Or, at least, there used to be... SAAB was made
in Trollhttan and Volvo was and still is made in Gothenburg.
Two fictive friends Albert and Herbert live in Trollhttan and Gothenburg, respectively. Albert
drives a Volvo and Herbert drives a SAAB.
If the above information is stored in a tabular form, you get the following three tables:
Logically, these tables
form a circular reference: The first two tables are linked through City; the next two through Person; the
last and the first through Car. Further, the data forms an anomaly: Volvo implies Gothenburg;
Gothenburg implies Herbert; and Herbert implies SAAB. Hence, Volvo implies SAAB which doesnt
make sense. This means that you have ambiguous results from the logical inference - different results
depending on whether you evaluate clockwise or counterclockwise.
If you load these tables into QlikView, the circular reference will be identified and you will get the
following data model:
To avoid ambiguous results, QlikView marks one of the tables as loosely coupled, which means that
the logical inference cannot propagate through this table. In the document properties you can decide
which table to use as the loosely coupled table. You will get different results from the logical inference
depending on which you choose.
So what did I do wrong? Why did I get a circular reference?
It is not always obvious why they occur, but when I encounter circular references I always look for
fields that are used in several different roles at the same time. One obvious example is if you have a
table listing external organizations and this table is used in several roles: as Customers, as Suppliers
and as Shippers. If you load the table only once and link to all three foreign keys, you will most likely
get a circular reference. You need to break the circular reference and the solution is of course to load
the table several times, once for each role.
In the above data model you have a similar case. You can think of Car as Car produced in the city or
Car that our friend drives. And you can think of City as City where car is produced or City where
our friend lives. Again, you should break the circular reference by loading a table twice. One possible
solution is the following:
In real life circular references are not as obvious as this one. I once encountered a data model with
many tables where I at first could not figure out what to do, but after some analyzing, the problem
boiled down to the interaction between three fields: Customers, Machines and Devices. A customer
had bought one or several machines; a device could be connected to some of the machine types
but not to all; and a customer had bought some devices. Hence, the device field could have two roles:
Devices that the customer actually had bought; and devices that would fit the machine that the customer
had bought, i.e. devices that the customer potentially could buy. Two roles. The solution was to load
the device table twice using different names.
Bottom line: Avoid circular references. But you probably already knew that
The post assumes that the reader knows what the Logical Inference engine does.
The Logical Inference engine is the core of QlikView. It evaluates which field values are possible,
given the selection. Basically it first evaluates which records are possible, and then the result of the
evaluation "propagates" into the next table via the possible values of the key field, and then the next
table is evaluated. It is this propagation that is disabled by the loosely coupled table.
Read more about Logical Inference
under http://community.qlik.com/blogs/qlikviewdesignblog/2013/07/15/logical-inference-and-aggregations
5. Logical Inference and Aggregations
Every time you click, QlikView recalculates everything.
Everything.
A new selection implies a new situation: Other field values than before are possible; other
summations need to be made; the charts and the KPIs get other values than before. The state vectors
and the objects are invalidated.Everything needs to be recalculated since this is what the user demands.
Well, there is of course a cache also so that QlikView doesnt have to recalculate something which
has been calculated before. So it isnt quite true that everything is recalculated: If a calculation has
been made before, the result is simply fetched from the cache. But it is true that nothing is pre-
calculated. There is no need for that. Everything canbe done in real-time.
QlikView is an on-demand calculation engine.
From a principal point, there are two steps in the recalculation of data: The logical inference in the data
model, and the calculations of all objects, including sheet labels and alerts.
The logical inference is done first. The goal is to figure out which field values in the symbol tables are
possible and which records in the data tables are possible, given the new selection. There is no
number crunching involved - it is a purely logical process. The result is stored in the state vectors.
Think of it as if the selection propagates from one table in the data model to all other tables. Table by
table is evaluated and QlikView figures out which values and records are possible, and which are
excluded.
When the logical inference is done, QlikView starts to evaluate all exposed objects. List boxes and
dimensions in charts must be populated and sorted. All expressions in charts, in text boxes, in
labels, in alerts must be calculated. Objects that are on other sheets, minimized or hidden, are
however not calculated.
The calculations are always aggregations based on the data records that have been marked as
possible by the logical inference engine. I.e., the objects do not persist any data on their own.
The calculation phase is usually the phase that takes time often over 90% of the response time is
due to calculations. The calculations are asynchronous and multi-threaded on several levels: First of
all, every object is calculated in its own thread. Secondly, in the 64-bit version, many aggregations
e.g. Sum() are calculated using several threads, so that a sum in one single object can be calculated
quickly using several CPUs.
Finally, when an object has been calculated, it is rendered. Since the calculation is asynchronous and
multi-threaded, some objects are rendered long before other objects are ready.
And when an object has been rendered, you can click again. And everything is repeated.
HIC
If you want to read more about QlikView internals, see Symbol Tables and Bit-Stuffed
Pointers and Colors, states and state vectors.
6. Incremental Load Souce: http://www.resultdata.com/qlikview-incremental-loads-and-qvds/
QlikView, by design, includes many new and innovative technologies such as the associative data
model and highly effective data compression algorithms which make possible its state-of-the-art in-
memory technology. QlikView allows us to load and keep all the data in memory for evaluation,
analysis and reporting. If youve worked with QlikView you understand the value of this approach,
but it sometimes comes with a price. Very large data sets can often take a long time to load bogging
down the performance of your QlikView documents over time. This month we will take a look at a
way to minimize the load time of very large data sets and increase your performance using
incremental data loads.
What is an Increment Load?
Incremental load is term that describes loading only new or changed records from the database. It is
a common task with databases and can greatly reduce the time needed to load data into your
QlikView application. The bulk of the data needed will already be available within your application
and only the new or changed data will be necessary to complete the picture. Incremental loads are
possible through the use of .QVD files.
What is a QVD file?
A QVD file is a native QlikView file format optimized and compacted for speed when reading data
from within a QlikView load script. Reading data from a QVD file can be 10-100 times faster than
reading records directly from other data sources. A QVD file contains a single table of data from a
QlikView application. While that may seem somewhat restricting remember that table can be the
result of a concatenation or a join so the structure you create in the application can greatly increase
its use. You can also include all calculations and manipulations in the script that creates you QVD file
further increases load performance at run time.
How could you use a QVD file?
There are several uses for a QVD file and in many cases more than one will be applicable at the same
time.
Decreasing Load Time
Decreasing Database Server Loads
Joining Data from Different QlikView Applications
Incremental Data Loading
Decreasing Load Time
By saving data to and loading large amounts of data from a QVD file you eliminate most of the time
used during load by using an optimized and compressed data file. By scripting all of your
concatenation, joining, calculations and data manipulations in the file used to create the QVD you
will increase your performance even more.
Decreasing Database Server Loads
By isolating your large data volumes and loading them from QVD files you will reduce the processing
on the database server at load time and dramatically increase the load time of your scripts as well.
You only need to provide data since the last load of your QVD to your QlikView document during
refresh. The fresher the data in your QVD the less data needed from the database server.
Joining Data from Different QlikView Applications
Once youve formatted and manipulated your data and get it working just the way you want, you
can save that table to a QVD and use the same vetted structure in other QlikView documents. While
it is true that you could copy and paste your load script into another QlikView document, by using a
QVD file instead you have the added advantage of dramatically faster loading. As your scripts
become more and more complex based on the business questions asked by the users you can
standardized your scripts and maintain the logic in one place. This increases our ability to create a
single version of the truth.
Incremental Data Loading
By adding business logic to the creation of you QVD files you can extend that logic to all of the
QlikView Applications that use that data; to create a more dynamic loading scenario. Lets say you
schedule your QVD loads monthly, after the close of business for the previous month. You
application now only needs to load data for the current period directly from the database and then
load all previous periods from your QVD file.
Incremental Load Scenarios
The structure of your data, available evaluation fields and how you choose to store your data in
QVDs will determine your particular scenario but here are a few examples to get you started
thinking.
Daily Incremental Reloads of a QVD
This scenario requires a date field in data that identifies the creation or update time of all records.
We can retrieve the last modified/created date from the existing QVD, use that date to retrieve new
records from the database and then concatenate the previously saved records from the QVD file to
our current data and save the QVD file again.
1. Load the latest (max) modified date from you previously saved QVD. If you have not yet
created the QVD then use the current date.
2. Select only the records from the database where the last modified date is between the last
modified date you retrieved in step on and right now.
3. Concatenate all the data from the QVD file where there is no match in new data table on the
unique ID field. This allows QlikView to only add the records that do not exist and accounts
for updated records as well as new records.
4. Save the resulting data set by overwriting the QVD file with all of the records in the new data
set.
This scenario will force QlikView into Fast mode instead of Super-Fast mode but will still be
significantly faster than loading all data from the database. You may also need to extend this logic to
your production QlikView Application if it needs to retrieve data since the last daily load.
Daily/Monthly/Yearly Stacked QVDs
At close of each Day, Month or Year (Month and/or Year being the most popular) you will create a
QVD containing that periods data. Each QVD will be named so that the data in them is clearly
identified by the name (I.E.: 3-1-2010.qvd or 3-2010.qvd or 2010.qvd). You may wish to use a
combination approach such as saving data from previous year in a yearly QVD and data within the
current year in a monthly QVD. This will give you the option of loading only the appropriate data into
your QlikView Applications. Depending on the target audience for your application you may need
different combinations of data. One application might require all available data while other may only
require a specific number of years past. A more analytic application may only require yearly and/or
monthly data while others will require up-to-the-minute data. This approach will give you flexibility
for all of those scenarios.
Another advantage of this approach is that the daily, monthly or yearly data can be loaded in Super-
Fast mode since no date evaluation is needed. Only the data needed to supplement the application
since the last saved QVD file, if any, will be read directly from the database.
7. Three types of Incremental Load Source: http://www.resultdata.com/three-types-of-qlikview-incremental-loads/
Large transactional tables can be significantly time consuming in a reload. Taking advantage of
Qlikviews incremental load logic can reduce that time considerably. An incremental load is a process
of simply appending new records or updating specific records in an existing QVD. There are three key
scripting options available for incremental loads.
Insert Only
Insert & Update
Insert, Update, & Delete
For a detailed review of QVDs and the concept of incremental loads, please review the following
article
Incremental Loads and QVDs
SET UP
Each of these three scenarios is designed to run once an INITIAL LOAD has occurred. An initial load is
a task that creates the source QVDs. These QVDs from then on can be optimized to reload with one
of the following incremental load scripts. Since an incremental load is designed to pull only new or
altered data, a source QVD is needed to hold all non-modified information and must exist before an
incremental load can run.
INSERT ONLY
For an INSERT ONLY scenario, there is the assumption that new data will not create duplicate
records. There is no set way to determine NEW data, so this must be reviewed case by case. Once a
method for finding new records is determined, the reload process is a simple three step process.
1. Load all NEW data from the data source
2. Concatenate this data with a load of all data from the QVD file
3. Store the entire table out to the QVD file
As long as the QVD is named the same, this will overwrite the previous QVD so the process can
repeat for the next reload.
INSERT & UPDATE
The INSERT & UPDATE scenario also takes new data from the source but it also pulls in updated
records. Additional precautions need to be taken in order to avoid duplicate records. During the load
from the QVD, exclude records where there is a match on the primary key. This will ensure that the
updated records will not be duplicated.
1. Load all NEW and UPDATED data from the data source
2. Concatenate this data with a load of only the missing records from the QVD file
3. Store the entire table out to the QVD file
Example of Script
Data:
SQL SELECT
PrimaryKey,
A,
B,
C
FROM DB_Table
WHERE ModifyDate >= $(vDate);
CONCATENATE
LOAD
PrimaryKey,
A,
B,
C
FROM Data.qvd
WHERE NOT exists (PrimaryKey);
STORE Data into Data.qvd;
Using the Exists() function keeps the QVD from loading the obsolete records since the UPDATED
version is currently in memory.
INSERT, UPDATE, & DELETE
An INSERT, UPDATE, & DELETE script is very similar to the load process of the INSERT & UPDATE,
however there is an additional step needed to remove deleted records. The most effective method is
to load all the PrimaryKeys from the source and then apply an inner join. This will achieve the delete
process.
1. Load all NEW and UPDATED data from the data source
2. Concatenate this data with a load of only the missing records from the QVD file
3. Inner join all PrimaryKeys from the data source
4. Store the entire table out to the QVD file
Example of Script
Data:
SQL SELECT
PrimaryKey,
A,
B,
C
FROM DB_Table
WHERE ModifyDate >= $(vDate);
CONCATENATE
LOAD
PrimaryKey,
A,
B,
C
FROM Data.qvd
WHERE NOT exists (PrimaryKey);
INNER JOIN
SQL SELECT
PrimaryKey,
FROM DB_Table;
STORE Data into Data.qvd;
Very large data sets can take a long time to load and greatly effect the performance of your QlikView
documents over time. By implementing QVD optimization with incremental loads, this technique can
be employed to perform faster loads in less time, utilizing less system resources.
8. Qlikview Associative data model Source: http://community.qlik.com/blogs/theqlikviewblog/2010/08/16/qlikview-is-associative-to-
its-very-core
One thing we're trying to do a better job of at QlikTech is communicating the associative nature of
QlikView. I've seen lots of conversations taking place online (for example on the QlikCommunity
site as well as Donald Farmer's blog andCurt Monash's blog). So I tapped into the brains of Dan
English, our Global Product Manager for OEM and Integration for his explanation, and I'm sharing it
with you here.
First and foremost we should clear up the semantics. If one uses the Wikipedia definition of an
associative model of data then it is correct to say that QlikView does not store data in an associative
format. However, QlikTech uses the word associative in an entirely different sense. When we say
that QlikView is associative we mean that at a data engine level QlikView creates and maintains real-
time associations among all result sets, creating a cohesive and intuitive view of business
information.
We describe QlikView's architecture as associative to differentiate it from query-based business
intelligence tools. With all query-based BI tools (whether ROLAP, MOLAP, or HOLAP) each individual
result set is returned from the underlying data engine without any inherent association back to the
data repository as a whole, or to any other query result set (see figure below).
When we say QlikView is associative, we aren't talking just about QlikView's intuitive user
interface?the UI that utilizes green for selected data, white for associated data, and gray for
unassociated data to illustrate relationships hidden in business information. (See this QlikView blog
post.) We're talking about a revolution in data engine architecture, in that:
Every data point in a QlikView document shares a common selection state. With QlikView's
data engine, each and every discrete data point in a given QlikView document?whether it is
part of an aggregated result set (e.g., straight table, pivot table, chart, etc.) or unaggregated
data (e.g., data in a list box)?shares a common selection state (e.g., universe of included and
excluded data).
All data points are constantly updated based on the selection state. All the data points in a
QlikView document are continually and instantaneously updated based on changes the user
makes to the selection state. The associations among result sets are maintained 100% by the
underlying data engine, which is built on a column-store, in-memory architecture.
QlikView's associative architecture delivers unprecedented flexibility
Why is QlikView's associative engine so important? One might argue that a real-time query tool gives
you the capability to answer any question you want. After all, within the limits of the tool's user
interface, you can define any result set you want, right? We maintain that the answers to real-world
business questions are almost never exposed in the result set of a single query. Almost always the
answer can only be extracted by examining the relationships of two or more associated result sets,
often aggregated along completely different dimensionality.
The bottom line: QlikView represents a fundamentally different class of analytic engine. All
associations are based on the data model set up when the QlikView document is developed. Those
associations are used to update every single result set in real time each and every time the user
changes the selection state. This is the source of QlikView's associative magic.
9. The magic of variables Source: http://community.qlik.com/blogs/qlikviewdesignblog/2013/11/04/the-magic-of-variables
Variables can be used in many ways in QlikView. They can have static values or they can be
calculated. But when are they calculated? At script run-time or when the user clicks? And how
should they be called? With or without dollar expansion?
One basic way to assign a value to a variable is to use a Let statement in the script:
Let vToday = Num(Today()) ;
This will calculate the expression and assign it to the variable when the script is run. This is exactly
what you want if you want to use a variable as a numeric parameter in your expressions.
But if you want the expression to be evaluated at a later stage, e.g. every time the user clicks, what
should you do then? One way is to store the expression as a string in the variable, using either the
Set or the Let statement or by defining it in the Document Properties -> Variables:
Set vSales = Sum(Sales) ;
Let vSales = 'Sum(Sales)' ;
In neither case, the expression will be calculated. The variable will contain the string Sum(Sales),
which subsequently can be used in an expression using a dollar expansion: $(vSales).
With a dollar expansion, QlikView will substitute the $(vSales) with Sum(Sales) before the
expression with the dollar expansion is evaluated. Some of you will recognize this as an old style
assembler macro expansion. The subsequent calculation will be made based on the evaluation of the
resulting expression. Note the two steps: (1) Variable expansion; and (2) Expression evaluation.
In the chart above, you can see the result of using a normal variable reference (the first expression)
or using a dollar expansion (the second expression). In the second expression, the variable is
expanded and the numbers are calculated correctly.
But this is just the beginning
It is also possible to calculate the variable value, i.e. determine how it should be expanded, by using
an initial equal sign in the variable definition.
Let vSales2 = '=Sum(Sales)';
In this case, the variable value is calculated after each click, whereupon the dollar expansion in the
chart expression is made, and finally the expression is evaluated. This means that the evaluation of
Sum(Sales) is done before the variable expansion. Note the three steps: (1) Variable calculation; (2)
Variable expansion; and (3) Expression evaluation.
The table below summarizes the three methods.
With the above, you can do almost magical things. You can for instance make conditional
calculations that depend on e.g. selections, client platform or user.
Example:
Create a field [Field to Aggregate] containing the names of two other numeric fields:
'Quantity' and 'Sales'
Create a variable vConditionalAggregationField = '=Only([Field to Aggregate])'
Create a chart with an expression = Sum($(vConditionalAggregationField))
The calculation in a chart will now toggle between Sum(Quantity) and Sum(Sales) depending on your
selection.
The use of variables is an extremely powerful tool that you can use to create flexible applications.
Use it but with caution. Too much magic behind the curtains can be confusing.
10. The QlikView Cache Source: http://community.qlik.com/blogs/qlikviewdesignblog/2014/04/14/the-qlikview-cache
QlikView has a very efficient, patented caching algorithm that effectively eliminates the calculation
time for calculations that have been made before. In other words, if you use the back button in the
toolbar, or if you happen to make a selection that you have made before, you usually get the result
immediately. No calculation is necessary.
But how does it work? What is used as lookup ID?
For each combination of data set and selection or data sub-set and expression QlikView calculates a
digital fingerprint that identifies the context. This is used as lookup ID and stored in the cache
together with the result of the calculation.
Here "calculation" means both the Logical Inference and Chart calculation - or in fact, any expression
anywhere. This means that both intermediate and final results of a selection are stored.
There are some peculiarities you need to know about the cache
The cache is global. It is used for all users and all documents. A cache entry does not belong
to one specific document or one user only. So, if a user makes a selection that another user
already has made, the cache is used. And if you have the same data in two different apps,
one single cache entry can be used for both documents.
Memory is not returned, when the document is unloaded. Cache entries will usually not be
purged until the RAM usage is close to or has reached the lower working set limit. QlikView
will then purge some entries and re-use the memory for other cache entries. This behavior
sometimes makes people believe there is a memory leak in the product. But have no fear it
should be this way. So, you do not need to restart the service to clear the cache.
The oldest cache entries are not purged first. Instead several factors are used to calculate a
priority for each cache entry; factors like RAM usage, cost to calculate it again and time since
the most recent usage. Entries with a combined low priority will be purged when needed.
Hence, an entry that is cheap to calculate again will easily be purged, also if it recently was
used. And another value that is expensive to recalculate or just uses a small amount of RAM
will be kept for a much longer time.
The cache is not cleared when running macros which I have seen some people claim.
You need to write your expression exactly right. If the same expression is used in several
places, it should be written exactly the same way Capitalization, same number of spaces,
etc. otherwise it will not be considered to be the same expression. If you do, there should
be no big performance difference between repeating the formula, referring to a different
expression using the label of the expression or using the Column() function.
The cache efficiently speeds up QlikView. Basically it is a way to trade memory against CPU-time: If
you put more memory in your server, you will be able to re-use more calculations and thus use less
CPU-time.
11. Null handling in QlikView Source: http://community.qlik.com/docs/DOC-3155
Refer: Null and Nothing.pdf
12. Text searches Source: http://community.qlik.com/blogs/qlikviewdesignblog/2013/10/16/text-searches
One of the strengths of QlikView is its search engine. With it, you can find pieces of information in a
fraction of a second and select the found field values. The response is immediate, which is necessary
for the user experience. Without it, you would easily get an empty result set without understanding
why.
Search strings can be made in many different ways, and QlikView will respond differently depending
on how the search string is defined. Normally you just enter a text, and QlikView will match this
against the beginning of the words in the field values. If several strings are entered, QlikView will
return the union of the matches of each of the strings.
But if you instead use a wildcard in your search string, the evaluation of the search string will be
made in a different way: the entire search string with the wild card will be matched against the
entire field value, sometimes yielding more matches, sometimes fewer.
If you want to create more complex search strings (and e.g. store them in actions or bookmarks) you
can do this too. Just use (, |, & and double quotes to define the syntax.
In all the above cases, the search and the selection are made in one and the same field. But
sometimes you want to make the selection in one field, but make the search in another. This can be
done using the associated search, which is an indirect search method. Start with the field where you
want to make the selection, enter the search string, and click on the small chevron to the right. You
will then get a list of other fields containing this search string. By clicking the desired match, you will
narrow down the number of matches in the primary list to show just the relevant values. You can
then make your selection by hitting Enter.
Further, did you know that
In the user preferences and in the list box properties, you can define how a default search
string should be created, but this does not affect how it is evaluated only how it is created.
Once created, you can add or remove wild cards as you please.
When you make a search and save the resulting selection in a bookmark, the bookmark will
contain the search string and not the list of selected values. When the bookmark is applied,
it will perform the search and select the found values. If data has changed, this may imply a
different search result than before.
You can use the same search string in many places: In list boxes, in Set analysis, in the
Advanced search dialog, in actions and in bookmarks.
Bottom line: The search string is a powerful tool that helps you find the values you want. Use it.
13. Automatic Number interpretation Source: http://community.qlik.com/blogs/qlikviewdesignblog/2013/07/07/automatic-number-
interpretation
I have in several previous blog posts written about the importance to interpret dates and numbers
correctly e.g. in Why dont my dates work?. These posts have emphasized the use of interpretation
functions in the script, e.g. Date#().
But most of the time, you dont need any interpretation functions, since there is an automatic
interpretation that kicks in before that.
So, how does that work?
In most cases when QlikView encounters a string, it tries to interpret the string as a number. It
happens in the script when field values are loaded; it happens when strings are used in where-
clauses, or in formulae in GUI objects, or as function parameters. This is a good thing QlikView
would otherwise not be able to interpret dates or decimal numbers in these situations.
QlikView needs an interpretation algorithm since it can mix data from different sources, some typed,
some not. For example, when you load a date from a text file, it is always a string: there are no data
types in text files it is all text. But when you want to link this field to date from a database, which
usually is a typed field, you would run into problems unless you have a good interpretation
algorithm.
For loaded fields, QlikView uses the automatic interpretation when appropriate (See table: In a text
file, all fields are text - also the ones with dates and timestamps.) QlikView does not use any
automatic interpretation for QVD or QVX files, since the interpretation already is done. It was done
when these files were created.
The logic for the interpretation is straightforward: QlikView compares the encountered string with
the information defined in the environment variables for numbers and dates in the beginning of the
script. In addition, QlikView will also test for a number with decimal point and for a date with the ISO
date format.
If a match is found, the field value is stored in a dual format (see Data Types in QlikView) using the
string as format. If no match is found, the field value is stored as text.
An example: A where-clause in the script:
Where Date > '2013-01-01' will make a correct comparison
The field Date is a dual that is compared to a string. QlikView automatically interprets the string on
the right hand side and makes a correct numeric date comparison. QlikView does not (at this stage)
interpret the content of the field on the left hand side of the comparison. The interpretation should
already have been done.
A second example: The IsNum() function
IsNum('2013-01-01') will evaluate as True
IsNum('2013-01-32') will evaluate as False
In both cases, strings are used as parameters. The first will be considered a number, since it can be
interpreted as a date, but the second will not.
A third example: String concatenation
Month(Year & '-' & Month & '-' & Day) will recognize correct dates and return the dual month
value.
Here the fields Year, Month and Day are concatenated with delimiters to form a valid date format.
Since the Month() function expects a number (a date), the automatic number interpretation kicks in
before the Month() function is evaluated, and the date is recognized.
A final example: The Dual() function
Dual('Googol - A large number', '1E100') will evaluate to a very large number
Here the second parameter of Dual() is a string, but QlikView expects a number. Hence: automatic
interpretation. Here, you can see that scientific notation is automatically interpreted. This
sometimes causes problems, since strings that really are strings in some cases get interpreted as
numbers. In such cases you need to wrap the field in a text function.
With this, I hope that the QlikView number handling is a little clearer.
14. Why dont my dates work?
Source: http://community.qlik.com/blogs/qlikviewdesignblog/2013/02/19/why-don-t-my-dates-
work
A common recurring question on the QlikCommunity forum is around dates that dont work. Here
follows a help on fixing the three most common causes. If you encounter such a question on the
forum, just link to this post in your answer.
1. Incorrect Date Interpretation
When data is loaded into QlikView, dates are often read as strings. QlikView then tries to recognize a
pattern in the string that looks like the date format specified in the DateFormat environment
variable. This sometimes fails and then you need to use the Date#() function to help QlikView
understand that it is a date.
How do I know that a date is correctly interpreted? Thats easy. Just format it as a number and see
what you get. (List box properties Number Integer)
The question is now what your list box looks like. If you have a number which is roughly 40000
(usually right-aligned), then you are all set. But if you still have a date stamp (usually left-aligned),
then you need to use the Date#() function in the script. See QlikView Date fields.
2. Linking integer dates with fractional dates
You have a date in two different tables, and you want to use this date as a key, but it doesnt seem
to work. Then you should suspect that you have true dates (integers) in one table and timestamps
(fractional numbers) in the other, but the formatting of the dates hides this fact.
How do I know whether this is the case? Thats easy. Just format it as a timestamp and see what you
get. (List box properties Number TimeStamp)
The question is now what your list box looks like. If you have timestamps where hours, minutes and
seconds are all zero, then you are all set. But if you have numbers in these places, then you need to
use the Floor() function in the script to get integer dates. See QlikView Date fields.
3. Incorrect date comparisons
The most subtle error is however the one with timestamps in comparisons, e.g.
Where Date = '2011-12-31';
Will this work? Yes, provided that the date format inside the string is recognized by QlikView, i.e.
that it corresponds to the date format specified in the environment variable DateFormat in the
beginning of the script.
It becomes even more complex if you use variables. Then it is important to use quotes correctly. The
following will work:
Let vToday = Today();
Where Date = '$(vToday)';
but the following will not:
Where Date = $(vToday);
The reason is that the $(vToday) will expand to the date, and then the comparison will be e.g.
Where Date = 2/19/2013;
So the date (which is approximately 40000) will be compared to 2 divided by 19 divided by 2013,
which of course is not what you want.
My recommendation is to always use numeric variables for dates. They always work - quotes or no
quotes:
Let vToday = Num(Today());
Where Date = $(vToday);
15. Colors in Chart Source: http://community.qlik.com/blogs/qlikviewdesignblog/2012/12/04/colors-in-charts
It is not uncommon that users want specific products or customers to be displayed in specific colors.
The most obvious way to do this is to change the colors in the chart properties. This is in fact quite
easy if you use the copy and paste functions found when you right-click a color button. Just copy one
button and paste on another, and you have moved the color.
This way you can assign which color is to be used for the different values of the field. However, a
prerequisite for this to work is that the order of the field values doesnt change.
A more robust way is to use color functions. Usually, you want to set the color of a bar, line or
bubble and this is done by using the Background Color on the Expression tab:
By the way dont use Visual cues. This feature is old and not very versatile. Use color functions as
described here instead.
In the picture above, both the product ID and the color are hard-coded in the expression. However,
if you want to define colors for many products, the if-function will not be manageable. Then it is
better to store this information in a table either in the database or in an Excel sheet or as an inline
statement in a scriptlet that is included in the script. Hence,
1. Create your color definition table and store it in an appropriate place. The Red, Green and
Blue columns hold the different color components and define the color uniquely.
2. Load the color definitions into a mapping table:
ProductColors:
Mapping Load ProductID, Rgb(Red,Green,Blue) as ProductColor From ProductColors
3. Use this mapping table when loading the products table, creating a new field for the product
color:
Applymap('ProductColors', ProductID , lightgray()) as ProductColor
The third parameter, here lightgray(), defines which color the unlisted products should get. If
you instead use null(), the unlisted products will be multicolored according to the color
settings in the chart properties.
4. Finally, use this field as product color in the charts:
This way it is easy to define which color specific products, customers, or other dimensions should
get.
Which colors to use? Oh, that is a completely different topic:
Stephen Few has a number of good general recommendations.
Adam Bellerby has some recommendations on how to avoid problems for color blind users.
Shima Auzins suggests using colors as warning signals.
HIC
16. Aggregations and Function Classes Source: http://community.qlik.com/blogs/qlikviewdesignblog/2014/05/19/function-classes
A typical QlikView application may have one million records in the data, one hundred rows in a pivot
table and a single number, a KPI, in a gauge or text box. Although different in magnitudes, all three
numbers may still represent all data. The numbers are just different aggregation levels.
There are many functions in QlikView that can help you write the necessary formulae to calculate
aggregated KPI:s. Some will collapse many records into one value, others will not. Today I will write
about the different function classes, and how you can combine them.
The Scalar Functions constitute the first class. Typical for these is that they are
one-to-one functions, i.e. they take a single value as parameter and return a single value (of
the dual data type). Examples: Left(), If(), Num(), Date(), Year(), Subfield(), etc.
The Aggregation Functions constitute the second class. These are many-to-one
functions, i.e. they use the values from many records as input and collapse these into one
single value that summarizes all records. Examples: Sum(), Count(), Avg(), Min(), Only(),
Concat(), etc.
Aggregation functions are special: You must use one to collapse several records into one number
which means that you need them in pretty much any formula in QlikView: In Chart expressions, in
Text boxes, in Labels, etc. If you dont write an aggregation function in your expression, QlikView will
assign one for you: It will use the Only() function.
Scalar functions can be used both inside and outside the aggregation function:
Date( Min( Date )
Money( Sum( If( Group='A', Amount ) ) )
There is one restriction: You can normally not use an aggregation function inside another
aggregation function. Hence, you usually need every field reference to be wrapped in exactly
one aggregation function.
The next function class has only one member: The Aggr Function. It is in spite
of its name not an aggregation function. It is a many-to-many function, rather like a tensor
or a matrix in mathematics. It converts an ntuple (table) with N records to an ntuple with M
records. In other words: It returns an array of values. Regard it as a virtual straight table with
one expression and one or several dimensions.
Most places in QlikView demand that you write your expression so that it returns one single value.
This means that you must wrap the Aggr function in an aggregation function to get a meaningful
result. The only exception is if you use the Aggr function to define a calculated dimension or field.
This means that you have two aggregation steps; one nested in the other:
Avg( Aggr( Sum( Amount ), Month ) )
Charts complicate the matters slightly: A chart is like a For-Next loop where the number of distinct
dimension values determines the number of loops. In each loop, the expression must return one
value only, and this is the value used for the bar/slice/pivot table row.
However, sometimes you need values from other rows in the chart, and it could even be that you
need values from several rows. To solve this, there are two additional classes of functions that
should be used together:
The Chart Inter-record Functions return values fetched from other rows in the
chart. Some of these can return several values, i.e. an array of values. These functions are
only meaningful inside a chart or Aggr() function. Examples: Above(), Below(), Top(), etc.
The Range Functions are functions that can collapse a chart inter-record array
into one single value. Examples: RangeSum(), RangeMin(), RangeMax(), etc.
Example:
RangeSum( Above( Sum( Amount ), 0, 12 ) )
Bottom line: Know your functions. It will help you write correct expressions.
17. Its all Aggregations Source: http://community.qlik.com/blogs/qlikviewdesignblog/2013/08/06/it-s-all-aggregations
I often see incorrect expressions being used in the QlikCommunity forum. Expressions that seem to
work correctly but really dont
So, let me make this clear: Calculations in QlikView are aggregations.
It doesnt matter if it is a measure in a chart, or a calculated object label, or a show condition for an
object, or a calculated color, or an advanced search all expressions in the user interface are
evaluated as aggregations. (Except calculated dimensions.)
This means that it is correct to use the Sum() function in an expression, since this is an aggregation
function - a function that uses several records as input. But if you omit the aggregation function or
use a scalar function only, e.g. RangeSum(), you can get an unexpected behavior.
Basically, all field references should be wrapped in an aggregation function. The Aggr() function and
some constructions using the total qualifier can even have several layers of aggregations.
But if the created expression does not contain an aggregation function, the expression is ill-formed
and potentially incorrect.
Examples:
=Sum(Amount)
=Count(OrderID)
These are both correct aggregations. Amount is wrapped in the Sum() function which will sum
several records of the field Amount. OrderID is wrapped in the Count() function, which will count the
records where OrderID has a value.
=Only(OrderID)
This is also a correct aggregation. OrderID is wrapped in the Only() function, which will return the
OrderID if there is only one value, otherwise NULL.
=OrderID
A single field reference is not an aggregation, so this is an ill-formed expression. But QlikView will
not throw an error. Instead it will use the Only() function to interpret the field reference. I.e., if there
is only one value, this value will be used. But if there are several possible values, NULL will be used.
So, it depends on the circumstances whether an expression without aggregation function is correct
or not.
=If(Year=Year(Today()), Sum(Amount1), Sum(Amount2))
Here, both the amounts are correctly wrapped in the Sum() function. But the first parameter of the
if() function, the condition, is not. Hence, this is an ill-formed expression. If it is used in a place where
there are several possible Years, the field reference will evaluate to NULL and the condition will be
evaluated as FALSE, which is not what you want. Instead, you probably want to wrap the Year in the
Min() or Max() function.
=ProductGroup= 'Shoes'
=IsNull(ProductGroup)
These expressions can both be used as show conditions or as advanced searches. However, since
there are no aggregation functions, the expressions are ill-formed. If you want to test whether there
exists Shoes or NULL values among the field values, you probably want to use the following instead:
=Count(If(ProductGroup= 'Shoes', ProductGroup))>0
=NullCount(ProductGroup)>0
Conclusions:
An aggregation function is a function that returns a single value describing some property
of several records in the data.
All UI expressions, except calculated dimensions, are evaluated as aggregations.
All field references in expressions must be wrapped in an aggregation function. If they
arent, QlikView will use the Only() function.
18. Dimensions and Measures Source: http://community.qlik.com/blogs/qlikviewdesignblog/2013/03/25/dimensions-and-
measures
To make a chart in QlikView or in any Business Intelligence tool, for that matter you need to know
what Dimensions and Measures are. But not all people have a clear picture of the difference between
the two. So this weeks post will try to straighten out whats what.
When you make a chart, you should start by asking yourself What do I want to show? The answer
is usually Sales, Quantity or some other number. This is your Measure. In QlikView we have
traditionally called this an Expression, but "Measure" is really the correct word. (There are
expressions that are not measures, e.g. expressions used as labels, or as sort order definitions).
The second question you should ask yourself is How many times should this be calculated? Per
what do I want to show this measure? The answer could be once per Month, per Customer, per
Supplier or something similar. This is your Dimension.
In the bar chart below, you have one bar per month, and a general rule is that you always have one
data point per distinct dimensional value: But depending on which visualization form you have
chosen, it can be a row (in a table), a point (in a scatter chart) or a slice (in a pie chart).
Measures
A database or a QlikView app can consist of thousands or millions of records that each contains a
small piece of information. A Measure is simply a calculation that can be made over multiple records
in this data set. The calculation always returns one single value that summarizes all relevant records.
This type of calculation is called an aggregation. There are several aggregation functions: Sum(),
Count(), Min(), Max(), etc.
Examples:
Each record contains a sales number. Then Sum(Sales) is a relevant measure that calculates
the total sales value.
Each record represents an order and OrderID is the key. Then Count(OrderID) is a relevant
measure that calculates the number of orders.
A Measure can be used almost anywhere in QlikView: In charts, in text boxes, as label for objects, in
gauges, etc. Typical measures are all KPI:s, Revenue, Number of orders, Performance, Cost, Quantity,
Gross Margin, etc.
Once again: A Measure is always based on an aggregation. Always!
Dimensions
Contrary to Measures, dimensions are descriptive attributes typically textual fields or discrete
numbers. A dimension is always an array of distinct values and the measure will be calculated once
per element in the array.
Example:
The field Customer is used as dimension. The individual customers will then be listed and
the measure will be calculated once per customer.
Typical dimensions are Customer, Product, Location, Supplier, Activity, Time, Color, Size, etc.
Like a For-Next loop
You can regard a chart like a For-Next loop: The Dimension is the loop variable; the calculations will
be made once per dimensional value. So the Dimension determines how many
rows/bars/points/slices the chart will have. The Measure is what is calculated in each loop.
Several Dimensions
If you have two or three dimensions in a chart, the dimensional values no longer form an array, but
instead a matrix or a cube, where the measures are calculated once per cell in the cube.
SQL
You can also compare a chart with an SQL SELECT statement. The GROUP BY symbols are the
dimensions and the aggregations are the Measures.
With this, I hope that the difference between Dimensions and Measures is a little clearer.
19. Qlikview Quoteology Source: http://community.qlik.com/blogs/qlikviewdesignblog/2013/04/09/quoteology
In all programming environments there is a need for quotation marks, and QlikView is no exception.
But which symbol should you use? " ", [ ], ` ` or ' '? This post will try to explain the differences
between the different quotation marks.
When creating the script or an expression in QlikView, you need to reference fields, explicit values
and variables. To do this correctly, you sometimes need to write the string inside a pair of quotation
marks. One common case is when a field name contains a symbol that prevents QlikView from
parsing it correctly, like a space or a minus sign.
For example, if you have a field called Unit Cost, then
Load Unit Cost
will cause a syntax error since QlikView expects an "as" or a comma after the word "Unit". If you
instead write
Load [Unit Cost]
QlikView will load the field Unit Cost. Finally, if you write
Load 'Unit Cost'
QlikView will load the text string "Unit Cost" as field value. Hence, it is important that you choose
the correct quotation mark.
So, what are the rules? Which quote should I use? Single? Double? Square brackets?
There are three basic rules:
1. Single quotes are used to quote literals, e.g. strings that should be used as field values.
2. Inside a Load statement, to the left of the as, double quotes are used to quote source field
references, i.e. names of fields.
3. Double quotes can always be substituted by square brackets or by grave accents.
With these three rules, most cases are covered. However, they dont cover everything, so I'll
continue:
In the script, but outside a Load statement, double quotes denote a variable reference and
not a field reference. If double quotes are used, the enclosed string will be interpreted as a
variable and the value of the variable will be used.
A general rule in QlikView is that field references inside a Load must refer to the fields in the input
table the source of the Load statement. They are source field references or in-context field
references. Aliases and fields that are created in the Load cannot be referred since they do not exist
in the source. There are however a couple of exceptions: the functions Peek() and Exists(). The first
parameters of these functions refer to fields that either have already been created or are in
the output of the Load. These are out-of-context field references.
Out-of-context field references and table references, e.g. the parameters in NoOfRows() and
Peek(), should be regarded as literals and therefore need single quotes.
Finally, in many places you are free to use any of the four quotation methods, e.g.
o Inside a Set statement, to the right of the =
o Inside a Load statement, to the right of the as
o In places where QlikView expects a file name, a URL or a table name
o Defining the beginning and end of an inline table
o For the first parameter of Peek() or Exists() when used inside a Load
I have deliberately chosen not to say anything about SELECT statements. The reason is that the rules
depend on which database and which ODBC/OLEDB you have. But usually, rules 1-3 apply there also.
With this, I hope that the QlikView quoteology is a little clearer.
20. The Crosstable Load Source: http://community.qlik.com/blogs/qlikviewdesignblog/2014/03/24/crosstable
There are a number of prefixes in QlikView, that help you load and transform data. One of them is
the Crosstable transformation.
Whenever you have a crosstable of data, the Crosstable prefix can be used to transform the data
and create the desired fields. A crosstable is basically a matrix where one of the fields is displayed
vertically and another is displayed horizontally. In the input table below you have one column per
month and one row per product.
But if you want to analyze this data, it is much easier to have all numbers in one field and all months
in another, i.e. in a three-column table. It is not very practical to have one column per month, since
you want to use Month as dimension and Sum(Sales) as measure.
Enter the Crosstable prefix.
It converts the data to a table with one column for Month and another for Sales. Another way to
express it is to say that it takes field names and converts these to field values. If you compare it
to the Generic prefix, you will find that they in principle are each others inverses.
The syntax is
Crosstable (Month, Sales) Load Product, [Jan 2014], [Feb 2014], [Mar 2014], From ;
There are however a couple of things worth noting:
Usually the input data has only one column as qualifier field; as internal key (Product in the
above example). But you can have several. If so, all qualifying fields must be listed before the
attribute fields, and the third parameter to the Crosstable prefix must be used to define the
number of qualifying fields.
It is not possible to have a preceding Load or a prefix in front of the Crosstable keyword.
Auto-concatenate will however work.