Multidimensional (Multivariate)Data VisualizationData Visualization
—— IV Course Spring’14
Graduate Coursef UCASof UCAS
May 9th, 2014
1
Data by Dimensionality• 1-D (Linear, Set and Sequences) SeeSoft, Info Mural
• 2-D (Map) GIS ArcView PageMaker2-D (Map) GIS, ArcView, PageMaker
• 3-D (Shape, the World) CAD, Medical, Architecture
• D (R l i l S i i l)• n-D (Relational, Statistical) Spotfire, Tableau
• Temporal LifeLines, Palantir
•• Tree (Hierarchy) Cone/Cam/Hyperbolic
• Network (Graph) Pajek, JUNG
2
Relational Data Model• Represent data as a tableEach row (tuple) represents a single recordEach record is a fixed length tupleEach record is a fixed-length tupleEach column (attribute) represents a single variablerepresents a single variableEach attribute has a name and a data type
• A database is a collection of tables
3
Statistical Data Model• Dimensions: Nominal/Ordinal variable describing data Dates, categories of values (independent variables)
• Measures: Interval/Ratio that can be aggregated Numbers to be analyzed (dependent variables) Aggregate as sum, count, average, std. deviation
4
Data by Variable/Measurement Types• N - Nominal (labels) Fruits: Apples oranges Fruits: Apples, oranges, …
• O - Ordinal Sanitation of restaurants: A/B/C Sanitation of restaurants: A/B/C
• Q - Interval (No zero measure) D t J 19 2006 L ti (LAT 33 98 LONG 118 45) Date: Jan. 19, 2006; Location (LAT 33.98, LONG -118.45) Like a geometric point. Cannot compare directly Only differences (i e intervals) may be compared Only differences (i.e. intervals) may be compared
• Q - Ratio (zero fixed) Ph i l L h M T Physical measurement: Length, Mass, Temp, … Counts and amounts Lik t i t i i is i f l
5
Like a geometric vector, origin is meaningful
Multivariate Data and Analysis• Definitions Multivariate analysis is based on the statistical principle of Multivariate analysis is based on the statistical principle of multivariate statistics, which involves observation and analysis of more than one statistical outcome variable at a time. Multivariate statistics is a form of statistics encompassing the simultaneous observation and analysis of more than one outcome variable outcome variable.
• Multivariate Data: three main components Objects: Item of interests (students courses terms ) Objects: Item of interests (students, courses, terms, …) Attributes: Characteristics or properties of data (name, age, GPA, number, date, …)age, GPA, number, date, …) Relations: How two or more objects relate (student takes course, course during term, …)
6
ExampleObObjects
(Entries/Cases) Metadata
LondonLondonOlympicGameGam
Performance
Attributes (Measures/Variables)
Relationshipamong
multiple objects &
tables
7
tables
Example
8
Multivariate Data Classification• Number of outcome/dependent variables per entry/caseentry/case 1 - Univariate data 2 - Bivariate data2 Bivariate data 3 - Trivariate data >3 - Hypervariate datayp
9
Univariate Data Visualization
• Put independent variable/cases (Country) on x-axis• Put dependent variable/measures (#gold medal) on y-
10
u p /m u ( g m ) yaxis
Bivariate Data Visualization
11
Trivariate Data Visualization
12
Trivariate Data Visualization
horsepower
mileage
price
cases
Represent each variable in separate charts
13
Hypervariate Data Visualization• 4~20 variables/measures• nD > 2D projection (3D): in maths MDS/PCA/• nD -> 2D projection (3D): in maths, MDS/PCA/…
14
Hypervariate Data Visualization• More visual channels: ~10 variables
A tensor fi ldfield
by tile visualizationvisualization
15
x, y, color hue, saturation, value, size, shape, orientation, rotation, texture, etc.
Hypervariate Data Visualization• Separate charts, multiple views on different variablesvariables
casescases
variables variables
16
cases
Hypervariate Data Visualization• TableLens Turn spreadsheet into statistical data graphics Turn spreadsheet into statistical data graphics Leverage the basic bar and scatterplot design
17
Change nominal values to scatterplots
Change quantitative values to bars
TableLens
18
TableLens
19Focus + Context
Hypervariate Data Visualization
TableLens video (0:00~5:00)TableLens video (0:00~5:00)
I f Z idInfoZoom video
However, spreadsheet-like visualizations show no correlation among variablesshow no correlation among variables
20
Scatterplot Matrix
21
Scatterplot Matrix
22
Pivot Table: Flexibly aggregating spreadsheets
Data Table
Pivot Tableot a
23
OLAP Cubes: Multidimensional analytics in BI and Data Managementand Data Management
Slice
DiceDice
24
OLAP Cubes
Drill-down
PiPivot
25
OLAP Cubes
26
Polaris: Multi-dimensional data visualization with extended Pivot Tables with extended Pivot Tables
27
Tableau: Commercial version of Polaris:
Video demo: bl l f L P bTableau visualization of OLAP cube
28
Still miss something on multidimensional data?g
No multidimensional relationships!No multidimensional relationships!
29
Attribute Explorer
• Attribute histogram
• All objects on all attribute scalesattribute scales
• Interaction with attributes limits
30
Attribute Explorer• Inter-relations between attributes – brushing
31
Attribute Explorer• Color-encoded sensitivity
32
Attribute Explorer
Old-fashioned Video Demo!
33
Parallel Coordinate
34
Parallel Coordinate• Sample multivariate data
35
Parallel Coordinate• First data entry
V1 V2 V3 V4 V5V1 V2 V3 V4 V5
36
Parallel Coordinate• Second data entry
V1 V2 V3 V4 V5V1 V2 V3 V4 V5
37
Parallel Coordinate• Third data entry
V1 V2 V3 V4 V5V1 V2 V3 V4 V5
38
Case Study: VLSI Chip Dataset• The Dataset: Production data for 473 batches of a VLSI chip Production data for 473 batches of a VLSI chip 16 process parameters:
X1: The yield: % of produced chips that are usefulX he y eld % of produced ch ps that are usefulX2: The quality of the produced chips (speed)X3 … X12: 10 types of defects (zero defects shown at top)X13 … X16: 4 physical parameters
Th Obj ti• The Objective: Raise the yield (X1) and maintain high quality (X2)
39
A. Inselberg, Multidimensional Detective, Proceedings of IEEE Symposium on Information Visualization (InfoVis '97), 1997.
Case Study: VLSI Chip Dataset• Overview
40
Case Study: VLSI Chip Dataset• Top Yield & Quality
D f tsDefects
S litSplits
41
Case Study: VLSI Chip Dataset• Zero Defect: not the highest yield and quality
42
Case Study: VLSI Chip Dataset• Best quality: some defects are necessary!
43
Parallel Coordinate Demo
44
Parallel Set• How about categorical data?
Live Demo
45
Star Plot (Radar Map)• Rotate coordinate from Parallel Coordinate
46
Star Plot (Radar Map)• Single-view v.s. Multiple-view
47
Star Coordinate• Use data point instead of polyline in Star Plots Accumulate data value along a vector parallel to the axis Accumulate data value along a vector parallel to the axis
48
SSummary
• Multivariate Data Model– Statistical and relational – Unvariate, Bivariate, Trivariate, Hypervariate
• Multivariate Data Visualization– Charts, scatterplot, spreadsheet and spreadsheet-like
visualization– Scatterplot matrix, pivot table, OLAP cube, Polaris and
Tableau– Parallel Coordinate and Parallel SetParallel Coordinate and Parallel Set– Star plot (Radar map) and star coordinate
49
Questions?Questions?
What’s Next –Multivariate Data Visualization Fun Demos
50
Final Project Checkpoint
Are you ready?y y
Team coordinators/Leaders Team coordinators/Leaders, please find Hanpengyu now
51
Fun Visualizations and Demos
52
FLINA: Flexible Linked Axes for Multivariate Data VisualizationMultivariate Data Visualization
53
Chernoff Faces
54
Mosaic Plot
55
Dust & Magnetg
56
Untangling Euler Diagramg g g
57