30
1 UNIVERSITY OF JYVÄSKYLÄ DEPARTMENT OF MATHEMATICAL INFORMATION TECHNOLOGY Lecture 9: Visualization TIES443: Introduction to DM 1 Lecture 9 Lecture 9 V isualization isualization Department of Mathematical Information Technology University of Jyväskylä Mykola Pechenizkiy Course webpage: http://www.cs.jyu.fi/~mpechen/TIES443 TIES443 November 17, 2006 UNIVERSITY OF JYVÄSKYLÄ DEPARTMENT OF MATHEMATICAL INFORMATION TECHNOLOGY Lecture 9: Visualization TIES443: Introduction to DM 2 Topics for today Topics for today Purpose of visualization in DM/KDD Visualization of data 1D, 2D, 3D, 3+D Visualization of data mining results Model visualization, model performance visualization Visualization of data mining processes Application … Interactive data mining and visualization Time-series visualization Data mining as part of visualization system or vice versa?

UNIVERSITY OF JYVÄSKYLÄ DEPARTMENT OF MATHEMATICAL ...mpechen/courses/TIES443/handouts/lecture09.pdf · TIES443: Introduction to DM Lecture 9: Visualization 3 Sources • Books:

  • Upload
    others

  • View
    7

  • Download
    0

Embed Size (px)

Citation preview

Page 1: UNIVERSITY OF JYVÄSKYLÄ DEPARTMENT OF MATHEMATICAL ...mpechen/courses/TIES443/handouts/lecture09.pdf · TIES443: Introduction to DM Lecture 9: Visualization 3 Sources • Books:

1

UNIVERSITY OF JYVÄSKYLÄ DEPARTMENT OF MATHEMATICAL INFORMATION TECHNOLOGY

Lecture 9: VisualizationTIES443: Introduction to DM 1

Lecture 9Lecture 9

VVisualizationisualization

Department of Mathematical Information TechnologyUniversity of Jyväskylä

Mykola Pechenizkiy

Course webpage: http://www.cs.jyu.fi/~mpechen/TIES443

TIES443

November 17, 2006

UNIVERSITY OF JYVÄSKYLÄ DEPARTMENT OF MATHEMATICAL INFORMATION TECHNOLOGY

Lecture 9: VisualizationTIES443: Introduction to DM 2

Topics for todayTopics for today

• Purpose of visualization in DM/KDD

• Visualization of data

– 1D, 2D, 3D, 3+D

• Visualization of data mining results

– Model visualization, model performance visualization

• Visualization of data mining processes

• Application …

– Interactive data mining and visualization

– Time-series visualization

– Data mining as part of visualization system or vice versa?

Page 2: UNIVERSITY OF JYVÄSKYLÄ DEPARTMENT OF MATHEMATICAL ...mpechen/courses/TIES443/handouts/lecture09.pdf · TIES443: Introduction to DM Lecture 9: Visualization 3 Sources • Books:

2

UNIVERSITY OF JYVÄSKYLÄ DEPARTMENT OF MATHEMATICAL INFORMATION TECHNOLOGY

Lecture 9: VisualizationTIES443: Introduction to DM 3

SourcesSources

• Books:– “Information Visualization in DM and KD” Fayyad et al.– “Information Visualization: Beyond the Horizon”: 2nd ed, C. Chen– “The Visual Display of Quantitative Information”, 2nd ed, E. Tufte

• People:– Edward Tufte, Ben Shneiderman, Daniel A. Keim, Marti Hearst,

Tamara Munzner, and other

• Excellent handouts and recommended reading on information visualization by Tamara Munzner– http://www.cs.ubc.ca/~tmm/courses/cpsc533c-04-spr/

• Some papers:– Visual Data Mining Framework for Convenient Identification of

Useful Knowledge– http://www.cs.uic.edu/~kzhao/Papers/06_ICDM_Zhao_Visual.pdf

• Lots of other sources …

UNIVERSITY OF JYVÄSKYLÄ DEPARTMENT OF MATHEMATICAL INFORMATION TECHNOLOGY

Lecture 9: VisualizationTIES443: Introduction to DM 4

Visualization & related fieldsVisualization & related fields

• User Interface Studies

• Computer Vision

• Cognitive Science

• Computer-Aided Design

• Geometric Modelling

• Computer Graphics

• Signal Processing

• Approximation Theory

Fig. by Jack van Wijk from “Views on Visualization ”

Page 3: UNIVERSITY OF JYVÄSKYLÄ DEPARTMENT OF MATHEMATICAL ...mpechen/courses/TIES443/handouts/lecture09.pdf · TIES443: Introduction to DM Lecture 9: Visualization 3 Sources • Books:

3

UNIVERSITY OF JYVÄSKYLÄ DEPARTMENT OF MATHEMATICAL INFORMATION TECHNOLOGY

Lecture 9: VisualizationTIES443: Introduction to DM 5

Purpose of Visualization in DM/KDDPurpose of Visualization in DM/KDD

• Data Visualization– Qualitative overview of large and complex data sets

– Data summarization and (interactive) exploration

– Regions of interest identification

– Support humans in looking for structures, features, patterns, trends, and anomalies in data

– Perceptual capabilities of the human visual system

• Model (and its performance) Visualization

• Process Visualization

• Disadvantages? – requires human eyes

– can be misleading

"The purpose of computing is insight, not numbers"Richard Hamming

UNIVERSITY OF JYVÄSKYLÄ DEPARTMENT OF MATHEMATICAL INFORMATION TECHNOLOGY

Lecture 9: VisualizationTIES443: Introduction to DM 6

Visualization Techniques CategorizationVisualization Techniques Categorization

• Based on task at hand– To explore data– To confirm a hypothesis

• Based on structure of underlying data set• Based on the dimension of the display

– Stationary– Animated– Interactive

• Geometric vs. Symbolic– Results of physical models, simulations, computations

– Nonnumeric data• Networks, relations as graphs; icons, arrays

• 1D-2D-3D, stereoscopic, virtual reality vs. 3+D• Static vs. Dynamic

Page 4: UNIVERSITY OF JYVÄSKYLÄ DEPARTMENT OF MATHEMATICAL ...mpechen/courses/TIES443/handouts/lecture09.pdf · TIES443: Introduction to DM Lecture 9: Visualization 3 Sources • Books:

4

UNIVERSITY OF JYVÄSKYLÄ DEPARTMENT OF MATHEMATICAL INFORMATION TECHNOLOGY

Lecture 9: VisualizationTIES443: Introduction to DM 7

Some Basic Visualization TerminologySome Basic Visualization Terminology

• Visualization

• Interaction

• Graphical attributes

• Mapping

• Rendering

• Field

• Scalar, Vector, Tensor

• Isosurface

• Glyph

• …

UNIVERSITY OF JYVÄSKYLÄ DEPARTMENT OF MATHEMATICAL INFORMATION TECHNOLOGY

Lecture 9: VisualizationTIES443: Introduction to DM 8

Perception in VisualizationPerception in Visualization

• What graphical primitives do humans detect quickly?

• What graphical attributes can be accurately measured?

• How many distinct values for a particular attribute can be used without confusion?

• How do we combine primitives to recognize complex phenomena?

• Human perception and information theory

Page 5: UNIVERSITY OF JYVÄSKYLÄ DEPARTMENT OF MATHEMATICAL ...mpechen/courses/TIES443/handouts/lecture09.pdf · TIES443: Introduction to DM Lecture 9: Visualization 3 Sources • Books:

5

UNIVERSITY OF JYVÄSKYLÄ DEPARTMENT OF MATHEMATICAL INFORMATION TECHNOLOGY

Lecture 9: VisualizationTIES443: Introduction to DM 9

The most of following slides for The most of following slides for datadata

visualization were adopted from the visualization were adopted from the presentation presentation ““Visualization and Data Visualization and Data

MiningMining”” by G. by G. PiatetskyPiatetsky--ShapiroShapiro(available from (available from www.kdnuggets.comwww.kdnuggets.com))

UNIVERSITY OF JYVÄSKYLÄ DEPARTMENT OF MATHEMATICAL INFORMATION TECHNOLOGY

Lecture 9: VisualizationTIES443: Introduction to DM 10

Data Visualization OutlineData Visualization Outline

• Graphical excellence and lie factor

• Representing data in 1,2, and 3-D

• Representing data in 4+ dimensions– Parallel coordinates

– Scatterplots

– Stick figures

Page 6: UNIVERSITY OF JYVÄSKYLÄ DEPARTMENT OF MATHEMATICAL ...mpechen/courses/TIES443/handouts/lecture09.pdf · TIES443: Introduction to DM Lecture 9: Visualization 3 Sources • Books:

6

UNIVERSITY OF JYVÄSKYLÄ DEPARTMENT OF MATHEMATICAL INFORMATION TECHNOLOGY

Lecture 9: VisualizationTIES443: Introduction to DM 11

Napoleon Invasion of Russia, 1812Napoleon Invasion of Russia, 1812

Napoleon

UNIVERSITY OF JYVÄSKYLÄ DEPARTMENT OF MATHEMATICAL INFORMATION TECHNOLOGY

Lecture 9: VisualizationTIES443: Introduction to DM 12

© www.odt.org , from http://www.odt.org/Pictures/minard.jpg, used by permission

Page 7: UNIVERSITY OF JYVÄSKYLÄ DEPARTMENT OF MATHEMATICAL ...mpechen/courses/TIES443/handouts/lecture09.pdf · TIES443: Introduction to DM Lecture 9: Visualization 3 Sources • Books:

7

UNIVERSITY OF JYVÄSKYLÄ DEPARTMENT OF MATHEMATICAL INFORMATION TECHNOLOGY

Lecture 9: VisualizationTIES443: Introduction to DM 13

Snow’s Cholera Map, 1855

UNIVERSITY OF JYVÄSKYLÄ DEPARTMENT OF MATHEMATICAL INFORMATION TECHNOLOGY

Lecture 9: VisualizationTIES443: Introduction to DM 14

Asia at nightAsia at night

Page 8: UNIVERSITY OF JYVÄSKYLÄ DEPARTMENT OF MATHEMATICAL ...mpechen/courses/TIES443/handouts/lecture09.pdf · TIES443: Introduction to DM Lecture 9: Visualization 3 Sources • Books:

8

UNIVERSITY OF JYVÄSKYLÄ DEPARTMENT OF MATHEMATICAL INFORMATION TECHNOLOGY

Lecture 9: VisualizationTIES443: Introduction to DM 15

Bad Visualization: misleading Y -axis

21242003

21212002

21202001

21052000

21101999

SalesYear

Sales

2095

2100

2105

2110

2115

2120

2125

2130

1999 2000 2001 2002 2003

Sales

Y-Axis scale gives WRONG

impression of big change

UNIVERSITY OF JYVÄSKYLÄ DEPARTMENT OF MATHEMATICAL INFORMATION TECHNOLOGY

Lecture 9: VisualizationTIES443: Introduction to DM 16

Better VisualizationBetter Visualization

21242003

21212002

21202001

21052000

21101999

SalesYear

Sales

0

500

1000

1500

2000

2500

3000

1999 2000 2001 2002 2003

Sales

Axis from 0 to 2000 scale gives

correct impression of small change

Page 9: UNIVERSITY OF JYVÄSKYLÄ DEPARTMENT OF MATHEMATICAL ...mpechen/courses/TIES443/handouts/lecture09.pdf · TIES443: Introduction to DM Lecture 9: Visualization 3 Sources • Books:

9

UNIVERSITY OF JYVÄSKYLÄ DEPARTMENT OF MATHEMATICAL INFORMATION TECHNOLOGY

Lecture 9: VisualizationTIES443: Introduction to DM 17

Lie Factor=14.8

(E.R. Tufte, (E.R. Tufte, ““The Visual Display of Quantitative InformationThe Visual Display of Quantitative Information””, 2nd edition), 2nd edition)

UNIVERSITY OF JYVÄSKYLÄ DEPARTMENT OF MATHEMATICAL INFORMATION TECHNOLOGY

Lecture 9: VisualizationTIES443: Introduction to DM 18

Lie FactorLie Factor

==

dataineffectofsize

graphicinshowneffectofsizeFactorLie

8.14528.0

833.7

18

)0.185.27(6.0

)6.03.5(

==

=

Tufte requirement: 0.95<Lie Factor<1.05

(E.R. Tufte, (E.R. Tufte, ““The Visual Display of Quantitative InformationThe Visual Display of Quantitative Information””, 2nd edition), 2nd edition)

Page 10: UNIVERSITY OF JYVÄSKYLÄ DEPARTMENT OF MATHEMATICAL ...mpechen/courses/TIES443/handouts/lecture09.pdf · TIES443: Introduction to DM Lecture 9: Visualization 3 Sources • Books:

10

UNIVERSITY OF JYVÄSKYLÄ DEPARTMENT OF MATHEMATICAL INFORMATION TECHNOLOGY

Lecture 9: VisualizationTIES443: Introduction to DM 19

• Give the viewer – the greatest number of ideas

– in the shortest time

– with the least ink in the smallest space.

• Tell the truth about the data!

(E.R. Tufte, (E.R. Tufte, ““The Visual Display of Quantitative InformationThe Visual Display of Quantitative Information””, 2nd , 2nd

edition)edition)

TufteTufte’’ss Principles of Graphical ExcellencePrinciples of Graphical Excellence

UNIVERSITY OF JYVÄSKYLÄ DEPARTMENT OF MATHEMATICAL INFORMATION TECHNOLOGY

Lecture 9: VisualizationTIES443: Introduction to DM 20

11--D (Univariate) DataD (Univariate) Data

• Representations

Page 11: UNIVERSITY OF JYVÄSKYLÄ DEPARTMENT OF MATHEMATICAL ...mpechen/courses/TIES443/handouts/lecture09.pdf · TIES443: Introduction to DM Lecture 9: Visualization 3 Sources • Books:

11

UNIVERSITY OF JYVÄSKYLÄ DEPARTMENT OF MATHEMATICAL INFORMATION TECHNOLOGY

Lecture 9: VisualizationTIES443: Introduction to DM 21

22--D (D (BivariateBivariate) Data) Data

• Scatter plot, …

price

mileage

UNIVERSITY OF JYVÄSKYLÄ DEPARTMENT OF MATHEMATICAL INFORMATION TECHNOLOGY

Lecture 9: VisualizationTIES443: Introduction to DM 22

33--D Data (projection)D Data (projection)

price

Page 12: UNIVERSITY OF JYVÄSKYLÄ DEPARTMENT OF MATHEMATICAL ...mpechen/courses/TIES443/handouts/lecture09.pdf · TIES443: Introduction to DM Lecture 9: Visualization 3 Sources • Books:

12

UNIVERSITY OF JYVÄSKYLÄ DEPARTMENT OF MATHEMATICAL INFORMATION TECHNOLOGY

Lecture 9: VisualizationTIES443: Introduction to DM 23

33--D image (requires 3D image (requires 3--D blue and red glasses)D blue and red glasses)

Taken by Mars Rover Spirit, Jan 2004

UNIVERSITY OF JYVÄSKYLÄ DEPARTMENT OF MATHEMATICAL INFORMATION TECHNOLOGY

Lecture 9: VisualizationTIES443: Introduction to DM 24

Visualizing in 4+ DimensionsVisualizing in 4+ Dimensions

• Scatterplots

• Parallel coordinates

• Chernoff faces

• Stick figures

• Star glyphs

• …

Page 13: UNIVERSITY OF JYVÄSKYLÄ DEPARTMENT OF MATHEMATICAL ...mpechen/courses/TIES443/handouts/lecture09.pdf · TIES443: Introduction to DM Lecture 9: Visualization 3 Sources • Books:

13

UNIVERSITY OF JYVÄSKYLÄ DEPARTMENT OF MATHEMATICAL INFORMATION TECHNOLOGY

Lecture 9: VisualizationTIES443: Introduction to DM 25

Multiple ViewsMultiple Views

Give each variable its own display

A B C D E

1 4 1 8 3 5

2 6 3 4 2 1

3 5 7 2 4 3

4 2 6 3 1 5

A B C D E

1

2

3

4

Problem: does not show correlations

UNIVERSITY OF JYVÄSKYLÄ DEPARTMENT OF MATHEMATICAL INFORMATION TECHNOLOGY

Lecture 9: VisualizationTIES443: Introduction to DM 26

Scatterplot MatrixScatterplot Matrix

Represent each possiblepair of variables in theirown 2-D scatterplot(car data)

Q: Useful for what?A: linear correlations

(e.g. horsepower & weight)

Q: Misses what?A: multivariate effects

For 13 dimensions (columns) :

Number of histograms = 13

Number of scatterplots = C(13,2) = 78

Page 14: UNIVERSITY OF JYVÄSKYLÄ DEPARTMENT OF MATHEMATICAL ...mpechen/courses/TIES443/handouts/lecture09.pdf · TIES443: Introduction to DM Lecture 9: Visualization 3 Sources • Books:

14

UNIVERSITY OF JYVÄSKYLÄ DEPARTMENT OF MATHEMATICAL INFORMATION TECHNOLOGY

Lecture 9: VisualizationTIES443: Introduction to DM 27

Parallel Coordinates Parallel Coordinates

• Encode variables along a horizontal row• Vertical line specifies values

Dataset in a Cartesian coordinates

Same dataset in parallel coordinates

Invented by

Alfred Inselberg

while at IBM, 1985

UNIVERSITY OF JYVÄSKYLÄ DEPARTMENT OF MATHEMATICAL INFORMATION TECHNOLOGY

Lecture 9: VisualizationTIES443: Introduction to DM 28

Example: Visualizing Iris DataExample: Visualizing Iris Data

sepal length

sepal width

petal length

petal width

5.1 3.5 1.4 0.2

4.9 3 1.4 0.2

... ... ... ...

5.9 3 5.1 1.8

Iris setosa

Iris versicolor

Iris virginica

Page 15: UNIVERSITY OF JYVÄSKYLÄ DEPARTMENT OF MATHEMATICAL ...mpechen/courses/TIES443/handouts/lecture09.pdf · TIES443: Introduction to DM Lecture 9: Visualization 3 Sources • Books:

15

UNIVERSITY OF JYVÄSKYLÄ DEPARTMENT OF MATHEMATICAL INFORMATION TECHNOLOGY

Lecture 9: VisualizationTIES443: Introduction to DM 29

Parallel Coordinates: 2 DParallel Coordinates: 2 D

Sepal

Length

5.1

Sepal

Width

3.5

sepal

length

sepal

width

petal

length

petal

width

5.1 3.5 1.4 0.2

UNIVERSITY OF JYVÄSKYLÄ DEPARTMENT OF MATHEMATICAL INFORMATION TECHNOLOGY

Lecture 9: VisualizationTIES443: Introduction to DM 30

Parallel Coordinates: 4 DParallel Coordinates: 4 D

Sepal

Length

5.1

Sepal

Width

Petal

length

Petal

Width

3.5

sepal

length

sepal

width

petal

length

petal

width

5.1 3.5 1.4 0.2

1.40.2

Page 16: UNIVERSITY OF JYVÄSKYLÄ DEPARTMENT OF MATHEMATICAL ...mpechen/courses/TIES443/handouts/lecture09.pdf · TIES443: Introduction to DM Lecture 9: Visualization 3 Sources • Books:

16

UNIVERSITY OF JYVÄSKYLÄ DEPARTMENT OF MATHEMATICAL INFORMATION TECHNOLOGY

Lecture 9: VisualizationTIES443: Introduction to DM 31

5.1

3.5

1.40.2

Parallel Visualization of Iris dataParallel Visualization of Iris data

UNIVERSITY OF JYVÄSKYLÄ DEPARTMENT OF MATHEMATICAL INFORMATION TECHNOLOGY

Lecture 9: VisualizationTIES443: Introduction to DM 32

Parallel CoordinatesParallel Coordinates: Correllation

Hyperdimensional Data Analysis Using Parallel Coordinates. Edward J. Wegman.

Journal of the American Statistical Association, 85(411), Sep 1990, p 664-675.

Page 17: UNIVERSITY OF JYVÄSKYLÄ DEPARTMENT OF MATHEMATICAL ...mpechen/courses/TIES443/handouts/lecture09.pdf · TIES443: Introduction to DM Lecture 9: Visualization 3 Sources • Books:

17

UNIVERSITY OF JYVÄSKYLÄ DEPARTMENT OF MATHEMATICAL INFORMATION TECHNOLOGY

Lecture 9: VisualizationTIES443: Introduction to DM 33

Hierarchical Parallel CoordinatesCoordinates

Hierarchical Parallel Coordinates for Visualizing Large Multivariate Data Sets. Fua,

Ward, and Rundensteiner, IEEE Visualization 99

UNIVERSITY OF JYVÄSKYLÄ DEPARTMENT OF MATHEMATICAL INFORMATION TECHNOLOGY

Lecture 9: VisualizationTIES443: Introduction to DM 34

Chernoff FacesChernoff Faces

Encode different variables’ values in characteristicsof human face

http://www.cs.uchicago.edu/~wiseman/chernoff/

http://hesketh.com/schampeo/projects/Faces/chernoff.htmlCute applets:

Page 18: UNIVERSITY OF JYVÄSKYLÄ DEPARTMENT OF MATHEMATICAL ...mpechen/courses/TIES443/handouts/lecture09.pdf · TIES443: Introduction to DM Lecture 9: Visualization 3 Sources • Books:

18

UNIVERSITY OF JYVÄSKYLÄ DEPARTMENT OF MATHEMATICAL INFORMATION TECHNOLOGY

Lecture 9: VisualizationTIES443: Introduction to DM 35

Chernoff faces, exampleChernoff faces, example

UNIVERSITY OF JYVÄSKYLÄ DEPARTMENT OF MATHEMATICAL INFORMATION TECHNOLOGY

Lecture 9: VisualizationTIES443: Introduction to DM 36

Stick FiguresStick Figures

• Two variables are mapped to X, Y axes

• Other variables are mapped to limb lengths and angles

• Texture patterns can show data characteristics

Page 19: UNIVERSITY OF JYVÄSKYLÄ DEPARTMENT OF MATHEMATICAL ...mpechen/courses/TIES443/handouts/lecture09.pdf · TIES443: Introduction to DM Lecture 9: Visualization 3 Sources • Books:

19

UNIVERSITY OF JYVÄSKYLÄ DEPARTMENT OF MATHEMATICAL INFORMATION TECHNOLOGY

Lecture 9: VisualizationTIES443: Introduction to DM 37

Stick figures, exampleStick figures, example

census data showingage, income, sex,education, etc.

Closed figures correspond to women and we can see more of them on the left.

Note also a young woman with high income

UNIVERSITY OF JYVÄSKYLÄ DEPARTMENT OF MATHEMATICAL INFORMATION TECHNOLOGY

Lecture 9: VisualizationTIES443: Introduction to DM 38

Star Glyphs from Star Glyphs from XmdvXmdv

• Two variables are mapped to X, Y axes

• Other variables are mapped to limb lengths and angles

• Texture patterns can show data characteristics

Page 20: UNIVERSITY OF JYVÄSKYLÄ DEPARTMENT OF MATHEMATICAL ...mpechen/courses/TIES443/handouts/lecture09.pdf · TIES443: Introduction to DM Lecture 9: Visualization 3 Sources • Books:

20

UNIVERSITY OF JYVÄSKYLÄ DEPARTMENT OF MATHEMATICAL INFORMATION TECHNOLOGY

Lecture 9: VisualizationTIES443: Introduction to DM 39

UNIVERSITY OF JYVÄSKYLÄ DEPARTMENT OF MATHEMATICAL INFORMATION TECHNOLOGY

Lecture 9: VisualizationTIES443: Introduction to DM 40

Model VisualizationModel Visualization

Page 21: UNIVERSITY OF JYVÄSKYLÄ DEPARTMENT OF MATHEMATICAL ...mpechen/courses/TIES443/handouts/lecture09.pdf · TIES443: Introduction to DM Lecture 9: Visualization 3 Sources • Books:

21

UNIVERSITY OF JYVÄSKYLÄ DEPARTMENT OF MATHEMATICAL INFORMATION TECHNOLOGY

Lecture 9: VisualizationTIES443: Introduction to DM 41

DM Model Visualization: WhyDM Model Visualization: Why

• Understanding the model– allow a user to discuss and explain the logic behind the model

• also with colleagues, customers, and other users

– more than just comprehension; it also involves business context • financial indicators like profit and cost

– visualization of the DM output in a meaningful way – allowing the user to interact with the visualization

• simple “what if” questions can be answered

– 3 whales: representation, interaction, and integration

• Trusting the model– good quantitative measures of "trust"– the overall goal of "visualizing trust" – understanding the

limitations of the model

• Comparing Different Models using Visualization – as Input-Output Mappings, as Algorithms, as Processes

UNIVERSITY OF JYVÄSKYLÄ DEPARTMENT OF MATHEMATICAL INFORMATION TECHNOLOGY

Lecture 9: VisualizationTIES443: Introduction to DM 42

Model Visualization: Performance EvaluationModel Visualization: Performance Evaluation

Cost-curves (Holt)

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

PC(+) − Probability Cost

No

rmal

ized

Ex

pec

ted

Co

st

Page 22: UNIVERSITY OF JYVÄSKYLÄ DEPARTMENT OF MATHEMATICAL ...mpechen/courses/TIES443/handouts/lecture09.pdf · TIES443: Introduction to DM Lecture 9: Visualization 3 Sources • Books:

22

UNIVERSITY OF JYVÄSKYLÄ DEPARTMENT OF MATHEMATICAL INFORMATION TECHNOLOGY

Lecture 9: VisualizationTIES443: Introduction to DM 43

Visualization of DM/KDD ProcessVisualization of DM/KDD Process

UNIVERSITY OF JYVÄSKYLÄ DEPARTMENT OF MATHEMATICAL INFORMATION TECHNOLOGY

Lecture 9: VisualizationTIES443: Introduction to DM 44

ApplicationsApplications ……

• Bioinformatics

• Visual genomics

• Social networks

• Process mining

• Time series visualization

Page 23: UNIVERSITY OF JYVÄSKYLÄ DEPARTMENT OF MATHEMATICAL ...mpechen/courses/TIES443/handouts/lecture09.pdf · TIES443: Introduction to DM Lecture 9: Visualization 3 Sources • Books:

23

UNIVERSITY OF JYVÄSKYLÄ DEPARTMENT OF MATHEMATICAL INFORMATION TECHNOLOGY

Lecture 9: VisualizationTIES443: Introduction to DM 45

Visualization of Clusters of Visualization of Clusters of MicroarrayMicroarray DataData

UNIVERSITY OF JYVÄSKYLÄ DEPARTMENT OF MATHEMATICAL INFORMATION TECHNOLOGY

Lecture 9: VisualizationTIES443: Introduction to DM 46

Visual GenomicsVisual Genomics

Pictures by Christoph W. Sensen

www.visualgenomics.ca/index.php?option=com_content&task=section&id=15&Itemid=143

Page 24: UNIVERSITY OF JYVÄSKYLÄ DEPARTMENT OF MATHEMATICAL ...mpechen/courses/TIES443/handouts/lecture09.pdf · TIES443: Introduction to DM Lecture 9: Visualization 3 Sources • Books:

24

UNIVERSITY OF JYVÄSKYLÄ DEPARTMENT OF MATHEMATICAL INFORMATION TECHNOLOGY

Lecture 9: VisualizationTIES443: Introduction to DM 47

The Result of Process MiningThe Result of Process Mining

www.processmining.orgFrom presentation by Wil van der Aalst, TU/e

UNIVERSITY OF JYVÄSKYLÄ DEPARTMENT OF MATHEMATICAL INFORMATION TECHNOLOGY

Lecture 9: VisualizationTIES443: Introduction to DM 48

Design Studies

Page 25: UNIVERSITY OF JYVÄSKYLÄ DEPARTMENT OF MATHEMATICAL ...mpechen/courses/TIES443/handouts/lecture09.pdf · TIES443: Introduction to DM Lecture 9: Visualization 3 Sources • Books:

25

UNIVERSITY OF JYVÄSKYLÄ DEPARTMENT OF MATHEMATICAL INFORMATION TECHNOLOGY

Lecture 9: VisualizationTIES443: Introduction to DM 49

Focus+Context

UNIVERSITY OF JYVÄSKYLÄ DEPARTMENT OF MATHEMATICAL INFORMATION TECHNOLOGY

Lecture 9: VisualizationTIES443: Introduction to DM 50

• Spiral Axis = serial attributes are encoded as line thickness

• Radii = periodic attributes

Carlis & Konstan. UIST-98Independently rediscovered by

Weber, Alexa & Müller InfoVis-01

But dates back to 1888!

Monday

Tuesday

Wednesday

Thursday

Friday

Saturday

Sunday

One year

of power

demand

data

Time Series SpiralsTime Series Spirals

(c) Eamonn Keogh, [email protected]

Page 26: UNIVERSITY OF JYVÄSKYLÄ DEPARTMENT OF MATHEMATICAL ...mpechen/courses/TIES443/handouts/lecture09.pdf · TIES443: Introduction to DM Lecture 9: Visualization 3 Sources • Books:

26

UNIVERSITY OF JYVÄSKYLÄ DEPARTMENT OF MATHEMATICAL INFORMATION TECHNOLOGY

Lecture 9: VisualizationTIES443: Introduction to DM 51

Power Consumption

van Wijk and van Selow, Cluster and Calender based Visualization of Time Series

Data, InfoVis99, http://www.win.tue.nl/˜vanwijk/clv.pdf

UNIVERSITY OF JYVÄSKYLÄ DEPARTMENT OF MATHEMATICAL INFORMATION TECHNOLOGY

Lecture 9: VisualizationTIES443: Introduction to DM 52

Chimpanzees Monthly Food Intake1980-1988

January

April

July

October

Time Series SpiralsTime Series Spirals

The spokes are months, and spiral guide lines are years

112 types of food Simple and intuitive, but works only with periodic data, and if the period is known(c) Eamonn Keogh, [email protected]

Page 27: UNIVERSITY OF JYVÄSKYLÄ DEPARTMENT OF MATHEMATICAL ...mpechen/courses/TIES443/handouts/lecture09.pdf · TIES443: Introduction to DM Lecture 9: Visualization 3 Sources • Books:

27

UNIVERSITY OF JYVÄSKYLÄ DEPARTMENT OF MATHEMATICAL INFORMATION TECHNOLOGY

Lecture 9: VisualizationTIES443: Introduction to DM 53

• Current width = strength of theme

• River width = global strength

• Color mapping (similar themes/ same color family)

• Time axis

• External events can be linked

A company’s patent activity

1988 to 1998

Havre, Hetzler, Whitney & NowellInfoVis 2000

ThemeRiverThemeRiver

UNIVERSITY OF JYVÄSKYLÄ DEPARTMENT OF MATHEMATICAL INFORMATION TECHNOLOGY

Lecture 9: VisualizationTIES443: Introduction to DM 54

oil

• Simple and intuitive• Many extensions possible• Scalability is still an issue

Castro confiscates American oil refineries

Fidel Castro’s speeches 1960-1961

dot.com stocks 1999-2002

ThemeRiverThemeRiver

(c) Eamonn Keogh, [email protected]

Page 28: UNIVERSITY OF JYVÄSKYLÄ DEPARTMENT OF MATHEMATICAL ...mpechen/courses/TIES443/handouts/lecture09.pdf · TIES443: Introduction to DM Lecture 9: Visualization 3 Sources • Books:

28

UNIVERSITY OF JYVÄSKYLÄ DEPARTMENT OF MATHEMATICAL INFORMATION TECHNOLOGY

Lecture 9: VisualizationTIES443: Introduction to DM 55

CommentsComments• Simple and intuitive

• Highly dynamic

exploration

• Query power may be

limited and simplistic

• Limited scalability

Hochheiser, and Shneiderman

TimeSearcherTimeSearcher

(c) Eamonn Keogh, [email protected]

UNIVERSITY OF JYVÄSKYLÄ DEPARTMENT OF MATHEMATICAL INFORMATION TECHNOLOGY

Lecture 9: VisualizationTIES443: Introduction to DM 56

10001000101001000101010100001

01010001010111011110101101001

01110100101010011101010101001

01001010101110101010010101010

11010101001011001011101111010

00111000010100001001110101000

11100001010101100101110101

01011001011110011010010000100

01010011011010111000010101011

10111110001101101101111110100

11001001000110100011110011011

01000101111000101101001101100

11010000001001100010011100000

11101001100101100001010010

VizTreeVizTree

Which set of bit strings is generated by a human and which one is generated by a computer?

the frequencies of all triplets are encoded in the thickness of branches …

“humans usually try to fake randomness by alternating patterns”

adopted from ex. by Eamonn Keogh

Page 29: UNIVERSITY OF JYVÄSKYLÄ DEPARTMENT OF MATHEMATICAL ...mpechen/courses/TIES443/handouts/lecture09.pdf · TIES443: Introduction to DM Lecture 9: Visualization 3 Sources • Books:

29

UNIVERSITY OF JYVÄSKYLÄ DEPARTMENT OF MATHEMATICAL INFORMATION TECHNOLOGY

Lecture 9: VisualizationTIES443: Introduction to DM 57

Zoom in

The “trick” on the

previous slide only

works for discrete data,

but time series are real

valued.

we will discuss how

we can SAX up a

time series to make

it discrete during the

next tutorial on time

series mining

Overview Details 1

Details 2

Ben Shneiderman

VizTreeVizTree

“Overview, zoom & filter, details on demand”– main principles of visualization

(c) Eamonn Keogh, [email protected]

UNIVERSITY OF JYVÄSKYLÄ DEPARTMENT OF MATHEMATICAL INFORMATION TECHNOLOGY

Lecture 9: VisualizationTIES443: Introduction to DM 58

• Purpose of visualization in DM/KDD

• Visualization of data - Many methods– 1D, 2D, 3D

– Visualization is possible in more than 3-D

• Visualization of data mining results– Model visualization, model performance visualization

• Visualization of data mining processes

• Application …– Interactive and integrative data mining and visualization

– Time-series, symbolic, networks visualization

Visualization SummaryVisualization Summary

What else did you get from this lecture?What else did you get from this lecture?

One picture may worth 1000 words!One picture may worth 1000 words!

Page 30: UNIVERSITY OF JYVÄSKYLÄ DEPARTMENT OF MATHEMATICAL ...mpechen/courses/TIES443/handouts/lecture09.pdf · TIES443: Introduction to DM Lecture 9: Visualization 3 Sources • Books:

30

UNIVERSITY OF JYVÄSKYLÄ DEPARTMENT OF MATHEMATICAL INFORMATION TECHNOLOGY

Lecture 9: VisualizationTIES443: Introduction to DM 59

Additional SlidesAdditional Slides

UNIVERSITY OF JYVÄSKYLÄ DEPARTMENT OF MATHEMATICAL INFORMATION TECHNOLOGY

Lecture 9: VisualizationTIES443: Introduction to DM 60

Visualization SoftwareVisualization Software

Free and Open-source

• Ggobi

• Xmdv

• Some techniques are available in WEKA and YALE as you have seen already

• Many more - see www.kdnuggets.com/software/visualization.html

Matlab has also many nice integrated tools for viz.