69
1 / 69 Data Visualization Prof. Tulasi Prasad Sariki SCSE, VIT, Chennai www.learnersdesk.weebly.com

Data Visualization - Tulasi Prasad Sariki · – The Anscombe's Quartet shows how four sets of data with identical simple summary statistics can vary considerably when graphed. 8

  • Upload
    others

  • View
    15

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Data Visualization - Tulasi Prasad Sariki · – The Anscombe's Quartet shows how four sets of data with identical simple summary statistics can vary considerably when graphed. 8

1 / 69

Data Visualization

Prof. Tulasi Prasad SarikiSCSE, VIT, Chennai

www.learnersdesk.weebly.com

Page 2: Data Visualization - Tulasi Prasad Sariki · – The Anscombe's Quartet shows how four sets of data with identical simple summary statistics can vary considerably when graphed. 8

2 / 69

Outline

● What is Data Visualization?● Why do we need Data Visualization?● Goals of Data Visualization● Characteristics of effective graphical displays● Different Types of Data ● Gestalt Principles of Visual Perception

Page 3: Data Visualization - Tulasi Prasad Sariki · – The Anscombe's Quartet shows how four sets of data with identical simple summary statistics can vary considerably when graphed. 8

3 / 69

?

Page 4: Data Visualization - Tulasi Prasad Sariki · – The Anscombe's Quartet shows how four sets of data with identical simple summary statistics can vary considerably when graphed. 8

4 / 69

02:54

Page 5: Data Visualization - Tulasi Prasad Sariki · – The Anscombe's Quartet shows how four sets of data with identical simple summary statistics can vary considerably when graphed. 8

5 / 69

What is Data Visualization?

● Data visualization is a general term that describes any effort to help people understand the significance of data by placing it in a visual context.

● Patterns, trends and correlations that might go undetected in text-based data can be exposed and recognized easier with data visualization software.

Page 6: Data Visualization - Tulasi Prasad Sariki · – The Anscombe's Quartet shows how four sets of data with identical simple summary statistics can vary considerably when graphed. 8

6 / 69

Why Data Visualization?

● Did you know that 25% of your brain power is connected to visual stimulus, and 70% of our sensory receptors are in our eyes?

● No wonder we "get the picture" faster when presenting information visually

Page 7: Data Visualization - Tulasi Prasad Sariki · – The Anscombe's Quartet shows how four sets of data with identical simple summary statistics can vary considerably when graphed. 8

7 / 69

Why Data Visualization?

● A picture is worth 1000 words.● A picture can also be worth 1000 data points.

– In 1973, the statistician Francis Anscombe demonstrated the importance of graphing data.

– The Anscombe's Quartet shows how four sets of data with identical simple summary statistics can vary considerably when graphed.

Page 8: Data Visualization - Tulasi Prasad Sariki · – The Anscombe's Quartet shows how four sets of data with identical simple summary statistics can vary considerably when graphed. 8

8 / 69

Why Data Visualization?

● Simple Summary Statistics of Anscombe's Quartet Data Table

Page 9: Data Visualization - Tulasi Prasad Sariki · – The Anscombe's Quartet shows how four sets of data with identical simple summary statistics can vary considerably when graphed. 8

9 / 69

Why Data Visualization?

● Graph of Anscombe's Quartet Data Table

Page 10: Data Visualization - Tulasi Prasad Sariki · – The Anscombe's Quartet shows how four sets of data with identical simple summary statistics can vary considerably when graphed. 8

10 / 69

Goals of Data Visualization

● A primary goal of data visualization is to communicate information clearly and efficiently via statistical graphics, plots and information graphics.

● Numerical data may be encoded using dots, lines, or bars, to visually communicate a quantitative message.

Page 11: Data Visualization - Tulasi Prasad Sariki · – The Anscombe's Quartet shows how four sets of data with identical simple summary statistics can vary considerably when graphed. 8

11 / 69

Goals of Data Visualization

● Effective visualization helps users analyze and reason about data and evidence. It makes complex data more accessible, understandable and usable.

● Users may have particular analytical tasks, such as making comparisons or understanding causality, and the design principle of the graphic follows the task.

● Tables are generally used where users will look up a specific measurement, while charts of various types are used to show patterns or relationships in the data for one or more variables

Page 12: Data Visualization - Tulasi Prasad Sariki · – The Anscombe's Quartet shows how four sets of data with identical simple summary statistics can vary considerably when graphed. 8

12 / 69

Characteristics of effective graphical displays

● show the data● induce the viewer to think about the

substance rather than about methodology, graphic design, the technology of graphic production or something else

● avoid distorting what the data has to say● present many numbers in a small space

Page 13: Data Visualization - Tulasi Prasad Sariki · – The Anscombe's Quartet shows how four sets of data with identical simple summary statistics can vary considerably when graphed. 8

13 / 69

Characteristics of effective graphical displays

● make large data sets coherent● encourage the eye to compare different

pieces of data● reveal the data at several levels of detail,

from a broad overview to the fine structure● serve a reasonably clear purpose:

description, exploration, tabulation or decoration

● be closely integrated with the statistical and verbal descriptions of a data set

Page 14: Data Visualization - Tulasi Prasad Sariki · – The Anscombe's Quartet shows how four sets of data with identical simple summary statistics can vary considerably when graphed. 8

14 / 69

Characteristics of effective graphical displays

● The greatest value of a picture is when it forces us to notice what we never expected to see. - John Tukey

Page 15: Data Visualization - Tulasi Prasad Sariki · – The Anscombe's Quartet shows how four sets of data with identical simple summary statistics can vary considerably when graphed. 8

15 / 69

Different Types of Data

Page 16: Data Visualization - Tulasi Prasad Sariki · – The Anscombe's Quartet shows how four sets of data with identical simple summary statistics can vary considerably when graphed. 8

16 / 69

Different Types of Data

Page 17: Data Visualization - Tulasi Prasad Sariki · – The Anscombe's Quartet shows how four sets of data with identical simple summary statistics can vary considerably when graphed. 8

17 / 69

Different Types of Data

Page 18: Data Visualization - Tulasi Prasad Sariki · – The Anscombe's Quartet shows how four sets of data with identical simple summary statistics can vary considerably when graphed. 8

18 / 69

Quantitative Comparison

● Use sparingly● No more than six components.● Not useful when values of each component

are similar

Page 19: Data Visualization - Tulasi Prasad Sariki · – The Anscombe's Quartet shows how four sets of data with identical simple summary statistics can vary considerably when graphed. 8

19 / 69

Quantitative Comparison

● Bar graph● Best for comparing categories.● Best Practices

– Make bars and columns wider than the space between them.

– Do not allow grid lines to pass through columns or bars.

– Use a single font type on a graph.

Page 20: Data Visualization - Tulasi Prasad Sariki · – The Anscombe's Quartet shows how four sets of data with identical simple summary statistics can vary considerably when graphed. 8

20 / 69

Quantitative Comparison

● Stacked bar graph

Page 21: Data Visualization - Tulasi Prasad Sariki · – The Anscombe's Quartet shows how four sets of data with identical simple summary statistics can vary considerably when graphed. 8

21 / 69

Quantitative Comparison

● Group Bar Plot or Clustered bar graph

Page 22: Data Visualization - Tulasi Prasad Sariki · – The Anscombe's Quartet shows how four sets of data with identical simple summary statistics can vary considerably when graphed. 8

22 / 69

Quantitative Comparison

● Bubble Charts

Page 23: Data Visualization - Tulasi Prasad Sariki · – The Anscombe's Quartet shows how four sets of data with identical simple summary statistics can vary considerably when graphed. 8

23 / 69

Quantitative Comparison

● Pictogram Chart

– For discrete data

Page 24: Data Visualization - Tulasi Prasad Sariki · – The Anscombe's Quartet shows how four sets of data with identical simple summary statistics can vary considerably when graphed. 8

24 / 69

Different Types of Data

Page 25: Data Visualization - Tulasi Prasad Sariki · – The Anscombe's Quartet shows how four sets of data with identical simple summary statistics can vary considerably when graphed. 8

25 / 69

Quantitative Relational

● Line Charts

– For identifying

trends.

Page 26: Data Visualization - Tulasi Prasad Sariki · – The Anscombe's Quartet shows how four sets of data with identical simple summary statistics can vary considerably when graphed. 8

26 / 69

Quantitative Relational

● Line Charts

– For identifying

trends

Page 27: Data Visualization - Tulasi Prasad Sariki · – The Anscombe's Quartet shows how four sets of data with identical simple summary statistics can vary considerably when graphed. 8

27 / 69

Quantitative Relational

Scatter Plots- For testing and identifying relationships, and statistical correlations

Page 28: Data Visualization - Tulasi Prasad Sariki · – The Anscombe's Quartet shows how four sets of data with identical simple summary statistics can vary considerably when graphed. 8

28 / 69

Quantitative Relational

● Surface plots

– Topography, Density Functions that have two dependent variables

Page 29: Data Visualization - Tulasi Prasad Sariki · – The Anscombe's Quartet shows how four sets of data with identical simple summary statistics can vary considerably when graphed. 8

29 / 69

Quantitative Relational

● Surface plots

– Topography, Density Functions that have two dependent variables

Page 30: Data Visualization - Tulasi Prasad Sariki · – The Anscombe's Quartet shows how four sets of data with identical simple summary statistics can vary considerably when graphed. 8

30 / 69

Quantitative Relational

● Heat Map

Page 31: Data Visualization - Tulasi Prasad Sariki · – The Anscombe's Quartet shows how four sets of data with identical simple summary statistics can vary considerably when graphed. 8

31 / 69

Quantitative Relational

● Co-occurrence matrix

/ Heat map

Page 32: Data Visualization - Tulasi Prasad Sariki · – The Anscombe's Quartet shows how four sets of data with identical simple summary statistics can vary considerably when graphed. 8

32 / 69

Quantitative Relational and Comparison

● Area Graph

Page 33: Data Visualization - Tulasi Prasad Sariki · – The Anscombe's Quartet shows how four sets of data with identical simple summary statistics can vary considerably when graphed. 8

33 / 69

Quantitative Relational and Comparison Stacked Area Graph

Page 34: Data Visualization - Tulasi Prasad Sariki · – The Anscombe's Quartet shows how four sets of data with identical simple summary statistics can vary considerably when graphed. 8

34 / 69

Different Types of Data

Page 35: Data Visualization - Tulasi Prasad Sariki · – The Anscombe's Quartet shows how four sets of data with identical simple summary statistics can vary considerably when graphed. 8

35 / 69

Distributions

Box and Whisker Plot

Page 36: Data Visualization - Tulasi Prasad Sariki · – The Anscombe's Quartet shows how four sets of data with identical simple summary statistics can vary considerably when graphed. 8

36 / 69

Distributions

Page 37: Data Visualization - Tulasi Prasad Sariki · – The Anscombe's Quartet shows how four sets of data with identical simple summary statistics can vary considerably when graphed. 8

37 / 69

Distributions

● Histograms

– A histogram is a plot that lets you discover, and show, the underlying frequency distribution

(shape) of a set of continuous data.

Page 38: Data Visualization - Tulasi Prasad Sariki · – The Anscombe's Quartet shows how four sets of data with identical simple summary statistics can vary considerably when graphed. 8

38 / 69

Distributions

Page 39: Data Visualization - Tulasi Prasad Sariki · – The Anscombe's Quartet shows how four sets of data with identical simple summary statistics can vary considerably when graphed. 8

39 / 69

Different Types of Data

Page 40: Data Visualization - Tulasi Prasad Sariki · – The Anscombe's Quartet shows how four sets of data with identical simple summary statistics can vary considerably when graphed. 8

40 / 69

Qualitative Data: Textual Structures

Page 41: Data Visualization - Tulasi Prasad Sariki · – The Anscombe's Quartet shows how four sets of data with identical simple summary statistics can vary considerably when graphed. 8

41 / 69

Qualitative Data: Textual Structures

● Word Tree

Page 42: Data Visualization - Tulasi Prasad Sariki · – The Anscombe's Quartet shows how four sets of data with identical simple summary statistics can vary considerably when graphed. 8

42 / 69

Different Types of Data

Page 43: Data Visualization - Tulasi Prasad Sariki · – The Anscombe's Quartet shows how four sets of data with identical simple summary statistics can vary considerably when graphed. 8

43 / 69

Qualitative Relational Structures● Source Directed Node Link Diagram

Page 44: Data Visualization - Tulasi Prasad Sariki · – The Anscombe's Quartet shows how four sets of data with identical simple summary statistics can vary considerably when graphed. 8

44 / 69

Qualitative Relational Structures

● Linear Layout

Page 45: Data Visualization - Tulasi Prasad Sariki · – The Anscombe's Quartet shows how four sets of data with identical simple summary statistics can vary considerably when graphed. 8

45 / 69

Qualitative Relational Structures

● Chord Diagram– visualises the inter-relationships between

entities. The connections between entities are used to display that they share something in common.

– The size of the arc is proportional to the importance of the flow.

Page 46: Data Visualization - Tulasi Prasad Sariki · – The Anscombe's Quartet shows how four sets of data with identical simple summary statistics can vary considerably when graphed. 8

46 / 69

Different Types of Data

Page 47: Data Visualization - Tulasi Prasad Sariki · – The Anscombe's Quartet shows how four sets of data with identical simple summary statistics can vary considerably when graphed. 8

47 / 69

Hierarchical Structures

● Vertical Trees

Page 48: Data Visualization - Tulasi Prasad Sariki · – The Anscombe's Quartet shows how four sets of data with identical simple summary statistics can vary considerably when graphed. 8

48 / 69

Hierarchical Structures

● Horizontal Trees

Page 49: Data Visualization - Tulasi Prasad Sariki · – The Anscombe's Quartet shows how four sets of data with identical simple summary statistics can vary considerably when graphed. 8

49 / 69

Hierarchical Structures

● Multi-Directional

Trees● Websites as Graphs

Page 50: Data Visualization - Tulasi Prasad Sariki · – The Anscombe's Quartet shows how four sets of data with identical simple summary statistics can vary considerably when graphed. 8

50 / 69

Hierarchical Structures

● Radial Trees

Page 51: Data Visualization - Tulasi Prasad Sariki · – The Anscombe's Quartet shows how four sets of data with identical simple summary statistics can vary considerably when graphed. 8

51 / 69

Hierarchical Structures● Rectangular Tree Maps

Page 52: Data Visualization - Tulasi Prasad Sariki · – The Anscombe's Quartet shows how four sets of data with identical simple summary statistics can vary considerably when graphed. 8

52 / 69

Hierarchical Structures

Page 53: Data Visualization - Tulasi Prasad Sariki · – The Anscombe's Quartet shows how four sets of data with identical simple summary statistics can vary considerably when graphed. 8

53 / 69

Hierarchical Structures

● Circular Tree Maps

Page 54: Data Visualization - Tulasi Prasad Sariki · – The Anscombe's Quartet shows how four sets of data with identical simple summary statistics can vary considerably when graphed. 8

54 / 69

Different Types of Data

Page 55: Data Visualization - Tulasi Prasad Sariki · – The Anscombe's Quartet shows how four sets of data with identical simple summary statistics can vary considerably when graphed. 8

55 / 69

Temporal Structures● Time Lines

Page 56: Data Visualization - Tulasi Prasad Sariki · – The Anscombe's Quartet shows how four sets of data with identical simple summary statistics can vary considerably when graphed. 8

56 / 69

Temporal Structures

Page 57: Data Visualization - Tulasi Prasad Sariki · – The Anscombe's Quartet shows how four sets of data with identical simple summary statistics can vary considerably when graphed. 8

57 / 69

Temporal Structures

Page 58: Data Visualization - Tulasi Prasad Sariki · – The Anscombe's Quartet shows how four sets of data with identical simple summary statistics can vary considerably when graphed. 8

58 / 69

Different Types of Datas

Page 59: Data Visualization - Tulasi Prasad Sariki · – The Anscombe's Quartet shows how four sets of data with identical simple summary statistics can vary considerably when graphed. 8

59 / 69

Spatial Structures: Maps

Page 60: Data Visualization - Tulasi Prasad Sariki · – The Anscombe's Quartet shows how four sets of data with identical simple summary statistics can vary considerably when graphed. 8

60 / 69

Spatial Structures: Maps

Page 61: Data Visualization - Tulasi Prasad Sariki · – The Anscombe's Quartet shows how four sets of data with identical simple summary statistics can vary considerably when graphed. 8

61 / 69

Spatial Structures: HeatMap

Page 62: Data Visualization - Tulasi Prasad Sariki · – The Anscombe's Quartet shows how four sets of data with identical simple summary statistics can vary considerably when graphed. 8

62 / 69

Different Types of Data

Page 63: Data Visualization - Tulasi Prasad Sariki · – The Anscombe's Quartet shows how four sets of data with identical simple summary statistics can vary considerably when graphed. 8

63 / 69

Spatial-Temporal Structures

Charles Minard's map of Napoleon's disastrous Russian campaign of 1812. The graphic is notable for its representation in two dimensions of six types of data: the number of Napoleon's troops; distance; temperature; the latitude and longitude; direction of travel; and location relative to specific dates

Page 64: Data Visualization - Tulasi Prasad Sariki · – The Anscombe's Quartet shows how four sets of data with identical simple summary statistics can vary considerably when graphed. 8

64 / 69

Spatial-Temporal Structures

Page 65: Data Visualization - Tulasi Prasad Sariki · – The Anscombe's Quartet shows how four sets of data with identical simple summary statistics can vary considerably when graphed. 8

65 / 69

Visual Variables● Visual variables for quantitative data (quantities)

Time

Size

Value

Position

Page 66: Data Visualization - Tulasi Prasad Sariki · – The Anscombe's Quartet shows how four sets of data with identical simple summary statistics can vary considerably when graphed. 8

66 / 69

Visual variables for qualitative data (category)

Colour

Orientation

Shape

Texture

Page 67: Data Visualization - Tulasi Prasad Sariki · – The Anscombe's Quartet shows how four sets of data with identical simple summary statistics can vary considerably when graphed. 8

67 / 69

Class Activity

● Find a visualization online.● Answer the following questions:

– In one or two sentences, what story does it tell? Identify the data.

– What type of data is it?– How many dimensions are being visually

mapped? Identify the visual variables used.– Identify the type of visualization, or

methods used.– If it is interactive, describe the interaction,

and the data revealed.

Page 68: Data Visualization - Tulasi Prasad Sariki · – The Anscombe's Quartet shows how four sets of data with identical simple summary statistics can vary considerably when graphed. 8

68 / 69

Page 69: Data Visualization - Tulasi Prasad Sariki · – The Anscombe's Quartet shows how four sets of data with identical simple summary statistics can vary considerably when graphed. 8

69 / 69

Thank You