Visualisatie - Module 3 - Big Data

  • Published on
    13-Apr-2017

  • View
    188

  • Download
    5

Embed Size (px)

Transcript

  • Post-academiccourseBigData

    Post-academiccourseBigData

    Joris KlerkxResearch Manager, PhD.joris.klerkx@cs.kuleuven.be

    VisualisatieBig DataIVPV - Instituut voor Permanente Vorming28-05-2015

    1

  • Augment group - HCI research lab Dept. ComputerwetenschappenKU Leuvenhttps://augmenthuman.wordpress.com

    2

  • Erik Duval11/9/1965 12/3/2016

    3

  • Our mission

    Toaugmentthehumanintellect(Engelbart,1962)

    4

    By augmen+nghuman intellect we mean increasing the capability of a manto approach a complex problem situa+on, to gain comprehension to suit hisparticular needs, and to derive solu+onstoproblems.

  • Design,buildandevaluaterelevanttoolsandtechnologiesthathelpuserstobecomebeCerintheirdailylife&work(Duval,2015)

    Our mission

    5

  • What are relevant user actions?

    How can we capture signals? How can we store them?

    How can we create a meaningful feedback loop?

    Our Research

    Physiological, behavioural signals

    Sensors, (self-)trackers

    Information visualization

    Scalable infrastructure

    6

  • Application Domains

    Technology-Enhanced Learning

    Media Consumption

    Science 2.0

    (e)Health

    7

  • Slides will be posted to Slideshare & Zephyr

    8

  • http://www.hearts.com/ecolife/cut-paper-consumption-protect-forests/

    9

  • Big Data

    10

  • Big data

    11

  • Big datainsights

    12

  • Better Human Understanding

    13

  • A mental model represents what a person thinks is true but isnt necessarily true

    14

  • UNDERSTANDING OF THEIR MENTAL MODELS

    15

  • Wouter Walgrave - http://www.slideshare.net/wouterwalgraeve/mental-models-as-information-radiators 16

    http://www.slideshare.net/wouterwalgraeve/mental-models-as-information-radiators

  • 17

  • 18

  • ?

    19

  • "The idea that business is strictly a numbers affair has always struck me as preposterous. For one thing, Ive never been particularly good at numbers, but I think Ive done a

    reasonable job with feelings. And Im convinced that it is feelings and feelings alone that account for the success of the Virgin brand in all of its myriad forms. -- Richard

    Branson

    20

  • Gut feeling21

  • What your gut feeling says

    What the facts say

    22

  • What your gut feeling says

    What the facts say

    Confirmation bias

    Undervalued Overvalued Foolish23

  • Big datainsightsdata-driven insights

    24

  • 25

  • Big datainsightsdata-driven insights

    Meaningful

    26

  • Defining visualization

    27

  • Definition

    28

    Information Visualization is the use of interactive visual representations to amplify cognition [Card. et. al]

  • algorithm

    human

    29

  • Information Visualisation is the use of interactive visual representations to amplify cognition [Card. et. al]

    Definition

    30

  • http://www.demorgen.be/dm/nl/5403/Internet/article/detail/1890428/2014/05/18/Twitteractiviteit-verraadt-je-politieke-profiel.dhtml31

    http://www.demorgen.be/dm/nl/5403/Internet/article/detail/1890428/2014/05/18/Twitteractiviteit-verraadt-je-politieke-profiel.dhtml

  • Facilitate human interaction for exploration with and understanding of big data

    32

  • Data visualization

    Slidesource:JohnStasko

    Scientific visualization

    Information visualization

    33

  • Scientific visualisation

    Specifically concerned with data that has a well-defined representation in 2D or 3D space (e.g., from simulation mesh or scanner).

    Slidesource:RobertPutman 34

  • Information Visualisation

    Concerned with data that does not have a well-defined representation in 2D or 3D space (i.e., abstract data)

    35

  • Dispersion (Backstrom & Kleinberg)36

  • The role of visualisation

    37

  • Big datainsightsdata-driven insights

    Meaningful

    38

  • By Longlivetheux - Own work, CC BY-SA 4.0, https://commons.wikimedia.org/w/index.php?curid=3770524739

  • https://medium.com/@angelamorelli/3-powerful-lessons-i-have-learnt-as-an-information-designer-cb028940254#.mkgb0h2cc40

    https://medium.com/@angelamorelli/3-powerful-lessons-i-have-learnt-as-an-information-designer-cb028940254#.mkgb0h2cc

  • The Role of visualisation

    Brehmer, M.; Munzner, T., "A Multi-Level Typology of Abstract Visualization Tasks," Visualization and Computer Graphics, IEEE Transactions on , vol.19, no.12, pp.2376,2385, Dec. 2013 41

  • Explore

    Data insights: a visualization (Gregor Aisch)

    42

  • http://www.visual-analytics.eu/faq

    Also: Visual Analytics

    43

    http://www.visual-analytics.eu/faq/

  • Visualizing Big Data

    44

  • Multiple data sources with varied data types

    Diverse data

    I talk geoJSON

    i talk custom xml

    i talk apache logs

    45

  • millions of records

    Tall data

    46

  • http://dataclysm.org

    Example: 51 million ratings

    47

    http://dataclysm.org

  • Example: 51 million ratings

    48

  • http://dataclysm.org

    Example: 51 million ratings

    49

  • http://dataclysm.org

    Example: 51 million ratings

    50

  • http://dataclysm.org 51

  • Cluttered displays

    Heer, J. & Kandel, S. (2012), Interactive Analysis of Big Data, XRDS, 19 (1)52

  • Cluttered displaysBinned density scatterplot

    Hexagonal instead of rectangular

    Heer, J. & Kandel, S. (2012), Interactive Analysis of Big Data, XRDS, 19 (1)53

  • Multi-variate data with 100s to 1000s of variables

    Wide data

    54

  • http://www.perceptualedge.com/blog/?p=2046

    In this day of so-called Big Data, organizations are scrambling to implement new software and hardware to increase the amount of data that they collect and store. In so doing they are unwittingly making it harder to find the needles of useful information in the rapidly growing mounds of hay. If you dont know how to differentiate signals from noise, adding more noise only makes matters worse.

    55

  • Avoid the All-You-Can-Eat buffet! (Ben Fry)56

  • Visualizations might help reveal multidimensional patterns

    Use the power of the machine to find a proxy in the data that predicts the selected variables

    Depending on their specific questions, domain experts might select a subset of variables they are interested in

    57

  • Example: 4 million messages/day on OKCupid

    http://dataclysm.org 58

  • Each dot at 90% transparency

    http://dataclysm.org 59

  • http://dataclysm.org 60

  • http://dataclysm.org 61

  • http://dataclysm.org 62

  • Multiple views on the data allow exploration of patterns

    63

  • The strength of visualization

    64

  • Anscombe`s quartet http://en.wikipedia.org/wiki/Anscombe's_quartet

    Enables discovery of visual patterns in data sets

    Graphics reveal data (Tufte, 2001)

    65

    http://en.wikipedia.org/wiki/Anscombe's_quartet

  • World Population GrowthA tremendous change occurred with the industrial revolution: whereas it had taken all of human history until around 1800 for world population to reach one billion, the second billion was achieved in only 130 years (1930), the third billion in less than 30 years (1959), the fourth billion in 15 years (1974), and the fifth billion in only 13 years (1987). During the 20th century alone, the population in the world has grown from 1.65 billion to 6 billion.

    Seeing is understanding

    66

  • Facilitates understandinghttp://www.bbc.co.uk/news/world-15391515

    67

    http://www.bbc.co.uk/news/world-15391515

  • Facilitates human interaction for exploration and understandinghttp://www.bbc.co.uk/news/world-15391515

    68

    http://www.bbc.co.uk/news/world-15391515

  • http://www.informationisbeautiful.net/visualizations/how-many-gigatons-of-co2/

    Tells stories

    69

    http://www.informationisbeautiful.net/visualizations/how-many-gigatons-of-co2/

  • T. Nagel, M. Maitan, E. Duval, A. Vande Moere, J. Klerkx, K. Kloeckl, and C. Ratti. Touching transport - a case study on visualizing metropolitan public transit on interactive tabletops. In AVI2014: 12th ACM International Working Conference on Advanced Visual Interfaces, pages 281288, 2014.

    http://www.youtube.com/watch?v=wQpTM7ASc-w

    Facilitates human interaction for exploration and understanding70

    http://www.youtube.com/watch?v=wQpTM7ASc-w

  • Will there be enough food?

    http

    ://w

    ww.

    foot

    print

    netw

    ork.o

    rg/e

    n/ind

    ex.ph

    p/gfn

    /pag

    e/ea

    rth_

    over

    shoo

    t_da

    y/

    Communicates insights easily

    71Triggers Impact

    http://www.footprintnetwork.org/en/index.php/gfn/page/earth_overshoot_day/

  • http://terror.periscopic.com

    Shows patterns & triggers questions

    72

    http://terror.periscopic.com

  • Interactivity allows comparison

    73

  • http://blog.stephenwolfram.com/2012/03/the-personal-analytics-of-my-life/

    Shows trends & anomalies in the data, therefore triggers questions

    74

  • Helps to find stories, see trends

    BelgiumBrazil

    USA

    India

    75

  • Sentiment analysis in enterprise social network (slack)

    Shows patterns

    76

  • http://deredactie.be/cm/vrtnieuws/grafiek/interactief/1.224856177

  • Reader Client

    Tracking Service

    WebSockets

    Database

    engagement data mouse data

    10.065 sessies werden getracked

    9674 sessies werden gebruikt in de analyse

    391 sessies werden verwijderd uit analyse (noise)

    78

  • Visualizing Reader Activity

    Elk vierkant is een slide

    Elke rij stelt een navigatie-patroon voor doorheen de slides

    Kolom 1 toont absoluut aantal lezers

    Kolom 2 toont het percentage lezers

    79

  • 262 readers (2.7%) gaan volledig door alle slides, waarna ze snel teruggaan naar de eerste slide om die nog even te bekijken.

    Lezerstijd per slide

    Lezers spenderen +/- 75 seconden (avg) op de eerste slide om te bestuderen welke informatie voorhanden is.

    80

    Shows patterns

  • Sentiment analysis in enterprise social network (slack)

    Triggers questions & creates awareness

    Disclaimer: Should we trust NLP-algorithms? 81

  • Empowers users to make informed decisions

    Positive Badges

    Negative Badges

    82

  • Show errors in the data

    http://woutervds.github.io/InfoVisPostgraduwhat/83

  • Show errors in the data84

  • Khaled Bachour, Frederic Kaplan, Pierre Dillenbourg, "An Interactive Table for Supporting Participation Balance in Face-to-Face Collaborative Learning," IEEE Transactions on Learning Technologies, vol. 3, no. 3, pp. 203-213, July-September, 2010

    Creates awareness

    85

  • http://infosthetics.com/

    http://visualizing.orghttp://www.visualcomplexity.com/vc/

    http://visual.ly/

    http://flowingdata.comhttp://www.infovis-wiki.net

    86

    http://infosthetics.com/http://visualizing.orghttp://www.visualcomplexity.com/vc/http://visual.ly/http://flowingdata.comhttp://www.infovis-wiki.net/

  • Visualizing (big) dataGuidelines & Facts

    88

  • How many circles?

    89

  • Humans have advanced perceptual abilitiesOur brains makes us extremely good at recognizing visual patterns

    90

  • 91

    Humans have little short term memoryOur brain remembers relatively little of what we perceive.

  • Most of us can only hold three to seven chunks of data at the same time.Humans have little short term memory

    92

  • RecognitionIdentify previously learned information

    93

  • Humans have advanced perceptual abilities

    Humans have little short term memory

    Our brains makes us extremely good at recognizing visual patterns

    Our brains remember relatively little of what we perceive

    Externalize data by using interactive, visual encodingsPromote recognition rather than recall

    94

  • https://www.youtube.com/watch?v=og7bzN0DhpI (9:51 - 11:22 )95

    https://www.youtube.com/watch?v=og7bzN0DhpI

  • 96

  • The centrality of human activity in the process is key

    97

  • Explore

    Data insights: a visualization (Gregor Aisch)

    98

  • Its not a magical algorithm that finds the insight for you

    You have to look at the overview, you have to decide what you zoom in to, what you filter out. And then

    you click to get the detailsBen Shneiderman, 201199

  • http://www.bbc.com/future/bespoke/20140724-flight-risk/

    Overview first, zoom & filter, details-on-demand

    100

    http://www.bbc.com/future/bespoke/20140724-flight-risk/

  • Overview first, zoom & filter, details-on-demand

    http://www.student.kuleuven.be/~r0580868/

    101

    http://www.student.kuleuven.be/~r0580868/

  • https://postgraduwhatblog.wordpress.com/2016/02/13/infovis-van-de-week-1-wouter/

    Overview first, zoom & filter, details-on-demand

    102

    https://postgraduwhatblog.wordpress.com/2016/02/13/infovis-van-de-week-1-wouter/

  • Visual Information Seeking Mantra

    103

  • Real data is ugly and needs to be cleaned

    http

    ://hc

    il2.c

    s.um

    d.ed

    u/tr

    s/20

    11-3

    4/20

    11-3

    4.pd

    f

    http://www.netmagazine.com/features/seven-dirty-secrets-data-visualisationhttps://code.google.com/p/google-refine/

    http://vis.stanford.edu/wrangler/Pre-process your data

    104

    http://www.netmagazine.com/features/seven-dirty-secrets-data-visualisationhttp://livepage.apple.com/https://code.google.com/p/google-refine/https://code.google.com/p/google-refine/http://vis.stanford.edu/wrangler/http://livepage.apple.com/

  • http://nieuws.vtm.be/verkiezingen/gemeente?province=P1&city=G73

    Always check & pre-process your data

    105

    Verkiezingen 14/10/12

    http://nieuws.vtm.be/verkiezingen/gemeente?province=P1&city=G73

  • Forget about 3D graphs (on a 2D screen..)

    Occlusion Complex to interact with Doesnt add anything to the data

    106

  • Source: Stephen Few

    What if we need to add a 3rd variable?

    107

  • Use small coordinated graphs to add variables

    108

    Forget about 3D graphs

    Source: Stephen Few

  • Which student has more blogposts?

    Size & angle are difficult to compare Without labels & legends, impossible to show exact quantitative

    differences Limited Short term (visual) memory

    109

  • Source: Stephen Few

    Save the pies for dessert (S. Few)

    Try using either of the pies to put the slices in order by size

    110

  • deredactie.be

    demorgen.be

    vtm.be

    Verkiezingen 14/10/12

    111

  • Obviously there are exceptions to the rule

    112http://themetapicture.com/the-sunny-side-of-the-pyramid/

  • 0"

    5"

    10"

    15"

    20"

    25"

    30"

    blogposts" tweets" comments"on"blogs"

    reports"submi6ed"

    Student'1'

    Student"1"

    0" 5" 10" 15" 20" 25" 30"

    blogposts"

    comments"on"blogs"

    tweets"

    reports"submi6ed"

    Student'1'

    Student"1"

    Use Common Sense

    0"

    5"

    10"

    15"

    20"

    25"

    30"

    blogposts" comments"on"blogs"

    tweets" reports"submi6ed"

    Student'1'

    Student"1"

    113

  • 0" 10" 20" 30" 40" 50" 60"

    Student"1"

    Student"2"

    Student"3"

    Student"4"

    blogposts"

    tweets"

    comments"on"blogs"

    reports"submi:ed"

    0%# 20%# 40%# 60%# 80%# 100%#

    Student#1#

    Student#2#

    Student#3#

    Student#4#

    blogposts#

    tweets#

    comments#on#blogs#

    reports#submi;ed#

    Use Common Sense

    What are you comparing?What story do you get from it?

    114

  • Which graph makes it easier to focus on the pattern of change through time, instead of the individual values?

    Choose graph that answers your questions about your data115Source: Stephen Few

  • vtm.be

    deredactie.be

    nieuwsblad.be

    Verkiezingen 14/10/12

    Communicate the correct story

    116

  • Dont use visualisations to mislead

    117

  • Dont use visualisations to mislead

    118

  • Source: Stephen Few 119

  • Source: Stephen Few 120

  • 121

  • http://fellinlovewithdata.com/research/deceptive-visualizations 122

    http://fellinlovewithdata.com/research/deceptive-visualizations

  • http://fellinlovewithdata.com/research/deceptive-visualizations 123

    http://fellinlovewithdata.com/research/deceptive-visualizations

  • How much better are the drinking water conditions in Willowtown as compared to Silvatown?

    124http://fellinlovewithdata.com/research/deceptive-visualizations

    http://fellinlovewithdata.com/research/deceptive-visualizations

  • Storytelling with visualisation

    125

  • Visualization tasks

    Brehmer, M.; Munzner, T., "A Multi-Level Typology of Abstract Visualization Tasks," Visualization and Computer Graphics, IEEE Transactions on , vol.19, no.12, pp.2376,2385, Dec. 2013 126

  • http://www.ted.com/talks/hans_rosling_shows_the_best_stats_you_ve_ever_seen.html127

    http://www.ted.com/talks/hans_rosling_shows_the_best_stats_you_ve_ever_seen.html

  • Human Perception

    128

    Our brains makes us extremely good at recognizing visual patterns

  • Source: Katrien Verbert 129

  • Source...