43
1 “Big Picture” Mixed-Initiative Visual Analytics of Big Data Michelle Zhou IBM Research, Almaden [email protected] http://blog.threestory.com/

“Big Picture”: Mixed-Initiative Visual Analytics of Big Data (VINCI 2013 Keynote)

  • View
    1.714

  • Download
    3

Embed Size (px)

DESCRIPTION

Information graphics have been used for thousands of years to help illustrate ideas and communicate information. However, it requires skills and time to hand craft high-quality, customized information graphics for specific situations (e.g., data characteristics and user tasks). The problem becomes more acute when we must deal with big data. To address this problem, we are researching and developing mixed-initiative visual analytic systems that leverage both the intelligence of humans and machines to aid users in deriving insights from massive data. On the one hand, such a system automatically guides users to perform their data analytic tasks by recommending suitable visualization and discovery paths in context. On the other hand, users interactively explore, verify, and improve visual analytic results, which in turn helps the system to learn from users' behavior and improve its quality over time. In this talk, I will present key technologies that we have developed in building mixed-initiative visual analytic systems, including feature-based visualization recommendation and optimization-based approaches to dynamic data transformation for more effective visualization. I will also use concrete applications to demonstrate the use and value of mixed-initiative visual analytic systems, and discuss existing challenges and future directions in this area.

Citation preview

Page 1: “Big Picture”: Mixed-Initiative Visual Analytics of Big Data (VINCI 2013 Keynote)

1

“Big Picture”Mixed-Initiative Visual Analytics of Big Data

Michelle Zhou IBM Research, Almaden

[email protected]

http://blog.threestory.com/

Page 2: “Big Picture”: Mixed-Initiative Visual Analytics of Big Data (VINCI 2013 Keynote)

Outline

Definitions

– Big data

– Mixed-initiative visual analytics

Challenges and Goals

Our Approaches

– Key technologies

– Use cases

Future Directions

Page 3: “Big Picture”: Mixed-Initiative Visual Analytics of Big Data (VINCI 2013 Keynote)

Variety

Definitions

“Big Picture”Mixed-Initiative Visual Analytics of Big Data

Volume Velocity Veracity210-million customers10-billion transactions

850 TB of data…

rumorsIncomplete data

100,000 tweets684,478 FB shares204 million emails

Per Minutewww.domo.com

Page 4: “Big Picture”: Mixed-Initiative Visual Analytics of Big Data (VINCI 2013 Keynote)

Definitions

“Big Picture”Mixed-Initiative Visual Analytics of Big Data

Here is your customer

summary. I also suggest …

Here is your customer

summary. I also suggest …

Tell me more about my customers dougblakely.com

user-initiative

system-initiative

Page 5: “Big Picture”: Mixed-Initiative Visual Analytics of Big Data (VINCI 2013 Keynote)

Key Challenges

“Tell me about … ”– How to visually summarize large

volumes of heterogeneous data to quickly discover meaningful insights

“What do they mean?”– How to visually explain

discovered insights (complex + abstract) and guide exploration

“This does not look right, I want to … do it again”– How to allow users to correct

analytic results and adopt previous analytic steps

ww

w.gfi.com

big data

Page 6: “Big Picture”: Mixed-Initiative Visual Analytics of Big Data (VINCI 2013 Keynote)

Combine advanced data analytics and interactive visualization to help end users

Our Goals

Derive and consume insights

Explore various analytic paths and trust derived insights

Discover opportunities to compensate for and improveinsights and analytic processes

ww

w.gfi.com

Page 7: “Big Picture”: Mixed-Initiative Visual Analytics of Big Data (VINCI 2013 Keynote)

AnalyticRequests

Output: Interactive Visualization

Input: User Actions

Data Analytic RecipesUser Models Analytic Engines

AlternativeVisualization

VisualizationExamples

“Big Picture”: Overview

Expressive UI &Action Interpreter

Expressive UI &Action Interpreter

Visual Analytics“Concierge”

Visual Analytics“Concierge”

Page 8: “Big Picture”: Mixed-Initiative Visual Analytics of Big Data (VINCI 2013 Keynote)

AnalyticRequests

Output: Interactive Visualization

Input: User Actions

Data Analytic RecipesUser Models Analytic Engines

AlternativeVisualization

VisualizationExamples

“Big Picture”: Our Focus

Expressive UI &Action Interpreter

Expressive UI &Action Interpreter

Visual Analytics“Concierge”

Visual Analytics“Concierge”

Page 9: “Big Picture”: Mixed-Initiative Visual Analytics of Big Data (VINCI 2013 Keynote)

Analytic Requests

Output: Interactive Visualization

Data Analytic RecipesUser Models Analytic Engines

AlternativeVisualization

VisualizationExamples

Visual Analytics Concierge

“Big Picture”: Our Focus

VisualizationRecommender

VisualizationRecommender

DataTransformer

DataTransformer

Insight Revision& Provenance

Insight Revision& Provenance

Page 10: “Big Picture”: Mixed-Initiative Visual Analytics of Big Data (VINCI 2013 Keynote)

Analytic Requests

Output: Interactive Visualization

Data Analytic RecipesUser Models Analytic Engines

AlternativeVisualization

VisualizationExamples

“Big Picture”: Our Focus

VisualizationRecommender

VisualizationRecommender

DataTransformer

DataTransformer

Insight Revision& Provenance

Insight Revision& Provenance

Visual Analytics Concierge

Page 11: “Big Picture”: Mixed-Initiative Visual Analytics of Big Data (VINCI 2013 Keynote)

ww

w.gfi.com

Data Transformation: Motivation

“Dirty”, noisy data

Large data variance

“Plain” raw data

Distorted, illegible visualization

“Messy” visualization without insights

Quality of data to be visualized affects the quality of visualization

Page 12: “Big Picture”: Mixed-Initiative Visual Analytics of Big Data (VINCI 2013 Keynote)

Example 1: “Dirty” Noisy Data

Original visualization After separating noise

Task: Show houses on a map

Page 13: “Big Picture”: Mixed-Initiative Visual Analytics of Big Data (VINCI 2013 Keynote)

Example 2: Large Data Variance

Task: Summarize houses by styles and towns

Original visualization After normalization

Page 14: “Big Picture”: Mixed-Initiative Visual Analytics of Big Data (VINCI 2013 Keynote)

Example 3: “Plain” Raw Data

Task: Correlate house price and towns under $1.5M

Original visualization After ordering towns

Price

Town

Ordered

Page 15: “Big Picture”: Mixed-Initiative Visual Analytics of Big Data (VINCI 2013 Keynote)

Example 4: “Plain” Raw Data

Task: What is my emotional style?

After semantic-temporal segmentation [Pan et al. IUI 2013]

Original visualization

Page 16: “Big Picture”: Mixed-Initiative Visual Analytics of Big Data (VINCI 2013 Keynote)

Technical Challenges

Determine proper data transformation for different visualization situations

– Difficult to predict visualization situations involving multiple factors: data, user, and types of visualization

Certain situations require multiple data transformations

Balance multiple, potentially conflicting factors

– Quality of visualization and performance

Page 17: “Big Picture”: Mixed-Initiative Visual Analytics of Big Data (VINCI 2013 Keynote)

Our Approach

Optimization-based approach to automatically derive data transformations that maximize visualization quality

Original Data (D)Data RetrievalData Retrieval Visualization

Generation

VisualizationGeneration

Data Transformer

Data Data TransformerTransformer

TransformedData

Visualization Type (Vt)

Input: Original data D, Visualization type Vt

Output: A set of transformation operators Op = {…, op[i], …}

where reward ∑ desirability(D, Vt , Op) is maximized

VisualizationRecommender

VisualizationRecommender

[Wen and Zhou IUI 2008, InfoVis 2008]

Page 18: “Big Picture”: Mixed-Initiative Visual Analytics of Big Data (VINCI 2013 Keynote)

Measuring Visualization Desirability

∑ Desirability (D, V, Op)

∑ Visual_Quality (D, Vt , Op) ∑ Cost (Op)

Visual quality metrics Time cost of data transformations

––

visual legibility

visual pattern recognizability

visual fidelity

visual continuity

))(/)()((1),( 21 tt VDdensityDcomplexityVD βλλχ ×+×−=

[Wen and Zhou IUI 2008, InfoVis 2008]

Page 19: “Big Picture”: Mixed-Initiative Visual Analytics of Big Data (VINCI 2013 Keynote)

Data Transformation: What’s Next

What additional desirability metrics should we consider?

How to perform data transformation in context (incremental transformation)?

How to scale out to support exabytes of data for different user tasks and situations?

Page 20: “Big Picture”: Mixed-Initiative Visual Analytics of Big Data (VINCI 2013 Keynote)

Analytic Requests

Output: Interactive Visualization

Data Analytic RecipesUser Models Analytic Engines

AlternativeVisualization

VisualizationExamples

Visual Analytics Concierge

“Big Picture”: Our Focus

VisualizationRecommender

VisualizationRecommender

DataTransformer

DataTransformer

Insight Revision& Provenance

Insight Revision& Provenance

Page 21: “Big Picture”: Mixed-Initiative Visual Analytics of Big Data (VINCI 2013 Keynote)

Visualization Recommendation: Motivation

ww

w.gfi.com

– Characteristics of data

– User tasks

– Device

– User interaction behavior

Visually encoding data requires skills and time and is influenced by a number of factors

Page 22: “Big Picture”: Mixed-Initiative Visual Analytics of Big Data (VINCI 2013 Keynote)

Visualization Recommendation: Types

ww

w.gfi.com

Data-driven recommendation– Dynamically recommend suitable

visualizations based on data, display, and user tasks

Behavior-driven recommendation– Dynamically track user interactions and

detect behavior patterns to recommend suitable visualizations in context

DisplayDisplay + new data + task

DisplayDisplay + user behavior

Page 23: “Big Picture”: Mixed-Initiative Visual Analytics of Big Data (VINCI 2013 Keynote)

Two situations

– Single display

– Multiple, consecutive displays

Multiple methods

– Rule based

– Planning based

– Machine learning based

VisualizationRecommendation

VisualizationRecommendation

Adopted from [Roth et al., CHI 94]

[Mackinlay ’86; Roth & Mattis CHI ’94; Zhou & Feiner InfoVis 96; Zhou & Feiner CHI 98; Zhou IJCAI 99; Zhou & Chen InfoVis 02; Zhou & Chen IJCAI03; Wen & Zhou InfoVis 05]

Data-Driven Visualization Recommendation

Page 24: “Big Picture”: Mixed-Initiative Visual Analytics of Big Data (VINCI 2013 Keynote)

Behavior-Driven Visualization Recommendation

Observation– Users tend to stay with unsuitable visualization or

compensate for with large number of interactions instead of changing visualization

Goal– Detect user interaction patterns and make pattern-based

visualizations recommendations

Page 25: “Big Picture”: Mixed-Initiative Visual Analytics of Big Data (VINCI 2013 Keynote)

Behavior-Driven Visualization Recommendation: Example 1Display: Bridge problems by State by year

User interactions: click on each state to examine the structural bridge problems

Page 26: “Big Picture”: Mixed-Initiative Visual Analytics of Big Data (VINCI 2013 Keynote)

Behavior-Driven Visualization Recommendation: Example 1Pattern: Scan

Visualization Recommendation: line charts for direct comparison

[Gotz and Wen IUI 2009]

Page 27: “Big Picture”: Mixed-Initiative Visual Analytics of Big Data (VINCI 2013 Keynote)

Behavior-Driven Visualization Recommendation: Example 2Display: Map of the Market

User interactions: repeatedly change time windows for two industries

Time Window:26 weeks52 weeks

Time Window:26 weeks52 weeks

Page 28: “Big Picture”: Mixed-Initiative Visual Analytics of Big Data (VINCI 2013 Keynote)

Behavior-Driven Visualization Recommendation: Example 2Pattern: Flip

Visualization Recommendation: line chart for direct trend comparison

-10

-5

0

5

10

15

20

10 20 30 40 50

Utility

Netw orking

[Gotz and Wen IUI 2009]

Page 29: “Big Picture”: Mixed-Initiative Visual Analytics of Big Data (VINCI 2013 Keynote)

Behavior-Driven Visualization Recommendation: Pattern-Based Approach

PatternDetection

PatternDetection

Pattern-TaskMatching

Pattern-TaskMatching

Pattern-DataMatching

Pattern-DataMatching

ExampleMatch

ExampleMatch

VisualizationRecommendations

Task Features

Data Features

User Interactions

[Gotz and Wen IUI 2009]

ScanFlip

Swap…

Page 30: “Big Picture”: Mixed-Initiative Visual Analytics of Big Data (VINCI 2013 Keynote)

Recommending Visual Interactions

Automatically annotate and suggest follow-on user interactions based on displayed visual features

Original display Annotated display

AB

[Kandogan VAST’2012]

Page 31: “Big Picture”: Mixed-Initiative Visual Analytics of Big Data (VINCI 2013 Keynote)

Recommending Visual Interactions

Grid-based approach to detect salient visual features in a display, and then annotate

Clusters

Outliers

Trends

[Kandogan VAST’2012]

Page 32: “Big Picture”: Mixed-Initiative Visual Analytics of Big Data (VINCI 2013 Keynote)

Visualization Recommendation: What’s Next

Recommend a suitable heterogeneous visualization as a consecutive display

Recommend the composition of two or more existing visualizations

+ = ?Vehicle Group Vehicle Age

Cost

?

?[Wen, Zhou & Aggarwal, InfoVis05; Heer & Robertson, InfoVis07]

[Yang , Li, & Zhou 2013]

Page 33: “Big Picture”: Mixed-Initiative Visual Analytics of Big Data (VINCI 2013 Keynote)

Visualization Recommendation: What’s Next

“Individualized” (hyper-personalized), adpativevisualization

– By cognitive style and personality [Gardner 1983]

– By one’s emotional/affective states

inventive/curious vs. consistent/cautious

friendly/compassionate vs. cold/unkind

outgoing/energetic vs. solitary/reserved

efficient/organized vs. easy-going/careless

Sadness Optimism Trust

sensitive/nervous vs. secure/confident

O

C

E

A

N

Big 5 Personality Model

Page 34: “Big Picture”: Mixed-Initiative Visual Analytics of Big Data (VINCI 2013 Keynote)

Analytic Requests

Output: Interactive Visualization

Data Analytic RecipesUser Models Analytic Engines

AlternativeVisualization

VisualizationExamples

Visual Analytics Concierge

“Big Picture”: Our Focus

VisualizationRecommender

VisualizationRecommender

DataTransformer

DataTransformer

Insight Revision& Provenance

Insight Revision& Provenance

Page 35: “Big Picture”: Mixed-Initiative Visual Analytics of Big Data (VINCI 2013 Keynote)

Insight Revision and Provenance

Insight revision

– Users amend derived insightsto correct analytic mistakes or make personalized adjustments

Insight provenance

– Users record interactions and insight for continuation and reuse

Neuroticism (high low)

Extroversion (low high)

Zoom In2 Edit2 Query Filter

User Actions

Page 36: “Big Picture”: Mixed-Initiative Visual Analytics of Big Data (VINCI 2013 Keynote)

Insight Revision

A crowd-powered approach to insight revision

– Users amend various types of text analytics mistakes

– Adopting multi-user consistent inputs

Correcting sentiment classification error

[Hu et al. INTERACT 2013]

Page 37: “Big Picture”: Mixed-Initiative Visual Analytics of Big Data (VINCI 2013 Keynote)

Insight Revision

A crowd-powered approach to insight revision

– Users amend various types of text analytics mistakes

– Adopting multi-user consistent inputs

Correcting summarization label error

[Hu et al. INTERACT 2013]

Page 38: “Big Picture”: Mixed-Initiative Visual Analytics of Big Data (VINCI 2013 Keynote)

Insight Provenance

[Gotz and Zhou InfoVis 2009]

An action-based approach to insight provenance– “Actions” captures observable and semantically

meaningful user interactions

• Three types of actions: Exploration | Insight | Meta

– “Action trails” captures sequence of actions leading to an insight for insight provenance

InsightnExploratio AA o+= ])([ ττ

Page 39: “Big Picture”: Mixed-Initiative Visual Analytics of Big Data (VINCI 2013 Keynote)

Insight Provenance [Gotz and Zhou InfoVis 2009]

Page 40: “Big Picture”: Mixed-Initiative Visual Analytics of Big Data (VINCI 2013 Keynote)

Insight Revision and Provenance: What’s Next

Balance crowd input and personalized adjustments– Reconcile diverse user amendments vs. prevent potential

system abuse

Detect and learn different types of logical structures from user interactions – Automatically infer and predict user interaction patterns

to better support and anticipate user tasks

?

Page 41: “Big Picture”: Mixed-Initiative Visual Analytics of Big Data (VINCI 2013 Keynote)

Summary

Tell me what’s in my data

Tell me what’s in my data

Here is the “big picture” of your

data. I also suggest you look into …

Here is the “big picture” of your

data. I also suggest you look into …

dougblakely.com

“Big data” is of high volume, heterogeneous, and often “dirty”

It requires both users and computers to take initiatives for effective visual analytics of big data

Something is wrong… should be…Remember what I have done so far

Something is wrong… should be…Remember what I have done so far I incorporated your

and others’ feedback. Please continue …

I incorporated your and others’ feedback.

Please continue …

DataTransformation

DataTransformation

VisualizationRecommendation

VisualizationRecommendation

Insight Revision& Provenance

Insight Revision& Provenance

Page 42: “Big Picture”: Mixed-Initiative Visual Analytics of Big Data (VINCI 2013 Keynote)

Acknowledgements

IBM Research, Almadn

– Eser Kandogan, Fei Wang, Huahai Yang, Liang Gou, Ying Xuan, Eben Haber, Yunyao Li

IBM T. J. Watson

– David Gotz, Zhen Wen, Shimei Pan, Jie Lu, Min Chen*, Sheng Ma*, Peter Kissa*, Vikram Aggarwal*

IBM Research, China

– Shixia Liu*, Nan Cao, Yangqiu Song* Weihong Qian

Summer interns

– Ying Feng (Indiana University)

– Basak Alper (UC Santa Barbara)

– Mengdie Hu (Georgia Tech)

– Jian Zhao (University of Toronto)

Page 43: “Big Picture”: Mixed-Initiative Visual Analytics of Big Data (VINCI 2013 Keynote)

References of Our WorkDavid Gotz and Michelle X. Zhou: Characterizing users' visual analytic activity for insight provenance. Information Visualization 8(1): 42-55, 2009.

David Gotz and Zhen Wen: Behavior-driven visualization recommendation. IUI 2009: 315-324, 2008.

Eser Kandogan: Just-in-time annotation of clusters, outliers, and trends in point-based data visualizations. IEEE VAST 2012: 73-82.

Mengdie Hu, Huahai Yang, Michelle X. Zhou, Liang Gou, Yunyao Li, and Eben Haber: OpinionBlocks: A Crowd-Powered, Self-Improving Interactive Visual Analytic System for Understanding Opinion Text. To appear in Proc. INTERACT 2013.

Zhen Wen and Michelle X. Zhou: Evaluating the Use of Data Transformation for Information Visualization. IEEE Trans. Vis. Comp. Graph. 14(6): 1309-1316, 2008.

Zhen Wen and Michelle X. Zhou: An optimization-based approach to dynamic data transformation for smart visualization. IUI 2008: 70-79

Zhen Wen, Michelle X. Zhou, and Vikram Aggarwal: An Optimization-based Approach to Dynamic Visual Context Management. INFOVIS 2005: 25-32.

Huahai Yang, Yunyao Li, and Michelle X. Zhou: A Crowd-sourced Study: Understanding Users’Comprehension and Preferences for Composing Information Graphics. In Submission to TOCHI 2013.

Michelle X. Zhou and Min Chen: Automated Generation of Graphic Sketches by Example. IJCAI 2003: 65-74

Michelle X. Zhou, Min Chen, and Ying Feng: Building a Visual Database for Example-based Graphics Generation. INFOVIS 2002: 23-30.

Michelle X. Zhou, Sheng Ma, and Ying Feng: Applying machine learning to automated information graphics generation. IBM Systems Journal 41(3): 504-523 (2002)