10
Analyse, Collaborate and Publish Statistics for Measuring Progress in our Society using Storytelling Prof. Mikael Jern NCVA National Center for Visual Analytics, ITN, Linkoping University, 60174 Norrköping, Sweden [email protected] http://ncva.itn.liu.se Abstract. Official statistics such as demographics, environment, health, social- economy and education from national or regional territories are a rich and important source of information for many important aspects of life. Web- enabled geovisual analytics is a technique that can help illustrating comprehensive statistical data which for the eye are hard perceive or interpret. In this paper, we introduce “storytelling” means for the author to 1) select spatio-temporal and multivariate statistical data, 2) explore and discern trends and patterns, 3) orchestrate and describe metadata, 4) collaborate with colleagues to confirm and 5) finally publish essential gained insight and knowledge embedded as dynamic visualization “Vislet” in a blog or web page. The author can guide the reader in the directions of both context and discovery while at the same time follow the analyst’s way of logical reasoning. We are moving away from a clear distinction between authors and readers affecting the process through which knowledge is created and the traditional models which support editorial work. Value no longer relies solely on the content but also on the ability to access this information. 1 Introduction We live in a data-rich world where people have become familiar with notions like GDP and sustainable development; statistics that compare countries’ economic performance often hit the news headlines. This paper reflects a challenging applied research task to stimulate, at global level, an exchange of best practices through geovisual analytics reasoning. Tools are introduced to help gathering and sharing national and local initiatives aimed at measuring economic, social and environmental developments and to engage policy makers, statisticians and the public in collaborative activities. The global dimension of such a task responds to build a repository of progress indicators, where experts and public users can use geovisual

Analyse, Collaborate and Publish Statistics for … · Analyse, Collaborate and Publish Statistics for Measuring Progress in our Society using Storytelling Prof. Mikael Jern NCVA

Embed Size (px)

Citation preview

Analyse, Collaborate and Publish Statistics

for Measuring Progress in our Society using Storytelling

Prof. Mikael Jern

NCVA – National Center for Visual Analytics, ITN, Linkoping University, 60174 Norrköping, Sweden

[email protected] http://ncva.itn.liu.se

Abstract. Official statistics such as demographics, environment, health, social-economy and education from national or regional territories are a rich and important source of information for many important aspects of life. Web-enabled geovisual analytics is a technique that can help illustrating

comprehensive statistical data which for the eye are hard perceive or interpret. In this paper, we introduce “storytelling” means for the author to 1) select spatio-temporal and multivariate statistical data, 2) explore and discern trends and patterns, 3) orchestrate and describe metadata, 4) collaborate with colleagues to confirm and 5) finally publish essential gained insight and knowledge embedded as dynamic visualization “Vislet” in a blog or web page. The author can guide the reader in the directions of both context and discovery while at the same time follow the analyst’s way of logical reasoning. We are

moving away from a clear distinction between authors and readers affecting the process through which knowledge is created and the traditional models which support editorial work. Value no longer relies solely on the content but also on the ability to access this information.

1 Introduction

We live in a data-rich world where people have become familiar with notions like

GDP and sustainable development; statistics that compare countries’ economic

performance often hit the news headlines. This paper reflects a challenging applied

research task to stimulate, at global level, an exchange of best practices through

geovisual analytics reasoning. Tools are introduced to help gathering and sharing national and local initiatives aimed at measuring economic, social and environmental

developments and to engage policy makers, statisticians and the public in

collaborative activities. The global dimension of such a task responds to build a

repository of progress indicators, where experts and public users can use geovisual

analytics tools to compare situations for countries, regions or local communities. At

the same time, people want to see statistics that describe the places where they live

and capture the quality of their own lives, taking into account a broader perspective

beyond the economic one. The geographical level to which statistics are referred is,

therefore, increasingly important. Understanding the variety in regional economic

structures and performance [12] is essential knowledge for initiating development

which could improve regional competitiveness and in turn increases national growth.

The results from our research [6] make these variations more visible, providing region-by region indicators in the form of motion graphs and maps that could lead to

better identification of areas that are outperforming or lagging behind. Patterns of

growth and the persistence of inequalities are analyzed over time, highlighting the

factors responsible for them. How can such significant knowledge about these

statistical facts be collaborated and published to analysts and citizens?

The paper introduces tools [3] for an integrated statistics analysis, collaboration

and publication process facilitating storytelling aimed at producing statistical news

content in support of an automatic authoring process. The author should simply press

a button to publish the gained knowledge from a visual interactive discovery process.

We present our latest research that focuses on the most ancient of social rituals

“storytelling” - telling a story about a region’s development over time and shape the

measure of economic growth and well-being. Discoveries that more engagingly draw us into reflections about the knowledge on how life is lived - and can be improved –

from region to region and in addition let the reader dynamically participate in this

process and help advancing research critical to the dissemination of official statistics

by means of web-enabled tools. A platform for dissemination of embedded dynamic

statistics data visualization with the analytics sense-making metadata (story) joined

together and publishable in any web pages such as blogs, wikis etc. Publishing official

statistics through assisted content creation with emphasis on visualization and

metadata represents a key advantage of our storytelling and has the potential to

change the terms and structures for learning.

Geovisual analytics tools [1, 2] that address a challenge to advance research critical

to visualization of statistics data facilitating seamless integration of visual exploration, collaboration and dissemination. The global dimension of such a task responds to

build a repository of progress indicators, where experts and public users can use

geovisual analytics tools to compare situations for countries, regions or local

communities. A storytelling mechanism enables the transition of tedious statistics data

into heterogeneous, open and communicative sense-making news entities with

integrated contextual metadata that will emphasize on content creation aspects and

where dynamic embedded temporal visualization could engage the user.

We build upon previous research [8] and our web-enabled application Statistics

eXplorer [4] platform that is emerging as a de facto standard in the statistics

community for exploring and communicating statistics data. A novel storytelling

mechanism is introduced for the author to: 1) import regional statistical data; 2) explore and make discoveries through trends and patterns and derive insight - gained

knowledge is the foundation for 3) creating a story that can be 4) shared with

colleagues and reach consensus and trust. Visual discoveries are captured into

snapshots together with descriptive metadata and hyperlinks in relation to the

analytics reasoning. The author gets feedback from colleagues, adopts the story and 5)

finally publishes “tell-a-story” to the community using a “Vislet” that is embedded in

blogs or Web pages.

2 Related Work

Volumes of official national and sub-national statistical data are today generated

by statistics offices all over the world and stored in public databases such as the

Worldbank or OECD Stat but not used as effectively as one would wish for. Little

focus has been given to make web-enabled sophisticated geovisual analytics

technologies accessible to statisticians and advance research for collaborative

dissemination to the public.

The importance of a capacity to snapshot explorative sessions and then reuse them

for presentation and evaluation within the same environment was early demonstrated

by MacEachren [11] and Jern [6,8] in geovisualization and incorporated features to

capture and reuse interactions and integrate them into electronic documents. Another

effort was made by Visual Inquiry Toolkit [5] that allows users to place pertinent

clusters into a “pattern-basket” to be reused in the visualization process. [14] describes a method they call “Re-Visualization” and a related tool ReVise that

captures and re-uses analysis sessions Keel [10] describes a visual analytics system

of computational agents that support the exchange of task-relevant information and

incremental discoveries of relationships and knowledge among team members

commonly referred to as sense-making. Wohlfart [15] describes a storytelling

approach combined with interactive volume visualization and an annotated animation.

Many capture and reuse approaches are limited to be used within the same

application environment that may well require a software license and are not always

easily accessible to team members without installing external software [9]. Increased

computer security practice for statisticians could also limit this possibility.

Research has so far focused on tools that explore data [1] while methods that communicate gained knowledge with clarity, precision, and efficiency has not

achieved the same attention. Geovisual analytics tools should share discoveries with

colleagues and communicate efficiently relevant knowledge to the public with the

goal to advance research critical to educational communication and publishing.

3 System Implementation

The conceptual data model for our Statistics eXplorer platform can be seen as a

data cube with three dimensions: space, time and indicators. The spatial dimension is

represented by the regions and the indicators are various demographics measurements (GDP growth, elderly dependency rate, etc). Time is the data acquisition period. The

general method for finding a value in the cube is by its position (space; time;

indicator;) and fast access time is essential for motion graphs. Space-time-indicator

awareness means that the data cube can be analysed and visualized across all three

dimensions simultaneously. Statistics eXplorer performs this task by integrating and

time-linking all its motion graphs (figure 1): choropleth map, parallel coordinates,

scatter plot, table lens, data grid, pie glyphs and time graph etc.

Statistics eXplorer is customized from our GAV Flash class library [16],

programmed in Adobe’s object-oriented language ActionScript and includes a

collection of common geo- and information visualization representations. Statistical

data are effectively analysed through the use of time-linked views controlled by a

time slider. Complex patterns can be detected through a number of different visual

representations simultaneously, each of which is best suited to highlight different

statistics pattern and can help stimulate the analytical visual thinking process so

characteristic for geovisual analytics reasoning. All graphs are time-linked, important in the synthesis of animation within explorative statistical data analysis.

Interactive features that support the analytical reasoning process include tooltips,

brushing, highlight, visual inquiry, conditioned statistics filter mechanisms that can

discover outliers and simultaneously update all views. Of particular interest is the

common information visualization methods table lens and parallel coordinates, to a

great extent unknown to the statistics community extended with special features that

are important to statistics exploration, for example, compare the profiles of selected

regions, motion to see these profiles change over time, frequency histograms and

filter operations based on percentile statistics. The Flash-based enhanced parallel

coordinates plot and table lens have slowly demonstrated to be not only functional but

also productive visualizing patterns for multivariate statistical (6-12) indicators [9].

Collaboration is achieved through a mechanism in GAV Flash (figure 2) that supports the storage of interactive events in an analytical reasoning process through

“memorized interactive visualization views” or “snapshots” that can be captured at

any time during an explorative data analysis process and becomes an important task

of the storytelling authoring analytical reasoning process.

3.1 Snapshot

When exploring and making sense of comprehensive statistics data, we need a

coherent cognitive workspace to hang our discoveries on for organizing and

navigating our thoughts. The GAV Flash toolkit includes such means by capturing

saving and packaging the results of a Statistics eXplorer “gain insight” process in a

series of “snapshots” that could help the analyst to highlight views of particular

interest and subsequently guide other analysts to follow important discoveries. The

snapshot tool creates a single or a continuous series (story) of visualization captures during the exploration process. In a typical scenario the analyst has selected relevant

attributes, time step (temporal data), regions-of-interest, colour class values, filter

conditions for selected attributes and finally highlights the “discovery” from a certain

angle (viewing properties).

The analyst requests a snapshot with the Capture function that results in a snapshot

class operation scanning through all its connected GAV Flash components for

properties to be captured. Each of these properties will then be parsed into XML and

written to a file that also contains details on which data (attributes and GIS regions)

was used and a unique name for each component. When a snapshot is activated, the

saved state of the snapshot class will be read from the XML file and parse its nodes

back into component properties again. The previously marked properties will then be applied and set the state of the application.

3.2 Storytelling

Storytelling, in our context, is about telling a story on the subject of statistics data

and related analytics reasoning about how gained knowledge was achieved.

Storytelling within this participative web context, could more engagingly draw the

user into exciting reflections and sometimes change a perspective altogether. The story is placed in the hands of those who need it, e.g. policy and decision makers,

teachers but also the informed citizens. Dynamic visual storytelling is a way of telling

stories through interactive web-enabled visualization. Our proposed novel storytelling

technology could advance research critical to collaboration and dissemination of

digital media and enable a leap in understanding by the audience so as to grasp how

statistical indicators may influence our society.

Fig. 1. Statistics eXplorer and Vislets are developed from GAV Flash components customized

and optimized to sustain real-time coordinated time-linked views [17] that are simultaneously updated with changing regional statistics data for every new time step. The user can stop the time animation and start interacting with the statistics data at any time step.

Statisticians with diverse background and expertise participate in a creative

discovery processes that transforms statistical data into knowledge. Storytelling tools

integrate this geovisual analytics process with collaborative means that streamline a

knowledge exchange process of developing a shared understanding with other

statisticians and after consensus has been reached can be published. The snapshot

mechanism helps the author of a story to highlight data views of particular interest

and subsequently guide others to important visual discoveries.

Fig. 2. The storytelling mechanism allows a sharable story to be created and saved in xml. Readers can then import the story and follow the analyst’s way of reasoning through descriptive text and hyperlinks that instantiate snapshots in the visual representation.

The author creates a single or a discrete series of captures during the explorative

process by electing relevant indicators, regions-of-interest, colour schema, filter

conditions focusing on the data-of-interest or a time step for temporal statistics.

Hypertext, meaning "more than just text", provides a richer functionality than

simple metatext by allowing the reader to click on key words and learn about topics in

the story. A story hyperlink is here a reference in the story metatext that links to an

external URL web site or a captured snapshot. To insert a hyperlink in the metatext

then select the text and a button “Link” is made visible and two options appear: a. new capture (snapshot) b. link to an external URL.

Before the actual capture is done, the user navigates, for example, the map view to

a particular country, select indicator, select indicators for the scatter plot, select time

step. A new view such as parallel axes can be added to the story etc. A “Capture” is

made and all preferred states are saved. When the story later is read, hyperlinks can be

initiated and eXplorer will display the state-of-the-snapshots.

Hyperlinks that instantiate a Statistics eXplorer state are a central feature of our

storytelling mechanism together with associated descriptive text that could guide the

reader in the analyst’s way of thinking. While it’s true that a picture is often worth a

thousand words, sometimes a few words and a snapshot provide the difference

between a pretty picture and understanding. This focus on publishing through assisted content creation with emphasis on visualization and metadata represents a novel

advantage of our storytelling.

A Statistics eXplorer story can also have several chapters where data source,

contents relating to indicators and visual layout can change. For example, in “ageing

population for Europe during 1990-2008” (figure 3), chapter 1 includes a map linked

to a scatter plot while chapter 2 has the same map but here linked to both scatter plot

and a parallel axes (parallel coordinates, profile plot) to simultaneously analyse three

selected regions from different visual perspectives.

3.3 Publisher and Vislets

A Vislet is a standalone Flash application (widget) assembled from low-level GAV

Flash components in a class library and Adobe Flex GUI tools and is represented by,

for example, a single map view or a composite time-linked map and scatter plot view

(figure 3). A Vislet facilitates the transition of selected tedious statistics data into

heterogeneous and communicative sense-making news entities with integrated

metadata and dynamic embedded animated visualization that could engage the user.

Publisher (figure 4) is the server tool that imports a story and generates the HTML

code that represents the Vislet and metadata.

First, the user selects appropriate visual representation for the Vislet e.g. map, scatter plot, parallel axes, table lens or time graph. Then the size of the Vislet window

with metadata is set and Publisher generates the HTML code. This code is manually

copied and finally manually (copy/paste) embedded into a web page. The Vislet can

now be opened in the reader’s Web browser and dynamically communicate the story.

A Publisher server maintains the Vislet flash (swf) files together with a story

repository, statistical data and regional shape maps. The Vislets run locally in the

client’s Flash Player and can thus achieve dynamic interactive performance.

Interactive features in a Vislet are exposed to all visualizations including tooltips,

brushing, highlight, filter that can discover outliers and dynamic multiple-linked

views. Several specialist colour legend tasks are supported e.g. show outliers based on

5th and 95th percentiles in certain colours or dynamic sliders that control class values etc. (figure 3).

4 Case Study

Our case study is based on an World eXplorer “explore, discover trends and gain

insight” scenario and is here used to explain the process from creating a story with

snapshots and metatext, save the story and finally use Publisher to load the story and

generate the HTML code that is placed in a web site and a Vislet is created:

I. Create new or select existing Story in World eXplorer

World eXplorer is used to 1) select spatio-temporal and multivariate statistical data; 2) explore and discern trends and patterns; 3) orchestrate and create snapshots in

the Story Editor for important discoveries and associate these snapshots with

hyperlinks in the metatext; 4) Save the story in XML format; 5) collaborate with

colleagues to confirm.

II. Use Publisher to customize your Vislet

Set Vislet properties in the Publisher (figure 4) panel e.g. height, width, graphics

layout for visualization and metadata, background and text colour.

III. Choose visual representations and copy HTML code

Select visual representations (map, scatter plot, table lens, parallel coordinates or time graph) in drop-down menus. The HTML-code is now automatically generated by

Publisher and copied when you press “Copy”. This HTML code, when pasted into a

web page or blog, will load your Vislet with associate metadata and hyperlinks using

the settings you have chosen (figure 4).

IV. Paste the HTML-code into your web page

Paste the HTML code either into a web page or directly into a CMS or blog. The

resulting Vislet can be evaluated at: http://www.ncomva.com/?page_id=307.

Fig.3. Fertility Rates vs. Population age 0-14 1960-2008. A Vislet with metadata is embedded in a web page using three time-linked views: map, scatter plot and time graph. Only

data relevant for the story is included. The user can select countries and see snapshots.

Fig. 4. Statistics Publisher imports stories and produce HTML code representing Vislets

7 Conclusions and future development

The technique introduced in this paper allows the analyst (author) to communicate

with interested readers through visual discoveries captured into snapshots together

with descriptive text. Selected indicators and visual representations can be published together with their metadata, thus facilitating the comprehension of statistical

information by non expert readers. We believe that this advanced storytelling

technology can be very useful for media as some examples of using Statistics

eXplorer to tell a story have already showed. At the same time, the Vislet technique

applied to ,in this case, World eXplorer can help developing agile on-line

publications, which draw the attention on recent trends and inequalities. Reviews

from our partners (OECD, Sweden and Denmark Statistics, Eurostat, Italy Statistics,

Goteborg City) who have evaluated the platform and tool highlights the following

features:

eXplorer can easily be customized by a statistics organisation - requires only

regional boundaries (shape file) and associate indicator data;

eXplorer is a comprehensive tool for advanced users – the Vislet approach is

regarded as a painless and more attractive to public;

Encourage collaboration between statistics analysts and users of statistics;

Possibility to capture, save and open discoveries (snapshots) with attached analytics reasoning metadata e.g. Storytelling;

IT expertise is not required to publish interactive visualization embedded in

blogs or web pages;

Possible strategic tool for news media to publish statistics news on the web;

Easy-to-import external statistical data into eXplorer;

Ability to have dynamic time-link views and see the multi-dimensionality of

regional development;

Increased expectations in terms of user experience;

Will encourage more educational use of official statistics;

References 1. Andrienko G, and Andrienko N,: Visual Exploration of Spatial Distribution of Temporal

Behaviors, In Proceedings of IEEE IV2005. 2. Geovisual Analytics: http://geoanalytics.net/GeoVisualAnalytics08/.

3. Geovisual Analytics tools: http://ncva.itn.liu.se/explorer/tools 4. OECD eXplorer. 2010 http://www.oecd.org/gov/regional/statisticsindicators/explorer 5. Guo D., Chen J., MacEachren A.M., Liao K.: A visualization system for space-time and

multivariate patterns, IEEE Visualization and Computer Graphics, Vol 12, No 6, 2006. 6. Jern M.: Smart Documents for Web-Enabled Collaboration, Published in “Digital Content

Creation”, Vince J. A. and R. A. Earnshaw (Eds) Springer Verlag, June 2001. 7. Jern M., Franzén J.: “GeoAnalytics – Exploring spatio-temporal and multivariate data”,

Reviewed proceedings, IV 2006, London, published by IEEE Computer Society. 8. Jern M., Rogstadius J., Åström T., and Ynnerman A: ”Visual Analytics presentation tools

applied in HTML Documents”, Reviewed proceedings, IV08, London, July 2008, published by IEEE Computer Society.

9. Jern M., Thygesen L., Brezzi M: “A web-enabled Geovisual Analytics tool applied to OECD Regional Data”, Reviewed Proceedings in Eurographics 2009, Munchen.

10. Keel P.: Collaborative Visual Analytics: Inferring from the Spatial Organisation and Collaborative use of information, VAST 2006, pp.137-144, IEEE.

11. MacEachren A.M., Brewer I., et al.: Geovisualization to mediate collaborative work: Tools to support different-place knowledge construction and decision-making. In 20th

International cartographic conference, Beijing, China. 2001. 12. OECD web site: http://www.oecd.org/GOV/regionaldevelopment 13. Roberts, J. C.: Exploratory Visualization with Multiple Linked Views, Exploring

Geovisualization, J. Dykes, A.M. MacEachren, M.-J. Kraak (Editors) 2004. 14. Robinson A.: Re-Visualization: Interactive Visualization of the Progress of Visual

Analysis, workshop proceedings, VASDS. 2006. 15. Wohlfart, M., Hauser, H.: Story Telling for Presentation in Volume Visualization,

EuroVis2007.

16. GAV Flash class library http://ncva.itn.liu.se/tools