Upload
others
View
5
Download
0
Embed Size (px)
Citation preview
TrelliscopeJS
HafenConsulting,LLC
PurdueUniversity
@hafenstats
RyanHafen
http://bit.ly/trelliscopejs1
ModernApproachestoDataExploration
withTrellisDisplay
install.packages(c("tidyverse", "gapminder", "rbokeh","visNetwork", "plotly"))devtools::install_github("hafen/trelliscopejs")
library(tidyverse)library(gapminder)library(rbokeh)library(visNetwork)library(trelliscopejs)
Allexamplesinthistalkare
reproducibleafterinstallingand
loadingthefollowingpackages:
TrelliscopeJSisanhtmlwidget
TrelliscopeJSisalayoutenginefor
collectionsofhtmlwidgets
TrelliscopeJSisaframeworkforcreating
interactivedisplaysofsmallmultiples
SmallMultiples
Aseriesofsimilarplots,usually
eachbasedonadifferentsliceof
data,arrangedinagrid
"Forawiderangeofproblemsin
datapresentation,smallmultiples
arethebestdesignsolution."EdwardTufte(Envisioning
Information)
Thisideawasformalizedand
popularizedinS/S-PLUSand
subsequentlyRwiththetrellisand
latticepackages
AdvantagesofSmallMultipleDisplays
source:
Avoidoverplotting
Workwithbigorhighdimensionaldata
Itisoftencriticaltothediscoveryofanew
insighttobeabletoseemultiplethingsat
once
Ourbrainsaregoodatperceiving
simplevisualfeatureslikecoloror
shapeorsizeandtheydoitamazingly
fastwithoutanyconsciouseffort
Wecantellimmediatelywhenapartof
animageisdifferentfromtherest,
withoutreallyhavingtofocusonit
Inmyexperience,smallmultiplesaremuchmoreeffective
thanmoreflashythingslikeanimation,linkedbrushing,
custominteractivevis,etc.
Trelliscope:Interactive
SmallMultipleDisplay
source:
Smallmultipledisplaysareusefulwhenvisualizingdataindetail
Butthenumberofpanelsinadisplaycanbepotentiallyverylarge,
toolargetoviewallatonce
Itcanalsobedifficulttospecifyameaningfulorderinwhichpanels
aredisplayed
Trelliscopeisageneralsolutionthatallowssmall
multipledisplaystocomealivebyprovidingthe
abilitytointeractivelysortandfilterthepanels
basedonsummarystatistics,cognostics,
automaticallycomputedforeachpanel
TrelliscopeJS
JavaScriptLibrary RPackage
trelliscopejs-lib trelliscopejs
BuiltusingReact
PureJavaScript
Interfaceagnostic
htmlwidgetinterfaceto
trelliscopejs-lib
EvolvedfromCRAN"trelliscope"
package(partof project)DeltaRho
GapminderExample
Supposewewanttounderstandmortalityovertimeforeachcountry
bservations: ,ariables: country fctr fghanistan, fghanistan, fghanistan, fghanistan, fgh... continent fctr sia, sia, sia, sia, sia, sia, sia, sia, sia, s... year int , , , , , , , , , ... life p dbl . , . , . , . , . , . , . , ... pop int , , , , , ,... gdp ercap dbl . , . , . , . , . , . ...
glimpse(gapminder) https://www.gapminder.org/
plot(year, life p, data gapminder, color country, geom "line")
Yikes!Therearealotofcountries...
plot(year, life p, data gapminder, color continent,group country, geom "line")
Ican'tseewhat'sgoingon...
plot(year, life p, data gapminder, color continent,group country, geom "line") facet_wrap( continent, nrow )
Thathelpedalittle...
`r
h̀
p plot(year, life p, data gapminder, color continent,group country, geom "line") facet_wrap( continent, nrow )
plotly::ggplotly(p)
Thishelpsbutthereisstilltoomuchoverplotting...
(andhoveringforadditionalinfoistoomuchworkandwecanonlyseemoreinfooneatatime)
plot(year, life p, data gapminder) lim( , ) ylim( , ) theme_bw() facet_wrap( country continent)
Fromggplot2FacetingtoTrelliscope
Turningaggplot2faceteddisplayintoaTrelliscopedisplayis aseasyaschanging:
facet_wrap()
or:
facet_grid()
to:
facet_trelliscope()
plot(year, life p, data gapminder) lim( , ) ylim( , ) theme_bw() facet_trelliscope( country continent, nrow = 2, ncol = 7, width = 300)
openinnew
window
plot(year, life p, data gapminder) lim( , ) ylim( , ) theme_bw() facet_trelliscope( country continent,nrow , ncol , width , s plotl = )
openinnew
window
Plottinginthe
Tidyverse
country_model function(df)lm(life p year, data df)
by_country gapminder group_by(country, continent) nest() mutate(model map(data, country_model),resid_mad map_dbl(model, function( ) mad(resid( ))))
by_country
Exampleadaptedfrom"RforDataScience"
tibble: country continent data model resid_mad fctr fctr list list dbl fghanistan sia tibble : lm . lbania urope tibble : lm . lgeria frica tibble : lm . ngola frica tibble : lm . rgentina mericas tibble : lm . ustralia ceania tibble : lm . ustria urope tibble : lm . ahrain sia tibble : lm . angladesh sia tibble : lm . elgium urope tibble : lm .
... with more rows
GapminderExamplefrom"RforDataScience"
Onerowpergroup
Per-groupdataand
modelsas"list-columns"
Excerptfrom"RforDataScience"
PlottingtheFitforEachCountry
figure( lim c( , ),ylim c( , ), tools N ) ly_points(year, life p, data data, hover data) ly_abline(model)
country_plot(by_country data ,by_country model )
PlottingtheDataandModelFitforaGroup
We'llusetherbokehpackageto makeaplotfunctionandapplyit tothefirstrowofourdata
country_plot function(data, model)
by_country by_country mutate(plot p2 plot(data, model, country_plot))
by_country
tibble: country continent data model resid_mad plot fctr fctr list list dbl list fghanistan sia tibble : lm . : rbokeh lbania urope tibble : lm . : rbokeh lgeria frica tibble : lm . : rbokeh ngola frica tibble : lm . : rbokeh rgentina mericas tibble : lm . : rbokeh ustralia ceania tibble : lm . : rbokeh ustria urope tibble : lm . : rbokeh ahrain sia tibble : lm . : rbokeh angladesh sia tibble : lm . : rbokeh elgium urope tibble : lm . : rbokeh ... with more rows
Let'sApplyThisFunctiontoEveryRow!
Plotsaslist-columns!!!
by_country trelliscope(name "by_country_lm", nrow , ncol )
openinnew
window
Recap:TrelliscopeJSintheTidyverse
Createadataframewithonerowpergroup,typicallyusing
Tidyversegroup_by()andnest()operations
Addacolumnofplots
TrelliscopeJSprovidespurrrmapfunctionsmap_plot(),
map2_plot(),pmap_plot()thatyoucanusetocreatethese
Youcanuseanygraphicssystemtocreatetheplotobjects
(ggplot2,htmlwidgets,lattice)
Optionallyaddmorecolumnstothedataframethatwillbeused
ascognostics-metricswithwhichyoucaninteractwiththepanels
Allatomiccolumnswillbeautomaticallyusedascognostics
Mapfunctionsmap_cog(),map2_cog(),pmap_cog()canbe
usedforconveniencetocreatecolumnsofcognostics
Simplypassthedataframeintotrelliscope()
Withplotsascolumns,TrelliscopeJSprovidesnearlyeffortless
detailed,flexible,interactivevisualizationintheTidyverse
by_country arrange( resid_mad) trelliscope(name "by_country_lm", nrow , ncol )
openinnew
window
Orderthedataframetoset
initialorderingofdisplay
by_country filter(continent " frica") trelliscope(name "by_country_africa_lm", nrow , ncol )
openinnew
window
Filterthedatatoonlyinclude
plotsyouwantinthedisplay
ImagesasPanels
pokemon read_csv("http://bit.ly/plot_pokemon") mutate_at(vars(matches("_id ")), as.character) mutate(panel img_panel(url_image))
pokemon
Show 10 entries Search:
Showing 1 to 10 of 801 entries Previous 1 2 3 4 5 … 81 Next
pokemon id species_id height weight base_experience type_1 type_2 attack
1 bulbasaur 1 1 7 69 64 grass poison 49
2 ivysaur 2 2 10 130 142 grass poison 62
3 venusaur 3 3 20 1000 236 grass poison 82
4venusaur-mega
4 3 24 1555 281 grass poison 100
5 charmander 5 4 6 85 62 fire 52
6 charmeleon 6 5 11 190 142 fire 64
7 charizard 7 6 17 905 240 fire flying 84
8charizard-mega-x
8 6 17 1105 285 fire dragon 130
9charizard-mega-y
9 6 17 1005 285 fire flying 104
10 squirtle 10 7 5 90 63 water 48
trelliscope(pokemon, name "pokemon", nrow , ncol ,state list(labels c("pokemon", "pokede ")))
datasource blogpost openinnew
window
htmlwidgetsas
Panels
1
Example:NetworkViswithvisNetworkhtmlwidget
library(visNetwork)nnodes nnedges nodes data.frame( id :nnodes, label :nnodes, value rep( , nnodes))
edges data.frame( from sample( :nnodes, nnedges, replace ), to sample( :nnodes, nnedges, replace )) group_by(from, to) summarise(value n())
network_plot function(id, hide_select ) style ifelse(hide_select, "visibility: hidden position: absolute", "")
visNetwork(nodes, edges) vis graph ayout(layout "layout_in_circle") visNodes(fi ed , scaling list(min , ma , label list(min , ma , draw hreshold , ma isible ))) vis dges(scaling list(min , ma )) vis ptions(highlightNearest list(enabled , degree , hide olor "rgba( , , , . )"), nodes d election list(selected as.character(id), style style))
network_plot( , hide_select )
nodedat edges group_by(from) summarise(n_nodes n(), tot_conns sum(value)) rename(id from) arrange( n_nodes) mutate(panel map_plot(id, network_plot))
nodedat
tibble: id n_nodes tot_conns panel int int int list : visNetwork : visNetwork : visNetwork : visNetwork : visNetwork : visNetwork : visNetwork : visNetwork : visNetwork : visNetwork ... with more rows
Trelliscopedisplaywithonepanelpernode
Wecreateaone-row-per-nodedataframewithnumberof
nodesconnectedtoandtotalnumberofconnectionsas
cognosticsandaddaplotpanelcolumn
nodedat arrange( n_nodes) trelliscope(name "connections", nrow , ncol )
openinnew
window
LargerTrelliscope
Displays
instadf %>%arrange(-likes_count) %>%trelliscope(name = "posts", width = 320, height = 320, nrow = 3, ncol = 6,state = list(labels = c("caption", "post_link", "likes_count")))
openinnew
window
blogpost
TrelliscopeDisplays
asApps
TrelliscopeDisplaysasApps
Ifyouhaveanappthathasmultipleinputsandproducesa
plotoutput,theideaissimplytoenumerateallpossible
inputsasrowsofadataframeandaddtheplotthat
correspondstotheseparametersascolumnandplotit
Trelliscopedisplaysaremostusefulasexploratoryplotsto
guidethedatascientist(becausetheycanbecreatedrapidly)
However,inmanycasesTrelliscopedisplayscanbeusedas
interactiveapplicationsforend-users,domainexperts,etc.
withthebonusthattheyaremucheasiertocreatethana
customapp
GampinderLifeExpectancy
Selectcountry:
Afghanistan
library(shiny)library(ggplot2)library(gapminder)server <- function(input, output) {
output$countryPlot <- renderPlot({ qplot(year, lifeExp, data = subset(gapminder, country == input$country)) + xlim(1948, 2011) + ylim(10, 95) + theme_bw() })}
choices <- sort(unique(gapminder$country))
ui <- fluidPage( titlePanel("Gampinder Life Expectancy"), sidebarLayout( sidebarPanel( selectInput("country", label = "Select country: ", choices = choices, selected = "Afghanistan") ), mainPanel( plotOutput("countryPlot", height = "500px") ) ))
runApp(list(ui = ui, server = server))
ScalingTrelliscope
Justbecauseyoucan'tlookatallpanelsinadisplaydoesn't
meanitisn'tusefulorpracticaltomakealargedisplay-it'sin
factbeneficialbecauseyougetanunprecedentedlevelof
detailinyourdisplays,andeverycornerofyourdatacanbe
conceptuallyviewed
Oneinsightisallyouneedforadisplayto
serveapurpose(provideditisquicktocreate)
Weusedthepreviousimplementationof
Trelliscopetovisualizemillionsofsubsets
ofterabytesofdata
WhatisneededtoscaleintheTidyverse?
SparklyRisthenaturalsolution
Butweneedafewthings...
SparklyRsupportforlist-columns(nesteddataframes
andarbitraryRobjects)
SparklyRsupportforremoteprocedurecalls(run
arbitraryRcodeonthedata)
FastrandomaccesstorowsofaSparklyRdataframe
ATrelliscopeJSdeferredpanelrenderingscheme
(renderon-the-flyratherthanallpanelsupfront)
What'sNext
trelliscopejs
Automaticcognostics:automaticallycomputeusefulcognostics
basedonthecontextofwhatisbeingplotted(e.g.ifa
scatterplothasamodelfitsuperposed,addmodeldiagnostics
cognostics
Automatichandlingofaxislimits-"same","sliced","free"
(underway-currently"same"limitsneedtobehard-coded)
Whenaxesare"same",onlyshowaxesonplotmarginsinstead
ofeverypanel(underwayforggplot2)
trelliscopejs-lib
Morevisualfiltersforcognostics(dates,geographic,bivariate
relationships,etc.)
Bookmarkable/sharablestate
Viewmultiplepanelsside-by-side
Supportforreceivingpanelsfromotherendpoints
ForMoreInformation
Twitter:
Blog:
Documentation:
Github:
@hafenstats
http://ryanhafen.com/blog
http://hafen.github.io/trelliscopejs
https://github.com/hafen/trelliscopejs