12
Graphical Perception of Continuous Quantitative Maps: the Effects of Spatial Frequency and Colormap Design Khairi Reda Indiana University–Purdue University Indianapolis Indianapolis, IN, USA [email protected] Pratik Nalawade Indiana University–Purdue University Indianapolis Indianapolis, IN, USA [email protected] Kate Ansah-Koi Indiana University–Purdue University Indianapolis Indianapolis, IN, USA [email protected] ABSTRACT Continuous ‘pseudocolor’ maps visualize how a quantitative attribute varies smoothly over space. These maps are widely used by experts and lay citizens alike for communicating scien- tific and geographical data. A critical challenge for designers of these maps is selecting a color scheme that is both effective and aesthetically pleasing. Although there exist empirically grounded guidelines for color choice in segmented maps (e.g., choropleths), continuous maps are significantly understudies, and their color-coding guidelines are largely based on expert opinion and design heuristics—many of these guidelines have yet to be verified experimentally. We conducted a series of crowdsourced experiments to investigate how the perception of continuous maps is affected by colormap characteristics and spatial frequency (a measure of data complexity). We find that spatial frequency significantly impacts the effectiveness of color encodes, but the precise effect is task-dependent. While rainbow schemes afforded the highest accuracy in quantity esti- mation irrespective of spatial complexity, divergent colormaps significantly outperformed other schemes in tasks requiring the perception of high-frequency patterns. We interpret these results in relation to current practices, and devise new and more granular guidelines for color mapping in continuous maps. ACM Classification Keywords H.5.m. Information Interfaces and Presentation (e.g. HCI): Miscellaneous Author Keywords Scalar field visualization; continuous colormaps; perception INTRODUCTION Continuous, ‘pseudocolor’ maps visualize how a quantitative attribute varies smoothly over space by mapping data intervals to color gradients. These maps support a range of graphical tasks, from quantity estimation (e.g., estimating air tempera- ture at a specific location), to the comprehension of patterns Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]. CHI 2018, April 21–26, 2018, Montreal, QC, Canada © 2018 Copyright held by the owner/author(s). Publication rights licensed to ACM. ISBN 978-1-4503-5620-6/18/04. . . $15.00 DOI: https://doi.org/10.1145/3173574.3173846 and structures throughout the image. Continuous maps are common in scientific publications, especially in the physical and climate sciences. However, they are also widely used to disseminate weather information to lay citizens, particu- larly during inclement conditions. Naturally, the choice of colormap affects the visual appearance of the image and poten- tially impacts data perception. While this choice is occasion- ally dictated by convention, often there is no clear agreement on what colormap to use. For example, designers of weather maps employ one of several color schemes to illustrate the geographic distribution of temperatures; some choose the pop- ular rainbow scheme while others might employ a diverging blue-to-red scale. A large body of research has been devoted to understanding how color encoding affects people’s perception of informa- tion in discrete maps [9]. Cartographers have analyzed the effectiveness of various color schemes for choropleths [29, 7], and contributed robust guidelines and tools for designing segmented colormaps [5, 6, 15]. By contrast, the graphical perception of continuous maps for smooth spatial data remains significantly understudied. The few extant studies have produced inconclusive evidence and inconsistent colormap recommendations [8]. For example, while some studies found rainbow colormaps to be accurate for surface interpretation [23, 18], others indicate rainbow to be ineffective, especially as compared to diverging color schemes [3]. Because of these inconsistencies, color encod- ing advice is largely based on expert opinion and designer intuition, rather than being grounded in empirical evidence. Existing guidelines –such as those discouraging the use of rainbow [4]– are often at odds with the visualization practices of the scientific community, and sometimes even contradict established results [41]. Further research is thus needed to validate current practices [22], and to establish evidence-based guidelines for color-coding in continuous spatial data. We study the graphical perception of continuous maps, investi- gating the impact of Colormap design and Spatial frequency–a measure of spatial variance. We conducted three crowdsourced experiments to test the effectiveness of commonly prescribed colormaps, comparing their accuracy under different kinds of tasks and at increasing levels of spatial frequency. Results indicate that spatial frequency impacts the effectiveness of color encoding, but the precise effect is dependent on the task. We find that rainbow colormaps afford the highest accuracy

Graphical Perception of Continuous Quantitative Maps: the ...khreda.com/papers/CHI18_colormaps.pdfA large body of research has been devoted to understanding how color encoding affects

  • Upload
    others

  • View
    5

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Graphical Perception of Continuous Quantitative Maps: the ...khreda.com/papers/CHI18_colormaps.pdfA large body of research has been devoted to understanding how color encoding affects

Graphical Perception of Continuous Quantitative Maps theEffects of Spatial Frequency and Colormap Design

Khairi RedaIndiana UniversityndashPurdue

University IndianapolisIndianapolis IN USA

redakiuedu

Pratik NalawadeIndiana UniversityndashPurdue

University IndianapolisIndianapolis IN USApnalawadiupuiedu

Kate Ansah-KoiIndiana UniversityndashPurdue

University IndianapolisIndianapolis IN USAkayansahiupuiedu

ABSTRACTContinuous lsquopseudocolorrsquo maps visualize how a quantitativeattribute varies smoothly over space These maps are widelyused by experts and lay citizens alike for communicating scien-tific and geographical data A critical challenge for designersof these maps is selecting a color scheme that is both effectiveand aesthetically pleasing Although there exist empiricallygrounded guidelines for color choice in segmented maps (egchoropleths) continuous maps are significantly understudiesand their color-coding guidelines are largely based on expertopinion and design heuristicsmdashmany of these guidelines haveyet to be verified experimentally We conducted a series ofcrowdsourced experiments to investigate how the perceptionof continuous maps is affected by colormap characteristicsand spatial frequency (a measure of data complexity) We findthat spatial frequency significantly impacts the effectiveness ofcolor encodes but the precise effect is task-dependent Whilerainbow schemes afforded the highest accuracy in quantity esti-mation irrespective of spatial complexity divergent colormapssignificantly outperformed other schemes in tasks requiringthe perception of high-frequency patterns We interpret theseresults in relation to current practices and devise new andmore granular guidelines for color mapping in continuousmaps

ACM Classification KeywordsH5m Information Interfaces and Presentation (eg HCI)Miscellaneous

Author KeywordsScalar field visualization continuous colormaps perception

INTRODUCTIONContinuous lsquopseudocolorrsquo maps visualize how a quantitativeattribute varies smoothly over space by mapping data intervalsto color gradients These maps support a range of graphicaltasks from quantity estimation (eg estimating air tempera-ture at a specific location) to the comprehension of patternsPermission to make digital or hard copies of all or part of this work for personal orclassroom use is granted without fee provided that copies are not made or distributedfor profit or commercial advantage and that copies bear this notice and the full citationon the first page Copyrights for components of this work owned by others than theauthor(s) must be honored Abstracting with credit is permitted To copy otherwise orrepublish to post on servers or to redistribute to lists requires prior specific permissionandor a fee Request permissions from permissionsacmorg

CHI 2018 April 21ndash26 2018 Montreal QC Canada

copy 2018 Copyright held by the ownerauthor(s) Publication rights licensed to ACMISBN 978-1-4503-5620-61804 $1500

DOI httpsdoiorg10114531735743173846

and structures throughout the image Continuous maps arecommon in scientific publications especially in the physicaland climate sciences However they are also widely usedto disseminate weather information to lay citizens particu-larly during inclement conditions Naturally the choice ofcolormap affects the visual appearance of the image and poten-tially impacts data perception While this choice is occasion-ally dictated by convention often there is no clear agreementon what colormap to use For example designers of weathermaps employ one of several color schemes to illustrate thegeographic distribution of temperatures some choose the pop-ular rainbow scheme while others might employ a divergingblue-to-red scale

A large body of research has been devoted to understandinghow color encoding affects peoplersquos perception of informa-tion in discrete maps [9] Cartographers have analyzed theeffectiveness of various color schemes for choropleths [297] and contributed robust guidelines and tools for designingsegmented colormaps [5 6 15]

By contrast the graphical perception of continuous maps forsmooth spatial data remains significantly understudied Thefew extant studies have produced inconclusive evidence andinconsistent colormap recommendations [8] For examplewhile some studies found rainbow colormaps to be accuratefor surface interpretation [23 18] others indicate rainbowto be ineffective especially as compared to diverging colorschemes [3] Because of these inconsistencies color encod-ing advice is largely based on expert opinion and designerintuition rather than being grounded in empirical evidenceExisting guidelines ndashsuch as those discouraging the use ofrainbow [4]ndash are often at odds with the visualization practicesof the scientific community and sometimes even contradictestablished results [41] Further research is thus needed tovalidate current practices [22] and to establish evidence-basedguidelines for color-coding in continuous spatial data

We study the graphical perception of continuous maps investi-gating the impact of Colormap design and Spatial frequencyndashameasure of spatial variance We conducted three crowdsourcedexperiments to test the effectiveness of commonly prescribedcolormaps comparing their accuracy under different kinds oftasks and at increasing levels of spatial frequency Resultsindicate that spatial frequency impacts the effectiveness ofcolor encoding but the precise effect is dependent on the taskWe find that rainbow colormaps afford the highest accuracy

in a quantity estimation task irrespective of spatial frequencyHowever for pattern-matching tasks we find that divergentcolormaps significantly outperform other schemes when theunderlying data exhibits high-spatial variance These resultshave significant design implications and suggest complemen-tary perceptual roles for hue-varying and divergent schemesWe distill these findings into new color mapping guidelines forcontinuous maps accounting for task and spatial complexityof the data

RELATED WORKColor mapping involves the transformation of quantitative orcategorical attributes into color by means of a colormap Tobe of practical use color mapping must enable the viewer todeduce quantities distributions and patterns present in theoriginal data [43] Researchers outline a number of propertiesbelieved to contribute to effective colormap design [39] Agood colormap sequence should be naturally orderable (egfrom cool blue to warm red) so as to perceptually reflectthe order of the originally mapped quantities Additionallya colormap should only reflect actual differences in the datawithout creating artificial boundaries in color [34] Using thesebroad principles researchers developed design tools to pro-vide colormap recommendations for designers For exampleColorBrewer suggests a set of carefully crafted and validatedpalettes [15] Similarly Colorgorical enables users to gener-ate categorical colormaps on-demand using perceptual opti-mizations while allowing for user-provided constraints [13]The majority these tools however are intended for craftingsegmented colormaps and are primarily aimed at discretemap representations (eg choropleths) One exception thePRAVDAColor tool [2] provides color mapping advice forcontinuous maps based on the datarsquos spatial frequency and theintended task However unlike ColorBrewer this advice isbased on design heuristics that have not been verified

Design Strategies for Continuous ColormapsResearchers have proposed a number of handpicked and proce-durally generated colormaps for continuous data For instanceHerman and Levkowitz devised a greyscale that maximizesCIELAB differences within the gradient finding that it reducesestimation errors by 20 compared to a linearly interpolatedscale [17] Greyscale ramps are thought to be effective atrevealing shapes and forms However they are susceptible tolarge simultaneous contrast shifts making them less usefulfor quantity estimation [41] One alternative to greyscalesinvolves varying hues instead of lightness typically via a gra-dation based on the electromagnetic spectrum The result is avivid fully saturated colormap that looks like a rainbow Al-though popular in scientific visualizations rainbow has beenthe subject of much critique in the visualization communityExperts argue that the order of hues in rainbow is not readilyapparent making it unsuitable for encoding interval data [33]Moreover rainbow introduces sharp visual boundaries aroundits yellow regions which can be misinterpreted by viewerswho might infer nonexistent features in the data [4]

Given the above limitations researchers proposed many al-ternatives to rainbow For instance lsquoSpiralrsquo colormaps com-prise a limited hue rotation combined with monotonically-

increasing luminance The result is a colormap that spirals upin the hue cone while simultaneously gaining luminance [42]Similarly cubehelix incorporates sinusoidal RGB variationsaccompanied by a monotonic buildup in luminance [14]Kindlmann et al proposes an isoluminant version of rain-bow [21] while Moreland advocates for diverging colormapswhich incorporate two opposing hues at the endpoints whilepassing through an unsaturated tone (typically white) [24]These colormaps are thought to provide lsquoperceptually uniformrsquoalternatives to rainbow by exerting control over luminancewhile providing a level of hue variation Although stronglyfavored by visualization experts evidence of their effective-ness remains inconclusive (eg see [23 18] vs [3]) Ourstudy compares these different design strategies by testing arepresentative sample of colormaps including rainbow Spiraland diverging schemes

Empirical Evaluations of Continuous ColormapsIt is generally recognized that colormap designs should beadaptive to the intended graphical task [31] Ware arguesthat colormaps should monotonically increase their lumi-nance when the goal is to comprehend shapes and spatialfeatures [41] By contrast when the goal is to estimate quan-tities a colormap should be designed to reduce simultaneouscontrast effects by registering non-monotonic variation inat least one of the three opponent-process channels Wareconfirms this latter hypothesis finding spectral ramps (ierainbow) to be the most accurate in quantity estimation butfinds little support for the shape perception theory He thensuggests that Spiral colormaps would be ideal for both quantityestimation and form perception [41] A study by Borkin et alfinds diverging ramps to be significantly more accurate thanrainbow when diagnosing heart disease from arterial scans [3]However Borkin et alrsquos results contradict two earlier studieswhich found spectral schemes to be more accurate in bothquantity and surface interpretation [23 18] Such inconsisten-cies highlight a limitation in current literature prior studiesemployed widely varying test conditions and tasks thus com-plicating their comparison Our work directly addresses thislimitation We evaluate colormaps under comparable exper-imental conditions and in a range of tasks from quantityestimation to form and pattern comprehension

Effects of Spatial FrequencySpatial frequency is a measure of the level of variance (orthe amount of information) that is present in a degree of vi-sual angle Maps with sharp edges and small features willgenerally convey more information and thus exhibit higherspatial frequency components Conversely maps with broadsmooth surfaces contain less spatial variation and thus ex-hibit lower spatial frequency Spatial frequency is thought tohave a critical role in visual perception some vision theoriessuggest that the visual cortex operates on a code of spatialfrequency as opposed to a code of straight lines and edges [1112] Moreover spatial frequency is inversely proportional theaverage size of visual features in the scene and size is knownto impact color perception [35 36] Given these factors spa-tial frequency is likely to affect the perception of continu-ous maps and possibly modulate the effectiveness of color

f1(median frequency)=3 f2 = 5 f3 = 7 f4 = 9 f5 = 11

f1

f4

f2

f5

f3

8

4

height distribution

lowterrain

highterrain

Figure 1 Five example scalar fields used as stimuli in this study The fields represent the height of procedurally generated terrain (brighter is higher)and are ordered according to their median spatial frequency (cycle per 8deg of visual angleDagger) Log-log plots depict the power spectra of each scalar field(combined in the rightmost plot to aid comparison) The position of the median frequency which splits the power distribution into two approximatelyequal halves is illustrated with a vertical line Although different with respect to spatial frequency characteristics the maps are very similar in theirdistribution of height amplitudes (shown in the top-right plot)

schemes to varying degrees Rogowitz et al argue for two col-ormap design strategies depending on spatial frequency theyrecommend ramps with monotonically increasing luminancefor datasets containing high spatial frequency and hue- orsaturation-varying for low-frequency data [32] Put differentlywe would expect sequential and spiral colormaps (eg cubehe-lix) to perform better in scalar fields that have rough surfacesand narrow features Conversely we can expect hue-varyingramps (eg rainbow) to work well with maps that have broadsurfaces and low variance This guideline is consistent withcolor difference experiments that tested viewersrsquo sensitivityto frequency-modulated Gabor patches [19] However it isdoubtful whether such experimental results (and the ensuingguideline) generalize to visual analysis tasks on actual scalarfields

SummaryColor-coding guidelines for continuous maps often come inthe form of advice that discourages the use of rainbow [425] and suggests perceptually uniform alternatives [24] Thisclinical omnibus advice is largely based on expert intuitionand design heuristics However the literature paints a morecomplex picture and suggests the choice of color encodingshould be based on both task [31 37] and data characteristicsincluding spatial frequency [32 2] This paper provides a firstexperimental account of the impact of spatial frequency onpeoplersquos ability to estimate quantities and perceive patternsin quantitative maps By studying how spatial complexitymodulates the effectiveness of colormap designs we can de-vise more nuanced guidelines that are responsive to both datacharacteristics and viewersrsquo information needs (ie tasks)

HYPOTHESESBuilding on prior research we developed three hypotheses

H1mdashWe expect colormaps comprising large hue variationsto be perceived more accurately in scalar fields containinglow spatial frequency Conversely we expect ramps withmonotonically increasing luminance to yield higher accuracyin high-frequency data These predictions are based on thecontrast-sensitivity of our visual perceptual system whichresponds more robustly to chromatic and hue variation when

assessing broad smooth surfaces and to lightness differenceswhen resolving small features [32]

H2mdashIn quantity estimation tasks (experiment 1) where thegoal is to estimate quantities at specific locations we expectcolormaps having substantial hue variation to perform betterThis conjecture assumes that hue-varying ramp will registersinusoidal variations along the chromatic opponent-processchannels Such non-monotonic variations reduce simultaneouscontrast shifts because they are less likely to systematicallyweigh chromatic processing in a particular direction [41]

H3mdashIn tasks requiring the comprehension of forms and struc-tures (experiments 2 and 3) we expect colormaps havingmonotonically increasing luminance to perform best Our vi-sual system infers surface information largely from shadingcues and luminance variation [27] Therefore colormaps thatexert linear control over their luminance can be expected toportray forms and structures more effectively

Although there is existing evidence to back H2 (see an earlierstudy by Ware [41]) our work aims to replicate and extendthese results to account for spatial frequency To that endH1 provides a broader (yet untested) prediction of how spa-tial frequency might impact the performance of continuouscolormaps

METHODOLOGYIn the following sections we present the results of three crowd-sourced experiments to test the above hypotheses Specificallywe investigate whether the effectiveness commonly prescribedcolormaps is modulated by the spatial frequency of the dataand the degree to which this relationship is influenced by vari-ations in luminance hue and saturation within the color rampEach experiment tests one specific task against nine colormapsand at increasing levels of spatial frequency The first ex-periment measures participantsrsquo ability to estimate values atspecific locations in the map The second experiment testsparticipantsrsquo accuracy in comparing gradients in larger mapswaths The third and final experiment is aimed at evaluatingparticipantsrsquo ability to perceive and match longitudinal pat-terns in the map Before delving into the details we describeour experimental design and stimulus generation procedure

spat

ial f

requ

ency

greyscale singlehue cubehelix extbodyheat coolwarm spectralrainbow blueyellow

luminance saturation CIELAB a (green-red) b (blue-yellow)

bodyheat

Figure 2 We tested nine colormaps selected to encompass a variety of design characteristics (illustrated by variation in luminance saturation andorhue) Each colormap was tested with multiple scalar fields corresponding to increasing levels of spatial frequency

StimuliWe employ digital elevation models (DEM) as stimuli for thethree experiments A DEM represents land elevation withcells in the 2D scalar field representing terrain height at thecorresponding locations To maintain precise control overspatial frequency and task difficulty we synthetically generateDEMs using Perlin noise [28] mixing five octaves of the noisefunction to produce seemingly realistic terrain The resultingDEMs are then normalized so that their heights span the entirecolormap range All generated maps were 820times630 pixels insize (approximately 16degtimes13deg of visual angleDagger)

By varying the scale of the noise function we obtain scalarfields with different spatial frequency characteristics The lat-ter is measured by first computing a Fast Fourier Transform(FFT) over the scalar field and calculating the relative con-tribution of each carrier frequency from the magnitude of itsFFT vector The result can be illustrated with a power spectraplot for each individual scalar field with spatial frequency onthe x-axis and the contribution of the frequency component onthe y-axis Low frequency fields exhibit a more pronouncedright skew in their power spectra We compute the positionof the median spatial frequency which splits the power spec-tra into approximately equal halves and use it to order thefields Fields with larger median frequencies indicate morevaried and complex terrain structure This procedure enabledus to synthesize scalar fields with very similar height distribu-tions while providing precise control over spatial frequencyFigure 1 illustrate examples generated using this method

ColormapsWe chose nine commonly-used colormaps (listed in Table 1and illustrated in Figure 2) In addition to a greyscale baselinethe colormaps selected reflect five design strategies

ndash Sequential monotonically increasing luminance over a lim-ited number of hues (singlehue bodyheat)

DaggerFollowing [36] we derive expected visual angle measurementsfrom pixel dimensions by assuming standard web viewing conditionsW3C-compliant browsers render HTML images at 96 DPI [40] andautomatically remap this to compensate for actual display resolutionWe assume a viewing distance of 30 inches Thus the estimatedvisual angle for an object of size S pixels is θ = 2tanminus1(

(S2)9630 )

ndash Spiral monotonically increasingly luminance with multiplehues (cubehelix extbodyheat)

ndash Diverging with uniformly-stepped luminance (coolwarmspectral)

ndash Diverging with uniformly-stepped saturation (blueyellow)ndash Fully saturated hues (rainbow)

All colormaps were interpolated in the CIELAB color spacewith the exception of coolwarm blueyellow and cubehelix mdashthese were interpolated (as originally intended) in a polar formof the LAB space [24] in the HSL space and using a taperedRGB helix [14] respectively

Greyscale Linear black to white ramp interpolated in the CIELAB color space

Singlehue Monotonically increasing luminance over a single blue hue (from Color Brewer [15])

Cubehelix Monotonically increasing luminance with sinusoidal RGB rotation [14]

Bodyheat Monotonically increasing luminance with a limited hue profile similar to a heated metal filament

Colormap Luminancecontrol

Hues

Ext-bodyheat Monotonically increasing luminance based on bodyheat but augmented with additional blue and purple hues in the low regions [41]Cool-warm Diverging with blue and red at the endpoints and soft white at the middle [24] Uniform luminance steps with darker ends and a bright midpoint

Rainbow Fully saturated hue gradation (blue green yellow red) interpolated in CIELAB

Spectral Diverging multi-hue encompassing a subset of the rainbow with a yellow middle [15] Uniform luminance steps with darker ends and a bright midpoint

Designstrategy

monotonic increase

monotonic increase

monotonic increase

-

blue

red yellow

sinusoidaRGB

blue red yellow

blue red

saturated RGB

limited RGB

monotonic increase

monotonic increase

uniform mid peak

-

-

luminance ramp

sequential

spiral

spiral

diverging

sequential

diverging

Blue-yellow Diverging uniformly-stepped saturation with blue and yellow ends and 75 grey in the middle

huerotation

blue yellow diverging

uniform mid peak

Table 1 The nine colormaps evaluated in this study

Experimental DesignWe investigate two independent variables Colormap and Spa-tial Frequency Colormap comprised nine distinct categories(see above) whereas Spatial Frequency is a continuous vari-able representing the number of cycles in 410 pixels (ie halfthe width of our map stimulus or approximately 8deg of visualangleDagger) We sampled spatial frequency at five intervals 3 5 79 11 To systematically study the effect of this variable on the

Click on a point that has an elevation of exactly 750 feet

Terrain is steeper when there is larger change in elevation between adjacent points Compare the steepness of terrain inside the two boxes then click on the box that is steeper on average

Imagine a line from A to B Select the elevation profile below that most closely matches its slope

Experiment 1 Experiment 2 Experiment 3

elevation(feet)

elevation(feet)

elevation(feet)

Figure 3 Example stimuli from the three experiments In experiment 1 participants indicated their response by clicking a point on the map matchinga specified elevation In experiment 2 participants were prompted to select the steeper of the two boxes In experiment 3 participants were asked toidentify the pattern corresponding to the terrain profile between two horizontally displaced markers

different colormaps we opted for a factorial design testingall possible 9times5 Colormap and Spatial Frequency combina-tions Given the sheer number of combinations we opted fora mixed design to make the study feasible Participants wererandomly assigned to one of three experimental conditions(illustrated in Table 2) Each condition comprised 3 of the 9colormaps (ie between-subject) and all 5 frequency levels(within-subject) In effect every participant saw 3 colormaptimes 5 spatial frequency combinations Participants completedmultiple trials with each combination

To equalize task difficulty stimuli for a given spatial frequencytrial were derived from the same base scalar field with differentcolormaps applied This arrangement enables us to make di-rect comparison between the colormaps for a given frequencyHowever it also meant that participants will see the samemap three times albeit with different colormaps To preventlearning scalar fields were flipped either horizontally or ver-tically resulting in three unique map reflections The orderof colormap presentation was fully counterbalanced acrossparticipants to minimize residual learning or fatigue effects

Condition Colormaps tested Spatial frequencies tested1 greyscale cubehelix rainbow 3 5 7 9 112 singlehue extbodyheat spectral 3 5 7 9 113 bodyheat coolwarm blueyellow 3 5 7 9 11

Table 2 Three experimental conditions each included 3 of 9 colormaps(ie between-subject variation) and all 5 levels of spatial frequency

EXPERIMENT 1 QUANTITY ESTIMATIONThe first experiment tests participantsrsquo ability to identify loca-tions on the map matching specified elevations Participantswere instructed to ldquoClick on a point that has an elevation ofexactly [H] feetrdquo Five different values for H were tested 0250 500 750 and 1000 feet These values correspond to thethree quartiles of the color scale as well as the min and max

ParticipantsWe recruited 90 participants from Amazon Mechanical Turk(50 females 40 males) with a mean age of 3464 (ST D = 953years) Participants were first screened for color-vision de-ficiency using a 14-panel Ishihara test and had to correctlyguess the number in 12 of the 14 panels to qualify We re-stricted the study to participants with a screen resolution of atleast 1280times800 to ensure the experimental interface would

fit their display Participants received a base reward of $050and a maximum bonus of $300 based on the percentage ofcorrectly solved tasks (for a possible total of $350)

ProcedureAfter signing up for the study participants were directed to anexternal link that displayed the experiment within a web inter-face Participants entered their MTurk ID and were presentedwith an information sheet about the study They were thenpresented with the color-vision qualification test Those whosuccessfully passed the test were given a set of 6 training trialsand provided with feedback on their accuracy Participantshad to identify a location that is within a 5 margin from thespecified height before proceeding to the next training trial

The main portion of the experiment consisted of 3 rounds onewith each of the 3 colormaps In each round participants sawthe five spatial frequency levels in ascending order providinga progression from simple to more complex maps The orderof colormap presentation was fully counterbalanced across par-ticipants using a Latin square design Participants completed 5trials with each colormap and spatial frequency combinationcorresponding to the 5 tested quantities (0 250 500 750 and1000) presented in random order A color scale was displayedto the right of the map and the range of the scale was fixedat [0ndash1000] feet (see Figure 3) In each trial participants firstsaw the question and clicked on lsquoShow Maprsquo to reveal thestimulus They then indicated their response by clicking onthe map to mark their selected location and clicked lsquoNextrsquo Toaid participants in accurately selecting locations the mousecursor was changed to a crosshair with a hollowed-out center(so as not to obscure the focal pixel)

ResultsWe computed an lsquoerrorrsquo measurement for each response bytaking the absolute difference between the requested elevationand the elevation at the point clicked by the participant Wethen applied the following log transform [10 16]

log2(error) = log2(| judged percentminus true percent|+18)

We removed three participants from the analysis (amountingto 33 of subjects) because their overall accuracy was twostandard deviations below the mean accuracy for all partici-pants (M = 8536ST D = 948) We analyze the results

greyscale singlehue bodyheat cubehelix extbodyheat coolwarm rainbow spectral blueyellow

3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11

0

1

2

3

frequency

log(error)

whichexperimentmodel

experiment modelspatial frequency

Figure 4 Mean log of error in quantity estimation (experiment 1 vs model) Ribbons represent 95 CIs of the experimental results

0

1

2

3

3 5 7 9 11frequency

log(error)

colorscalegreyscalesinglehuebodyheatcubehelixextbodyheatcoolwarmrainbowspectralblueyellow

spatial frequency

0

1

2

greyscale

singlehue

bodyheat

cubehelix

extbodyheat

coolwarm

rainbow

spectral

blueyellow

log(error)

colorscalegreyscalesinglehuebodyheatcubehelixextbodyheatcoolwarmrainbowspectralblueyellow

Figure 5 Mean log error in quantity estimation by colormap (left) andby colormap times spatial frequency Intervals are 95 CIs

by fitting the log of error to a linear mixed-effects modelcomprising two fixed effects (colormap frequency) and tworandom effects The first random effect accounts for individ-ual variations among participants and the second for intra-trialvariations (recall that trials comprised different test elevations)

Figure 4 illustrates the mean log error by colormap and spatialfrequency (the dashed trendline represents the model) Theexperimental results are combined in Figure 5 to ease com-parison A likelihood ratio test indicates the overall model issignificant (χ2(18) = 14817 p lt 0001) To test for interac-tion between spatial frequency and colormap we fit a reducedmodel that accounts for both frequency and colormap but nottheir interaction There was no significant difference betweenthe full and the reduced model (χ2(8) = 10695 p = 0219)thus ruling out an interaction between colormap and spatial fre-quency We will therefore interpret the reduced model whichaccounts for both factors independently Table 3 illustrates themodel coefficients

The model predicts that a step-increase in spatial frequencyyields a 011 increase in the log of estimation error The differ-ence in estimation error between the highest (f=11) and lowest(f=3) frequency levels is approximately 09 orders of magni-tude The effect of color encoding was equally evident allthe colormaps were significantly better than greyscale How-ever the gain in accuracy was markedly different betweenthe colormaps Rainbow had the largest impact on estimationaccuracy reducing error by approximately 23 orders of mag-nitude compared to greyscale The runner-up was spectralwhich also contains substantial hue variation However spec-tral reduced error by 175 orders of magnitude only On theother hand Spiral colormaps (extbodyheat cubehelix) whichcomprise multiple hues over a monotonically increasing lumi-nance decreased estimation errors by approximately 13-14orders of magnitude compared to 08-12 for Diverging ramps(blueyellow coolwarm) Sequential schemes (singlehue and

Coefficient Estimate |t value| p(Intercept) 177 6628 singlehue -026 2253 bodyheat -071 6118 cubehelix -127 16467 extbodyheat -139 12073 coolwarm -118 10122 rainbow -233 30264 spectral -175 15172 blueyellow -080 6871 Spatial Frequency 011 16967

Table 3 Effects of colormap and spatial frequency on the log of errorin quantity estimation The intercept represents greyscale as colormap(lowastlowastlowast= p lt 0001lowastlowast= p lt 001lowast= p lt 005)

000

025

050

075

3 5 7 9 11spatial frequency

avg

gra

dien

t (

)

Figure 6 Average local gradient (ie terrain slope) at locations selectedby participants Error bars are 95 CIs

bodyheat) had the least impact on error with a mere improve-ment of 026-071 orders of magnitude relative to greyscale

The fact that we did not find interaction between colormapand spatial frequency implies that the relative effectiveness ofthe different colormaps is stable across all spatial frequencylevels tested Rainbow is thus expected to be the most ac-curate colormap for quantity estimation regardless of howspatially complex the data is However estimation accuracywill decrease comparably for all colormaps as the data be-comes more spatially varied This could reflect a combinationof perceptual and motor difficulty in locating and clicking theintended location due the larger local gradients encounteredin high-frequency maps (see Figure 6)

EXPERIMENT 2 GRADIENT PERCEPTIONThe second experiment tests participantsrsquo accuracy in compar-ing and judging the steepness of gradients The ability to judgehow fast the encoded quantities change between adjacent maplocations is important in many contexts

ParticipantsWe recruited 126 participants (50 females 74 males 2 others)with a mean age of 3562 years (ST D = 955 years) Partici-pants had an overall success rate of 6775 (ST D = 1141)Ten participants (79 of subjects) were dropped from theanalysis because their overall accuracy was worse than chancehaving correctly answered less than 50 of trials in a two-alternatives forced choice experiment

greyscale singlehue bodyheat cubehelix extbodyheat coolwarm rainbow spectral blueyellow

3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 1104

06

08

frequency

c

orre

ct

whichexperimentmodel

experiment model

p(su

cces

s)

spatial frequency

Figure 7 Probability of successful gradient judgment (experiment 2 vs model) Ribbons represent 95 CIs of the experimental results

spatial frequency

05

06

07

08

09

3 5 7 9 11spatial frequency

c

orre

ct

colorscalegreyscalesinglehuebodyheatcubehelixextbodyheatcoolwarmrainbowspectralblueyellow

00

02

04

06

08

greysc

ale

single

hue

body

heat

cube

helix

extbo

dyhe

at

coolw

arm

rainb

ow

spec

tral

bluey

ellow

c

orre

ct

colorscalegreyscalesinglehuebodyheatcubehelixextbodyheatcoolwarmrainbowspectralblueyellow

Figure 8 Percentage of correctly answered trials in a gradient percep-tion task (experiment 2) Intervals are 95 CIs

ProcedureEach trail consisted of a map with two squares juxtaposed ontop (see Figure 3) Participants were prompted to ldquocomparethe steepness of terrain inside the two boxesrdquo and ldquoclick on thebox that is steeper on averagerdquo The two boxes were identicalin size (175times175 pixels or 35degtimes35deg of visual angle) How-ever terrain steepness which was calculated by taking the aver-age first derivative within each box was varied systematicallyThe gradient-ratio between the flatter and the steeper boxeswas fixed to one of four levels 0808308609(plusmn005) Alower ratio implies larger and potentially more perceptibleslope difference making the task easier However the twoboxes encompassed terrain with identical height ranges to re-duce variability in the appearance of their peaks (a potentialconfound in slope judgment [26])

Participants first completed a set of 6 training trials that in-cluded feedback before proceeding to the main trials Theorder of stimuli was similar to the previous experiment thestudy consisted of 3 rounds one with each of the 3 colormapsthe participant was assigned to see Each round encompassedall 5 spatial frequency levels Participants completed 4 trialswith each colormap and frequency combination spanning arange of easy to difficult tests (a total of 60 trials) The orderof colormap presentation was fully counterbalanced acrossparticipants

ResultsFigure 7 illustrates participantsrsquo probability of correctly iden-tifying the steeper gradient The experimental data is shownseparately in Figure 8 We fit the results to a logistic re-gression model comprising two fixed effects (colormap fre-quency) The model also included two random effects toaccount for individual differences among participants andintra-trial variations (recall that trials varied in difficulty) Themodel essentially predicts the odds of correctly identifyingthe steeper gradient A likelihood ratio test indicates the over-all model is significant (χ2(17) = 42165 p lt 0001) The

a Main effectsCoef Est |z| p(Intercept) 080 0516singlehue 114 0415bodyheat 094 0183cubehelix 090 0354extbodyheat 113 0388coolwarm 066 1316rainbow 065 1354spectral 092 0259blueyellow 091 0281Frequency 115 4633

b Interaction effects(colormap x frequency)

Coef Est |z| psinglehue 096 0910bodyheat 104 1020cubehelix 106 1303extbodyheat 103 0596coolwarm 114 3008 rainbow 113 2876 spectral 106 1330blueyellow 112 2471

Table 4 Main effects of colormap and spatial frequency on success oddsin gradient judgment (a) and their interaction Coefficients shown corre-spond to the exponented model estimates to reflect odd-ratios The inter-cept represents greyscale (lowastlowastlowast= p lt 0001 lowastlowast= p lt 001 lowast= p lt 005)

model correctly predicts 7467 of outcomes We find sig-nificant interaction between colormap and spatial frequency(χ2(8) = 2781 p lt 0001) Table 4 shows model coefficients

The main-effect coefficients for all colormaps were not sig-nificant indicating that all colormaps perform comparablyto greyscale at low spatial frequencies Participants are thusunlikely to benefit from the use of color when judging gra-dients in low-variance data The main effect of spatial fre-quency however is significant The model estimates that astep-increase in spatial frequency improves the odds of correctjudgment by 15 Estimating gradients appear to be easier inmaps with more complex spatial structures

The model indicates several noteworthy interactions Althoughthe use of color had no significant effect in low-frequencymaps several colormaps significantly outperformed greyscaleat high frequency The divergent coolwarm improved partici-pantsrsquo success odds by 14 for every step-increase in spatialfrequency Similarly rainbow and blueyellow increased theodds by approximately 13 and 12 respectively Notablythese three colormaps contain substantial variation in satura-tion (coolwarm and blueyellow) or hue (rainbow) All othercolormaps tested were not reliably different from greyscale

EXPERIMENT 3 PATTERN PERCEPTIONHaving tested accuracy in quantity estimation and gradientperception we now evaluate participantsrsquo ability to integratethese two skills Experiment 3 required participants to extracta longitudinal pattern from the map and match it to an externalrepresentation a task originally devised by Hyslop [18]

ParticipantsWe recruited 165 participants (79 females 84 males 2 others)The mean participant age was 3604 years (ST D = 1171)Overall participants had a mean success rate of 7851 inmatching the correct pattern (ST D = 1931) We dropped

greyscale singlehue bodyheat cubehelix extbodyheat coolwarm rainbow spectral blueyellow

3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11

07

08

09

frequency

c

orre

ct

whichexperimentmodel

experiment model

p(su

cces

s)

spatial frequency

Figure 9 Probability of successful pattern matching (experiment 3 vs model) Ribbons denote 95 CIs of the experimental data

070

075

080

085

090

3 5 7 9 11spatial frequency

c

orre

ct

colorscalegreyscalesinglehuebodyheatcubehelixextbodyheatcoolwarmrainbowspectralblueyellow

000

025

050

075

greysc

ale

single

hue

body

heat

cube

helix

extbo

dyhe

at

coolw

arm

rainb

ow

spec

tral

bluey

ellow

c

orre

ct

colorscalegreyscalesinglehuebodyheatcubehelixextbodyheatcoolwarmrainbowspectralblueyellow

Figure 10 Percentage of correctly answered trials in experiment 3 In-tervals are 95 CIs

seven participants from the analysis (42 of subjects) whoseoverall accuracy was two standard deviations below the mean

ProcedureParticipants first completed a set of 6 training trials that in-cluded feedback before proceeding to the main experimentEach trial consisted of a map with two markers labeled A andB (see Figure 3) The markers were horizontally displacedby 350 pixels (7degof visual angle) Participants were given thefollowing prompt ldquoImagine a line from A to B Select the ele-vation profile below that most closely matches its sloperdquo Theythen selected a choice among a set of 6 patterns includingthe actual elevation profile and 5 other distractors Distractorswere generated from the same map so as to reflect similar spa-tial frequency characteristics and had to be 65-70 similarto the actual profile (as measured by dynamic warping [20])Additionally profiles and distractors were selected to not havepeaks or valleys at the endpoints These criteria determinedafter a pilot ensure similar task difficulty across the trials

The order of stimuli was similar to the previous two experi-ments the study consisted of 3 rounds one with each of the 3colormaps the participant was assigned to see and encompass-ing the 5 spatial frequency levels Thus every participant saw3times5 colormap and frequency combinations and completed 3pattern matching trials with each combination for a total of 45trials As in the previous experiments the order of colormappresentation was fully counterbalanced

ResultsWe fit the results to a logistic regression model comprisingtwo fixed effects (colormap frequency) and two random ef-fects to account for individual differences and intra-trial vari-ations Figure 9 shows the odds of successful profile match-ing The experimental results are illustrated separately inFigure 10 A likelihood ratio test indicates the model issignificant (χ2(17) = 39467 p lt 0001) We find signif-icant interaction between colormap and spatial frequency

a Main effectsCoef Est |z| p(Intercept) 903 6502 singlehue 134 0649bodyheat 080 0503cubehelix 077 0681extbodyheat 081 0466coolwarm 044 1868 rainbow 098 0064spectral 057 1284blueyellow 073 0697Frequency 093 2058

b Interaction effects(colormap x frequency)

Coef Est |z| psinglehue 098 0450bodyheat 102 0456cubehelix 109 1781 extbodyheat 104 0797coolwarm 117 3143 rainbow 101 0259spectral 111 2198 blueyellow 109 1683

Table 5 Main effects of colormap and spatial frequency on successodds in experiment 3 (a) and their interaction Coefficients depict expo-nented model estimates to reflect odd-ratios The intercept correspondto greyscale (lowastlowastlowast= p lt 0001 lowastlowast= p lt 001 lowast= p lt 005 = p lt 01)

(χ2(8) = 18131 p lt 005) the relative effectiveness of thecolormaps appears to vary with spatial frequency

Overall we find a significant detrimental main effect of spa-tial frequency on pattern perception as indicated by a 093Frequency coefficient (Table 5a) This translates to a 7 dropin the odds of correctly matching the profile for every step-increase in spatial frequency The main effect coefficients forall colormaps were not significant indicating that the use ofcolor at low spatial frequency is unlikely to improve patternperception as compared to a plain greyscale ramp

Colormap performance begins to diverge at high spatial fre-quency Only two colormaps have significant and largeenough odds-ratio coefficients (ie gt 107) to overcome thefrequency-induced perceptual difficulty spectral and cool-warm increased the odds of correct pattern matching by 11ndash17 respectively for a every step-increase in spatial fre-quency (after adjusting for frequency effects alone) Addi-tionally blueyellow and cubehelix were associated with a 9improvement but the advantage was not reliable (p lt 01)On the other hand extbodyheat bodyheat rainbow had small(and insignificant) odds-ratio coefficients (098ndash104 lt 107)indicating that similar to greyscale they are associated withlower success odds in complex maps

In short only two of the tested colormaps (coolwarm andspectral) appear to reliably support pattern perception at highspatial frequency Both consist of a diverging ramp withuniformly-stepped luminance All other colormaps (includinggreyscale) suffered as data complexity increased

DISCUSSION AND GUIDELINESOur work sheds new light on how spatial complexity impactsthe perception of continuous color-coded maps The experi-ments also led to some surprising findings that are at odds withcurrent guidelines We interpret these results and accordingly

Quantity estimationRanking unaffected by spatial frequency

Gradient perceptionLow spatial frequency

(038 cycledeg)High spatial frequency

(138 cycledeg)

no s

igni

fican

tdi

ffere

nce

no s

igni

fican

tdi

ffere

nce

Pattern perception

better

worse

Color mapGuidelines

(1) Maximize range of saturated hues regardless of spatial frequency

(2) At high spatial frequency Fully-saturated hues or diverging ramps with chroma variation

(3) At high spatial frequency Diverging ramps with uniformly-stepped luminance

no s

igni

fican

tdi

ffere

nce

no

sign

ifica

ntdi

ffere

nce

Low spatial frequency(038 cycledeg)

High spatial frequency(138 cycledeg)

Figure 11 Model-derived colormap ranking and guidelines by task and spatial frequency (lowast= p lt 005 = p lt 01 relative to greyscale)

devise new task- and frequency-aware color mapping guide-lines (indicated byF) We also rank the tested colormaps andsummarize our guidelines in Figure 11

Quantity EstimationOur first hypothesis (H1) predicts hue- and saturation-varyingramps to be more accurate at low spatial frequencies andramps with monotonically increasing luminance to be moreaccurate at high frequencies As discussed H1 is based onthe relative contrast-sensitivity of our visual system [32] Aquantity estimation task (experiment 1) shows no interactionbetween colormap and spatial frequency While increasedspatial complexity is associated with higher estimation errorthe effect is similar across all colormaps We thus reject H1

On the other hands results provide support for H2 whichpredicts that hue-varying ramps will lead to more accurateestimation Indeed the top performing colormaps (rainbowand spectral) contain substantial hue variation Results fromexperiment 1 thus replicate earlier findings by Ware [41] butalso extend them to show that spatial frequency have no ap-parent impact on the effectiveness of hue-varying ramps Ourdata shows that rainbow and spectral are the most accurateamong the colormaps tested even at the highest levels of spa-tial frequency Altogether these results lend further support tothe theory that lookup errors in color-coded maps are largelycaused by systematic simultaneous contrast shifts [41] ratherthan being affected by contrast sensitivity modulation [32]These shifts are best counteracted with colormaps that varynon-monotonically along one or more perceptual channels

A corollary result is that mixing monotonic luminance withhue variation would lead to significant accuracy loss Indeeddata from experiment 1 indicates that rainbow is approxi-mately an order of magnitude more accurate than extbodyheatand cubehelix These Spiral colormaps are designed to be moreaccurate rainbow alternatives for interval data [4 24] Con-trary we find that they reduce accuracy compared to a purelyhue-varying ramp This finding suggests that when estimat-ing a continuously coded spatial quantity people benefit mostfrom a large dynamic hue range Incorporating monotonicallyincreasing lightness within the colormap would necessarilyreduce the hue range thereby diminishing accuracy

F Guideline 1 We recommend maximizing hue variation toimprove quantity estimation irrespective of spatial frequency

Gradient PerceptionGradient perception allows people to distinguish how quicklythe encoded attribute changes between adjacent locations anessential skill when evaluating the distribution and varianceof spatial data We find that the task is strongly modulatedby the datarsquos spatial complexity increased spatial frequencyappears to enhance the perception of gradients This is un-surprising as maps with jagged surfaces are likely to exhibitmore pronounced mdashand thus more perceptiblemdash differencesin slope Colormap effectiveness was also impacted by spa-tial frequency color encoding did not help participantsrsquo dis-tinguish gradients at low frequency levels as all colormapsshowed similar performance to greyscale However three col-ormaps demonstrated significant advantage at high frequenciesCoolwarm rainbow and blueyellow improved perception oddsby 12-14 for every step-increase in spatial frequency Allthree employed one of two design strategies a diverging rampwith varying saturation or a fully saturated hue rotation

The above results contradict H1 which predicts hue- andsaturation-varying colormaps to perform better at low frequen-cies In fact we see the opposite The results also do not sup-port H3 which predicts better performance for monotonically-luminant ramps in structure perception tasks In fact all threetop-performing ramps exhibit non-monotonic luminance

F Guideline 2 For tasks requiring gradient perception athigh spatial frequency we recommend a range of fully satu-rated hues (eg rainbow) or diverging chroma-varying ramps(eg coolwarm or blueyellow)

Pattern PerceptionExperiment 3 prompted participants to match the elevationprofile along a horizontal path with an external pattern Weexpected colormaps with monotonically increasing luminanceto be more accurate at this task (H3) but results were notentirely consistent with this prediction While all tested col-ormaps had comparable performance at low spatial frequencyonly two colormaps coolwarm and spectral gave partici-pants higher odds of successfully matching the pattern at highfrequency Both colormaps comprise a diverging ramp withuniformly-stepped (though not strictly monotonic) luminanceBy contrast sequential and spiral ramps performed just aspoorly as greyscale in complex maps and so did rainbow

The above result are consistent with Morelandrsquos argumentthat diverging ramps provide ldquomaximal perceptual resolutionrdquo(through increasing and decreasing luminance intervals) [24]potentially enabling high-frequency patterns to be resolvedmore easily Our results may also explain why divergingschemes performed better in medical diagnosis [3] we suspectsuch tasks to require the analysis of potentially high-frequencyfeatures (eg small tissue aberrations)

FGuideline 3 We recommend diverging ramps with equidis-tant luminance steps (eg coolwarm and spectral) to sup-port the perception of longitudinal patterns at high spatialfrequency Rainbow Sequential and Spiral schemes shouldbe avoided in complex maps especially if the task involvesthe analysis and matching of fine-grained features

Yet Another Look at the RainbowResults of experiments 1 and 2 may shed a light on why rain-bow remains a popular choice among scientists [25] despitebeing considered a bad choice by the visualization commu-nity [4 30] Our data reveals that counterintuitively rainbowis robust for estimating a smoothly varying quantitative at-tribute regardless of spatial complexity Moreover rainbowprovides good support for gradient estimation at high spatialfrequency These two tasks correspond to elementary visualanalytic primitives including characterizing distributions de-termining ranges and filtering [1] Moreover studies showthat when experts attempt to form a mental model about avisualization they first go through a time-consuming processof extracting quantitative data ldquoat a rather detailed levelrdquo [38]For instance a weather forecaster will lookup pressure andwind changes estimating current readings at landmark loca-tions in the map before making a forecast Our data suggeststhat rainbow provides good support for these tasks making ita potentially reasonable choice for weather forecasters

Critique of rainbow centers on its tendency to create sharpvisual boundaries particularly around its yellow regions [4]Experts also criticize the use of fully saturated hues [24]which result in non-uniform perceptual steps within the colorramp The common intuition is that these two factors com-bined will inevitably distort the perception of quantities Wedo not see evidence to support this hypothesis In fact to thecontrary attempts to lsquolinearizersquo the rainbow by monotonicallyincreasing the luminance of hues could reduce estimationaccuracy by up to an order of magnitude

F Guideline 4 Rather than entirely discouraging the useof rainbow we suggest that it can be a reasonable designchoice for conveying spatial distributions and variances andin tasks that require quantitative as opposed to geometricprecision However rainbow has a number of limitationsThe use of green and red hues is problematic for people withcolor deficiency Moreover rainbow is probably ineffective atrevealing high-frequency patterns Interestingly these short-comings are balanced by diverging ramps (eg coolwarm)which although quantitatively inaccurate appear to supportpattern perception at high spatial frequency We thus arguethat hue-varying and diverging colormaps support orthogonaltasks in continuous maps and should therefore be consideredas complementary rather than mutually exclusive choices

LIMITATIONS AND FUTURE WORKThere are some limitations to our work that should be consid-ered First as with other crowdsourced graphical perceptionstudies we gain access to a larger pool of participants butsacrifice some experimental control [16] Particularly rele-vant to our study is the variations in participantsrsquo monitorsincluding color calibration and display resolution as well asthe illumination conditions in their homes or offices mdash all ofwhich can impact color perception We could not control thesefactors but attempted to counteract their variation by involvinga larger sample (N=381) Although we expect crowdsourcingto improve the ecological validity of results and guidelinesuncontrolled variations can potentially reduce our ability to de-tect small but otherwise significant differences in performancebetween tested conditions Future lab studies should thereforebe attempted to replicate our findings with added controls

Second our study employed a limited set of tasks designed tomeasure elementary perceptual operators including quantityestimation gradient perception and pattern matching Thereis an opportunity to test higher-level tasks that mimic scientificanalyses more closely including the identification and compar-ison of larger map features (eg fronts ridges) Additionallysome of the tasks we tested could be re-evaluated in more au-thentic formulations For instance a metric task could requireparticipants to estimate the quantity at a specific location onthe map This formulation is arguably more realistic than thetask we tested which simply asked participants to click anylocation thought to match a specified quantity

Third our analysis was focused exclusively on spatial fre-quency and there are good reasons to consider this factor [1132] However there are also additional data characteristics toconsider including for instance the distribution of amplitudeswithin the map Such factors will influence the distribution ofcolors in the image and may thus impact perception

Lastly we limited our study to synthetically generated scalarfields to precisely vary spatial frequency while controllingfor other confounds However synthetic stimuli may alsointroduce (unknown) perceptual or cognitive biases There-fore additional studies are needed to replicate our findingswith datasets from real-world domains (eg meteorology geo-physics or oceanography) and with domain experts We alsorestricted this study to participants with normal color visionTherefore our results may not generalize to approximately 5of the population who have some form of color deficiency

CONCLUSIONSWe conducted three experiments to investigate the effects ofspatial frequency and colormap characteristics on the percep-tion of continuous pseudocolor maps Our results indicate thatspatial frequency impacts judgment of the encoded quantitiesand structures While viewersrsquo quantity estimation accuracyexhibited a predictable response increased data complexityhad a more nuanced effect on gradient and pattern compre-hension the impact of which was dependent on the colormapused Designers should therefore consider both the type oftask and the spatial complexity of the underlying data Were-examined current guidelines and devised new recommenda-tions for color-coding of continuous spatial data

REFERENCES1 Robert Amar James Eagan and John Stasko 2005

Low-level components of analytic activity in informationvisualization In Information Visualization 2005INFOVIS 2005 IEEE Symposium on IEEE 111ndash117

2 Lawrence D Bergman Bernice E Rogowitz and Lloyd ATreinish 1995 A rule-based tool for assisting colormapselection In Proceedings of the 6th conference onVisualizationrsquo95 IEEE Computer Society 118

3 Michelle Borkin Krzysztof Gajos Amanda PetersDimitrios Mitsouras Simone Melchionna Frank RybickiCharles Feldman and Hanspeter Pfister 2011 Evaluationof artery visualizations for heart disease diagnosis IEEETransactions on Visualization and Computer Graphics 1712 (2011) 2479ndash2488

4 David Borland and Russell M Taylor Ii 2007 Rainbowcolor map (still) considered harmful IEEE ComputerGraphics and Applications 27 2 (2007)

5 Cynthia A Brewer 1994 Visualization in ModernCartography Elsevier Science Chapter Color useguidelines for mapping and visualization

6 Cynthia A Brewer 1996 Guidelines for selecting colorsfor diverging schemes on maps The CartographicJournal 33 2 (1996) 79ndash86

7 Cynthia A Brewer Alan M MacEachren Linda W Pickleand Douglas Herrmann 1997 Mapping mortalityEvaluating color schemes for choropleth maps Annals ofthe Association of American Geographers 87 3 (1997)411ndash438

8 Roxana Bujack Terece L Turton Francesca SamselColin Ware David H Rogers and James Ahrens 2017The Good the Bad and the Ugly A TheoreticalFramework for the Assessment of Continuous ColormapsIEEE Transactions on Visualization and ComputerGraphics (2017)

9 William S Cleveland and William S Cleveland 1983 Acolor-caused optical illusion on a statistical graph TheAmerican Statistician 37 2 (1983) 101ndash105

10 William S Cleveland and Robert McGill 1984 Graphicalperception Theory experimentation and application tothe development of graphical methods J Amer StatistAssoc 79 387 (1984) 531ndash554

11 Russel De Valois and Karen De Valouis 1990 SpatialVision Oxford University Press

12 Russell L De Valois Duane G Albrecht and Lisa GThorell 1978 Cortical cells bar and edge detectors orspatial frequency filters In Frontiers in visual scienceSpringer 544ndash556

13 Connor C Gramazio David H Laidlaw and Karen BSchloss 2017 Colorgorical Creating discriminable andpreferable color palettes for information visualizationIEEE Transactions on Visualization and ComputerGraphics 23 1 (2017) 521ndash530

14 D A Green 2011 A colour scheme for the display ofastronomical intensity images Bulletin of theAstronomical Society of India 39 (June 2011) 289ndash295

15 Mark Harrower and Cynthia A Brewer 2003ColorBrewer org an online tool for selecting colourschemes for maps The Cartographic Journal 40 1(2003) 27ndash37

16 Jeffrey Heer and Michael Bostock 2010 Crowdsourcinggraphical perception using mechanical turk to assessvisualization design In Proceedings of the SIGCHIConference on Human Factors in Computing SystemsACM 203ndash212

17 GT Herman and H Levkowitz 1992 Color scales forimage data IEEE Computer Graphics and Applications12 1 (1992) 72ndash80

18 Michael D Hyslop 2006 A comparison of spectral colorand greyscale continuous-tone map perception Masterrsquosthesis Michigan State University

19 Alan David Kalvin Bernice E Rogowitz Adar Pelah andAron Cohen 2000 Building perceptual color maps forvisualizing interval data In Human Vision and ElectronicImaging V Vol 3959 International Society for Opticsand Photonics 323ndash336

20 Eamonn Keogh and Chotirat Ann Ratanamahatana 2005Exact indexing of dynamic time warping Knowledge andInformation Systems 7 3 (2005) 358ndash386

21 Gordon Kindlmann Erik Reinhard and Sarah Creem2002 Face-based luminance matching for perceptualcolormap generation In Proceedings of the IEEEConference on Visualization rsquo02 IEEE Computer Society299ndash306

22 Robert Kosara 2016 An Empire Built On SandReexamining What We Think We Know AboutVisualization In Proceedings of the Beyond Time andErrors on Novel Evaluation Methods for VisualizationACM 162ndash168

23 Mark P Kumler and Richard E Groop 1990Continuous-tone mapping of smooth surfacesCartography and Geographic Information Systems 17 4(1990) 279ndash289

24 Kenneth Moreland 2009 Diverging color maps forscientific visualization In International Symposium onVisual Computing Springer 92ndash103

25 Kenneth Moreland 2016 Why We Use Bad Color Mapsand What You Can Do About It Electronic Imaging2016 16 (2016) 1ndash6

26 Lace Padilla P Samuel Quinan Miriah Meyer andSarah H Creem-Regehr 2017 Evaluating the Impact ofBinning 2D Scalar Fields IEEE Transactions onVisualization and Computer Graphics 23 1 (2017)431ndash440

27 Stephen E Palmer 1999 Vision science Photons tophenomenology MIT press

28 Ken Perlin 1985 An image synthesizer ACMSIGGRAPH Computer Graphics 19 3 (1985) 287ndash296

29 Linda Williams Pickle 2003 Usability testing of mapdesigns In Proceedings of Symposium on the Interface ofComputing Science and Statistics 42ndash56

30 P Samuel Quinan and Miriah Meyer 2016 Visuallycomparing weather features in forecasts IEEETransactions on Visualization and Computer Graphics 221 (2016) 389ndash398

31 Penny L Rheingans 2000 Task-based color scale designIn 28th AIPR Workshop 3D Visualization for DataExploration and Decision Making International Societyfor Optics and Photonics 35ndash43

32 Bernice E Rogowitz and Lloyd A Treinish 1994 Usingperceptual rules in interactive visualization In ISTSPIE1994 International Symposium on Electronic ImagingScience and Technology International Society for Opticsand Photonics 287ndash295

33 Bernice E Rogowitz Lloyd A Treinish Steve Brysonand others 1996 How not to lie with visualizationComputers in Physics 10 3 (1996) 268ndash273

34 Samuel Silva Beatriz Sousa Santos and JoaquimMadeira 2011 Using color in visualization A surveyComputers Graphics 35 2 (2011) 320ndash333

35 Maureen Stone Danielle Albers Szafir and Vidya Setlur2014 An engineering model for color difference as afunction of size In Color and Imaging Conference Vol2014 Society for Imaging Science and Technology253ndash258

36 Danielle Albers Szafir 2017 Modeling Color Differencefor Visualization Design IEEE Transactions onVisualization and Computer Graphics (2017)

37 Christian Tominski Georg Fuchs and HeidrunSchumann 2008 Task-driven color coding InInformation Visualisation 2008 IVrsquo08 12thInternational Conference IEEE 373ndash380

38 J Gregory Trafton Susan S Kirschenbaum Ted L TsuiRobert T Miyamoto James A Ballas and Paula DRaymond 2000 Turning pictures into numbersextracting and generating information from complexvisualizations International Journal of Human-ComputerStudies 53 5 (2000) 827ndash850

39 Bruce E Trumbo 1981 A theory for coloring bivariatestatistical maps The American Statistician 35 4 (1981)220ndash226

40 W3C 2016 CSS Values and Units Module Level 3(2016)httpwwww3orgTRcss3-valuesabsolute-lengths

41 Colin Ware 1988 Color sequences for univariate mapsTheory experiments and principles IEEE ComputerGraphics and Applications 8 5 (1988) 41ndash49

42 Colin Ware 2012 Information visualization perceptionfor design Elsevier

43 Liang Zhou and Charles D Hansen 2016 A survey ofcolormaps in visualization IEEE Transactions onVisualization and Computer Graphics 22 8 (2016)2051ndash2069

  • Introduction
  • Related Work
    • Design Strategies for Continuous Colormaps
    • Empirical Evaluations of Continuous Colormaps
    • Effects of Spatial Frequency
    • Summary
      • Hypotheses
      • Methodology
        • Stimuli
        • Colormaps
        • Experimental Design
          • Experiment 1 Quantity estimation
            • Participants
            • Procedure
            • Results
              • Experiment 2 Gradient perception
                • Participants
                • Procedure
                • Results
                  • Experiment 3 Pattern perception
                    • Participants
                    • Procedure
                    • Results
                      • Discussion and Guidelines
                        • Quantity Estimation
                        • Gradient Perception
                        • Pattern Perception
                        • Yet Another Look at the Rainbow
                          • Limitations and Future Work
                          • Conclusions
                          • References
Page 2: Graphical Perception of Continuous Quantitative Maps: the ...khreda.com/papers/CHI18_colormaps.pdfA large body of research has been devoted to understanding how color encoding affects

in a quantity estimation task irrespective of spatial frequencyHowever for pattern-matching tasks we find that divergentcolormaps significantly outperform other schemes when theunderlying data exhibits high-spatial variance These resultshave significant design implications and suggest complemen-tary perceptual roles for hue-varying and divergent schemesWe distill these findings into new color mapping guidelines forcontinuous maps accounting for task and spatial complexityof the data

RELATED WORKColor mapping involves the transformation of quantitative orcategorical attributes into color by means of a colormap Tobe of practical use color mapping must enable the viewer todeduce quantities distributions and patterns present in theoriginal data [43] Researchers outline a number of propertiesbelieved to contribute to effective colormap design [39] Agood colormap sequence should be naturally orderable (egfrom cool blue to warm red) so as to perceptually reflectthe order of the originally mapped quantities Additionallya colormap should only reflect actual differences in the datawithout creating artificial boundaries in color [34] Using thesebroad principles researchers developed design tools to pro-vide colormap recommendations for designers For exampleColorBrewer suggests a set of carefully crafted and validatedpalettes [15] Similarly Colorgorical enables users to gener-ate categorical colormaps on-demand using perceptual opti-mizations while allowing for user-provided constraints [13]The majority these tools however are intended for craftingsegmented colormaps and are primarily aimed at discretemap representations (eg choropleths) One exception thePRAVDAColor tool [2] provides color mapping advice forcontinuous maps based on the datarsquos spatial frequency and theintended task However unlike ColorBrewer this advice isbased on design heuristics that have not been verified

Design Strategies for Continuous ColormapsResearchers have proposed a number of handpicked and proce-durally generated colormaps for continuous data For instanceHerman and Levkowitz devised a greyscale that maximizesCIELAB differences within the gradient finding that it reducesestimation errors by 20 compared to a linearly interpolatedscale [17] Greyscale ramps are thought to be effective atrevealing shapes and forms However they are susceptible tolarge simultaneous contrast shifts making them less usefulfor quantity estimation [41] One alternative to greyscalesinvolves varying hues instead of lightness typically via a gra-dation based on the electromagnetic spectrum The result is avivid fully saturated colormap that looks like a rainbow Al-though popular in scientific visualizations rainbow has beenthe subject of much critique in the visualization communityExperts argue that the order of hues in rainbow is not readilyapparent making it unsuitable for encoding interval data [33]Moreover rainbow introduces sharp visual boundaries aroundits yellow regions which can be misinterpreted by viewerswho might infer nonexistent features in the data [4]

Given the above limitations researchers proposed many al-ternatives to rainbow For instance lsquoSpiralrsquo colormaps com-prise a limited hue rotation combined with monotonically-

increasing luminance The result is a colormap that spirals upin the hue cone while simultaneously gaining luminance [42]Similarly cubehelix incorporates sinusoidal RGB variationsaccompanied by a monotonic buildup in luminance [14]Kindlmann et al proposes an isoluminant version of rain-bow [21] while Moreland advocates for diverging colormapswhich incorporate two opposing hues at the endpoints whilepassing through an unsaturated tone (typically white) [24]These colormaps are thought to provide lsquoperceptually uniformrsquoalternatives to rainbow by exerting control over luminancewhile providing a level of hue variation Although stronglyfavored by visualization experts evidence of their effective-ness remains inconclusive (eg see [23 18] vs [3]) Ourstudy compares these different design strategies by testing arepresentative sample of colormaps including rainbow Spiraland diverging schemes

Empirical Evaluations of Continuous ColormapsIt is generally recognized that colormap designs should beadaptive to the intended graphical task [31] Ware arguesthat colormaps should monotonically increase their lumi-nance when the goal is to comprehend shapes and spatialfeatures [41] By contrast when the goal is to estimate quan-tities a colormap should be designed to reduce simultaneouscontrast effects by registering non-monotonic variation inat least one of the three opponent-process channels Wareconfirms this latter hypothesis finding spectral ramps (ierainbow) to be the most accurate in quantity estimation butfinds little support for the shape perception theory He thensuggests that Spiral colormaps would be ideal for both quantityestimation and form perception [41] A study by Borkin et alfinds diverging ramps to be significantly more accurate thanrainbow when diagnosing heart disease from arterial scans [3]However Borkin et alrsquos results contradict two earlier studieswhich found spectral schemes to be more accurate in bothquantity and surface interpretation [23 18] Such inconsisten-cies highlight a limitation in current literature prior studiesemployed widely varying test conditions and tasks thus com-plicating their comparison Our work directly addresses thislimitation We evaluate colormaps under comparable exper-imental conditions and in a range of tasks from quantityestimation to form and pattern comprehension

Effects of Spatial FrequencySpatial frequency is a measure of the level of variance (orthe amount of information) that is present in a degree of vi-sual angle Maps with sharp edges and small features willgenerally convey more information and thus exhibit higherspatial frequency components Conversely maps with broadsmooth surfaces contain less spatial variation and thus ex-hibit lower spatial frequency Spatial frequency is thought tohave a critical role in visual perception some vision theoriessuggest that the visual cortex operates on a code of spatialfrequency as opposed to a code of straight lines and edges [1112] Moreover spatial frequency is inversely proportional theaverage size of visual features in the scene and size is knownto impact color perception [35 36] Given these factors spa-tial frequency is likely to affect the perception of continu-ous maps and possibly modulate the effectiveness of color

f1(median frequency)=3 f2 = 5 f3 = 7 f4 = 9 f5 = 11

f1

f4

f2

f5

f3

8

4

height distribution

lowterrain

highterrain

Figure 1 Five example scalar fields used as stimuli in this study The fields represent the height of procedurally generated terrain (brighter is higher)and are ordered according to their median spatial frequency (cycle per 8deg of visual angleDagger) Log-log plots depict the power spectra of each scalar field(combined in the rightmost plot to aid comparison) The position of the median frequency which splits the power distribution into two approximatelyequal halves is illustrated with a vertical line Although different with respect to spatial frequency characteristics the maps are very similar in theirdistribution of height amplitudes (shown in the top-right plot)

schemes to varying degrees Rogowitz et al argue for two col-ormap design strategies depending on spatial frequency theyrecommend ramps with monotonically increasing luminancefor datasets containing high spatial frequency and hue- orsaturation-varying for low-frequency data [32] Put differentlywe would expect sequential and spiral colormaps (eg cubehe-lix) to perform better in scalar fields that have rough surfacesand narrow features Conversely we can expect hue-varyingramps (eg rainbow) to work well with maps that have broadsurfaces and low variance This guideline is consistent withcolor difference experiments that tested viewersrsquo sensitivityto frequency-modulated Gabor patches [19] However it isdoubtful whether such experimental results (and the ensuingguideline) generalize to visual analysis tasks on actual scalarfields

SummaryColor-coding guidelines for continuous maps often come inthe form of advice that discourages the use of rainbow [425] and suggests perceptually uniform alternatives [24] Thisclinical omnibus advice is largely based on expert intuitionand design heuristics However the literature paints a morecomplex picture and suggests the choice of color encodingshould be based on both task [31 37] and data characteristicsincluding spatial frequency [32 2] This paper provides a firstexperimental account of the impact of spatial frequency onpeoplersquos ability to estimate quantities and perceive patternsin quantitative maps By studying how spatial complexitymodulates the effectiveness of colormap designs we can de-vise more nuanced guidelines that are responsive to both datacharacteristics and viewersrsquo information needs (ie tasks)

HYPOTHESESBuilding on prior research we developed three hypotheses

H1mdashWe expect colormaps comprising large hue variationsto be perceived more accurately in scalar fields containinglow spatial frequency Conversely we expect ramps withmonotonically increasing luminance to yield higher accuracyin high-frequency data These predictions are based on thecontrast-sensitivity of our visual perceptual system whichresponds more robustly to chromatic and hue variation when

assessing broad smooth surfaces and to lightness differenceswhen resolving small features [32]

H2mdashIn quantity estimation tasks (experiment 1) where thegoal is to estimate quantities at specific locations we expectcolormaps having substantial hue variation to perform betterThis conjecture assumes that hue-varying ramp will registersinusoidal variations along the chromatic opponent-processchannels Such non-monotonic variations reduce simultaneouscontrast shifts because they are less likely to systematicallyweigh chromatic processing in a particular direction [41]

H3mdashIn tasks requiring the comprehension of forms and struc-tures (experiments 2 and 3) we expect colormaps havingmonotonically increasing luminance to perform best Our vi-sual system infers surface information largely from shadingcues and luminance variation [27] Therefore colormaps thatexert linear control over their luminance can be expected toportray forms and structures more effectively

Although there is existing evidence to back H2 (see an earlierstudy by Ware [41]) our work aims to replicate and extendthese results to account for spatial frequency To that endH1 provides a broader (yet untested) prediction of how spa-tial frequency might impact the performance of continuouscolormaps

METHODOLOGYIn the following sections we present the results of three crowd-sourced experiments to test the above hypotheses Specificallywe investigate whether the effectiveness commonly prescribedcolormaps is modulated by the spatial frequency of the dataand the degree to which this relationship is influenced by vari-ations in luminance hue and saturation within the color rampEach experiment tests one specific task against nine colormapsand at increasing levels of spatial frequency The first ex-periment measures participantsrsquo ability to estimate values atspecific locations in the map The second experiment testsparticipantsrsquo accuracy in comparing gradients in larger mapswaths The third and final experiment is aimed at evaluatingparticipantsrsquo ability to perceive and match longitudinal pat-terns in the map Before delving into the details we describeour experimental design and stimulus generation procedure

spat

ial f

requ

ency

greyscale singlehue cubehelix extbodyheat coolwarm spectralrainbow blueyellow

luminance saturation CIELAB a (green-red) b (blue-yellow)

bodyheat

Figure 2 We tested nine colormaps selected to encompass a variety of design characteristics (illustrated by variation in luminance saturation andorhue) Each colormap was tested with multiple scalar fields corresponding to increasing levels of spatial frequency

StimuliWe employ digital elevation models (DEM) as stimuli for thethree experiments A DEM represents land elevation withcells in the 2D scalar field representing terrain height at thecorresponding locations To maintain precise control overspatial frequency and task difficulty we synthetically generateDEMs using Perlin noise [28] mixing five octaves of the noisefunction to produce seemingly realistic terrain The resultingDEMs are then normalized so that their heights span the entirecolormap range All generated maps were 820times630 pixels insize (approximately 16degtimes13deg of visual angleDagger)

By varying the scale of the noise function we obtain scalarfields with different spatial frequency characteristics The lat-ter is measured by first computing a Fast Fourier Transform(FFT) over the scalar field and calculating the relative con-tribution of each carrier frequency from the magnitude of itsFFT vector The result can be illustrated with a power spectraplot for each individual scalar field with spatial frequency onthe x-axis and the contribution of the frequency component onthe y-axis Low frequency fields exhibit a more pronouncedright skew in their power spectra We compute the positionof the median spatial frequency which splits the power spec-tra into approximately equal halves and use it to order thefields Fields with larger median frequencies indicate morevaried and complex terrain structure This procedure enabledus to synthesize scalar fields with very similar height distribu-tions while providing precise control over spatial frequencyFigure 1 illustrate examples generated using this method

ColormapsWe chose nine commonly-used colormaps (listed in Table 1and illustrated in Figure 2) In addition to a greyscale baselinethe colormaps selected reflect five design strategies

ndash Sequential monotonically increasing luminance over a lim-ited number of hues (singlehue bodyheat)

DaggerFollowing [36] we derive expected visual angle measurementsfrom pixel dimensions by assuming standard web viewing conditionsW3C-compliant browsers render HTML images at 96 DPI [40] andautomatically remap this to compensate for actual display resolutionWe assume a viewing distance of 30 inches Thus the estimatedvisual angle for an object of size S pixels is θ = 2tanminus1(

(S2)9630 )

ndash Spiral monotonically increasingly luminance with multiplehues (cubehelix extbodyheat)

ndash Diverging with uniformly-stepped luminance (coolwarmspectral)

ndash Diverging with uniformly-stepped saturation (blueyellow)ndash Fully saturated hues (rainbow)

All colormaps were interpolated in the CIELAB color spacewith the exception of coolwarm blueyellow and cubehelix mdashthese were interpolated (as originally intended) in a polar formof the LAB space [24] in the HSL space and using a taperedRGB helix [14] respectively

Greyscale Linear black to white ramp interpolated in the CIELAB color space

Singlehue Monotonically increasing luminance over a single blue hue (from Color Brewer [15])

Cubehelix Monotonically increasing luminance with sinusoidal RGB rotation [14]

Bodyheat Monotonically increasing luminance with a limited hue profile similar to a heated metal filament

Colormap Luminancecontrol

Hues

Ext-bodyheat Monotonically increasing luminance based on bodyheat but augmented with additional blue and purple hues in the low regions [41]Cool-warm Diverging with blue and red at the endpoints and soft white at the middle [24] Uniform luminance steps with darker ends and a bright midpoint

Rainbow Fully saturated hue gradation (blue green yellow red) interpolated in CIELAB

Spectral Diverging multi-hue encompassing a subset of the rainbow with a yellow middle [15] Uniform luminance steps with darker ends and a bright midpoint

Designstrategy

monotonic increase

monotonic increase

monotonic increase

-

blue

red yellow

sinusoidaRGB

blue red yellow

blue red

saturated RGB

limited RGB

monotonic increase

monotonic increase

uniform mid peak

-

-

luminance ramp

sequential

spiral

spiral

diverging

sequential

diverging

Blue-yellow Diverging uniformly-stepped saturation with blue and yellow ends and 75 grey in the middle

huerotation

blue yellow diverging

uniform mid peak

Table 1 The nine colormaps evaluated in this study

Experimental DesignWe investigate two independent variables Colormap and Spa-tial Frequency Colormap comprised nine distinct categories(see above) whereas Spatial Frequency is a continuous vari-able representing the number of cycles in 410 pixels (ie halfthe width of our map stimulus or approximately 8deg of visualangleDagger) We sampled spatial frequency at five intervals 3 5 79 11 To systematically study the effect of this variable on the

Click on a point that has an elevation of exactly 750 feet

Terrain is steeper when there is larger change in elevation between adjacent points Compare the steepness of terrain inside the two boxes then click on the box that is steeper on average

Imagine a line from A to B Select the elevation profile below that most closely matches its slope

Experiment 1 Experiment 2 Experiment 3

elevation(feet)

elevation(feet)

elevation(feet)

Figure 3 Example stimuli from the three experiments In experiment 1 participants indicated their response by clicking a point on the map matchinga specified elevation In experiment 2 participants were prompted to select the steeper of the two boxes In experiment 3 participants were asked toidentify the pattern corresponding to the terrain profile between two horizontally displaced markers

different colormaps we opted for a factorial design testingall possible 9times5 Colormap and Spatial Frequency combina-tions Given the sheer number of combinations we opted fora mixed design to make the study feasible Participants wererandomly assigned to one of three experimental conditions(illustrated in Table 2) Each condition comprised 3 of the 9colormaps (ie between-subject) and all 5 frequency levels(within-subject) In effect every participant saw 3 colormaptimes 5 spatial frequency combinations Participants completedmultiple trials with each combination

To equalize task difficulty stimuli for a given spatial frequencytrial were derived from the same base scalar field with differentcolormaps applied This arrangement enables us to make di-rect comparison between the colormaps for a given frequencyHowever it also meant that participants will see the samemap three times albeit with different colormaps To preventlearning scalar fields were flipped either horizontally or ver-tically resulting in three unique map reflections The orderof colormap presentation was fully counterbalanced acrossparticipants to minimize residual learning or fatigue effects

Condition Colormaps tested Spatial frequencies tested1 greyscale cubehelix rainbow 3 5 7 9 112 singlehue extbodyheat spectral 3 5 7 9 113 bodyheat coolwarm blueyellow 3 5 7 9 11

Table 2 Three experimental conditions each included 3 of 9 colormaps(ie between-subject variation) and all 5 levels of spatial frequency

EXPERIMENT 1 QUANTITY ESTIMATIONThe first experiment tests participantsrsquo ability to identify loca-tions on the map matching specified elevations Participantswere instructed to ldquoClick on a point that has an elevation ofexactly [H] feetrdquo Five different values for H were tested 0250 500 750 and 1000 feet These values correspond to thethree quartiles of the color scale as well as the min and max

ParticipantsWe recruited 90 participants from Amazon Mechanical Turk(50 females 40 males) with a mean age of 3464 (ST D = 953years) Participants were first screened for color-vision de-ficiency using a 14-panel Ishihara test and had to correctlyguess the number in 12 of the 14 panels to qualify We re-stricted the study to participants with a screen resolution of atleast 1280times800 to ensure the experimental interface would

fit their display Participants received a base reward of $050and a maximum bonus of $300 based on the percentage ofcorrectly solved tasks (for a possible total of $350)

ProcedureAfter signing up for the study participants were directed to anexternal link that displayed the experiment within a web inter-face Participants entered their MTurk ID and were presentedwith an information sheet about the study They were thenpresented with the color-vision qualification test Those whosuccessfully passed the test were given a set of 6 training trialsand provided with feedback on their accuracy Participantshad to identify a location that is within a 5 margin from thespecified height before proceeding to the next training trial

The main portion of the experiment consisted of 3 rounds onewith each of the 3 colormaps In each round participants sawthe five spatial frequency levels in ascending order providinga progression from simple to more complex maps The orderof colormap presentation was fully counterbalanced across par-ticipants using a Latin square design Participants completed 5trials with each colormap and spatial frequency combinationcorresponding to the 5 tested quantities (0 250 500 750 and1000) presented in random order A color scale was displayedto the right of the map and the range of the scale was fixedat [0ndash1000] feet (see Figure 3) In each trial participants firstsaw the question and clicked on lsquoShow Maprsquo to reveal thestimulus They then indicated their response by clicking onthe map to mark their selected location and clicked lsquoNextrsquo Toaid participants in accurately selecting locations the mousecursor was changed to a crosshair with a hollowed-out center(so as not to obscure the focal pixel)

ResultsWe computed an lsquoerrorrsquo measurement for each response bytaking the absolute difference between the requested elevationand the elevation at the point clicked by the participant Wethen applied the following log transform [10 16]

log2(error) = log2(| judged percentminus true percent|+18)

We removed three participants from the analysis (amountingto 33 of subjects) because their overall accuracy was twostandard deviations below the mean accuracy for all partici-pants (M = 8536ST D = 948) We analyze the results

greyscale singlehue bodyheat cubehelix extbodyheat coolwarm rainbow spectral blueyellow

3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11

0

1

2

3

frequency

log(error)

whichexperimentmodel

experiment modelspatial frequency

Figure 4 Mean log of error in quantity estimation (experiment 1 vs model) Ribbons represent 95 CIs of the experimental results

0

1

2

3

3 5 7 9 11frequency

log(error)

colorscalegreyscalesinglehuebodyheatcubehelixextbodyheatcoolwarmrainbowspectralblueyellow

spatial frequency

0

1

2

greyscale

singlehue

bodyheat

cubehelix

extbodyheat

coolwarm

rainbow

spectral

blueyellow

log(error)

colorscalegreyscalesinglehuebodyheatcubehelixextbodyheatcoolwarmrainbowspectralblueyellow

Figure 5 Mean log error in quantity estimation by colormap (left) andby colormap times spatial frequency Intervals are 95 CIs

by fitting the log of error to a linear mixed-effects modelcomprising two fixed effects (colormap frequency) and tworandom effects The first random effect accounts for individ-ual variations among participants and the second for intra-trialvariations (recall that trials comprised different test elevations)

Figure 4 illustrates the mean log error by colormap and spatialfrequency (the dashed trendline represents the model) Theexperimental results are combined in Figure 5 to ease com-parison A likelihood ratio test indicates the overall model issignificant (χ2(18) = 14817 p lt 0001) To test for interac-tion between spatial frequency and colormap we fit a reducedmodel that accounts for both frequency and colormap but nottheir interaction There was no significant difference betweenthe full and the reduced model (χ2(8) = 10695 p = 0219)thus ruling out an interaction between colormap and spatial fre-quency We will therefore interpret the reduced model whichaccounts for both factors independently Table 3 illustrates themodel coefficients

The model predicts that a step-increase in spatial frequencyyields a 011 increase in the log of estimation error The differ-ence in estimation error between the highest (f=11) and lowest(f=3) frequency levels is approximately 09 orders of magni-tude The effect of color encoding was equally evident allthe colormaps were significantly better than greyscale How-ever the gain in accuracy was markedly different betweenthe colormaps Rainbow had the largest impact on estimationaccuracy reducing error by approximately 23 orders of mag-nitude compared to greyscale The runner-up was spectralwhich also contains substantial hue variation However spec-tral reduced error by 175 orders of magnitude only On theother hand Spiral colormaps (extbodyheat cubehelix) whichcomprise multiple hues over a monotonically increasing lumi-nance decreased estimation errors by approximately 13-14orders of magnitude compared to 08-12 for Diverging ramps(blueyellow coolwarm) Sequential schemes (singlehue and

Coefficient Estimate |t value| p(Intercept) 177 6628 singlehue -026 2253 bodyheat -071 6118 cubehelix -127 16467 extbodyheat -139 12073 coolwarm -118 10122 rainbow -233 30264 spectral -175 15172 blueyellow -080 6871 Spatial Frequency 011 16967

Table 3 Effects of colormap and spatial frequency on the log of errorin quantity estimation The intercept represents greyscale as colormap(lowastlowastlowast= p lt 0001lowastlowast= p lt 001lowast= p lt 005)

000

025

050

075

3 5 7 9 11spatial frequency

avg

gra

dien

t (

)

Figure 6 Average local gradient (ie terrain slope) at locations selectedby participants Error bars are 95 CIs

bodyheat) had the least impact on error with a mere improve-ment of 026-071 orders of magnitude relative to greyscale

The fact that we did not find interaction between colormapand spatial frequency implies that the relative effectiveness ofthe different colormaps is stable across all spatial frequencylevels tested Rainbow is thus expected to be the most ac-curate colormap for quantity estimation regardless of howspatially complex the data is However estimation accuracywill decrease comparably for all colormaps as the data be-comes more spatially varied This could reflect a combinationof perceptual and motor difficulty in locating and clicking theintended location due the larger local gradients encounteredin high-frequency maps (see Figure 6)

EXPERIMENT 2 GRADIENT PERCEPTIONThe second experiment tests participantsrsquo accuracy in compar-ing and judging the steepness of gradients The ability to judgehow fast the encoded quantities change between adjacent maplocations is important in many contexts

ParticipantsWe recruited 126 participants (50 females 74 males 2 others)with a mean age of 3562 years (ST D = 955 years) Partici-pants had an overall success rate of 6775 (ST D = 1141)Ten participants (79 of subjects) were dropped from theanalysis because their overall accuracy was worse than chancehaving correctly answered less than 50 of trials in a two-alternatives forced choice experiment

greyscale singlehue bodyheat cubehelix extbodyheat coolwarm rainbow spectral blueyellow

3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 1104

06

08

frequency

c

orre

ct

whichexperimentmodel

experiment model

p(su

cces

s)

spatial frequency

Figure 7 Probability of successful gradient judgment (experiment 2 vs model) Ribbons represent 95 CIs of the experimental results

spatial frequency

05

06

07

08

09

3 5 7 9 11spatial frequency

c

orre

ct

colorscalegreyscalesinglehuebodyheatcubehelixextbodyheatcoolwarmrainbowspectralblueyellow

00

02

04

06

08

greysc

ale

single

hue

body

heat

cube

helix

extbo

dyhe

at

coolw

arm

rainb

ow

spec

tral

bluey

ellow

c

orre

ct

colorscalegreyscalesinglehuebodyheatcubehelixextbodyheatcoolwarmrainbowspectralblueyellow

Figure 8 Percentage of correctly answered trials in a gradient percep-tion task (experiment 2) Intervals are 95 CIs

ProcedureEach trail consisted of a map with two squares juxtaposed ontop (see Figure 3) Participants were prompted to ldquocomparethe steepness of terrain inside the two boxesrdquo and ldquoclick on thebox that is steeper on averagerdquo The two boxes were identicalin size (175times175 pixels or 35degtimes35deg of visual angle) How-ever terrain steepness which was calculated by taking the aver-age first derivative within each box was varied systematicallyThe gradient-ratio between the flatter and the steeper boxeswas fixed to one of four levels 0808308609(plusmn005) Alower ratio implies larger and potentially more perceptibleslope difference making the task easier However the twoboxes encompassed terrain with identical height ranges to re-duce variability in the appearance of their peaks (a potentialconfound in slope judgment [26])

Participants first completed a set of 6 training trials that in-cluded feedback before proceeding to the main trials Theorder of stimuli was similar to the previous experiment thestudy consisted of 3 rounds one with each of the 3 colormapsthe participant was assigned to see Each round encompassedall 5 spatial frequency levels Participants completed 4 trialswith each colormap and frequency combination spanning arange of easy to difficult tests (a total of 60 trials) The orderof colormap presentation was fully counterbalanced acrossparticipants

ResultsFigure 7 illustrates participantsrsquo probability of correctly iden-tifying the steeper gradient The experimental data is shownseparately in Figure 8 We fit the results to a logistic re-gression model comprising two fixed effects (colormap fre-quency) The model also included two random effects toaccount for individual differences among participants andintra-trial variations (recall that trials varied in difficulty) Themodel essentially predicts the odds of correctly identifyingthe steeper gradient A likelihood ratio test indicates the over-all model is significant (χ2(17) = 42165 p lt 0001) The

a Main effectsCoef Est |z| p(Intercept) 080 0516singlehue 114 0415bodyheat 094 0183cubehelix 090 0354extbodyheat 113 0388coolwarm 066 1316rainbow 065 1354spectral 092 0259blueyellow 091 0281Frequency 115 4633

b Interaction effects(colormap x frequency)

Coef Est |z| psinglehue 096 0910bodyheat 104 1020cubehelix 106 1303extbodyheat 103 0596coolwarm 114 3008 rainbow 113 2876 spectral 106 1330blueyellow 112 2471

Table 4 Main effects of colormap and spatial frequency on success oddsin gradient judgment (a) and their interaction Coefficients shown corre-spond to the exponented model estimates to reflect odd-ratios The inter-cept represents greyscale (lowastlowastlowast= p lt 0001 lowastlowast= p lt 001 lowast= p lt 005)

model correctly predicts 7467 of outcomes We find sig-nificant interaction between colormap and spatial frequency(χ2(8) = 2781 p lt 0001) Table 4 shows model coefficients

The main-effect coefficients for all colormaps were not sig-nificant indicating that all colormaps perform comparablyto greyscale at low spatial frequencies Participants are thusunlikely to benefit from the use of color when judging gra-dients in low-variance data The main effect of spatial fre-quency however is significant The model estimates that astep-increase in spatial frequency improves the odds of correctjudgment by 15 Estimating gradients appear to be easier inmaps with more complex spatial structures

The model indicates several noteworthy interactions Althoughthe use of color had no significant effect in low-frequencymaps several colormaps significantly outperformed greyscaleat high frequency The divergent coolwarm improved partici-pantsrsquo success odds by 14 for every step-increase in spatialfrequency Similarly rainbow and blueyellow increased theodds by approximately 13 and 12 respectively Notablythese three colormaps contain substantial variation in satura-tion (coolwarm and blueyellow) or hue (rainbow) All othercolormaps tested were not reliably different from greyscale

EXPERIMENT 3 PATTERN PERCEPTIONHaving tested accuracy in quantity estimation and gradientperception we now evaluate participantsrsquo ability to integratethese two skills Experiment 3 required participants to extracta longitudinal pattern from the map and match it to an externalrepresentation a task originally devised by Hyslop [18]

ParticipantsWe recruited 165 participants (79 females 84 males 2 others)The mean participant age was 3604 years (ST D = 1171)Overall participants had a mean success rate of 7851 inmatching the correct pattern (ST D = 1931) We dropped

greyscale singlehue bodyheat cubehelix extbodyheat coolwarm rainbow spectral blueyellow

3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11

07

08

09

frequency

c

orre

ct

whichexperimentmodel

experiment model

p(su

cces

s)

spatial frequency

Figure 9 Probability of successful pattern matching (experiment 3 vs model) Ribbons denote 95 CIs of the experimental data

070

075

080

085

090

3 5 7 9 11spatial frequency

c

orre

ct

colorscalegreyscalesinglehuebodyheatcubehelixextbodyheatcoolwarmrainbowspectralblueyellow

000

025

050

075

greysc

ale

single

hue

body

heat

cube

helix

extbo

dyhe

at

coolw

arm

rainb

ow

spec

tral

bluey

ellow

c

orre

ct

colorscalegreyscalesinglehuebodyheatcubehelixextbodyheatcoolwarmrainbowspectralblueyellow

Figure 10 Percentage of correctly answered trials in experiment 3 In-tervals are 95 CIs

seven participants from the analysis (42 of subjects) whoseoverall accuracy was two standard deviations below the mean

ProcedureParticipants first completed a set of 6 training trials that in-cluded feedback before proceeding to the main experimentEach trial consisted of a map with two markers labeled A andB (see Figure 3) The markers were horizontally displacedby 350 pixels (7degof visual angle) Participants were given thefollowing prompt ldquoImagine a line from A to B Select the ele-vation profile below that most closely matches its sloperdquo Theythen selected a choice among a set of 6 patterns includingthe actual elevation profile and 5 other distractors Distractorswere generated from the same map so as to reflect similar spa-tial frequency characteristics and had to be 65-70 similarto the actual profile (as measured by dynamic warping [20])Additionally profiles and distractors were selected to not havepeaks or valleys at the endpoints These criteria determinedafter a pilot ensure similar task difficulty across the trials

The order of stimuli was similar to the previous two experi-ments the study consisted of 3 rounds one with each of the 3colormaps the participant was assigned to see and encompass-ing the 5 spatial frequency levels Thus every participant saw3times5 colormap and frequency combinations and completed 3pattern matching trials with each combination for a total of 45trials As in the previous experiments the order of colormappresentation was fully counterbalanced

ResultsWe fit the results to a logistic regression model comprisingtwo fixed effects (colormap frequency) and two random ef-fects to account for individual differences and intra-trial vari-ations Figure 9 shows the odds of successful profile match-ing The experimental results are illustrated separately inFigure 10 A likelihood ratio test indicates the model issignificant (χ2(17) = 39467 p lt 0001) We find signif-icant interaction between colormap and spatial frequency

a Main effectsCoef Est |z| p(Intercept) 903 6502 singlehue 134 0649bodyheat 080 0503cubehelix 077 0681extbodyheat 081 0466coolwarm 044 1868 rainbow 098 0064spectral 057 1284blueyellow 073 0697Frequency 093 2058

b Interaction effects(colormap x frequency)

Coef Est |z| psinglehue 098 0450bodyheat 102 0456cubehelix 109 1781 extbodyheat 104 0797coolwarm 117 3143 rainbow 101 0259spectral 111 2198 blueyellow 109 1683

Table 5 Main effects of colormap and spatial frequency on successodds in experiment 3 (a) and their interaction Coefficients depict expo-nented model estimates to reflect odd-ratios The intercept correspondto greyscale (lowastlowastlowast= p lt 0001 lowastlowast= p lt 001 lowast= p lt 005 = p lt 01)

(χ2(8) = 18131 p lt 005) the relative effectiveness of thecolormaps appears to vary with spatial frequency

Overall we find a significant detrimental main effect of spa-tial frequency on pattern perception as indicated by a 093Frequency coefficient (Table 5a) This translates to a 7 dropin the odds of correctly matching the profile for every step-increase in spatial frequency The main effect coefficients forall colormaps were not significant indicating that the use ofcolor at low spatial frequency is unlikely to improve patternperception as compared to a plain greyscale ramp

Colormap performance begins to diverge at high spatial fre-quency Only two colormaps have significant and largeenough odds-ratio coefficients (ie gt 107) to overcome thefrequency-induced perceptual difficulty spectral and cool-warm increased the odds of correct pattern matching by 11ndash17 respectively for a every step-increase in spatial fre-quency (after adjusting for frequency effects alone) Addi-tionally blueyellow and cubehelix were associated with a 9improvement but the advantage was not reliable (p lt 01)On the other hand extbodyheat bodyheat rainbow had small(and insignificant) odds-ratio coefficients (098ndash104 lt 107)indicating that similar to greyscale they are associated withlower success odds in complex maps

In short only two of the tested colormaps (coolwarm andspectral) appear to reliably support pattern perception at highspatial frequency Both consist of a diverging ramp withuniformly-stepped luminance All other colormaps (includinggreyscale) suffered as data complexity increased

DISCUSSION AND GUIDELINESOur work sheds new light on how spatial complexity impactsthe perception of continuous color-coded maps The experi-ments also led to some surprising findings that are at odds withcurrent guidelines We interpret these results and accordingly

Quantity estimationRanking unaffected by spatial frequency

Gradient perceptionLow spatial frequency

(038 cycledeg)High spatial frequency

(138 cycledeg)

no s

igni

fican

tdi

ffere

nce

no s

igni

fican

tdi

ffere

nce

Pattern perception

better

worse

Color mapGuidelines

(1) Maximize range of saturated hues regardless of spatial frequency

(2) At high spatial frequency Fully-saturated hues or diverging ramps with chroma variation

(3) At high spatial frequency Diverging ramps with uniformly-stepped luminance

no s

igni

fican

tdi

ffere

nce

no

sign

ifica

ntdi

ffere

nce

Low spatial frequency(038 cycledeg)

High spatial frequency(138 cycledeg)

Figure 11 Model-derived colormap ranking and guidelines by task and spatial frequency (lowast= p lt 005 = p lt 01 relative to greyscale)

devise new task- and frequency-aware color mapping guide-lines (indicated byF) We also rank the tested colormaps andsummarize our guidelines in Figure 11

Quantity EstimationOur first hypothesis (H1) predicts hue- and saturation-varyingramps to be more accurate at low spatial frequencies andramps with monotonically increasing luminance to be moreaccurate at high frequencies As discussed H1 is based onthe relative contrast-sensitivity of our visual system [32] Aquantity estimation task (experiment 1) shows no interactionbetween colormap and spatial frequency While increasedspatial complexity is associated with higher estimation errorthe effect is similar across all colormaps We thus reject H1

On the other hands results provide support for H2 whichpredicts that hue-varying ramps will lead to more accurateestimation Indeed the top performing colormaps (rainbowand spectral) contain substantial hue variation Results fromexperiment 1 thus replicate earlier findings by Ware [41] butalso extend them to show that spatial frequency have no ap-parent impact on the effectiveness of hue-varying ramps Ourdata shows that rainbow and spectral are the most accurateamong the colormaps tested even at the highest levels of spa-tial frequency Altogether these results lend further support tothe theory that lookup errors in color-coded maps are largelycaused by systematic simultaneous contrast shifts [41] ratherthan being affected by contrast sensitivity modulation [32]These shifts are best counteracted with colormaps that varynon-monotonically along one or more perceptual channels

A corollary result is that mixing monotonic luminance withhue variation would lead to significant accuracy loss Indeeddata from experiment 1 indicates that rainbow is approxi-mately an order of magnitude more accurate than extbodyheatand cubehelix These Spiral colormaps are designed to be moreaccurate rainbow alternatives for interval data [4 24] Con-trary we find that they reduce accuracy compared to a purelyhue-varying ramp This finding suggests that when estimat-ing a continuously coded spatial quantity people benefit mostfrom a large dynamic hue range Incorporating monotonicallyincreasing lightness within the colormap would necessarilyreduce the hue range thereby diminishing accuracy

F Guideline 1 We recommend maximizing hue variation toimprove quantity estimation irrespective of spatial frequency

Gradient PerceptionGradient perception allows people to distinguish how quicklythe encoded attribute changes between adjacent locations anessential skill when evaluating the distribution and varianceof spatial data We find that the task is strongly modulatedby the datarsquos spatial complexity increased spatial frequencyappears to enhance the perception of gradients This is un-surprising as maps with jagged surfaces are likely to exhibitmore pronounced mdashand thus more perceptiblemdash differencesin slope Colormap effectiveness was also impacted by spa-tial frequency color encoding did not help participantsrsquo dis-tinguish gradients at low frequency levels as all colormapsshowed similar performance to greyscale However three col-ormaps demonstrated significant advantage at high frequenciesCoolwarm rainbow and blueyellow improved perception oddsby 12-14 for every step-increase in spatial frequency Allthree employed one of two design strategies a diverging rampwith varying saturation or a fully saturated hue rotation

The above results contradict H1 which predicts hue- andsaturation-varying colormaps to perform better at low frequen-cies In fact we see the opposite The results also do not sup-port H3 which predicts better performance for monotonically-luminant ramps in structure perception tasks In fact all threetop-performing ramps exhibit non-monotonic luminance

F Guideline 2 For tasks requiring gradient perception athigh spatial frequency we recommend a range of fully satu-rated hues (eg rainbow) or diverging chroma-varying ramps(eg coolwarm or blueyellow)

Pattern PerceptionExperiment 3 prompted participants to match the elevationprofile along a horizontal path with an external pattern Weexpected colormaps with monotonically increasing luminanceto be more accurate at this task (H3) but results were notentirely consistent with this prediction While all tested col-ormaps had comparable performance at low spatial frequencyonly two colormaps coolwarm and spectral gave partici-pants higher odds of successfully matching the pattern at highfrequency Both colormaps comprise a diverging ramp withuniformly-stepped (though not strictly monotonic) luminanceBy contrast sequential and spiral ramps performed just aspoorly as greyscale in complex maps and so did rainbow

The above result are consistent with Morelandrsquos argumentthat diverging ramps provide ldquomaximal perceptual resolutionrdquo(through increasing and decreasing luminance intervals) [24]potentially enabling high-frequency patterns to be resolvedmore easily Our results may also explain why divergingschemes performed better in medical diagnosis [3] we suspectsuch tasks to require the analysis of potentially high-frequencyfeatures (eg small tissue aberrations)

FGuideline 3 We recommend diverging ramps with equidis-tant luminance steps (eg coolwarm and spectral) to sup-port the perception of longitudinal patterns at high spatialfrequency Rainbow Sequential and Spiral schemes shouldbe avoided in complex maps especially if the task involvesthe analysis and matching of fine-grained features

Yet Another Look at the RainbowResults of experiments 1 and 2 may shed a light on why rain-bow remains a popular choice among scientists [25] despitebeing considered a bad choice by the visualization commu-nity [4 30] Our data reveals that counterintuitively rainbowis robust for estimating a smoothly varying quantitative at-tribute regardless of spatial complexity Moreover rainbowprovides good support for gradient estimation at high spatialfrequency These two tasks correspond to elementary visualanalytic primitives including characterizing distributions de-termining ranges and filtering [1] Moreover studies showthat when experts attempt to form a mental model about avisualization they first go through a time-consuming processof extracting quantitative data ldquoat a rather detailed levelrdquo [38]For instance a weather forecaster will lookup pressure andwind changes estimating current readings at landmark loca-tions in the map before making a forecast Our data suggeststhat rainbow provides good support for these tasks making ita potentially reasonable choice for weather forecasters

Critique of rainbow centers on its tendency to create sharpvisual boundaries particularly around its yellow regions [4]Experts also criticize the use of fully saturated hues [24]which result in non-uniform perceptual steps within the colorramp The common intuition is that these two factors com-bined will inevitably distort the perception of quantities Wedo not see evidence to support this hypothesis In fact to thecontrary attempts to lsquolinearizersquo the rainbow by monotonicallyincreasing the luminance of hues could reduce estimationaccuracy by up to an order of magnitude

F Guideline 4 Rather than entirely discouraging the useof rainbow we suggest that it can be a reasonable designchoice for conveying spatial distributions and variances andin tasks that require quantitative as opposed to geometricprecision However rainbow has a number of limitationsThe use of green and red hues is problematic for people withcolor deficiency Moreover rainbow is probably ineffective atrevealing high-frequency patterns Interestingly these short-comings are balanced by diverging ramps (eg coolwarm)which although quantitatively inaccurate appear to supportpattern perception at high spatial frequency We thus arguethat hue-varying and diverging colormaps support orthogonaltasks in continuous maps and should therefore be consideredas complementary rather than mutually exclusive choices

LIMITATIONS AND FUTURE WORKThere are some limitations to our work that should be consid-ered First as with other crowdsourced graphical perceptionstudies we gain access to a larger pool of participants butsacrifice some experimental control [16] Particularly rele-vant to our study is the variations in participantsrsquo monitorsincluding color calibration and display resolution as well asthe illumination conditions in their homes or offices mdash all ofwhich can impact color perception We could not control thesefactors but attempted to counteract their variation by involvinga larger sample (N=381) Although we expect crowdsourcingto improve the ecological validity of results and guidelinesuncontrolled variations can potentially reduce our ability to de-tect small but otherwise significant differences in performancebetween tested conditions Future lab studies should thereforebe attempted to replicate our findings with added controls

Second our study employed a limited set of tasks designed tomeasure elementary perceptual operators including quantityestimation gradient perception and pattern matching Thereis an opportunity to test higher-level tasks that mimic scientificanalyses more closely including the identification and compar-ison of larger map features (eg fronts ridges) Additionallysome of the tasks we tested could be re-evaluated in more au-thentic formulations For instance a metric task could requireparticipants to estimate the quantity at a specific location onthe map This formulation is arguably more realistic than thetask we tested which simply asked participants to click anylocation thought to match a specified quantity

Third our analysis was focused exclusively on spatial fre-quency and there are good reasons to consider this factor [1132] However there are also additional data characteristics toconsider including for instance the distribution of amplitudeswithin the map Such factors will influence the distribution ofcolors in the image and may thus impact perception

Lastly we limited our study to synthetically generated scalarfields to precisely vary spatial frequency while controllingfor other confounds However synthetic stimuli may alsointroduce (unknown) perceptual or cognitive biases There-fore additional studies are needed to replicate our findingswith datasets from real-world domains (eg meteorology geo-physics or oceanography) and with domain experts We alsorestricted this study to participants with normal color visionTherefore our results may not generalize to approximately 5of the population who have some form of color deficiency

CONCLUSIONSWe conducted three experiments to investigate the effects ofspatial frequency and colormap characteristics on the percep-tion of continuous pseudocolor maps Our results indicate thatspatial frequency impacts judgment of the encoded quantitiesand structures While viewersrsquo quantity estimation accuracyexhibited a predictable response increased data complexityhad a more nuanced effect on gradient and pattern compre-hension the impact of which was dependent on the colormapused Designers should therefore consider both the type oftask and the spatial complexity of the underlying data Were-examined current guidelines and devised new recommenda-tions for color-coding of continuous spatial data

REFERENCES1 Robert Amar James Eagan and John Stasko 2005

Low-level components of analytic activity in informationvisualization In Information Visualization 2005INFOVIS 2005 IEEE Symposium on IEEE 111ndash117

2 Lawrence D Bergman Bernice E Rogowitz and Lloyd ATreinish 1995 A rule-based tool for assisting colormapselection In Proceedings of the 6th conference onVisualizationrsquo95 IEEE Computer Society 118

3 Michelle Borkin Krzysztof Gajos Amanda PetersDimitrios Mitsouras Simone Melchionna Frank RybickiCharles Feldman and Hanspeter Pfister 2011 Evaluationof artery visualizations for heart disease diagnosis IEEETransactions on Visualization and Computer Graphics 1712 (2011) 2479ndash2488

4 David Borland and Russell M Taylor Ii 2007 Rainbowcolor map (still) considered harmful IEEE ComputerGraphics and Applications 27 2 (2007)

5 Cynthia A Brewer 1994 Visualization in ModernCartography Elsevier Science Chapter Color useguidelines for mapping and visualization

6 Cynthia A Brewer 1996 Guidelines for selecting colorsfor diverging schemes on maps The CartographicJournal 33 2 (1996) 79ndash86

7 Cynthia A Brewer Alan M MacEachren Linda W Pickleand Douglas Herrmann 1997 Mapping mortalityEvaluating color schemes for choropleth maps Annals ofthe Association of American Geographers 87 3 (1997)411ndash438

8 Roxana Bujack Terece L Turton Francesca SamselColin Ware David H Rogers and James Ahrens 2017The Good the Bad and the Ugly A TheoreticalFramework for the Assessment of Continuous ColormapsIEEE Transactions on Visualization and ComputerGraphics (2017)

9 William S Cleveland and William S Cleveland 1983 Acolor-caused optical illusion on a statistical graph TheAmerican Statistician 37 2 (1983) 101ndash105

10 William S Cleveland and Robert McGill 1984 Graphicalperception Theory experimentation and application tothe development of graphical methods J Amer StatistAssoc 79 387 (1984) 531ndash554

11 Russel De Valois and Karen De Valouis 1990 SpatialVision Oxford University Press

12 Russell L De Valois Duane G Albrecht and Lisa GThorell 1978 Cortical cells bar and edge detectors orspatial frequency filters In Frontiers in visual scienceSpringer 544ndash556

13 Connor C Gramazio David H Laidlaw and Karen BSchloss 2017 Colorgorical Creating discriminable andpreferable color palettes for information visualizationIEEE Transactions on Visualization and ComputerGraphics 23 1 (2017) 521ndash530

14 D A Green 2011 A colour scheme for the display ofastronomical intensity images Bulletin of theAstronomical Society of India 39 (June 2011) 289ndash295

15 Mark Harrower and Cynthia A Brewer 2003ColorBrewer org an online tool for selecting colourschemes for maps The Cartographic Journal 40 1(2003) 27ndash37

16 Jeffrey Heer and Michael Bostock 2010 Crowdsourcinggraphical perception using mechanical turk to assessvisualization design In Proceedings of the SIGCHIConference on Human Factors in Computing SystemsACM 203ndash212

17 GT Herman and H Levkowitz 1992 Color scales forimage data IEEE Computer Graphics and Applications12 1 (1992) 72ndash80

18 Michael D Hyslop 2006 A comparison of spectral colorand greyscale continuous-tone map perception Masterrsquosthesis Michigan State University

19 Alan David Kalvin Bernice E Rogowitz Adar Pelah andAron Cohen 2000 Building perceptual color maps forvisualizing interval data In Human Vision and ElectronicImaging V Vol 3959 International Society for Opticsand Photonics 323ndash336

20 Eamonn Keogh and Chotirat Ann Ratanamahatana 2005Exact indexing of dynamic time warping Knowledge andInformation Systems 7 3 (2005) 358ndash386

21 Gordon Kindlmann Erik Reinhard and Sarah Creem2002 Face-based luminance matching for perceptualcolormap generation In Proceedings of the IEEEConference on Visualization rsquo02 IEEE Computer Society299ndash306

22 Robert Kosara 2016 An Empire Built On SandReexamining What We Think We Know AboutVisualization In Proceedings of the Beyond Time andErrors on Novel Evaluation Methods for VisualizationACM 162ndash168

23 Mark P Kumler and Richard E Groop 1990Continuous-tone mapping of smooth surfacesCartography and Geographic Information Systems 17 4(1990) 279ndash289

24 Kenneth Moreland 2009 Diverging color maps forscientific visualization In International Symposium onVisual Computing Springer 92ndash103

25 Kenneth Moreland 2016 Why We Use Bad Color Mapsand What You Can Do About It Electronic Imaging2016 16 (2016) 1ndash6

26 Lace Padilla P Samuel Quinan Miriah Meyer andSarah H Creem-Regehr 2017 Evaluating the Impact ofBinning 2D Scalar Fields IEEE Transactions onVisualization and Computer Graphics 23 1 (2017)431ndash440

27 Stephen E Palmer 1999 Vision science Photons tophenomenology MIT press

28 Ken Perlin 1985 An image synthesizer ACMSIGGRAPH Computer Graphics 19 3 (1985) 287ndash296

29 Linda Williams Pickle 2003 Usability testing of mapdesigns In Proceedings of Symposium on the Interface ofComputing Science and Statistics 42ndash56

30 P Samuel Quinan and Miriah Meyer 2016 Visuallycomparing weather features in forecasts IEEETransactions on Visualization and Computer Graphics 221 (2016) 389ndash398

31 Penny L Rheingans 2000 Task-based color scale designIn 28th AIPR Workshop 3D Visualization for DataExploration and Decision Making International Societyfor Optics and Photonics 35ndash43

32 Bernice E Rogowitz and Lloyd A Treinish 1994 Usingperceptual rules in interactive visualization In ISTSPIE1994 International Symposium on Electronic ImagingScience and Technology International Society for Opticsand Photonics 287ndash295

33 Bernice E Rogowitz Lloyd A Treinish Steve Brysonand others 1996 How not to lie with visualizationComputers in Physics 10 3 (1996) 268ndash273

34 Samuel Silva Beatriz Sousa Santos and JoaquimMadeira 2011 Using color in visualization A surveyComputers Graphics 35 2 (2011) 320ndash333

35 Maureen Stone Danielle Albers Szafir and Vidya Setlur2014 An engineering model for color difference as afunction of size In Color and Imaging Conference Vol2014 Society for Imaging Science and Technology253ndash258

36 Danielle Albers Szafir 2017 Modeling Color Differencefor Visualization Design IEEE Transactions onVisualization and Computer Graphics (2017)

37 Christian Tominski Georg Fuchs and HeidrunSchumann 2008 Task-driven color coding InInformation Visualisation 2008 IVrsquo08 12thInternational Conference IEEE 373ndash380

38 J Gregory Trafton Susan S Kirschenbaum Ted L TsuiRobert T Miyamoto James A Ballas and Paula DRaymond 2000 Turning pictures into numbersextracting and generating information from complexvisualizations International Journal of Human-ComputerStudies 53 5 (2000) 827ndash850

39 Bruce E Trumbo 1981 A theory for coloring bivariatestatistical maps The American Statistician 35 4 (1981)220ndash226

40 W3C 2016 CSS Values and Units Module Level 3(2016)httpwwww3orgTRcss3-valuesabsolute-lengths

41 Colin Ware 1988 Color sequences for univariate mapsTheory experiments and principles IEEE ComputerGraphics and Applications 8 5 (1988) 41ndash49

42 Colin Ware 2012 Information visualization perceptionfor design Elsevier

43 Liang Zhou and Charles D Hansen 2016 A survey ofcolormaps in visualization IEEE Transactions onVisualization and Computer Graphics 22 8 (2016)2051ndash2069

  • Introduction
  • Related Work
    • Design Strategies for Continuous Colormaps
    • Empirical Evaluations of Continuous Colormaps
    • Effects of Spatial Frequency
    • Summary
      • Hypotheses
      • Methodology
        • Stimuli
        • Colormaps
        • Experimental Design
          • Experiment 1 Quantity estimation
            • Participants
            • Procedure
            • Results
              • Experiment 2 Gradient perception
                • Participants
                • Procedure
                • Results
                  • Experiment 3 Pattern perception
                    • Participants
                    • Procedure
                    • Results
                      • Discussion and Guidelines
                        • Quantity Estimation
                        • Gradient Perception
                        • Pattern Perception
                        • Yet Another Look at the Rainbow
                          • Limitations and Future Work
                          • Conclusions
                          • References
Page 3: Graphical Perception of Continuous Quantitative Maps: the ...khreda.com/papers/CHI18_colormaps.pdfA large body of research has been devoted to understanding how color encoding affects

f1(median frequency)=3 f2 = 5 f3 = 7 f4 = 9 f5 = 11

f1

f4

f2

f5

f3

8

4

height distribution

lowterrain

highterrain

Figure 1 Five example scalar fields used as stimuli in this study The fields represent the height of procedurally generated terrain (brighter is higher)and are ordered according to their median spatial frequency (cycle per 8deg of visual angleDagger) Log-log plots depict the power spectra of each scalar field(combined in the rightmost plot to aid comparison) The position of the median frequency which splits the power distribution into two approximatelyequal halves is illustrated with a vertical line Although different with respect to spatial frequency characteristics the maps are very similar in theirdistribution of height amplitudes (shown in the top-right plot)

schemes to varying degrees Rogowitz et al argue for two col-ormap design strategies depending on spatial frequency theyrecommend ramps with monotonically increasing luminancefor datasets containing high spatial frequency and hue- orsaturation-varying for low-frequency data [32] Put differentlywe would expect sequential and spiral colormaps (eg cubehe-lix) to perform better in scalar fields that have rough surfacesand narrow features Conversely we can expect hue-varyingramps (eg rainbow) to work well with maps that have broadsurfaces and low variance This guideline is consistent withcolor difference experiments that tested viewersrsquo sensitivityto frequency-modulated Gabor patches [19] However it isdoubtful whether such experimental results (and the ensuingguideline) generalize to visual analysis tasks on actual scalarfields

SummaryColor-coding guidelines for continuous maps often come inthe form of advice that discourages the use of rainbow [425] and suggests perceptually uniform alternatives [24] Thisclinical omnibus advice is largely based on expert intuitionand design heuristics However the literature paints a morecomplex picture and suggests the choice of color encodingshould be based on both task [31 37] and data characteristicsincluding spatial frequency [32 2] This paper provides a firstexperimental account of the impact of spatial frequency onpeoplersquos ability to estimate quantities and perceive patternsin quantitative maps By studying how spatial complexitymodulates the effectiveness of colormap designs we can de-vise more nuanced guidelines that are responsive to both datacharacteristics and viewersrsquo information needs (ie tasks)

HYPOTHESESBuilding on prior research we developed three hypotheses

H1mdashWe expect colormaps comprising large hue variationsto be perceived more accurately in scalar fields containinglow spatial frequency Conversely we expect ramps withmonotonically increasing luminance to yield higher accuracyin high-frequency data These predictions are based on thecontrast-sensitivity of our visual perceptual system whichresponds more robustly to chromatic and hue variation when

assessing broad smooth surfaces and to lightness differenceswhen resolving small features [32]

H2mdashIn quantity estimation tasks (experiment 1) where thegoal is to estimate quantities at specific locations we expectcolormaps having substantial hue variation to perform betterThis conjecture assumes that hue-varying ramp will registersinusoidal variations along the chromatic opponent-processchannels Such non-monotonic variations reduce simultaneouscontrast shifts because they are less likely to systematicallyweigh chromatic processing in a particular direction [41]

H3mdashIn tasks requiring the comprehension of forms and struc-tures (experiments 2 and 3) we expect colormaps havingmonotonically increasing luminance to perform best Our vi-sual system infers surface information largely from shadingcues and luminance variation [27] Therefore colormaps thatexert linear control over their luminance can be expected toportray forms and structures more effectively

Although there is existing evidence to back H2 (see an earlierstudy by Ware [41]) our work aims to replicate and extendthese results to account for spatial frequency To that endH1 provides a broader (yet untested) prediction of how spa-tial frequency might impact the performance of continuouscolormaps

METHODOLOGYIn the following sections we present the results of three crowd-sourced experiments to test the above hypotheses Specificallywe investigate whether the effectiveness commonly prescribedcolormaps is modulated by the spatial frequency of the dataand the degree to which this relationship is influenced by vari-ations in luminance hue and saturation within the color rampEach experiment tests one specific task against nine colormapsand at increasing levels of spatial frequency The first ex-periment measures participantsrsquo ability to estimate values atspecific locations in the map The second experiment testsparticipantsrsquo accuracy in comparing gradients in larger mapswaths The third and final experiment is aimed at evaluatingparticipantsrsquo ability to perceive and match longitudinal pat-terns in the map Before delving into the details we describeour experimental design and stimulus generation procedure

spat

ial f

requ

ency

greyscale singlehue cubehelix extbodyheat coolwarm spectralrainbow blueyellow

luminance saturation CIELAB a (green-red) b (blue-yellow)

bodyheat

Figure 2 We tested nine colormaps selected to encompass a variety of design characteristics (illustrated by variation in luminance saturation andorhue) Each colormap was tested with multiple scalar fields corresponding to increasing levels of spatial frequency

StimuliWe employ digital elevation models (DEM) as stimuli for thethree experiments A DEM represents land elevation withcells in the 2D scalar field representing terrain height at thecorresponding locations To maintain precise control overspatial frequency and task difficulty we synthetically generateDEMs using Perlin noise [28] mixing five octaves of the noisefunction to produce seemingly realistic terrain The resultingDEMs are then normalized so that their heights span the entirecolormap range All generated maps were 820times630 pixels insize (approximately 16degtimes13deg of visual angleDagger)

By varying the scale of the noise function we obtain scalarfields with different spatial frequency characteristics The lat-ter is measured by first computing a Fast Fourier Transform(FFT) over the scalar field and calculating the relative con-tribution of each carrier frequency from the magnitude of itsFFT vector The result can be illustrated with a power spectraplot for each individual scalar field with spatial frequency onthe x-axis and the contribution of the frequency component onthe y-axis Low frequency fields exhibit a more pronouncedright skew in their power spectra We compute the positionof the median spatial frequency which splits the power spec-tra into approximately equal halves and use it to order thefields Fields with larger median frequencies indicate morevaried and complex terrain structure This procedure enabledus to synthesize scalar fields with very similar height distribu-tions while providing precise control over spatial frequencyFigure 1 illustrate examples generated using this method

ColormapsWe chose nine commonly-used colormaps (listed in Table 1and illustrated in Figure 2) In addition to a greyscale baselinethe colormaps selected reflect five design strategies

ndash Sequential monotonically increasing luminance over a lim-ited number of hues (singlehue bodyheat)

DaggerFollowing [36] we derive expected visual angle measurementsfrom pixel dimensions by assuming standard web viewing conditionsW3C-compliant browsers render HTML images at 96 DPI [40] andautomatically remap this to compensate for actual display resolutionWe assume a viewing distance of 30 inches Thus the estimatedvisual angle for an object of size S pixels is θ = 2tanminus1(

(S2)9630 )

ndash Spiral monotonically increasingly luminance with multiplehues (cubehelix extbodyheat)

ndash Diverging with uniformly-stepped luminance (coolwarmspectral)

ndash Diverging with uniformly-stepped saturation (blueyellow)ndash Fully saturated hues (rainbow)

All colormaps were interpolated in the CIELAB color spacewith the exception of coolwarm blueyellow and cubehelix mdashthese were interpolated (as originally intended) in a polar formof the LAB space [24] in the HSL space and using a taperedRGB helix [14] respectively

Greyscale Linear black to white ramp interpolated in the CIELAB color space

Singlehue Monotonically increasing luminance over a single blue hue (from Color Brewer [15])

Cubehelix Monotonically increasing luminance with sinusoidal RGB rotation [14]

Bodyheat Monotonically increasing luminance with a limited hue profile similar to a heated metal filament

Colormap Luminancecontrol

Hues

Ext-bodyheat Monotonically increasing luminance based on bodyheat but augmented with additional blue and purple hues in the low regions [41]Cool-warm Diverging with blue and red at the endpoints and soft white at the middle [24] Uniform luminance steps with darker ends and a bright midpoint

Rainbow Fully saturated hue gradation (blue green yellow red) interpolated in CIELAB

Spectral Diverging multi-hue encompassing a subset of the rainbow with a yellow middle [15] Uniform luminance steps with darker ends and a bright midpoint

Designstrategy

monotonic increase

monotonic increase

monotonic increase

-

blue

red yellow

sinusoidaRGB

blue red yellow

blue red

saturated RGB

limited RGB

monotonic increase

monotonic increase

uniform mid peak

-

-

luminance ramp

sequential

spiral

spiral

diverging

sequential

diverging

Blue-yellow Diverging uniformly-stepped saturation with blue and yellow ends and 75 grey in the middle

huerotation

blue yellow diverging

uniform mid peak

Table 1 The nine colormaps evaluated in this study

Experimental DesignWe investigate two independent variables Colormap and Spa-tial Frequency Colormap comprised nine distinct categories(see above) whereas Spatial Frequency is a continuous vari-able representing the number of cycles in 410 pixels (ie halfthe width of our map stimulus or approximately 8deg of visualangleDagger) We sampled spatial frequency at five intervals 3 5 79 11 To systematically study the effect of this variable on the

Click on a point that has an elevation of exactly 750 feet

Terrain is steeper when there is larger change in elevation between adjacent points Compare the steepness of terrain inside the two boxes then click on the box that is steeper on average

Imagine a line from A to B Select the elevation profile below that most closely matches its slope

Experiment 1 Experiment 2 Experiment 3

elevation(feet)

elevation(feet)

elevation(feet)

Figure 3 Example stimuli from the three experiments In experiment 1 participants indicated their response by clicking a point on the map matchinga specified elevation In experiment 2 participants were prompted to select the steeper of the two boxes In experiment 3 participants were asked toidentify the pattern corresponding to the terrain profile between two horizontally displaced markers

different colormaps we opted for a factorial design testingall possible 9times5 Colormap and Spatial Frequency combina-tions Given the sheer number of combinations we opted fora mixed design to make the study feasible Participants wererandomly assigned to one of three experimental conditions(illustrated in Table 2) Each condition comprised 3 of the 9colormaps (ie between-subject) and all 5 frequency levels(within-subject) In effect every participant saw 3 colormaptimes 5 spatial frequency combinations Participants completedmultiple trials with each combination

To equalize task difficulty stimuli for a given spatial frequencytrial were derived from the same base scalar field with differentcolormaps applied This arrangement enables us to make di-rect comparison between the colormaps for a given frequencyHowever it also meant that participants will see the samemap three times albeit with different colormaps To preventlearning scalar fields were flipped either horizontally or ver-tically resulting in three unique map reflections The orderof colormap presentation was fully counterbalanced acrossparticipants to minimize residual learning or fatigue effects

Condition Colormaps tested Spatial frequencies tested1 greyscale cubehelix rainbow 3 5 7 9 112 singlehue extbodyheat spectral 3 5 7 9 113 bodyheat coolwarm blueyellow 3 5 7 9 11

Table 2 Three experimental conditions each included 3 of 9 colormaps(ie between-subject variation) and all 5 levels of spatial frequency

EXPERIMENT 1 QUANTITY ESTIMATIONThe first experiment tests participantsrsquo ability to identify loca-tions on the map matching specified elevations Participantswere instructed to ldquoClick on a point that has an elevation ofexactly [H] feetrdquo Five different values for H were tested 0250 500 750 and 1000 feet These values correspond to thethree quartiles of the color scale as well as the min and max

ParticipantsWe recruited 90 participants from Amazon Mechanical Turk(50 females 40 males) with a mean age of 3464 (ST D = 953years) Participants were first screened for color-vision de-ficiency using a 14-panel Ishihara test and had to correctlyguess the number in 12 of the 14 panels to qualify We re-stricted the study to participants with a screen resolution of atleast 1280times800 to ensure the experimental interface would

fit their display Participants received a base reward of $050and a maximum bonus of $300 based on the percentage ofcorrectly solved tasks (for a possible total of $350)

ProcedureAfter signing up for the study participants were directed to anexternal link that displayed the experiment within a web inter-face Participants entered their MTurk ID and were presentedwith an information sheet about the study They were thenpresented with the color-vision qualification test Those whosuccessfully passed the test were given a set of 6 training trialsand provided with feedback on their accuracy Participantshad to identify a location that is within a 5 margin from thespecified height before proceeding to the next training trial

The main portion of the experiment consisted of 3 rounds onewith each of the 3 colormaps In each round participants sawthe five spatial frequency levels in ascending order providinga progression from simple to more complex maps The orderof colormap presentation was fully counterbalanced across par-ticipants using a Latin square design Participants completed 5trials with each colormap and spatial frequency combinationcorresponding to the 5 tested quantities (0 250 500 750 and1000) presented in random order A color scale was displayedto the right of the map and the range of the scale was fixedat [0ndash1000] feet (see Figure 3) In each trial participants firstsaw the question and clicked on lsquoShow Maprsquo to reveal thestimulus They then indicated their response by clicking onthe map to mark their selected location and clicked lsquoNextrsquo Toaid participants in accurately selecting locations the mousecursor was changed to a crosshair with a hollowed-out center(so as not to obscure the focal pixel)

ResultsWe computed an lsquoerrorrsquo measurement for each response bytaking the absolute difference between the requested elevationand the elevation at the point clicked by the participant Wethen applied the following log transform [10 16]

log2(error) = log2(| judged percentminus true percent|+18)

We removed three participants from the analysis (amountingto 33 of subjects) because their overall accuracy was twostandard deviations below the mean accuracy for all partici-pants (M = 8536ST D = 948) We analyze the results

greyscale singlehue bodyheat cubehelix extbodyheat coolwarm rainbow spectral blueyellow

3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11

0

1

2

3

frequency

log(error)

whichexperimentmodel

experiment modelspatial frequency

Figure 4 Mean log of error in quantity estimation (experiment 1 vs model) Ribbons represent 95 CIs of the experimental results

0

1

2

3

3 5 7 9 11frequency

log(error)

colorscalegreyscalesinglehuebodyheatcubehelixextbodyheatcoolwarmrainbowspectralblueyellow

spatial frequency

0

1

2

greyscale

singlehue

bodyheat

cubehelix

extbodyheat

coolwarm

rainbow

spectral

blueyellow

log(error)

colorscalegreyscalesinglehuebodyheatcubehelixextbodyheatcoolwarmrainbowspectralblueyellow

Figure 5 Mean log error in quantity estimation by colormap (left) andby colormap times spatial frequency Intervals are 95 CIs

by fitting the log of error to a linear mixed-effects modelcomprising two fixed effects (colormap frequency) and tworandom effects The first random effect accounts for individ-ual variations among participants and the second for intra-trialvariations (recall that trials comprised different test elevations)

Figure 4 illustrates the mean log error by colormap and spatialfrequency (the dashed trendline represents the model) Theexperimental results are combined in Figure 5 to ease com-parison A likelihood ratio test indicates the overall model issignificant (χ2(18) = 14817 p lt 0001) To test for interac-tion between spatial frequency and colormap we fit a reducedmodel that accounts for both frequency and colormap but nottheir interaction There was no significant difference betweenthe full and the reduced model (χ2(8) = 10695 p = 0219)thus ruling out an interaction between colormap and spatial fre-quency We will therefore interpret the reduced model whichaccounts for both factors independently Table 3 illustrates themodel coefficients

The model predicts that a step-increase in spatial frequencyyields a 011 increase in the log of estimation error The differ-ence in estimation error between the highest (f=11) and lowest(f=3) frequency levels is approximately 09 orders of magni-tude The effect of color encoding was equally evident allthe colormaps were significantly better than greyscale How-ever the gain in accuracy was markedly different betweenthe colormaps Rainbow had the largest impact on estimationaccuracy reducing error by approximately 23 orders of mag-nitude compared to greyscale The runner-up was spectralwhich also contains substantial hue variation However spec-tral reduced error by 175 orders of magnitude only On theother hand Spiral colormaps (extbodyheat cubehelix) whichcomprise multiple hues over a monotonically increasing lumi-nance decreased estimation errors by approximately 13-14orders of magnitude compared to 08-12 for Diverging ramps(blueyellow coolwarm) Sequential schemes (singlehue and

Coefficient Estimate |t value| p(Intercept) 177 6628 singlehue -026 2253 bodyheat -071 6118 cubehelix -127 16467 extbodyheat -139 12073 coolwarm -118 10122 rainbow -233 30264 spectral -175 15172 blueyellow -080 6871 Spatial Frequency 011 16967

Table 3 Effects of colormap and spatial frequency on the log of errorin quantity estimation The intercept represents greyscale as colormap(lowastlowastlowast= p lt 0001lowastlowast= p lt 001lowast= p lt 005)

000

025

050

075

3 5 7 9 11spatial frequency

avg

gra

dien

t (

)

Figure 6 Average local gradient (ie terrain slope) at locations selectedby participants Error bars are 95 CIs

bodyheat) had the least impact on error with a mere improve-ment of 026-071 orders of magnitude relative to greyscale

The fact that we did not find interaction between colormapand spatial frequency implies that the relative effectiveness ofthe different colormaps is stable across all spatial frequencylevels tested Rainbow is thus expected to be the most ac-curate colormap for quantity estimation regardless of howspatially complex the data is However estimation accuracywill decrease comparably for all colormaps as the data be-comes more spatially varied This could reflect a combinationof perceptual and motor difficulty in locating and clicking theintended location due the larger local gradients encounteredin high-frequency maps (see Figure 6)

EXPERIMENT 2 GRADIENT PERCEPTIONThe second experiment tests participantsrsquo accuracy in compar-ing and judging the steepness of gradients The ability to judgehow fast the encoded quantities change between adjacent maplocations is important in many contexts

ParticipantsWe recruited 126 participants (50 females 74 males 2 others)with a mean age of 3562 years (ST D = 955 years) Partici-pants had an overall success rate of 6775 (ST D = 1141)Ten participants (79 of subjects) were dropped from theanalysis because their overall accuracy was worse than chancehaving correctly answered less than 50 of trials in a two-alternatives forced choice experiment

greyscale singlehue bodyheat cubehelix extbodyheat coolwarm rainbow spectral blueyellow

3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 1104

06

08

frequency

c

orre

ct

whichexperimentmodel

experiment model

p(su

cces

s)

spatial frequency

Figure 7 Probability of successful gradient judgment (experiment 2 vs model) Ribbons represent 95 CIs of the experimental results

spatial frequency

05

06

07

08

09

3 5 7 9 11spatial frequency

c

orre

ct

colorscalegreyscalesinglehuebodyheatcubehelixextbodyheatcoolwarmrainbowspectralblueyellow

00

02

04

06

08

greysc

ale

single

hue

body

heat

cube

helix

extbo

dyhe

at

coolw

arm

rainb

ow

spec

tral

bluey

ellow

c

orre

ct

colorscalegreyscalesinglehuebodyheatcubehelixextbodyheatcoolwarmrainbowspectralblueyellow

Figure 8 Percentage of correctly answered trials in a gradient percep-tion task (experiment 2) Intervals are 95 CIs

ProcedureEach trail consisted of a map with two squares juxtaposed ontop (see Figure 3) Participants were prompted to ldquocomparethe steepness of terrain inside the two boxesrdquo and ldquoclick on thebox that is steeper on averagerdquo The two boxes were identicalin size (175times175 pixels or 35degtimes35deg of visual angle) How-ever terrain steepness which was calculated by taking the aver-age first derivative within each box was varied systematicallyThe gradient-ratio between the flatter and the steeper boxeswas fixed to one of four levels 0808308609(plusmn005) Alower ratio implies larger and potentially more perceptibleslope difference making the task easier However the twoboxes encompassed terrain with identical height ranges to re-duce variability in the appearance of their peaks (a potentialconfound in slope judgment [26])

Participants first completed a set of 6 training trials that in-cluded feedback before proceeding to the main trials Theorder of stimuli was similar to the previous experiment thestudy consisted of 3 rounds one with each of the 3 colormapsthe participant was assigned to see Each round encompassedall 5 spatial frequency levels Participants completed 4 trialswith each colormap and frequency combination spanning arange of easy to difficult tests (a total of 60 trials) The orderof colormap presentation was fully counterbalanced acrossparticipants

ResultsFigure 7 illustrates participantsrsquo probability of correctly iden-tifying the steeper gradient The experimental data is shownseparately in Figure 8 We fit the results to a logistic re-gression model comprising two fixed effects (colormap fre-quency) The model also included two random effects toaccount for individual differences among participants andintra-trial variations (recall that trials varied in difficulty) Themodel essentially predicts the odds of correctly identifyingthe steeper gradient A likelihood ratio test indicates the over-all model is significant (χ2(17) = 42165 p lt 0001) The

a Main effectsCoef Est |z| p(Intercept) 080 0516singlehue 114 0415bodyheat 094 0183cubehelix 090 0354extbodyheat 113 0388coolwarm 066 1316rainbow 065 1354spectral 092 0259blueyellow 091 0281Frequency 115 4633

b Interaction effects(colormap x frequency)

Coef Est |z| psinglehue 096 0910bodyheat 104 1020cubehelix 106 1303extbodyheat 103 0596coolwarm 114 3008 rainbow 113 2876 spectral 106 1330blueyellow 112 2471

Table 4 Main effects of colormap and spatial frequency on success oddsin gradient judgment (a) and their interaction Coefficients shown corre-spond to the exponented model estimates to reflect odd-ratios The inter-cept represents greyscale (lowastlowastlowast= p lt 0001 lowastlowast= p lt 001 lowast= p lt 005)

model correctly predicts 7467 of outcomes We find sig-nificant interaction between colormap and spatial frequency(χ2(8) = 2781 p lt 0001) Table 4 shows model coefficients

The main-effect coefficients for all colormaps were not sig-nificant indicating that all colormaps perform comparablyto greyscale at low spatial frequencies Participants are thusunlikely to benefit from the use of color when judging gra-dients in low-variance data The main effect of spatial fre-quency however is significant The model estimates that astep-increase in spatial frequency improves the odds of correctjudgment by 15 Estimating gradients appear to be easier inmaps with more complex spatial structures

The model indicates several noteworthy interactions Althoughthe use of color had no significant effect in low-frequencymaps several colormaps significantly outperformed greyscaleat high frequency The divergent coolwarm improved partici-pantsrsquo success odds by 14 for every step-increase in spatialfrequency Similarly rainbow and blueyellow increased theodds by approximately 13 and 12 respectively Notablythese three colormaps contain substantial variation in satura-tion (coolwarm and blueyellow) or hue (rainbow) All othercolormaps tested were not reliably different from greyscale

EXPERIMENT 3 PATTERN PERCEPTIONHaving tested accuracy in quantity estimation and gradientperception we now evaluate participantsrsquo ability to integratethese two skills Experiment 3 required participants to extracta longitudinal pattern from the map and match it to an externalrepresentation a task originally devised by Hyslop [18]

ParticipantsWe recruited 165 participants (79 females 84 males 2 others)The mean participant age was 3604 years (ST D = 1171)Overall participants had a mean success rate of 7851 inmatching the correct pattern (ST D = 1931) We dropped

greyscale singlehue bodyheat cubehelix extbodyheat coolwarm rainbow spectral blueyellow

3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11

07

08

09

frequency

c

orre

ct

whichexperimentmodel

experiment model

p(su

cces

s)

spatial frequency

Figure 9 Probability of successful pattern matching (experiment 3 vs model) Ribbons denote 95 CIs of the experimental data

070

075

080

085

090

3 5 7 9 11spatial frequency

c

orre

ct

colorscalegreyscalesinglehuebodyheatcubehelixextbodyheatcoolwarmrainbowspectralblueyellow

000

025

050

075

greysc

ale

single

hue

body

heat

cube

helix

extbo

dyhe

at

coolw

arm

rainb

ow

spec

tral

bluey

ellow

c

orre

ct

colorscalegreyscalesinglehuebodyheatcubehelixextbodyheatcoolwarmrainbowspectralblueyellow

Figure 10 Percentage of correctly answered trials in experiment 3 In-tervals are 95 CIs

seven participants from the analysis (42 of subjects) whoseoverall accuracy was two standard deviations below the mean

ProcedureParticipants first completed a set of 6 training trials that in-cluded feedback before proceeding to the main experimentEach trial consisted of a map with two markers labeled A andB (see Figure 3) The markers were horizontally displacedby 350 pixels (7degof visual angle) Participants were given thefollowing prompt ldquoImagine a line from A to B Select the ele-vation profile below that most closely matches its sloperdquo Theythen selected a choice among a set of 6 patterns includingthe actual elevation profile and 5 other distractors Distractorswere generated from the same map so as to reflect similar spa-tial frequency characteristics and had to be 65-70 similarto the actual profile (as measured by dynamic warping [20])Additionally profiles and distractors were selected to not havepeaks or valleys at the endpoints These criteria determinedafter a pilot ensure similar task difficulty across the trials

The order of stimuli was similar to the previous two experi-ments the study consisted of 3 rounds one with each of the 3colormaps the participant was assigned to see and encompass-ing the 5 spatial frequency levels Thus every participant saw3times5 colormap and frequency combinations and completed 3pattern matching trials with each combination for a total of 45trials As in the previous experiments the order of colormappresentation was fully counterbalanced

ResultsWe fit the results to a logistic regression model comprisingtwo fixed effects (colormap frequency) and two random ef-fects to account for individual differences and intra-trial vari-ations Figure 9 shows the odds of successful profile match-ing The experimental results are illustrated separately inFigure 10 A likelihood ratio test indicates the model issignificant (χ2(17) = 39467 p lt 0001) We find signif-icant interaction between colormap and spatial frequency

a Main effectsCoef Est |z| p(Intercept) 903 6502 singlehue 134 0649bodyheat 080 0503cubehelix 077 0681extbodyheat 081 0466coolwarm 044 1868 rainbow 098 0064spectral 057 1284blueyellow 073 0697Frequency 093 2058

b Interaction effects(colormap x frequency)

Coef Est |z| psinglehue 098 0450bodyheat 102 0456cubehelix 109 1781 extbodyheat 104 0797coolwarm 117 3143 rainbow 101 0259spectral 111 2198 blueyellow 109 1683

Table 5 Main effects of colormap and spatial frequency on successodds in experiment 3 (a) and their interaction Coefficients depict expo-nented model estimates to reflect odd-ratios The intercept correspondto greyscale (lowastlowastlowast= p lt 0001 lowastlowast= p lt 001 lowast= p lt 005 = p lt 01)

(χ2(8) = 18131 p lt 005) the relative effectiveness of thecolormaps appears to vary with spatial frequency

Overall we find a significant detrimental main effect of spa-tial frequency on pattern perception as indicated by a 093Frequency coefficient (Table 5a) This translates to a 7 dropin the odds of correctly matching the profile for every step-increase in spatial frequency The main effect coefficients forall colormaps were not significant indicating that the use ofcolor at low spatial frequency is unlikely to improve patternperception as compared to a plain greyscale ramp

Colormap performance begins to diverge at high spatial fre-quency Only two colormaps have significant and largeenough odds-ratio coefficients (ie gt 107) to overcome thefrequency-induced perceptual difficulty spectral and cool-warm increased the odds of correct pattern matching by 11ndash17 respectively for a every step-increase in spatial fre-quency (after adjusting for frequency effects alone) Addi-tionally blueyellow and cubehelix were associated with a 9improvement but the advantage was not reliable (p lt 01)On the other hand extbodyheat bodyheat rainbow had small(and insignificant) odds-ratio coefficients (098ndash104 lt 107)indicating that similar to greyscale they are associated withlower success odds in complex maps

In short only two of the tested colormaps (coolwarm andspectral) appear to reliably support pattern perception at highspatial frequency Both consist of a diverging ramp withuniformly-stepped luminance All other colormaps (includinggreyscale) suffered as data complexity increased

DISCUSSION AND GUIDELINESOur work sheds new light on how spatial complexity impactsthe perception of continuous color-coded maps The experi-ments also led to some surprising findings that are at odds withcurrent guidelines We interpret these results and accordingly

Quantity estimationRanking unaffected by spatial frequency

Gradient perceptionLow spatial frequency

(038 cycledeg)High spatial frequency

(138 cycledeg)

no s

igni

fican

tdi

ffere

nce

no s

igni

fican

tdi

ffere

nce

Pattern perception

better

worse

Color mapGuidelines

(1) Maximize range of saturated hues regardless of spatial frequency

(2) At high spatial frequency Fully-saturated hues or diverging ramps with chroma variation

(3) At high spatial frequency Diverging ramps with uniformly-stepped luminance

no s

igni

fican

tdi

ffere

nce

no

sign

ifica

ntdi

ffere

nce

Low spatial frequency(038 cycledeg)

High spatial frequency(138 cycledeg)

Figure 11 Model-derived colormap ranking and guidelines by task and spatial frequency (lowast= p lt 005 = p lt 01 relative to greyscale)

devise new task- and frequency-aware color mapping guide-lines (indicated byF) We also rank the tested colormaps andsummarize our guidelines in Figure 11

Quantity EstimationOur first hypothesis (H1) predicts hue- and saturation-varyingramps to be more accurate at low spatial frequencies andramps with monotonically increasing luminance to be moreaccurate at high frequencies As discussed H1 is based onthe relative contrast-sensitivity of our visual system [32] Aquantity estimation task (experiment 1) shows no interactionbetween colormap and spatial frequency While increasedspatial complexity is associated with higher estimation errorthe effect is similar across all colormaps We thus reject H1

On the other hands results provide support for H2 whichpredicts that hue-varying ramps will lead to more accurateestimation Indeed the top performing colormaps (rainbowand spectral) contain substantial hue variation Results fromexperiment 1 thus replicate earlier findings by Ware [41] butalso extend them to show that spatial frequency have no ap-parent impact on the effectiveness of hue-varying ramps Ourdata shows that rainbow and spectral are the most accurateamong the colormaps tested even at the highest levels of spa-tial frequency Altogether these results lend further support tothe theory that lookup errors in color-coded maps are largelycaused by systematic simultaneous contrast shifts [41] ratherthan being affected by contrast sensitivity modulation [32]These shifts are best counteracted with colormaps that varynon-monotonically along one or more perceptual channels

A corollary result is that mixing monotonic luminance withhue variation would lead to significant accuracy loss Indeeddata from experiment 1 indicates that rainbow is approxi-mately an order of magnitude more accurate than extbodyheatand cubehelix These Spiral colormaps are designed to be moreaccurate rainbow alternatives for interval data [4 24] Con-trary we find that they reduce accuracy compared to a purelyhue-varying ramp This finding suggests that when estimat-ing a continuously coded spatial quantity people benefit mostfrom a large dynamic hue range Incorporating monotonicallyincreasing lightness within the colormap would necessarilyreduce the hue range thereby diminishing accuracy

F Guideline 1 We recommend maximizing hue variation toimprove quantity estimation irrespective of spatial frequency

Gradient PerceptionGradient perception allows people to distinguish how quicklythe encoded attribute changes between adjacent locations anessential skill when evaluating the distribution and varianceof spatial data We find that the task is strongly modulatedby the datarsquos spatial complexity increased spatial frequencyappears to enhance the perception of gradients This is un-surprising as maps with jagged surfaces are likely to exhibitmore pronounced mdashand thus more perceptiblemdash differencesin slope Colormap effectiveness was also impacted by spa-tial frequency color encoding did not help participantsrsquo dis-tinguish gradients at low frequency levels as all colormapsshowed similar performance to greyscale However three col-ormaps demonstrated significant advantage at high frequenciesCoolwarm rainbow and blueyellow improved perception oddsby 12-14 for every step-increase in spatial frequency Allthree employed one of two design strategies a diverging rampwith varying saturation or a fully saturated hue rotation

The above results contradict H1 which predicts hue- andsaturation-varying colormaps to perform better at low frequen-cies In fact we see the opposite The results also do not sup-port H3 which predicts better performance for monotonically-luminant ramps in structure perception tasks In fact all threetop-performing ramps exhibit non-monotonic luminance

F Guideline 2 For tasks requiring gradient perception athigh spatial frequency we recommend a range of fully satu-rated hues (eg rainbow) or diverging chroma-varying ramps(eg coolwarm or blueyellow)

Pattern PerceptionExperiment 3 prompted participants to match the elevationprofile along a horizontal path with an external pattern Weexpected colormaps with monotonically increasing luminanceto be more accurate at this task (H3) but results were notentirely consistent with this prediction While all tested col-ormaps had comparable performance at low spatial frequencyonly two colormaps coolwarm and spectral gave partici-pants higher odds of successfully matching the pattern at highfrequency Both colormaps comprise a diverging ramp withuniformly-stepped (though not strictly monotonic) luminanceBy contrast sequential and spiral ramps performed just aspoorly as greyscale in complex maps and so did rainbow

The above result are consistent with Morelandrsquos argumentthat diverging ramps provide ldquomaximal perceptual resolutionrdquo(through increasing and decreasing luminance intervals) [24]potentially enabling high-frequency patterns to be resolvedmore easily Our results may also explain why divergingschemes performed better in medical diagnosis [3] we suspectsuch tasks to require the analysis of potentially high-frequencyfeatures (eg small tissue aberrations)

FGuideline 3 We recommend diverging ramps with equidis-tant luminance steps (eg coolwarm and spectral) to sup-port the perception of longitudinal patterns at high spatialfrequency Rainbow Sequential and Spiral schemes shouldbe avoided in complex maps especially if the task involvesthe analysis and matching of fine-grained features

Yet Another Look at the RainbowResults of experiments 1 and 2 may shed a light on why rain-bow remains a popular choice among scientists [25] despitebeing considered a bad choice by the visualization commu-nity [4 30] Our data reveals that counterintuitively rainbowis robust for estimating a smoothly varying quantitative at-tribute regardless of spatial complexity Moreover rainbowprovides good support for gradient estimation at high spatialfrequency These two tasks correspond to elementary visualanalytic primitives including characterizing distributions de-termining ranges and filtering [1] Moreover studies showthat when experts attempt to form a mental model about avisualization they first go through a time-consuming processof extracting quantitative data ldquoat a rather detailed levelrdquo [38]For instance a weather forecaster will lookup pressure andwind changes estimating current readings at landmark loca-tions in the map before making a forecast Our data suggeststhat rainbow provides good support for these tasks making ita potentially reasonable choice for weather forecasters

Critique of rainbow centers on its tendency to create sharpvisual boundaries particularly around its yellow regions [4]Experts also criticize the use of fully saturated hues [24]which result in non-uniform perceptual steps within the colorramp The common intuition is that these two factors com-bined will inevitably distort the perception of quantities Wedo not see evidence to support this hypothesis In fact to thecontrary attempts to lsquolinearizersquo the rainbow by monotonicallyincreasing the luminance of hues could reduce estimationaccuracy by up to an order of magnitude

F Guideline 4 Rather than entirely discouraging the useof rainbow we suggest that it can be a reasonable designchoice for conveying spatial distributions and variances andin tasks that require quantitative as opposed to geometricprecision However rainbow has a number of limitationsThe use of green and red hues is problematic for people withcolor deficiency Moreover rainbow is probably ineffective atrevealing high-frequency patterns Interestingly these short-comings are balanced by diverging ramps (eg coolwarm)which although quantitatively inaccurate appear to supportpattern perception at high spatial frequency We thus arguethat hue-varying and diverging colormaps support orthogonaltasks in continuous maps and should therefore be consideredas complementary rather than mutually exclusive choices

LIMITATIONS AND FUTURE WORKThere are some limitations to our work that should be consid-ered First as with other crowdsourced graphical perceptionstudies we gain access to a larger pool of participants butsacrifice some experimental control [16] Particularly rele-vant to our study is the variations in participantsrsquo monitorsincluding color calibration and display resolution as well asthe illumination conditions in their homes or offices mdash all ofwhich can impact color perception We could not control thesefactors but attempted to counteract their variation by involvinga larger sample (N=381) Although we expect crowdsourcingto improve the ecological validity of results and guidelinesuncontrolled variations can potentially reduce our ability to de-tect small but otherwise significant differences in performancebetween tested conditions Future lab studies should thereforebe attempted to replicate our findings with added controls

Second our study employed a limited set of tasks designed tomeasure elementary perceptual operators including quantityestimation gradient perception and pattern matching Thereis an opportunity to test higher-level tasks that mimic scientificanalyses more closely including the identification and compar-ison of larger map features (eg fronts ridges) Additionallysome of the tasks we tested could be re-evaluated in more au-thentic formulations For instance a metric task could requireparticipants to estimate the quantity at a specific location onthe map This formulation is arguably more realistic than thetask we tested which simply asked participants to click anylocation thought to match a specified quantity

Third our analysis was focused exclusively on spatial fre-quency and there are good reasons to consider this factor [1132] However there are also additional data characteristics toconsider including for instance the distribution of amplitudeswithin the map Such factors will influence the distribution ofcolors in the image and may thus impact perception

Lastly we limited our study to synthetically generated scalarfields to precisely vary spatial frequency while controllingfor other confounds However synthetic stimuli may alsointroduce (unknown) perceptual or cognitive biases There-fore additional studies are needed to replicate our findingswith datasets from real-world domains (eg meteorology geo-physics or oceanography) and with domain experts We alsorestricted this study to participants with normal color visionTherefore our results may not generalize to approximately 5of the population who have some form of color deficiency

CONCLUSIONSWe conducted three experiments to investigate the effects ofspatial frequency and colormap characteristics on the percep-tion of continuous pseudocolor maps Our results indicate thatspatial frequency impacts judgment of the encoded quantitiesand structures While viewersrsquo quantity estimation accuracyexhibited a predictable response increased data complexityhad a more nuanced effect on gradient and pattern compre-hension the impact of which was dependent on the colormapused Designers should therefore consider both the type oftask and the spatial complexity of the underlying data Were-examined current guidelines and devised new recommenda-tions for color-coding of continuous spatial data

REFERENCES1 Robert Amar James Eagan and John Stasko 2005

Low-level components of analytic activity in informationvisualization In Information Visualization 2005INFOVIS 2005 IEEE Symposium on IEEE 111ndash117

2 Lawrence D Bergman Bernice E Rogowitz and Lloyd ATreinish 1995 A rule-based tool for assisting colormapselection In Proceedings of the 6th conference onVisualizationrsquo95 IEEE Computer Society 118

3 Michelle Borkin Krzysztof Gajos Amanda PetersDimitrios Mitsouras Simone Melchionna Frank RybickiCharles Feldman and Hanspeter Pfister 2011 Evaluationof artery visualizations for heart disease diagnosis IEEETransactions on Visualization and Computer Graphics 1712 (2011) 2479ndash2488

4 David Borland and Russell M Taylor Ii 2007 Rainbowcolor map (still) considered harmful IEEE ComputerGraphics and Applications 27 2 (2007)

5 Cynthia A Brewer 1994 Visualization in ModernCartography Elsevier Science Chapter Color useguidelines for mapping and visualization

6 Cynthia A Brewer 1996 Guidelines for selecting colorsfor diverging schemes on maps The CartographicJournal 33 2 (1996) 79ndash86

7 Cynthia A Brewer Alan M MacEachren Linda W Pickleand Douglas Herrmann 1997 Mapping mortalityEvaluating color schemes for choropleth maps Annals ofthe Association of American Geographers 87 3 (1997)411ndash438

8 Roxana Bujack Terece L Turton Francesca SamselColin Ware David H Rogers and James Ahrens 2017The Good the Bad and the Ugly A TheoreticalFramework for the Assessment of Continuous ColormapsIEEE Transactions on Visualization and ComputerGraphics (2017)

9 William S Cleveland and William S Cleveland 1983 Acolor-caused optical illusion on a statistical graph TheAmerican Statistician 37 2 (1983) 101ndash105

10 William S Cleveland and Robert McGill 1984 Graphicalperception Theory experimentation and application tothe development of graphical methods J Amer StatistAssoc 79 387 (1984) 531ndash554

11 Russel De Valois and Karen De Valouis 1990 SpatialVision Oxford University Press

12 Russell L De Valois Duane G Albrecht and Lisa GThorell 1978 Cortical cells bar and edge detectors orspatial frequency filters In Frontiers in visual scienceSpringer 544ndash556

13 Connor C Gramazio David H Laidlaw and Karen BSchloss 2017 Colorgorical Creating discriminable andpreferable color palettes for information visualizationIEEE Transactions on Visualization and ComputerGraphics 23 1 (2017) 521ndash530

14 D A Green 2011 A colour scheme for the display ofastronomical intensity images Bulletin of theAstronomical Society of India 39 (June 2011) 289ndash295

15 Mark Harrower and Cynthia A Brewer 2003ColorBrewer org an online tool for selecting colourschemes for maps The Cartographic Journal 40 1(2003) 27ndash37

16 Jeffrey Heer and Michael Bostock 2010 Crowdsourcinggraphical perception using mechanical turk to assessvisualization design In Proceedings of the SIGCHIConference on Human Factors in Computing SystemsACM 203ndash212

17 GT Herman and H Levkowitz 1992 Color scales forimage data IEEE Computer Graphics and Applications12 1 (1992) 72ndash80

18 Michael D Hyslop 2006 A comparison of spectral colorand greyscale continuous-tone map perception Masterrsquosthesis Michigan State University

19 Alan David Kalvin Bernice E Rogowitz Adar Pelah andAron Cohen 2000 Building perceptual color maps forvisualizing interval data In Human Vision and ElectronicImaging V Vol 3959 International Society for Opticsand Photonics 323ndash336

20 Eamonn Keogh and Chotirat Ann Ratanamahatana 2005Exact indexing of dynamic time warping Knowledge andInformation Systems 7 3 (2005) 358ndash386

21 Gordon Kindlmann Erik Reinhard and Sarah Creem2002 Face-based luminance matching for perceptualcolormap generation In Proceedings of the IEEEConference on Visualization rsquo02 IEEE Computer Society299ndash306

22 Robert Kosara 2016 An Empire Built On SandReexamining What We Think We Know AboutVisualization In Proceedings of the Beyond Time andErrors on Novel Evaluation Methods for VisualizationACM 162ndash168

23 Mark P Kumler and Richard E Groop 1990Continuous-tone mapping of smooth surfacesCartography and Geographic Information Systems 17 4(1990) 279ndash289

24 Kenneth Moreland 2009 Diverging color maps forscientific visualization In International Symposium onVisual Computing Springer 92ndash103

25 Kenneth Moreland 2016 Why We Use Bad Color Mapsand What You Can Do About It Electronic Imaging2016 16 (2016) 1ndash6

26 Lace Padilla P Samuel Quinan Miriah Meyer andSarah H Creem-Regehr 2017 Evaluating the Impact ofBinning 2D Scalar Fields IEEE Transactions onVisualization and Computer Graphics 23 1 (2017)431ndash440

27 Stephen E Palmer 1999 Vision science Photons tophenomenology MIT press

28 Ken Perlin 1985 An image synthesizer ACMSIGGRAPH Computer Graphics 19 3 (1985) 287ndash296

29 Linda Williams Pickle 2003 Usability testing of mapdesigns In Proceedings of Symposium on the Interface ofComputing Science and Statistics 42ndash56

30 P Samuel Quinan and Miriah Meyer 2016 Visuallycomparing weather features in forecasts IEEETransactions on Visualization and Computer Graphics 221 (2016) 389ndash398

31 Penny L Rheingans 2000 Task-based color scale designIn 28th AIPR Workshop 3D Visualization for DataExploration and Decision Making International Societyfor Optics and Photonics 35ndash43

32 Bernice E Rogowitz and Lloyd A Treinish 1994 Usingperceptual rules in interactive visualization In ISTSPIE1994 International Symposium on Electronic ImagingScience and Technology International Society for Opticsand Photonics 287ndash295

33 Bernice E Rogowitz Lloyd A Treinish Steve Brysonand others 1996 How not to lie with visualizationComputers in Physics 10 3 (1996) 268ndash273

34 Samuel Silva Beatriz Sousa Santos and JoaquimMadeira 2011 Using color in visualization A surveyComputers Graphics 35 2 (2011) 320ndash333

35 Maureen Stone Danielle Albers Szafir and Vidya Setlur2014 An engineering model for color difference as afunction of size In Color and Imaging Conference Vol2014 Society for Imaging Science and Technology253ndash258

36 Danielle Albers Szafir 2017 Modeling Color Differencefor Visualization Design IEEE Transactions onVisualization and Computer Graphics (2017)

37 Christian Tominski Georg Fuchs and HeidrunSchumann 2008 Task-driven color coding InInformation Visualisation 2008 IVrsquo08 12thInternational Conference IEEE 373ndash380

38 J Gregory Trafton Susan S Kirschenbaum Ted L TsuiRobert T Miyamoto James A Ballas and Paula DRaymond 2000 Turning pictures into numbersextracting and generating information from complexvisualizations International Journal of Human-ComputerStudies 53 5 (2000) 827ndash850

39 Bruce E Trumbo 1981 A theory for coloring bivariatestatistical maps The American Statistician 35 4 (1981)220ndash226

40 W3C 2016 CSS Values and Units Module Level 3(2016)httpwwww3orgTRcss3-valuesabsolute-lengths

41 Colin Ware 1988 Color sequences for univariate mapsTheory experiments and principles IEEE ComputerGraphics and Applications 8 5 (1988) 41ndash49

42 Colin Ware 2012 Information visualization perceptionfor design Elsevier

43 Liang Zhou and Charles D Hansen 2016 A survey ofcolormaps in visualization IEEE Transactions onVisualization and Computer Graphics 22 8 (2016)2051ndash2069

  • Introduction
  • Related Work
    • Design Strategies for Continuous Colormaps
    • Empirical Evaluations of Continuous Colormaps
    • Effects of Spatial Frequency
    • Summary
      • Hypotheses
      • Methodology
        • Stimuli
        • Colormaps
        • Experimental Design
          • Experiment 1 Quantity estimation
            • Participants
            • Procedure
            • Results
              • Experiment 2 Gradient perception
                • Participants
                • Procedure
                • Results
                  • Experiment 3 Pattern perception
                    • Participants
                    • Procedure
                    • Results
                      • Discussion and Guidelines
                        • Quantity Estimation
                        • Gradient Perception
                        • Pattern Perception
                        • Yet Another Look at the Rainbow
                          • Limitations and Future Work
                          • Conclusions
                          • References
Page 4: Graphical Perception of Continuous Quantitative Maps: the ...khreda.com/papers/CHI18_colormaps.pdfA large body of research has been devoted to understanding how color encoding affects

spat

ial f

requ

ency

greyscale singlehue cubehelix extbodyheat coolwarm spectralrainbow blueyellow

luminance saturation CIELAB a (green-red) b (blue-yellow)

bodyheat

Figure 2 We tested nine colormaps selected to encompass a variety of design characteristics (illustrated by variation in luminance saturation andorhue) Each colormap was tested with multiple scalar fields corresponding to increasing levels of spatial frequency

StimuliWe employ digital elevation models (DEM) as stimuli for thethree experiments A DEM represents land elevation withcells in the 2D scalar field representing terrain height at thecorresponding locations To maintain precise control overspatial frequency and task difficulty we synthetically generateDEMs using Perlin noise [28] mixing five octaves of the noisefunction to produce seemingly realistic terrain The resultingDEMs are then normalized so that their heights span the entirecolormap range All generated maps were 820times630 pixels insize (approximately 16degtimes13deg of visual angleDagger)

By varying the scale of the noise function we obtain scalarfields with different spatial frequency characteristics The lat-ter is measured by first computing a Fast Fourier Transform(FFT) over the scalar field and calculating the relative con-tribution of each carrier frequency from the magnitude of itsFFT vector The result can be illustrated with a power spectraplot for each individual scalar field with spatial frequency onthe x-axis and the contribution of the frequency component onthe y-axis Low frequency fields exhibit a more pronouncedright skew in their power spectra We compute the positionof the median spatial frequency which splits the power spec-tra into approximately equal halves and use it to order thefields Fields with larger median frequencies indicate morevaried and complex terrain structure This procedure enabledus to synthesize scalar fields with very similar height distribu-tions while providing precise control over spatial frequencyFigure 1 illustrate examples generated using this method

ColormapsWe chose nine commonly-used colormaps (listed in Table 1and illustrated in Figure 2) In addition to a greyscale baselinethe colormaps selected reflect five design strategies

ndash Sequential monotonically increasing luminance over a lim-ited number of hues (singlehue bodyheat)

DaggerFollowing [36] we derive expected visual angle measurementsfrom pixel dimensions by assuming standard web viewing conditionsW3C-compliant browsers render HTML images at 96 DPI [40] andautomatically remap this to compensate for actual display resolutionWe assume a viewing distance of 30 inches Thus the estimatedvisual angle for an object of size S pixels is θ = 2tanminus1(

(S2)9630 )

ndash Spiral monotonically increasingly luminance with multiplehues (cubehelix extbodyheat)

ndash Diverging with uniformly-stepped luminance (coolwarmspectral)

ndash Diverging with uniformly-stepped saturation (blueyellow)ndash Fully saturated hues (rainbow)

All colormaps were interpolated in the CIELAB color spacewith the exception of coolwarm blueyellow and cubehelix mdashthese were interpolated (as originally intended) in a polar formof the LAB space [24] in the HSL space and using a taperedRGB helix [14] respectively

Greyscale Linear black to white ramp interpolated in the CIELAB color space

Singlehue Monotonically increasing luminance over a single blue hue (from Color Brewer [15])

Cubehelix Monotonically increasing luminance with sinusoidal RGB rotation [14]

Bodyheat Monotonically increasing luminance with a limited hue profile similar to a heated metal filament

Colormap Luminancecontrol

Hues

Ext-bodyheat Monotonically increasing luminance based on bodyheat but augmented with additional blue and purple hues in the low regions [41]Cool-warm Diverging with blue and red at the endpoints and soft white at the middle [24] Uniform luminance steps with darker ends and a bright midpoint

Rainbow Fully saturated hue gradation (blue green yellow red) interpolated in CIELAB

Spectral Diverging multi-hue encompassing a subset of the rainbow with a yellow middle [15] Uniform luminance steps with darker ends and a bright midpoint

Designstrategy

monotonic increase

monotonic increase

monotonic increase

-

blue

red yellow

sinusoidaRGB

blue red yellow

blue red

saturated RGB

limited RGB

monotonic increase

monotonic increase

uniform mid peak

-

-

luminance ramp

sequential

spiral

spiral

diverging

sequential

diverging

Blue-yellow Diverging uniformly-stepped saturation with blue and yellow ends and 75 grey in the middle

huerotation

blue yellow diverging

uniform mid peak

Table 1 The nine colormaps evaluated in this study

Experimental DesignWe investigate two independent variables Colormap and Spa-tial Frequency Colormap comprised nine distinct categories(see above) whereas Spatial Frequency is a continuous vari-able representing the number of cycles in 410 pixels (ie halfthe width of our map stimulus or approximately 8deg of visualangleDagger) We sampled spatial frequency at five intervals 3 5 79 11 To systematically study the effect of this variable on the

Click on a point that has an elevation of exactly 750 feet

Terrain is steeper when there is larger change in elevation between adjacent points Compare the steepness of terrain inside the two boxes then click on the box that is steeper on average

Imagine a line from A to B Select the elevation profile below that most closely matches its slope

Experiment 1 Experiment 2 Experiment 3

elevation(feet)

elevation(feet)

elevation(feet)

Figure 3 Example stimuli from the three experiments In experiment 1 participants indicated their response by clicking a point on the map matchinga specified elevation In experiment 2 participants were prompted to select the steeper of the two boxes In experiment 3 participants were asked toidentify the pattern corresponding to the terrain profile between two horizontally displaced markers

different colormaps we opted for a factorial design testingall possible 9times5 Colormap and Spatial Frequency combina-tions Given the sheer number of combinations we opted fora mixed design to make the study feasible Participants wererandomly assigned to one of three experimental conditions(illustrated in Table 2) Each condition comprised 3 of the 9colormaps (ie between-subject) and all 5 frequency levels(within-subject) In effect every participant saw 3 colormaptimes 5 spatial frequency combinations Participants completedmultiple trials with each combination

To equalize task difficulty stimuli for a given spatial frequencytrial were derived from the same base scalar field with differentcolormaps applied This arrangement enables us to make di-rect comparison between the colormaps for a given frequencyHowever it also meant that participants will see the samemap three times albeit with different colormaps To preventlearning scalar fields were flipped either horizontally or ver-tically resulting in three unique map reflections The orderof colormap presentation was fully counterbalanced acrossparticipants to minimize residual learning or fatigue effects

Condition Colormaps tested Spatial frequencies tested1 greyscale cubehelix rainbow 3 5 7 9 112 singlehue extbodyheat spectral 3 5 7 9 113 bodyheat coolwarm blueyellow 3 5 7 9 11

Table 2 Three experimental conditions each included 3 of 9 colormaps(ie between-subject variation) and all 5 levels of spatial frequency

EXPERIMENT 1 QUANTITY ESTIMATIONThe first experiment tests participantsrsquo ability to identify loca-tions on the map matching specified elevations Participantswere instructed to ldquoClick on a point that has an elevation ofexactly [H] feetrdquo Five different values for H were tested 0250 500 750 and 1000 feet These values correspond to thethree quartiles of the color scale as well as the min and max

ParticipantsWe recruited 90 participants from Amazon Mechanical Turk(50 females 40 males) with a mean age of 3464 (ST D = 953years) Participants were first screened for color-vision de-ficiency using a 14-panel Ishihara test and had to correctlyguess the number in 12 of the 14 panels to qualify We re-stricted the study to participants with a screen resolution of atleast 1280times800 to ensure the experimental interface would

fit their display Participants received a base reward of $050and a maximum bonus of $300 based on the percentage ofcorrectly solved tasks (for a possible total of $350)

ProcedureAfter signing up for the study participants were directed to anexternal link that displayed the experiment within a web inter-face Participants entered their MTurk ID and were presentedwith an information sheet about the study They were thenpresented with the color-vision qualification test Those whosuccessfully passed the test were given a set of 6 training trialsand provided with feedback on their accuracy Participantshad to identify a location that is within a 5 margin from thespecified height before proceeding to the next training trial

The main portion of the experiment consisted of 3 rounds onewith each of the 3 colormaps In each round participants sawthe five spatial frequency levels in ascending order providinga progression from simple to more complex maps The orderof colormap presentation was fully counterbalanced across par-ticipants using a Latin square design Participants completed 5trials with each colormap and spatial frequency combinationcorresponding to the 5 tested quantities (0 250 500 750 and1000) presented in random order A color scale was displayedto the right of the map and the range of the scale was fixedat [0ndash1000] feet (see Figure 3) In each trial participants firstsaw the question and clicked on lsquoShow Maprsquo to reveal thestimulus They then indicated their response by clicking onthe map to mark their selected location and clicked lsquoNextrsquo Toaid participants in accurately selecting locations the mousecursor was changed to a crosshair with a hollowed-out center(so as not to obscure the focal pixel)

ResultsWe computed an lsquoerrorrsquo measurement for each response bytaking the absolute difference between the requested elevationand the elevation at the point clicked by the participant Wethen applied the following log transform [10 16]

log2(error) = log2(| judged percentminus true percent|+18)

We removed three participants from the analysis (amountingto 33 of subjects) because their overall accuracy was twostandard deviations below the mean accuracy for all partici-pants (M = 8536ST D = 948) We analyze the results

greyscale singlehue bodyheat cubehelix extbodyheat coolwarm rainbow spectral blueyellow

3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11

0

1

2

3

frequency

log(error)

whichexperimentmodel

experiment modelspatial frequency

Figure 4 Mean log of error in quantity estimation (experiment 1 vs model) Ribbons represent 95 CIs of the experimental results

0

1

2

3

3 5 7 9 11frequency

log(error)

colorscalegreyscalesinglehuebodyheatcubehelixextbodyheatcoolwarmrainbowspectralblueyellow

spatial frequency

0

1

2

greyscale

singlehue

bodyheat

cubehelix

extbodyheat

coolwarm

rainbow

spectral

blueyellow

log(error)

colorscalegreyscalesinglehuebodyheatcubehelixextbodyheatcoolwarmrainbowspectralblueyellow

Figure 5 Mean log error in quantity estimation by colormap (left) andby colormap times spatial frequency Intervals are 95 CIs

by fitting the log of error to a linear mixed-effects modelcomprising two fixed effects (colormap frequency) and tworandom effects The first random effect accounts for individ-ual variations among participants and the second for intra-trialvariations (recall that trials comprised different test elevations)

Figure 4 illustrates the mean log error by colormap and spatialfrequency (the dashed trendline represents the model) Theexperimental results are combined in Figure 5 to ease com-parison A likelihood ratio test indicates the overall model issignificant (χ2(18) = 14817 p lt 0001) To test for interac-tion between spatial frequency and colormap we fit a reducedmodel that accounts for both frequency and colormap but nottheir interaction There was no significant difference betweenthe full and the reduced model (χ2(8) = 10695 p = 0219)thus ruling out an interaction between colormap and spatial fre-quency We will therefore interpret the reduced model whichaccounts for both factors independently Table 3 illustrates themodel coefficients

The model predicts that a step-increase in spatial frequencyyields a 011 increase in the log of estimation error The differ-ence in estimation error between the highest (f=11) and lowest(f=3) frequency levels is approximately 09 orders of magni-tude The effect of color encoding was equally evident allthe colormaps were significantly better than greyscale How-ever the gain in accuracy was markedly different betweenthe colormaps Rainbow had the largest impact on estimationaccuracy reducing error by approximately 23 orders of mag-nitude compared to greyscale The runner-up was spectralwhich also contains substantial hue variation However spec-tral reduced error by 175 orders of magnitude only On theother hand Spiral colormaps (extbodyheat cubehelix) whichcomprise multiple hues over a monotonically increasing lumi-nance decreased estimation errors by approximately 13-14orders of magnitude compared to 08-12 for Diverging ramps(blueyellow coolwarm) Sequential schemes (singlehue and

Coefficient Estimate |t value| p(Intercept) 177 6628 singlehue -026 2253 bodyheat -071 6118 cubehelix -127 16467 extbodyheat -139 12073 coolwarm -118 10122 rainbow -233 30264 spectral -175 15172 blueyellow -080 6871 Spatial Frequency 011 16967

Table 3 Effects of colormap and spatial frequency on the log of errorin quantity estimation The intercept represents greyscale as colormap(lowastlowastlowast= p lt 0001lowastlowast= p lt 001lowast= p lt 005)

000

025

050

075

3 5 7 9 11spatial frequency

avg

gra

dien

t (

)

Figure 6 Average local gradient (ie terrain slope) at locations selectedby participants Error bars are 95 CIs

bodyheat) had the least impact on error with a mere improve-ment of 026-071 orders of magnitude relative to greyscale

The fact that we did not find interaction between colormapand spatial frequency implies that the relative effectiveness ofthe different colormaps is stable across all spatial frequencylevels tested Rainbow is thus expected to be the most ac-curate colormap for quantity estimation regardless of howspatially complex the data is However estimation accuracywill decrease comparably for all colormaps as the data be-comes more spatially varied This could reflect a combinationof perceptual and motor difficulty in locating and clicking theintended location due the larger local gradients encounteredin high-frequency maps (see Figure 6)

EXPERIMENT 2 GRADIENT PERCEPTIONThe second experiment tests participantsrsquo accuracy in compar-ing and judging the steepness of gradients The ability to judgehow fast the encoded quantities change between adjacent maplocations is important in many contexts

ParticipantsWe recruited 126 participants (50 females 74 males 2 others)with a mean age of 3562 years (ST D = 955 years) Partici-pants had an overall success rate of 6775 (ST D = 1141)Ten participants (79 of subjects) were dropped from theanalysis because their overall accuracy was worse than chancehaving correctly answered less than 50 of trials in a two-alternatives forced choice experiment

greyscale singlehue bodyheat cubehelix extbodyheat coolwarm rainbow spectral blueyellow

3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 1104

06

08

frequency

c

orre

ct

whichexperimentmodel

experiment model

p(su

cces

s)

spatial frequency

Figure 7 Probability of successful gradient judgment (experiment 2 vs model) Ribbons represent 95 CIs of the experimental results

spatial frequency

05

06

07

08

09

3 5 7 9 11spatial frequency

c

orre

ct

colorscalegreyscalesinglehuebodyheatcubehelixextbodyheatcoolwarmrainbowspectralblueyellow

00

02

04

06

08

greysc

ale

single

hue

body

heat

cube

helix

extbo

dyhe

at

coolw

arm

rainb

ow

spec

tral

bluey

ellow

c

orre

ct

colorscalegreyscalesinglehuebodyheatcubehelixextbodyheatcoolwarmrainbowspectralblueyellow

Figure 8 Percentage of correctly answered trials in a gradient percep-tion task (experiment 2) Intervals are 95 CIs

ProcedureEach trail consisted of a map with two squares juxtaposed ontop (see Figure 3) Participants were prompted to ldquocomparethe steepness of terrain inside the two boxesrdquo and ldquoclick on thebox that is steeper on averagerdquo The two boxes were identicalin size (175times175 pixels or 35degtimes35deg of visual angle) How-ever terrain steepness which was calculated by taking the aver-age first derivative within each box was varied systematicallyThe gradient-ratio between the flatter and the steeper boxeswas fixed to one of four levels 0808308609(plusmn005) Alower ratio implies larger and potentially more perceptibleslope difference making the task easier However the twoboxes encompassed terrain with identical height ranges to re-duce variability in the appearance of their peaks (a potentialconfound in slope judgment [26])

Participants first completed a set of 6 training trials that in-cluded feedback before proceeding to the main trials Theorder of stimuli was similar to the previous experiment thestudy consisted of 3 rounds one with each of the 3 colormapsthe participant was assigned to see Each round encompassedall 5 spatial frequency levels Participants completed 4 trialswith each colormap and frequency combination spanning arange of easy to difficult tests (a total of 60 trials) The orderof colormap presentation was fully counterbalanced acrossparticipants

ResultsFigure 7 illustrates participantsrsquo probability of correctly iden-tifying the steeper gradient The experimental data is shownseparately in Figure 8 We fit the results to a logistic re-gression model comprising two fixed effects (colormap fre-quency) The model also included two random effects toaccount for individual differences among participants andintra-trial variations (recall that trials varied in difficulty) Themodel essentially predicts the odds of correctly identifyingthe steeper gradient A likelihood ratio test indicates the over-all model is significant (χ2(17) = 42165 p lt 0001) The

a Main effectsCoef Est |z| p(Intercept) 080 0516singlehue 114 0415bodyheat 094 0183cubehelix 090 0354extbodyheat 113 0388coolwarm 066 1316rainbow 065 1354spectral 092 0259blueyellow 091 0281Frequency 115 4633

b Interaction effects(colormap x frequency)

Coef Est |z| psinglehue 096 0910bodyheat 104 1020cubehelix 106 1303extbodyheat 103 0596coolwarm 114 3008 rainbow 113 2876 spectral 106 1330blueyellow 112 2471

Table 4 Main effects of colormap and spatial frequency on success oddsin gradient judgment (a) and their interaction Coefficients shown corre-spond to the exponented model estimates to reflect odd-ratios The inter-cept represents greyscale (lowastlowastlowast= p lt 0001 lowastlowast= p lt 001 lowast= p lt 005)

model correctly predicts 7467 of outcomes We find sig-nificant interaction between colormap and spatial frequency(χ2(8) = 2781 p lt 0001) Table 4 shows model coefficients

The main-effect coefficients for all colormaps were not sig-nificant indicating that all colormaps perform comparablyto greyscale at low spatial frequencies Participants are thusunlikely to benefit from the use of color when judging gra-dients in low-variance data The main effect of spatial fre-quency however is significant The model estimates that astep-increase in spatial frequency improves the odds of correctjudgment by 15 Estimating gradients appear to be easier inmaps with more complex spatial structures

The model indicates several noteworthy interactions Althoughthe use of color had no significant effect in low-frequencymaps several colormaps significantly outperformed greyscaleat high frequency The divergent coolwarm improved partici-pantsrsquo success odds by 14 for every step-increase in spatialfrequency Similarly rainbow and blueyellow increased theodds by approximately 13 and 12 respectively Notablythese three colormaps contain substantial variation in satura-tion (coolwarm and blueyellow) or hue (rainbow) All othercolormaps tested were not reliably different from greyscale

EXPERIMENT 3 PATTERN PERCEPTIONHaving tested accuracy in quantity estimation and gradientperception we now evaluate participantsrsquo ability to integratethese two skills Experiment 3 required participants to extracta longitudinal pattern from the map and match it to an externalrepresentation a task originally devised by Hyslop [18]

ParticipantsWe recruited 165 participants (79 females 84 males 2 others)The mean participant age was 3604 years (ST D = 1171)Overall participants had a mean success rate of 7851 inmatching the correct pattern (ST D = 1931) We dropped

greyscale singlehue bodyheat cubehelix extbodyheat coolwarm rainbow spectral blueyellow

3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11

07

08

09

frequency

c

orre

ct

whichexperimentmodel

experiment model

p(su

cces

s)

spatial frequency

Figure 9 Probability of successful pattern matching (experiment 3 vs model) Ribbons denote 95 CIs of the experimental data

070

075

080

085

090

3 5 7 9 11spatial frequency

c

orre

ct

colorscalegreyscalesinglehuebodyheatcubehelixextbodyheatcoolwarmrainbowspectralblueyellow

000

025

050

075

greysc

ale

single

hue

body

heat

cube

helix

extbo

dyhe

at

coolw

arm

rainb

ow

spec

tral

bluey

ellow

c

orre

ct

colorscalegreyscalesinglehuebodyheatcubehelixextbodyheatcoolwarmrainbowspectralblueyellow

Figure 10 Percentage of correctly answered trials in experiment 3 In-tervals are 95 CIs

seven participants from the analysis (42 of subjects) whoseoverall accuracy was two standard deviations below the mean

ProcedureParticipants first completed a set of 6 training trials that in-cluded feedback before proceeding to the main experimentEach trial consisted of a map with two markers labeled A andB (see Figure 3) The markers were horizontally displacedby 350 pixels (7degof visual angle) Participants were given thefollowing prompt ldquoImagine a line from A to B Select the ele-vation profile below that most closely matches its sloperdquo Theythen selected a choice among a set of 6 patterns includingthe actual elevation profile and 5 other distractors Distractorswere generated from the same map so as to reflect similar spa-tial frequency characteristics and had to be 65-70 similarto the actual profile (as measured by dynamic warping [20])Additionally profiles and distractors were selected to not havepeaks or valleys at the endpoints These criteria determinedafter a pilot ensure similar task difficulty across the trials

The order of stimuli was similar to the previous two experi-ments the study consisted of 3 rounds one with each of the 3colormaps the participant was assigned to see and encompass-ing the 5 spatial frequency levels Thus every participant saw3times5 colormap and frequency combinations and completed 3pattern matching trials with each combination for a total of 45trials As in the previous experiments the order of colormappresentation was fully counterbalanced

ResultsWe fit the results to a logistic regression model comprisingtwo fixed effects (colormap frequency) and two random ef-fects to account for individual differences and intra-trial vari-ations Figure 9 shows the odds of successful profile match-ing The experimental results are illustrated separately inFigure 10 A likelihood ratio test indicates the model issignificant (χ2(17) = 39467 p lt 0001) We find signif-icant interaction between colormap and spatial frequency

a Main effectsCoef Est |z| p(Intercept) 903 6502 singlehue 134 0649bodyheat 080 0503cubehelix 077 0681extbodyheat 081 0466coolwarm 044 1868 rainbow 098 0064spectral 057 1284blueyellow 073 0697Frequency 093 2058

b Interaction effects(colormap x frequency)

Coef Est |z| psinglehue 098 0450bodyheat 102 0456cubehelix 109 1781 extbodyheat 104 0797coolwarm 117 3143 rainbow 101 0259spectral 111 2198 blueyellow 109 1683

Table 5 Main effects of colormap and spatial frequency on successodds in experiment 3 (a) and their interaction Coefficients depict expo-nented model estimates to reflect odd-ratios The intercept correspondto greyscale (lowastlowastlowast= p lt 0001 lowastlowast= p lt 001 lowast= p lt 005 = p lt 01)

(χ2(8) = 18131 p lt 005) the relative effectiveness of thecolormaps appears to vary with spatial frequency

Overall we find a significant detrimental main effect of spa-tial frequency on pattern perception as indicated by a 093Frequency coefficient (Table 5a) This translates to a 7 dropin the odds of correctly matching the profile for every step-increase in spatial frequency The main effect coefficients forall colormaps were not significant indicating that the use ofcolor at low spatial frequency is unlikely to improve patternperception as compared to a plain greyscale ramp

Colormap performance begins to diverge at high spatial fre-quency Only two colormaps have significant and largeenough odds-ratio coefficients (ie gt 107) to overcome thefrequency-induced perceptual difficulty spectral and cool-warm increased the odds of correct pattern matching by 11ndash17 respectively for a every step-increase in spatial fre-quency (after adjusting for frequency effects alone) Addi-tionally blueyellow and cubehelix were associated with a 9improvement but the advantage was not reliable (p lt 01)On the other hand extbodyheat bodyheat rainbow had small(and insignificant) odds-ratio coefficients (098ndash104 lt 107)indicating that similar to greyscale they are associated withlower success odds in complex maps

In short only two of the tested colormaps (coolwarm andspectral) appear to reliably support pattern perception at highspatial frequency Both consist of a diverging ramp withuniformly-stepped luminance All other colormaps (includinggreyscale) suffered as data complexity increased

DISCUSSION AND GUIDELINESOur work sheds new light on how spatial complexity impactsthe perception of continuous color-coded maps The experi-ments also led to some surprising findings that are at odds withcurrent guidelines We interpret these results and accordingly

Quantity estimationRanking unaffected by spatial frequency

Gradient perceptionLow spatial frequency

(038 cycledeg)High spatial frequency

(138 cycledeg)

no s

igni

fican

tdi

ffere

nce

no s

igni

fican

tdi

ffere

nce

Pattern perception

better

worse

Color mapGuidelines

(1) Maximize range of saturated hues regardless of spatial frequency

(2) At high spatial frequency Fully-saturated hues or diverging ramps with chroma variation

(3) At high spatial frequency Diverging ramps with uniformly-stepped luminance

no s

igni

fican

tdi

ffere

nce

no

sign

ifica

ntdi

ffere

nce

Low spatial frequency(038 cycledeg)

High spatial frequency(138 cycledeg)

Figure 11 Model-derived colormap ranking and guidelines by task and spatial frequency (lowast= p lt 005 = p lt 01 relative to greyscale)

devise new task- and frequency-aware color mapping guide-lines (indicated byF) We also rank the tested colormaps andsummarize our guidelines in Figure 11

Quantity EstimationOur first hypothesis (H1) predicts hue- and saturation-varyingramps to be more accurate at low spatial frequencies andramps with monotonically increasing luminance to be moreaccurate at high frequencies As discussed H1 is based onthe relative contrast-sensitivity of our visual system [32] Aquantity estimation task (experiment 1) shows no interactionbetween colormap and spatial frequency While increasedspatial complexity is associated with higher estimation errorthe effect is similar across all colormaps We thus reject H1

On the other hands results provide support for H2 whichpredicts that hue-varying ramps will lead to more accurateestimation Indeed the top performing colormaps (rainbowand spectral) contain substantial hue variation Results fromexperiment 1 thus replicate earlier findings by Ware [41] butalso extend them to show that spatial frequency have no ap-parent impact on the effectiveness of hue-varying ramps Ourdata shows that rainbow and spectral are the most accurateamong the colormaps tested even at the highest levels of spa-tial frequency Altogether these results lend further support tothe theory that lookup errors in color-coded maps are largelycaused by systematic simultaneous contrast shifts [41] ratherthan being affected by contrast sensitivity modulation [32]These shifts are best counteracted with colormaps that varynon-monotonically along one or more perceptual channels

A corollary result is that mixing monotonic luminance withhue variation would lead to significant accuracy loss Indeeddata from experiment 1 indicates that rainbow is approxi-mately an order of magnitude more accurate than extbodyheatand cubehelix These Spiral colormaps are designed to be moreaccurate rainbow alternatives for interval data [4 24] Con-trary we find that they reduce accuracy compared to a purelyhue-varying ramp This finding suggests that when estimat-ing a continuously coded spatial quantity people benefit mostfrom a large dynamic hue range Incorporating monotonicallyincreasing lightness within the colormap would necessarilyreduce the hue range thereby diminishing accuracy

F Guideline 1 We recommend maximizing hue variation toimprove quantity estimation irrespective of spatial frequency

Gradient PerceptionGradient perception allows people to distinguish how quicklythe encoded attribute changes between adjacent locations anessential skill when evaluating the distribution and varianceof spatial data We find that the task is strongly modulatedby the datarsquos spatial complexity increased spatial frequencyappears to enhance the perception of gradients This is un-surprising as maps with jagged surfaces are likely to exhibitmore pronounced mdashand thus more perceptiblemdash differencesin slope Colormap effectiveness was also impacted by spa-tial frequency color encoding did not help participantsrsquo dis-tinguish gradients at low frequency levels as all colormapsshowed similar performance to greyscale However three col-ormaps demonstrated significant advantage at high frequenciesCoolwarm rainbow and blueyellow improved perception oddsby 12-14 for every step-increase in spatial frequency Allthree employed one of two design strategies a diverging rampwith varying saturation or a fully saturated hue rotation

The above results contradict H1 which predicts hue- andsaturation-varying colormaps to perform better at low frequen-cies In fact we see the opposite The results also do not sup-port H3 which predicts better performance for monotonically-luminant ramps in structure perception tasks In fact all threetop-performing ramps exhibit non-monotonic luminance

F Guideline 2 For tasks requiring gradient perception athigh spatial frequency we recommend a range of fully satu-rated hues (eg rainbow) or diverging chroma-varying ramps(eg coolwarm or blueyellow)

Pattern PerceptionExperiment 3 prompted participants to match the elevationprofile along a horizontal path with an external pattern Weexpected colormaps with monotonically increasing luminanceto be more accurate at this task (H3) but results were notentirely consistent with this prediction While all tested col-ormaps had comparable performance at low spatial frequencyonly two colormaps coolwarm and spectral gave partici-pants higher odds of successfully matching the pattern at highfrequency Both colormaps comprise a diverging ramp withuniformly-stepped (though not strictly monotonic) luminanceBy contrast sequential and spiral ramps performed just aspoorly as greyscale in complex maps and so did rainbow

The above result are consistent with Morelandrsquos argumentthat diverging ramps provide ldquomaximal perceptual resolutionrdquo(through increasing and decreasing luminance intervals) [24]potentially enabling high-frequency patterns to be resolvedmore easily Our results may also explain why divergingschemes performed better in medical diagnosis [3] we suspectsuch tasks to require the analysis of potentially high-frequencyfeatures (eg small tissue aberrations)

FGuideline 3 We recommend diverging ramps with equidis-tant luminance steps (eg coolwarm and spectral) to sup-port the perception of longitudinal patterns at high spatialfrequency Rainbow Sequential and Spiral schemes shouldbe avoided in complex maps especially if the task involvesthe analysis and matching of fine-grained features

Yet Another Look at the RainbowResults of experiments 1 and 2 may shed a light on why rain-bow remains a popular choice among scientists [25] despitebeing considered a bad choice by the visualization commu-nity [4 30] Our data reveals that counterintuitively rainbowis robust for estimating a smoothly varying quantitative at-tribute regardless of spatial complexity Moreover rainbowprovides good support for gradient estimation at high spatialfrequency These two tasks correspond to elementary visualanalytic primitives including characterizing distributions de-termining ranges and filtering [1] Moreover studies showthat when experts attempt to form a mental model about avisualization they first go through a time-consuming processof extracting quantitative data ldquoat a rather detailed levelrdquo [38]For instance a weather forecaster will lookup pressure andwind changes estimating current readings at landmark loca-tions in the map before making a forecast Our data suggeststhat rainbow provides good support for these tasks making ita potentially reasonable choice for weather forecasters

Critique of rainbow centers on its tendency to create sharpvisual boundaries particularly around its yellow regions [4]Experts also criticize the use of fully saturated hues [24]which result in non-uniform perceptual steps within the colorramp The common intuition is that these two factors com-bined will inevitably distort the perception of quantities Wedo not see evidence to support this hypothesis In fact to thecontrary attempts to lsquolinearizersquo the rainbow by monotonicallyincreasing the luminance of hues could reduce estimationaccuracy by up to an order of magnitude

F Guideline 4 Rather than entirely discouraging the useof rainbow we suggest that it can be a reasonable designchoice for conveying spatial distributions and variances andin tasks that require quantitative as opposed to geometricprecision However rainbow has a number of limitationsThe use of green and red hues is problematic for people withcolor deficiency Moreover rainbow is probably ineffective atrevealing high-frequency patterns Interestingly these short-comings are balanced by diverging ramps (eg coolwarm)which although quantitatively inaccurate appear to supportpattern perception at high spatial frequency We thus arguethat hue-varying and diverging colormaps support orthogonaltasks in continuous maps and should therefore be consideredas complementary rather than mutually exclusive choices

LIMITATIONS AND FUTURE WORKThere are some limitations to our work that should be consid-ered First as with other crowdsourced graphical perceptionstudies we gain access to a larger pool of participants butsacrifice some experimental control [16] Particularly rele-vant to our study is the variations in participantsrsquo monitorsincluding color calibration and display resolution as well asthe illumination conditions in their homes or offices mdash all ofwhich can impact color perception We could not control thesefactors but attempted to counteract their variation by involvinga larger sample (N=381) Although we expect crowdsourcingto improve the ecological validity of results and guidelinesuncontrolled variations can potentially reduce our ability to de-tect small but otherwise significant differences in performancebetween tested conditions Future lab studies should thereforebe attempted to replicate our findings with added controls

Second our study employed a limited set of tasks designed tomeasure elementary perceptual operators including quantityestimation gradient perception and pattern matching Thereis an opportunity to test higher-level tasks that mimic scientificanalyses more closely including the identification and compar-ison of larger map features (eg fronts ridges) Additionallysome of the tasks we tested could be re-evaluated in more au-thentic formulations For instance a metric task could requireparticipants to estimate the quantity at a specific location onthe map This formulation is arguably more realistic than thetask we tested which simply asked participants to click anylocation thought to match a specified quantity

Third our analysis was focused exclusively on spatial fre-quency and there are good reasons to consider this factor [1132] However there are also additional data characteristics toconsider including for instance the distribution of amplitudeswithin the map Such factors will influence the distribution ofcolors in the image and may thus impact perception

Lastly we limited our study to synthetically generated scalarfields to precisely vary spatial frequency while controllingfor other confounds However synthetic stimuli may alsointroduce (unknown) perceptual or cognitive biases There-fore additional studies are needed to replicate our findingswith datasets from real-world domains (eg meteorology geo-physics or oceanography) and with domain experts We alsorestricted this study to participants with normal color visionTherefore our results may not generalize to approximately 5of the population who have some form of color deficiency

CONCLUSIONSWe conducted three experiments to investigate the effects ofspatial frequency and colormap characteristics on the percep-tion of continuous pseudocolor maps Our results indicate thatspatial frequency impacts judgment of the encoded quantitiesand structures While viewersrsquo quantity estimation accuracyexhibited a predictable response increased data complexityhad a more nuanced effect on gradient and pattern compre-hension the impact of which was dependent on the colormapused Designers should therefore consider both the type oftask and the spatial complexity of the underlying data Were-examined current guidelines and devised new recommenda-tions for color-coding of continuous spatial data

REFERENCES1 Robert Amar James Eagan and John Stasko 2005

Low-level components of analytic activity in informationvisualization In Information Visualization 2005INFOVIS 2005 IEEE Symposium on IEEE 111ndash117

2 Lawrence D Bergman Bernice E Rogowitz and Lloyd ATreinish 1995 A rule-based tool for assisting colormapselection In Proceedings of the 6th conference onVisualizationrsquo95 IEEE Computer Society 118

3 Michelle Borkin Krzysztof Gajos Amanda PetersDimitrios Mitsouras Simone Melchionna Frank RybickiCharles Feldman and Hanspeter Pfister 2011 Evaluationof artery visualizations for heart disease diagnosis IEEETransactions on Visualization and Computer Graphics 1712 (2011) 2479ndash2488

4 David Borland and Russell M Taylor Ii 2007 Rainbowcolor map (still) considered harmful IEEE ComputerGraphics and Applications 27 2 (2007)

5 Cynthia A Brewer 1994 Visualization in ModernCartography Elsevier Science Chapter Color useguidelines for mapping and visualization

6 Cynthia A Brewer 1996 Guidelines for selecting colorsfor diverging schemes on maps The CartographicJournal 33 2 (1996) 79ndash86

7 Cynthia A Brewer Alan M MacEachren Linda W Pickleand Douglas Herrmann 1997 Mapping mortalityEvaluating color schemes for choropleth maps Annals ofthe Association of American Geographers 87 3 (1997)411ndash438

8 Roxana Bujack Terece L Turton Francesca SamselColin Ware David H Rogers and James Ahrens 2017The Good the Bad and the Ugly A TheoreticalFramework for the Assessment of Continuous ColormapsIEEE Transactions on Visualization and ComputerGraphics (2017)

9 William S Cleveland and William S Cleveland 1983 Acolor-caused optical illusion on a statistical graph TheAmerican Statistician 37 2 (1983) 101ndash105

10 William S Cleveland and Robert McGill 1984 Graphicalperception Theory experimentation and application tothe development of graphical methods J Amer StatistAssoc 79 387 (1984) 531ndash554

11 Russel De Valois and Karen De Valouis 1990 SpatialVision Oxford University Press

12 Russell L De Valois Duane G Albrecht and Lisa GThorell 1978 Cortical cells bar and edge detectors orspatial frequency filters In Frontiers in visual scienceSpringer 544ndash556

13 Connor C Gramazio David H Laidlaw and Karen BSchloss 2017 Colorgorical Creating discriminable andpreferable color palettes for information visualizationIEEE Transactions on Visualization and ComputerGraphics 23 1 (2017) 521ndash530

14 D A Green 2011 A colour scheme for the display ofastronomical intensity images Bulletin of theAstronomical Society of India 39 (June 2011) 289ndash295

15 Mark Harrower and Cynthia A Brewer 2003ColorBrewer org an online tool for selecting colourschemes for maps The Cartographic Journal 40 1(2003) 27ndash37

16 Jeffrey Heer and Michael Bostock 2010 Crowdsourcinggraphical perception using mechanical turk to assessvisualization design In Proceedings of the SIGCHIConference on Human Factors in Computing SystemsACM 203ndash212

17 GT Herman and H Levkowitz 1992 Color scales forimage data IEEE Computer Graphics and Applications12 1 (1992) 72ndash80

18 Michael D Hyslop 2006 A comparison of spectral colorand greyscale continuous-tone map perception Masterrsquosthesis Michigan State University

19 Alan David Kalvin Bernice E Rogowitz Adar Pelah andAron Cohen 2000 Building perceptual color maps forvisualizing interval data In Human Vision and ElectronicImaging V Vol 3959 International Society for Opticsand Photonics 323ndash336

20 Eamonn Keogh and Chotirat Ann Ratanamahatana 2005Exact indexing of dynamic time warping Knowledge andInformation Systems 7 3 (2005) 358ndash386

21 Gordon Kindlmann Erik Reinhard and Sarah Creem2002 Face-based luminance matching for perceptualcolormap generation In Proceedings of the IEEEConference on Visualization rsquo02 IEEE Computer Society299ndash306

22 Robert Kosara 2016 An Empire Built On SandReexamining What We Think We Know AboutVisualization In Proceedings of the Beyond Time andErrors on Novel Evaluation Methods for VisualizationACM 162ndash168

23 Mark P Kumler and Richard E Groop 1990Continuous-tone mapping of smooth surfacesCartography and Geographic Information Systems 17 4(1990) 279ndash289

24 Kenneth Moreland 2009 Diverging color maps forscientific visualization In International Symposium onVisual Computing Springer 92ndash103

25 Kenneth Moreland 2016 Why We Use Bad Color Mapsand What You Can Do About It Electronic Imaging2016 16 (2016) 1ndash6

26 Lace Padilla P Samuel Quinan Miriah Meyer andSarah H Creem-Regehr 2017 Evaluating the Impact ofBinning 2D Scalar Fields IEEE Transactions onVisualization and Computer Graphics 23 1 (2017)431ndash440

27 Stephen E Palmer 1999 Vision science Photons tophenomenology MIT press

28 Ken Perlin 1985 An image synthesizer ACMSIGGRAPH Computer Graphics 19 3 (1985) 287ndash296

29 Linda Williams Pickle 2003 Usability testing of mapdesigns In Proceedings of Symposium on the Interface ofComputing Science and Statistics 42ndash56

30 P Samuel Quinan and Miriah Meyer 2016 Visuallycomparing weather features in forecasts IEEETransactions on Visualization and Computer Graphics 221 (2016) 389ndash398

31 Penny L Rheingans 2000 Task-based color scale designIn 28th AIPR Workshop 3D Visualization for DataExploration and Decision Making International Societyfor Optics and Photonics 35ndash43

32 Bernice E Rogowitz and Lloyd A Treinish 1994 Usingperceptual rules in interactive visualization In ISTSPIE1994 International Symposium on Electronic ImagingScience and Technology International Society for Opticsand Photonics 287ndash295

33 Bernice E Rogowitz Lloyd A Treinish Steve Brysonand others 1996 How not to lie with visualizationComputers in Physics 10 3 (1996) 268ndash273

34 Samuel Silva Beatriz Sousa Santos and JoaquimMadeira 2011 Using color in visualization A surveyComputers Graphics 35 2 (2011) 320ndash333

35 Maureen Stone Danielle Albers Szafir and Vidya Setlur2014 An engineering model for color difference as afunction of size In Color and Imaging Conference Vol2014 Society for Imaging Science and Technology253ndash258

36 Danielle Albers Szafir 2017 Modeling Color Differencefor Visualization Design IEEE Transactions onVisualization and Computer Graphics (2017)

37 Christian Tominski Georg Fuchs and HeidrunSchumann 2008 Task-driven color coding InInformation Visualisation 2008 IVrsquo08 12thInternational Conference IEEE 373ndash380

38 J Gregory Trafton Susan S Kirschenbaum Ted L TsuiRobert T Miyamoto James A Ballas and Paula DRaymond 2000 Turning pictures into numbersextracting and generating information from complexvisualizations International Journal of Human-ComputerStudies 53 5 (2000) 827ndash850

39 Bruce E Trumbo 1981 A theory for coloring bivariatestatistical maps The American Statistician 35 4 (1981)220ndash226

40 W3C 2016 CSS Values and Units Module Level 3(2016)httpwwww3orgTRcss3-valuesabsolute-lengths

41 Colin Ware 1988 Color sequences for univariate mapsTheory experiments and principles IEEE ComputerGraphics and Applications 8 5 (1988) 41ndash49

42 Colin Ware 2012 Information visualization perceptionfor design Elsevier

43 Liang Zhou and Charles D Hansen 2016 A survey ofcolormaps in visualization IEEE Transactions onVisualization and Computer Graphics 22 8 (2016)2051ndash2069

  • Introduction
  • Related Work
    • Design Strategies for Continuous Colormaps
    • Empirical Evaluations of Continuous Colormaps
    • Effects of Spatial Frequency
    • Summary
      • Hypotheses
      • Methodology
        • Stimuli
        • Colormaps
        • Experimental Design
          • Experiment 1 Quantity estimation
            • Participants
            • Procedure
            • Results
              • Experiment 2 Gradient perception
                • Participants
                • Procedure
                • Results
                  • Experiment 3 Pattern perception
                    • Participants
                    • Procedure
                    • Results
                      • Discussion and Guidelines
                        • Quantity Estimation
                        • Gradient Perception
                        • Pattern Perception
                        • Yet Another Look at the Rainbow
                          • Limitations and Future Work
                          • Conclusions
                          • References
Page 5: Graphical Perception of Continuous Quantitative Maps: the ...khreda.com/papers/CHI18_colormaps.pdfA large body of research has been devoted to understanding how color encoding affects

Click on a point that has an elevation of exactly 750 feet

Terrain is steeper when there is larger change in elevation between adjacent points Compare the steepness of terrain inside the two boxes then click on the box that is steeper on average

Imagine a line from A to B Select the elevation profile below that most closely matches its slope

Experiment 1 Experiment 2 Experiment 3

elevation(feet)

elevation(feet)

elevation(feet)

Figure 3 Example stimuli from the three experiments In experiment 1 participants indicated their response by clicking a point on the map matchinga specified elevation In experiment 2 participants were prompted to select the steeper of the two boxes In experiment 3 participants were asked toidentify the pattern corresponding to the terrain profile between two horizontally displaced markers

different colormaps we opted for a factorial design testingall possible 9times5 Colormap and Spatial Frequency combina-tions Given the sheer number of combinations we opted fora mixed design to make the study feasible Participants wererandomly assigned to one of three experimental conditions(illustrated in Table 2) Each condition comprised 3 of the 9colormaps (ie between-subject) and all 5 frequency levels(within-subject) In effect every participant saw 3 colormaptimes 5 spatial frequency combinations Participants completedmultiple trials with each combination

To equalize task difficulty stimuli for a given spatial frequencytrial were derived from the same base scalar field with differentcolormaps applied This arrangement enables us to make di-rect comparison between the colormaps for a given frequencyHowever it also meant that participants will see the samemap three times albeit with different colormaps To preventlearning scalar fields were flipped either horizontally or ver-tically resulting in three unique map reflections The orderof colormap presentation was fully counterbalanced acrossparticipants to minimize residual learning or fatigue effects

Condition Colormaps tested Spatial frequencies tested1 greyscale cubehelix rainbow 3 5 7 9 112 singlehue extbodyheat spectral 3 5 7 9 113 bodyheat coolwarm blueyellow 3 5 7 9 11

Table 2 Three experimental conditions each included 3 of 9 colormaps(ie between-subject variation) and all 5 levels of spatial frequency

EXPERIMENT 1 QUANTITY ESTIMATIONThe first experiment tests participantsrsquo ability to identify loca-tions on the map matching specified elevations Participantswere instructed to ldquoClick on a point that has an elevation ofexactly [H] feetrdquo Five different values for H were tested 0250 500 750 and 1000 feet These values correspond to thethree quartiles of the color scale as well as the min and max

ParticipantsWe recruited 90 participants from Amazon Mechanical Turk(50 females 40 males) with a mean age of 3464 (ST D = 953years) Participants were first screened for color-vision de-ficiency using a 14-panel Ishihara test and had to correctlyguess the number in 12 of the 14 panels to qualify We re-stricted the study to participants with a screen resolution of atleast 1280times800 to ensure the experimental interface would

fit their display Participants received a base reward of $050and a maximum bonus of $300 based on the percentage ofcorrectly solved tasks (for a possible total of $350)

ProcedureAfter signing up for the study participants were directed to anexternal link that displayed the experiment within a web inter-face Participants entered their MTurk ID and were presentedwith an information sheet about the study They were thenpresented with the color-vision qualification test Those whosuccessfully passed the test were given a set of 6 training trialsand provided with feedback on their accuracy Participantshad to identify a location that is within a 5 margin from thespecified height before proceeding to the next training trial

The main portion of the experiment consisted of 3 rounds onewith each of the 3 colormaps In each round participants sawthe five spatial frequency levels in ascending order providinga progression from simple to more complex maps The orderof colormap presentation was fully counterbalanced across par-ticipants using a Latin square design Participants completed 5trials with each colormap and spatial frequency combinationcorresponding to the 5 tested quantities (0 250 500 750 and1000) presented in random order A color scale was displayedto the right of the map and the range of the scale was fixedat [0ndash1000] feet (see Figure 3) In each trial participants firstsaw the question and clicked on lsquoShow Maprsquo to reveal thestimulus They then indicated their response by clicking onthe map to mark their selected location and clicked lsquoNextrsquo Toaid participants in accurately selecting locations the mousecursor was changed to a crosshair with a hollowed-out center(so as not to obscure the focal pixel)

ResultsWe computed an lsquoerrorrsquo measurement for each response bytaking the absolute difference between the requested elevationand the elevation at the point clicked by the participant Wethen applied the following log transform [10 16]

log2(error) = log2(| judged percentminus true percent|+18)

We removed three participants from the analysis (amountingto 33 of subjects) because their overall accuracy was twostandard deviations below the mean accuracy for all partici-pants (M = 8536ST D = 948) We analyze the results

greyscale singlehue bodyheat cubehelix extbodyheat coolwarm rainbow spectral blueyellow

3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11

0

1

2

3

frequency

log(error)

whichexperimentmodel

experiment modelspatial frequency

Figure 4 Mean log of error in quantity estimation (experiment 1 vs model) Ribbons represent 95 CIs of the experimental results

0

1

2

3

3 5 7 9 11frequency

log(error)

colorscalegreyscalesinglehuebodyheatcubehelixextbodyheatcoolwarmrainbowspectralblueyellow

spatial frequency

0

1

2

greyscale

singlehue

bodyheat

cubehelix

extbodyheat

coolwarm

rainbow

spectral

blueyellow

log(error)

colorscalegreyscalesinglehuebodyheatcubehelixextbodyheatcoolwarmrainbowspectralblueyellow

Figure 5 Mean log error in quantity estimation by colormap (left) andby colormap times spatial frequency Intervals are 95 CIs

by fitting the log of error to a linear mixed-effects modelcomprising two fixed effects (colormap frequency) and tworandom effects The first random effect accounts for individ-ual variations among participants and the second for intra-trialvariations (recall that trials comprised different test elevations)

Figure 4 illustrates the mean log error by colormap and spatialfrequency (the dashed trendline represents the model) Theexperimental results are combined in Figure 5 to ease com-parison A likelihood ratio test indicates the overall model issignificant (χ2(18) = 14817 p lt 0001) To test for interac-tion between spatial frequency and colormap we fit a reducedmodel that accounts for both frequency and colormap but nottheir interaction There was no significant difference betweenthe full and the reduced model (χ2(8) = 10695 p = 0219)thus ruling out an interaction between colormap and spatial fre-quency We will therefore interpret the reduced model whichaccounts for both factors independently Table 3 illustrates themodel coefficients

The model predicts that a step-increase in spatial frequencyyields a 011 increase in the log of estimation error The differ-ence in estimation error between the highest (f=11) and lowest(f=3) frequency levels is approximately 09 orders of magni-tude The effect of color encoding was equally evident allthe colormaps were significantly better than greyscale How-ever the gain in accuracy was markedly different betweenthe colormaps Rainbow had the largest impact on estimationaccuracy reducing error by approximately 23 orders of mag-nitude compared to greyscale The runner-up was spectralwhich also contains substantial hue variation However spec-tral reduced error by 175 orders of magnitude only On theother hand Spiral colormaps (extbodyheat cubehelix) whichcomprise multiple hues over a monotonically increasing lumi-nance decreased estimation errors by approximately 13-14orders of magnitude compared to 08-12 for Diverging ramps(blueyellow coolwarm) Sequential schemes (singlehue and

Coefficient Estimate |t value| p(Intercept) 177 6628 singlehue -026 2253 bodyheat -071 6118 cubehelix -127 16467 extbodyheat -139 12073 coolwarm -118 10122 rainbow -233 30264 spectral -175 15172 blueyellow -080 6871 Spatial Frequency 011 16967

Table 3 Effects of colormap and spatial frequency on the log of errorin quantity estimation The intercept represents greyscale as colormap(lowastlowastlowast= p lt 0001lowastlowast= p lt 001lowast= p lt 005)

000

025

050

075

3 5 7 9 11spatial frequency

avg

gra

dien

t (

)

Figure 6 Average local gradient (ie terrain slope) at locations selectedby participants Error bars are 95 CIs

bodyheat) had the least impact on error with a mere improve-ment of 026-071 orders of magnitude relative to greyscale

The fact that we did not find interaction between colormapand spatial frequency implies that the relative effectiveness ofthe different colormaps is stable across all spatial frequencylevels tested Rainbow is thus expected to be the most ac-curate colormap for quantity estimation regardless of howspatially complex the data is However estimation accuracywill decrease comparably for all colormaps as the data be-comes more spatially varied This could reflect a combinationof perceptual and motor difficulty in locating and clicking theintended location due the larger local gradients encounteredin high-frequency maps (see Figure 6)

EXPERIMENT 2 GRADIENT PERCEPTIONThe second experiment tests participantsrsquo accuracy in compar-ing and judging the steepness of gradients The ability to judgehow fast the encoded quantities change between adjacent maplocations is important in many contexts

ParticipantsWe recruited 126 participants (50 females 74 males 2 others)with a mean age of 3562 years (ST D = 955 years) Partici-pants had an overall success rate of 6775 (ST D = 1141)Ten participants (79 of subjects) were dropped from theanalysis because their overall accuracy was worse than chancehaving correctly answered less than 50 of trials in a two-alternatives forced choice experiment

greyscale singlehue bodyheat cubehelix extbodyheat coolwarm rainbow spectral blueyellow

3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 1104

06

08

frequency

c

orre

ct

whichexperimentmodel

experiment model

p(su

cces

s)

spatial frequency

Figure 7 Probability of successful gradient judgment (experiment 2 vs model) Ribbons represent 95 CIs of the experimental results

spatial frequency

05

06

07

08

09

3 5 7 9 11spatial frequency

c

orre

ct

colorscalegreyscalesinglehuebodyheatcubehelixextbodyheatcoolwarmrainbowspectralblueyellow

00

02

04

06

08

greysc

ale

single

hue

body

heat

cube

helix

extbo

dyhe

at

coolw

arm

rainb

ow

spec

tral

bluey

ellow

c

orre

ct

colorscalegreyscalesinglehuebodyheatcubehelixextbodyheatcoolwarmrainbowspectralblueyellow

Figure 8 Percentage of correctly answered trials in a gradient percep-tion task (experiment 2) Intervals are 95 CIs

ProcedureEach trail consisted of a map with two squares juxtaposed ontop (see Figure 3) Participants were prompted to ldquocomparethe steepness of terrain inside the two boxesrdquo and ldquoclick on thebox that is steeper on averagerdquo The two boxes were identicalin size (175times175 pixels or 35degtimes35deg of visual angle) How-ever terrain steepness which was calculated by taking the aver-age first derivative within each box was varied systematicallyThe gradient-ratio between the flatter and the steeper boxeswas fixed to one of four levels 0808308609(plusmn005) Alower ratio implies larger and potentially more perceptibleslope difference making the task easier However the twoboxes encompassed terrain with identical height ranges to re-duce variability in the appearance of their peaks (a potentialconfound in slope judgment [26])

Participants first completed a set of 6 training trials that in-cluded feedback before proceeding to the main trials Theorder of stimuli was similar to the previous experiment thestudy consisted of 3 rounds one with each of the 3 colormapsthe participant was assigned to see Each round encompassedall 5 spatial frequency levels Participants completed 4 trialswith each colormap and frequency combination spanning arange of easy to difficult tests (a total of 60 trials) The orderof colormap presentation was fully counterbalanced acrossparticipants

ResultsFigure 7 illustrates participantsrsquo probability of correctly iden-tifying the steeper gradient The experimental data is shownseparately in Figure 8 We fit the results to a logistic re-gression model comprising two fixed effects (colormap fre-quency) The model also included two random effects toaccount for individual differences among participants andintra-trial variations (recall that trials varied in difficulty) Themodel essentially predicts the odds of correctly identifyingthe steeper gradient A likelihood ratio test indicates the over-all model is significant (χ2(17) = 42165 p lt 0001) The

a Main effectsCoef Est |z| p(Intercept) 080 0516singlehue 114 0415bodyheat 094 0183cubehelix 090 0354extbodyheat 113 0388coolwarm 066 1316rainbow 065 1354spectral 092 0259blueyellow 091 0281Frequency 115 4633

b Interaction effects(colormap x frequency)

Coef Est |z| psinglehue 096 0910bodyheat 104 1020cubehelix 106 1303extbodyheat 103 0596coolwarm 114 3008 rainbow 113 2876 spectral 106 1330blueyellow 112 2471

Table 4 Main effects of colormap and spatial frequency on success oddsin gradient judgment (a) and their interaction Coefficients shown corre-spond to the exponented model estimates to reflect odd-ratios The inter-cept represents greyscale (lowastlowastlowast= p lt 0001 lowastlowast= p lt 001 lowast= p lt 005)

model correctly predicts 7467 of outcomes We find sig-nificant interaction between colormap and spatial frequency(χ2(8) = 2781 p lt 0001) Table 4 shows model coefficients

The main-effect coefficients for all colormaps were not sig-nificant indicating that all colormaps perform comparablyto greyscale at low spatial frequencies Participants are thusunlikely to benefit from the use of color when judging gra-dients in low-variance data The main effect of spatial fre-quency however is significant The model estimates that astep-increase in spatial frequency improves the odds of correctjudgment by 15 Estimating gradients appear to be easier inmaps with more complex spatial structures

The model indicates several noteworthy interactions Althoughthe use of color had no significant effect in low-frequencymaps several colormaps significantly outperformed greyscaleat high frequency The divergent coolwarm improved partici-pantsrsquo success odds by 14 for every step-increase in spatialfrequency Similarly rainbow and blueyellow increased theodds by approximately 13 and 12 respectively Notablythese three colormaps contain substantial variation in satura-tion (coolwarm and blueyellow) or hue (rainbow) All othercolormaps tested were not reliably different from greyscale

EXPERIMENT 3 PATTERN PERCEPTIONHaving tested accuracy in quantity estimation and gradientperception we now evaluate participantsrsquo ability to integratethese two skills Experiment 3 required participants to extracta longitudinal pattern from the map and match it to an externalrepresentation a task originally devised by Hyslop [18]

ParticipantsWe recruited 165 participants (79 females 84 males 2 others)The mean participant age was 3604 years (ST D = 1171)Overall participants had a mean success rate of 7851 inmatching the correct pattern (ST D = 1931) We dropped

greyscale singlehue bodyheat cubehelix extbodyheat coolwarm rainbow spectral blueyellow

3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11

07

08

09

frequency

c

orre

ct

whichexperimentmodel

experiment model

p(su

cces

s)

spatial frequency

Figure 9 Probability of successful pattern matching (experiment 3 vs model) Ribbons denote 95 CIs of the experimental data

070

075

080

085

090

3 5 7 9 11spatial frequency

c

orre

ct

colorscalegreyscalesinglehuebodyheatcubehelixextbodyheatcoolwarmrainbowspectralblueyellow

000

025

050

075

greysc

ale

single

hue

body

heat

cube

helix

extbo

dyhe

at

coolw

arm

rainb

ow

spec

tral

bluey

ellow

c

orre

ct

colorscalegreyscalesinglehuebodyheatcubehelixextbodyheatcoolwarmrainbowspectralblueyellow

Figure 10 Percentage of correctly answered trials in experiment 3 In-tervals are 95 CIs

seven participants from the analysis (42 of subjects) whoseoverall accuracy was two standard deviations below the mean

ProcedureParticipants first completed a set of 6 training trials that in-cluded feedback before proceeding to the main experimentEach trial consisted of a map with two markers labeled A andB (see Figure 3) The markers were horizontally displacedby 350 pixels (7degof visual angle) Participants were given thefollowing prompt ldquoImagine a line from A to B Select the ele-vation profile below that most closely matches its sloperdquo Theythen selected a choice among a set of 6 patterns includingthe actual elevation profile and 5 other distractors Distractorswere generated from the same map so as to reflect similar spa-tial frequency characteristics and had to be 65-70 similarto the actual profile (as measured by dynamic warping [20])Additionally profiles and distractors were selected to not havepeaks or valleys at the endpoints These criteria determinedafter a pilot ensure similar task difficulty across the trials

The order of stimuli was similar to the previous two experi-ments the study consisted of 3 rounds one with each of the 3colormaps the participant was assigned to see and encompass-ing the 5 spatial frequency levels Thus every participant saw3times5 colormap and frequency combinations and completed 3pattern matching trials with each combination for a total of 45trials As in the previous experiments the order of colormappresentation was fully counterbalanced

ResultsWe fit the results to a logistic regression model comprisingtwo fixed effects (colormap frequency) and two random ef-fects to account for individual differences and intra-trial vari-ations Figure 9 shows the odds of successful profile match-ing The experimental results are illustrated separately inFigure 10 A likelihood ratio test indicates the model issignificant (χ2(17) = 39467 p lt 0001) We find signif-icant interaction between colormap and spatial frequency

a Main effectsCoef Est |z| p(Intercept) 903 6502 singlehue 134 0649bodyheat 080 0503cubehelix 077 0681extbodyheat 081 0466coolwarm 044 1868 rainbow 098 0064spectral 057 1284blueyellow 073 0697Frequency 093 2058

b Interaction effects(colormap x frequency)

Coef Est |z| psinglehue 098 0450bodyheat 102 0456cubehelix 109 1781 extbodyheat 104 0797coolwarm 117 3143 rainbow 101 0259spectral 111 2198 blueyellow 109 1683

Table 5 Main effects of colormap and spatial frequency on successodds in experiment 3 (a) and their interaction Coefficients depict expo-nented model estimates to reflect odd-ratios The intercept correspondto greyscale (lowastlowastlowast= p lt 0001 lowastlowast= p lt 001 lowast= p lt 005 = p lt 01)

(χ2(8) = 18131 p lt 005) the relative effectiveness of thecolormaps appears to vary with spatial frequency

Overall we find a significant detrimental main effect of spa-tial frequency on pattern perception as indicated by a 093Frequency coefficient (Table 5a) This translates to a 7 dropin the odds of correctly matching the profile for every step-increase in spatial frequency The main effect coefficients forall colormaps were not significant indicating that the use ofcolor at low spatial frequency is unlikely to improve patternperception as compared to a plain greyscale ramp

Colormap performance begins to diverge at high spatial fre-quency Only two colormaps have significant and largeenough odds-ratio coefficients (ie gt 107) to overcome thefrequency-induced perceptual difficulty spectral and cool-warm increased the odds of correct pattern matching by 11ndash17 respectively for a every step-increase in spatial fre-quency (after adjusting for frequency effects alone) Addi-tionally blueyellow and cubehelix were associated with a 9improvement but the advantage was not reliable (p lt 01)On the other hand extbodyheat bodyheat rainbow had small(and insignificant) odds-ratio coefficients (098ndash104 lt 107)indicating that similar to greyscale they are associated withlower success odds in complex maps

In short only two of the tested colormaps (coolwarm andspectral) appear to reliably support pattern perception at highspatial frequency Both consist of a diverging ramp withuniformly-stepped luminance All other colormaps (includinggreyscale) suffered as data complexity increased

DISCUSSION AND GUIDELINESOur work sheds new light on how spatial complexity impactsthe perception of continuous color-coded maps The experi-ments also led to some surprising findings that are at odds withcurrent guidelines We interpret these results and accordingly

Quantity estimationRanking unaffected by spatial frequency

Gradient perceptionLow spatial frequency

(038 cycledeg)High spatial frequency

(138 cycledeg)

no s

igni

fican

tdi

ffere

nce

no s

igni

fican

tdi

ffere

nce

Pattern perception

better

worse

Color mapGuidelines

(1) Maximize range of saturated hues regardless of spatial frequency

(2) At high spatial frequency Fully-saturated hues or diverging ramps with chroma variation

(3) At high spatial frequency Diverging ramps with uniformly-stepped luminance

no s

igni

fican

tdi

ffere

nce

no

sign

ifica

ntdi

ffere

nce

Low spatial frequency(038 cycledeg)

High spatial frequency(138 cycledeg)

Figure 11 Model-derived colormap ranking and guidelines by task and spatial frequency (lowast= p lt 005 = p lt 01 relative to greyscale)

devise new task- and frequency-aware color mapping guide-lines (indicated byF) We also rank the tested colormaps andsummarize our guidelines in Figure 11

Quantity EstimationOur first hypothesis (H1) predicts hue- and saturation-varyingramps to be more accurate at low spatial frequencies andramps with monotonically increasing luminance to be moreaccurate at high frequencies As discussed H1 is based onthe relative contrast-sensitivity of our visual system [32] Aquantity estimation task (experiment 1) shows no interactionbetween colormap and spatial frequency While increasedspatial complexity is associated with higher estimation errorthe effect is similar across all colormaps We thus reject H1

On the other hands results provide support for H2 whichpredicts that hue-varying ramps will lead to more accurateestimation Indeed the top performing colormaps (rainbowand spectral) contain substantial hue variation Results fromexperiment 1 thus replicate earlier findings by Ware [41] butalso extend them to show that spatial frequency have no ap-parent impact on the effectiveness of hue-varying ramps Ourdata shows that rainbow and spectral are the most accurateamong the colormaps tested even at the highest levels of spa-tial frequency Altogether these results lend further support tothe theory that lookup errors in color-coded maps are largelycaused by systematic simultaneous contrast shifts [41] ratherthan being affected by contrast sensitivity modulation [32]These shifts are best counteracted with colormaps that varynon-monotonically along one or more perceptual channels

A corollary result is that mixing monotonic luminance withhue variation would lead to significant accuracy loss Indeeddata from experiment 1 indicates that rainbow is approxi-mately an order of magnitude more accurate than extbodyheatand cubehelix These Spiral colormaps are designed to be moreaccurate rainbow alternatives for interval data [4 24] Con-trary we find that they reduce accuracy compared to a purelyhue-varying ramp This finding suggests that when estimat-ing a continuously coded spatial quantity people benefit mostfrom a large dynamic hue range Incorporating monotonicallyincreasing lightness within the colormap would necessarilyreduce the hue range thereby diminishing accuracy

F Guideline 1 We recommend maximizing hue variation toimprove quantity estimation irrespective of spatial frequency

Gradient PerceptionGradient perception allows people to distinguish how quicklythe encoded attribute changes between adjacent locations anessential skill when evaluating the distribution and varianceof spatial data We find that the task is strongly modulatedby the datarsquos spatial complexity increased spatial frequencyappears to enhance the perception of gradients This is un-surprising as maps with jagged surfaces are likely to exhibitmore pronounced mdashand thus more perceptiblemdash differencesin slope Colormap effectiveness was also impacted by spa-tial frequency color encoding did not help participantsrsquo dis-tinguish gradients at low frequency levels as all colormapsshowed similar performance to greyscale However three col-ormaps demonstrated significant advantage at high frequenciesCoolwarm rainbow and blueyellow improved perception oddsby 12-14 for every step-increase in spatial frequency Allthree employed one of two design strategies a diverging rampwith varying saturation or a fully saturated hue rotation

The above results contradict H1 which predicts hue- andsaturation-varying colormaps to perform better at low frequen-cies In fact we see the opposite The results also do not sup-port H3 which predicts better performance for monotonically-luminant ramps in structure perception tasks In fact all threetop-performing ramps exhibit non-monotonic luminance

F Guideline 2 For tasks requiring gradient perception athigh spatial frequency we recommend a range of fully satu-rated hues (eg rainbow) or diverging chroma-varying ramps(eg coolwarm or blueyellow)

Pattern PerceptionExperiment 3 prompted participants to match the elevationprofile along a horizontal path with an external pattern Weexpected colormaps with monotonically increasing luminanceto be more accurate at this task (H3) but results were notentirely consistent with this prediction While all tested col-ormaps had comparable performance at low spatial frequencyonly two colormaps coolwarm and spectral gave partici-pants higher odds of successfully matching the pattern at highfrequency Both colormaps comprise a diverging ramp withuniformly-stepped (though not strictly monotonic) luminanceBy contrast sequential and spiral ramps performed just aspoorly as greyscale in complex maps and so did rainbow

The above result are consistent with Morelandrsquos argumentthat diverging ramps provide ldquomaximal perceptual resolutionrdquo(through increasing and decreasing luminance intervals) [24]potentially enabling high-frequency patterns to be resolvedmore easily Our results may also explain why divergingschemes performed better in medical diagnosis [3] we suspectsuch tasks to require the analysis of potentially high-frequencyfeatures (eg small tissue aberrations)

FGuideline 3 We recommend diverging ramps with equidis-tant luminance steps (eg coolwarm and spectral) to sup-port the perception of longitudinal patterns at high spatialfrequency Rainbow Sequential and Spiral schemes shouldbe avoided in complex maps especially if the task involvesthe analysis and matching of fine-grained features

Yet Another Look at the RainbowResults of experiments 1 and 2 may shed a light on why rain-bow remains a popular choice among scientists [25] despitebeing considered a bad choice by the visualization commu-nity [4 30] Our data reveals that counterintuitively rainbowis robust for estimating a smoothly varying quantitative at-tribute regardless of spatial complexity Moreover rainbowprovides good support for gradient estimation at high spatialfrequency These two tasks correspond to elementary visualanalytic primitives including characterizing distributions de-termining ranges and filtering [1] Moreover studies showthat when experts attempt to form a mental model about avisualization they first go through a time-consuming processof extracting quantitative data ldquoat a rather detailed levelrdquo [38]For instance a weather forecaster will lookup pressure andwind changes estimating current readings at landmark loca-tions in the map before making a forecast Our data suggeststhat rainbow provides good support for these tasks making ita potentially reasonable choice for weather forecasters

Critique of rainbow centers on its tendency to create sharpvisual boundaries particularly around its yellow regions [4]Experts also criticize the use of fully saturated hues [24]which result in non-uniform perceptual steps within the colorramp The common intuition is that these two factors com-bined will inevitably distort the perception of quantities Wedo not see evidence to support this hypothesis In fact to thecontrary attempts to lsquolinearizersquo the rainbow by monotonicallyincreasing the luminance of hues could reduce estimationaccuracy by up to an order of magnitude

F Guideline 4 Rather than entirely discouraging the useof rainbow we suggest that it can be a reasonable designchoice for conveying spatial distributions and variances andin tasks that require quantitative as opposed to geometricprecision However rainbow has a number of limitationsThe use of green and red hues is problematic for people withcolor deficiency Moreover rainbow is probably ineffective atrevealing high-frequency patterns Interestingly these short-comings are balanced by diverging ramps (eg coolwarm)which although quantitatively inaccurate appear to supportpattern perception at high spatial frequency We thus arguethat hue-varying and diverging colormaps support orthogonaltasks in continuous maps and should therefore be consideredas complementary rather than mutually exclusive choices

LIMITATIONS AND FUTURE WORKThere are some limitations to our work that should be consid-ered First as with other crowdsourced graphical perceptionstudies we gain access to a larger pool of participants butsacrifice some experimental control [16] Particularly rele-vant to our study is the variations in participantsrsquo monitorsincluding color calibration and display resolution as well asthe illumination conditions in their homes or offices mdash all ofwhich can impact color perception We could not control thesefactors but attempted to counteract their variation by involvinga larger sample (N=381) Although we expect crowdsourcingto improve the ecological validity of results and guidelinesuncontrolled variations can potentially reduce our ability to de-tect small but otherwise significant differences in performancebetween tested conditions Future lab studies should thereforebe attempted to replicate our findings with added controls

Second our study employed a limited set of tasks designed tomeasure elementary perceptual operators including quantityestimation gradient perception and pattern matching Thereis an opportunity to test higher-level tasks that mimic scientificanalyses more closely including the identification and compar-ison of larger map features (eg fronts ridges) Additionallysome of the tasks we tested could be re-evaluated in more au-thentic formulations For instance a metric task could requireparticipants to estimate the quantity at a specific location onthe map This formulation is arguably more realistic than thetask we tested which simply asked participants to click anylocation thought to match a specified quantity

Third our analysis was focused exclusively on spatial fre-quency and there are good reasons to consider this factor [1132] However there are also additional data characteristics toconsider including for instance the distribution of amplitudeswithin the map Such factors will influence the distribution ofcolors in the image and may thus impact perception

Lastly we limited our study to synthetically generated scalarfields to precisely vary spatial frequency while controllingfor other confounds However synthetic stimuli may alsointroduce (unknown) perceptual or cognitive biases There-fore additional studies are needed to replicate our findingswith datasets from real-world domains (eg meteorology geo-physics or oceanography) and with domain experts We alsorestricted this study to participants with normal color visionTherefore our results may not generalize to approximately 5of the population who have some form of color deficiency

CONCLUSIONSWe conducted three experiments to investigate the effects ofspatial frequency and colormap characteristics on the percep-tion of continuous pseudocolor maps Our results indicate thatspatial frequency impacts judgment of the encoded quantitiesand structures While viewersrsquo quantity estimation accuracyexhibited a predictable response increased data complexityhad a more nuanced effect on gradient and pattern compre-hension the impact of which was dependent on the colormapused Designers should therefore consider both the type oftask and the spatial complexity of the underlying data Were-examined current guidelines and devised new recommenda-tions for color-coding of continuous spatial data

REFERENCES1 Robert Amar James Eagan and John Stasko 2005

Low-level components of analytic activity in informationvisualization In Information Visualization 2005INFOVIS 2005 IEEE Symposium on IEEE 111ndash117

2 Lawrence D Bergman Bernice E Rogowitz and Lloyd ATreinish 1995 A rule-based tool for assisting colormapselection In Proceedings of the 6th conference onVisualizationrsquo95 IEEE Computer Society 118

3 Michelle Borkin Krzysztof Gajos Amanda PetersDimitrios Mitsouras Simone Melchionna Frank RybickiCharles Feldman and Hanspeter Pfister 2011 Evaluationof artery visualizations for heart disease diagnosis IEEETransactions on Visualization and Computer Graphics 1712 (2011) 2479ndash2488

4 David Borland and Russell M Taylor Ii 2007 Rainbowcolor map (still) considered harmful IEEE ComputerGraphics and Applications 27 2 (2007)

5 Cynthia A Brewer 1994 Visualization in ModernCartography Elsevier Science Chapter Color useguidelines for mapping and visualization

6 Cynthia A Brewer 1996 Guidelines for selecting colorsfor diverging schemes on maps The CartographicJournal 33 2 (1996) 79ndash86

7 Cynthia A Brewer Alan M MacEachren Linda W Pickleand Douglas Herrmann 1997 Mapping mortalityEvaluating color schemes for choropleth maps Annals ofthe Association of American Geographers 87 3 (1997)411ndash438

8 Roxana Bujack Terece L Turton Francesca SamselColin Ware David H Rogers and James Ahrens 2017The Good the Bad and the Ugly A TheoreticalFramework for the Assessment of Continuous ColormapsIEEE Transactions on Visualization and ComputerGraphics (2017)

9 William S Cleveland and William S Cleveland 1983 Acolor-caused optical illusion on a statistical graph TheAmerican Statistician 37 2 (1983) 101ndash105

10 William S Cleveland and Robert McGill 1984 Graphicalperception Theory experimentation and application tothe development of graphical methods J Amer StatistAssoc 79 387 (1984) 531ndash554

11 Russel De Valois and Karen De Valouis 1990 SpatialVision Oxford University Press

12 Russell L De Valois Duane G Albrecht and Lisa GThorell 1978 Cortical cells bar and edge detectors orspatial frequency filters In Frontiers in visual scienceSpringer 544ndash556

13 Connor C Gramazio David H Laidlaw and Karen BSchloss 2017 Colorgorical Creating discriminable andpreferable color palettes for information visualizationIEEE Transactions on Visualization and ComputerGraphics 23 1 (2017) 521ndash530

14 D A Green 2011 A colour scheme for the display ofastronomical intensity images Bulletin of theAstronomical Society of India 39 (June 2011) 289ndash295

15 Mark Harrower and Cynthia A Brewer 2003ColorBrewer org an online tool for selecting colourschemes for maps The Cartographic Journal 40 1(2003) 27ndash37

16 Jeffrey Heer and Michael Bostock 2010 Crowdsourcinggraphical perception using mechanical turk to assessvisualization design In Proceedings of the SIGCHIConference on Human Factors in Computing SystemsACM 203ndash212

17 GT Herman and H Levkowitz 1992 Color scales forimage data IEEE Computer Graphics and Applications12 1 (1992) 72ndash80

18 Michael D Hyslop 2006 A comparison of spectral colorand greyscale continuous-tone map perception Masterrsquosthesis Michigan State University

19 Alan David Kalvin Bernice E Rogowitz Adar Pelah andAron Cohen 2000 Building perceptual color maps forvisualizing interval data In Human Vision and ElectronicImaging V Vol 3959 International Society for Opticsand Photonics 323ndash336

20 Eamonn Keogh and Chotirat Ann Ratanamahatana 2005Exact indexing of dynamic time warping Knowledge andInformation Systems 7 3 (2005) 358ndash386

21 Gordon Kindlmann Erik Reinhard and Sarah Creem2002 Face-based luminance matching for perceptualcolormap generation In Proceedings of the IEEEConference on Visualization rsquo02 IEEE Computer Society299ndash306

22 Robert Kosara 2016 An Empire Built On SandReexamining What We Think We Know AboutVisualization In Proceedings of the Beyond Time andErrors on Novel Evaluation Methods for VisualizationACM 162ndash168

23 Mark P Kumler and Richard E Groop 1990Continuous-tone mapping of smooth surfacesCartography and Geographic Information Systems 17 4(1990) 279ndash289

24 Kenneth Moreland 2009 Diverging color maps forscientific visualization In International Symposium onVisual Computing Springer 92ndash103

25 Kenneth Moreland 2016 Why We Use Bad Color Mapsand What You Can Do About It Electronic Imaging2016 16 (2016) 1ndash6

26 Lace Padilla P Samuel Quinan Miriah Meyer andSarah H Creem-Regehr 2017 Evaluating the Impact ofBinning 2D Scalar Fields IEEE Transactions onVisualization and Computer Graphics 23 1 (2017)431ndash440

27 Stephen E Palmer 1999 Vision science Photons tophenomenology MIT press

28 Ken Perlin 1985 An image synthesizer ACMSIGGRAPH Computer Graphics 19 3 (1985) 287ndash296

29 Linda Williams Pickle 2003 Usability testing of mapdesigns In Proceedings of Symposium on the Interface ofComputing Science and Statistics 42ndash56

30 P Samuel Quinan and Miriah Meyer 2016 Visuallycomparing weather features in forecasts IEEETransactions on Visualization and Computer Graphics 221 (2016) 389ndash398

31 Penny L Rheingans 2000 Task-based color scale designIn 28th AIPR Workshop 3D Visualization for DataExploration and Decision Making International Societyfor Optics and Photonics 35ndash43

32 Bernice E Rogowitz and Lloyd A Treinish 1994 Usingperceptual rules in interactive visualization In ISTSPIE1994 International Symposium on Electronic ImagingScience and Technology International Society for Opticsand Photonics 287ndash295

33 Bernice E Rogowitz Lloyd A Treinish Steve Brysonand others 1996 How not to lie with visualizationComputers in Physics 10 3 (1996) 268ndash273

34 Samuel Silva Beatriz Sousa Santos and JoaquimMadeira 2011 Using color in visualization A surveyComputers Graphics 35 2 (2011) 320ndash333

35 Maureen Stone Danielle Albers Szafir and Vidya Setlur2014 An engineering model for color difference as afunction of size In Color and Imaging Conference Vol2014 Society for Imaging Science and Technology253ndash258

36 Danielle Albers Szafir 2017 Modeling Color Differencefor Visualization Design IEEE Transactions onVisualization and Computer Graphics (2017)

37 Christian Tominski Georg Fuchs and HeidrunSchumann 2008 Task-driven color coding InInformation Visualisation 2008 IVrsquo08 12thInternational Conference IEEE 373ndash380

38 J Gregory Trafton Susan S Kirschenbaum Ted L TsuiRobert T Miyamoto James A Ballas and Paula DRaymond 2000 Turning pictures into numbersextracting and generating information from complexvisualizations International Journal of Human-ComputerStudies 53 5 (2000) 827ndash850

39 Bruce E Trumbo 1981 A theory for coloring bivariatestatistical maps The American Statistician 35 4 (1981)220ndash226

40 W3C 2016 CSS Values and Units Module Level 3(2016)httpwwww3orgTRcss3-valuesabsolute-lengths

41 Colin Ware 1988 Color sequences for univariate mapsTheory experiments and principles IEEE ComputerGraphics and Applications 8 5 (1988) 41ndash49

42 Colin Ware 2012 Information visualization perceptionfor design Elsevier

43 Liang Zhou and Charles D Hansen 2016 A survey ofcolormaps in visualization IEEE Transactions onVisualization and Computer Graphics 22 8 (2016)2051ndash2069

  • Introduction
  • Related Work
    • Design Strategies for Continuous Colormaps
    • Empirical Evaluations of Continuous Colormaps
    • Effects of Spatial Frequency
    • Summary
      • Hypotheses
      • Methodology
        • Stimuli
        • Colormaps
        • Experimental Design
          • Experiment 1 Quantity estimation
            • Participants
            • Procedure
            • Results
              • Experiment 2 Gradient perception
                • Participants
                • Procedure
                • Results
                  • Experiment 3 Pattern perception
                    • Participants
                    • Procedure
                    • Results
                      • Discussion and Guidelines
                        • Quantity Estimation
                        • Gradient Perception
                        • Pattern Perception
                        • Yet Another Look at the Rainbow
                          • Limitations and Future Work
                          • Conclusions
                          • References
Page 6: Graphical Perception of Continuous Quantitative Maps: the ...khreda.com/papers/CHI18_colormaps.pdfA large body of research has been devoted to understanding how color encoding affects

greyscale singlehue bodyheat cubehelix extbodyheat coolwarm rainbow spectral blueyellow

3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11

0

1

2

3

frequency

log(error)

whichexperimentmodel

experiment modelspatial frequency

Figure 4 Mean log of error in quantity estimation (experiment 1 vs model) Ribbons represent 95 CIs of the experimental results

0

1

2

3

3 5 7 9 11frequency

log(error)

colorscalegreyscalesinglehuebodyheatcubehelixextbodyheatcoolwarmrainbowspectralblueyellow

spatial frequency

0

1

2

greyscale

singlehue

bodyheat

cubehelix

extbodyheat

coolwarm

rainbow

spectral

blueyellow

log(error)

colorscalegreyscalesinglehuebodyheatcubehelixextbodyheatcoolwarmrainbowspectralblueyellow

Figure 5 Mean log error in quantity estimation by colormap (left) andby colormap times spatial frequency Intervals are 95 CIs

by fitting the log of error to a linear mixed-effects modelcomprising two fixed effects (colormap frequency) and tworandom effects The first random effect accounts for individ-ual variations among participants and the second for intra-trialvariations (recall that trials comprised different test elevations)

Figure 4 illustrates the mean log error by colormap and spatialfrequency (the dashed trendline represents the model) Theexperimental results are combined in Figure 5 to ease com-parison A likelihood ratio test indicates the overall model issignificant (χ2(18) = 14817 p lt 0001) To test for interac-tion between spatial frequency and colormap we fit a reducedmodel that accounts for both frequency and colormap but nottheir interaction There was no significant difference betweenthe full and the reduced model (χ2(8) = 10695 p = 0219)thus ruling out an interaction between colormap and spatial fre-quency We will therefore interpret the reduced model whichaccounts for both factors independently Table 3 illustrates themodel coefficients

The model predicts that a step-increase in spatial frequencyyields a 011 increase in the log of estimation error The differ-ence in estimation error between the highest (f=11) and lowest(f=3) frequency levels is approximately 09 orders of magni-tude The effect of color encoding was equally evident allthe colormaps were significantly better than greyscale How-ever the gain in accuracy was markedly different betweenthe colormaps Rainbow had the largest impact on estimationaccuracy reducing error by approximately 23 orders of mag-nitude compared to greyscale The runner-up was spectralwhich also contains substantial hue variation However spec-tral reduced error by 175 orders of magnitude only On theother hand Spiral colormaps (extbodyheat cubehelix) whichcomprise multiple hues over a monotonically increasing lumi-nance decreased estimation errors by approximately 13-14orders of magnitude compared to 08-12 for Diverging ramps(blueyellow coolwarm) Sequential schemes (singlehue and

Coefficient Estimate |t value| p(Intercept) 177 6628 singlehue -026 2253 bodyheat -071 6118 cubehelix -127 16467 extbodyheat -139 12073 coolwarm -118 10122 rainbow -233 30264 spectral -175 15172 blueyellow -080 6871 Spatial Frequency 011 16967

Table 3 Effects of colormap and spatial frequency on the log of errorin quantity estimation The intercept represents greyscale as colormap(lowastlowastlowast= p lt 0001lowastlowast= p lt 001lowast= p lt 005)

000

025

050

075

3 5 7 9 11spatial frequency

avg

gra

dien

t (

)

Figure 6 Average local gradient (ie terrain slope) at locations selectedby participants Error bars are 95 CIs

bodyheat) had the least impact on error with a mere improve-ment of 026-071 orders of magnitude relative to greyscale

The fact that we did not find interaction between colormapand spatial frequency implies that the relative effectiveness ofthe different colormaps is stable across all spatial frequencylevels tested Rainbow is thus expected to be the most ac-curate colormap for quantity estimation regardless of howspatially complex the data is However estimation accuracywill decrease comparably for all colormaps as the data be-comes more spatially varied This could reflect a combinationof perceptual and motor difficulty in locating and clicking theintended location due the larger local gradients encounteredin high-frequency maps (see Figure 6)

EXPERIMENT 2 GRADIENT PERCEPTIONThe second experiment tests participantsrsquo accuracy in compar-ing and judging the steepness of gradients The ability to judgehow fast the encoded quantities change between adjacent maplocations is important in many contexts

ParticipantsWe recruited 126 participants (50 females 74 males 2 others)with a mean age of 3562 years (ST D = 955 years) Partici-pants had an overall success rate of 6775 (ST D = 1141)Ten participants (79 of subjects) were dropped from theanalysis because their overall accuracy was worse than chancehaving correctly answered less than 50 of trials in a two-alternatives forced choice experiment

greyscale singlehue bodyheat cubehelix extbodyheat coolwarm rainbow spectral blueyellow

3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 1104

06

08

frequency

c

orre

ct

whichexperimentmodel

experiment model

p(su

cces

s)

spatial frequency

Figure 7 Probability of successful gradient judgment (experiment 2 vs model) Ribbons represent 95 CIs of the experimental results

spatial frequency

05

06

07

08

09

3 5 7 9 11spatial frequency

c

orre

ct

colorscalegreyscalesinglehuebodyheatcubehelixextbodyheatcoolwarmrainbowspectralblueyellow

00

02

04

06

08

greysc

ale

single

hue

body

heat

cube

helix

extbo

dyhe

at

coolw

arm

rainb

ow

spec

tral

bluey

ellow

c

orre

ct

colorscalegreyscalesinglehuebodyheatcubehelixextbodyheatcoolwarmrainbowspectralblueyellow

Figure 8 Percentage of correctly answered trials in a gradient percep-tion task (experiment 2) Intervals are 95 CIs

ProcedureEach trail consisted of a map with two squares juxtaposed ontop (see Figure 3) Participants were prompted to ldquocomparethe steepness of terrain inside the two boxesrdquo and ldquoclick on thebox that is steeper on averagerdquo The two boxes were identicalin size (175times175 pixels or 35degtimes35deg of visual angle) How-ever terrain steepness which was calculated by taking the aver-age first derivative within each box was varied systematicallyThe gradient-ratio between the flatter and the steeper boxeswas fixed to one of four levels 0808308609(plusmn005) Alower ratio implies larger and potentially more perceptibleslope difference making the task easier However the twoboxes encompassed terrain with identical height ranges to re-duce variability in the appearance of their peaks (a potentialconfound in slope judgment [26])

Participants first completed a set of 6 training trials that in-cluded feedback before proceeding to the main trials Theorder of stimuli was similar to the previous experiment thestudy consisted of 3 rounds one with each of the 3 colormapsthe participant was assigned to see Each round encompassedall 5 spatial frequency levels Participants completed 4 trialswith each colormap and frequency combination spanning arange of easy to difficult tests (a total of 60 trials) The orderof colormap presentation was fully counterbalanced acrossparticipants

ResultsFigure 7 illustrates participantsrsquo probability of correctly iden-tifying the steeper gradient The experimental data is shownseparately in Figure 8 We fit the results to a logistic re-gression model comprising two fixed effects (colormap fre-quency) The model also included two random effects toaccount for individual differences among participants andintra-trial variations (recall that trials varied in difficulty) Themodel essentially predicts the odds of correctly identifyingthe steeper gradient A likelihood ratio test indicates the over-all model is significant (χ2(17) = 42165 p lt 0001) The

a Main effectsCoef Est |z| p(Intercept) 080 0516singlehue 114 0415bodyheat 094 0183cubehelix 090 0354extbodyheat 113 0388coolwarm 066 1316rainbow 065 1354spectral 092 0259blueyellow 091 0281Frequency 115 4633

b Interaction effects(colormap x frequency)

Coef Est |z| psinglehue 096 0910bodyheat 104 1020cubehelix 106 1303extbodyheat 103 0596coolwarm 114 3008 rainbow 113 2876 spectral 106 1330blueyellow 112 2471

Table 4 Main effects of colormap and spatial frequency on success oddsin gradient judgment (a) and their interaction Coefficients shown corre-spond to the exponented model estimates to reflect odd-ratios The inter-cept represents greyscale (lowastlowastlowast= p lt 0001 lowastlowast= p lt 001 lowast= p lt 005)

model correctly predicts 7467 of outcomes We find sig-nificant interaction between colormap and spatial frequency(χ2(8) = 2781 p lt 0001) Table 4 shows model coefficients

The main-effect coefficients for all colormaps were not sig-nificant indicating that all colormaps perform comparablyto greyscale at low spatial frequencies Participants are thusunlikely to benefit from the use of color when judging gra-dients in low-variance data The main effect of spatial fre-quency however is significant The model estimates that astep-increase in spatial frequency improves the odds of correctjudgment by 15 Estimating gradients appear to be easier inmaps with more complex spatial structures

The model indicates several noteworthy interactions Althoughthe use of color had no significant effect in low-frequencymaps several colormaps significantly outperformed greyscaleat high frequency The divergent coolwarm improved partici-pantsrsquo success odds by 14 for every step-increase in spatialfrequency Similarly rainbow and blueyellow increased theodds by approximately 13 and 12 respectively Notablythese three colormaps contain substantial variation in satura-tion (coolwarm and blueyellow) or hue (rainbow) All othercolormaps tested were not reliably different from greyscale

EXPERIMENT 3 PATTERN PERCEPTIONHaving tested accuracy in quantity estimation and gradientperception we now evaluate participantsrsquo ability to integratethese two skills Experiment 3 required participants to extracta longitudinal pattern from the map and match it to an externalrepresentation a task originally devised by Hyslop [18]

ParticipantsWe recruited 165 participants (79 females 84 males 2 others)The mean participant age was 3604 years (ST D = 1171)Overall participants had a mean success rate of 7851 inmatching the correct pattern (ST D = 1931) We dropped

greyscale singlehue bodyheat cubehelix extbodyheat coolwarm rainbow spectral blueyellow

3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11

07

08

09

frequency

c

orre

ct

whichexperimentmodel

experiment model

p(su

cces

s)

spatial frequency

Figure 9 Probability of successful pattern matching (experiment 3 vs model) Ribbons denote 95 CIs of the experimental data

070

075

080

085

090

3 5 7 9 11spatial frequency

c

orre

ct

colorscalegreyscalesinglehuebodyheatcubehelixextbodyheatcoolwarmrainbowspectralblueyellow

000

025

050

075

greysc

ale

single

hue

body

heat

cube

helix

extbo

dyhe

at

coolw

arm

rainb

ow

spec

tral

bluey

ellow

c

orre

ct

colorscalegreyscalesinglehuebodyheatcubehelixextbodyheatcoolwarmrainbowspectralblueyellow

Figure 10 Percentage of correctly answered trials in experiment 3 In-tervals are 95 CIs

seven participants from the analysis (42 of subjects) whoseoverall accuracy was two standard deviations below the mean

ProcedureParticipants first completed a set of 6 training trials that in-cluded feedback before proceeding to the main experimentEach trial consisted of a map with two markers labeled A andB (see Figure 3) The markers were horizontally displacedby 350 pixels (7degof visual angle) Participants were given thefollowing prompt ldquoImagine a line from A to B Select the ele-vation profile below that most closely matches its sloperdquo Theythen selected a choice among a set of 6 patterns includingthe actual elevation profile and 5 other distractors Distractorswere generated from the same map so as to reflect similar spa-tial frequency characteristics and had to be 65-70 similarto the actual profile (as measured by dynamic warping [20])Additionally profiles and distractors were selected to not havepeaks or valleys at the endpoints These criteria determinedafter a pilot ensure similar task difficulty across the trials

The order of stimuli was similar to the previous two experi-ments the study consisted of 3 rounds one with each of the 3colormaps the participant was assigned to see and encompass-ing the 5 spatial frequency levels Thus every participant saw3times5 colormap and frequency combinations and completed 3pattern matching trials with each combination for a total of 45trials As in the previous experiments the order of colormappresentation was fully counterbalanced

ResultsWe fit the results to a logistic regression model comprisingtwo fixed effects (colormap frequency) and two random ef-fects to account for individual differences and intra-trial vari-ations Figure 9 shows the odds of successful profile match-ing The experimental results are illustrated separately inFigure 10 A likelihood ratio test indicates the model issignificant (χ2(17) = 39467 p lt 0001) We find signif-icant interaction between colormap and spatial frequency

a Main effectsCoef Est |z| p(Intercept) 903 6502 singlehue 134 0649bodyheat 080 0503cubehelix 077 0681extbodyheat 081 0466coolwarm 044 1868 rainbow 098 0064spectral 057 1284blueyellow 073 0697Frequency 093 2058

b Interaction effects(colormap x frequency)

Coef Est |z| psinglehue 098 0450bodyheat 102 0456cubehelix 109 1781 extbodyheat 104 0797coolwarm 117 3143 rainbow 101 0259spectral 111 2198 blueyellow 109 1683

Table 5 Main effects of colormap and spatial frequency on successodds in experiment 3 (a) and their interaction Coefficients depict expo-nented model estimates to reflect odd-ratios The intercept correspondto greyscale (lowastlowastlowast= p lt 0001 lowastlowast= p lt 001 lowast= p lt 005 = p lt 01)

(χ2(8) = 18131 p lt 005) the relative effectiveness of thecolormaps appears to vary with spatial frequency

Overall we find a significant detrimental main effect of spa-tial frequency on pattern perception as indicated by a 093Frequency coefficient (Table 5a) This translates to a 7 dropin the odds of correctly matching the profile for every step-increase in spatial frequency The main effect coefficients forall colormaps were not significant indicating that the use ofcolor at low spatial frequency is unlikely to improve patternperception as compared to a plain greyscale ramp

Colormap performance begins to diverge at high spatial fre-quency Only two colormaps have significant and largeenough odds-ratio coefficients (ie gt 107) to overcome thefrequency-induced perceptual difficulty spectral and cool-warm increased the odds of correct pattern matching by 11ndash17 respectively for a every step-increase in spatial fre-quency (after adjusting for frequency effects alone) Addi-tionally blueyellow and cubehelix were associated with a 9improvement but the advantage was not reliable (p lt 01)On the other hand extbodyheat bodyheat rainbow had small(and insignificant) odds-ratio coefficients (098ndash104 lt 107)indicating that similar to greyscale they are associated withlower success odds in complex maps

In short only two of the tested colormaps (coolwarm andspectral) appear to reliably support pattern perception at highspatial frequency Both consist of a diverging ramp withuniformly-stepped luminance All other colormaps (includinggreyscale) suffered as data complexity increased

DISCUSSION AND GUIDELINESOur work sheds new light on how spatial complexity impactsthe perception of continuous color-coded maps The experi-ments also led to some surprising findings that are at odds withcurrent guidelines We interpret these results and accordingly

Quantity estimationRanking unaffected by spatial frequency

Gradient perceptionLow spatial frequency

(038 cycledeg)High spatial frequency

(138 cycledeg)

no s

igni

fican

tdi

ffere

nce

no s

igni

fican

tdi

ffere

nce

Pattern perception

better

worse

Color mapGuidelines

(1) Maximize range of saturated hues regardless of spatial frequency

(2) At high spatial frequency Fully-saturated hues or diverging ramps with chroma variation

(3) At high spatial frequency Diverging ramps with uniformly-stepped luminance

no s

igni

fican

tdi

ffere

nce

no

sign

ifica

ntdi

ffere

nce

Low spatial frequency(038 cycledeg)

High spatial frequency(138 cycledeg)

Figure 11 Model-derived colormap ranking and guidelines by task and spatial frequency (lowast= p lt 005 = p lt 01 relative to greyscale)

devise new task- and frequency-aware color mapping guide-lines (indicated byF) We also rank the tested colormaps andsummarize our guidelines in Figure 11

Quantity EstimationOur first hypothesis (H1) predicts hue- and saturation-varyingramps to be more accurate at low spatial frequencies andramps with monotonically increasing luminance to be moreaccurate at high frequencies As discussed H1 is based onthe relative contrast-sensitivity of our visual system [32] Aquantity estimation task (experiment 1) shows no interactionbetween colormap and spatial frequency While increasedspatial complexity is associated with higher estimation errorthe effect is similar across all colormaps We thus reject H1

On the other hands results provide support for H2 whichpredicts that hue-varying ramps will lead to more accurateestimation Indeed the top performing colormaps (rainbowand spectral) contain substantial hue variation Results fromexperiment 1 thus replicate earlier findings by Ware [41] butalso extend them to show that spatial frequency have no ap-parent impact on the effectiveness of hue-varying ramps Ourdata shows that rainbow and spectral are the most accurateamong the colormaps tested even at the highest levels of spa-tial frequency Altogether these results lend further support tothe theory that lookup errors in color-coded maps are largelycaused by systematic simultaneous contrast shifts [41] ratherthan being affected by contrast sensitivity modulation [32]These shifts are best counteracted with colormaps that varynon-monotonically along one or more perceptual channels

A corollary result is that mixing monotonic luminance withhue variation would lead to significant accuracy loss Indeeddata from experiment 1 indicates that rainbow is approxi-mately an order of magnitude more accurate than extbodyheatand cubehelix These Spiral colormaps are designed to be moreaccurate rainbow alternatives for interval data [4 24] Con-trary we find that they reduce accuracy compared to a purelyhue-varying ramp This finding suggests that when estimat-ing a continuously coded spatial quantity people benefit mostfrom a large dynamic hue range Incorporating monotonicallyincreasing lightness within the colormap would necessarilyreduce the hue range thereby diminishing accuracy

F Guideline 1 We recommend maximizing hue variation toimprove quantity estimation irrespective of spatial frequency

Gradient PerceptionGradient perception allows people to distinguish how quicklythe encoded attribute changes between adjacent locations anessential skill when evaluating the distribution and varianceof spatial data We find that the task is strongly modulatedby the datarsquos spatial complexity increased spatial frequencyappears to enhance the perception of gradients This is un-surprising as maps with jagged surfaces are likely to exhibitmore pronounced mdashand thus more perceptiblemdash differencesin slope Colormap effectiveness was also impacted by spa-tial frequency color encoding did not help participantsrsquo dis-tinguish gradients at low frequency levels as all colormapsshowed similar performance to greyscale However three col-ormaps demonstrated significant advantage at high frequenciesCoolwarm rainbow and blueyellow improved perception oddsby 12-14 for every step-increase in spatial frequency Allthree employed one of two design strategies a diverging rampwith varying saturation or a fully saturated hue rotation

The above results contradict H1 which predicts hue- andsaturation-varying colormaps to perform better at low frequen-cies In fact we see the opposite The results also do not sup-port H3 which predicts better performance for monotonically-luminant ramps in structure perception tasks In fact all threetop-performing ramps exhibit non-monotonic luminance

F Guideline 2 For tasks requiring gradient perception athigh spatial frequency we recommend a range of fully satu-rated hues (eg rainbow) or diverging chroma-varying ramps(eg coolwarm or blueyellow)

Pattern PerceptionExperiment 3 prompted participants to match the elevationprofile along a horizontal path with an external pattern Weexpected colormaps with monotonically increasing luminanceto be more accurate at this task (H3) but results were notentirely consistent with this prediction While all tested col-ormaps had comparable performance at low spatial frequencyonly two colormaps coolwarm and spectral gave partici-pants higher odds of successfully matching the pattern at highfrequency Both colormaps comprise a diverging ramp withuniformly-stepped (though not strictly monotonic) luminanceBy contrast sequential and spiral ramps performed just aspoorly as greyscale in complex maps and so did rainbow

The above result are consistent with Morelandrsquos argumentthat diverging ramps provide ldquomaximal perceptual resolutionrdquo(through increasing and decreasing luminance intervals) [24]potentially enabling high-frequency patterns to be resolvedmore easily Our results may also explain why divergingschemes performed better in medical diagnosis [3] we suspectsuch tasks to require the analysis of potentially high-frequencyfeatures (eg small tissue aberrations)

FGuideline 3 We recommend diverging ramps with equidis-tant luminance steps (eg coolwarm and spectral) to sup-port the perception of longitudinal patterns at high spatialfrequency Rainbow Sequential and Spiral schemes shouldbe avoided in complex maps especially if the task involvesthe analysis and matching of fine-grained features

Yet Another Look at the RainbowResults of experiments 1 and 2 may shed a light on why rain-bow remains a popular choice among scientists [25] despitebeing considered a bad choice by the visualization commu-nity [4 30] Our data reveals that counterintuitively rainbowis robust for estimating a smoothly varying quantitative at-tribute regardless of spatial complexity Moreover rainbowprovides good support for gradient estimation at high spatialfrequency These two tasks correspond to elementary visualanalytic primitives including characterizing distributions de-termining ranges and filtering [1] Moreover studies showthat when experts attempt to form a mental model about avisualization they first go through a time-consuming processof extracting quantitative data ldquoat a rather detailed levelrdquo [38]For instance a weather forecaster will lookup pressure andwind changes estimating current readings at landmark loca-tions in the map before making a forecast Our data suggeststhat rainbow provides good support for these tasks making ita potentially reasonable choice for weather forecasters

Critique of rainbow centers on its tendency to create sharpvisual boundaries particularly around its yellow regions [4]Experts also criticize the use of fully saturated hues [24]which result in non-uniform perceptual steps within the colorramp The common intuition is that these two factors com-bined will inevitably distort the perception of quantities Wedo not see evidence to support this hypothesis In fact to thecontrary attempts to lsquolinearizersquo the rainbow by monotonicallyincreasing the luminance of hues could reduce estimationaccuracy by up to an order of magnitude

F Guideline 4 Rather than entirely discouraging the useof rainbow we suggest that it can be a reasonable designchoice for conveying spatial distributions and variances andin tasks that require quantitative as opposed to geometricprecision However rainbow has a number of limitationsThe use of green and red hues is problematic for people withcolor deficiency Moreover rainbow is probably ineffective atrevealing high-frequency patterns Interestingly these short-comings are balanced by diverging ramps (eg coolwarm)which although quantitatively inaccurate appear to supportpattern perception at high spatial frequency We thus arguethat hue-varying and diverging colormaps support orthogonaltasks in continuous maps and should therefore be consideredas complementary rather than mutually exclusive choices

LIMITATIONS AND FUTURE WORKThere are some limitations to our work that should be consid-ered First as with other crowdsourced graphical perceptionstudies we gain access to a larger pool of participants butsacrifice some experimental control [16] Particularly rele-vant to our study is the variations in participantsrsquo monitorsincluding color calibration and display resolution as well asthe illumination conditions in their homes or offices mdash all ofwhich can impact color perception We could not control thesefactors but attempted to counteract their variation by involvinga larger sample (N=381) Although we expect crowdsourcingto improve the ecological validity of results and guidelinesuncontrolled variations can potentially reduce our ability to de-tect small but otherwise significant differences in performancebetween tested conditions Future lab studies should thereforebe attempted to replicate our findings with added controls

Second our study employed a limited set of tasks designed tomeasure elementary perceptual operators including quantityestimation gradient perception and pattern matching Thereis an opportunity to test higher-level tasks that mimic scientificanalyses more closely including the identification and compar-ison of larger map features (eg fronts ridges) Additionallysome of the tasks we tested could be re-evaluated in more au-thentic formulations For instance a metric task could requireparticipants to estimate the quantity at a specific location onthe map This formulation is arguably more realistic than thetask we tested which simply asked participants to click anylocation thought to match a specified quantity

Third our analysis was focused exclusively on spatial fre-quency and there are good reasons to consider this factor [1132] However there are also additional data characteristics toconsider including for instance the distribution of amplitudeswithin the map Such factors will influence the distribution ofcolors in the image and may thus impact perception

Lastly we limited our study to synthetically generated scalarfields to precisely vary spatial frequency while controllingfor other confounds However synthetic stimuli may alsointroduce (unknown) perceptual or cognitive biases There-fore additional studies are needed to replicate our findingswith datasets from real-world domains (eg meteorology geo-physics or oceanography) and with domain experts We alsorestricted this study to participants with normal color visionTherefore our results may not generalize to approximately 5of the population who have some form of color deficiency

CONCLUSIONSWe conducted three experiments to investigate the effects ofspatial frequency and colormap characteristics on the percep-tion of continuous pseudocolor maps Our results indicate thatspatial frequency impacts judgment of the encoded quantitiesand structures While viewersrsquo quantity estimation accuracyexhibited a predictable response increased data complexityhad a more nuanced effect on gradient and pattern compre-hension the impact of which was dependent on the colormapused Designers should therefore consider both the type oftask and the spatial complexity of the underlying data Were-examined current guidelines and devised new recommenda-tions for color-coding of continuous spatial data

REFERENCES1 Robert Amar James Eagan and John Stasko 2005

Low-level components of analytic activity in informationvisualization In Information Visualization 2005INFOVIS 2005 IEEE Symposium on IEEE 111ndash117

2 Lawrence D Bergman Bernice E Rogowitz and Lloyd ATreinish 1995 A rule-based tool for assisting colormapselection In Proceedings of the 6th conference onVisualizationrsquo95 IEEE Computer Society 118

3 Michelle Borkin Krzysztof Gajos Amanda PetersDimitrios Mitsouras Simone Melchionna Frank RybickiCharles Feldman and Hanspeter Pfister 2011 Evaluationof artery visualizations for heart disease diagnosis IEEETransactions on Visualization and Computer Graphics 1712 (2011) 2479ndash2488

4 David Borland and Russell M Taylor Ii 2007 Rainbowcolor map (still) considered harmful IEEE ComputerGraphics and Applications 27 2 (2007)

5 Cynthia A Brewer 1994 Visualization in ModernCartography Elsevier Science Chapter Color useguidelines for mapping and visualization

6 Cynthia A Brewer 1996 Guidelines for selecting colorsfor diverging schemes on maps The CartographicJournal 33 2 (1996) 79ndash86

7 Cynthia A Brewer Alan M MacEachren Linda W Pickleand Douglas Herrmann 1997 Mapping mortalityEvaluating color schemes for choropleth maps Annals ofthe Association of American Geographers 87 3 (1997)411ndash438

8 Roxana Bujack Terece L Turton Francesca SamselColin Ware David H Rogers and James Ahrens 2017The Good the Bad and the Ugly A TheoreticalFramework for the Assessment of Continuous ColormapsIEEE Transactions on Visualization and ComputerGraphics (2017)

9 William S Cleveland and William S Cleveland 1983 Acolor-caused optical illusion on a statistical graph TheAmerican Statistician 37 2 (1983) 101ndash105

10 William S Cleveland and Robert McGill 1984 Graphicalperception Theory experimentation and application tothe development of graphical methods J Amer StatistAssoc 79 387 (1984) 531ndash554

11 Russel De Valois and Karen De Valouis 1990 SpatialVision Oxford University Press

12 Russell L De Valois Duane G Albrecht and Lisa GThorell 1978 Cortical cells bar and edge detectors orspatial frequency filters In Frontiers in visual scienceSpringer 544ndash556

13 Connor C Gramazio David H Laidlaw and Karen BSchloss 2017 Colorgorical Creating discriminable andpreferable color palettes for information visualizationIEEE Transactions on Visualization and ComputerGraphics 23 1 (2017) 521ndash530

14 D A Green 2011 A colour scheme for the display ofastronomical intensity images Bulletin of theAstronomical Society of India 39 (June 2011) 289ndash295

15 Mark Harrower and Cynthia A Brewer 2003ColorBrewer org an online tool for selecting colourschemes for maps The Cartographic Journal 40 1(2003) 27ndash37

16 Jeffrey Heer and Michael Bostock 2010 Crowdsourcinggraphical perception using mechanical turk to assessvisualization design In Proceedings of the SIGCHIConference on Human Factors in Computing SystemsACM 203ndash212

17 GT Herman and H Levkowitz 1992 Color scales forimage data IEEE Computer Graphics and Applications12 1 (1992) 72ndash80

18 Michael D Hyslop 2006 A comparison of spectral colorand greyscale continuous-tone map perception Masterrsquosthesis Michigan State University

19 Alan David Kalvin Bernice E Rogowitz Adar Pelah andAron Cohen 2000 Building perceptual color maps forvisualizing interval data In Human Vision and ElectronicImaging V Vol 3959 International Society for Opticsand Photonics 323ndash336

20 Eamonn Keogh and Chotirat Ann Ratanamahatana 2005Exact indexing of dynamic time warping Knowledge andInformation Systems 7 3 (2005) 358ndash386

21 Gordon Kindlmann Erik Reinhard and Sarah Creem2002 Face-based luminance matching for perceptualcolormap generation In Proceedings of the IEEEConference on Visualization rsquo02 IEEE Computer Society299ndash306

22 Robert Kosara 2016 An Empire Built On SandReexamining What We Think We Know AboutVisualization In Proceedings of the Beyond Time andErrors on Novel Evaluation Methods for VisualizationACM 162ndash168

23 Mark P Kumler and Richard E Groop 1990Continuous-tone mapping of smooth surfacesCartography and Geographic Information Systems 17 4(1990) 279ndash289

24 Kenneth Moreland 2009 Diverging color maps forscientific visualization In International Symposium onVisual Computing Springer 92ndash103

25 Kenneth Moreland 2016 Why We Use Bad Color Mapsand What You Can Do About It Electronic Imaging2016 16 (2016) 1ndash6

26 Lace Padilla P Samuel Quinan Miriah Meyer andSarah H Creem-Regehr 2017 Evaluating the Impact ofBinning 2D Scalar Fields IEEE Transactions onVisualization and Computer Graphics 23 1 (2017)431ndash440

27 Stephen E Palmer 1999 Vision science Photons tophenomenology MIT press

28 Ken Perlin 1985 An image synthesizer ACMSIGGRAPH Computer Graphics 19 3 (1985) 287ndash296

29 Linda Williams Pickle 2003 Usability testing of mapdesigns In Proceedings of Symposium on the Interface ofComputing Science and Statistics 42ndash56

30 P Samuel Quinan and Miriah Meyer 2016 Visuallycomparing weather features in forecasts IEEETransactions on Visualization and Computer Graphics 221 (2016) 389ndash398

31 Penny L Rheingans 2000 Task-based color scale designIn 28th AIPR Workshop 3D Visualization for DataExploration and Decision Making International Societyfor Optics and Photonics 35ndash43

32 Bernice E Rogowitz and Lloyd A Treinish 1994 Usingperceptual rules in interactive visualization In ISTSPIE1994 International Symposium on Electronic ImagingScience and Technology International Society for Opticsand Photonics 287ndash295

33 Bernice E Rogowitz Lloyd A Treinish Steve Brysonand others 1996 How not to lie with visualizationComputers in Physics 10 3 (1996) 268ndash273

34 Samuel Silva Beatriz Sousa Santos and JoaquimMadeira 2011 Using color in visualization A surveyComputers Graphics 35 2 (2011) 320ndash333

35 Maureen Stone Danielle Albers Szafir and Vidya Setlur2014 An engineering model for color difference as afunction of size In Color and Imaging Conference Vol2014 Society for Imaging Science and Technology253ndash258

36 Danielle Albers Szafir 2017 Modeling Color Differencefor Visualization Design IEEE Transactions onVisualization and Computer Graphics (2017)

37 Christian Tominski Georg Fuchs and HeidrunSchumann 2008 Task-driven color coding InInformation Visualisation 2008 IVrsquo08 12thInternational Conference IEEE 373ndash380

38 J Gregory Trafton Susan S Kirschenbaum Ted L TsuiRobert T Miyamoto James A Ballas and Paula DRaymond 2000 Turning pictures into numbersextracting and generating information from complexvisualizations International Journal of Human-ComputerStudies 53 5 (2000) 827ndash850

39 Bruce E Trumbo 1981 A theory for coloring bivariatestatistical maps The American Statistician 35 4 (1981)220ndash226

40 W3C 2016 CSS Values and Units Module Level 3(2016)httpwwww3orgTRcss3-valuesabsolute-lengths

41 Colin Ware 1988 Color sequences for univariate mapsTheory experiments and principles IEEE ComputerGraphics and Applications 8 5 (1988) 41ndash49

42 Colin Ware 2012 Information visualization perceptionfor design Elsevier

43 Liang Zhou and Charles D Hansen 2016 A survey ofcolormaps in visualization IEEE Transactions onVisualization and Computer Graphics 22 8 (2016)2051ndash2069

  • Introduction
  • Related Work
    • Design Strategies for Continuous Colormaps
    • Empirical Evaluations of Continuous Colormaps
    • Effects of Spatial Frequency
    • Summary
      • Hypotheses
      • Methodology
        • Stimuli
        • Colormaps
        • Experimental Design
          • Experiment 1 Quantity estimation
            • Participants
            • Procedure
            • Results
              • Experiment 2 Gradient perception
                • Participants
                • Procedure
                • Results
                  • Experiment 3 Pattern perception
                    • Participants
                    • Procedure
                    • Results
                      • Discussion and Guidelines
                        • Quantity Estimation
                        • Gradient Perception
                        • Pattern Perception
                        • Yet Another Look at the Rainbow
                          • Limitations and Future Work
                          • Conclusions
                          • References
Page 7: Graphical Perception of Continuous Quantitative Maps: the ...khreda.com/papers/CHI18_colormaps.pdfA large body of research has been devoted to understanding how color encoding affects

greyscale singlehue bodyheat cubehelix extbodyheat coolwarm rainbow spectral blueyellow

3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 1104

06

08

frequency

c

orre

ct

whichexperimentmodel

experiment model

p(su

cces

s)

spatial frequency

Figure 7 Probability of successful gradient judgment (experiment 2 vs model) Ribbons represent 95 CIs of the experimental results

spatial frequency

05

06

07

08

09

3 5 7 9 11spatial frequency

c

orre

ct

colorscalegreyscalesinglehuebodyheatcubehelixextbodyheatcoolwarmrainbowspectralblueyellow

00

02

04

06

08

greysc

ale

single

hue

body

heat

cube

helix

extbo

dyhe

at

coolw

arm

rainb

ow

spec

tral

bluey

ellow

c

orre

ct

colorscalegreyscalesinglehuebodyheatcubehelixextbodyheatcoolwarmrainbowspectralblueyellow

Figure 8 Percentage of correctly answered trials in a gradient percep-tion task (experiment 2) Intervals are 95 CIs

ProcedureEach trail consisted of a map with two squares juxtaposed ontop (see Figure 3) Participants were prompted to ldquocomparethe steepness of terrain inside the two boxesrdquo and ldquoclick on thebox that is steeper on averagerdquo The two boxes were identicalin size (175times175 pixels or 35degtimes35deg of visual angle) How-ever terrain steepness which was calculated by taking the aver-age first derivative within each box was varied systematicallyThe gradient-ratio between the flatter and the steeper boxeswas fixed to one of four levels 0808308609(plusmn005) Alower ratio implies larger and potentially more perceptibleslope difference making the task easier However the twoboxes encompassed terrain with identical height ranges to re-duce variability in the appearance of their peaks (a potentialconfound in slope judgment [26])

Participants first completed a set of 6 training trials that in-cluded feedback before proceeding to the main trials Theorder of stimuli was similar to the previous experiment thestudy consisted of 3 rounds one with each of the 3 colormapsthe participant was assigned to see Each round encompassedall 5 spatial frequency levels Participants completed 4 trialswith each colormap and frequency combination spanning arange of easy to difficult tests (a total of 60 trials) The orderof colormap presentation was fully counterbalanced acrossparticipants

ResultsFigure 7 illustrates participantsrsquo probability of correctly iden-tifying the steeper gradient The experimental data is shownseparately in Figure 8 We fit the results to a logistic re-gression model comprising two fixed effects (colormap fre-quency) The model also included two random effects toaccount for individual differences among participants andintra-trial variations (recall that trials varied in difficulty) Themodel essentially predicts the odds of correctly identifyingthe steeper gradient A likelihood ratio test indicates the over-all model is significant (χ2(17) = 42165 p lt 0001) The

a Main effectsCoef Est |z| p(Intercept) 080 0516singlehue 114 0415bodyheat 094 0183cubehelix 090 0354extbodyheat 113 0388coolwarm 066 1316rainbow 065 1354spectral 092 0259blueyellow 091 0281Frequency 115 4633

b Interaction effects(colormap x frequency)

Coef Est |z| psinglehue 096 0910bodyheat 104 1020cubehelix 106 1303extbodyheat 103 0596coolwarm 114 3008 rainbow 113 2876 spectral 106 1330blueyellow 112 2471

Table 4 Main effects of colormap and spatial frequency on success oddsin gradient judgment (a) and their interaction Coefficients shown corre-spond to the exponented model estimates to reflect odd-ratios The inter-cept represents greyscale (lowastlowastlowast= p lt 0001 lowastlowast= p lt 001 lowast= p lt 005)

model correctly predicts 7467 of outcomes We find sig-nificant interaction between colormap and spatial frequency(χ2(8) = 2781 p lt 0001) Table 4 shows model coefficients

The main-effect coefficients for all colormaps were not sig-nificant indicating that all colormaps perform comparablyto greyscale at low spatial frequencies Participants are thusunlikely to benefit from the use of color when judging gra-dients in low-variance data The main effect of spatial fre-quency however is significant The model estimates that astep-increase in spatial frequency improves the odds of correctjudgment by 15 Estimating gradients appear to be easier inmaps with more complex spatial structures

The model indicates several noteworthy interactions Althoughthe use of color had no significant effect in low-frequencymaps several colormaps significantly outperformed greyscaleat high frequency The divergent coolwarm improved partici-pantsrsquo success odds by 14 for every step-increase in spatialfrequency Similarly rainbow and blueyellow increased theodds by approximately 13 and 12 respectively Notablythese three colormaps contain substantial variation in satura-tion (coolwarm and blueyellow) or hue (rainbow) All othercolormaps tested were not reliably different from greyscale

EXPERIMENT 3 PATTERN PERCEPTIONHaving tested accuracy in quantity estimation and gradientperception we now evaluate participantsrsquo ability to integratethese two skills Experiment 3 required participants to extracta longitudinal pattern from the map and match it to an externalrepresentation a task originally devised by Hyslop [18]

ParticipantsWe recruited 165 participants (79 females 84 males 2 others)The mean participant age was 3604 years (ST D = 1171)Overall participants had a mean success rate of 7851 inmatching the correct pattern (ST D = 1931) We dropped

greyscale singlehue bodyheat cubehelix extbodyheat coolwarm rainbow spectral blueyellow

3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11

07

08

09

frequency

c

orre

ct

whichexperimentmodel

experiment model

p(su

cces

s)

spatial frequency

Figure 9 Probability of successful pattern matching (experiment 3 vs model) Ribbons denote 95 CIs of the experimental data

070

075

080

085

090

3 5 7 9 11spatial frequency

c

orre

ct

colorscalegreyscalesinglehuebodyheatcubehelixextbodyheatcoolwarmrainbowspectralblueyellow

000

025

050

075

greysc

ale

single

hue

body

heat

cube

helix

extbo

dyhe

at

coolw

arm

rainb

ow

spec

tral

bluey

ellow

c

orre

ct

colorscalegreyscalesinglehuebodyheatcubehelixextbodyheatcoolwarmrainbowspectralblueyellow

Figure 10 Percentage of correctly answered trials in experiment 3 In-tervals are 95 CIs

seven participants from the analysis (42 of subjects) whoseoverall accuracy was two standard deviations below the mean

ProcedureParticipants first completed a set of 6 training trials that in-cluded feedback before proceeding to the main experimentEach trial consisted of a map with two markers labeled A andB (see Figure 3) The markers were horizontally displacedby 350 pixels (7degof visual angle) Participants were given thefollowing prompt ldquoImagine a line from A to B Select the ele-vation profile below that most closely matches its sloperdquo Theythen selected a choice among a set of 6 patterns includingthe actual elevation profile and 5 other distractors Distractorswere generated from the same map so as to reflect similar spa-tial frequency characteristics and had to be 65-70 similarto the actual profile (as measured by dynamic warping [20])Additionally profiles and distractors were selected to not havepeaks or valleys at the endpoints These criteria determinedafter a pilot ensure similar task difficulty across the trials

The order of stimuli was similar to the previous two experi-ments the study consisted of 3 rounds one with each of the 3colormaps the participant was assigned to see and encompass-ing the 5 spatial frequency levels Thus every participant saw3times5 colormap and frequency combinations and completed 3pattern matching trials with each combination for a total of 45trials As in the previous experiments the order of colormappresentation was fully counterbalanced

ResultsWe fit the results to a logistic regression model comprisingtwo fixed effects (colormap frequency) and two random ef-fects to account for individual differences and intra-trial vari-ations Figure 9 shows the odds of successful profile match-ing The experimental results are illustrated separately inFigure 10 A likelihood ratio test indicates the model issignificant (χ2(17) = 39467 p lt 0001) We find signif-icant interaction between colormap and spatial frequency

a Main effectsCoef Est |z| p(Intercept) 903 6502 singlehue 134 0649bodyheat 080 0503cubehelix 077 0681extbodyheat 081 0466coolwarm 044 1868 rainbow 098 0064spectral 057 1284blueyellow 073 0697Frequency 093 2058

b Interaction effects(colormap x frequency)

Coef Est |z| psinglehue 098 0450bodyheat 102 0456cubehelix 109 1781 extbodyheat 104 0797coolwarm 117 3143 rainbow 101 0259spectral 111 2198 blueyellow 109 1683

Table 5 Main effects of colormap and spatial frequency on successodds in experiment 3 (a) and their interaction Coefficients depict expo-nented model estimates to reflect odd-ratios The intercept correspondto greyscale (lowastlowastlowast= p lt 0001 lowastlowast= p lt 001 lowast= p lt 005 = p lt 01)

(χ2(8) = 18131 p lt 005) the relative effectiveness of thecolormaps appears to vary with spatial frequency

Overall we find a significant detrimental main effect of spa-tial frequency on pattern perception as indicated by a 093Frequency coefficient (Table 5a) This translates to a 7 dropin the odds of correctly matching the profile for every step-increase in spatial frequency The main effect coefficients forall colormaps were not significant indicating that the use ofcolor at low spatial frequency is unlikely to improve patternperception as compared to a plain greyscale ramp

Colormap performance begins to diverge at high spatial fre-quency Only two colormaps have significant and largeenough odds-ratio coefficients (ie gt 107) to overcome thefrequency-induced perceptual difficulty spectral and cool-warm increased the odds of correct pattern matching by 11ndash17 respectively for a every step-increase in spatial fre-quency (after adjusting for frequency effects alone) Addi-tionally blueyellow and cubehelix were associated with a 9improvement but the advantage was not reliable (p lt 01)On the other hand extbodyheat bodyheat rainbow had small(and insignificant) odds-ratio coefficients (098ndash104 lt 107)indicating that similar to greyscale they are associated withlower success odds in complex maps

In short only two of the tested colormaps (coolwarm andspectral) appear to reliably support pattern perception at highspatial frequency Both consist of a diverging ramp withuniformly-stepped luminance All other colormaps (includinggreyscale) suffered as data complexity increased

DISCUSSION AND GUIDELINESOur work sheds new light on how spatial complexity impactsthe perception of continuous color-coded maps The experi-ments also led to some surprising findings that are at odds withcurrent guidelines We interpret these results and accordingly

Quantity estimationRanking unaffected by spatial frequency

Gradient perceptionLow spatial frequency

(038 cycledeg)High spatial frequency

(138 cycledeg)

no s

igni

fican

tdi

ffere

nce

no s

igni

fican

tdi

ffere

nce

Pattern perception

better

worse

Color mapGuidelines

(1) Maximize range of saturated hues regardless of spatial frequency

(2) At high spatial frequency Fully-saturated hues or diverging ramps with chroma variation

(3) At high spatial frequency Diverging ramps with uniformly-stepped luminance

no s

igni

fican

tdi

ffere

nce

no

sign

ifica

ntdi

ffere

nce

Low spatial frequency(038 cycledeg)

High spatial frequency(138 cycledeg)

Figure 11 Model-derived colormap ranking and guidelines by task and spatial frequency (lowast= p lt 005 = p lt 01 relative to greyscale)

devise new task- and frequency-aware color mapping guide-lines (indicated byF) We also rank the tested colormaps andsummarize our guidelines in Figure 11

Quantity EstimationOur first hypothesis (H1) predicts hue- and saturation-varyingramps to be more accurate at low spatial frequencies andramps with monotonically increasing luminance to be moreaccurate at high frequencies As discussed H1 is based onthe relative contrast-sensitivity of our visual system [32] Aquantity estimation task (experiment 1) shows no interactionbetween colormap and spatial frequency While increasedspatial complexity is associated with higher estimation errorthe effect is similar across all colormaps We thus reject H1

On the other hands results provide support for H2 whichpredicts that hue-varying ramps will lead to more accurateestimation Indeed the top performing colormaps (rainbowand spectral) contain substantial hue variation Results fromexperiment 1 thus replicate earlier findings by Ware [41] butalso extend them to show that spatial frequency have no ap-parent impact on the effectiveness of hue-varying ramps Ourdata shows that rainbow and spectral are the most accurateamong the colormaps tested even at the highest levels of spa-tial frequency Altogether these results lend further support tothe theory that lookup errors in color-coded maps are largelycaused by systematic simultaneous contrast shifts [41] ratherthan being affected by contrast sensitivity modulation [32]These shifts are best counteracted with colormaps that varynon-monotonically along one or more perceptual channels

A corollary result is that mixing monotonic luminance withhue variation would lead to significant accuracy loss Indeeddata from experiment 1 indicates that rainbow is approxi-mately an order of magnitude more accurate than extbodyheatand cubehelix These Spiral colormaps are designed to be moreaccurate rainbow alternatives for interval data [4 24] Con-trary we find that they reduce accuracy compared to a purelyhue-varying ramp This finding suggests that when estimat-ing a continuously coded spatial quantity people benefit mostfrom a large dynamic hue range Incorporating monotonicallyincreasing lightness within the colormap would necessarilyreduce the hue range thereby diminishing accuracy

F Guideline 1 We recommend maximizing hue variation toimprove quantity estimation irrespective of spatial frequency

Gradient PerceptionGradient perception allows people to distinguish how quicklythe encoded attribute changes between adjacent locations anessential skill when evaluating the distribution and varianceof spatial data We find that the task is strongly modulatedby the datarsquos spatial complexity increased spatial frequencyappears to enhance the perception of gradients This is un-surprising as maps with jagged surfaces are likely to exhibitmore pronounced mdashand thus more perceptiblemdash differencesin slope Colormap effectiveness was also impacted by spa-tial frequency color encoding did not help participantsrsquo dis-tinguish gradients at low frequency levels as all colormapsshowed similar performance to greyscale However three col-ormaps demonstrated significant advantage at high frequenciesCoolwarm rainbow and blueyellow improved perception oddsby 12-14 for every step-increase in spatial frequency Allthree employed one of two design strategies a diverging rampwith varying saturation or a fully saturated hue rotation

The above results contradict H1 which predicts hue- andsaturation-varying colormaps to perform better at low frequen-cies In fact we see the opposite The results also do not sup-port H3 which predicts better performance for monotonically-luminant ramps in structure perception tasks In fact all threetop-performing ramps exhibit non-monotonic luminance

F Guideline 2 For tasks requiring gradient perception athigh spatial frequency we recommend a range of fully satu-rated hues (eg rainbow) or diverging chroma-varying ramps(eg coolwarm or blueyellow)

Pattern PerceptionExperiment 3 prompted participants to match the elevationprofile along a horizontal path with an external pattern Weexpected colormaps with monotonically increasing luminanceto be more accurate at this task (H3) but results were notentirely consistent with this prediction While all tested col-ormaps had comparable performance at low spatial frequencyonly two colormaps coolwarm and spectral gave partici-pants higher odds of successfully matching the pattern at highfrequency Both colormaps comprise a diverging ramp withuniformly-stepped (though not strictly monotonic) luminanceBy contrast sequential and spiral ramps performed just aspoorly as greyscale in complex maps and so did rainbow

The above result are consistent with Morelandrsquos argumentthat diverging ramps provide ldquomaximal perceptual resolutionrdquo(through increasing and decreasing luminance intervals) [24]potentially enabling high-frequency patterns to be resolvedmore easily Our results may also explain why divergingschemes performed better in medical diagnosis [3] we suspectsuch tasks to require the analysis of potentially high-frequencyfeatures (eg small tissue aberrations)

FGuideline 3 We recommend diverging ramps with equidis-tant luminance steps (eg coolwarm and spectral) to sup-port the perception of longitudinal patterns at high spatialfrequency Rainbow Sequential and Spiral schemes shouldbe avoided in complex maps especially if the task involvesthe analysis and matching of fine-grained features

Yet Another Look at the RainbowResults of experiments 1 and 2 may shed a light on why rain-bow remains a popular choice among scientists [25] despitebeing considered a bad choice by the visualization commu-nity [4 30] Our data reveals that counterintuitively rainbowis robust for estimating a smoothly varying quantitative at-tribute regardless of spatial complexity Moreover rainbowprovides good support for gradient estimation at high spatialfrequency These two tasks correspond to elementary visualanalytic primitives including characterizing distributions de-termining ranges and filtering [1] Moreover studies showthat when experts attempt to form a mental model about avisualization they first go through a time-consuming processof extracting quantitative data ldquoat a rather detailed levelrdquo [38]For instance a weather forecaster will lookup pressure andwind changes estimating current readings at landmark loca-tions in the map before making a forecast Our data suggeststhat rainbow provides good support for these tasks making ita potentially reasonable choice for weather forecasters

Critique of rainbow centers on its tendency to create sharpvisual boundaries particularly around its yellow regions [4]Experts also criticize the use of fully saturated hues [24]which result in non-uniform perceptual steps within the colorramp The common intuition is that these two factors com-bined will inevitably distort the perception of quantities Wedo not see evidence to support this hypothesis In fact to thecontrary attempts to lsquolinearizersquo the rainbow by monotonicallyincreasing the luminance of hues could reduce estimationaccuracy by up to an order of magnitude

F Guideline 4 Rather than entirely discouraging the useof rainbow we suggest that it can be a reasonable designchoice for conveying spatial distributions and variances andin tasks that require quantitative as opposed to geometricprecision However rainbow has a number of limitationsThe use of green and red hues is problematic for people withcolor deficiency Moreover rainbow is probably ineffective atrevealing high-frequency patterns Interestingly these short-comings are balanced by diverging ramps (eg coolwarm)which although quantitatively inaccurate appear to supportpattern perception at high spatial frequency We thus arguethat hue-varying and diverging colormaps support orthogonaltasks in continuous maps and should therefore be consideredas complementary rather than mutually exclusive choices

LIMITATIONS AND FUTURE WORKThere are some limitations to our work that should be consid-ered First as with other crowdsourced graphical perceptionstudies we gain access to a larger pool of participants butsacrifice some experimental control [16] Particularly rele-vant to our study is the variations in participantsrsquo monitorsincluding color calibration and display resolution as well asthe illumination conditions in their homes or offices mdash all ofwhich can impact color perception We could not control thesefactors but attempted to counteract their variation by involvinga larger sample (N=381) Although we expect crowdsourcingto improve the ecological validity of results and guidelinesuncontrolled variations can potentially reduce our ability to de-tect small but otherwise significant differences in performancebetween tested conditions Future lab studies should thereforebe attempted to replicate our findings with added controls

Second our study employed a limited set of tasks designed tomeasure elementary perceptual operators including quantityestimation gradient perception and pattern matching Thereis an opportunity to test higher-level tasks that mimic scientificanalyses more closely including the identification and compar-ison of larger map features (eg fronts ridges) Additionallysome of the tasks we tested could be re-evaluated in more au-thentic formulations For instance a metric task could requireparticipants to estimate the quantity at a specific location onthe map This formulation is arguably more realistic than thetask we tested which simply asked participants to click anylocation thought to match a specified quantity

Third our analysis was focused exclusively on spatial fre-quency and there are good reasons to consider this factor [1132] However there are also additional data characteristics toconsider including for instance the distribution of amplitudeswithin the map Such factors will influence the distribution ofcolors in the image and may thus impact perception

Lastly we limited our study to synthetically generated scalarfields to precisely vary spatial frequency while controllingfor other confounds However synthetic stimuli may alsointroduce (unknown) perceptual or cognitive biases There-fore additional studies are needed to replicate our findingswith datasets from real-world domains (eg meteorology geo-physics or oceanography) and with domain experts We alsorestricted this study to participants with normal color visionTherefore our results may not generalize to approximately 5of the population who have some form of color deficiency

CONCLUSIONSWe conducted three experiments to investigate the effects ofspatial frequency and colormap characteristics on the percep-tion of continuous pseudocolor maps Our results indicate thatspatial frequency impacts judgment of the encoded quantitiesand structures While viewersrsquo quantity estimation accuracyexhibited a predictable response increased data complexityhad a more nuanced effect on gradient and pattern compre-hension the impact of which was dependent on the colormapused Designers should therefore consider both the type oftask and the spatial complexity of the underlying data Were-examined current guidelines and devised new recommenda-tions for color-coding of continuous spatial data

REFERENCES1 Robert Amar James Eagan and John Stasko 2005

Low-level components of analytic activity in informationvisualization In Information Visualization 2005INFOVIS 2005 IEEE Symposium on IEEE 111ndash117

2 Lawrence D Bergman Bernice E Rogowitz and Lloyd ATreinish 1995 A rule-based tool for assisting colormapselection In Proceedings of the 6th conference onVisualizationrsquo95 IEEE Computer Society 118

3 Michelle Borkin Krzysztof Gajos Amanda PetersDimitrios Mitsouras Simone Melchionna Frank RybickiCharles Feldman and Hanspeter Pfister 2011 Evaluationof artery visualizations for heart disease diagnosis IEEETransactions on Visualization and Computer Graphics 1712 (2011) 2479ndash2488

4 David Borland and Russell M Taylor Ii 2007 Rainbowcolor map (still) considered harmful IEEE ComputerGraphics and Applications 27 2 (2007)

5 Cynthia A Brewer 1994 Visualization in ModernCartography Elsevier Science Chapter Color useguidelines for mapping and visualization

6 Cynthia A Brewer 1996 Guidelines for selecting colorsfor diverging schemes on maps The CartographicJournal 33 2 (1996) 79ndash86

7 Cynthia A Brewer Alan M MacEachren Linda W Pickleand Douglas Herrmann 1997 Mapping mortalityEvaluating color schemes for choropleth maps Annals ofthe Association of American Geographers 87 3 (1997)411ndash438

8 Roxana Bujack Terece L Turton Francesca SamselColin Ware David H Rogers and James Ahrens 2017The Good the Bad and the Ugly A TheoreticalFramework for the Assessment of Continuous ColormapsIEEE Transactions on Visualization and ComputerGraphics (2017)

9 William S Cleveland and William S Cleveland 1983 Acolor-caused optical illusion on a statistical graph TheAmerican Statistician 37 2 (1983) 101ndash105

10 William S Cleveland and Robert McGill 1984 Graphicalperception Theory experimentation and application tothe development of graphical methods J Amer StatistAssoc 79 387 (1984) 531ndash554

11 Russel De Valois and Karen De Valouis 1990 SpatialVision Oxford University Press

12 Russell L De Valois Duane G Albrecht and Lisa GThorell 1978 Cortical cells bar and edge detectors orspatial frequency filters In Frontiers in visual scienceSpringer 544ndash556

13 Connor C Gramazio David H Laidlaw and Karen BSchloss 2017 Colorgorical Creating discriminable andpreferable color palettes for information visualizationIEEE Transactions on Visualization and ComputerGraphics 23 1 (2017) 521ndash530

14 D A Green 2011 A colour scheme for the display ofastronomical intensity images Bulletin of theAstronomical Society of India 39 (June 2011) 289ndash295

15 Mark Harrower and Cynthia A Brewer 2003ColorBrewer org an online tool for selecting colourschemes for maps The Cartographic Journal 40 1(2003) 27ndash37

16 Jeffrey Heer and Michael Bostock 2010 Crowdsourcinggraphical perception using mechanical turk to assessvisualization design In Proceedings of the SIGCHIConference on Human Factors in Computing SystemsACM 203ndash212

17 GT Herman and H Levkowitz 1992 Color scales forimage data IEEE Computer Graphics and Applications12 1 (1992) 72ndash80

18 Michael D Hyslop 2006 A comparison of spectral colorand greyscale continuous-tone map perception Masterrsquosthesis Michigan State University

19 Alan David Kalvin Bernice E Rogowitz Adar Pelah andAron Cohen 2000 Building perceptual color maps forvisualizing interval data In Human Vision and ElectronicImaging V Vol 3959 International Society for Opticsand Photonics 323ndash336

20 Eamonn Keogh and Chotirat Ann Ratanamahatana 2005Exact indexing of dynamic time warping Knowledge andInformation Systems 7 3 (2005) 358ndash386

21 Gordon Kindlmann Erik Reinhard and Sarah Creem2002 Face-based luminance matching for perceptualcolormap generation In Proceedings of the IEEEConference on Visualization rsquo02 IEEE Computer Society299ndash306

22 Robert Kosara 2016 An Empire Built On SandReexamining What We Think We Know AboutVisualization In Proceedings of the Beyond Time andErrors on Novel Evaluation Methods for VisualizationACM 162ndash168

23 Mark P Kumler and Richard E Groop 1990Continuous-tone mapping of smooth surfacesCartography and Geographic Information Systems 17 4(1990) 279ndash289

24 Kenneth Moreland 2009 Diverging color maps forscientific visualization In International Symposium onVisual Computing Springer 92ndash103

25 Kenneth Moreland 2016 Why We Use Bad Color Mapsand What You Can Do About It Electronic Imaging2016 16 (2016) 1ndash6

26 Lace Padilla P Samuel Quinan Miriah Meyer andSarah H Creem-Regehr 2017 Evaluating the Impact ofBinning 2D Scalar Fields IEEE Transactions onVisualization and Computer Graphics 23 1 (2017)431ndash440

27 Stephen E Palmer 1999 Vision science Photons tophenomenology MIT press

28 Ken Perlin 1985 An image synthesizer ACMSIGGRAPH Computer Graphics 19 3 (1985) 287ndash296

29 Linda Williams Pickle 2003 Usability testing of mapdesigns In Proceedings of Symposium on the Interface ofComputing Science and Statistics 42ndash56

30 P Samuel Quinan and Miriah Meyer 2016 Visuallycomparing weather features in forecasts IEEETransactions on Visualization and Computer Graphics 221 (2016) 389ndash398

31 Penny L Rheingans 2000 Task-based color scale designIn 28th AIPR Workshop 3D Visualization for DataExploration and Decision Making International Societyfor Optics and Photonics 35ndash43

32 Bernice E Rogowitz and Lloyd A Treinish 1994 Usingperceptual rules in interactive visualization In ISTSPIE1994 International Symposium on Electronic ImagingScience and Technology International Society for Opticsand Photonics 287ndash295

33 Bernice E Rogowitz Lloyd A Treinish Steve Brysonand others 1996 How not to lie with visualizationComputers in Physics 10 3 (1996) 268ndash273

34 Samuel Silva Beatriz Sousa Santos and JoaquimMadeira 2011 Using color in visualization A surveyComputers Graphics 35 2 (2011) 320ndash333

35 Maureen Stone Danielle Albers Szafir and Vidya Setlur2014 An engineering model for color difference as afunction of size In Color and Imaging Conference Vol2014 Society for Imaging Science and Technology253ndash258

36 Danielle Albers Szafir 2017 Modeling Color Differencefor Visualization Design IEEE Transactions onVisualization and Computer Graphics (2017)

37 Christian Tominski Georg Fuchs and HeidrunSchumann 2008 Task-driven color coding InInformation Visualisation 2008 IVrsquo08 12thInternational Conference IEEE 373ndash380

38 J Gregory Trafton Susan S Kirschenbaum Ted L TsuiRobert T Miyamoto James A Ballas and Paula DRaymond 2000 Turning pictures into numbersextracting and generating information from complexvisualizations International Journal of Human-ComputerStudies 53 5 (2000) 827ndash850

39 Bruce E Trumbo 1981 A theory for coloring bivariatestatistical maps The American Statistician 35 4 (1981)220ndash226

40 W3C 2016 CSS Values and Units Module Level 3(2016)httpwwww3orgTRcss3-valuesabsolute-lengths

41 Colin Ware 1988 Color sequences for univariate mapsTheory experiments and principles IEEE ComputerGraphics and Applications 8 5 (1988) 41ndash49

42 Colin Ware 2012 Information visualization perceptionfor design Elsevier

43 Liang Zhou and Charles D Hansen 2016 A survey ofcolormaps in visualization IEEE Transactions onVisualization and Computer Graphics 22 8 (2016)2051ndash2069

  • Introduction
  • Related Work
    • Design Strategies for Continuous Colormaps
    • Empirical Evaluations of Continuous Colormaps
    • Effects of Spatial Frequency
    • Summary
      • Hypotheses
      • Methodology
        • Stimuli
        • Colormaps
        • Experimental Design
          • Experiment 1 Quantity estimation
            • Participants
            • Procedure
            • Results
              • Experiment 2 Gradient perception
                • Participants
                • Procedure
                • Results
                  • Experiment 3 Pattern perception
                    • Participants
                    • Procedure
                    • Results
                      • Discussion and Guidelines
                        • Quantity Estimation
                        • Gradient Perception
                        • Pattern Perception
                        • Yet Another Look at the Rainbow
                          • Limitations and Future Work
                          • Conclusions
                          • References
Page 8: Graphical Perception of Continuous Quantitative Maps: the ...khreda.com/papers/CHI18_colormaps.pdfA large body of research has been devoted to understanding how color encoding affects

greyscale singlehue bodyheat cubehelix extbodyheat coolwarm rainbow spectral blueyellow

3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11 3 5 7 9 11

07

08

09

frequency

c

orre

ct

whichexperimentmodel

experiment model

p(su

cces

s)

spatial frequency

Figure 9 Probability of successful pattern matching (experiment 3 vs model) Ribbons denote 95 CIs of the experimental data

070

075

080

085

090

3 5 7 9 11spatial frequency

c

orre

ct

colorscalegreyscalesinglehuebodyheatcubehelixextbodyheatcoolwarmrainbowspectralblueyellow

000

025

050

075

greysc

ale

single

hue

body

heat

cube

helix

extbo

dyhe

at

coolw

arm

rainb

ow

spec

tral

bluey

ellow

c

orre

ct

colorscalegreyscalesinglehuebodyheatcubehelixextbodyheatcoolwarmrainbowspectralblueyellow

Figure 10 Percentage of correctly answered trials in experiment 3 In-tervals are 95 CIs

seven participants from the analysis (42 of subjects) whoseoverall accuracy was two standard deviations below the mean

ProcedureParticipants first completed a set of 6 training trials that in-cluded feedback before proceeding to the main experimentEach trial consisted of a map with two markers labeled A andB (see Figure 3) The markers were horizontally displacedby 350 pixels (7degof visual angle) Participants were given thefollowing prompt ldquoImagine a line from A to B Select the ele-vation profile below that most closely matches its sloperdquo Theythen selected a choice among a set of 6 patterns includingthe actual elevation profile and 5 other distractors Distractorswere generated from the same map so as to reflect similar spa-tial frequency characteristics and had to be 65-70 similarto the actual profile (as measured by dynamic warping [20])Additionally profiles and distractors were selected to not havepeaks or valleys at the endpoints These criteria determinedafter a pilot ensure similar task difficulty across the trials

The order of stimuli was similar to the previous two experi-ments the study consisted of 3 rounds one with each of the 3colormaps the participant was assigned to see and encompass-ing the 5 spatial frequency levels Thus every participant saw3times5 colormap and frequency combinations and completed 3pattern matching trials with each combination for a total of 45trials As in the previous experiments the order of colormappresentation was fully counterbalanced

ResultsWe fit the results to a logistic regression model comprisingtwo fixed effects (colormap frequency) and two random ef-fects to account for individual differences and intra-trial vari-ations Figure 9 shows the odds of successful profile match-ing The experimental results are illustrated separately inFigure 10 A likelihood ratio test indicates the model issignificant (χ2(17) = 39467 p lt 0001) We find signif-icant interaction between colormap and spatial frequency

a Main effectsCoef Est |z| p(Intercept) 903 6502 singlehue 134 0649bodyheat 080 0503cubehelix 077 0681extbodyheat 081 0466coolwarm 044 1868 rainbow 098 0064spectral 057 1284blueyellow 073 0697Frequency 093 2058

b Interaction effects(colormap x frequency)

Coef Est |z| psinglehue 098 0450bodyheat 102 0456cubehelix 109 1781 extbodyheat 104 0797coolwarm 117 3143 rainbow 101 0259spectral 111 2198 blueyellow 109 1683

Table 5 Main effects of colormap and spatial frequency on successodds in experiment 3 (a) and their interaction Coefficients depict expo-nented model estimates to reflect odd-ratios The intercept correspondto greyscale (lowastlowastlowast= p lt 0001 lowastlowast= p lt 001 lowast= p lt 005 = p lt 01)

(χ2(8) = 18131 p lt 005) the relative effectiveness of thecolormaps appears to vary with spatial frequency

Overall we find a significant detrimental main effect of spa-tial frequency on pattern perception as indicated by a 093Frequency coefficient (Table 5a) This translates to a 7 dropin the odds of correctly matching the profile for every step-increase in spatial frequency The main effect coefficients forall colormaps were not significant indicating that the use ofcolor at low spatial frequency is unlikely to improve patternperception as compared to a plain greyscale ramp

Colormap performance begins to diverge at high spatial fre-quency Only two colormaps have significant and largeenough odds-ratio coefficients (ie gt 107) to overcome thefrequency-induced perceptual difficulty spectral and cool-warm increased the odds of correct pattern matching by 11ndash17 respectively for a every step-increase in spatial fre-quency (after adjusting for frequency effects alone) Addi-tionally blueyellow and cubehelix were associated with a 9improvement but the advantage was not reliable (p lt 01)On the other hand extbodyheat bodyheat rainbow had small(and insignificant) odds-ratio coefficients (098ndash104 lt 107)indicating that similar to greyscale they are associated withlower success odds in complex maps

In short only two of the tested colormaps (coolwarm andspectral) appear to reliably support pattern perception at highspatial frequency Both consist of a diverging ramp withuniformly-stepped luminance All other colormaps (includinggreyscale) suffered as data complexity increased

DISCUSSION AND GUIDELINESOur work sheds new light on how spatial complexity impactsthe perception of continuous color-coded maps The experi-ments also led to some surprising findings that are at odds withcurrent guidelines We interpret these results and accordingly

Quantity estimationRanking unaffected by spatial frequency

Gradient perceptionLow spatial frequency

(038 cycledeg)High spatial frequency

(138 cycledeg)

no s

igni

fican

tdi

ffere

nce

no s

igni

fican

tdi

ffere

nce

Pattern perception

better

worse

Color mapGuidelines

(1) Maximize range of saturated hues regardless of spatial frequency

(2) At high spatial frequency Fully-saturated hues or diverging ramps with chroma variation

(3) At high spatial frequency Diverging ramps with uniformly-stepped luminance

no s

igni

fican

tdi

ffere

nce

no

sign

ifica

ntdi

ffere

nce

Low spatial frequency(038 cycledeg)

High spatial frequency(138 cycledeg)

Figure 11 Model-derived colormap ranking and guidelines by task and spatial frequency (lowast= p lt 005 = p lt 01 relative to greyscale)

devise new task- and frequency-aware color mapping guide-lines (indicated byF) We also rank the tested colormaps andsummarize our guidelines in Figure 11

Quantity EstimationOur first hypothesis (H1) predicts hue- and saturation-varyingramps to be more accurate at low spatial frequencies andramps with monotonically increasing luminance to be moreaccurate at high frequencies As discussed H1 is based onthe relative contrast-sensitivity of our visual system [32] Aquantity estimation task (experiment 1) shows no interactionbetween colormap and spatial frequency While increasedspatial complexity is associated with higher estimation errorthe effect is similar across all colormaps We thus reject H1

On the other hands results provide support for H2 whichpredicts that hue-varying ramps will lead to more accurateestimation Indeed the top performing colormaps (rainbowand spectral) contain substantial hue variation Results fromexperiment 1 thus replicate earlier findings by Ware [41] butalso extend them to show that spatial frequency have no ap-parent impact on the effectiveness of hue-varying ramps Ourdata shows that rainbow and spectral are the most accurateamong the colormaps tested even at the highest levels of spa-tial frequency Altogether these results lend further support tothe theory that lookup errors in color-coded maps are largelycaused by systematic simultaneous contrast shifts [41] ratherthan being affected by contrast sensitivity modulation [32]These shifts are best counteracted with colormaps that varynon-monotonically along one or more perceptual channels

A corollary result is that mixing monotonic luminance withhue variation would lead to significant accuracy loss Indeeddata from experiment 1 indicates that rainbow is approxi-mately an order of magnitude more accurate than extbodyheatand cubehelix These Spiral colormaps are designed to be moreaccurate rainbow alternatives for interval data [4 24] Con-trary we find that they reduce accuracy compared to a purelyhue-varying ramp This finding suggests that when estimat-ing a continuously coded spatial quantity people benefit mostfrom a large dynamic hue range Incorporating monotonicallyincreasing lightness within the colormap would necessarilyreduce the hue range thereby diminishing accuracy

F Guideline 1 We recommend maximizing hue variation toimprove quantity estimation irrespective of spatial frequency

Gradient PerceptionGradient perception allows people to distinguish how quicklythe encoded attribute changes between adjacent locations anessential skill when evaluating the distribution and varianceof spatial data We find that the task is strongly modulatedby the datarsquos spatial complexity increased spatial frequencyappears to enhance the perception of gradients This is un-surprising as maps with jagged surfaces are likely to exhibitmore pronounced mdashand thus more perceptiblemdash differencesin slope Colormap effectiveness was also impacted by spa-tial frequency color encoding did not help participantsrsquo dis-tinguish gradients at low frequency levels as all colormapsshowed similar performance to greyscale However three col-ormaps demonstrated significant advantage at high frequenciesCoolwarm rainbow and blueyellow improved perception oddsby 12-14 for every step-increase in spatial frequency Allthree employed one of two design strategies a diverging rampwith varying saturation or a fully saturated hue rotation

The above results contradict H1 which predicts hue- andsaturation-varying colormaps to perform better at low frequen-cies In fact we see the opposite The results also do not sup-port H3 which predicts better performance for monotonically-luminant ramps in structure perception tasks In fact all threetop-performing ramps exhibit non-monotonic luminance

F Guideline 2 For tasks requiring gradient perception athigh spatial frequency we recommend a range of fully satu-rated hues (eg rainbow) or diverging chroma-varying ramps(eg coolwarm or blueyellow)

Pattern PerceptionExperiment 3 prompted participants to match the elevationprofile along a horizontal path with an external pattern Weexpected colormaps with monotonically increasing luminanceto be more accurate at this task (H3) but results were notentirely consistent with this prediction While all tested col-ormaps had comparable performance at low spatial frequencyonly two colormaps coolwarm and spectral gave partici-pants higher odds of successfully matching the pattern at highfrequency Both colormaps comprise a diverging ramp withuniformly-stepped (though not strictly monotonic) luminanceBy contrast sequential and spiral ramps performed just aspoorly as greyscale in complex maps and so did rainbow

The above result are consistent with Morelandrsquos argumentthat diverging ramps provide ldquomaximal perceptual resolutionrdquo(through increasing and decreasing luminance intervals) [24]potentially enabling high-frequency patterns to be resolvedmore easily Our results may also explain why divergingschemes performed better in medical diagnosis [3] we suspectsuch tasks to require the analysis of potentially high-frequencyfeatures (eg small tissue aberrations)

FGuideline 3 We recommend diverging ramps with equidis-tant luminance steps (eg coolwarm and spectral) to sup-port the perception of longitudinal patterns at high spatialfrequency Rainbow Sequential and Spiral schemes shouldbe avoided in complex maps especially if the task involvesthe analysis and matching of fine-grained features

Yet Another Look at the RainbowResults of experiments 1 and 2 may shed a light on why rain-bow remains a popular choice among scientists [25] despitebeing considered a bad choice by the visualization commu-nity [4 30] Our data reveals that counterintuitively rainbowis robust for estimating a smoothly varying quantitative at-tribute regardless of spatial complexity Moreover rainbowprovides good support for gradient estimation at high spatialfrequency These two tasks correspond to elementary visualanalytic primitives including characterizing distributions de-termining ranges and filtering [1] Moreover studies showthat when experts attempt to form a mental model about avisualization they first go through a time-consuming processof extracting quantitative data ldquoat a rather detailed levelrdquo [38]For instance a weather forecaster will lookup pressure andwind changes estimating current readings at landmark loca-tions in the map before making a forecast Our data suggeststhat rainbow provides good support for these tasks making ita potentially reasonable choice for weather forecasters

Critique of rainbow centers on its tendency to create sharpvisual boundaries particularly around its yellow regions [4]Experts also criticize the use of fully saturated hues [24]which result in non-uniform perceptual steps within the colorramp The common intuition is that these two factors com-bined will inevitably distort the perception of quantities Wedo not see evidence to support this hypothesis In fact to thecontrary attempts to lsquolinearizersquo the rainbow by monotonicallyincreasing the luminance of hues could reduce estimationaccuracy by up to an order of magnitude

F Guideline 4 Rather than entirely discouraging the useof rainbow we suggest that it can be a reasonable designchoice for conveying spatial distributions and variances andin tasks that require quantitative as opposed to geometricprecision However rainbow has a number of limitationsThe use of green and red hues is problematic for people withcolor deficiency Moreover rainbow is probably ineffective atrevealing high-frequency patterns Interestingly these short-comings are balanced by diverging ramps (eg coolwarm)which although quantitatively inaccurate appear to supportpattern perception at high spatial frequency We thus arguethat hue-varying and diverging colormaps support orthogonaltasks in continuous maps and should therefore be consideredas complementary rather than mutually exclusive choices

LIMITATIONS AND FUTURE WORKThere are some limitations to our work that should be consid-ered First as with other crowdsourced graphical perceptionstudies we gain access to a larger pool of participants butsacrifice some experimental control [16] Particularly rele-vant to our study is the variations in participantsrsquo monitorsincluding color calibration and display resolution as well asthe illumination conditions in their homes or offices mdash all ofwhich can impact color perception We could not control thesefactors but attempted to counteract their variation by involvinga larger sample (N=381) Although we expect crowdsourcingto improve the ecological validity of results and guidelinesuncontrolled variations can potentially reduce our ability to de-tect small but otherwise significant differences in performancebetween tested conditions Future lab studies should thereforebe attempted to replicate our findings with added controls

Second our study employed a limited set of tasks designed tomeasure elementary perceptual operators including quantityestimation gradient perception and pattern matching Thereis an opportunity to test higher-level tasks that mimic scientificanalyses more closely including the identification and compar-ison of larger map features (eg fronts ridges) Additionallysome of the tasks we tested could be re-evaluated in more au-thentic formulations For instance a metric task could requireparticipants to estimate the quantity at a specific location onthe map This formulation is arguably more realistic than thetask we tested which simply asked participants to click anylocation thought to match a specified quantity

Third our analysis was focused exclusively on spatial fre-quency and there are good reasons to consider this factor [1132] However there are also additional data characteristics toconsider including for instance the distribution of amplitudeswithin the map Such factors will influence the distribution ofcolors in the image and may thus impact perception

Lastly we limited our study to synthetically generated scalarfields to precisely vary spatial frequency while controllingfor other confounds However synthetic stimuli may alsointroduce (unknown) perceptual or cognitive biases There-fore additional studies are needed to replicate our findingswith datasets from real-world domains (eg meteorology geo-physics or oceanography) and with domain experts We alsorestricted this study to participants with normal color visionTherefore our results may not generalize to approximately 5of the population who have some form of color deficiency

CONCLUSIONSWe conducted three experiments to investigate the effects ofspatial frequency and colormap characteristics on the percep-tion of continuous pseudocolor maps Our results indicate thatspatial frequency impacts judgment of the encoded quantitiesand structures While viewersrsquo quantity estimation accuracyexhibited a predictable response increased data complexityhad a more nuanced effect on gradient and pattern compre-hension the impact of which was dependent on the colormapused Designers should therefore consider both the type oftask and the spatial complexity of the underlying data Were-examined current guidelines and devised new recommenda-tions for color-coding of continuous spatial data

REFERENCES1 Robert Amar James Eagan and John Stasko 2005

Low-level components of analytic activity in informationvisualization In Information Visualization 2005INFOVIS 2005 IEEE Symposium on IEEE 111ndash117

2 Lawrence D Bergman Bernice E Rogowitz and Lloyd ATreinish 1995 A rule-based tool for assisting colormapselection In Proceedings of the 6th conference onVisualizationrsquo95 IEEE Computer Society 118

3 Michelle Borkin Krzysztof Gajos Amanda PetersDimitrios Mitsouras Simone Melchionna Frank RybickiCharles Feldman and Hanspeter Pfister 2011 Evaluationof artery visualizations for heart disease diagnosis IEEETransactions on Visualization and Computer Graphics 1712 (2011) 2479ndash2488

4 David Borland and Russell M Taylor Ii 2007 Rainbowcolor map (still) considered harmful IEEE ComputerGraphics and Applications 27 2 (2007)

5 Cynthia A Brewer 1994 Visualization in ModernCartography Elsevier Science Chapter Color useguidelines for mapping and visualization

6 Cynthia A Brewer 1996 Guidelines for selecting colorsfor diverging schemes on maps The CartographicJournal 33 2 (1996) 79ndash86

7 Cynthia A Brewer Alan M MacEachren Linda W Pickleand Douglas Herrmann 1997 Mapping mortalityEvaluating color schemes for choropleth maps Annals ofthe Association of American Geographers 87 3 (1997)411ndash438

8 Roxana Bujack Terece L Turton Francesca SamselColin Ware David H Rogers and James Ahrens 2017The Good the Bad and the Ugly A TheoreticalFramework for the Assessment of Continuous ColormapsIEEE Transactions on Visualization and ComputerGraphics (2017)

9 William S Cleveland and William S Cleveland 1983 Acolor-caused optical illusion on a statistical graph TheAmerican Statistician 37 2 (1983) 101ndash105

10 William S Cleveland and Robert McGill 1984 Graphicalperception Theory experimentation and application tothe development of graphical methods J Amer StatistAssoc 79 387 (1984) 531ndash554

11 Russel De Valois and Karen De Valouis 1990 SpatialVision Oxford University Press

12 Russell L De Valois Duane G Albrecht and Lisa GThorell 1978 Cortical cells bar and edge detectors orspatial frequency filters In Frontiers in visual scienceSpringer 544ndash556

13 Connor C Gramazio David H Laidlaw and Karen BSchloss 2017 Colorgorical Creating discriminable andpreferable color palettes for information visualizationIEEE Transactions on Visualization and ComputerGraphics 23 1 (2017) 521ndash530

14 D A Green 2011 A colour scheme for the display ofastronomical intensity images Bulletin of theAstronomical Society of India 39 (June 2011) 289ndash295

15 Mark Harrower and Cynthia A Brewer 2003ColorBrewer org an online tool for selecting colourschemes for maps The Cartographic Journal 40 1(2003) 27ndash37

16 Jeffrey Heer and Michael Bostock 2010 Crowdsourcinggraphical perception using mechanical turk to assessvisualization design In Proceedings of the SIGCHIConference on Human Factors in Computing SystemsACM 203ndash212

17 GT Herman and H Levkowitz 1992 Color scales forimage data IEEE Computer Graphics and Applications12 1 (1992) 72ndash80

18 Michael D Hyslop 2006 A comparison of spectral colorand greyscale continuous-tone map perception Masterrsquosthesis Michigan State University

19 Alan David Kalvin Bernice E Rogowitz Adar Pelah andAron Cohen 2000 Building perceptual color maps forvisualizing interval data In Human Vision and ElectronicImaging V Vol 3959 International Society for Opticsand Photonics 323ndash336

20 Eamonn Keogh and Chotirat Ann Ratanamahatana 2005Exact indexing of dynamic time warping Knowledge andInformation Systems 7 3 (2005) 358ndash386

21 Gordon Kindlmann Erik Reinhard and Sarah Creem2002 Face-based luminance matching for perceptualcolormap generation In Proceedings of the IEEEConference on Visualization rsquo02 IEEE Computer Society299ndash306

22 Robert Kosara 2016 An Empire Built On SandReexamining What We Think We Know AboutVisualization In Proceedings of the Beyond Time andErrors on Novel Evaluation Methods for VisualizationACM 162ndash168

23 Mark P Kumler and Richard E Groop 1990Continuous-tone mapping of smooth surfacesCartography and Geographic Information Systems 17 4(1990) 279ndash289

24 Kenneth Moreland 2009 Diverging color maps forscientific visualization In International Symposium onVisual Computing Springer 92ndash103

25 Kenneth Moreland 2016 Why We Use Bad Color Mapsand What You Can Do About It Electronic Imaging2016 16 (2016) 1ndash6

26 Lace Padilla P Samuel Quinan Miriah Meyer andSarah H Creem-Regehr 2017 Evaluating the Impact ofBinning 2D Scalar Fields IEEE Transactions onVisualization and Computer Graphics 23 1 (2017)431ndash440

27 Stephen E Palmer 1999 Vision science Photons tophenomenology MIT press

28 Ken Perlin 1985 An image synthesizer ACMSIGGRAPH Computer Graphics 19 3 (1985) 287ndash296

29 Linda Williams Pickle 2003 Usability testing of mapdesigns In Proceedings of Symposium on the Interface ofComputing Science and Statistics 42ndash56

30 P Samuel Quinan and Miriah Meyer 2016 Visuallycomparing weather features in forecasts IEEETransactions on Visualization and Computer Graphics 221 (2016) 389ndash398

31 Penny L Rheingans 2000 Task-based color scale designIn 28th AIPR Workshop 3D Visualization for DataExploration and Decision Making International Societyfor Optics and Photonics 35ndash43

32 Bernice E Rogowitz and Lloyd A Treinish 1994 Usingperceptual rules in interactive visualization In ISTSPIE1994 International Symposium on Electronic ImagingScience and Technology International Society for Opticsand Photonics 287ndash295

33 Bernice E Rogowitz Lloyd A Treinish Steve Brysonand others 1996 How not to lie with visualizationComputers in Physics 10 3 (1996) 268ndash273

34 Samuel Silva Beatriz Sousa Santos and JoaquimMadeira 2011 Using color in visualization A surveyComputers Graphics 35 2 (2011) 320ndash333

35 Maureen Stone Danielle Albers Szafir and Vidya Setlur2014 An engineering model for color difference as afunction of size In Color and Imaging Conference Vol2014 Society for Imaging Science and Technology253ndash258

36 Danielle Albers Szafir 2017 Modeling Color Differencefor Visualization Design IEEE Transactions onVisualization and Computer Graphics (2017)

37 Christian Tominski Georg Fuchs and HeidrunSchumann 2008 Task-driven color coding InInformation Visualisation 2008 IVrsquo08 12thInternational Conference IEEE 373ndash380

38 J Gregory Trafton Susan S Kirschenbaum Ted L TsuiRobert T Miyamoto James A Ballas and Paula DRaymond 2000 Turning pictures into numbersextracting and generating information from complexvisualizations International Journal of Human-ComputerStudies 53 5 (2000) 827ndash850

39 Bruce E Trumbo 1981 A theory for coloring bivariatestatistical maps The American Statistician 35 4 (1981)220ndash226

40 W3C 2016 CSS Values and Units Module Level 3(2016)httpwwww3orgTRcss3-valuesabsolute-lengths

41 Colin Ware 1988 Color sequences for univariate mapsTheory experiments and principles IEEE ComputerGraphics and Applications 8 5 (1988) 41ndash49

42 Colin Ware 2012 Information visualization perceptionfor design Elsevier

43 Liang Zhou and Charles D Hansen 2016 A survey ofcolormaps in visualization IEEE Transactions onVisualization and Computer Graphics 22 8 (2016)2051ndash2069

  • Introduction
  • Related Work
    • Design Strategies for Continuous Colormaps
    • Empirical Evaluations of Continuous Colormaps
    • Effects of Spatial Frequency
    • Summary
      • Hypotheses
      • Methodology
        • Stimuli
        • Colormaps
        • Experimental Design
          • Experiment 1 Quantity estimation
            • Participants
            • Procedure
            • Results
              • Experiment 2 Gradient perception
                • Participants
                • Procedure
                • Results
                  • Experiment 3 Pattern perception
                    • Participants
                    • Procedure
                    • Results
                      • Discussion and Guidelines
                        • Quantity Estimation
                        • Gradient Perception
                        • Pattern Perception
                        • Yet Another Look at the Rainbow
                          • Limitations and Future Work
                          • Conclusions
                          • References
Page 9: Graphical Perception of Continuous Quantitative Maps: the ...khreda.com/papers/CHI18_colormaps.pdfA large body of research has been devoted to understanding how color encoding affects

Quantity estimationRanking unaffected by spatial frequency

Gradient perceptionLow spatial frequency

(038 cycledeg)High spatial frequency

(138 cycledeg)

no s

igni

fican

tdi

ffere

nce

no s

igni

fican

tdi

ffere

nce

Pattern perception

better

worse

Color mapGuidelines

(1) Maximize range of saturated hues regardless of spatial frequency

(2) At high spatial frequency Fully-saturated hues or diverging ramps with chroma variation

(3) At high spatial frequency Diverging ramps with uniformly-stepped luminance

no s

igni

fican

tdi

ffere

nce

no

sign

ifica

ntdi

ffere

nce

Low spatial frequency(038 cycledeg)

High spatial frequency(138 cycledeg)

Figure 11 Model-derived colormap ranking and guidelines by task and spatial frequency (lowast= p lt 005 = p lt 01 relative to greyscale)

devise new task- and frequency-aware color mapping guide-lines (indicated byF) We also rank the tested colormaps andsummarize our guidelines in Figure 11

Quantity EstimationOur first hypothesis (H1) predicts hue- and saturation-varyingramps to be more accurate at low spatial frequencies andramps with monotonically increasing luminance to be moreaccurate at high frequencies As discussed H1 is based onthe relative contrast-sensitivity of our visual system [32] Aquantity estimation task (experiment 1) shows no interactionbetween colormap and spatial frequency While increasedspatial complexity is associated with higher estimation errorthe effect is similar across all colormaps We thus reject H1

On the other hands results provide support for H2 whichpredicts that hue-varying ramps will lead to more accurateestimation Indeed the top performing colormaps (rainbowand spectral) contain substantial hue variation Results fromexperiment 1 thus replicate earlier findings by Ware [41] butalso extend them to show that spatial frequency have no ap-parent impact on the effectiveness of hue-varying ramps Ourdata shows that rainbow and spectral are the most accurateamong the colormaps tested even at the highest levels of spa-tial frequency Altogether these results lend further support tothe theory that lookup errors in color-coded maps are largelycaused by systematic simultaneous contrast shifts [41] ratherthan being affected by contrast sensitivity modulation [32]These shifts are best counteracted with colormaps that varynon-monotonically along one or more perceptual channels

A corollary result is that mixing monotonic luminance withhue variation would lead to significant accuracy loss Indeeddata from experiment 1 indicates that rainbow is approxi-mately an order of magnitude more accurate than extbodyheatand cubehelix These Spiral colormaps are designed to be moreaccurate rainbow alternatives for interval data [4 24] Con-trary we find that they reduce accuracy compared to a purelyhue-varying ramp This finding suggests that when estimat-ing a continuously coded spatial quantity people benefit mostfrom a large dynamic hue range Incorporating monotonicallyincreasing lightness within the colormap would necessarilyreduce the hue range thereby diminishing accuracy

F Guideline 1 We recommend maximizing hue variation toimprove quantity estimation irrespective of spatial frequency

Gradient PerceptionGradient perception allows people to distinguish how quicklythe encoded attribute changes between adjacent locations anessential skill when evaluating the distribution and varianceof spatial data We find that the task is strongly modulatedby the datarsquos spatial complexity increased spatial frequencyappears to enhance the perception of gradients This is un-surprising as maps with jagged surfaces are likely to exhibitmore pronounced mdashand thus more perceptiblemdash differencesin slope Colormap effectiveness was also impacted by spa-tial frequency color encoding did not help participantsrsquo dis-tinguish gradients at low frequency levels as all colormapsshowed similar performance to greyscale However three col-ormaps demonstrated significant advantage at high frequenciesCoolwarm rainbow and blueyellow improved perception oddsby 12-14 for every step-increase in spatial frequency Allthree employed one of two design strategies a diverging rampwith varying saturation or a fully saturated hue rotation

The above results contradict H1 which predicts hue- andsaturation-varying colormaps to perform better at low frequen-cies In fact we see the opposite The results also do not sup-port H3 which predicts better performance for monotonically-luminant ramps in structure perception tasks In fact all threetop-performing ramps exhibit non-monotonic luminance

F Guideline 2 For tasks requiring gradient perception athigh spatial frequency we recommend a range of fully satu-rated hues (eg rainbow) or diverging chroma-varying ramps(eg coolwarm or blueyellow)

Pattern PerceptionExperiment 3 prompted participants to match the elevationprofile along a horizontal path with an external pattern Weexpected colormaps with monotonically increasing luminanceto be more accurate at this task (H3) but results were notentirely consistent with this prediction While all tested col-ormaps had comparable performance at low spatial frequencyonly two colormaps coolwarm and spectral gave partici-pants higher odds of successfully matching the pattern at highfrequency Both colormaps comprise a diverging ramp withuniformly-stepped (though not strictly monotonic) luminanceBy contrast sequential and spiral ramps performed just aspoorly as greyscale in complex maps and so did rainbow

The above result are consistent with Morelandrsquos argumentthat diverging ramps provide ldquomaximal perceptual resolutionrdquo(through increasing and decreasing luminance intervals) [24]potentially enabling high-frequency patterns to be resolvedmore easily Our results may also explain why divergingschemes performed better in medical diagnosis [3] we suspectsuch tasks to require the analysis of potentially high-frequencyfeatures (eg small tissue aberrations)

FGuideline 3 We recommend diverging ramps with equidis-tant luminance steps (eg coolwarm and spectral) to sup-port the perception of longitudinal patterns at high spatialfrequency Rainbow Sequential and Spiral schemes shouldbe avoided in complex maps especially if the task involvesthe analysis and matching of fine-grained features

Yet Another Look at the RainbowResults of experiments 1 and 2 may shed a light on why rain-bow remains a popular choice among scientists [25] despitebeing considered a bad choice by the visualization commu-nity [4 30] Our data reveals that counterintuitively rainbowis robust for estimating a smoothly varying quantitative at-tribute regardless of spatial complexity Moreover rainbowprovides good support for gradient estimation at high spatialfrequency These two tasks correspond to elementary visualanalytic primitives including characterizing distributions de-termining ranges and filtering [1] Moreover studies showthat when experts attempt to form a mental model about avisualization they first go through a time-consuming processof extracting quantitative data ldquoat a rather detailed levelrdquo [38]For instance a weather forecaster will lookup pressure andwind changes estimating current readings at landmark loca-tions in the map before making a forecast Our data suggeststhat rainbow provides good support for these tasks making ita potentially reasonable choice for weather forecasters

Critique of rainbow centers on its tendency to create sharpvisual boundaries particularly around its yellow regions [4]Experts also criticize the use of fully saturated hues [24]which result in non-uniform perceptual steps within the colorramp The common intuition is that these two factors com-bined will inevitably distort the perception of quantities Wedo not see evidence to support this hypothesis In fact to thecontrary attempts to lsquolinearizersquo the rainbow by monotonicallyincreasing the luminance of hues could reduce estimationaccuracy by up to an order of magnitude

F Guideline 4 Rather than entirely discouraging the useof rainbow we suggest that it can be a reasonable designchoice for conveying spatial distributions and variances andin tasks that require quantitative as opposed to geometricprecision However rainbow has a number of limitationsThe use of green and red hues is problematic for people withcolor deficiency Moreover rainbow is probably ineffective atrevealing high-frequency patterns Interestingly these short-comings are balanced by diverging ramps (eg coolwarm)which although quantitatively inaccurate appear to supportpattern perception at high spatial frequency We thus arguethat hue-varying and diverging colormaps support orthogonaltasks in continuous maps and should therefore be consideredas complementary rather than mutually exclusive choices

LIMITATIONS AND FUTURE WORKThere are some limitations to our work that should be consid-ered First as with other crowdsourced graphical perceptionstudies we gain access to a larger pool of participants butsacrifice some experimental control [16] Particularly rele-vant to our study is the variations in participantsrsquo monitorsincluding color calibration and display resolution as well asthe illumination conditions in their homes or offices mdash all ofwhich can impact color perception We could not control thesefactors but attempted to counteract their variation by involvinga larger sample (N=381) Although we expect crowdsourcingto improve the ecological validity of results and guidelinesuncontrolled variations can potentially reduce our ability to de-tect small but otherwise significant differences in performancebetween tested conditions Future lab studies should thereforebe attempted to replicate our findings with added controls

Second our study employed a limited set of tasks designed tomeasure elementary perceptual operators including quantityestimation gradient perception and pattern matching Thereis an opportunity to test higher-level tasks that mimic scientificanalyses more closely including the identification and compar-ison of larger map features (eg fronts ridges) Additionallysome of the tasks we tested could be re-evaluated in more au-thentic formulations For instance a metric task could requireparticipants to estimate the quantity at a specific location onthe map This formulation is arguably more realistic than thetask we tested which simply asked participants to click anylocation thought to match a specified quantity

Third our analysis was focused exclusively on spatial fre-quency and there are good reasons to consider this factor [1132] However there are also additional data characteristics toconsider including for instance the distribution of amplitudeswithin the map Such factors will influence the distribution ofcolors in the image and may thus impact perception

Lastly we limited our study to synthetically generated scalarfields to precisely vary spatial frequency while controllingfor other confounds However synthetic stimuli may alsointroduce (unknown) perceptual or cognitive biases There-fore additional studies are needed to replicate our findingswith datasets from real-world domains (eg meteorology geo-physics or oceanography) and with domain experts We alsorestricted this study to participants with normal color visionTherefore our results may not generalize to approximately 5of the population who have some form of color deficiency

CONCLUSIONSWe conducted three experiments to investigate the effects ofspatial frequency and colormap characteristics on the percep-tion of continuous pseudocolor maps Our results indicate thatspatial frequency impacts judgment of the encoded quantitiesand structures While viewersrsquo quantity estimation accuracyexhibited a predictable response increased data complexityhad a more nuanced effect on gradient and pattern compre-hension the impact of which was dependent on the colormapused Designers should therefore consider both the type oftask and the spatial complexity of the underlying data Were-examined current guidelines and devised new recommenda-tions for color-coding of continuous spatial data

REFERENCES1 Robert Amar James Eagan and John Stasko 2005

Low-level components of analytic activity in informationvisualization In Information Visualization 2005INFOVIS 2005 IEEE Symposium on IEEE 111ndash117

2 Lawrence D Bergman Bernice E Rogowitz and Lloyd ATreinish 1995 A rule-based tool for assisting colormapselection In Proceedings of the 6th conference onVisualizationrsquo95 IEEE Computer Society 118

3 Michelle Borkin Krzysztof Gajos Amanda PetersDimitrios Mitsouras Simone Melchionna Frank RybickiCharles Feldman and Hanspeter Pfister 2011 Evaluationof artery visualizations for heart disease diagnosis IEEETransactions on Visualization and Computer Graphics 1712 (2011) 2479ndash2488

4 David Borland and Russell M Taylor Ii 2007 Rainbowcolor map (still) considered harmful IEEE ComputerGraphics and Applications 27 2 (2007)

5 Cynthia A Brewer 1994 Visualization in ModernCartography Elsevier Science Chapter Color useguidelines for mapping and visualization

6 Cynthia A Brewer 1996 Guidelines for selecting colorsfor diverging schemes on maps The CartographicJournal 33 2 (1996) 79ndash86

7 Cynthia A Brewer Alan M MacEachren Linda W Pickleand Douglas Herrmann 1997 Mapping mortalityEvaluating color schemes for choropleth maps Annals ofthe Association of American Geographers 87 3 (1997)411ndash438

8 Roxana Bujack Terece L Turton Francesca SamselColin Ware David H Rogers and James Ahrens 2017The Good the Bad and the Ugly A TheoreticalFramework for the Assessment of Continuous ColormapsIEEE Transactions on Visualization and ComputerGraphics (2017)

9 William S Cleveland and William S Cleveland 1983 Acolor-caused optical illusion on a statistical graph TheAmerican Statistician 37 2 (1983) 101ndash105

10 William S Cleveland and Robert McGill 1984 Graphicalperception Theory experimentation and application tothe development of graphical methods J Amer StatistAssoc 79 387 (1984) 531ndash554

11 Russel De Valois and Karen De Valouis 1990 SpatialVision Oxford University Press

12 Russell L De Valois Duane G Albrecht and Lisa GThorell 1978 Cortical cells bar and edge detectors orspatial frequency filters In Frontiers in visual scienceSpringer 544ndash556

13 Connor C Gramazio David H Laidlaw and Karen BSchloss 2017 Colorgorical Creating discriminable andpreferable color palettes for information visualizationIEEE Transactions on Visualization and ComputerGraphics 23 1 (2017) 521ndash530

14 D A Green 2011 A colour scheme for the display ofastronomical intensity images Bulletin of theAstronomical Society of India 39 (June 2011) 289ndash295

15 Mark Harrower and Cynthia A Brewer 2003ColorBrewer org an online tool for selecting colourschemes for maps The Cartographic Journal 40 1(2003) 27ndash37

16 Jeffrey Heer and Michael Bostock 2010 Crowdsourcinggraphical perception using mechanical turk to assessvisualization design In Proceedings of the SIGCHIConference on Human Factors in Computing SystemsACM 203ndash212

17 GT Herman and H Levkowitz 1992 Color scales forimage data IEEE Computer Graphics and Applications12 1 (1992) 72ndash80

18 Michael D Hyslop 2006 A comparison of spectral colorand greyscale continuous-tone map perception Masterrsquosthesis Michigan State University

19 Alan David Kalvin Bernice E Rogowitz Adar Pelah andAron Cohen 2000 Building perceptual color maps forvisualizing interval data In Human Vision and ElectronicImaging V Vol 3959 International Society for Opticsand Photonics 323ndash336

20 Eamonn Keogh and Chotirat Ann Ratanamahatana 2005Exact indexing of dynamic time warping Knowledge andInformation Systems 7 3 (2005) 358ndash386

21 Gordon Kindlmann Erik Reinhard and Sarah Creem2002 Face-based luminance matching for perceptualcolormap generation In Proceedings of the IEEEConference on Visualization rsquo02 IEEE Computer Society299ndash306

22 Robert Kosara 2016 An Empire Built On SandReexamining What We Think We Know AboutVisualization In Proceedings of the Beyond Time andErrors on Novel Evaluation Methods for VisualizationACM 162ndash168

23 Mark P Kumler and Richard E Groop 1990Continuous-tone mapping of smooth surfacesCartography and Geographic Information Systems 17 4(1990) 279ndash289

24 Kenneth Moreland 2009 Diverging color maps forscientific visualization In International Symposium onVisual Computing Springer 92ndash103

25 Kenneth Moreland 2016 Why We Use Bad Color Mapsand What You Can Do About It Electronic Imaging2016 16 (2016) 1ndash6

26 Lace Padilla P Samuel Quinan Miriah Meyer andSarah H Creem-Regehr 2017 Evaluating the Impact ofBinning 2D Scalar Fields IEEE Transactions onVisualization and Computer Graphics 23 1 (2017)431ndash440

27 Stephen E Palmer 1999 Vision science Photons tophenomenology MIT press

28 Ken Perlin 1985 An image synthesizer ACMSIGGRAPH Computer Graphics 19 3 (1985) 287ndash296

29 Linda Williams Pickle 2003 Usability testing of mapdesigns In Proceedings of Symposium on the Interface ofComputing Science and Statistics 42ndash56

30 P Samuel Quinan and Miriah Meyer 2016 Visuallycomparing weather features in forecasts IEEETransactions on Visualization and Computer Graphics 221 (2016) 389ndash398

31 Penny L Rheingans 2000 Task-based color scale designIn 28th AIPR Workshop 3D Visualization for DataExploration and Decision Making International Societyfor Optics and Photonics 35ndash43

32 Bernice E Rogowitz and Lloyd A Treinish 1994 Usingperceptual rules in interactive visualization In ISTSPIE1994 International Symposium on Electronic ImagingScience and Technology International Society for Opticsand Photonics 287ndash295

33 Bernice E Rogowitz Lloyd A Treinish Steve Brysonand others 1996 How not to lie with visualizationComputers in Physics 10 3 (1996) 268ndash273

34 Samuel Silva Beatriz Sousa Santos and JoaquimMadeira 2011 Using color in visualization A surveyComputers Graphics 35 2 (2011) 320ndash333

35 Maureen Stone Danielle Albers Szafir and Vidya Setlur2014 An engineering model for color difference as afunction of size In Color and Imaging Conference Vol2014 Society for Imaging Science and Technology253ndash258

36 Danielle Albers Szafir 2017 Modeling Color Differencefor Visualization Design IEEE Transactions onVisualization and Computer Graphics (2017)

37 Christian Tominski Georg Fuchs and HeidrunSchumann 2008 Task-driven color coding InInformation Visualisation 2008 IVrsquo08 12thInternational Conference IEEE 373ndash380

38 J Gregory Trafton Susan S Kirschenbaum Ted L TsuiRobert T Miyamoto James A Ballas and Paula DRaymond 2000 Turning pictures into numbersextracting and generating information from complexvisualizations International Journal of Human-ComputerStudies 53 5 (2000) 827ndash850

39 Bruce E Trumbo 1981 A theory for coloring bivariatestatistical maps The American Statistician 35 4 (1981)220ndash226

40 W3C 2016 CSS Values and Units Module Level 3(2016)httpwwww3orgTRcss3-valuesabsolute-lengths

41 Colin Ware 1988 Color sequences for univariate mapsTheory experiments and principles IEEE ComputerGraphics and Applications 8 5 (1988) 41ndash49

42 Colin Ware 2012 Information visualization perceptionfor design Elsevier

43 Liang Zhou and Charles D Hansen 2016 A survey ofcolormaps in visualization IEEE Transactions onVisualization and Computer Graphics 22 8 (2016)2051ndash2069

  • Introduction
  • Related Work
    • Design Strategies for Continuous Colormaps
    • Empirical Evaluations of Continuous Colormaps
    • Effects of Spatial Frequency
    • Summary
      • Hypotheses
      • Methodology
        • Stimuli
        • Colormaps
        • Experimental Design
          • Experiment 1 Quantity estimation
            • Participants
            • Procedure
            • Results
              • Experiment 2 Gradient perception
                • Participants
                • Procedure
                • Results
                  • Experiment 3 Pattern perception
                    • Participants
                    • Procedure
                    • Results
                      • Discussion and Guidelines
                        • Quantity Estimation
                        • Gradient Perception
                        • Pattern Perception
                        • Yet Another Look at the Rainbow
                          • Limitations and Future Work
                          • Conclusions
                          • References
Page 10: Graphical Perception of Continuous Quantitative Maps: the ...khreda.com/papers/CHI18_colormaps.pdfA large body of research has been devoted to understanding how color encoding affects

The above result are consistent with Morelandrsquos argumentthat diverging ramps provide ldquomaximal perceptual resolutionrdquo(through increasing and decreasing luminance intervals) [24]potentially enabling high-frequency patterns to be resolvedmore easily Our results may also explain why divergingschemes performed better in medical diagnosis [3] we suspectsuch tasks to require the analysis of potentially high-frequencyfeatures (eg small tissue aberrations)

FGuideline 3 We recommend diverging ramps with equidis-tant luminance steps (eg coolwarm and spectral) to sup-port the perception of longitudinal patterns at high spatialfrequency Rainbow Sequential and Spiral schemes shouldbe avoided in complex maps especially if the task involvesthe analysis and matching of fine-grained features

Yet Another Look at the RainbowResults of experiments 1 and 2 may shed a light on why rain-bow remains a popular choice among scientists [25] despitebeing considered a bad choice by the visualization commu-nity [4 30] Our data reveals that counterintuitively rainbowis robust for estimating a smoothly varying quantitative at-tribute regardless of spatial complexity Moreover rainbowprovides good support for gradient estimation at high spatialfrequency These two tasks correspond to elementary visualanalytic primitives including characterizing distributions de-termining ranges and filtering [1] Moreover studies showthat when experts attempt to form a mental model about avisualization they first go through a time-consuming processof extracting quantitative data ldquoat a rather detailed levelrdquo [38]For instance a weather forecaster will lookup pressure andwind changes estimating current readings at landmark loca-tions in the map before making a forecast Our data suggeststhat rainbow provides good support for these tasks making ita potentially reasonable choice for weather forecasters

Critique of rainbow centers on its tendency to create sharpvisual boundaries particularly around its yellow regions [4]Experts also criticize the use of fully saturated hues [24]which result in non-uniform perceptual steps within the colorramp The common intuition is that these two factors com-bined will inevitably distort the perception of quantities Wedo not see evidence to support this hypothesis In fact to thecontrary attempts to lsquolinearizersquo the rainbow by monotonicallyincreasing the luminance of hues could reduce estimationaccuracy by up to an order of magnitude

F Guideline 4 Rather than entirely discouraging the useof rainbow we suggest that it can be a reasonable designchoice for conveying spatial distributions and variances andin tasks that require quantitative as opposed to geometricprecision However rainbow has a number of limitationsThe use of green and red hues is problematic for people withcolor deficiency Moreover rainbow is probably ineffective atrevealing high-frequency patterns Interestingly these short-comings are balanced by diverging ramps (eg coolwarm)which although quantitatively inaccurate appear to supportpattern perception at high spatial frequency We thus arguethat hue-varying and diverging colormaps support orthogonaltasks in continuous maps and should therefore be consideredas complementary rather than mutually exclusive choices

LIMITATIONS AND FUTURE WORKThere are some limitations to our work that should be consid-ered First as with other crowdsourced graphical perceptionstudies we gain access to a larger pool of participants butsacrifice some experimental control [16] Particularly rele-vant to our study is the variations in participantsrsquo monitorsincluding color calibration and display resolution as well asthe illumination conditions in their homes or offices mdash all ofwhich can impact color perception We could not control thesefactors but attempted to counteract their variation by involvinga larger sample (N=381) Although we expect crowdsourcingto improve the ecological validity of results and guidelinesuncontrolled variations can potentially reduce our ability to de-tect small but otherwise significant differences in performancebetween tested conditions Future lab studies should thereforebe attempted to replicate our findings with added controls

Second our study employed a limited set of tasks designed tomeasure elementary perceptual operators including quantityestimation gradient perception and pattern matching Thereis an opportunity to test higher-level tasks that mimic scientificanalyses more closely including the identification and compar-ison of larger map features (eg fronts ridges) Additionallysome of the tasks we tested could be re-evaluated in more au-thentic formulations For instance a metric task could requireparticipants to estimate the quantity at a specific location onthe map This formulation is arguably more realistic than thetask we tested which simply asked participants to click anylocation thought to match a specified quantity

Third our analysis was focused exclusively on spatial fre-quency and there are good reasons to consider this factor [1132] However there are also additional data characteristics toconsider including for instance the distribution of amplitudeswithin the map Such factors will influence the distribution ofcolors in the image and may thus impact perception

Lastly we limited our study to synthetically generated scalarfields to precisely vary spatial frequency while controllingfor other confounds However synthetic stimuli may alsointroduce (unknown) perceptual or cognitive biases There-fore additional studies are needed to replicate our findingswith datasets from real-world domains (eg meteorology geo-physics or oceanography) and with domain experts We alsorestricted this study to participants with normal color visionTherefore our results may not generalize to approximately 5of the population who have some form of color deficiency

CONCLUSIONSWe conducted three experiments to investigate the effects ofspatial frequency and colormap characteristics on the percep-tion of continuous pseudocolor maps Our results indicate thatspatial frequency impacts judgment of the encoded quantitiesand structures While viewersrsquo quantity estimation accuracyexhibited a predictable response increased data complexityhad a more nuanced effect on gradient and pattern compre-hension the impact of which was dependent on the colormapused Designers should therefore consider both the type oftask and the spatial complexity of the underlying data Were-examined current guidelines and devised new recommenda-tions for color-coding of continuous spatial data

REFERENCES1 Robert Amar James Eagan and John Stasko 2005

Low-level components of analytic activity in informationvisualization In Information Visualization 2005INFOVIS 2005 IEEE Symposium on IEEE 111ndash117

2 Lawrence D Bergman Bernice E Rogowitz and Lloyd ATreinish 1995 A rule-based tool for assisting colormapselection In Proceedings of the 6th conference onVisualizationrsquo95 IEEE Computer Society 118

3 Michelle Borkin Krzysztof Gajos Amanda PetersDimitrios Mitsouras Simone Melchionna Frank RybickiCharles Feldman and Hanspeter Pfister 2011 Evaluationof artery visualizations for heart disease diagnosis IEEETransactions on Visualization and Computer Graphics 1712 (2011) 2479ndash2488

4 David Borland and Russell M Taylor Ii 2007 Rainbowcolor map (still) considered harmful IEEE ComputerGraphics and Applications 27 2 (2007)

5 Cynthia A Brewer 1994 Visualization in ModernCartography Elsevier Science Chapter Color useguidelines for mapping and visualization

6 Cynthia A Brewer 1996 Guidelines for selecting colorsfor diverging schemes on maps The CartographicJournal 33 2 (1996) 79ndash86

7 Cynthia A Brewer Alan M MacEachren Linda W Pickleand Douglas Herrmann 1997 Mapping mortalityEvaluating color schemes for choropleth maps Annals ofthe Association of American Geographers 87 3 (1997)411ndash438

8 Roxana Bujack Terece L Turton Francesca SamselColin Ware David H Rogers and James Ahrens 2017The Good the Bad and the Ugly A TheoreticalFramework for the Assessment of Continuous ColormapsIEEE Transactions on Visualization and ComputerGraphics (2017)

9 William S Cleveland and William S Cleveland 1983 Acolor-caused optical illusion on a statistical graph TheAmerican Statistician 37 2 (1983) 101ndash105

10 William S Cleveland and Robert McGill 1984 Graphicalperception Theory experimentation and application tothe development of graphical methods J Amer StatistAssoc 79 387 (1984) 531ndash554

11 Russel De Valois and Karen De Valouis 1990 SpatialVision Oxford University Press

12 Russell L De Valois Duane G Albrecht and Lisa GThorell 1978 Cortical cells bar and edge detectors orspatial frequency filters In Frontiers in visual scienceSpringer 544ndash556

13 Connor C Gramazio David H Laidlaw and Karen BSchloss 2017 Colorgorical Creating discriminable andpreferable color palettes for information visualizationIEEE Transactions on Visualization and ComputerGraphics 23 1 (2017) 521ndash530

14 D A Green 2011 A colour scheme for the display ofastronomical intensity images Bulletin of theAstronomical Society of India 39 (June 2011) 289ndash295

15 Mark Harrower and Cynthia A Brewer 2003ColorBrewer org an online tool for selecting colourschemes for maps The Cartographic Journal 40 1(2003) 27ndash37

16 Jeffrey Heer and Michael Bostock 2010 Crowdsourcinggraphical perception using mechanical turk to assessvisualization design In Proceedings of the SIGCHIConference on Human Factors in Computing SystemsACM 203ndash212

17 GT Herman and H Levkowitz 1992 Color scales forimage data IEEE Computer Graphics and Applications12 1 (1992) 72ndash80

18 Michael D Hyslop 2006 A comparison of spectral colorand greyscale continuous-tone map perception Masterrsquosthesis Michigan State University

19 Alan David Kalvin Bernice E Rogowitz Adar Pelah andAron Cohen 2000 Building perceptual color maps forvisualizing interval data In Human Vision and ElectronicImaging V Vol 3959 International Society for Opticsand Photonics 323ndash336

20 Eamonn Keogh and Chotirat Ann Ratanamahatana 2005Exact indexing of dynamic time warping Knowledge andInformation Systems 7 3 (2005) 358ndash386

21 Gordon Kindlmann Erik Reinhard and Sarah Creem2002 Face-based luminance matching for perceptualcolormap generation In Proceedings of the IEEEConference on Visualization rsquo02 IEEE Computer Society299ndash306

22 Robert Kosara 2016 An Empire Built On SandReexamining What We Think We Know AboutVisualization In Proceedings of the Beyond Time andErrors on Novel Evaluation Methods for VisualizationACM 162ndash168

23 Mark P Kumler and Richard E Groop 1990Continuous-tone mapping of smooth surfacesCartography and Geographic Information Systems 17 4(1990) 279ndash289

24 Kenneth Moreland 2009 Diverging color maps forscientific visualization In International Symposium onVisual Computing Springer 92ndash103

25 Kenneth Moreland 2016 Why We Use Bad Color Mapsand What You Can Do About It Electronic Imaging2016 16 (2016) 1ndash6

26 Lace Padilla P Samuel Quinan Miriah Meyer andSarah H Creem-Regehr 2017 Evaluating the Impact ofBinning 2D Scalar Fields IEEE Transactions onVisualization and Computer Graphics 23 1 (2017)431ndash440

27 Stephen E Palmer 1999 Vision science Photons tophenomenology MIT press

28 Ken Perlin 1985 An image synthesizer ACMSIGGRAPH Computer Graphics 19 3 (1985) 287ndash296

29 Linda Williams Pickle 2003 Usability testing of mapdesigns In Proceedings of Symposium on the Interface ofComputing Science and Statistics 42ndash56

30 P Samuel Quinan and Miriah Meyer 2016 Visuallycomparing weather features in forecasts IEEETransactions on Visualization and Computer Graphics 221 (2016) 389ndash398

31 Penny L Rheingans 2000 Task-based color scale designIn 28th AIPR Workshop 3D Visualization for DataExploration and Decision Making International Societyfor Optics and Photonics 35ndash43

32 Bernice E Rogowitz and Lloyd A Treinish 1994 Usingperceptual rules in interactive visualization In ISTSPIE1994 International Symposium on Electronic ImagingScience and Technology International Society for Opticsand Photonics 287ndash295

33 Bernice E Rogowitz Lloyd A Treinish Steve Brysonand others 1996 How not to lie with visualizationComputers in Physics 10 3 (1996) 268ndash273

34 Samuel Silva Beatriz Sousa Santos and JoaquimMadeira 2011 Using color in visualization A surveyComputers Graphics 35 2 (2011) 320ndash333

35 Maureen Stone Danielle Albers Szafir and Vidya Setlur2014 An engineering model for color difference as afunction of size In Color and Imaging Conference Vol2014 Society for Imaging Science and Technology253ndash258

36 Danielle Albers Szafir 2017 Modeling Color Differencefor Visualization Design IEEE Transactions onVisualization and Computer Graphics (2017)

37 Christian Tominski Georg Fuchs and HeidrunSchumann 2008 Task-driven color coding InInformation Visualisation 2008 IVrsquo08 12thInternational Conference IEEE 373ndash380

38 J Gregory Trafton Susan S Kirschenbaum Ted L TsuiRobert T Miyamoto James A Ballas and Paula DRaymond 2000 Turning pictures into numbersextracting and generating information from complexvisualizations International Journal of Human-ComputerStudies 53 5 (2000) 827ndash850

39 Bruce E Trumbo 1981 A theory for coloring bivariatestatistical maps The American Statistician 35 4 (1981)220ndash226

40 W3C 2016 CSS Values and Units Module Level 3(2016)httpwwww3orgTRcss3-valuesabsolute-lengths

41 Colin Ware 1988 Color sequences for univariate mapsTheory experiments and principles IEEE ComputerGraphics and Applications 8 5 (1988) 41ndash49

42 Colin Ware 2012 Information visualization perceptionfor design Elsevier

43 Liang Zhou and Charles D Hansen 2016 A survey ofcolormaps in visualization IEEE Transactions onVisualization and Computer Graphics 22 8 (2016)2051ndash2069

  • Introduction
  • Related Work
    • Design Strategies for Continuous Colormaps
    • Empirical Evaluations of Continuous Colormaps
    • Effects of Spatial Frequency
    • Summary
      • Hypotheses
      • Methodology
        • Stimuli
        • Colormaps
        • Experimental Design
          • Experiment 1 Quantity estimation
            • Participants
            • Procedure
            • Results
              • Experiment 2 Gradient perception
                • Participants
                • Procedure
                • Results
                  • Experiment 3 Pattern perception
                    • Participants
                    • Procedure
                    • Results
                      • Discussion and Guidelines
                        • Quantity Estimation
                        • Gradient Perception
                        • Pattern Perception
                        • Yet Another Look at the Rainbow
                          • Limitations and Future Work
                          • Conclusions
                          • References
Page 11: Graphical Perception of Continuous Quantitative Maps: the ...khreda.com/papers/CHI18_colormaps.pdfA large body of research has been devoted to understanding how color encoding affects

REFERENCES1 Robert Amar James Eagan and John Stasko 2005

Low-level components of analytic activity in informationvisualization In Information Visualization 2005INFOVIS 2005 IEEE Symposium on IEEE 111ndash117

2 Lawrence D Bergman Bernice E Rogowitz and Lloyd ATreinish 1995 A rule-based tool for assisting colormapselection In Proceedings of the 6th conference onVisualizationrsquo95 IEEE Computer Society 118

3 Michelle Borkin Krzysztof Gajos Amanda PetersDimitrios Mitsouras Simone Melchionna Frank RybickiCharles Feldman and Hanspeter Pfister 2011 Evaluationof artery visualizations for heart disease diagnosis IEEETransactions on Visualization and Computer Graphics 1712 (2011) 2479ndash2488

4 David Borland and Russell M Taylor Ii 2007 Rainbowcolor map (still) considered harmful IEEE ComputerGraphics and Applications 27 2 (2007)

5 Cynthia A Brewer 1994 Visualization in ModernCartography Elsevier Science Chapter Color useguidelines for mapping and visualization

6 Cynthia A Brewer 1996 Guidelines for selecting colorsfor diverging schemes on maps The CartographicJournal 33 2 (1996) 79ndash86

7 Cynthia A Brewer Alan M MacEachren Linda W Pickleand Douglas Herrmann 1997 Mapping mortalityEvaluating color schemes for choropleth maps Annals ofthe Association of American Geographers 87 3 (1997)411ndash438

8 Roxana Bujack Terece L Turton Francesca SamselColin Ware David H Rogers and James Ahrens 2017The Good the Bad and the Ugly A TheoreticalFramework for the Assessment of Continuous ColormapsIEEE Transactions on Visualization and ComputerGraphics (2017)

9 William S Cleveland and William S Cleveland 1983 Acolor-caused optical illusion on a statistical graph TheAmerican Statistician 37 2 (1983) 101ndash105

10 William S Cleveland and Robert McGill 1984 Graphicalperception Theory experimentation and application tothe development of graphical methods J Amer StatistAssoc 79 387 (1984) 531ndash554

11 Russel De Valois and Karen De Valouis 1990 SpatialVision Oxford University Press

12 Russell L De Valois Duane G Albrecht and Lisa GThorell 1978 Cortical cells bar and edge detectors orspatial frequency filters In Frontiers in visual scienceSpringer 544ndash556

13 Connor C Gramazio David H Laidlaw and Karen BSchloss 2017 Colorgorical Creating discriminable andpreferable color palettes for information visualizationIEEE Transactions on Visualization and ComputerGraphics 23 1 (2017) 521ndash530

14 D A Green 2011 A colour scheme for the display ofastronomical intensity images Bulletin of theAstronomical Society of India 39 (June 2011) 289ndash295

15 Mark Harrower and Cynthia A Brewer 2003ColorBrewer org an online tool for selecting colourschemes for maps The Cartographic Journal 40 1(2003) 27ndash37

16 Jeffrey Heer and Michael Bostock 2010 Crowdsourcinggraphical perception using mechanical turk to assessvisualization design In Proceedings of the SIGCHIConference on Human Factors in Computing SystemsACM 203ndash212

17 GT Herman and H Levkowitz 1992 Color scales forimage data IEEE Computer Graphics and Applications12 1 (1992) 72ndash80

18 Michael D Hyslop 2006 A comparison of spectral colorand greyscale continuous-tone map perception Masterrsquosthesis Michigan State University

19 Alan David Kalvin Bernice E Rogowitz Adar Pelah andAron Cohen 2000 Building perceptual color maps forvisualizing interval data In Human Vision and ElectronicImaging V Vol 3959 International Society for Opticsand Photonics 323ndash336

20 Eamonn Keogh and Chotirat Ann Ratanamahatana 2005Exact indexing of dynamic time warping Knowledge andInformation Systems 7 3 (2005) 358ndash386

21 Gordon Kindlmann Erik Reinhard and Sarah Creem2002 Face-based luminance matching for perceptualcolormap generation In Proceedings of the IEEEConference on Visualization rsquo02 IEEE Computer Society299ndash306

22 Robert Kosara 2016 An Empire Built On SandReexamining What We Think We Know AboutVisualization In Proceedings of the Beyond Time andErrors on Novel Evaluation Methods for VisualizationACM 162ndash168

23 Mark P Kumler and Richard E Groop 1990Continuous-tone mapping of smooth surfacesCartography and Geographic Information Systems 17 4(1990) 279ndash289

24 Kenneth Moreland 2009 Diverging color maps forscientific visualization In International Symposium onVisual Computing Springer 92ndash103

25 Kenneth Moreland 2016 Why We Use Bad Color Mapsand What You Can Do About It Electronic Imaging2016 16 (2016) 1ndash6

26 Lace Padilla P Samuel Quinan Miriah Meyer andSarah H Creem-Regehr 2017 Evaluating the Impact ofBinning 2D Scalar Fields IEEE Transactions onVisualization and Computer Graphics 23 1 (2017)431ndash440

27 Stephen E Palmer 1999 Vision science Photons tophenomenology MIT press

28 Ken Perlin 1985 An image synthesizer ACMSIGGRAPH Computer Graphics 19 3 (1985) 287ndash296

29 Linda Williams Pickle 2003 Usability testing of mapdesigns In Proceedings of Symposium on the Interface ofComputing Science and Statistics 42ndash56

30 P Samuel Quinan and Miriah Meyer 2016 Visuallycomparing weather features in forecasts IEEETransactions on Visualization and Computer Graphics 221 (2016) 389ndash398

31 Penny L Rheingans 2000 Task-based color scale designIn 28th AIPR Workshop 3D Visualization for DataExploration and Decision Making International Societyfor Optics and Photonics 35ndash43

32 Bernice E Rogowitz and Lloyd A Treinish 1994 Usingperceptual rules in interactive visualization In ISTSPIE1994 International Symposium on Electronic ImagingScience and Technology International Society for Opticsand Photonics 287ndash295

33 Bernice E Rogowitz Lloyd A Treinish Steve Brysonand others 1996 How not to lie with visualizationComputers in Physics 10 3 (1996) 268ndash273

34 Samuel Silva Beatriz Sousa Santos and JoaquimMadeira 2011 Using color in visualization A surveyComputers Graphics 35 2 (2011) 320ndash333

35 Maureen Stone Danielle Albers Szafir and Vidya Setlur2014 An engineering model for color difference as afunction of size In Color and Imaging Conference Vol2014 Society for Imaging Science and Technology253ndash258

36 Danielle Albers Szafir 2017 Modeling Color Differencefor Visualization Design IEEE Transactions onVisualization and Computer Graphics (2017)

37 Christian Tominski Georg Fuchs and HeidrunSchumann 2008 Task-driven color coding InInformation Visualisation 2008 IVrsquo08 12thInternational Conference IEEE 373ndash380

38 J Gregory Trafton Susan S Kirschenbaum Ted L TsuiRobert T Miyamoto James A Ballas and Paula DRaymond 2000 Turning pictures into numbersextracting and generating information from complexvisualizations International Journal of Human-ComputerStudies 53 5 (2000) 827ndash850

39 Bruce E Trumbo 1981 A theory for coloring bivariatestatistical maps The American Statistician 35 4 (1981)220ndash226

40 W3C 2016 CSS Values and Units Module Level 3(2016)httpwwww3orgTRcss3-valuesabsolute-lengths

41 Colin Ware 1988 Color sequences for univariate mapsTheory experiments and principles IEEE ComputerGraphics and Applications 8 5 (1988) 41ndash49

42 Colin Ware 2012 Information visualization perceptionfor design Elsevier

43 Liang Zhou and Charles D Hansen 2016 A survey ofcolormaps in visualization IEEE Transactions onVisualization and Computer Graphics 22 8 (2016)2051ndash2069

  • Introduction
  • Related Work
    • Design Strategies for Continuous Colormaps
    • Empirical Evaluations of Continuous Colormaps
    • Effects of Spatial Frequency
    • Summary
      • Hypotheses
      • Methodology
        • Stimuli
        • Colormaps
        • Experimental Design
          • Experiment 1 Quantity estimation
            • Participants
            • Procedure
            • Results
              • Experiment 2 Gradient perception
                • Participants
                • Procedure
                • Results
                  • Experiment 3 Pattern perception
                    • Participants
                    • Procedure
                    • Results
                      • Discussion and Guidelines
                        • Quantity Estimation
                        • Gradient Perception
                        • Pattern Perception
                        • Yet Another Look at the Rainbow
                          • Limitations and Future Work
                          • Conclusions
                          • References
Page 12: Graphical Perception of Continuous Quantitative Maps: the ...khreda.com/papers/CHI18_colormaps.pdfA large body of research has been devoted to understanding how color encoding affects

28 Ken Perlin 1985 An image synthesizer ACMSIGGRAPH Computer Graphics 19 3 (1985) 287ndash296

29 Linda Williams Pickle 2003 Usability testing of mapdesigns In Proceedings of Symposium on the Interface ofComputing Science and Statistics 42ndash56

30 P Samuel Quinan and Miriah Meyer 2016 Visuallycomparing weather features in forecasts IEEETransactions on Visualization and Computer Graphics 221 (2016) 389ndash398

31 Penny L Rheingans 2000 Task-based color scale designIn 28th AIPR Workshop 3D Visualization for DataExploration and Decision Making International Societyfor Optics and Photonics 35ndash43

32 Bernice E Rogowitz and Lloyd A Treinish 1994 Usingperceptual rules in interactive visualization In ISTSPIE1994 International Symposium on Electronic ImagingScience and Technology International Society for Opticsand Photonics 287ndash295

33 Bernice E Rogowitz Lloyd A Treinish Steve Brysonand others 1996 How not to lie with visualizationComputers in Physics 10 3 (1996) 268ndash273

34 Samuel Silva Beatriz Sousa Santos and JoaquimMadeira 2011 Using color in visualization A surveyComputers Graphics 35 2 (2011) 320ndash333

35 Maureen Stone Danielle Albers Szafir and Vidya Setlur2014 An engineering model for color difference as afunction of size In Color and Imaging Conference Vol2014 Society for Imaging Science and Technology253ndash258

36 Danielle Albers Szafir 2017 Modeling Color Differencefor Visualization Design IEEE Transactions onVisualization and Computer Graphics (2017)

37 Christian Tominski Georg Fuchs and HeidrunSchumann 2008 Task-driven color coding InInformation Visualisation 2008 IVrsquo08 12thInternational Conference IEEE 373ndash380

38 J Gregory Trafton Susan S Kirschenbaum Ted L TsuiRobert T Miyamoto James A Ballas and Paula DRaymond 2000 Turning pictures into numbersextracting and generating information from complexvisualizations International Journal of Human-ComputerStudies 53 5 (2000) 827ndash850

39 Bruce E Trumbo 1981 A theory for coloring bivariatestatistical maps The American Statistician 35 4 (1981)220ndash226

40 W3C 2016 CSS Values and Units Module Level 3(2016)httpwwww3orgTRcss3-valuesabsolute-lengths

41 Colin Ware 1988 Color sequences for univariate mapsTheory experiments and principles IEEE ComputerGraphics and Applications 8 5 (1988) 41ndash49

42 Colin Ware 2012 Information visualization perceptionfor design Elsevier

43 Liang Zhou and Charles D Hansen 2016 A survey ofcolormaps in visualization IEEE Transactions onVisualization and Computer Graphics 22 8 (2016)2051ndash2069

  • Introduction
  • Related Work
    • Design Strategies for Continuous Colormaps
    • Empirical Evaluations of Continuous Colormaps
    • Effects of Spatial Frequency
    • Summary
      • Hypotheses
      • Methodology
        • Stimuli
        • Colormaps
        • Experimental Design
          • Experiment 1 Quantity estimation
            • Participants
            • Procedure
            • Results
              • Experiment 2 Gradient perception
                • Participants
                • Procedure
                • Results
                  • Experiment 3 Pattern perception
                    • Participants
                    • Procedure
                    • Results
                      • Discussion and Guidelines
                        • Quantity Estimation
                        • Gradient Perception
                        • Pattern Perception
                        • Yet Another Look at the Rainbow
                          • Limitations and Future Work
                          • Conclusions
                          • References