28
1 Informational Determinants of Large-area Hurricane Evacuations Authors: Noah Dormady 1* , Anthony Fasano 2 , Alfredo Roa-Henriquez 3 , Drew Flanagan 1 , William Welch 1 , and Dylan Wood 4 Affiliations: 1 John Glenn College of Public Affairs, The Ohio State University, 43210 Columbus, USA. 2 Department of Physics, The Ohio State University, 43210 Columbus, USA. 3 National Institute for Standards and Technology (NIST), 20899 Gaithersburg, USA. 4 Civil and Environmental Engineering, University of Notre Dame, 46556, Notre Dame, USA. *Correspondence to: [email protected], Phone: 614-688-1668. Abstract: This study reports on two experiments to investigate the informational determinants of hurricane evacuation decisions (temporal and spatial). Whereas most observational and experimental studies in this domain address the public’s response to forecast information, this study addresses emergency management decisions. Using a subject sample of emergency managers and other public safety leaders, contrasted with a more typical university subject pool, this study presents an experimental design that overcomes the counterfactual problem present in all prior published experiments, by relying on an actual storm (Hurricane Rita) with a known outcome. Several methodological advancements are presented, including the use of an established numeracy instrument, integration of advanced hydrodynamic forecasts, and use of a loss aversion frame to improve generalizability. Results indicate that the availability of additional forecast information (e.g., wind speed, forecast tracks) significantly increases the probability and improves the timing of early voluntary evacuation. However, we observe that more numerate subjects are less likely to avoid relying upon forecast information that is characterized by probability (e.g., the uncertainty in the forecast track, sometimes referred to as the “cone of uncertainty”). Consequently, more numerate emergency managers are almost twice as likely as less numerate ones to provide additional evacuation time to their coastal communities, and they do so by longer than a typical workday (8.8 hours). Results also indicate that subjects knowingly over-evacuate large populations when making spatial mandatory evacuation orders. However, results indicate that numeracy mitigates this effect by more than half in terms of the population subject to mandatory evacuation. Capsule: Hurricane evacuation experiments find numeracy a key factor in reducing unnecessary over-evacuation and improving early voluntary evacuation. Early Online Release: This preliminary version has been accepted for publication in Bulletin of the American Meteorological Society, may be fully cited, and has been assigned DOI The final typeset copyedited article will replace the EOR at the above DOI when it is published. © 20 American Meteorological Society 21 10.1175/BAMS-D-21-0008.1. Unauthenticated | Downloaded 10/20/21 04:22 PM UTC

Informational Determinants of Large-area Hurricane …

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

1

Informational Determinants of Large-area Hurricane Evacuations

Authors: Noah Dormady1*, Anthony Fasano2, Alfredo Roa-Henriquez3, Drew Flanagan1,

William Welch1, and Dylan Wood4

Affiliations:

1John Glenn College of Public Affairs, The Ohio State University, 43210 Columbus, USA.

2Department of Physics, The Ohio State University, 43210 Columbus, USA.

3National Institute for Standards and Technology (NIST), 20899 Gaithersburg, USA.

4Civil and Environmental Engineering, University of Notre Dame, 46556, Notre Dame, USA.

*Correspondence to: [email protected], Phone: 614-688-1668.

Abstract: This study reports on two experiments to investigate the informational determinants of

hurricane evacuation decisions (temporal and spatial). Whereas most observational and

experimental studies in this domain address the public’s response to forecast information, this

study addresses emergency management decisions. Using a subject sample of emergency

managers and other public safety leaders, contrasted with a more typical university subject pool,

this study presents an experimental design that overcomes the counterfactual problem present in

all prior published experiments, by relying on an actual storm (Hurricane Rita) with a known

outcome. Several methodological advancements are presented, including the use of an established

numeracy instrument, integration of advanced hydrodynamic forecasts, and use of a loss aversion

frame to improve generalizability. Results indicate that the availability of additional forecast

information (e.g., wind speed, forecast tracks) significantly increases the probability and improves

the timing of early voluntary evacuation. However, we observe that more numerate subjects are

less likely to avoid relying upon forecast information that is characterized by probability (e.g., the

uncertainty in the forecast track, sometimes referred to as the “cone of uncertainty”).

Consequently, more numerate emergency managers are almost twice as likely as less numerate

ones to provide additional evacuation time to their coastal communities, and they do so by longer

than a typical workday (8.8 hours). Results also indicate that subjects knowingly over-evacuate

large populations when making spatial mandatory evacuation orders. However, results indicate

that numeracy mitigates this effect by more than half in terms of the population subject to

mandatory evacuation.

Capsule: Hurricane evacuation experiments find numeracy a key factor in reducing unnecessary

over-evacuation and improving early voluntary evacuation.

Early Online Release: This preliminary version has been accepted for publication in Bulletin of the American Meteorological Society, may be fully cited, and has been assigned DOI The final typeset copyedited article will replace the EOR at the above DOI when it is published. © 20 American Meteorological Society 21

10.1175/BAMS-D-21-0008.1.

Unauthenticated | Downloaded 10/20/21 04:22 PM UTC

1

1. Introduction 1

Research evaluating household evacuation decisions in response to hurricane evacuation orders is 2

extensive (Baker 1991; Dash and Gladwin 2007; Thompson et al. 2017). However, very little is 3

known about how those evacuation orders are made by emergency managers (EMs) and other 4

public safety professionals. When a hurricane is approaching, what explains the timing of 5

voluntary evacuation orders? When mandatory evacuation orders are issued, why are some 6

communities evacuated and others not? We know from ex post observational data that compliance 7

with these orders varies considerably (Baker 1995; Dow and Cutter 2002; Fu et al. 2007; Huang 8

et al. 2016; Pham et al. 2020; Wallace et al. 2016), and since at least 2005, some officials have 9

resorted to scare tactics to enforce compliance, even urging non-compliant residents to write their 10

social security numbers on their arms and abdomens with markers to facilitate body identification 11

(Blome 2005; Keneally 2017; Mele 2016). Given the commonplace evacuation of non-exposed 12

communities, there may be adverse societal consequences if residents come to expect some degree 13

of hedging by EMs who knowingly over-evacuate. 14

From the limited EM decision making research, we have learned that when provided 15

multiple types of storm forecast information (e.g., maximum wind speeds, storm path, probable 16

areas for landfall), EMs are subject to many of the same cognitive errors and decision biases as 17

non-professionals (Drake 2012; Wernstedt et al. 2019) with some notable exceptions (Hoss and 18

Fischbeck 2016). Just like non-professionals, research finds that EMs tend to overly focus on the 19

forecasted path of the storm, known to forecasters as the “center track,” rather than the so-called 20

“cone of uncertainty,” a tool which forecasters use to visually represent the probabilistic future 21

track of a tropical storm, either to the left or right of its forecasted center track. (Broad et al. 2007; 22

Meyer et al. 2013; Sherman-Morris and Antonelli 2018). We have also learned that there can be 23

Accepted for publication in Bulletin of the American Meteorological ociety. DOI S 10.1175/BAMS-D-21-0008.1.Unauthenticated | Downloaded 10/20/21 04:22 PM UTC

2

anchoring bias, wherein some decision makers may “anchor” onto information with higher 24

perceived severity (Losee et al. 2017). And, ex post observational studies suggest that physical 25

properties of the local terrain appropriately factor into decisions (Gudishala and Wilmot 2017). 26

Some of these informational parameters of the storm forecast presented to EMs are 27

implicitly probabilistic in nature (e.g., the cone of uncertainty). Because these parameters are 28

probabilistic in nature, a decision maker’s ability to effectively utilize them may be tied to their 29

ability to understand probability (i.e., statistical numeracy) (Cokely et al. 2012; Peters et al. 2006). 30

Consequently, a few studies have considered the numeracy (Wernstedt et al. 2019) and critical 31

thinking capacity (Peerbolte and Collins 2013) of EMs. To date, no one has evaluated whether 32

numeracy influences which aspects of the storm forecast information that EMs rely upon, and 33

whether, in the absence of numeracy, decision makers avoid relying on information that is 34

perceived to be probabilistic in nature. Moreover, no one has evaluated whether this relationship 35

impacts social welfare by shaping earlier versus later or over- versus under-evacuation orders. 36

Methodologically, hypothetical surveys feature prominently in this domain (Baker 1995; 37

Drake 2012; Wernstedt et al. 2019), but are subject to hypothetical bias (Loomis 2011). And unlike 38

experiments, observational studies (Gudishala and Wilmot 2017; Regnier 2008) are limited by the 39

absence of controlled counterfactuals. The experiments that do exist, again, predominantly focus 40

on household decision making (Christensen and Ruch 1980; Losee et al. 2017; Meyer et al. 2013; 41

Sherman-Morris and Antonelli 2018; Wu et al. 2014) and very few focus on the decisions of 42

emergency managers (Wernstedt et al. 2019; Wu et al. 2015a,b). And, none of those evaluate the 43

role of numeracy in spatial or temporal decision making. 44

The experiment presented here is unique in several important ways. Prior experimental 45

designs were limited to cross-treatment comparisons without a counterfactual. In other words, they 46

Accepted for publication in Bulletin of the American Meteorological ociety. DOI S 10.1175/BAMS-D-21-0008.1.Unauthenticated | Downloaded 10/20/21 04:22 PM UTC

3

were able to analyze subject behavior across treatments, but unable to compare decisions to the 47

storm’s ultimate outcome because the scenarios were fictitious. This experiment overcomes this 48

limitation by replicating and obfuscating an actual historic hurricane (Hurricane Rita in the area of 49

Houston and Galveston, Texas in 2005) so an assessment can be made relative to how the disaster 50

actually transpired. Unlike experiments relying solely on student subjects (Losee et al. 2017; Wu 51

et al. 2015a,b), this experiment also incorporates a sample of professional subjects including 52

county EMs from select hurricane-observing states (excluding Texas) and public safety leaders. 53

In this two-stage experiment, we study both temporal and spatial dimensions. The stage 1 54

experiment evaluates the timing of voluntary evacuation order recommendations for coastal and 55

low-lying communities in the context of experimentally-controlled forecast information. The stage 56

2 experiment evaluates subjects’ spatial decisions on which of the city of Houston’s established 57

evacuation zones are to be mandatorily evacuated. Because this experiment overcame the absence 58

of a counterfactual, subject decisions could be scored. This enables the design to be the first of its 59

kind to integrate induced value theory (Smith 1976) with embedded decision scoring. In other 60

words, while prior evacuation experiments were both hypothetical and had payment unconnected 61

to decision performance, this experiment is not so limited, and incentivizes subjects to perform as 62

they would in the field. The scoring functions are novel in design, accounting for both over- and 63

under-evacuation. 64

Beyond this, the scoring functions utilized in this experiment further incentivize externally-65

valid decisions by building upon a loss (rather than a gain) frame (Tversky and Kahneman 1981). 66

To induce the endowment effect, subjects were given the single-item battery of the Berlin 67

Numeracy Test (Cokely et al. 2012). Thus, this experiment is the only experiment in this domain 68

that can simultaneously incentivize externally-valid and real-case decisions while controlling for 69

Accepted for publication in Bulletin of the American Meteorological ociety. DOI S 10.1175/BAMS-D-21-0008.1.Unauthenticated | Downloaded 10/20/21 04:22 PM UTC

4

subject numeracy using an established risk literacy instrument. Additional controls are integrated 70

for a post-experiment survey. Statistical analyses are presented to evaluate both stages. 71

72

2. Experimental Methods 73

Experiments are, by their very nature, abstractions of reality. They are designed for the purpose of 74

testing theories regarding mechanisms or behaviors and are not meant to be inclusive of every 75

facet of a physical or topographical area of study, which would otherwise be impractical and 76

confuse subjects. Experiments can play an important role in evaluating risk behaviors because they 77

can control for counterfactuals in a way that observational data cannot. In this domain, there have 78

been a handful of experiments focusing on the evacuation decisions of households (Christensen 79

and Ruch 1980; Losee et al. 2017; Meyer et al. 2013; Sherman-Morris and Antonelli 2018; Wu et 80

al. 2014), but even fewer have focused on the decisions of emergency managers (Wernstedt et al. 81

2019; Wu et al. 2015a,b). Prior experimental designs were limited to cross-treatment comparisons 82

without a counterfactual as they used fictional scenarios to cross-analyze subject behavior under 83

various treatments, but had no measure of the “correctness” of the subjects’ decision making with 84

respect to an actually observed event. This experiment overcomes this limitation by replicating an 85

actual historic hurricane (Hurricane Rita, 2005),1 so an assessment of each decision can be made 86

relative to how the disaster actually transpired. 87

1 Utilization of Hurricane Rita followed a consultative criteria-driven selection process that included review of numerous

alternatives, consultation with engineering experts in storm surge and hydrodynamic modeling, and an informal interview with the

Emergency Manager of Harris County, Texas. Our criteria required the storm to be of sufficient age and relatively low salience to

prevent recall identification. They required the storm to have a relatively common track to further obfuscate recall identification.

They required the storm to be a Category 5 that made landfall. For ease of presentation to subjects, they required the storm to be

non-multijurisdictional (i.e., at risk populations limited to a single state). Whereas prior evacuation experiments presented subjects

with only straight-line forecasts (Wu et al. 2014), we identified a strong preference for a curvilinear track to improve external

validity. Rita met each of these characteristics.

Accepted for publication in Bulletin of the American Meteorological ociety. DOI S 10.1175/BAMS-D-21-0008.1.Unauthenticated | Downloaded 10/20/21 04:22 PM UTC

5

Subjects took two experiments, stage 1 and stage 2. In the first stage, subjects were 88

randomly assigned to one of four treatment groups in a between-subjects design online 89

experiment.2 Subjects assumed a role as Senior Advisor at the Texas Office of Emergency 90

Management to advise the Texas governor on making large-area evacuation decisions for a storm 91

approaching from the Gulf. The experiment included two distinct stages. In stage 1, subjects made 92

a voluntary evacuation recommendation for the coastal and low-lying areas in the Houston metro 93

area. In stage 2, subjects made a spatial mandatory evacuation recommendation consistent with 94

Harris County’s four established hurricane evacuation zones. Stage 2 used a within-subjects 95

design. Across the two stages, subjects made a temporal voluntary decision followed by a spatial 96

mandatory one. 97

2.1. Stage 1 Experiment 98

Subject entry into the experiment coincided with the time of the first forecast advisory by 99

the National Hurricane Center (NHC) at which the storm made the transition to a Cat. 1 on the 100

Saffir-Simpson Index (11am Tuesday, 9/20). At that time, the storm was located just south of 101

Miami, Florida and headed west. Subjects were presented with actual hurricane forecast 102

information from Rita, obfuscated by name to ‘Rebecca’ to avoid recall identification. Forecast 103

information from the successive NHC advisories was progressively added across treatments. 104

Advisories were presented in a series of nine decision periods, or rounds, that mapped directly 105

onto the nine advisories issued for Rita before the NHC issued its critical ‘Hurricane Warning’ for 106

the area (10am Thursday, 9/22). Subjects were given the opportunity to make the voluntary order 107

recommendation each round. Once the recommendation was made, subjects exited the stage and 108

were informed that a Hurricane Warning had been issued and hurricane-force winds were expected 109

2 Replication link and instructions are provided in SI-2. A description of the random assignment algorithm is provided in SI-5.

Accepted for publication in Bulletin of the American Meteorological ociety. DOI S 10.1175/BAMS-D-21-0008.1.Unauthenticated | Downloaded 10/20/21 04:22 PM UTC

6

within 24-36 hours. Subjects not making a voluntary evacuation recommendation proceeded 110

through all nine advisories, or rounds, and were then notified that the Hurricane Warning had been 111

issued. Advisories were re-numbered to exclude early storm formation for decision simplicity. 112

Treatment conditions selectively presented subjects with increased probabilistic forecast 113

information. T1 represented a baseline control condition in which only historic and current storm 114

information was provided: Historic Center Track; Current Center Location; Current Max Sustained 115

Wind Speed. T2 added: Forecast Center Track; and Forecasted Watch and Warning Areas. T3 116

added: Cone of Uncertainty. T4 added: Forecasted Max Wind Speed. See Figure 2. 117

2.2 Stage 2 Experiment 118

Stage 2 used a within-subjects design in which subjects were asked to make a spatial 119

mandatory evacuation order recommendation. From an interactive zone map of the region, they 120

were given one of seven possible mandatory evacuation configurations (see SI Sect. 4.2.11). 121

Subjects identified one of the seven possible mandatory evacuation configurations that correspond 122

to established Harris County evacuation areas. 123

Because today’s advanced hydrological models are often presented to decision makers in 124

the form of best versus worst-case inundation scenarios, subjects were presented with three 125

possible inundation maps, or maps of the maximum extent of flooding induced by the hurricane 126

storm surge. The maps were produced by finite element analysis in the hydrodynamic model, DG-127

SWEM (Discontinuous Galerkin-Shallow Water Equations Model) (Dawson et al. 2011; Kubatko 128

et al. 2006). The hydrodynamic model analysis was executed on a computational mesh grid known 129

as TX2008, a grid of more than 2.8 million nodes developed for FEMA Flood Insurance Studies 130

(FIS) on the Texas coastline by the U.S. Army Corp of Engineers. The grid represents the 131

bathymetry, i.e., the surface of the earth that lies underwater, in the Gulf of Mexico and western 132

Accepted for publication in Bulletin of the American Meteorological ociety. DOI S 10.1175/BAMS-D-21-0008.1.Unauthenticated | Downloaded 10/20/21 04:22 PM UTC

7

north Atlantic Ocean as well as topography, the surface of earth above water, in coastal Texas with 133

nodal spacings varying significantly throughout, where maximum grid resolution is roughly 19 to 134

24 km in the deep Atlantic Ocean and minimum grid resolution is roughly 30 m in Texas (USACE 135

2011). Hydrodynamic forces in the model were based on tidal constituents enforced at the model 136

boundaries as well as a quadratic drag law for wind stress, where observational wind speeds and 137

air pressures (10 meters above sea level) for the hurricane were provided from data assimilation 138

analysis by Oceanweather Inc., also developed for FEMA FIS studies in both Louisiana and Texas 139

(USACE 2011, 2008). Hurricane Rita was simulated up until the time of NHC advisory number 140

20 for the storm, the first NHC advisory for which a hurricane warning was issued in the study 141

area (Houston/Galveston, Texas) and consistent with the decision timing provided to subjects. 142

From this point, three different scenarios were modeled: a “center track” scenario, where 143

the storm proceeded directly along the center track forecasted in NHC advisory number 20, and 144

“veer-left” and “veer-right” scenarios, where the storm track deviated from the forecasted center 145

track in either the left or right direction, based on the NHC specifications on the extent of the Cone 146

of Uncertainty for its forecasted hurricane tracks. The corresponding inundation maps are 147

generated by considering the maximum water surface elevations at each nodal point in the 148

computational grid over the entirety of the simulation for each storm scenario modeled, and by 149

drawing color where the depth of the water on land (i.e., initially dry areas) exceeds 0.15 m 150

(roughly half of a foot). See Fig. 3. 151

2.3 Scoring Functions & Numeracy Test 152

Advancements from the field of experimental economics can meaningfully inform disaster 153

evacuation experiments. To date, all prior evacuation experiments have omitted the integration of 154

induced value theory (see the work of Nobel Laureate Vernon Smith, 1976) into their designs. This 155

Accepted for publication in Bulletin of the American Meteorological ociety. DOI S 10.1175/BAMS-D-21-0008.1.Unauthenticated | Downloaded 10/20/21 04:22 PM UTC

8

typically occurs when subject payment is tied to decision-making performance to induce subjects 156

to make decisions that more closely approximate decisions that they would make in the external 157

context being evaluated. Because prior evacuation experiments omitted a counterfactual from their 158

designs, they were not able to score the appropriateness of decisions, and, therefore, they were 159

unable to induce value. Additionally, because they relied upon hypothetical scenarios, hypothetical 160

bias (Loomis 2011) may have also been a concern. In other words, in previous experiments, 161

subjects got paid either way and had no direct inducements to behave as they would in a real 162

disaster. 163

This experiment not only overcomes the issues but takes the added step of building a 164

scoring function (i.e., the scoring mechanism) that is tied to a loss frame (Kahneman and Tversky 165

1979; Tversky and Kahneman 1981). To induce the endowment effect, subjects were given the 166

single-item battery of the Berlin Numeracy Test (Cokely et al. 2012)3 at the outset of the 167

experiment. The test “specifically measures the range of statistical numeracy skill that is important 168

for accurately interpreting and acting on information about risk—i.e., risk literacy” (Cokely et al. 169

2012, p. 37). Subjects were informed that by completing the assessment, they would be earning 170

points that they would either retain, or lose, based on their decision performance in the experiment. 171

They were informed that at the end of the experiment points would be converted to dollars for their 172

subject payment. Subjects were not informed of their performance on the numeracy test and were 173

given 200 points each for their completion of it. In this way, we were able to statistically account 174

for numeracy using the same assessment that we used to generate the subject’s endowment. 175

3 Cokely et al.’s (Cokely et al. 2012) assessment is the following: Out of 1,000 people in a small town 500 are members of a choir.

Out of these 500 members in the choir 100 are men. Out of the 500 inhabitants that are not in the choir 300 are men. What is the

probability that a randomly drawn man is a member of the choir?

Accepted for publication in Bulletin of the American Meteorological ociety. DOI S 10.1175/BAMS-D-21-0008.1.Unauthenticated | Downloaded 10/20/21 04:22 PM UTC

9

Subject payment ratios differed by subject type. Professional subjects’ conversion rate was 176

15 cents per point remaining at the end of the experiment. Student subjects’ rate was 6 cents. Rates 177

were set to approximate a rate of $60/hr. based on the mean duration recorded during pilot 178

experiments with graduate student subjects. Subjects also earned five additional dollars (student 179

subjects three) for completing a post-experiment survey that provided additional explanatory 180

variables. Mean total payouts were $24.47 and $11.62 for professional and student subjects, 181

respectively. 182

Scoring functions were structured to coincide with best-possible outcomes given the 183

realized impacts of Rita. Stage 1 decision scoring was straightforward, given observed landfall just 184

north of the Houston Metro area. Voluntary evacuation orders for the coastal and low-lying areas 185

(e.g., Galveston) were an appropriate decision, and an early voluntary evacuation recommendation 186

was appropriate for those communities. Stage 2 decision scoring required the creation of an 187

inundation map. While no official map was created or publicly released, our student team 188

developed one from the post-disaster aerial imagery provided by NOAA’s Geodetic Survey 189

(NOAA 2005) and the NHC’s Tropical Cyclone Report for Rita (Knabb et al. 2011). The report 190

provides geographic inundation details based on geographic indicators, including flood insurance 191

claims and high-water marks. Inundation was observed only for the coastal evacuation zone 192

including Galveston, with minor wind-induced inland flooding. 193

The scoring function for each stage (text and mathematical derivation provided in SI-6) 194

made use of 100 points, or half of the endowment. The stage 1 function accounted for each possible 195

decision outcome and was designed to account for improved early evacuation as well as the 196

adverse effects of false positives (Regnier 2008). The stage 2 function was developed to account 197

Accepted for publication in Bulletin of the American Meteorological ociety. DOI S 10.1175/BAMS-D-21-0008.1.Unauthenticated | Downloaded 10/20/21 04:22 PM UTC

10

for the at-risk populations expected to remain in each of the county’s four evacuation zones.4 198

Estimated populations remaining in each zone were presented to subjects and were visible to 199

subjects separately for each decision selection and visible through an interactive color-adaptive 200

spatial map. The scoring function was designed to simultaneously account for both under- and 201

over-evacuation. While no metric could accurately capture the relative value of each, we presented 202

subjects with a value function that scored under-evacuation twice as adversely as over-evacuation. 203

2.4 Video Instructions 204

Subject instructions were delivered as videos that took the form of a professional briefing; 205

one preceding each stage. Instructions were presented by Darryl Anderson, Interoperability 206

Coordinator for the U.S. Dept. of Homeland Security Office of Emergency Communications, and 207

Commandant Ret. of the Ohio Highway Patrol Academy. Videos were professionally recorded 208

and edited. Separate video tracks were developed for each treatment group. Videos also contain 209

treatment-specific instructions for reading advisories taken from a different storm, Ophelia (2005). 210

The experiment interface was coded to prevent fast-forwarding or skipping of the videos. Closed 211

captioning was manually revised to 100% accuracy and auto-enabled. 212

2.5 Subject Sample 213

The subject pool included both professional subjects as well as a more typical student 214

population. Professional subjects (N=81) consisted predominantly of emergency managers or 215

assistant emergency managers drawn from coastal hurricane-observing states,5 excluding Texas to 216

avoid memory bias. All valid publicly-listed emergency manager email addresses in those states 217

received an invitation, yielding a net completion rate of 8.9%. Beyond EMs, a small group of 218

4 Post-voluntary evacuation population remaining percentages were estimated in (Dow and Cutter 2002; Fu et al. 2007; Pham et al.

2020; Whitehead et al. 2001), with them collectively finding approximately 15-45% voluntary evacuation compliance rates. See

SI-7 for detailed methodology for estimating remaining at-risk population. 5 These included AL, DE, FL, GA, LA, MD, NC, NJ, NY, SC, & VA. Emergency managers represent 74% of the professional

subject population.

Accepted for publication in Bulletin of the American Meteorological ociety. DOI S 10.1175/BAMS-D-21-0008.1.Unauthenticated | Downloaded 10/20/21 04:22 PM UTC

11

subjects were invited from the Ohio State University Public Safety Leadership Academy, which 219

provides leadership training to senior public safety officials (e.g., Chiefs and Sheriffs). Student 220

subjects (N=227) were randomly drawn from the Ohio State University experimental economics 221

subject pool. Oversampling was conducted for upper division and graduate-level students from 222

disciplines more common to the emergency management field (e.g., public affairs, business, 223

ROTC) to improve representativeness. Subjects were 81% and 44% male for professional and 224

student subjects, respectively (<1% reported transgender or non-binary). They were 18.5% and 225

3.5% veteran, respectively. 67% of the professional subjects reported a college degree or higher. 226

227

3. Results 228

3.1. Stage 1 Experiment: Temporal Voluntary Evacuation Decision 229

Hurricane forecast information increased voluntary evacuation recommendation times by 230

between 16.6 and 22.8 hours, much-needed time for coastal communities to prepare for and 231

implement evacuation measures. By evaluating treatment effects relative to the absence of forecast 232

information (Treatment 1, or T1), a clean estimate of the individual contribution of each key 233

informational parameter can be assessed, that does not preclude inter-treatment comparison. At 234

the most basic level and while holding numeracy constant, the social value of the forecast center 235

track (T2) represented an expected value of 16.6 additional hours (p<.05) to coastal communities. 236

Adding a cone of uncertainty (T3) extended this time to 19.1 hours (p<.05), and adding forecasted 237

max wind speed (T4) extended this time to 22.8 hours (p<.01). The addition of the forecast center 238

track (T2) increased the likelihood of early voluntary evacuation by a magnitude of 2.7 times (n.s.). 239

Addition of the cone (T3) increased the magnitude to 3.1 times (p<.10), and addition of forecasted 240

max wind speed (T4) increased this magnitude to 4.4 times (p<.05). 241

Accepted for publication in Bulletin of the American Meteorological ociety. DOI S 10.1175/BAMS-D-21-0008.1.Unauthenticated | Downloaded 10/20/21 04:22 PM UTC

12

These estimates were derived by regression analysis of the experiment data, presented in 242

Table 1. Model 1 presents Tobit and Model 2 presents survival regression estimates using the Cox 243

Proportional Hazard (PH) model. Tobit models are specifically used to account for censoring 244

effects in the data in which observations are ‘censored’ or cut off at the initial and concluding 245

rounds/advisories (Cameron and Trivedi 2010; Tobin 1958). Survival models (also called 246

‘Accelerated Failure Time’ or AFT models) are a type of regression model designed for evaluating 247

dependent variables involving time until a binary event (in this case, time until evacuation 248

recommendation). They allow us to treat evacuation as a “failure” event (a statistical term not an 249

evaluative one) where the modeled outcome is the hazard ratio, which is an estimate of the 250

increased likelihood of this event occurring at any given advisory relative to the control. Cox 251

models (Cox 1972; Cox and Oakes 1984) are an advanced class of these models. Results are 252

separated by subject type; similar results are obtained for student subjects. The dependent variable 253

in Model 1 is hours remaining prior to the issuance of the hurricane warning by the NHC (i.e., 254

when early evacuation ends), which is left- and right-censored at start and end advisories. The AFT 255

variable (or dependent variable) for Model 2 is the advisory in which the subject recommended 256

voluntary evacuation. Robustness checks and alternative model specifications are provided in SI-257

3 at Tables SI-3.5-3.6. Demographics (e.g., sex, age, education, veteran status) were not robust 258

explanatory factors. 259

Importantly, estimates in both models control for subject numeracy. Among professional 260

subjects, numeracy improved hurricane voluntary evacuation recommendation times by an average 261

of 8.8 hours (p<.10) and increased the hazard ratio by 1.7 (p<.10). Numeracy is only statistically 262

significant for professional subjects, which is influenced by the fact that 25% of professionals 263

successfully completed the numeracy assessment compared to 52% of students.We note that this 264

Accepted for publication in Bulletin of the American Meteorological ociety. DOI S 10.1175/BAMS-D-21-0008.1.Unauthenticated | Downloaded 10/20/21 04:22 PM UTC

13

also provides a highly-robust validity check of the experimental results—Cokely et al. (Cokely et 265

al. 2012), who introduced the Berlin Numeracy Test, obtain the exact same percentage accurate in 266

their student sample. We find that numeracy is uncorrelated with education (ρ=.01 for college 267

degree, ρ=.08 for post-graduate degree) and time taken (Cokely et al. 2018) (see SI-3 Table 3.10). 268

Extrapolating from these results, we observe that more numerate emergency managers are almost 269

twice as likely as less numerate ones to provide additional voluntary evacuation time to their 270

coastal communities regardless of forecast information, and they do so by longer than a typical 271

workday. 272

Student subjects generally outperformed professional subjects, on average recommending 273

voluntary evacuation approximately 1.3 advisories earlier. Students evacuated, on average, after 274

6.6 advisories. Professionals did so after 7.9. Students were 8% more likely to make advance 275

recommendation. 31.7% of students and 39.5% of professionals did not recommend voluntary 276

evacuation prior to the NHC warning. Additional evacuation rate details by advisory and treatment 277

are provided in SI-3 (see Table SI-3.9 and Figures SI-3.1-3.6). 278

Further refinement of results can be obtained by evaluating the post-experiment survey. 279

Subjects were asked to identify the three informational attributes they relied upon most and then 280

rank-order them. By self-identifying those informational criteria, subjects provided valuable 281

information on the forecast elements that most influenced their decisions. Detailed summary 282

statistics are provided in SI-3, along with a full suite of statistical tests of treatment equality (see 283

Table SI-3.8). Subjects relied most heavily on the current center location in the absence of forecast 284

information. 285

Of critical importance is the unwavering reliance by subjects on the forecast center track 286

in all forecast treatments. Even with the addition of the cone of uncertainty, the relative weight, or 287

Accepted for publication in Bulletin of the American Meteorological ociety. DOI S 10.1175/BAMS-D-21-0008.1.Unauthenticated | Downloaded 10/20/21 04:22 PM UTC

14

importance of the forecast center track remained the most ascribed informational attribute, and 288

consistently so across subject type (all tests safely fail to reject the null). This is important and 289

comports with Regnier (Regnier 2008), Wernstedt et al. (Wernstedt et al. 2019), Wu et al. (Wu et 290

al. 2014), and several others whose findings suggest that both the general public and emergency 291

managers make significant judgment errors by over-relying on the center track in evaluating 292

forecast information. Consequently, after 2006, the NHC began providing separate advisory 293

graphics both including and excluding the center track, allowing site visitors to self-select advisory 294

graphics (Morrow et al. 2015). However, our results indicate the critical importance of one 295

additional caveat. 296

That is, results indicate that numeracy plays an important role in influencing which 297

informational attributes subjects rely upon. In a comparison of T2 and T3 mean values between 298

subject types, we clearly observe that when provided a cone, professionals place half as much 299

weight as students on it. Recalling that observed student numeracy rates are twice that of 300

professionals, we evaluated the relationship between numeracy and informational attribute 301

importance. Bivariate logistic regression on subject data from forecast treatments reveals a positive 302

and statistically-significant relationship (β=1.121, p<.10 for professionals and β=0.717, p<.05 for 303

students) between numeracy and the importance ranking of the cone of uncertainty. Predicted 304

margins from these regressions indicate that subjects who received the cone are 27.2% more likely 305

to rely upon the cone if they are more numerate (34.3 versus 61.5%). For students, this value is 306

17.5% (46.8 versus 64.3%). 307

Taken together, these results provide at least some evidence that less numerate decision 308

makers, as measured by a well-established risk literacy instrument, avoid the cone of uncertainty 309

as an informational determinant. This indicates a predisposition toward relying upon informational 310

Accepted for publication in Bulletin of the American Meteorological ociety. DOI S 10.1175/BAMS-D-21-0008.1.Unauthenticated | Downloaded 10/20/21 04:22 PM UTC

15

attributes that they perceive to be less probabilistic in nature and paying greater attention to 311

attributes such as the forecast center track, which are not directly presented as a function of 312

statistical error. 313

3.2. Stage 2 Experiment: Spatial Mandatory Evacuation Decision 314

A mandatory evacuation order is an implicit spatial decision that involves two 315

consequences of potential judgment error: over- and under-evacuated communities. At the mean, 316

professional subjects evacuated a remaining population (i.e., those remaining after NHC warning) 317

of just over 320K persons; students evacuated just over 285K persons (difference of 34K persons, 318

p<.16). While three percent of student subjects chose to evacuate zero communities, all 319

professional subjects evacuated at least one zone. 13.6 and 18.1% of professionals and students, 320

respectively, evacuated only the coastal zone (i.e., the Galveston area). 50.6 and 46.7 percent of 321

professionals and students, respectively, evacuated both the Coastal and Zone A—the nearest 322

inland zone. Choropleth spatial density maps of evacuation decisions are presented in Figure 1. 323

Stage 2 used a within-subjects design (i.e., all exposed to the same treatment) in which subjects 324

were asked to make a spatial mandatory evacuation order recommendation. From an interactive 325

zone map of the region, subjects selected one of seven possible mandatory evacuation 326

configurations that correspond to established Harris County evacuation areas (see SI 4.2). 327

All subjects were provided three stormwater inundation projections that coincide with three 328

potential storm track scenarios (i.e., veer-left, center track, veer-right). Post-experiment survey 329

instruments provide further explanatory power. Subjects were asked to identify which of the three 330

projection maps they relied upon most. Only 5% report relying on the optimistic (veer-right) 331

projection—the most accurate relative to actual inundation. 59% report reliance on the center 332

projection. 333

Accepted for publication in Bulletin of the American Meteorological ociety. DOI S 10.1175/BAMS-D-21-0008.1.Unauthenticated | Downloaded 10/20/21 04:22 PM UTC

16

Subjects were also asked a post-experiment decision-making rationale question that 334

features prominently in statistical robustness tests—if they knowingly chose to evacuate any zones 335

even though they thought the likelihood of human exposure to flooding was low (a variable we 336

refer to as ‘human exposure’). While 56.5 percent of subjects reported in the affirmative, the 337

difference between professional subjects and student subjects is not statistically significant on the 338

whole. However, among the 35% of subjects who reported reliance on the veer-left (worst-case) 339

projection, professional and student subjects differ significantly in self-admitting over-evacuation 340

(p.<02 using a Wilcoxon test). Of these subjects relying on the worst-case projection, 42.8 percent 341

of professionals and 70 percent of student subjects self-identified as over-evacuating. 342

Moreover, perceptions were not consistent with evacuated populations. Students evacuated 343

an average of 124K persons when self-reporting over-evacuating, but this number is 166K among 344

professionals. Of those subjects reporting reliance on the veer-left projection, the additional 345

evacuated population for professional subjects associated with self-reported over-evacuating is 346

over 200K persons, nearly twice the difference for student subjects (119K). Put simply, when 347

subjects relied upon the worst-case projection, even though professional subjects generally 348

evacuated larger populations than students, they were less likely to admit to over-evacuating. 349

However, when doing so, professionals evacuated populations nearly twice the size when relying 350

on the worst-case inundation projection. 351

Regression analyses provide further explanation. Two main dependent variables were 352

evaluated: a) a scoring function (detailed in SI-6) that accounts for both over- and under-evacuation 353

and b) the total remaining population mandatorily evacuated. While we can confirm the absence 354

of statistical collinearity in the models (meaning there is no pairwise relationship between reliance 355

on one inundation projection over another and other explanatory variables), we also confirm that 356

Accepted for publication in Bulletin of the American Meteorological ociety. DOI S 10.1175/BAMS-D-21-0008.1.Unauthenticated | Downloaded 10/20/21 04:22 PM UTC

17

these inundation projections lack statistical robustness in all models, indicating that reported 357

reliance on one inundation projection over another was less of an influencing factor for evacuation 358

decisions than self-admitted over-evacuation (i.e., the ‘human exposure’ variable). Across the 359

board, this variable is statistically associated with over-evacuating populations (negative 360

coefficients in the scoring function models and positive in the population models, indicating over- 361

rather than under-evacuation). This generally confirms that subjects knew that they were over-362

evacuating. 363

We similarly confirm that numeracy reduces over-evacuation but is only statistically-364

robust among the student population. While numeracy is not statistically correlated with self-365

admitted over-evacuation, the favorable effects of numeracy offset more than half of the over-366

evacuation effects (67.3 percent population difference at p<.01). Put simply, the adverse social 367

consequences of over-evacuation decisions among our sample of student subjects is significantly 368

mitigated by numeracy. This is not the case, however, for professional subjects who knowingly 369

over-evacuated larger populations regardless of numeracy. 370

371

4. Discussion of Results 372

Non-compliance with hurricane evacuation orders is a social problem. But therein lies a paradox—373

observational and experimental studies find that residents who had previously lived through a 374

hurricane are often more likely to be non-compliant (Baker 1991; Meyer et al. 2013). Nearly five 375

decades ago, Windham et al. (Windham et al. 1977) referred to this as the “experience-adjustment 376

paradox” and Meyer et al (Meyer et al. 2013) refer to this as “false experience effects.” These 377

experimental results might move the proverbial needle in explaining this paradox from the 378

standpoint of informational parameters. Here, results suggest that less numerate decision makers 379

Accepted for publication in Bulletin of the American Meteorological ociety. DOI S 10.1175/BAMS-D-21-0008.1.Unauthenticated | Downloaded 10/20/21 04:22 PM UTC

18

avoid what they perceive to be probabilistic forecast information, and the consequence of this is 380

less advance warning to communities—by more than a typical workday. Results also indicate that 381

numeracy can play a role in significantly reducing the predilection to knowingly over-evacuate 382

large areas of remaining populations. If residential populations come to expect that their public 383

leaders’ risk considerations are asymmetric from their own, the important public trust that is 384

necessary to avoid loss of life in major disasters can be called into question. Escalation effects can 385

promulgate if public safety leaders feel the need to scare increasingly distrustful populations into 386

compliance. 387

These experimental results also serve to address a long-established conundrum in 388

presenting tropical cyclone forecasts—the over-reliance on the forecast center track by both 389

residential populations and public safety officials (Broad et al. 2007; Meyer et al. 2013; Sherman-390

Morris and Antonelli 2018). Our results highlight the fact that it is not necessarily that decision 391

makers are over-reliant on the track line, but instead, the less numerate disregard what is presented 392

as a function of statistical error (i.e., the cone). This result calls into question the judgment of 393

media and others who, for more than a decade now, have begun to selectively omit the forecast 394

center track and present only the cone. What remains is what is otherwise disregarded by many 395

less numerate decision makers. Given the absence of consensus regarding the presentation of 396

scientific uncertainty to public safety practitioners, further research is needed to improve the 397

presentation of probabilistic information. 398

This experiment has made several methodological advancements beyond existing 399

experimental research in this domain. These include integration of endowment generation and 400

scoring functions, overcoming the counterfactual challenge that exists in hypothetical experiments, 401

use of a well-established numeracy metric, and integration of a professional subject population. 402

Accepted for publication in Bulletin of the American Meteorological ociety. DOI S 10.1175/BAMS-D-21-0008.1.Unauthenticated | Downloaded 10/20/21 04:22 PM UTC

19

One implication regarding the subject population deserves to be highlighted. Whereas existing 403

experiments have consistently relied upon student subject pools to extrapolate public safety 404

decisions, by including both student subjects and professional decisionmakers, side by side, our 405

results paint in stark relief the potential validation challenge that is present in existing experimental 406

works relying solely on student subjects. 407

Further research like ours is needed in this domain to extend these findings. This can 408

include integration of additional physical and hydrodynamic forecast properties, including storm 409

surge and infrastructure (e.g., levee) fragility modeling and probabilistic wind conditions. This can 410

include simultaneous interactive experimental designs that integrate residential and public safety 411

decisions to gauge escalation effects. This can also include eye-tracking studies to observe forecast 412

parameters receiving the most visual attention. And, while this experiment was designed to provide 413

greater analytical depth of a single storm, future research can extend this work to other trajectory 414

classifications and perils. 415

While we caution against monolithic policy guidance on the basis of a single study, the 416

results of this research motivate pragmatic policy questions deserving greater attention. These 417

include the degree to which merit-based public safety decisions are moderated by, or influenced 418

by additional social or contextual factors such as legal risk aversion (Nicholson 2007; Wilson and 419

McCreight 2012). Public safety professionals enjoy a degree of insulation from legal liability that 420

may shape their decisions involving risk. As we identify a predilection toward overevacuation in 421

the absence of risk literacy, further research may be warranted to strike the appropriate balance 422

between insulating those decision makers and building accountability for moral hazard. 423

Additionally, this debate involving the presentation of scientific information is taxing for 424

scientists who bear a disproportionate responsibility in this domain. One may argue that they are 425

Accepted for publication in Bulletin of the American Meteorological ociety. DOI S 10.1175/BAMS-D-21-0008.1.Unauthenticated | Downloaded 10/20/21 04:22 PM UTC

20

unequally yoked, responsible not only for developing important forecasts for public safety, but 426

also responsible for ensuring that those forecasts are interpreted correctly in the absence of 427

requisite scientific literacy, such as numeracy and ultimately risk literacy. An important public 428

debate needs to occur to identify the relative responsibilities of both the scientific community and 429

public sector leaders informed by that community. 430

431

Acknowledgments: The authors are grateful for research support and assistance from Darryl 432

Anderson, Antonio Gil De Rubio-Cruz, Mehrzad Rahimi, Abdollah Shafieezadeh, Ethan Kubatko, 433

Coral Wonderly, Kelly Lash, Tim Bailey, and Sam Stelnicki. This work was supported by National 434

Science Foundation Grant # 1563372. 435

Data and materials availability: All data and code necessary for replication will be published to 436

NSF DesignSafe and available from the authors. 437

438

Accepted for publication in Bulletin of the American Meteorological ociety. DOI S 10.1175/BAMS-D-21-0008.1.Unauthenticated | Downloaded 10/20/21 04:22 PM UTC

21

References: 439

Baker, E., 1991: Hurricane evacuation behavior. International journal of mass emergencies and 440

disasters, 9, 287–310. 441

Baker, E. J., 1995: Public response to hurricane probability forecasts. The Professional 442

Geographer, 47, 137–147, https://doi.org/10.1111/j.0033-0124.1995.00137.x. 443

Blome, R., 2005: Staying? Better write your SS# on your body... NBC NEWS. 444

http://www.nbcnews.com/id/9439537/ns/us_news-katrina_the_long_road_back/t/staying-445

better-write-your-ss-your-body/#.X1_QE5NKiYU. 446

Broad, K., A. Leiserowitz, J. Weinkle, and M. Steketee, 2007: Misinterpretations of the “cone of 447

uncertainty” in Florida during the 2004 hurricane season. Bulletin of the American 448

Meteorological Society, 88, 651–668, https://doi.org/10.1175/BAMS-88-5-651. 449

Cameron, A. C., and P. K. Trivedi, 2010: Microeconometrics using Stata. Stata Press, 706 pp. 450

Christensen, L., and C. E. Ruch, 1980: The effect of social influence on response to hurricane 451

warnings. Disasters, 4, 205–210, https://doi.org/10.1111/j.1467-7717.1980.tb00273.x. 452

Cokely, E., M. Galesic, E. Schulz, S. Ghazal, and R. Garcia-Retamero, 2012: Measuring risk 453

literacy: The Berlin numeracy test. Judgm Decis Mak, 7, 25–47. 454

Cokely, E. T., A. Feltz, S. Ghazal, J. N. Allan, D. Petrova, and R. Garcia-Retamero, 2018: 455

Skilled Decision Theory: From Intelligence to Numeracy and Expertise. The Cambridge 456

Handbook of Expertise and Expert Performance, K.A. Ericsson, R.R. Hoffman, A. 457

Kozbelt, and A.M. Williams, Eds., Cambridge University Press, 476–505. 458

Cox, D. R., 1972: Regression Models and Life-Tables. Journal of the Royal Statistical Society: 459

Series B (Methodological), 34, 187–202, https://doi.org/10.1111/j.2517-460

6161.1972.tb00899.x. 461

——, and D. Oakes, 1984: Analysis of survival data. Chapman and Hall, 201 pp. 462

Dash, N., and H. Gladwin, 2007: Evacuation decision making and behavioral responses: 463

Individual and household. Nat. Hazards Rev., 8, 69–77, 464

https://doi.org/10.1061/(ASCE)1527-6988(2007)8:3(69). 465

Dawson, C., E. J. Kubatko, J. J. Westerink, C. Trahan, C. Mirabito, C. Michoski, and N. Panda, 466

2011: Discontinuous Galerkin methods for modeling hurricane storm surge. Advances in 467

Water Resources, 34, 1165–1176, https://doi.org/10.1016/j.advwatres.2010.11.004. 468

Dow, K., and S. L. Cutter, 2002: Emerging hurricane evacuation issues: Hurricane Floyd and 469

South Carolina. Nat. Hazards Rev., 3, 12–18, https://doi.org/10.1061/(ASCE)1527-470

6988(2002)3:1(12). 471

Accepted for publication in Bulletin of the American Meteorological ociety. DOI S 10.1175/BAMS-D-21-0008.1.Unauthenticated | Downloaded 10/20/21 04:22 PM UTC

22

Drake, L., 2012: Scientific prerequisites to comprehension of the tropical cyclone forecast: 472

Intensity, track, and size. Weather and Forecasting, 27, 462–472, 473

https://doi.org/10.1175/WAF-D-11-00041.1. 474

Fu, H., C. G. Wilmot, H. Zhang, and E. J. Baker, 2007: Modeling the hurricane evacuation 475

response curve. Transportation Research Record, 2022, 94–102, 476

https://doi.org/10.3141/2022-11. 477

Gudishala, R., and C. Wilmot, 2017: Modeling emergency managers’ hurricane evacuation 478

decisions. Transportation Research Record, 2604, 82–87, https://doi.org/10.3141/2604-479

10. 480

Hoss, F., and P. Fischbeck, 2016: Increasing the value of uncertain weather and river forecasts 481

for emergency managers. Bulletin of the American Meteorological Society, 97, 85–97, 482

https://doi.org/10.1175/BAMS-D-13-00275.1. 483

Huang, S.-K., M. K. Lindell, and C. S. Prater, 2016: Who leaves and who stays? A review and 484

statistical meta-analysis of hurricane evacuation studies. Environment and Behavior, 48, 485

991–1029, https://doi.org/10.1177/0013916515578485. 486

Kahneman, D., and A. Tversky, 1979: Prospect Theory: An Analysis of Decision under Risk. 487

Econometrica, 47, 263, https://doi.org/10.2307/1914185. 488

Keneally, M., 2017: Mayor tells those not evacuating for hurricane to put Social Security 489

numbers on arms. abc NEWS. https://abcnews.go.com/US/mayor-tells-evacuating-490

hurricane-harvey-put-social-security/story?id=49417573. 491

Knabb, R., J. Rhome, and D. Brown, 2011: Tropical Cyclone Report Hurricane Katrina 23-30 492

August 2005. National Hurricane Center. 493

Kubatko, E. J., J. J. Westerink, and C. Dawson, 2006: hp discontinuous Galerkin methods for 494

advection dominated problems in shallow water flow. Computer Methods in Applied 495

Mechanics and Engineering, 196, 437–451, https://doi.org/10.1016/j.cma.2006.05.002. 496

Loomis, J., 2011: What’s to know about hypothetical bias in stated preference valuation studies?: 497

Hypothetical bias in states preference valuation studies. Journal of Economic Surveys, 25, 498

363–370, https://doi.org/10.1111/j.1467-6419.2010.00675.x. 499

Losee, J. E., K. Z. Naufel, L. Locker, and G. D. Webster, 2017: Weather warning uncertainty: 500

High severity influences judgment bias. Weather, Climate, and Society, 9, 441–454, 501

https://doi.org/10.1175/WCAS-D-16-0071.1. 502

Mele, C., 2016: How to Get People to Evacuate? Try Fear. The New York Times. 503

https://www.nytimes.com/2016/10/07/us/how-to-get-people-to-evacuate-try-fear.html. 504

Meyer, R., K. Broad, B. Orlove, and N. Petrovic, 2013: Dynamic simulation as an approach to 505

understanding hurricane risk response: Insights from the Stormview lab. Risk Analysis, 506

33, 1532–1552, https://doi.org/10.1111/j.1539-6924.2012.01935.x. 507

Accepted for publication in Bulletin of the American Meteorological ociety. DOI S 10.1175/BAMS-D-21-0008.1.Unauthenticated | Downloaded 10/20/21 04:22 PM UTC

23

Morrow, B. H., J. K. Lazo, J. Rhome, and J. Feyen, 2015: Improving storm surge risk 508

communication: Stakeholder perspectives. Bulletin of the American Meteorological 509

Society, 96, 35–48, https://doi.org/10.1175/BAMS-D-13-00197.1. 510

Nicholson, W. C., 2007: Emergency Planning and Potential Liabilities for State and Local 511

Governments. State and Local Government Review, 39, 44–56, 512

https://doi.org/10.1177/0160323X0703900105. 513

NOAA, 2005: Hurricane Rita Images. NOAA National Geodetic Survey. 514

https://geodesy.noaa.gov/storm_archive/storms/rita/index.html. 515

Peerbolte, S. L., and M. L. Collins, 2013: Disaster management and the critical thinking skills of 516

local emergency managers: correlations with age, gender, education, and years in 517

occupation. Disasters, 37, 48–60, https://doi.org/10.1111/j.1467-7717.2012.01291.x. 518

Peters, E., D. Västfjäll, P. Slovic, C. K. Mertz, K. Mazzocco, and S. Dickert, 2006: Numeracy 519

and Decision Making. Psychol Sci, 17, 407–413, https://doi.org/10.1111/j.1467-520

9280.2006.01720.x. 521

Pham, E. O., C. T. Emrich, Z. Li, J. Mitchem, and S. L. Cutter, 2020: Evacuation departure 522

timing during Hurricane Matthew. Weather, Climate, and Society, 12, 235–248, 523

https://doi.org/10.1175/WCAS-D-19-0030.1. 524

Regnier, E., 2008: Public evacuation decisions and hurricane track uncertainty. Management 525

Science, 54, 16–28. 526

Sherman-Morris, K., and K. B. Antonelli, 2018: Hurricane knowledge and interpretation of 527

forecasted error cone and wind potential graphics. JEM, 16, 137, 528

https://doi.org/10.5055/jem.2018.0363. 529

Smith, V., 1976: Experimental economics: Induced value theory. The American Economic 530

Review, 66, 274–279. 531

Thompson, R. R., D. R. Garfin, and R. C. Silver, 2017: Evacuation from natural disasters: A 532

systematic review of the literature. Risk Analysis, 37, 812–839, 533

https://doi.org/10.1111/risa.12654. 534

Tobin, J., 1958: Estimation of Relationships for Limited Dependent Variables. Econometrica, 535

26, 24, https://doi.org/10.2307/1907382. 536

Tversky, A., and D. Kahneman, 1981: The framing of decisions and the psychology of choice. 537

Science, 211, 453–458, https://doi.org/10.1126/science.7455683. 538

USACE, 2008: Flood Insurance Study: Southwestern Parishes, Louisiana. Intermediate 539

Submission 2. USACE. 540

——, 2011: Flood Insurance Study: Coastal Counties, Texas Intermediate Submission 2. 541

USACE. 542

Accepted for publication in Bulletin of the American Meteorological ociety. DOI S 10.1175/BAMS-D-21-0008.1.Unauthenticated | Downloaded 10/20/21 04:22 PM UTC

24

Wallace, J. W., C. Poole, and J. A. Horney, 2016: The association between actual and perceived 543

flood risk and evacuation from Hurricane Irene, Beaufort County, North Carolina. J 544

Flood Risk Management, 9, 125–135, https://doi.org/10.1111/jfr3.12115. 545

Wernstedt, K., P. S. Roberts, J. Arvai, and K. Redmond, 2019: How emergency managers 546

(mis?)interpret forecasts. Disasters, 43, 88–109, https://doi.org/10.1111/disa.12293. 547

Whitehead, J. C., B. Edwards, M. Van Willigen, J. R. Maiolo, K. Wilson, and K. T. Smith, 2001: 548

Heading for higher ground: factors affecting real and hypothetical hurricane evacuation 549

behavior. Environmental Hazards, 2, 133–142, https://doi.org/10.3763/ehaz.2000.0219. 550

Wilson, L. R., and R. McCreight, 2012: Public Emergency Laws & Regulations: Understanding 551

Constraints & Opportunities. Journal of Homeland Security and Emergency 552

Management, 9, 1547-7355.2034, https://doi.org/10.1515/1547-7355.2034. 553

Windham, G., E. Posey, and B. Spencer, 1977: Reactions to Storm Threat during Hurricane 554

Eloise. Mississippi State University. 555

Wu, H.-C., M. K. Lindell, C. S. Prater, and C. D. Samuelson, 2014: Effects of track and threat 556

information on judgments of hurricane strike probability. Risk Analysis, 34, 1025–1039, 557

https://doi.org/10.1111/risa.12128. 558

——, ——, and ——, 2015a: Process tracing analysis of hurricane information displays. Risk 559

Analysis, 35, 2202–2220, https://doi.org/10.1111/risa.12423. 560

——, ——, and ——, 2015b: Strike probability judgments and protective action 561

recommendations in a dynamic hurricane tracking task. Nat Hazards, 79, 355–380, 562

https://doi.org/10.1007/s11069-015-1846-z. 563

564

Accepted for publication in Bulletin of the American Meteorological ociety. DOI S 10.1175/BAMS-D-21-0008.1.Unauthenticated | Downloaded 10/20/21 04:22 PM UTC

25

565

Main Text Figures and Tables: 566

567

Figure 1. Spatial Density of Evacuation Decisions 568 Notes: Choropleth maps identify spatial density of mandatory evacuation decisions made by subjects. Zonal division aligns with 569 the Harris County Office of Emergency Management’s established hurricane evacuation planning zones. ‘Numerate’ / ‘Innumerate’ 570 (which refers to more and less numerate) indicates subject was successful / unsuccessful on the single-item Berlin Numeracy Test. 571 ‘Human Exposure’ indicates subject self-identified in post-experiment survey as knowingly over-evacuating zone(s) even though 572 they thought the likelihood of inundation to be low. 573

Accepted for publication in Bulletin of the American Meteorological ociety. DOI S 10.1175/BAMS-D-21-0008.1.Unauthenticated | Downloaded 10/20/21 04:22 PM UTC

26

574

575

Control (T1) Treatment 2 (T2)

Treatment 3 (T3) Treatment 4 (T4)

Figure 2. Informational Conditions by Treatment 576

577

578

Veer-Left Projection Center Projection Veer-Right Projection

Figure 3. DG-SWEM Inundation Projections Provided to Subjects 579

580

581

Accepted for publication in Bulletin of the American Meteorological ociety. DOI S 10.1175/BAMS-D-21-0008.1.Unauthenticated | Downloaded 10/20/21 04:22 PM UTC

27

582

Table 1. Regression Estimates for Temporal Voluntary Evacuation 583

Model 1

(Tobit Regression)

Model 2

(Cox PH)

Professional

Subjects

Student

Subjects

Professional

Subjects

Student

Subjects

Treatment 2 16.556**

(8.137)

8.744

(7.000)

2.721

(1.747)

1.354

(0.442)

Treatment 3 19.079**

(7.786)

14.139**

(6.871)

3.354*

(2.076)

1.702*

(0.542)

Treatment 4 22.765***

(8.491)

18.374**

(7.308)

4.356**

(2.829)

1.989**

(0.662)

Numeracy 8.792*

(4.965)

-4.281

(3.740)

1.703*

(0.054)

0.908

(0.149)

Constant 12.139*

(7.254)

29.491

(6.683) n/a n/a

Pseudo R2 0.025 0.006 n/a n/a

Log-likelihood -233.472 -720.493 -195.025 -776.744

Likelihood Ratio 11.94** 9.25* 10.06** 6.42

Schoenfeld’s Resid. P-Value n/a n/a 0.949 0.117

N 81 227 81 227

*p<.1, **p<.05, ***p<.01. Model 1 reports Tobit regression coefficients with standard errors 584 in parentheses. Dependent variable is hours remaining prior to issuance of hurricane warning 585 by NHC. 32 left-censored observations and 1 right-censored observation observed in the 586 professional subject model. These values are 72 and 20 for the student subject model. Model 587 2 reports Cox proportional hazard ratios with Breslow method for tied failures. Accelerated 588 failure time variable is advisory in which subject recommended voluntary evacuation. Both 589 Cox models meet the P-H assumption using Schoenfeld’s residuals (p>0.10). 590

591

592

593

Table 2. Regression Estimates for Spatial Mandatory Evacuation 594

Model 3

(Tobit Regression)

Model 4

(Tobit Regression)

Model 5

(OLS Regression)

Model 6

(OLS Regression)

Professional

Subjects

Student

Subjects

Professional

Subjects

Student

Subjects

Professional

Subjects

Student

Subjects

Professional

Subjects

Student

Subjects

Veer-Left

Projection

-0.204

(2.323)

-0.031

(1.419)

-1.302

(2.149)

1.053

(1.385)

6,107.000

(51,330.270)

13,167.130

(29,887.840)

28,986.910

(47,940.580)

-7,462.223

(29,073.550)

Veer-Right

Projection

-1.125

(5.773)

1.951

(2.856)

-0.224

(5.287)

1.363

(2.756)

11,035.810

(129,169.700)

-31,604.680

(59,314.800)

-8,149.545

(119,748.800)

-18,568.200

(57,036.150)

Numeracy 1.859

(2.541)

3.025**

(1.334)

0.966

(2.339)

2.897**

(1.283)

-41,397.370

(56,077.490)

-85,614.750***

(28,076.100)

-24,379.310

(52,141.250)

-83,208.360***

(26,967.320)

Human

Exposure

-8.042***

(2.043)

-6.336***

(1.338)

168,162.800***

(45,334.450)

123,588.700***

(27,766.320)

Constant 89.505***

(1.536)

89.626***

(1.106)

94.263***

(1.877)

93.113***

(1.305)

327,763.300***

(34,000.680)

327,983.500***

(23261.960)

229,167.500***

(41,209.400)

260,777.200***

(26,963.000)

Pseudo R2 0.001 0.004 0.028 0.019 n/a n/a n/a n/a

R2 n/a n/a n/a n/a 0.007 0.042 0.159 0.120

Log-likelihood -268.548 -726.293 -261.262 -715.217 n/a n/a n/a n/a

Likelihood

Ratio 0.57 5.48 15.14*** 27.63*** n/a n/a n/a n/a

F n/a n/a n/a n/a 0.19 3.25** 3.61*** 7.60***

N 81 227 81 227 81 227 81 227

*p<.1, **p<.05, ***p<.01. Models 3-4 reports Tobit regression coefficients with standard errors in parentheses. The dependent variable is the scoring 595 function value described in methods section that accounts for both over- and under-evacuation rates. 3 (1) left-censored observations and 41 (11) right-596 centered observations in the student (professional) subjects’ models. Models 5-6 reports ordinary least squares (OLS) estimates in which the dependent 597 variable is the total population evacuated. In each model, the reference category excluded for comparison is subjects reporting greatest reliance on the center 598 inundation map. 599

600

Accepted for publication in Bulletin of the American Meteorological ociety. DOI S 10.1175/BAMS-D-21-0008.1.Unauthenticated | Downloaded 10/20/21 04:22 PM UTC