Precautionary Principle Papers

8/18/2019 Precautionary Principle Papers

1/36

D R A F T

FAT TAILSRESEARCH PROGRAM

Mathematical Foundations for the Precautionary

PrincipleNassim Nicholas Taleb

This is a companion piece (in progress) to the Precautionary Principle by Taleb, Read, Norman, Douady, and

Bar-Yam (2014, 2016); the ideas will be integrated in a future version.

CONTENTS

I On the Fallacy of Using State-Space Probabilities

for Path Dependent Outcomes 1

II Principle of Probabilistic Sustainability 3

III Why Increase in "Benefits" Usually Increases

the Risk of Ruin 4

IV Technical definition of Fat Tails 5

V On the Unreliability of Hypothesis testing for

Risk Analysis 6

References 8

I. ON THE FALLACY OF U SING S TATE-S PACE

PROBABILITIES FOR PATH D EPENDENT O UTCOMES

Remark 1. If you take the risk –any risk – repeatedly , the

way to count is in exposure per lifespan.

Commentary 1. Ruin probabilities are in the time do-

main for a single agent and do not correspond to state-

space (or ensemble) tail probabilities. Nor are expecta-

tions fungible between the two domains. Statements on

the "overestimation" of tail events by agents derived from

state-space estimations are accordingly flawed (we note

that about all decision theory books have been equally

flawed). Many theories of "rationality" of agents from are

based on wrong estimation operators and/or probability

measures.

[ This is a special case of the conflation between a random

variable and the payoff of a time-dependent path-dependentderivative function.

A. A simplified general case

Commentary 2. (Less technical translation)

Never cross a river if it is on average only 4 feet deep.

(Debate of author with P. Jorion, 1997 and [2] .)

Consider the extremely simplified example, the sequence

of independent random variables (X i)ni=1 = (X 1, X 2, . . . X n)

Correspondence: N N Taleb, email NNT1@nyu.edu

Fig. 1. Ensemble probability vs. time probability discussions, from Petersand Gell-Mann [1]. The treatment by option traders is done via the absorbingbarrier. I have traditionally treated this in Dynamic Hedging and Antifragileas the conflation between X (a random variable) and f(x) a function of ther.v.; an option is path dependent and is not the underlying.

with support in the positive real numbers (R+) . The con-

vergence theorems of classical probability theory address the

behavior of the sum or average: limn→∞ 1nPni X i = m by

the (weak) law of large numbers (convergence in probability).As shown in Fig.1, n going to infinity produces convergence

in probability to the true mean return m. Although the law of

large number applies to draws i that can be strictly separated

by time, it assumes (some) independence, and certainly path

independence.

Now consider (X i,t)T t=1 = (X 1, X 2, . . . X T ) where every

variable X i is indexed by some unit of time t. Assume that

the "time events" are drawn from the exact same probability

distribution: P (X i) = P (X i,t).

We define a time probability the evolution over time for a

single agent i .

1


2/36

D R A F T

FAT TAILS RES EARCH PROGRAM 2

In the presence of terminal that is, irreversible, ruin, every

observation is now conditional on some attribute of the pre-

ceding one, and what happens at period t depends on t − 1,what happens at t − 1 depends on t − 2, etc. We now havepath dependence.

Theorem 1 (space-time inequality). Assume that ∀t, P (X t =0) > 0 and X 0 > 0 , EN (X t) < ∞ the state space expectation for a static initial period t , and ET (X i) the time expectation for any agent i , both obtained through the weak law of large

numbers. We have

EN (X t) ≥ ET (X i) (1)

Proof.

∀t, limn→∞

1

n

nXi

Xi,t−1>0X i,t = m

1 − 1

n

nXi

Xi,t−1≤0

!.

(2)

where Xt−1>0 is the indicator function requiring survival

at the previous period. Hence the limits of n for t show a

decreasing temporal expectation: EN (X t−1) ≤ EN (X t).We can actually prove divergence.

∀i, limT →∞

1

T

T Xt

Xi,t−1>0X i,t = 0. (3)

As we can see by making T < ∞, by recursing the law of iterated expectations, we get the inequality for all T .

In Eq. 3 we took the ensemble of risk takers expecting

a return m

1 − 1nPni Xi,t−1=0

in any period t, while

every single risk taker is guaranteed to eventually go bust.

Note: we can also approach the proof more formally in ameasure-theoretic way by showing that while space sets for

"nonruin" A are disjoint, time sets are not. For a measure ν :

ν

[T

At\≤t

Aci

does not necessarily equal ν

[T

At

!.

Commentary 3. Almost all psychology papers dis-

cussing the "overestimation" of tail risk, see review in

[3] (for a flawed reasoning and historical of flawed

reasoning) are void by the inequality in Theorem 1.

Clearly they assume that an agent only exists for a single

decision. Simply the original papers documenting the

"bias" assume that the agents will never ever again make

another decision in their remaining lives.

The usual solution to this path dependence –if it depends

on only ruin – is done by introducing a function of X

to allow the ensemble (path independent) average to have

the same properties as the time (path dependent) average –

or survival conditioned mean. The natural logarithm seems

a good candidate. Hence S n = Pni=1 log(X i) and S T =PT

t=1 log(X t) belong to the same probabilistic class; hencea probability measure on one is invariant with respect to the

other –what is called ergodicity. In that sense, when analyzing

performance and risk, under conditions of ruin, it is necessary

to use a logarithmic transformation of the variable [4], [1],

or boundedness of the left tail[5] and [6] while maximizing

opportunity in the right tail.

Commentary 4. What we showed here is that unless one

takes a logarithmic transformation (or a similar –smooth –function producing −∞ with ruin set at X = 0), bothexpectations diverge. The entire point of the precaution-

ary principle is to avoid having to rely on logarithms or

transformations by reducing the probability of ruin.

Commentary 5. In their magisterial paper, Peters and

Gell-Mann[1] showed that the Bernouilli use of the

logarithm wasn’t for a concave "utility" function, but,

as with the Kelly criterion, to restore ergodicity. A bit of

history:

• Bernouilli discovers logarithmic risk taking under

the illusion of "utility".• Kelly and Thorp recovered the logarithm for maxi-

mal growth criterion as an optimal gambling strat-

egy. Nothing to do with utility.

• Samuelson disses logarithm as aggressive, not re-

alizing that semi-logarithm (or partial logarithm),

i.e. on partial of wealth can be done. From Menger

to Arrow, via Chernoff and Samuelson, everyone in

decision theory is shown to be making the mistake

of ergodicity.

• Pitman in 1975 [7] shows that a Brownian motion

subjected to an absorbing barrier at 0, with cen-

sored absorbing paths, becomes a three-dimensional

Bessel process. The drift of the surviving paths is 1x ,

which integrates to a logarithm.

• Peters and Gell-Mann recovers the logarithm for

ergodicity and, in addition, put the Kelly-Thorpe

result on rigorous physical grounds.

• With Cirillo, this author [8] discovers the log as

unique smooth transformation to create a dual of

the distribution in order to remove one-tail compact

support to allow the use of extreme value theory.

• We can show (Briys and Taleb, in progress) the

necessity of logarithmic transformation as simple

ruin avoidance, which happens to be a special case

of the HARA utility class.

B. Adaptation of Theorem 1 to Brownian Motion

The implications of simplified discussion does not change

whether one uses richer models, such as a full stochastic

process subjected to an absorbing barrier. And of course in a

natural setting the eradication of all previous life can happen

(i.e. X t can take extreme negative value), not just a stopping

condition.

The problem is invariant in real life if one uses a Brownian-

motion style stochastic process subjected to an absorbing


3/36

D R A F T


barrier. In place of the simplified representation we would

have, for an process subjected to L, an absorbing barrier from

below, in the arithmetic version:

∀i, X i,t =(

X i,t−1 + Z i,t, X i,t−1 > L

0 otherwise(4)

or, for a geometric process

∀i, X i,t =(

X i,t−1(1 + Z i,t) ≈ X i,t−1eZ i,t , X i,t−1 > L0 otherwise

(5)

where Z is a random variable.

Going to continuous time, and considering the geometric

case, let τ = {inf t : X i,t > L} be the stopping time. Theidea is to have the simple expectation of the stopping time

match the remaining lifespan –or remain in the same order.

Working with a function of X under stopping time: Let us

digress to see how we can work with functions of ruin as an

absorbing barrier. This is a bit more complicated but quite

useful for calibration of ruin off a certain known shelf life.

Assume a finite expectation for the stopping time E L

(τ )


4/36

D R A F T


III. WHY I NCREASE IN " BENEFITS" USUALLY I NCREASES

TH E R ISK OF RUI N

Commentary 7. We show that the probability of ruin

increases upon an action to "improve" on a system as

the trade-off benefits-risk increases in the tails. In other

words, anything that increases benefits, if it increases

uncertainty, leads to the violation of the equality state

space-time expectations.

This is a general precautionary method for iatrogenics.

Why should uncertainty about climate models lead to a more

conservative, more cautious, "greener" stance, even –or espe-

cially – if one disbelieved the models?

Why do super-rich gain more from inequality than from

gains in GDP, in proportion to how rich they are?

Why do derivatives "in the tails" depend more on changes

in expected volatility than from changes in the mean (not well

known by practitioners of mathematical finance (derivatives)

who get periodically harmed by it)?

Why increase risk of deflation also necessarily increases the

risk of hyperinflation.

Why should worry about GMOs even if one accepted their

purported benefits?

It is a necessarily mathematical relation that remote parts of

the distribution —the tails— are less sensitive to changes in

the mean, and more sensitive to other parameters controlling

the dispersion and scale (which in the special case of class

of finite-variance distributions would be the variance) or those

controlling the tail exponent.

Definition 1. Let Φ be a twice derivable continuous probabil-

ity CDF with at least one unbounded tailΦ ∈ C 2 : D→ [0, 1] ,with s > 0 , where λ is a slowly varying function with respect

to x: ∀t > 0, limx→±∞ λ(tx)λ(x) = 1:We have either

Φ(x; l,s,α) , λ(x − l

s ,α) Z

x − l

s ,α

,D = [−∞, x0]

or

Φ(x; l,s,α) , 1 − λ( x − ls

,α) Z

x − l

s ,α

,D = [x0, ∞)

(6)

where x0 is the (maximum) minimum value for the represen-

tation of the distribution, l is the location and s ∈ (0, ∞) thescale.

Intuitively we are using any probability distribution and

mapping the random variable x 7→ x−ls and only focusing onthe tails. Thanks to such focus on the tails only, the distribution

does not necessarily need to be in the location-scale family

(i.e., retain their properties after the transformation). We are

factorizing the CDF into a two functions, one of which

becomes a constant for "large" values of |x|, given that we

aim at isolating tail probabilities and disregard other portions

of the distribution.

The distribution in (6) can accommodate x0 = ±∞, inwhich the function is whole; but all is required is for it to

smooth and have the expressions above in either of the open

"tail segments" concerned, which we define as values below

(above) K .

A. Scale vs. Location

Let υ(K )− , ∂ Φ(.)

∂ s |x=K and υ(K )+ , ∂ Φ̄(.)

∂ s |x=K be the

"vega", that is sensitivity to scale and δ (K )− ,

∂ Φ(.)∂ l

|x=K

and δ (K )+ , ∂ ¯Φ(.)

∂ l |x=K be the "delta" that is sensitivity tolocation for positive and negative tail, respectively. Consider

the tail probabilities at level K , defined as P (x < K ) , Φ(K )and P (x > K ) , Φ(K ).

We have the "vega", that is sensitivity to scale

υ(K )− = − 1s2

(K − l)

λ

K − l

s ,α

Z (1,0)

K − l

s ,α

+ λ(1,0)

K − ls

,α

Z

K − l

s ,α

(7)

For clarity we are using the slot notation: Z (1,0)(., .) refersthe first partial derivative of the function Z with respect

of the first argument (not the variable under concern), and

Z (0,1)(., .) that with respect of the second (by the chain ruke,Z (1,0)

K −ls

,α

= s ∂ Z (.,.)∂ s ).

(8)

δ (K )− = −1s

λ

K − l

s ,α

Z (1,0)

K − l

s ,α

+ λ(1,0)

K − ls

,α

Z

K − l

s ,α

.

Thanks to the Karamata Tauberian theorem, we can refine

υ(.) and δ (.). given that λ(.) is a slowly varying function, wehave:

λ(K − l

s ,α) → λ,

and

λ(1,0)(K − l

s ,α) → 0.

Hence: υ(K )− = − 1s2 (K − l)λZ (1,0)K −ls ,α

and

δ (K )− = −1sλK −ls ,α

Z (1,0)

K −ls ,α

, which allows us

to prove the following theorem:

Theorem 2 (scale/location tradeoff). There exists, for any

distribution with an one unbounded tail, in C 2 , twice differ-

entiable below (above) level K ∗ s.t. for the unbounded tailunder concern, |K |≥ |K ∗|, r = υ(K )−

δ(K )−=

υ(K )+δ(K )+

→ K −ls .The theorem is trivial once we set up the probability

distributions as we did in 6.

The physical interpretation of the theorem is that in the tail,

ruin probability is vastly more reactive to s, that is the ratio

of dispersion, than l , a shifting of the mean.

We note that our result is general –some paper (cite) saw the

effect and applied to specific tail distributions from extreme

value theory when our result does not require the particulars

of the distribution.


5/36

D R A F T


1) Examples: For a LogNormal Distribution with parame-

ters µ and σ, we construct a change of variable x−ls , ending

with a CDF: 12erfc

µ−log(x−ls )√

2σ

. Now the right tail proba-

bility

Φ(K ) = 1

2

erf

− log(K − l) + µ + log(s)√ 2σ

+ 1

has for derivatives υ(K )+ = e− (− log(K−l)+µ+log(s))2

2σ2√ 2πsσ

, δ (K )+ =

e− (− log(K−l)+µ+log(s))

2

2σ2√ 2πσ(K −l) , and the ratio r =

K −ls .

B. Shape vs. location

Discussion: exponent determining the shape of the distribu-

tion, expression uncertainty by broadening the tails and having

an effect on the variance for powerlaws (except for the Lévy

Stable distribution).

As to the tail probability sensitivity to the exponent, it

is to be taken in the negative: lower exponent means more

uncertainty, so we reverse the sign.

(9)

ω(K )+ = −∂ Φ(.)∂α

|x=K

= λ(K − l, α)Z (0,1)

K − ls

,α

+ λ(0,1)(K − l, α)Z

K − ls

,α

Theorem 3 (tail shape/location tradeoff). There exists, for a

distribution with an one unbounded tail, a level K* s.t. for an

unbounded tail, |K |≥ |K ∗| ,

r = ω(K )+

δ (K )+=

1

α

(K

−l(s +1))logK − l(s + 1)

s2 − 1

Proof. For large positive deviations, with K > l s we can

write, by Karamata’s result for slowly moving functions,

Φ(K ) ≈ λ K −ls −α, hence Z (.) = K −ls −α .ω(K )+δ (K )+

= sZ

K −ls ,α

Z (1,0)

K −ls ,α

+ sZ (0,1)K −ls ,α

Z (1,0)

K −ls ,α

(10)

C. Shape vs Scale

We note the role of shape vs scale:

υ(K )+υ(K )+

= cs(K − l(s + 1))

log

K − l(s + 1)s2

− 1α(K − l)

Commentary 8. We can show the effect of changes in

the tails on, say, "fragile" items, and the inaplicability of

the average. For instance, what matters for the climate

is not the average, but the distribution of the extremes.

Here average effect is l and extremes from s and, worse,

α.

I am working on the distribution of risk based on

"stochastic α"

IV. TECHNICAL DEFINITION OF FAT TAILS

Probability distributions range between extreme thin-tailed

(Bernoulli) and extreme fat tailed [?]. Among the categories

of distributions that are often distinguished due to the conver-

gence properties of moments are: 1) Having a support that is

compact but not degenerate, 2) Subgaussian, 3) Gaussian, 4)Subexponential, 5) Power law with exponent greater than 3, 6)

Power law with exponent less than or equal to 3 and greater

than 2, 7) Power law with exponent less than or equal to 2. In

particular, power law distributions have a finite mean only if

the exponent is greater than 1, and have a finite variance only

if the exponent exceeds 2.

Our interest is in distinguishing between cases where tail

events dominate impacts, as a formal definition of the bound-

ary between the categories of distributions to be considered

as Mediocristan and Extremistan. The natural boundary be-

tween these occurs at the subexponential class which has the

following property:

Let X = (X i)1≤i≤n be a sequence of independent andidentically distributed random variables with support in (R+),

with cumulative distribution function F . The subexponential

class of distributions is defined by [9],[10].

limx→+∞

1 − F ∗2(x)1 − F (x) = 2

where F ∗2 = F 0∗F is the cumulative distribution of X 1+X 2,the sum of two independent copies of X . This implies that the

probability that the sum X 1+X 2 exceeds a value x is twice theprobability that either one separately exceeds x. Thus, every

time the sum exceeds x, for large enough values of x, the

value of the sum is due to either one or the other exceeding

x—the maximum over the two variables—and the other of

them contributes negligibly.

More generally, it can be shown that the sum of n variables

is dominated by the maximum of the values over those vari-

ables in the same way. Formally, the following two properties

are equivalent to the subexponential condition [11],[12]. For

a given n ≥ 2, let S n = Σni=1xi and M n = max1≤i≤n xi

a) limx→∞P (S n>x)P (X>x) = n,

b) limx→∞P (S n>x)

P (M n>x) = 1.

Thus the sum S n has the same magnitude as the largest

sample M n, which is another way of saying that tails play

the most important role.

Intuitively, tail events in subexponential distributions should

decline more slowly than an exponential distribution for which

large tail events should be irrelevant. Indeed, one can show that

subexponential distributions have no exponential moments:Z ∞0

ex dF (x) = +∞


6/36

D R A F T


for all values of ε greater than zero. However,the converse isn’t

true, since distributions can have no exponential moments, yet

not satisfy the subexponential condition.

We note that if we choose to indicate deviations as negative

values of the variable x, the same result holds by symmetry for

extreme negative values, replacing x → +∞ with x → −∞.For two-tailed variables, we can separately consider positive

and negative domains.

V. ON THE U NRELIABILITY OF H YPOTHESIS TESTING FOR

RIS K A NALYSIS

The derivations in this section were motivated during the

GMO debate by discussions of "evidence", based on p-

values of around ≈ .05. Not only we need much lowerthan .05 for any safety assessment of repeated exposure,but the p-value method and, sadly, the "power of test"

have acute stochasticities reminiscent of fat tail problems.

Commentary 9. Where we show that p-values are un-

reliable for risk analysis, hence say nothing about ruin probabilities. The so-called "scientific" studies are too

speculative to be of any use for tail risk (which shows in

their continuous evolution). The exception, of course, is

negative empiricism.

Assume that we knew the "true" p-value, ps, what would its

realizations look like across various attempts on statistically

identical copies of the phenomena? By true value ps, we

mean its expected value by the law of large numbers across

an m ensemble of possible samples for the phenomenon

under scrutiny, that is 1m

P≤m pi

P −→ ps (where P −→ denotesconvergence in probability). A similar convergence argument

can be also made for the corresponding "true median" pM .The main result of the paper is that the the distribution of n

small samples can be made explicit (albeit with special inverse

functions), as well as its parsimonious limiting one for n large,

with no other parameter than the median value pM . We were

unable to get an explicit form for ps but we go around it with

the use of the median. Finally, the distribution of the minimum

p-value under can be made explicit, in a parsimonious formula

allowing for the understanding of biases in scientific studies.

It turned out, as we can see in Fig. 4 the distribution is

extremely asymmetric (right-skewed), to the point where 75%

of the realizations of a "true" p-value of .05 will be .1

(and very possibly close to .2), with a standard deviation

>.2 (sic) and a mean deviation of around .35 (sic, sic).

Because of the excessive skewness, measures of disper-

sion in L1 and L2 (and higher norms) vary hardly with

ps, so the standard deviation is not proportional, meaning

Fig. 3. The different values for Equ. 11 showing convergence to the limitingdistribution.

an in-sample .01 p-value has a significant probability of having a true value > .3.

So clearly we don’t know what we are talking aboutwhen we talk about p-values.

Earlier attempts for an explicit meta-distribution in the

literature were found in [13] and [14], though for situations

of Gaussian subordination and less parsimonious parametriza-

tion. The severity of the problem of significance of the so-

called "statistically significant" has been discussed in [15]

and offered a remedy via Bayesian methods in [16], which

in fact recommends the same tightening of standards to p-

values ≈ .01. But the gravity of the extreme skewness of thedistribution of p-values is only apparent when one looks at the

meta-distribution.

For notation, we use n for the sample size of a given studyand m the number of trials leading to a p-value.

A. Proofs and derivations

Proposition 1. Let P be a random variable ∈ [0, 1]) corre-sponding to the sample-derived one-tailed p-value from the

paired T-test statistic (unknown variance) with median value

M(P ) = pM ∈ [0, 1] derived from a sample of n size.The distribution across the ensemble of statistically identical

copies of the sample has for PDF

ϕ( p; pM ) =(ϕ( p; pM )L for p < 12ϕ( p; pM )H for p > 12

ϕ( p; pM )L = λ12 (−n−1) ps

− λ p (λ pM − 1)(λ p − 1)λ pM − 2

p (1 − λ p)λ p

p (1 − λ pM )λ pM + 1

11λp

− 2√

1−λp√

λpM √ λp

√ 1−λpM

+ 11−λpM − 1

n/2


7/36

D R A F T


ϕ( p; pM )H =

1 − λ0 p 1

2 (−n−1) λ0 p − 1 (λ pM − 1)λ0 p (−λ pM ) + 2

q 1 − λ0 p

λ0 pp

(1 − λ pM )λ pM + 1

n+12

(11)

where λ p = I −12 p n2 ,

12

, λ pM = I −11−2 pM

12 , n2

, λ0 p =I −12 p−1 12 , n2 , and I −1(.) (., .) is the inverse beta regularizedfunction.

Remark 3. For p= 12 the distribution doesn’t exist in theory,

but does in practice and we can work around it with the

sequence pmk = 12±

1k , as in the graph showing a convergence

to the Uniform distribution on [0, 1] in Figure 5. Also notethat what is called the "null" hypothesis is effectively a set of

measure 0.

Proof. Let Z be a random normalized variable with realiza-

tions ζ , from a vector ~ v of n realizations, with sample mean

mv, and sample standard deviation sv, ζ = mv−mh

sv√ n

(where mh

is the level it is tested against), hence assumed to ∼ Student T

with n degrees of freedom, and, crucially, supposed to delivera mean of ζ̄ ,

f (ζ ; ζ̄ ) =

n

(ζ̄ −ζ )2+n

n+12

√ nBn2 ,

12

where B(.,.) is the standard beta function. Let g(.) be the one-tailed survival function of the Student T distribution with zero

mean and n degrees of freedom:

g(ζ ) = P(Z > ζ ) =

12

I nζ2+n

n2

, 12

ζ ≥ 0

12 I ζ2ζ2+n

12

, n2 + 1 ζ


8/36

D R A F T


Remark 4. For values of p close to 0, ϕ in Equ. 12 can be

usefully calculated as:

ϕ( p; pM ) =√

2π pM

s log

1

2π p2M

e

r − log

2π log

12πp2

−2 log( p)

s − log

2π log

1

2πp2M

−2 log( pM )

+ O( p2). (14)

The approximation works more precisely for the band of

relevant values 0 < p < 12π .

From this we can get numerical results for convolutions of

ϕ using the Fourier Transform or similar methods.

We can and get the distribution of the minimum p-value per

m trials across statistically identical situations thus get an idea

of "p-hacking", defined as attempts by researchers to get the

lowest p-values of many experiments, or try until one of the

tests produces statistical significance.

Proposition 3. The distribution of the minimum of m obser-

vations of statistically identical p-values becomes (under thelimiting distribution of proposition 2):

ϕm( p; pM ) = m eerfc−1(2 pM )(2erfc−1(2 p)−erfc−1(2 pM ))

1 − 12

erfc

erfc−1(2 p) − erfc−1(2 pM )m−1

(15)

Proof. P ( p1 > p, p2 > p, . . . , pm > p) = Tni=1 Φ( pi) =

Φ̄( p)m. Taking the first derivative we get the result.

Outside the limiting distribution: we integrate numerically

for different values of m as shown in figure 6. So, more

precisely, for m trials, the expectation is calculated as:

E( pmin) =Z 1

0

−m ϕ( p; pM )Z p

0

ϕ(u, .) dum−1

d p

Fig. 6. The "p-hacking" value across m trials for pM = .15 and ps = .22.

B. Inverse Power of Test

Let β be the power of a test for a given p-value p, for

random draws X from unobserved parameter θ and a sample

size of n. To gauge the reliability of β as a true measure of

power, we perform an inverse problem:

β X θ,p,n

β −1(X )

∆

Proposition 4. Let β c be the projection of the power of the

test from the realizations assumed to be student T distributed

and evaluated under the parameter θ. We have

Φ(β c) =

(Φ(β c)L for β c <

12

Φ(β c)H for β c > 12

where

Φ(β c)L =p

1 − γ 1γ −n2

1− γ 1

2q

1γ3−1

√ −(γ 1−1)γ 1−2

√ −(γ 1−1)γ 1+γ 1

2q

1γ3−1− 1

γ3

−1

!n+12

p − (γ 1 − 1)γ 1

(

Φ(β c)H =√ γ 2 (1 − γ 2)−

n2 B

1

2, n

2

1

−2(√ −(γ2−1)γ2+γ2)

√ 1γ3

−1+2√

1γ3

−1+2√ −(γ2−1)γ2−1

γ2−1 + 1

γ3

n+12

p − (γ 2 − 1)γ 2B

n2

, 12

(17)

where γ 1 = I −12βc

n2

, 12

, γ 2 = I

−12βc−1

12

, n2

, and γ 3 =

I −1(1,2 ps−1)n2 ,

12

.

C. Application and Conclusion

• One can safely see that under such stochasticity for

the realizations of p-values and the distribution of its

minimum, to get what people mean by 5% confidence

(and the inferences they get from it), they need a p-value

of at least one order of magnitude smaller.

• The "power" of a test has the same problem unless one

either lowers p-values or sets the test at higher levels,

such at .99.

SUMMARY AND C ONCLUSIONS

We showed the fallacies committed in the name of "ratio-

nality" by various people such as Cass Sunstein or similar

persons in the verbalistic "evidence based" category.

REFERENCES

[1] O. Peters and M. Gell-Mann, “Evaluating gambles usingdynamics,” Chaos, vol. 26, no. 2, 2016. [Online]. Available:http://scitation.aip.org/content/aip/journal/chaos/26/2/10.1063/1.4940236

[2] N. N. Taleb, “Black swans and the domains of statistics,” The AmericanStatistician, vol. 61, no. 3, pp. 198–200, 2007.

[3] N. Barberis, “The psychology of tail events: Progress and challenges,” American Economic Review, vol. 103, no. 3, pp. 611–16, 2013.

[4] O. Peters, “The time resolution of the st petersburg paradox,” Philo-sophical Transactions of the Royal Society of London A: Mathematical,Physical and Engineering Sciences, vol. 369, no. 1956, pp. 4913–4931,2011.


9/36

D R A F T


[5] J. L. Kelly, “A new interpretation of information rate,” InformationTheory, IRE Transactions on, vol. 2, no. 3, pp. 185–189, 1956.

[6] D. Geman, H. Geman, and N. N. Taleb, “Tail risk constraints andmaximum entropy,” Entropy, vol. 17, no. 6, p. 3724, 2015. [Online].Available: http://www.mdpi.com/1099-4300/17/6/3724

[7] J. W. Pitman, “One-dimensional brownian motion and the three-dimensional bessel process,” Advances in Applied Probability, pp. 511–526, 1975.

[8] N. N. Taleb and P. Cirillo, “On the shadow moments of apparentlyinfinite-mean phenomena,” arXiv preprint arXiv:1510.06731, 2015.

[9] J. L. Teugels, “The class of subexponential distributions,” The Annalsof Probability, vol. 3, no. 6, pp. 1000–1011, 1975.

[10] E. Pitman, “Subexponential distribution functions,” J. Austral. Math.Soc. Ser. A, vol. 29, no. 3, pp. 337–347, 1980.

[11] V. Chistyakov, “A theorem on sums of independent positive randomvariables and its applications to branching random processes,” Theoryof Probability & Its Applications, vol. 9, no. 4, pp. 640–648, 1964.

[12] P. Embrechts, C. M. Goldie, and N. Veraverbeke, “Subexponentialityand infinite divisibility,” Probability Theory and Related Fields, vol. 49,no. 3, pp. 335–347, 1979.

[13] H. J. Hung, R. T. O’Neill, P. Bauer, and K. Kohne, “The behavior of thep-value when the alternative hypothesis is true,” Biometrics, pp. 11–22,1997.

[14] H. Sackrowitz and E. Samuel-Cahn, “P values as random variables—expected p values,” The American Statistician, vol. 53, no. 4, pp. 326–331, 1999.

[15] A. Gelman and H. Stern, “The difference between “significant” and

“not significant” is not itself statistically significant,” The AmericanStatistician, vol. 60, no. 4, pp. 328–331, 2006.

[16] V. E. Johnson, “Revised standards for statistical evidence,” Proceedingsof the National Academy of Sciences, vol. 110, no. 48, pp. 19313–19 317, 2013.


10/36

Climate models and precautionary measuresJoseph Norman†, Rupert Read§, Yaneer Bar-Yam†, Nassim Nicholas Taleb ∗

†New England Complex Systems Institute, §School of Philosophy, University of East Anglia, ∗School of

Engineering, New York University

Forthcoming in Issues in Science and Technology

F

THE POLICY DEBATE with respect to anthropogenicclimate-change typically revolves around the accu-racy of models. Those who contend that models makeaccurate predictions argue for specific policies to stemthe foreseen damaging effects; those who doubt their

accuracy cite a lack of reliable evidence of harm towarrant policy action.These two alternatives are not exhaustive. One can

sidestep the "skepticism" of those who question existingclimate-models, by framing risk in the most straight-forward possible terms, at the global scale. That is, weshould ask "what would the correct policy be if we hadno reliable models?"

We have only one planet. This fact radically constrainsthe kinds of risks that are appropriate to take at a largescale. Even a risk with a very low probability becomesunacceptable when it affects all of us – there is noreversing mistakes of that magnitude.

Without any precise models, we can still reason thatpolluting or altering our environment significantly couldput us in uncharted territory, with no statistical track-record and potentially large consequences. It is at thecore of both scientific decision making and ancestralwisdom to take seriously absence of evidence whenthe consequences of an action can be large. And it isstandard textbook decision theory that a policy should

depend at least as much on uncertainty concerning theadverse consequences as it does on the known effects.

Further, it has been shown that in any system fraughtwith opacity, harm is in the dose rather than in the na-ture of the offending substance: it increases nonlinearly

to the quantities at stake. Everything fragile has suchproperty. While some amount of pollution is inevitable,high quantities of any pollutant put us at a rapidlyincreasing risk of destabilizing the climate, a system thatis integral to the biosphere. Ergo, we should build downCO2 emissions, even regardless of what climate-modelstell us.

This leads to the following asymmetry in climatepolicy. The scale of the effect must be demonstrated to

be large enough to have impact. Once this is shown, andit has been, the burden of proof of absence of harm ison those who would deny it.

It is the degree of opacity and uncertainty in a system,

as well as asymmetry in effect, rather than specific modelpredictions, that should drive the precautionary mea-sures. Push a complex system too far and it will not come

back. The popular belief that uncertainty underminesthe case for taking seriously the ’climate crisis’ thatscientists tell us we face is the opposite of the truth.Properly understood, as driving the case for precaution,uncertainty radically underscores that case, and may evenconstitute it.


11/36


12/36

EXTREME RISK INITIATIVE —NYU SCHOOL OF ENGINEERING WORKING PAPER SERIES 2

Traditional decision-making strategies focus on thecase where harm is localized and risk is easy to calculatefrom past data. Under these circumstances, cost-benefitanalyses and mitigation techniques are appropriate. Thepotential harm from miscalculation is bounded.

On the other hand, the possibility of irreversible andwidespread damage raises different questions about thenature of decision making and what risks can be reason-ably taken. This is the domain of the PP.

Criticisms are often levied against those who arguefor caution portraying them as unreasonable and pos-sibly even paranoid. Those who raise such criticismsare implicitly or explicitly advocating for a cost benefitanalysis, and necessarily so. Critics of the PP have alsoexpressed concern that it will be applied in an overreach-ing manner, eliminating the ability to take reasonablerisks that are needed for individual or societal gains.While indiscriminate use of the PP might constrainappropriate risk-taking, at the same time one can alsomake the error of suspending the PP in cases when it is

vital.Hence, a non-naive view of the precautionary princi-ple is one in which it is only invoked when necessary,and only to prevent a certain variety of very precisely de-fined risks based on distinctive probabilistic structures.But, also, in such a view, the PP should never be omittedwhen needed.

The remainder of this section will outline the differ-ence between the naive and non-naive approaches.

2.1 What we mean by a non-naive PP

Risk aversion and risk-seeking are both well-studiedhuman behaviors. However, it is essential to distinguishthe PP so that it is neither used naively to justify any actof caution, nor dismissed by those who wish to courtrisks for themselves or others.

The PP is intended to make decisions that ensuresurvival when statistical evidence is limited—becauseit has not had time to show up —by focusing on theadverse effects of "absence of evidence."

Table 1 encapsulates the central idea of the paper andshows the differences between decisions with a risk of harm (warranting regular risk management techniques)and decisions with a risk of total ruin (warranting thePP).

2.2 Harm vs. Ruin: When the PP is necessary

The purpose of the PP is to avoid a certain class of what,in probability and insurance, is called “ruin" problems[1]. A ruin problem is one where outcomes of riskshave a non-zero probability of resulting in unrecoverablelosses. An often-cited illustrative case is that of a gamblerwho loses his entire fortune and so cannot return tothe game. In biology, an example would be a speciesthat has gone extinct. For nature, "ruin" is ecocide: anirreversible termination of life at some scale, which could

be planetwide. The large majority of variations that

Standard Risk Management Precautionary Approach

localized harm systemic ruinnuanced cost-benefit avoid at all costsstatistical fragility basedstatistical probabilistic non-statisticalvariations ruinconvergent probabibilities divergent probabilitiesrecoverable irreversibleindependent factors interconnected factorsevidence based precautionary

thin tails fat tails bottom-up, tinkering top-down engineeredevolved human-made

Table 1: Two different types of risk and their respectivecharacteristics compared

2000 4000 6000 8000 10 000Exposure

0.2

0.4

0.6

0.8

1.0

ro a ty o u n

Figure 1: Why Ruin is not a Renewable Resource. Nomatter how small the probability, in time, something

bound to hit the ruin barrier is about guaranteed to hitit.

occur within a system, even drastic ones, fundamentallydiffer from ruin problems: a system that achieves ruincannot recover. As long as the instance is bounded, e.g.a gambler can work to gain additional resources, theremay be some hope of reversing the misfortune. This isnot the case when it is global.

Our concern is with public policy. While an individualmay be advised to not "bet the farm," whether or not hedoes so is generally a matter of individual preferences.Policy makers have a responsibility to avoid catastrophicharm for society as a whole; the focus is on the aggregate,not at the level of single individuals, and on global-systemic, not idiosyncratic, harm. This is the domain of collective "ruin" problems.

Precautionary considerations are relevant much more broadly than to ruin problems. For example, there wasa precautionary case against cigarettes long before therewas an open-and-shut evidence-based case against them.Our point is that the PP is a decisive consideration forruin problems, while in a broader context precaution isnot decisive and can be balanced against other consid-erations.

3 WHY RUIN IS SERIOUS BUSINESS

The risk of ruin is not sustainable. By the ruin theorems,if you incur a tiny probability of ruin as a "one-off" risk,


13/36


survive it, then do it again (another "one-off" deal), youwill eventually go bust with probability 1. Confusionarises because it may seem that the "one-off" risk isreasonable, but that also means that an additional oneis reasonable. This can be quantified by recognizing thatthe probability of ruin approaches 1 as the number of exposures to individually small risks, say one in tenthousand, increases (see Fig. 1). For this reason a strategyof risk taking is not sustainable and we must considerany genuine risk of total ruin as if it were inevitable.

The good news is that some classes of risk can bedeemed to be practically of probability zero: the earthsurvived trillions of natural variations daily over 3 bil-lion years, otherwise we would not be here. By recog-nizing that normal risks are not in the category of ruinproblems, we recognize also that it is not necessary oreven normal to take risks that involve a possibility of ruin.

3.1 PP is not Risk Management

It is important to contrast and not conflate the PP andrisk management. Risk management involves variousstrategies to make decisions based upon accounting forthe effects of positive and negative outcomes and theirprobabilities, as well as seeking means to mitigate harmand offset losses. Risk management strategies are impor-tant for decision-making when ruin is not at stake. How-ever, the only risk management strategy of importancein the case of the PP is ensuring that actions which canresult in ruin are not taken, or equivalently, modifyingpotential choices of action so that ruin is not one of thepossible outcomes.

More generally, we can identify three layers associatedwith strategies for dealing with uncertainty and risk.The first layer is the PP which addresses cases thatinvolve potential global harm, whether probabilities areuncertain or known and whether they are large or small.The second is risk management which addresses the caseof known probabilities of well-defined, bounded gainsand losses. The third is risk aversion or risk-seeking

behavior, which reflects quite generally the role of per-sonal preferences for individual risks when uncertaintyis present.

3.2 Ruin is forever

A way to formalize the ruin problem in terms of the de-structive consequences of actions identifies harm as notabout the amount of destruction, but rather a measureof the integrated level of destruction over the time itpersists. When the impact of harm extends to all futuretimes, i.e. forever, then the harm is infinite. When theharm is infinite, the product of any non-zero probabilityand the harm is also infinite, and it cannot be balancedagainst any potential gains, which are necessarily finite.This strategy for evaluation of harm as involving theduration of destruction can be used for localized harmsfor better assessment in risk management. Our focus

Figure 2: A variety of temporal states for a processsubjected to an absorbing barrier. Once the absorbing

barrier is hit, the process terminates, regardless of itsfuture potential.

here is on the case where destruction is complete for

a system or an irreplaceable aspect of a system.Figure 2 shows ruin as an absorbing barrier, a point

that does not allow recovery.For example, for humanity global devastation cannot

be measured on a scale in which harm is proportionalto level of devastation. The harm due to complete de-struction is not the same as 10 times the destructionof 1/10 of the system. As the percentage of destructionapproaches 100%, the assessment of harm diverges toinfinity (instead of converging to a particular number)due to the value placed on a future that ceases to exist.

Because the “cost” of ruin is effectively infinite, cost- benefit analysis (in which the potential harm and po-tential gain are multiplied by their probabilities andweighed against each other) is no longer a usefulparadigm. Even if probabilities are expected to be zero

but have a non-zero uncertainty, then a sensitivity anal-ysis that considers the impact of that uncertainty resultsin infinities as well. The potential harm is so substantialthat everything else in the equation ceases to matter. Inthis case, we must do everything we can to avoid thecatastrophe.

4 SCIENTIFIC METHODS AND THE PP

How well can we know either the potential conse-quences of policies or their probabilities? What doesscience say about uncertainty? To be helpful in policydecisions, science has to encompass not just expectationsof potential benefit and harm but also their probabilityand uncertainty.

Just as the imperative of analysis of decision-makingchanges when there is infinite harm for a small, non-zerorisk, so is there a fundamental change in the ability toapply scientific methods to the evaluation of that harm.This influences the way we evaluate both the possibilityof and the risk associated with ruin.


14/36


15/36


16/36


plants and land insects are comparatively robust. It isnot known to what extent these events are driven extrin-sically, by meteor impacts, geological events includingvolcanos, or cascading events of coupled species extinc-tions, or combinations of them. The variability associatedwith mass extinctions, however, indicates that there arefat tail events that can affect the global biosphere. Themajor extinction events during the past 500 million yearsoccur at intervals of millions of years [5]. While mass ex-tinctions occur, the extent of that vulnerability is driven

by both sensitivity to external events and connectivityamong ecosystems.

The greatest impact of human beings on this naturalsystem connectivity is through dramatic increases inglobal transportation. The impact of invasive species andrapid global transmission of diseases demonstrates therole of human activity in connecting previously muchmore isolated natural systems. The role of transporta-tion and communication in connecting civilization itself is apparent in economic interdependence manifest in

cascading financial crises that were not possible even ahundred years ago. The danger we are facing today isthat we as a civilization are globally connected, and thefat tail of the distribution of shocks extends globally, toour peril.

Had nature not imposed sufficiently thin-tailed varia-tions in the aggregate or macro level, we would not behere today. A single one of the trillions, perhaps the tril-lions of trillions, of variations over evolutionary historywould have terminated life on the planet. Figures 1 and 2show the difference between the two separate statisticalproperties. While tails can be fat for subsystems, natureremains predominantly thin-tailed at the level of the

planet [6]. As connectivity increases the risk of extinctionincreases dramatically and nonlinearly [7].

6.1 Risk and Global Interventionism

Currently, global dependencies are manifest in the ex-pressed concerns about policy maker actions that nom-inally appear to be local in their scope. In just recentmonths, headlines have been about Russia’s involvementin Ukraine, the spread of Ebola in east Africa, expansionof ISIS control into Iraq, ongoing posturing in North Ko-

rea and Israeli-Palestinian conflict, among others. Theseevents reflect upon local policy maker decisions thatare justifiably viewed as having global repercussions.The connection between local actions and global riskscompels widespread concern and global responses toalter or mitigate local actions. In this context, we pointout that the broader significance and risk associatedwith policy actions that impact on global ecological andhuman survival is the essential point of the PP. Payingattention to the headline events without paying attentionto these even larger risks is like being concerned aboutthe wine being served on the Titanic.

Figure 5: Nonlinear response compared to linear re-sponse. The PP should be evoked to prevent impactsthat result in complete destruction due to the nonlin-ear response of natural systems, it is not needed forsmaller impacts where risk management methods can

be applied.

7 FRAGILITY

We define fragility in the technical discussion in Ap-pendix C as "is harmed by uncertainty", with the math-ematical result that what is harmed by uncertainty hasa certain type on nonlinear response to random events.

The PP applies only to the largest scale impacts dueto the inherent fragility of systems that maintain theirstructure. As the scale of impacts increases the harmincreases non-linearly up to the point of destruction.

7.1 Fragility as Nonlinear Response

Everything that has survived is necessarily non-linear toharm. If I fall from a height of 10 meters I am injuredmore than 10 times than if I fell from a height of 1 meter,or more than 1000 times than if I fell from a heightof 1 centimeter, hence I am fragile. In general, everyadditional meter, up to the point of my destruction, hurtsme more than the previous one.

Similarly, if I am hit with a big stone I will be harmeda lot more than if I were pelted serially with pebbles of the same total weight.

Everything that is fragile and still in existence (thatis, unbroken), will be harmed more by a certain stressorof intensity X than by k times a stressor of intensityX/k, up to the point of breaking. If I were not fragile(susceptible to harm more than linearly), I would bedestroyed by accumulated effects of small events, andthus would not survive. This non-linear response iscentral for everything on planet earth.

This explains the necessity of considering scale wheninvoking the PP. Polluting in a small way does notwarrant the PP because it is essentially less harmful thanpolluting in large quantities, since harm is non-linear.


17/36


7.2 Why is fragility a general rule?

The statistical structure of stressors is such that smallvariations are much, much more frequent than largeones. Fragility is intimately connected to the ability towithstand small impacts and recover from them. Thisability is what makes a system retain its structure. Everysystem has a threshold of impact beyond which it will

be destroyed, i.e. its structure is not sustained.Consider a coffee cup sitting on a table: there are

millions of recorded earthquakes every year; if the coffeecup were linearly sensitive to earthquakes and accumu-lated their effects as small deteriorations of its form, itwould not persist even for a short time as it would have

been broken down due to the accumulated impact of small vibrations. The coffee cup, however, is non-linearto harm, so that the small or remote earthquakes onlymake it wobble, whereas one large one would break itforever.

This nonlinearity is necessarily present in everythingfragile.

Thus, when impacts extend to the size of the sys-tem, harm is severely exacerbated by non-linear effects.Small impacts, below a threshold of recovery, do notaccumulate for systems that retain their structure. Largerimpacts cause irreversible damage. We should be careful,however, of actions that may seem small and local butthen lead to systemic consequences.

7.3 Fragility, Dose response and the 1/n rule

Another area where we see non-linear responses to harmis the dose-response relationship. As the dose of some

chemical or stressor increases, the response to it growsnon-linearly. Many low-dose exposures do not causegreat harm, but a single large-dose can cause irreversibledamage to the system, like overdosing on painkillers.

In decision theory, the 1/n heuristic is a simple rulein which an agent invests equally across n funds (orsources of risk) rather than weighting their investmentsaccording to some optimization criterion such as mean-variance or Modern Portfolio Theory (MPT), which dic-tates some amount of concentration in order to increasethe potential payoff. The 1/n heuristic mitigates the riskof suffering ruin due to an error in the model; thereis no single asset whose failure can bring down the

ship. While the potential upside of the large payoff isdampened, ruin due to an error in prediction is avoided.This heuristic works best when the sources of variationsare uncorrelated and, in the presence of correlation ordependence between the various sources of risk, the totalexposure needs to be reduced.

Hence, because of non-linearities, it is preferable todiversify our effect on the planet, e.g. distinct types of pollutants, across the broadest number of uncorrelatedsources of harm, rather than concentrate them. In thisway, we avoid the risk of an unforeseen, disproportion-ately harmful response to a pollutant deemed "safe" by

virtue of responses observed only in relatively smalldoses.

8 THE LIMITATION OF TOP-DOWN ENGINEER-ING IN COMPLEX ENVIRONMENTS

In considering the limitations of risk-taking, a key ques-tion is whether or not we can analyze the potentialoutcomes of interventions and, knowing them, identifythe associated risks. Can’t we just "figure it out?” Withsuch knowledge we can gain assurance that extremeproblems such as global destruction will not arise.

Since the same issue arises for any engineering effort,we can ask what is the state-of-the-art of engineering?Does it enable us to know the risks we will encounter?Perhaps it can just determine the actions we should,or should not, take. There is justifiably widespread re-spect for engineering because it has provided us withinnovations ranging from infrastructure to electronicsthat have become essential to modern life. What is not

as well known by the scientific community and thepublic, is that engineering approaches fail in the face of complex challenges and this failure has been extensivelydocumented by the engineering community itself [8].The underlying reason for the failure is that complexenvironments present a wide range of conditions. Whichconditions will actually be encountered is uncertain.Engineering approaches involve planning that requiresknowledge of the conditions that will be encountered.Planning fails due to the inability to anticipate the manyconditions that will arise.

This problem arises particularly for “real-time” sys-tems that are dealing with large amounts of information

and have critical functions in which lives are at risk. Aclassic example is the air traffic control system. An effortto modernize that system by traditional engineeringmethods cost $3-6 billion and was abandoned withoutchanging any part of the system because of the inabilityto evaluate the risks associated with its implementation.

Significantly, the failure of traditional engineering toaddress complex challenges has led to the adoption of innovation strategies that mirror evolutionary processes,creating platforms and rules that can serve as a basisfor safely introducing small incremental changes thatare extensively tested in their real world context [8].This strategy underlies the approach used by highly-

successful, modern, engineered-evolved, complex sys-tems ranging from the Internet, to Wikipedia, to iPhoneApp communities.

9 SKEPTICISM AND PRECAUTION

We show in Figures 6 and 7 that an increase in uncer-tainty leads to an increase in the probability of ruin,hence "skepticism" is that its impact on decisions shouldlead to increased, not decreased conservatism in thepresence of ruin. More skepticism about models im-plies more uncertainty about the tails, which necessitates


18/36


Figure 6: The more uncertain or skeptical one is of "scientific" models and projections, the higher the riskof ruin, which flies in the face of the argument of the style "skeptical of climate models". No matter howincreased the probability of benefits, ruin as an absorbing

barrier, i.e. causing extinction without further recovery,can more than cancels them out. This graph assumeschanges in uncertainty without changes in benefits (amean-preserving sensitivity) –the next one isolates thechanges in benefits.

Figure 7: The graph shows the asymmetry between benefits and harm and the effect on the ruin probabilities.Shows the effect on ruin probability of changes the

Information Ratio, that is, expected benefituncertainty (or signal divided by noise). Benefits are small compared to negative ef-fects. Three cases are considered, two from Extremistan:extremely fat-tailed (α = 1), and less fat-tailed (α = 2),

and one from Mediocristan.

more precaution about newly implemented techniques,or larger size of exposures. As we said, Nature mightnot be smart, but its longer track record means smalleruncertainty in following its logic.

Mathematically, more uncertainty about the future –or about a model –increases the scale of the distribution,hence thickens the "left tail" (as well as the "right one")which raises the potential ruin. The survival probabilityis reduced no matter what takes place in the right tail.

Hence skepticim about climate models should lead tomore precautionary policies.

In addition, such increase uncertainty matters far morein Extremistan –and has benign effects in Mediocristan.Figure 7 shows th asymmetries between costs and ben-efits as far as ruin probabilities, and why these mattermore for fat-tailed domains than thin-tailed ones. In thin-tailed domains, an increase in uncertainty changes theprobability of ruin by several orders of magnitude, butthe effect remains small: from say 10−40 to 10−30 is notquite worrisome. In fat-tailed domains, the effect is size-able as we start with a substantially higher probabilityof ruin (which is typically underestimated, see [6]).

10 WHY SHOULD GM OS B E U N D E R PP BU TNOT NUCLEAR ENERGY?

As examples that are relevant to the discussion of thedifferent types of strategies, we consider the differences

between concerns about nuclear energy and GM crops.

In short nuclear exposure in nonlinear –and can belocal (under some conditions) – while GMOs are not andpresent systemic risks even in small amounts.

10.1 Nuclear energy

Many are justifiably concerned about nuclear energy.It is known that the potential harm due to radiationrelease, core meltdowns and waste can be large. At thesame time, the nature of these risks has been extensivelystudied, and the risks from local uses of nuclear energyhave a scale that is much smaller than global. Thus,even though some uncertainties remain, it is possible

to formulate a cost benefit analysis of risks for localdecision-making. The large potential harm at a local scalemeans that decisions about whether, how and how muchto use nuclear energy, and what safety measures to use,should be made carefully so that decision makers and thepublic can rely upon them. Risk management is a veryserious matter when potential harm can be large andshould not be done casually or superficially. Those whoperform the analysis must not only do it carefully, theymust have the trust of others that they are doing it care-fully. Nevertheless, the known statistical structure of therisks and the absence of global systemic consequencesmakes the cost benefit analysis meaningful. Decisions

can be made in the cost-benefit context—evoking the PPis not appropriate for small amounts of nuclear energy,as the local nature of the risks is not indicative of thecircumstances to which the PP applies.

In large quantities, we should worry about an unseenrisk from nuclear energy and invoke the PP. In smallquantities, it may be OK—how small we should deter-mine by direct analysis, making sure threats never ceaseto be local.

In addition to the risks from nuclear energy useitself, we must keep in mind the longer term risksassociated with the storage of nuclear waste, which are


19/36


20/36


subject to selection over long times and survived. Thesuccess rate is tiny. Unlike GMOs, in nature there is noimmediate replication of mutated organisms to become alarge fraction of the organisms of a species. Indeed, anyone genetic variation is unlikely to become part of thelong term genetic pool of the population. Instead, justlike any other genetic variation or mutation, transgenictransfers are subject to competition and selection overmany generations before becoming a significant part of the population. A new genetic transfer engineered todayis not the same as one that has survived this process of selection.

An example of the effect of transfer of biologicallyevolved systems to a different context is that of zoonoticdiseases. Even though pathogens consume their hosts,they evolve to be less harmful than they would oth-erwise be. Pathogens that cause highly lethal diseasesare selected against because their hosts die before theyare able to transmit to others. This is the underlyingreason for the greater dangers associated with zoonotic

diseases—caused by pathogens that shift from the hostthat they evolved in to human beings, including HIV,Avian and Swine flu that transferred from monkeys(through chimpanzees), birds and hogs, respectively.

More generally, engineered modifications to ecologicalsystems (through GMOs) are categorically and statisti-cally different from bottom up ones. Bottom-up modifi-cations do not remove the crops from their long termevolutionary context, enabling the push and pull of the ecosystem to locally extinguish harmful mutations.Top-down modifications that bypass this evolutionarypathway unintentionally manipulate large sets of inter-dependent factors at the same time, with dramatic risks

of unintended consequences. They thus result in fat-tailed distributions and place a huge risk on the foodsystem as a whole.

For the impact of GMOs on health, the evaluation of whether the genetic engineering of a particular chemical(protein) into a plant is OK by the FDA is based uponconsidering limited existing knowledge of risks associ-ated with that protein. The number of ways such an eval-uation can be in error is large. The genetic modificationsare biologically significant as the purpose is to stronglyimpact the chemical functions of the plant, modifying itsresistance to other chemicals such as herbicides or pesti-cides, or affecting its own lethality to other organisms—

i.e. its antibiotic qualities. The limited existing knowl-edge generally does not include long term testing of the exposure of people to the added chemical, even inisolation. The evaluation is independent of the ways theprotein affects the biochemistry of the plant, includinginteractions among the various metabolic pathways andregulatory systems—and the impact of the resultingchanges in biochemistry on health of consumers. Theevaluation is independent of its farm-ecosystem com-

bination (i.e. pesticide resistant crops are subject to in-creased use of pesticides, which are subsequently presentin the plant in larger concentrations and cannot be

washed away). Rather than recognizing the limitationsof current understanding, poorly grounded perspectivesabout the potential damage with unjustified assumptionsare being made. Limited empirical validation of bothessential aspects of the conceptual framework as wellas specific conclusions are being used because testing isrecognized to be difficult.

We should exert the precautionary principle here – ournon-naive version – because we do not want to discovererrors after considerable and irreversible environmentaland health damage.

10.4 Red herring: How about the risk of famine with-out GMOs?

An argument used by those who advocate for GMOs isthat they will reduce the hunger in the world. Invokingthe risk of famine as an alternative to GMOs is a deceitfulstrategy, no different from urging people to play Russianroulette in order to get out of poverty.

The evocation of famine also prevents clear thinkingabout not just GMOs but also about global hunger. Theidea that GMO crops will help avert famine ignoresevidence that the problem of global hunger is due topoor economic and agricultural policies. Those who careabout the supply of food should advocate for an imme-diate impact on the problem by reducing the amountof corn used for ethanol in the US, which burns foodfor fuel consuming over 40% of the US crop that couldprovide enough food to feed 2/3 of a billion people [14].

One of the most extensively debated cases for GMOsis a variety of rice—"golden rice"—to which has beenadded a precursor of vitamin A as a potential means

to alleviate this nutritional deficiency, which is a keymedical condition affecting impoverished populations.Since there are alternatives, including traditional vitaminfortification, one approach is to apply a cost benefitanalysis comparing these approaches. Counter to thisapproach stands both the largely unknown risks asso-ciated with the introduction of GMOs, and the needand opportunities for more systemic interventions toalleviate not just malnutrition but poverty and hungerworldwide. While great attention should be placed onimmediate needs, neglecting the larger scale risks is un-reasonable [10]. Here science should adopt an unyieldingrigor for both health benefit and risk assessment, in-

cluding careful application of the PP. Absent such rigor,advocacy by the scientific community not only fails to

be scientific, but also becomes subject to challenge forshort term interests, not much different from corporateendorsers. Thus, cutting corners on tests, including testswithout adequate consent or approvals performed onChinese children [15], undermines scientific claims tohumanitarian ideals. Given the promotion of "goldenrice" by the agribusiness that also promote biofuels, theirinterest in humanitarian impacts versus profits gainedthrough wider acceptance of GMO technology can belegitimately questioned [16].


21/36


We can frame the problem in our probabilistic argu-ment of Section 9. This asymmetry from adding anotherrisk, here a technology (with uncertainty attending someof its outcomes), to solve a given risk (which can besolved by less complicated means) are illustrated inFigures 6 and 7. Model error, or errors from the tech-nology itself, i.e., its iatrogenics, can turn a perceived"benefit" into a highly likely catastrophe, simply becausean error from, say, "golden rice" or some such technologywould have much worse outcomes than an equivalent

benefit. Most of the discussions on "saving the poor fromstarvation" via GMOs miss the fundamental asymmetryshown in 7.

10.5 GMOs in summary

In contrast to nuclear energy (which, as discussed insection 10.1 above, may or may not fall under thePP, depending on how and where (how widely) it isimplemented), Genetically Modified Organisms, GMOs,fall squarely under the PP because of their systemic risk.

The understanding of the risks is very limited and thescope of the impacts are global both due to engineeringapproach replacing an evolutionary approach, and dueto the use of monoculture.

Labeling the GMO approach “scientific" betrays a verypoor—indeed warped—understanding of probabilisticpayoffs and risk management. A lack of observations of explicit harm does not show absence of hidden risks.Current models of complex systems only contain thesubset of reality that is accessible to the scientist. Natureis much richer than any model of it. To expose anentire system to something whose potential harm isnot understood because extant models do not predict a

negative outcome is not justifiable; the relevant variablesmay not have been adequately identified.

Given the limited oversight that is taking place onGMO introductions in the US, and the global impactof those introductions, we are precisely in the regimeof the ruin problem. A rational consumer should say:We do not wish to pay—or have our descendants pay—for errors made by executives of Monsanto, who arefinancially incentivized to focus on quarterly profitsrather than long term global impacts. We should exertthe precautionary principle—our non-naive version—simply because we otherwise will discover errors withlarge impacts only after considerable damage.

10.6 Vaccination, Antibiotics, and Other Exposures

Our position is that while one may argue that vaccina-tion is risky, or risky under some circumstances, it doesnot fall under PP owing to the lack of systemic risk.The same applies to such interventions as antibiotics,provided the scale remains limited to the local.

11 PRECAUTION AS POLICY AND NAIVE IN-TERVENTION

When there is a risk of ruin, obstructionism and policyinaction are important strategies, impeding the rapid

headlong experimentation with global ruin by thosewith short-term, self-centered incentives and perspec-tives. Two approaches for policy action are well justified.In the first, actions that avoid the inherent sensitivity of the system to propagation of harm can be used to free thesystem to enable local decision-making and explorationwith only local harm. This involves introducing bound-aries, barriers and separations that inhibit propagation of shocks, preventing ruin for overly connected systems. Inthe second, where such boundaries don’t exist or cannot

be introduced due to other effects, there is a need foractions that are adequately evaluated as to their globalharm. Scientific analysis of such actions, meticulouslyvalidated, is needed to prevent small risks from causingruin.

What is not justified, and dangerous, are actions thatare intended to prevent harm by additional intervention.The reason is that indirect effects are likely to createprecisely the risks that one is intending to avoid.

When existing risks are perceived as having the po-

tential for ruin, it may be assumed that any preventivemeasure is justified. There are at least two problemswith such a perspective. First, localized harm is oftenmistaken for ruin, and the PP is wrongly invoked whererisk management techniques should be employed. Whena risk is not systemic, overreaction will typically causemore harm than benefits, like undergoing dangeroussurgery to remove a benign growth. Second, even if thethreat of ruin is real, taking specific (positive) action inorder to ward off the perceived threat may introducenew systemic risks. It is often wiser to reduce or removeactivity that is generating or supporting the threat andallow natural variations to play out in localized ways.

Preventive action should be limited to correcting sit-uations by removing threats via negativa in order to

bring them back in line with a statistical structure thatavoids ruin. It is often better to remove structure orallow natural variation to take place rather than to addsomething additional to the system.

When one takes the opposite approach, taking specificaction designed to diminish some perceived threat, oneis almost guaranteed to induce unforeseen consequences.Even when there appears to be a direct link from aspecific action to a specific preventive outcome, theweb of causality extends in complex ways with con-

sequences that are far from the intended goal. Theseunintended consequences may generate new vulnerabil-ities or strengthen the harm one is hoping to diminish.Thus, when possible, limiting fragilizing dependencies is

better than imposing additional structure that increasesthe fragility of the system as a whole.

12 FALLACIOUS ARGUMENTS AGAINST P P

In this section we respond to a variety of arguments thathave been made against the PP.


22/36


12.1 Crossing the road (the paralysis fallacy)

Many have countered the invocation of the PP with“nothing is ever totally safe.” “I take risks crossing theroad every day, so according to you I should stay homein a state of paralysis.” The answer is that we don’t crossthe street blindfolded, we use sensory information tomitigate risks and reduce exposure to extreme shocks.

Even more importantly in the context of the PP, theprobability distribution of death from road accidents atthe population level is thin-tailed; I do not incur the riskof generalized human extinction by crossing the street—a human life is bounded in duration and its unavoidabletermination is an inherent part of the bio-social system[17]. The error of my crossing the street at the wrongtime and meeting an untimely demise in general doesnot cause others to do the same; the error does notspread. If anything, one might expect the opposite effect,that others in the system benefit from my mistake byadapting their behavior to avoid exposing themselves tosimilar risks. Equating risks a person takes with his or

her own life with risking the existence of civilization isan inappropriate ego trip. In fact, the very idea of thePP is to avoid such a frivolous focus.

The paralysis argument is often used to present the PPas incompatible with progress. This is untrue: tinkering,

bottom-up progress where mistakes are bounded is howprogress has taken place in history. The non-naive PPsimply asserts that the risks we take as we innovate mustnot extend to the entire system; local failure serves asinformation for improvement. Global failure does not.

This fallacy illustrates the misunderstanding betweensystemic and idiosyncratic risk in the literature. Individ-uals are fragile and mortal. The idea of sustainability isto stike to make systems as close to immortal as possible.

12.2 The Psychology of Risk and Thick Tailed Distri-butions

One concern about the utility of the PP is that its evoca-tion may become commonplace because of risk aversion.Is it true that people overreact to small probabilitiesand the PP would feed into human biases? While wehave carefully identified the scope of the domain of applicability of the PP, it is also helpful to review theevidence of risk aversion, which we find not to be based

upon sound studies.Certain empirical studies appear to support the exis-

tence of a bias toward risk aversion, claiming evidencethat people choose to avoid risks that are beneficial,inconsistent with cost-benefit analyses. The relevant ex-periments ask people questions about single probabilityevents, showing that people overreact to small probabil-ities. However, those researchers failed to include theconsequences of the associated events which humansunderestimate. Thus, this empirical strategy as a wayof identifying effectiveness of response to risk is funda-mentally flawed [18].

The proper consideration of risk involves both prob-ability and consequence, which should be multipliedtogether. Consequences in many domains have thicktails, i.e. much larger consequences can arise than areconsidered in traditional statistical approaches. Overre-acting to small probabilities is not irrational when theeffect is large, as the product of probability and harmis larger than expected from the traditional treatment of probability distributions.

12.3 The Loch Ness fallacy

Many have countered that we have no evidence thatthe Loch Ness monster doesn’t exist, and, to take theargument of evidence of absence being different fromabsence of evidence, we should act as if the Loch Nessmonster existed. The argument is a corruption of theabsence of evidence problem and certainly not part of the PP.

The relevant question is whether the existence of theLoch Ness monster has implications for decisions about

actions that are being taken. We are not considering adecision to swim in the Loch Ness. If the Loch Nessmonster did exist, there would still be no reason toinvoke the PP, as the harm he might cause is limitedin scope to Loch Ness itself, and does not present therisk of ruin.

12.4 The fallacy of misusing the naturalistic fallacy

Some people invoke “the naturalistic fallacy,” a philo-sophical concept that is limited to the moral domain.According to this critique, we should not claim thatnatural things are necessarily good; human innovation

can be equally valid. We do not claim to use nature toderive a notion of how things "ought" to be organized.Rather, as scientists, we respect nature for the extent of its experimentation. The high level of statistical signifi-cance given by a very large sample cannot be ignored.Nature may not have arrived at the best solution to aproblem we consider important, but there is reason to

believe that it is smarter than our technology based onlyon statistical significance.

The question about what kinds of systems work (asdemonstrated by nature) is different than the questionabout what working systems ought to do. We can takea lesson from nature—and time—about what kinds of

organizations are robust against, or even benefit from,shocks, and in that sense systems should be structured inways that allow them to function. Conversely, we cannotderive the structure of a functioning system from whatwe believe the outcomes ought to be.

To take one example, Cass Sunstein—who has writtenan article critical of the PP [19]—claims that there is a"false belief that nature is benign." However, his concep-tual discussion fails to distinguish between thin and fattails, local harm and global ruin. The method of analysismisses both the statistical significance of nature and thefact that it is not necessary to believe in the perfection of


23/36


nature, or in its "benign" attributes, but rather in its trackrecord, its sheer statistical power as a risk evaluator andas a risk manager in avoiding ruin.

12.5 The "Butterfly in China" fallacy

The statement “if I move my finger to scratch my nose,

by the butterfly-in-China effect, owing to non-linearities,I may terminate life on earth," is known to be flawed. Theexplanation is not widely understood. The fundamentalreason arises because of the existence of a wide rangein levels of predictability and the presence of a largenumber of fine scale degrees of freedom for every largescale one [20]. Thus, the traditional deterministic chaos,for which the butterfly effect was named, applies specif-ically to low dimensional systems with a few variablesin a particular regime. High dimensional systems, likethe earth, have large numbers of fine scale variables forevery large scale one. Thus, it is apparent that not all

butterfly wing flaps can cause hurricanes. It is not clear

that any one of them can, and, if small perturbationscan influence large scale events, it happens only underspecific conditions where amplification occurs.

Empirically, our thesis rebuts the butterfly fallacy withthe argument that, in the aggregate, nature has experi-enced trillions of small variations and yet it survives.Therefore, we know that the effects of scratching one’snose fall into the thin tailed domain and thus do notwarrant the precautionary principle.

As described previously, barriers in natural systemslead to subsystems having a high-degree of indepen-dence. Understanding how modern systems with a high-degree of connectivity have cascading effects is essentialfor understanding when it is and isn’t appropriate to usethe PP.

12.6 The potato fallacy

Many species were abruptly introduced into the OldWorld starting in the 16th Century that did not causeenvironmental disasters (perhaps aside from diseasesaffecting Native Americans). Some use this observationin defense of GMOs. However, the argument is fallaciousat two levels:

First, by the fragility argument, potatoes, tomatoes

and similar "New World" goods were developed locallythrough progressive, bottom-up tinkering in a complexsystem in the context of its interactions with its envi-ronment. Had they had an impact on the environment,it would have caused adverse consequences that wouldhave prevented their continual spread.

Second, a counterexample is not evidence in the riskdomain, particularly when the evidence is that takinga similar action previously did not lead to ruin. Lack of ruin due to several or even many trials does not indicatesafety from ruin in the next one. This is also the Russianroulette fallacy, detailed below.

12.7 The Russian roulette fallacy (the counterexam-ples in the risk domain)

The potato example, assuming potatoes had not beengenerated top-down by some engineers, would still not

be sufficient. Nobody says "look, the other day therewas no war, so we don’t need an army," as we know

better in real-life domains. Nobody argues that a giant

Russian roulette with many barrels is "safe" and a greatmoney making opportunity because it didn’t blow upsomeone’s brains last time.

There are many reasons a previous action may nothave led to ruin while still having the potential to doso. If you attempt to cross the street with a blindfoldand earmuffs on, you may make it across, but this is notevidence that such an action carries no risk.

More generally, one needs a large sample for claimsof absence of risk in the presence of a small probabilityof ruin, while a single “n = 1" example would be suf-ficient to counter the claims of safety—this is the BlackSwan argument [27]. Simply put, systemic modifications

require a very long history in order for the evidence of lack of harm to carry any weight.

12.8 The Carpenter Fallacy

Risk managers skeptical of the understanding of riskof biological processes, such as GMOs, by the expertsare sometimes asked "are you a biologist?" But nobodyasks a probabilist dealing with roulette sequences if he isa carpenter. To understand the gambler’s ruin problem

by roulette betting, we know to ask a probabilist, nota carpenter. No amount of expertise in carpentry canreplace rigor in understanding the properties of long

sequences of small probability bets. Likewise, no amountof expertise in the details of biological processes can bea substitute for probabilistic rigor.

The context for evaluating risk is the extent of knowl-edge or lack of knowledge. Thus, when consideringGMO risks, a key question is what is the extent towhich we know the impacts of genetic changes in organ-isms. Claims that geneticists know these consequencesas a basis for GMOs do not recognize either that theirknowledge is not complete in its own domain nor isgenetics complete as a body of knowledge. Geneticistsdo not know the developmental, physiological, medical,cognitive and environmental consequences of genetic

changes in organisms. Indeed, most of these are not partof their training or competency. Neither are they trainedin recognizing the impact of the limitations of knowledgeon risk.

Some advocates dismiss the very existence of risk dueto the role of scientific knowledge in GMOs. Accordingto this view scientists from Monsanto and similar com-panies can be trusted to provide safe foods without riskand even a question about risk is without basis. Scientificknowledge as a source of engineering innovation has along tradition. At the same time, engineering itself is adifferent discipline and has different imperatives. While


24

Precautionary Principle Papers

Documents

A Core Precautionary Principle* - Iowa State Universityjwcwolf/Papers/Gardiner on...This paper tries to do better for the precautionary approach. The ﬁrst half introduces the problems

Biodiversity Precautionary Principle

The precautionary principle under fire Rupert Read and Tim O … · 2017. 11. 22. · The precautionary principle under fire Rupert Read and Tim O’Riordan The precautionary principle

The Precautionary Principle: A Fundamental Principle of

The Precautionary Principle and chemical risks

Brian Hodges Precautionary Principle

Precaution and the Precautionary Principle: two Australian

The precautionary principle ... - European Commissionec.europa.eu/.../pdf/precautionary_principle_decision_making_under... · 3 THE PRECAUTIONARY PRINCIPLE: DECISIONfiMAING UNDER

Greenhouse Effect and Precautionary Principle Arctic Ecosystems

Legal Aspects of the Precautionary Principle · Legal Aspects of the Precautionary Principle 6 Part 1: The precautionary principle in the European Union (Withdrawal) Act 2018 The

The Precautionary Principle (with Application to the

Precautionary principle and the polluter pay principle

Understanding the Precautionary Principle

The Precautionary Principle: From Theory to Practice

THE PRECAUTIONARY PRINCIPLE : Will lack of full legal certainty … · 2010-04-19 · precautionary principle. • Ont. C.A.: Spraytech only indicates that the precautionary principle

The Precautionary Principle in Action

Comest - The Precautionary Principle

Outline What is the precautionary principle? Precautionary principle in the context of DSM Obligation to apply the precautionary approach Precautionary

Defending the Precautionary Principle

Precautionary Principle and WTO