19
Ethics for self-improving machines J Storrs Hall Mark Waser

Ethics for self-improving machines

  • Upload
    temira

  • View
    25

  • Download
    0

Embed Size (px)

DESCRIPTION

Ethics for self-improving machines. J Storrs Hall Mark Waser. Asimov's 3 Laws:. 1. A robot may not injure a human being or, through inaction, allow a human being to come to harm. - PowerPoint PPT Presentation

Citation preview

Page 1: Ethics for  self-improving machines

Ethics for self-improving machines

J Storrs HallMark Waser

Page 2: Ethics for  self-improving machines

Asimov's3 Laws:

http://www.markzug.com/

1. A robot may not injure a human

being or, through inaction, allow a

human being to come to harm.

2. A robot must obey orders given

to it by human beings except where

such orders would conflict with the

First Law.

3. A robot must protect its own

existence as long as such protection

does not conflict with the First or

Second Law.

Page 3: Ethics for  self-improving machines

Asimov's robots didn't

Improve Themselves.

But our AIs (we hope)

Will.

Page 4: Ethics for  self-improving machines

How do you design laws for something that will think in concepts you haven't heard of

and which you couldn't grasp if you had?

Page 5: Ethics for  self-improving machines

There is no chance that everybody will create their robots with any given set of laws anyhow!

Laws reflect goals (and thus values) which do NOT converge over humanity.

Page 6: Ethics for  self-improving machines

Axelrod's Evolution of Cooperation and decades of follow-on evolutionary game theory provide the theoretical underpinnings.

Be nice/don’t defect Retaliate Forgive

“Selfish individuals, for theirown selfish good, should benice and forgiving”

Page 7: Ethics for  self-improving machines

In nature, cooperation appears whenever

the cognitive machinery will support it.

Vampire bats(Wilkinson)

Blue Jays (Stephen

s, McLinn, &

Stevens)

Cotton-Top Tamarins (Hauser, et

al)

Page 8: Ethics for  self-improving machines

Economic Sentience

Defined as:“Awareness of the potential benefits of cooperation and trade with other intelligences”

TIME DISCOUNTING is its measure.

Page 9: Ethics for  self-improving machines

Tragedy of the Commons

Page 10: Ethics for  self-improving machines

Acting ethically is an attractor in the state space of intelligent goal-driven systems (if they interact with other intelligent goal-driven systems on a long-term ongoing basis)

Ethics *IS* the necessary basis for cooperation

Page 11: Ethics for  self-improving machines

We must find ethical design elements that are Evolutionarily Stable Strategies

so that we can start AIs out

in the attractor it's taken us millions of years to begin to descend.

Page 12: Ethics for  self-improving machines

Let's call such adesign element an

ESV:

Evolutionarily Stable(or EconomicallySentient)Virtue.

Page 13: Ethics for  self-improving machines

Economically Unviable

Destruction

Slavery

Short-term profit at the expense of the long term

Avoiding all of these are ESVs

Page 14: Ethics for  self-improving machines

Fair enforcement of contractsis an ESV that demonstrablypromotes cooperation.

Page 15: Ethics for  self-improving machines

Open Sourcemotivationsare an ESV

Like auditingin current-daycorporations,since money is their trueemotion.

and other forms of guaranteeing trustworthiness

Page 16: Ethics for  self-improving machines

In particular,RECIPROCAL ALTRUISM is an ESV;Exactly like it's superset ENLIGHTENED SELF-INTEREST (AKA

ETHICS)

Page 17: Ethics for  self-improving machines

A general desire for all ethical agents to live (and prosper) as long as possible is also an ESV, because it promotes a community with long-term stability and accountability.

Page 18: Ethics for  self-improving machines

— Socrates

There is no good but knowledge,and no evil but ignorance.

Curiosity – a will to extend and improveone's world model – is an ESV.

Page 19: Ethics for  self-improving machines

An AI with ESVs whoknows what that meanshas a guideline fordesigning Version 2.0, even when the particulars of the new environment don't match the concepts of the old literal goal structure.