Upload
temira
View
25
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Ethics for self-improving machines. J Storrs Hall Mark Waser. Asimov's 3 Laws:. 1. A robot may not injure a human being or, through inaction, allow a human being to come to harm. - PowerPoint PPT Presentation
Citation preview
Ethics for self-improving machines
J Storrs HallMark Waser
Asimov's3 Laws:
http://www.markzug.com/
1. A robot may not injure a human
being or, through inaction, allow a
human being to come to harm.
2. A robot must obey orders given
to it by human beings except where
such orders would conflict with the
First Law.
3. A robot must protect its own
existence as long as such protection
does not conflict with the First or
Second Law.
Asimov's robots didn't
Improve Themselves.
But our AIs (we hope)
Will.
How do you design laws for something that will think in concepts you haven't heard of
and which you couldn't grasp if you had?
There is no chance that everybody will create their robots with any given set of laws anyhow!
Laws reflect goals (and thus values) which do NOT converge over humanity.
Axelrod's Evolution of Cooperation and decades of follow-on evolutionary game theory provide the theoretical underpinnings.
Be nice/don’t defect Retaliate Forgive
“Selfish individuals, for theirown selfish good, should benice and forgiving”
In nature, cooperation appears whenever
the cognitive machinery will support it.
Vampire bats(Wilkinson)
Blue Jays (Stephen
s, McLinn, &
Stevens)
Cotton-Top Tamarins (Hauser, et
al)
Economic Sentience
Defined as:“Awareness of the potential benefits of cooperation and trade with other intelligences”
TIME DISCOUNTING is its measure.
Tragedy of the Commons
Acting ethically is an attractor in the state space of intelligent goal-driven systems (if they interact with other intelligent goal-driven systems on a long-term ongoing basis)
Ethics *IS* the necessary basis for cooperation
We must find ethical design elements that are Evolutionarily Stable Strategies
so that we can start AIs out
in the attractor it's taken us millions of years to begin to descend.
Let's call such adesign element an
ESV:
Evolutionarily Stable(or EconomicallySentient)Virtue.
Economically Unviable
Destruction
Slavery
Short-term profit at the expense of the long term
Avoiding all of these are ESVs
Fair enforcement of contractsis an ESV that demonstrablypromotes cooperation.
Open Sourcemotivationsare an ESV
Like auditingin current-daycorporations,since money is their trueemotion.
and other forms of guaranteeing trustworthiness
In particular,RECIPROCAL ALTRUISM is an ESV;Exactly like it's superset ENLIGHTENED SELF-INTEREST (AKA
ETHICS)
A general desire for all ethical agents to live (and prosper) as long as possible is also an ESV, because it promotes a community with long-term stability and accountability.
— Socrates
There is no good but knowledge,and no evil but ignorance.
Curiosity – a will to extend and improveone's world model – is an ESV.
An AI with ESVs whoknows what that meanshas a guideline fordesigning Version 2.0, even when the particulars of the new environment don't match the concepts of the old literal goal structure.