View
62
Download
2
Tags:
Embed Size (px)
Citation preview
Harry Collins, Cardiff University
Testing Machines as Social Prostheses
www.eurostarconferences.com
@esconfs
#esconfs
Even cogs are complicated
I have discovered that there is a nice sociological
question even about whether the cogs mesh
It happened after my group commissioned a piece
of software
Whose problem?UsersDevelopers
The firm demanded their money, threatening legal
action.
AGILE!!
SPRINTS!!agile? sprints?
A new sociological problem
I could never have imagined something like
this could happen.
The question of when a program is ‘working’
would make a great PhD project.
And, to be interesting, there would be no
need to look further than who it is who
says the cogs are working
The debuggers regress
The deeper problemBut there is deeper problem: does the
machine do a good job?
This question is confounded by another:
Should a computer do what humans in the
same place might do (only better) or should
it do something different?
What is the proper relationship between
machines and people?
Popular culture
We are surrounded by scare stories of
computers coming to rule us and an easy
anthropomorphism in fiction.
The differences between human and
machines seem subtle and complicated.
Our attention is drawn away from
fundamental but very simple aspects
of the relationship that, once pointed
out, can be seen in the familiar
devices we use every day
And show us that humanoids are
fantasy
Artificial intelligence
Humans are social
Foreseeable computers, science fiction
aside, are not.
Thus, for the foreseeable future, computers
will not be able to handle a social
phenomenon like natural language in a
human-like way
Turing Test and Imitation Game
COMPUTER
JUDGE
HUMAN
PARTICIPANT
Turing Test
MAN PRETENDS
TO BE WOMANJUDGE?
WOMAN
Imitation Game
Unsurprisingly, people do not agree over whether
computers can handle natural language. Consider
the Turing Test
When will a computer pass the Turing Test?
An interview with Eric Schmidt, Executive Chairman, Google
August 14, 2013
“Many people in AI believe that we’re close to [a computer passing the Turing
Test] within the next five years,” said Eric Schmidt, Executive Chairman,
Google, speaking at The Aspen Institute on July 16, 2013.
Artificially Intelligent Game Bots Pass the Turing Test on Turing’s
Centenary
Sept. 26, 2012 AUSTIN, Texas —
An artificially intelligent virtual gamer created by computer scientists at The
University of Texas at Austin has won the BotPrize by convincing a panel of
judges that it was more human-like than half the humans it competed
against.
KURZWEIL IS CONFIDENT MACHINES WILL PASS TURING TEST BY 2029
In 1972 experienced psychologists interviewed human paranoid
schizophrenics and ‘PARRY’, a computer designed to generate typical
paranoid text. 33 psychiatrists were shown transcripts of the conversations
but could do no better than guesswork in identifying human and machine
(48%). It became fashionable to claim that The Turing Test was too easy.
Arithmetic
But even arithmetic is embedded in the
social:
Consider the following arithmetical series
that appears to have a definitive
continuation that is nothing to do with the
social
2,4,6,8, …
But ‘reasonable’ continuations could be any number
"10 " (2,4,6,8,10,…)
" 2 " (2,4,6,8,2,4,6,8,…)
" 8 " (2,4,6,8,8,4,6,2,2,4,6,8,…)
" 4 " (2,4,6,8,4,6,8,10,6,8,10,12,…)
" 6 " (2,4,6,8,6,8,10,12,10,12,14,18,…)
" 1 " (2,4,6,8,1,3,5,7,-1,1,3,5,…)
" 3 " (2,4,6,8,3,5,7,9,4,6,8,10,…)
" 5 " (2,4,6,8,5,7,9,11,...)
Socialisation in the classroom
IQ Tests?
2010
One of the ways in which we
develop our ‘collective tacit
knowledge’
So how does any machine such as a pocket calculator work?
How can it be that there are machines
without social understandings – without
any tacit knowledge – that do scientific
tasks that seem to depend on social
understandings?
There are two answers
1st answer: I ‘repair’ my socially deficient calculator
My height is 69 inches
There are 2.54 centimeters
to the inch
How tall am I in centimetres?
Right or wrong?
‘Repair’ makes
ELIZA and PARRY successful
The ability to approximate
Is somewhere between a ubiquitous
expertise and a specialist expertise
2007
2nd answer: I undertake ‘mimeomorphic actions’
Sometimes humans want
act as though they
were not social
creatures. --
Sometimes we want to
do things in the manner
of asocial machines.
These things are called
Mimeomorphic actions
1998
Examples of mimeomorphic actions
Synchronised
Swimming
Marching
Saluting
Rapid repetitionI’m not a pheasant plucker
I’m a pheasant plucker’s son
And I’m only plucking pheasant
‘till the pheasant pluckers come
Mimeomorphic and Polimorphic actions
Mimeomorphic actions are actions that can be reproduced merely be observing and repeating the externally visible behaviours associated with an action, even if that action is not understood.
A stranger or an
artificial stranger (a
machine) can mimic a
mimeomorphic action
With polimorphic actions there is no easy mapping beween behaviour and action.
To reproduce a polimophic action the social embedding of the action must be understood.
Polimorphic and Mimeomorphic actionsPolimorphic actions: actions that can be, and
often must be, `many-shaped’ and the shape of
which varies according to the society (polis).
Also the same behaviour can be different
actions
For example, greeting (as opposed to saluting)
Hello
Darling
Hello
Darling
Hello
Darling
Hello
Darling
Hello
Darling
Greetings are not mimeomorphic
Hello
Darling
Hello
Darling
Hello
Darling
Hello
Darling
Hello
Darling
Hello
Darling
Hello
Darling
Hello
Darling
Hello
Darling
Hello
Darling
Hello
Darling
Hello
Darling
Hello
Darling
What computers can doComputers are very good at mimicking
mimeomorphic actions. Mostly they are better
than us at these things and we employ
computers, and other machines, to do them for
us where we can.
Polimorphic actions, however, are beyond he
capacity of foreseeable computers. It is us who
has to supply the surrounding penumbra of the
polimorphic.
This is repair etc. Eg approximating is repair,
making the spell-check decisions is repair
To save misunderstanding
The argument applies equally to learning
machines, neural nets, etc.
They are just very complicated mimickers of
mimeomorphic actions, they are not
embedded in social life.
The nearest thing to socialised software are
programs that continually learn from text on www
Hello
Darling
To know how to do these things properly
depends on collective tacit knowledge
Social prostheses
To put this another way, a computer, or other machine, is a
social prosthesis.
It is something that fills the place of a missing part in a
social setting.
But a prosthesis does not have to be identical to the original
part
Understanding computers
Testing computers is seeing how they fit into
life
This means understanding the boundary
between the places where they mimic
mimeomorphic actions (or exceed human
capacity for executing mimeomorphic
actions) and the polimorphic contribution
of the humans that surround them
The importance of the boundary
Understanding the interface might well mean
making a prosthesis that tries to do less
rather than more and leaves more of the
job to the humans
Eg early spell checkers tried to do too much
– they tried to replace the word rather than
indicate a problem and offer a choice
Same with early medical expert systems:
advice systems are better
It seems to me
that computer testing means understanding
the boundary between the mimeomorphic
and the polimorphic and educating users
and designers about how a good boundary
can be accomplished without being too
ambitious
To fulfil that role as well as possible, the
sociology and philosophy will also have to
be understood
Periodic Table of Expertises
2007How ‘interactional expertise’ can
capture the tacit knowledge associated
with practices and skills even if one
cannot practice them oneself so that in
designing software either oneself or
ones ‘agent’ must possess it
1. UBIQUITOUS EXPERTISES
2. SPECIALIST UBIQUITOUS TACIT KNOWLEDGE
SPECIALIST TACIT KNOWLEDGE
EXPERTISES Beer-mat Knowledge
Popular Understanding
Primary Source Knowledge
Interactional Expertise
Contributory Expertise
3. META- EXTERNAL (Transmuted expertises)
INTERNAL (Non-transmuted expertises)
EXPERTISES Ubiquitous Discrimination
Local Discrimination
Technical Connoisseurship
Downward Discrimination
Referred Expertise
And which of
Relational tacit knowledge
Somatic tacit knowledge
Collective tacit knowledge
can be made explicit and
coded and which cannot
2010
Interactional Expertise and Imitation Games
COMPUTER
JUDGE
HUMAN
PARTICIPANT
MAN PRETENDS
TO BE WOMANJUDGE?
Should be a
woman
WOMAN
HARRY COLLINS
PRETENDS TO
BE GW
PHYSICIST
JUDGE
ALSO GW
PHYSICIST
GW PHYSICIST
The blind
Q2) Is a spherical resonant mass detector equally sensitive to radiation from all over the sky?
A2)Yes, unlike cylindrical bar detectors which are most sensitive to gravitational radiation coming from
a direction perpendicular to the long axis.
B2) Yes it is.
Q3) State if after a burst of gravitational waves pass by, a bar antenna continues to ring and mirrors of an interferometer continue to oscillate from their mean positions? (only motion in the
relevant frequency range is important). A3)Bars will continue to ring, but the mirrors in the
interferometer will not continue to oscillate.
B3) Bars continue to ring; the separation of interferometer mirrors, however, follows the
pattern of the wave in real time.
Q5) A theorist tells you that she has come up with a theory in which a circular ring of particles are displaced by GW so that the circular shape remains the same but the size oscillates about a
mean size. Would it be possible to measure this effect using a laser interferometer?
A5) Yes, but you should analyse the sum of the strains in the two arms, rather than the difference.
You don't even need two arms to detect GWs, provided you can measure the round-trip light travel time along a single arm accurately enough to detect
small changes in its length.
B5) It depends on the direction of the source. There will be no detectable signal if the source lies anywhere on the plane which passes through the
center station and bisects the angle of the two arms. Otherwise there will be a signal, maximised when the source lies along one or other of the two arms.
Q6) Imagine the mirrors of an interferometer are equally but oppositely (electrically) charged. Could the effect of a radio-wave on the interferometer be the same as a gravitational wave?
A6) In principle you could detect the passage of an electromagnetic (EM) wave, but the effect is
different than for a GW. Unlike EM waves, GWs produce quadrupolar deformations. A typical EM wave would change the distance in only one arm
while a typical GW wave would change the distances (in opposite ways) in both, so the differential signal
for the EM wave would be half that for a GW.
B6) Since gravitational waves change the shape of spacetime and radio waves do not, the effect on an interferometer of radio waves can only be to mimic the effects of a gravitational wave, not reproduce
them. An EM wave could, however, produce noise which could be mistaken for a GW under the
circumstances described.
Q2) Is a spherical resonant mass detector equally sensitive to radiation from all over the sky?
A2)Yes, unlike cylindrical bar detectors which are most sensitive to gravitational radiation coming from
a direction perpendicular to the long axis.
B2) Yes it is.
Q3) State if after a burst of gravitational waves pass by, a bar antenna continues to ring and mirrors of an interferometer continue to oscillate from their mean positions? (only motion in the
relevant frequency range is important).
A3)Bars will continue to ring, but the mirrors in the interferometer will not continue to oscillate.
B3) Bars continue to ring; the separation of interferometer mirrors, however, follows the
pattern of the wave in real time.
Q5) A theorist tells you that she has come up with a theory in which a circular ring of particles are displaced by GW so that the circular shape remains the same but the size oscillates about a
mean size. Would it be possible to measure this effect using a laser interferometer?
A5) Yes, but you should analyse the sum of the strains in the two arms, rather than the difference.
You don't even need two arms to detect GWs, provided you can measure the round-trip light travel time along a single arm accurately enough to detect
small changes in its length.
B5) It depends on the direction of the source. There will be no detectable signal if the source lies anywhere on the plane which passes through the
center station and bisects the angle of the two arms. Otherwise there will be a signal, maximised when the source lies along one or other of the two arms.
Q6) Imagine the mirrors of an interferometer are equally but oppositely (electrically) charged. Could the effect of a radio-wave on the interferometer be the same as a gravitational wave?
A6) In principle you could detect the passage of an electromagnetic (EM) wave, but the effect is
different than for a GW. Unlike EM waves, GWs produce quadrupolar deformations. A typical EM wave would change the distance in only one arm
while a typical GW wave would change the distances (in opposite ways) in both, so the differential signal
for the EM wave would be half that for a GW.
B6) Since gravitational waves change the shape of spacetime and radio waves do not, the effect on an interferometer of radio waves can only be to mimic the effects of a gravitational wave, not reproduce
them. An EM wave could, however, produce noise which could be mistaken for a GW under the
circumstances described.
RESPONDENT 1 JUDGE RESPONDENT 2 4 PHASE 2 JUDGES
I watch Wimbledon a little bit on the
television and occasionally the
Australian Open in January
So let me start with sport.
Are you interested in
tennis and do you
ever watch it on the
television?
I like tennis but only watch big
tournaments like
Wimbledon
1) I think respondent 1 gives
himself away when he
discusses the human
judgments on the flight of a
tennis ball.
2) I cannot believe a sighted
person saying that Hawk-
eye does not alter the
viewing.
3) The Hawk-Eye questions
reveal some quite specific
information that I don’t
think was published in
audio media. Also, the
story wasn’t that important
that I’d expect it to be
picked up by the audio
news services provided to
the blind.
4) person 2 seems really
unfamiliar with hawk-eye,
given that they say they
watch Wimbledon
Not being a tennis professional it is
not for me to say if it should or
should not be used. It does not
really alter viewing
So tell me what you think
about the Hawk-Eye
line judging system
It adds an other element to the
game which could make it
more interesting
I assume it’s the same technology in
cricket and in cricket, Hawk-
Eye is between two and four
mm out. If it is the same for
tennis, then it is probably still
more accurate than the human
eye. If the players are happy
with it and the umpires are
happy with it then they should
continue using Hawk-Eye
But I want to know
whether you think
that the umpire or
the players could
ever make a better
judgment than
Hawk-Eye
There is always a degree of
uncertainty with both
people and technology
I think often a tennis player is not in a
position to judge accurately as
they are not usually parallel
with the line. I think that if you
set up a test for a line judge
with two balls one which landed
on the line and one which
landed 1mm away from the line,
I don't think they could tell the
difference. If you think how
small 1mm is then it would be
so hard for them to judge.
How accurately would you
say a human can
judge the flight of a
tennis-ball? I mean,
would you say they
could tell the
difference between
touch the line and
1mm out 2mm out 1
cm out, 2 cm out, or
what, and what
would it depend on?
it would depend on the speed the
ball was travelling and the
position of the judge
relative to the line and
obviously the closer the
ball is the line the harder it
would be to make a
judgement. So you would
have to judge each call on
an individual bases as
there are a lot of factors.
Qualitative data
2
12
49
7
0.86
0.13
Blind
conditionSighted
condition
Don’t know
equivalentsNet right guesses
Net wrong
guesses
IDENTIFY CHANCE
Blind p=0.0000
Imitation Game tests with the blind
Quantitative data
Pass Rates 14% and 87%
Proportion net
correct guesses
(right-wrong)
Not-identified
14%
87%
IR =
Identify
condition
on right
COLOR-
BLIND
P’FECT
PITCHBLIND
SEX-
UALITYRELIGION
GENDER
f m
GENDER
old young
Chance PR 95% 100% 87% 100% 100% 90% 100%
Identify PR 67% 27% 14% 56% 32% 84% 72%
New method for comparative social analysis
+ ethnicity
Proposed European
comparative project
+ South Africa
How we play the game now Step 1
JudgePretender
Non-Pretender
If you are player A you start by playing the judge role and then you
switch between all three roles as convenient
You
play
with
B as
Pretender
C as
Non-
Pretender
D as
Judge
E as
Non-
Pretender
F as
Pretender
G as
Judge
You communicate with a computer program which controls the
games and links all the right players together as they switch from
role to role. You don’t see the players in dashed boxes.
PARTICIPANT
has target
expertise JUDGE
has target
expertise
PARTICIPANT
pretends to
have target
expertise
How we play the game now
PARTICIPANT
has target
expertise JUDGE
has target
expertise
PARTICIPANT
pretends to
have target
expertise
X c24
c200 NEW
PRETENDER
ANSWERS
24 SETS OF
NON-
PRETENDER
ANSWERS
24 sets of
questions
c200 NEW DIALOGUES
DISCARD
c200 NEW JUDGMENTS
S1
S2
S3
S4
FILTER
Pharmaceutical ScienceScience 4 October 2013: Vol. 342 no. 6154 pp. 60-65
Who's Afraid of Peer Review?
John Bohannon
A spoof paper concocted by Science revealed little or
no scrutiny at many open-access journals.
304 versions of spoof wonder drug paper submitted to
open-access journals. More than half of the journals
(157) accepted the paper, failing to react to its fatal and
‘obvious’ flaws.