Advanced HCI Lecture 3 - University of Birmingham€¦ · Advanced HCI Lecture 3 Andrew Howes School of Computer Science ... only region R1 with gain G1, 1 the overlap region R2 with

Advanced HCILecture 3

Andrew HowesSchool of Computer Science

University of Birmingham

Adaptive Interaction

• People choose what to do by finding strategies that exist within the space defined by three things:

• The environment. Bounds imposed by the environment including computers, tablets, smartphones etc.

• Bounds imposed by human psychology, including memory, vision, and motor-systems.

• A utility function. People have goals, preferences, tasks. They must weight various trade-offs including, for example, speed and accuracy.

putting perception and motor control together

+

• Your task is to press the “Guardian” icon.

• Your eyes start focused on the crosshair.

• Make the required eye movements, pick your aim point, and press the button.

• Achieving this goal requires your brain to process the verbal instructions, choose an aim point, control muscles in your face and eyes, muscles in your hand, arm, and upper body. All of this activity needs to be coordinated so as to press a button.

Modeling

• We can use what we know about vision and motor control to predict and explain performance on this task.

example: targeting

I. NEOCLASSICAL ECONOMIC APPROACHES TO THE BRAIN

99

outcome of their hand movement. The central difficulty for the movement planner in these tasks is that, with speeded movement, planned movement will differ from actual movement due to motor noise. As a result, the exact outcome of the hand movement is stochastic and the choice of a movement plan simply serves to assign probabilities to each movement outcome.

In our studies, subjects pointed rapidly at stimulus configurations consisting of a small target and nearby penalty region ( Figure 8.3 ). Hand movements that ended within the green target circle yielded a small monetary reward; those ending in the red penalty circle could result in a loss. Endpoints in the target–penalty overlap region led to the awarding of both the reward and the penalty. A time limit was imposed, and movements that arrived after the time limit resulted in large penalties. Target size and the dis-tance between target and penalty regions were small (! 2 cm), similar in size to the subject’s movement endpoint variability. The movement plan that maxi-mizes expected gain under these conditions depends on the relative position of the target and penalty circle, on the loss assigned to the penalty region, and on the subject’s endpoint variability; this is explained next.

How should a subject perform this task? Clearly, the subject’s visuo-motor strategy should take into account motor uncertainty and the penalty structure

imposed by the task. Our model of optimal perform-ance is built on the following assumptions:

1. When the motor system selects a visuo-motor strategy, it in effect imposes a probability density on the space of possible movement trajectories that could occur once the motor strategy is executed. This probability density is likely affected by the goal of the movement, the planned duration, the possibility of visual feedback during the movement, previous training, and intrinsic uncertainty in the motor system (see, for example, Tassinari et al ., 2006 ; Dean et al ., 2007 ). We emphasize that the consequences for the subject are completely mediated through this probability density, and we can, for the most part, ignore the details of the actual mechanisms that produce and steer the action.

2. Whatever the penalty structure of the task, the penalty incurred by the subject depends only on the motion trajectory that actually occurs.

3. The subject acts so as to produce maximum expected gain (MEG) as computed from the magnitude of each possible reward and penalty and the probability of incurring it.

According to this model, the goal of movement planning is to select an optimal visuo-motor movement strategy (i.e. a movement plan) that specifies a desired movement trajectory. In this model, the optimal move-ment strategy is the one that maximizes expected gain. The model takes into account explicit gains associated with the possible outcomes of the movement, the mov-er’s own task-relevant variability, and costs associated with the time limits imposed on the mover.

For the conditions of this experiment, the scene is divided into four regions ( Figure 8.4a ): the reward-only region R1 with gain G1 ,1 the overlap region R2with gain G2 , the penalty-only region R3 with gain G3 , and the background region R4 with gain G4 " 0. We define an optimal visuo-motor strategy S as one that maximizes the subject’s expected gain

!( ) ( ) ( ) .S G P R S G P Sii

i" #"1

4

" | timeout|timeout

(8.1)

Here, P R Si( )| is the probability, given a particular choice of strategy S , of reaching region Ri before the time limit ( t " timeout) has expired,

Button press Stimulus onset

Toolate!$700

$500

#100

#

400–600 ms max 700 ms

FIGURE 8.3 A movement task equivalent to decisionmaking under risk. Subjects were required to touch a computer screen within a short period of time (e.g. 700 ms). Subjects won 100 points by hitting inside the green circle and lost 500 points by hitting inside the red circle. Subjects did not win or lose points by hitting the background, as long as the movement was completed before the time limit, but a large penalty (700 points) was imposed for movements that were too slow. Because movements were rapid, they were also variable: the finger did not always land where the subject intended. As a result, it was in the best interest of the subject to aim to the right of the center of the green circle, compromising between the loss of points due to occasionally missing the green tar-get circle and the loss of points due to occasionally landing in the red penalty circle.

MOVEMENT PLANNING AS OPTIMIZATION

1Here we refer to outcomes as gains denoted Gi with losses coded as negative gains. The term “ expected gain ” that we use corre-sponds exactly to expected value in the psychological and economic literature.

Author’s personal copy

from Trommershauser, Maloney, Landy (2008)


8. THE EXPECTED UTILITY OF MOVEMENT100

FIGURE 8.4 Equivalence between movement under risk and decision making under risk. (a) There are four regions in which the endpoint can land in the task outlined in Figure 8.3 : reward-only (region with expected gain ! 100), reward–penalty overlap (region with expected gain " 400), penalty-only (region with expected gain " 500) and background (region with expected gain 0). (b) Sample endpoints for a subject aim-ing at the center of the green target (aim point indicated by the white diamond). This subject had a motor uncertainty of 5.6 mm (standard deviation); target and penalty circles have radii of 9 mm. The combination of motor uncertainty and aim point specifies the probability of land-ing in each region. This movement strategy yields an expected gain of " 111.39 points/trial due to the high frequency of touching inside the red penalty circle. (c) Sample endpoints for the same subject aiming to the right of the target center to avoid accidental hits inside the penalty circle. The expected gain # 20.68 points/trial corresponds to the optimal strategy maximizing expected gain. (d) Expected gain for this subject as a function of mean movement endpoint. The maximum of this function corresponds to the aim point illustrated in (c).

(a)

100"500

R1R2R3R4

"111.39 pts/trial

(b)

100"500

20.68 pts/trial

(c)

100"500

Exp

ecte

d ga

in

(d)

S2

100"10"20"20"2020

90

60

30

"30

$"60

0

20

"50

50

0

0020

x (mm)x (mm)y (mm)

"500 !100

P R S P S diRi

( ) ( ) ,| |timeout

# ! !! (8.2)

where Ritimeout is the set of trajectories ! that pass

through Ri at some time after the start of the execution of the visuo-motor strategy and before the timeout. The task involves a penalty for not responding before the time limit ( Gtimeout ). The probability that a visuo-motor strategy S leads to a timeout is P S( )timeout| .

In our experiments, subjects win or lose points by touching the reward and penalty regions on the plane of the display before the timeout. Penalties and rewards depend only on the position of the endpoint in this plane, so a strategy S can be identified with the mean endpoint on the plane x y,( ) that results from adopting a particular choice of strategy S that results in that mean endpoint.

In most of our experiments, subjects ’ movement variance was the same in the vertical and horizontal directions, indistinguishable from a bivariate Gaussian distribution (see Figure 8.4b, 8.4c for simulated data for two aim points), and remained stable throughout

the experiment (see, for example, Trommershäuser et al ., 2003a, 2005 ; Wu et al ., 2006 ). Thus, we assume that the movement endpoints ( x, y ) are distributed according to a spatially isotropic Gaussian distribu-tion with standard deviation " ,

p x y x y x x y y( , , , ) exp( ( ) ).| (( ) )/"#"

"22

2 2 212

2# " " ! "

(8.3)

The probability of hitting region Ri is then

P R x y p x y x y dxdyiRi

( , , ) ( , , , ) .| |" "2 2# ! (8.4)

In our experiments, the probability of a timeout is effectively constant over the limited range of relevant screen locations so, for a given endpoint variance " 2 , finding an optimal movement strategy corresponds to choosing a strategy with mean aim point ( , )x y that maximizes



101

!( , ) ( , , ).x y G P R x yi ii

!!

| !2

1

4

" (8.5)

The maximum of !( , )x y corresponds to the strat-egy maximizing expected gain, and depends on the position and magnitude of the penalty and on the distribution of the subject’s endpoints ( Figure. 8.4d ). When the penalty is zero, the aim point maximiz-ing expected gain (and hence the mean endpoint maximizing expected gain) is the center of the target region. When the penalty is non-zero, the aim point maximizing expected gain shifts away from the pen-alty region and, therefore, away from the center of the target. This optimal shift is larger for greater penalties, for penalty regions closer to the target, and for larger magnitudes of motor variability.

For all conditions, we compared subjects ’ mean endpoints with those of a movement planner that maximized expected gain by taking into account its own task-relevant variability. Once we had meas-ured the task-relevant variability for each subject and for each condition, our model yielded parameter-free predictions of behavior maximizing expected gain for all experimental conditions against which subject behavior could be compared.

The subjects in our experiments chose strate-gies maximizing expected gain (MEG), or nearly so ( Figure 8.5 ). Efficiency was defined as the amount of money won relative to the amount of money expected for a subject who used the strategy maxi-mizing expected gain. Subjects ’ efficiencies were typi-cally above 90% ( Trommershäuser et al ., 2003a, 2003b, 2005, 2006a ; Dean et al ., 2007 ). Subjects chose visuo-motor strategies that came close to maximizing gain in a wide variety of simple stimulus configurations, in good agreement with the predictions for the sub-ject maximizing expected gain ( Trommershäuser et al ., 2003a, 2003b, 2005, 2006a, 2006b ; Gepshtein et al ., 2007 ; Stritzke and Trommershäuser, 2007 ).

The experiments just described focused on spa-tial movement uncertainty and its consequences for behavior. Time played a role, but only in the time limit imposed on completion of movements to the target area. More recent experiments focus on compensation for temporal uncertainty and, more generally, the allo-cation of available time. Hudson et al . (2008) carried out experiments analogous to Trommershäuser et al . (2003a, 2003b) , but with subjects rewarded for making touching movements that arrived at a target within a specified time window. If the subject missed the target or arrived outside of the time window, no reward was given. In different experimental conditions the subject could also be penalized for arriving early or late, as

summarized in Figure 8.6a . Each of the bars is a time line, and the reward window is colored green. Arriving at times colored in red incurred a penalty, and arriv-ing at times colored neither red nor green incurred no penalty and also no reward. The four reward/penalty conditions in Figure 8.6a were blocked and subjects were informed about the reward/penalty structure; they saw a display similar to the time bars in Figure 8.6a . The challenge for the subject was to compensate for the subject’s own temporal movement uncertainty. Figure 8.6b illustrates the one-dimensional compu-tation of expected gain as a function of the temporal aim point selected by the subject. The computation is analogous to that described by Trommershäuser et al . (2003a, 2003b) , but is now computed in one temporal dimension rather than two spatial dimensions. One difference between the spatial experiments and the temporal experiment of Hudson and colleagues is that temporal movement uncertainty increases with dura-tion of the movement. Hudson et al . found that sub-jects chose temporal aim points in good agreement with those predicted to maximize expected gain in each condition ( Figure 8.6c ). Moreover, each subject compensated for the increase in timing uncertainty with movements of longer duration.

Dean et al . (2007) studied how well subjects traded off speed and accuracy in attempting to hit targets whose value rapidly diminished over time. Once the

"500 #100

Optimal endpoint0 2 4 6 8

8

6

4

2

0

Mea

sure

d m

ean

end

poin

tS1, leftS1, right

S2, left

S2, right

S3, leftS3, right

S4, leftS4, right

S5, leftS5, right

2 mm

FIGURE 8.5 Movement strategies during rapid movement under risk. Measured mean movement endpoint as a function of optimal endpoint maximizing expected gain (data reprinted from Trommershäuser et al ., 2003b ) in the task illustrated in Figure 8.3 . Five subjects participated. There were six different conditions corre-sponding to green target circles located to the left (open symbols) or right (filled symbols) of the red penalty circle at each of three differ-ent target-penalty distances. The data points fall close to the diago-nal identity line, indicating that subjects chose aim points close to those maximizing expected gain based on their individual move-ment variances.

MOVEMENT PLANNING AS OPTIMIZATION


key features

• environment: simple button arrangement, some with adjacent buttons some without.

• bounds: targeting variance; size and location of buttons; cut-off duration.

• utility = hits - penalties

• These imply a set of strategies...

●●

●

●

●

●

●●

●●

●

●

●

●

●

● ●

● ●

●

●●

●

●

●●

●

●

●●

●●

●

●●

●

●

●

●●

●

●

●

●

●

●

●

●

●

●●

●

●

●

●

●

● ●●

●

●

● ●

●●●

●

●●●

●

●●

●

●

●

●

●

●

●●

● ●●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●●

−5 0 5

0.0

0.1

0.2

0.3

0.4

0.5

X−offset

Density

●

●

●●

●

●●

●

●●

●

●●

●

● ●

●

●

●

●●

●

●

●

●

●

●●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●●

●

●

●

●

●

● ●●

●

●●●

●

●

●

●

●

●

●

●

●

●●

●

●

●●

●

●

●

●

●

●

●

●

●

●

−5 0 5

0.0

0.1

0.2

0.3

0.4

0.5

X−offset

Density

low variance strategy

high variance strategy

strategy choice

• the strategy space is at least two dimensional

• how fast to perform (implying a level of noise)

• where to aim

• which strategy an individual chooses will depend on

• utility (e.g. how much they value time),

• bounds (their own motor noise), and

• environment (the noise in the environment).

probability density functions (p.d.f.)

• Are crucial to understanding human behavior.

• Consider the simple pdf for throwing two dice.

• We will see a lot more pdfs in the next two lectures.

human decision making

Signal Detection Theory

• Nearly all human capabilities, including vision, take place in the presence of some uncertainty.

• As we have seen uncertainty about vision increases with eccentricity, distance, from the fovea.

• Uncertainty about motor targeting increases with speed of action and has consequences for aiming strategies.

• Here are some other examples:

http://www.cns.nyu.edu/~david/handouts/sdt/sdt.html

purchasing uncertainty

• You may be personally uncertain about the quality of a book.

• You might attempt to reduce your uncertainty by looking at the average customer rating.

• The average rating is itself a function of uncertainty.

• In this case the average is a function of a distribution of customer opinion.

Kidney tumor CT-scan• CT-scan or “computed

tomography” is a type of x-ray that produces images in slices.

• Imagine that we are trying to build a new app so that Doctors can view CT-scans on a tablet or a smartphone.

• On the left is a scan of a kidney tumor. This one is quite pronounced. cf the right kidney.

• But the smartphone provides a much smaller image of the scan than other types of display.

• How much harder is the tumor to detect?

• David C. Dugdale, III, MD, Professor of Medicine, Division of General Medicine, Department of Medicine, University of Washington School of Medicine; Read more: http://www.umm.edu/imagepages/1168.htm#ixzz27rhUeKUo

monitor

tabletphone

an empirical test

• we might get 100 CT-scans of kidneys with tumor and 100 without and see how well doctors did at discriminating them when displayed on each display type.

Interpreting scans is hard..

• Interpreting CT scans is hard. The task is so hard, there is always some uncertainty as to what is there and what is not there.

• Either there is a tumor (signal present) or there is not (signal absent).

• If the doctor thinks that they see a tumor (they respond “yes”) or does not (they respond “no”).

• These kinds of decisions can be understood with Signal Detection Theory.

Signal Detection Theory

• ... is a means of deciding whether a signal, e.g. the tumor, is present or absent given noisy input.

• There are four possible outcomes:

• hit (tumor present and doctor says “yes”)

• miss (tumor present and doctor says “no”)

• false alarm (tumor absent and doctor says “yes”)

• correct rejection (tumor absent and doctor says “no”)

• Graphically...

given the stimulus there are two components to generating the perceiver’s response...

Information acquisition and criterion• There are two components to the decision making process:

information acquisition and criterion.

• Information Acquisition. There is about acquiring information in the scan. Healthy organs have a characteristic shape and shade. But even so some benign growths will look like tumors, some normal organs will look as though they may be cancerous and some cancers will look like healthy organs.

• Criterion. This is the clinicians strategy. A tumor looks darker but how dark does an area on the image have to be before it is inferred that a tumor may be present? If the criterion/threshold is “just a little bit dark” then perhaps lots of growths will be incorrectly diagnosed as cancers. If the criterion is “very dark” then perhaps lots of cancers will be missed.

• You might say be extra safe! And you’d be right but all medical decisions have costs as well as benefits. A “tumor” diagnosis might imply the need for invasive and risky surgery, even if only for further tests.

evidence for a tumor

criterion

environment, bounds, and utility

• Two doctors with access to exactly the same information (environment) and knowledge (bounds) may make different decisions because they use different criterions (utility).

• The criterions encode the utility.

noise

• There are two kinds of uncertainty both due to noise:

• External noise: e.g. part of the photographic process. A CT-scan produces an image with a certain resolution. FMRi scans have much lower resolution. We can design technologies that reduce external noise but some noise is inevitable.

http://physicsworld.com/cws/article/news/2010/mar/05/superconductors-could-simulate-the-brain

internal noise

• Internal noise: Neural responses are noisy. Imagine that we could stick an electrode into your brain and monitor the electrical signal generated by a particular CT-scan image.

• You might get a reading of 20 spikes per second. Later, you might get the same clinician to look at the same scan again and measure the same neuron. This time measure a firing rate of 40 spikes per second.

neuron in a mouse brain

internal response

• This firing rate, actually the firing rate over a cluster of neurons, is the internal response to the image.

• We are interested in the level of internal response that is required for the clinician to decide that there is a tumor present.

• We can model this with probability density functions.

probability density function

• The horizontal axis is a measure of the internal response (e.g. neural firing rate).

• The vertical axis represents the probability of that firing rate for tumor absent and tumor present CT-scans.

29/09/2012 14:46Signal Detection Theory

Page 3 of 7http://www.cns.nyu.edu/~david/handouts/sdt/sdt.html

Figure 1: Internal response probability of occurrence curves for noise-alone and for signal-plus-noisetrials.

Just to be really concrete, we could mark the horizontal axis in units of firing rate (10, 20, 30,..., etc.spikes per second). This would mean that on a noise-alone (no tumor) trial, it is most likely that theinternal response would be 10 spikes per second. It is also rather likely that the internal response would be5 or 15 spikes per second. But it is very unlikely that the internal response would be 25 spikes per secondwhen no tumor is present. Because I want to remain noncommittal about what and where in the brain theinternal response is, I did not label the horizontal axis in terms of firing rates. The internal response is insome unknown, but quantifiable, units.

The role of the criterion: Perhaps the simplest strategy that the doctor can adopt is to pick a criterionlocation along the internal response axis. Whenever the internal response is greater than this criterion theyrespond "yes". Whenever the internal response is less than this criterion they respond "no".

An example criterion is indicated by the vertical lines in Figure 2. The criterion line divides the graph intofour sections that correspond to: hits, misses, false alarms, and correct rejections. On both hits and falsealarms, the internal response is greater than the criterion, because the doctor is responding "yes''. Hitscorrespond to signal-plus-noise trials when the internal response is greater than criterion, as indicated inthe figure. False alarms correspond to noise-alone trials when the internal response is greater thancriterion, as indicated in the figure.

Figure 2: Internal response probability of occurrence curves for noise-alone and signal-plus-noise trials.Since the curves overlap, the internal response for a noise-alone trial may exceed the internal response for

a signal-plus-noise trial. Vertical lines correspond to the criterion response.

probability density function

• It is very likely that the internal response when the tumor is present will be 15 units.

• It is highly unlikely that the internal response will ever be 26 units and it is almost never 0.

• A response of 12.5 units is equally likely to be from either distribution.









the role of the criterion

• The internal response and its pdf are determined by internal and external noise.

• Signal Detection Theory models the clinician’s strategy as a criterion.

• Whenever the internal response is higher than the criterion then the clinician says “tumor present” and whenever the response is below the criterion then the clinician says “tumor absent”.











Suppose that the doctor chooses a low criterion (Figure 3, top), so that they respond "yes'' to almosteverything. Then they will never miss a tumor when it is present and they will therefore have a very highhit rate. On the other hand, saying "yes'' to almost everything will greatly increase the number of falsealarms (potentially leading to unnecessary surgeries). Thus, there is a clear cost to increasing the numberof hits, and that cost is paid in terms of false alarms. If the doctor chooses a high criterion (Figure 3,bottom) then they respond "no'' to almost everything. They will rarely make a false alarm, but they willalso miss many real tumors.

Figure 3: Effect of shifting the criterion

Notice that there is no way that the doctor can set their criterion to achieve only hits and no false alarms.The message that you should be taking home from this is that it is inevitable that some mistakes will bemade. Because of the noise it is simply a true, undeniable, fact that the internal responses on noise-alonetrials may exceed the internal responses on signal-plus-noise trials, in some instances. Thus a doctorcannot always be right. They can adjust the kind of errors that they make by manipulating their criterion,the one part of this diagram that is under their control.

The Receiver Operating Characteristic

We can describe the full range of the doctor's options in a single curve, called an ROC curve, which standsfor receiver-operating characteristic. The ROC curve captures, in a single graph, the various alternativesthat are available to the doctor as they move their criterion to higher and lower levels.

ROC curves (Figure 4) are plotted with the false alarm rate on the horizontal axis and the hit rate on thevertical axis. The figure shows several different ROC curves, each corresponding to a different signalstrengths. Just pay attention to one of them (the curve labeled d'=1) for the time being. We already knowthat if the criterion is very high, then both the false alarm rate and the hit rate will be very low, putting yousomewhere near the lower left corner of the ROC graph. If the criterion is very low, then both the hit rateand the false alarm rate will be very high, putting you somewhere near the upper right corner of thegraph. For an intermediate choice of criterion, the hit rate and false alarm rate will take on intermediatevalues. The ROC curve characterizes the choices available to the doctor. They may set the criterionanywhere, but any choice that they make will land them with a hit and false alarm rate somewhere on theROC curve. Notice also that for any reasonable choice of criterion, the hit rate is always larger than thefalse alarm rate, so the ROC curve is bowed upward.

the effect of shifting

the criterion

• There is a clear cost to increasing the number of hits. More false alarms might mean more unnecessary stress and unnecessary surgery.

• On the other hand, reducing false alarms, also reduces the number of hits.

• It is impossible for the doctor to set a criterion that achieves only hits. In other words, it is inevitable that some mistakes will be made. Diagnosis errors cannot be avoided.

• However, errors are systematic and, with signal detection theory can be understood and their rate predicted.

• Also, their costs can be minimized if training and experience is used to set an optimal criterion.

• And, design of technology can decrease external noise, and increase discriminability, giving higher hit rates for any given level of false alarm.

discrimination d’

• one way to think of discrimination is as a measure of how easily hits and false alarms can be separated.

• d’ = separation / spread



Figure 4: Internal response probability of occurrence curves and ROC curves for different signal strengths.When the signal is stronger there is less overlap in the probability of occurrence curves, and the ROC

curve becomes more bowed.

The role of information: Acquiring more information makes the decision easier. Running another test(e.g., MRI) can be used to acquire more information about the presence or absence of a tumor.Unfortunately, the radiologist does not have much control over how much information is available.

In a controlled perception experiment the experimenter has complete control over how much informationis provided. Having this control allows for quite a different sort of outcome. If the experimenter choosesto present a stronger stimulus, then the subject's internal response strength will, on the average, bestronger. Pictorially, this will have the effect of shifting the probability of occurrence curve for signal-plus-noise trials to the right, a bit further away from the noise-alone probability of occurrence curve.

Figure 4 shows two sets of probability of occurrence curves. When the signal is stronger there is moreseparation between the two probability of occurrence curves. When this happens the subject's choices arenot so difficult as before. They can pick a criterion to get nearly a perfect hit rate with almost no falsealarms. ROC curves for stronger signals bow out further than ROC curves for weaker signals. Ultimately,if the signal is really strong (lots of information), then the ROC curve goes all the way up to the upper leftcorner (all hits and no false alarms).

Varying the noise: There is another aspect of the probability of occurrence curves that also determinesdetectability: the amount of noise. Less noise reduces the spread of the curves. For example, consider thetwo probability of occurrence curves in Figure 5. The separation between the peaks is the same but thesecond set of curves are much skinnier. Clearly, the signal is much more discriminable when there is lessspread (less noise) in the probability of occurrence curves. So the subject would have an easier timesetting their criterion in order to be right nearly all the time.

the effect of good design

• the effect of good design is to increase discrimination.

• bad design reduces discrimination.









the effect of more information

• the effect of more information can also be to increase discrimination.

• though there are limits.











Figure 5: Internal response probability of occurrence curves for two different noise levels. When the noiseis greater, the curves are wider (more spread) and there is more overlap.

In reality, we have no control over the amount of internal noise. But it is important to realize thatdecreasing the noise has the same effect as increasing the signal strength. Both reduce the overlap betweenthe probability of occurrence curves.

Discriminability index (d'): Thus, the discriminability of a signal depends both on the separation and thespread of the noise-alone and signal-plus-noise curves. Discriminability is made easier either byincreasing the separation (stronger signal) or by decreasing the spread (less noise). In either case, there isless overlap between the probability of occurrence curves. To write down a complete description of howdiscriminable the signal is from no-signal, we want a formula that captures both the separation and thespread. The most widely used measure is called d-prime (d' ), and its formula is simply:

d' = separation / spread

This number, d', is an estimate of the strength of the signal. Its primary virtue, and the reason that it is sowidely used, is that its value does not depend upon the criterion the subject is adopting, but instead it is atrue measure of the internal response.

Estimating d': To recap... Increasing the stimulus strength separates the two (noise-alone versus signal-plus-noise) probability of occurrence curves. This has the effect of increasing the hit and correct rejectionrates. Shifting to a high criterion leads to fewer false alarms, fewer hits, and fewer surgical procedures.Shifting to a low criterion leads to more hits (lots of worthwhile surgeries), but many false alarms(unnecessary surgeries) as well. The discriminability index, d', is a measure of the strength of the internalresponse that is independent of the criterion.

But how do we measure d'? The trick is that we have to measure both the hit rate and the false alarm rate,then we can read-off d' from an ROC curve. Figure 4 shows a family of ROC curves. Each of these curvescorresponds to a different d-prime value; d'=0, d'=1, etc. As the signal strength increases, the internalresponse increases, the ROC curve bows out more, and d' increases.

So let's say that we do a detection experiment; we ask our doctor to detect tumors in 1000 CT scans. Someof these patients truly had tumors and some of them didn't. We only use patients who have already hadsurgery (biopsies) so we know which of them truly had tumors. We count up the number of hits and falsealarms. And that drops us somewhere on this plot, on one of the ROC curves. Then we simply read off thed' value corresponding to that ROC curve. Notice that we need to know both the hit rate and the falsealarm rate to get the discriminability index, d'.

Medical Malpractice Example: A study of doctors' performance was performed in Boston. 10,000 cases

different sampling distributions of face models are included. We wished, first,to extract from this representation the correlations between hit and false alarmrates that one would obtain when these different distributions underlie the facemodels included in a standard face recognition study.

Exhaustive distribution simulations

Baseline statistics

In this first analysis, we did the simplest thing possible. We correlated the hitand false alarm rates for all of the face models in the matrix. This is an exhaus-tive sample of the faces. This sample yields a correlation of .589 between hitand false alarm rate. The pattern of covariance is illustrated in Figure 3 and issimply a representation of a complete set of ROC curves for the d’ and C rangesconsidered.

One dimensional variation of face samplingdistribution

In this exercise, we simply divided up Figure 3 into the parts caused by varia-tion in d ’ and and the parts caused by variation in C.

SIGNAL DETECTION THEORY 443

FIG. 2. Schematic representation of the signal detection models created, with d’ increasing by row,and criterion becoming stricter by column.As Criterion increases there are fewer false alarms and fewer hits. As d-

prime increases more hits can be achieved with fewer false alarms.From O’Toole, Bartlett, Abdi, 2000

the normal distribution

• these probability density functions have a typical shape: They are normal or Gaussian.

the receiver operating characteristic• the ROC graph is a plot with the

false alarm rate on the horizontal axis and the hit rate of on the vertical axis.

• Each line plotted on the ROC curve is for a different d’.

• Given a d’=1, say, the clinician can pick a criterion anywhere along the plotted curve.

• Good design increases the value of d’ until with d’=4, iit is possible to find a criterion (strategy) with the highest hit rate and lowest false alarm rate.









practical exercise• Download the following paper and use it, and other papers, to help answer the

following questions. Blog your answers and comment on the answers provided by others.

• Sorkin, R.D. & Woods, D.D. (1985). Systems with human monitors: a signal detection analysis. Human-Computer Interaction, 1, 49-75.

• How can Signal Detection Theory help interface designers of systems that aim to automate some aspect of human decision making?

• What is signal detection theory?

• What is automation?

• What does automation do to the human task?

• How should a pilot set their criterion for responding to a stall warning system? How should the stall warning system’s criterion be set?

• What strategies are available to a decision maker using Signal Detection Theory?

• What bounds are imposed by neural information system?

• What form might a utility function take for using Signal Detection Theory?

references

• http://www.cns.nyu.edu/~david/handouts/sdt/sdt.html

• http://www.cs.bham.ac.uk/~howesa/Advanced-HCI-overheads3.key.pdf

Documents

Advanced HCI Lecture 3 - University of Birmingham€¦ · Advanced HCI Lecture 3 Andrew Howes School of Computer Science ... only region R1 with gain G1, 1 the overlap region R2 with