46
Title Slide 2003- “Five variations on the theme: Software failure: avoiding the avoidable and living with the rest“ Variation 4: “Avoiding the avoidable” Prof. Les Hatton Computing Laboratory, University of Kent, Canterbury Version 1.1: 18/Nov/2003 ©Copyright, L.Hatton, 2003-

Title Slide 2003- “Five variations on the theme: Software ... · Software Failure Seminars, Les Hatton, UKC, December 2003 , (slide 1 - 2) ©Copyright Les Hatton, 2003 Introduction

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Title Slide 2003- “Five variations on the theme: Software ... · Software Failure Seminars, Les Hatton, UKC, December 2003 , (slide 1 - 2) ©Copyright Les Hatton, 2003 Introduction

Title Slide 2003-

“Five variations on the theme:Software failure: avoiding the avoidable and living with the

rest“

Variation 4: “Avoiding the avoidable”

Prof. Les Hatton

Computing Laboratory, University of Kent, Canterbury

Version 1.1: 18/Nov/2003

©Copyright, L.Hatton, 2003-

Page 2: Title Slide 2003- “Five variations on the theme: Software ... · Software Failure Seminars, Les Hatton, UKC, December 2003 , (slide 1 - 2) ©Copyright Les Hatton, 2003 Introduction

©Copyright Les Hatton, 2003Software Failure Seminars, Les Hatton, UKC, December 2003 , (slide 1 - 2)

Introduction to software failure

In the absence of any technology for guaranteeing the absence of defect, engineers are always left with two obligations:-– When the system fails, it must have the least effect on the

user relative to the benefit it provides– The nature of the failure must be sufficiently well

diagnosed that its cause can be found and corrected quickly so the failure does not re-occur.

These are both Design issues

Page 3: Title Slide 2003- “Five variations on the theme: Software ... · Software Failure Seminars, Les Hatton, UKC, December 2003 , (slide 1 - 2) ©Copyright Les Hatton, 2003 Introduction

©Copyright Les Hatton, 2003Software Failure Seminars, Les Hatton, UKC, December 2003 , (slide 1 - 3)

Overview

v A case history in forensic analysisv Some successful technologiesv The process v. product dilemmav Why are we so bad at diagnosis ?

Page 4: Title Slide 2003- “Five variations on the theme: Software ... · Software Failure Seminars, Les Hatton, UKC, December 2003 , (slide 1 - 2) ©Copyright Les Hatton, 2003 Introduction

©Copyright Les Hatton, 2003Software Failure Seminars, Les Hatton, UKC, December 2003 , (slide 1 - 4)

Precedence in programming languages

A case history in forensic analysis and common mode failure

Page 5: Title Slide 2003- “Five variations on the theme: Software ... · Software Failure Seminars, Les Hatton, UKC, December 2003 , (slide 1 - 2) ©Copyright Les Hatton, 2003 Introduction

©Copyright Les Hatton, 2003Software Failure Seminars, Les Hatton, UKC, December 2003 , (slide 1 - 5)

Precedence

Precedence in languages– Languages vary greatly in precedence complexity.

u Ada83 has 4 levels.u Fortran 77 has 9 levels

– The C-like languages, (C, C++ and Java) are characterised by very complex precedence tables.

u C has 15 levelsu Java and C++ have 17 levels

– Scripting languages can be even worseu Tcl has 12 levels but Perl has 21 levels and PHP 23 levels.

Page 6: Title Slide 2003- “Five variations on the theme: Software ... · Software Failure Seminars, Les Hatton, UKC, December 2003 , (slide 1 - 2) ©Copyright Les Hatton, 2003 Introduction

©Copyright Les Hatton, 2003Software Failure Seminars, Les Hatton, UKC, December 2003 , (slide 1 - 6)

Operator precedence levels

OPERATORS Associativity

() [] -> . ++ -- Left to Right

! ~ ++ -- + - * & (type) sizeof Right to Left

* / % Left to Right

+ - Left to Right

<< >> Left to Right

< <= > >= Left to Right

== != Left to Right

& Left to Right

^ Left to Right

| Left to Right

&& Left to Right

|| Left to Right

?: Right to Left

= += -= *= /= %= &= ^= |= <<= >>= Right to Left

, Left to Right

Page 7: Title Slide 2003- “Five variations on the theme: Software ... · Software Failure Seminars, Les Hatton, UKC, December 2003 , (slide 1 - 2) ©Copyright Les Hatton, 2003 Introduction

©Copyright Les Hatton, 2003Software Failure Seminars, Les Hatton, UKC, December 2003 , (slide 1 - 7)

Precedence

How we have responded– To attempt to ameliorate this, we introduce rules like:

u “No expression shall rely on precedence”However, this rule for example is broken 4441 times in the ISO C

validation suite, (about 1 per 50 lines).

– All the C-like languages have inherited the same table and added to it.

Page 8: Title Slide 2003- “Five variations on the theme: Software ... · Software Failure Seminars, Les Hatton, UKC, December 2003 , (slide 1 - 2) ©Copyright Les Hatton, 2003 Introduction

©Copyright Les Hatton, 2003Software Failure Seminars, Les Hatton, UKC, December 2003 , (slide 1 - 8)

Precedence

How we could respond forensically– Analyse failure modes

u if ( flags & FLAG != 0 ) ...

u while (c = getc(in) != EOF) ...

u if ( (t=BTYPE(pt1->aty)==STRTY) || t == UNIONTY ) ...

u if ( a < b < c ) ...

– Derive failure rule

Page 9: Title Slide 2003- “Five variations on the theme: Software ... · Software Failure Seminars, Les Hatton, UKC, December 2003 , (slide 1 - 2) ©Copyright Les Hatton, 2003 Introduction

©Copyright Les Hatton, 2003Software Failure Seminars, Les Hatton, UKC, December 2003 , (slide 1 - 9)

Precedence

How we could respond forensicallyReplace …

u “No expression shall rely on precedence”By …

u “If a Boolean valued sub-expression appears in a predicate next to a bit operator, an assignment operator or a relational operator, it is nearly always wrong.”

This rule is much more precise. Even in the ISO C validation suite which is supposed to exercise everything, it only fires 11 times. In real code, it has a > 90% hit rate for real defects.

Page 10: Title Slide 2003- “Five variations on the theme: Software ... · Software Failure Seminars, Les Hatton, UKC, December 2003 , (slide 1 - 2) ©Copyright Les Hatton, 2003 Introduction

©Copyright Les Hatton, 2003Software Failure Seminars, Les Hatton, UKC, December 2003 , (slide 1 - 10)

Overview

v A case history in forensic analysisv Some successful technologiesv The process v. product dilemmav Why are we so bad at diagnosis ?

Page 11: Title Slide 2003- “Five variations on the theme: Software ... · Software Failure Seminars, Les Hatton, UKC, December 2003 , (slide 1 - 2) ©Copyright Les Hatton, 2003 Introduction

©Copyright Les Hatton, 2003Software Failure Seminars, Les Hatton, UKC, December 2003 , (slide 1 - 11)

Inspections

Inspections find fault, not failure. Historically, they are categorised into:-– Design inspections, (much harder to do because of poor

standardisation of design issues)– Code inspections

There is some evidence to suggest that code inspections find more problems but design inspections find more serious problems

Page 12: Title Slide 2003- “Five variations on the theme: Software ... · Software Failure Seminars, Les Hatton, UKC, December 2003 , (slide 1 - 2) ©Copyright Les Hatton, 2003 Introduction

©Copyright Les Hatton, 2003Software Failure Seminars, Les Hatton, UKC, December 2003 , (slide 1 - 12)

Inspections

v Probably the most well-known are Fagan inspections, named after Michael Fagan of IBM. In essence a Fagan inspection contains:– Planning– Overview– Individual preparation– Program inspection– Rework– Re-inspection

A walk-through is less formalised, about half as effective and contains only steps 4 and 5.

Page 13: Title Slide 2003- “Five variations on the theme: Software ... · Software Failure Seminars, Les Hatton, UKC, December 2003 , (slide 1 - 2) ©Copyright Les Hatton, 2003 Introduction

©Copyright Les Hatton, 2003Software Failure Seminars, Les Hatton, UKC, December 2003 , (slide 1 - 13)

Inspection effectiveness- HP example

Code inspections are the most effective way of monitoring intrinsic quality. Consider the following rates of defects found per hour, (Grady & Caswell, Hewlett-Packard):

LiveRunning

Black Box White Box Inspections0

0.20.40.60.8

11.2

LiveRunning

Black Box White Box Inspections

Page 14: Title Slide 2003- “Five variations on the theme: Software ... · Software Failure Seminars, Les Hatton, UKC, December 2003 , (slide 1 - 2) ©Copyright Les Hatton, 2003 Introduction

©Copyright Les Hatton, 2003Software Failure Seminars, Les Hatton, UKC, December 2003 , (slide 1 - 14)

Inspections - applying control process feedback

v An inspection should measure the following:– Lines per hour reviewed– Defects per hour found– Defects per hour missed, (using hindsight)

It is known that inspection rates of > 100 lines per hour and around 400 lines per day lead to a significant deterioration in effectiveness.

Page 15: Title Slide 2003- “Five variations on the theme: Software ... · Software Failure Seminars, Les Hatton, UKC, December 2003 , (slide 1 - 2) ©Copyright Les Hatton, 2003 Introduction

©Copyright Les Hatton, 2003Software Failure Seminars, Les Hatton, UKC, December 2003 , (slide 1 - 15)

Inspections – lines per hour

These measurements are due to Roper (1999) made onOO systems which have similar non-locality to embeddedinterrupt driven systems.

Detec tion rate vers us de tec tion s peed

0

1

2

3

4

5

6

7

100 109 120 133 150 171 200 240 300 400 600 1200

LOC / hour

Nor

mal

ised

def

ects

fou

nd

Page 16: Title Slide 2003- “Five variations on the theme: Software ... · Software Failure Seminars, Les Hatton, UKC, December 2003 , (slide 1 - 2) ©Copyright Les Hatton, 2003 Introduction

©Copyright Les Hatton, 2003Software Failure Seminars, Les Hatton, UKC, December 2003 , (slide 1 - 16)

Inspections – lines per hour

These measurements are due to Humphrey (1995) made on25 C++ projects

Defects found

0

10

20

30

40

50

60

70

80

1 2 3 4 5 6 7 8 9 10

Defect densityfound

Inspection speed (100s LOC/ hour)

Page 17: Title Slide 2003- “Five variations on the theme: Software ... · Software Failure Seminars, Les Hatton, UKC, December 2003 , (slide 1 - 2) ©Copyright Les Hatton, 2003 Introduction

©Copyright Les Hatton, 2003Software Failure Seminars, Les Hatton, UKC, December 2003 , (slide 1 - 17)

Inspections

v Code inspections inspect for:– Style transgressions– Standard transgressions– Dangerous use of language– Differences between requirements and behaviour

Of these, the first three can be automated using static deep-flow tools, greatly increasing the efficiency of inspections. Code inspectors should be given code from which the first three categories are absent.

Page 18: Title Slide 2003- “Five variations on the theme: Software ... · Software Failure Seminars, Les Hatton, UKC, December 2003 , (slide 1 - 2) ©Copyright Les Hatton, 2003 Introduction

©Copyright Les Hatton, 2003Software Failure Seminars, Les Hatton, UKC, December 2003 , (slide 1 - 18)

Inspections

v Inspection yield

Total number of defects found in inspection

-------------------------------------------

Total number of defects ever found

What is a sensible value for this ?

Page 19: Title Slide 2003- “Five variations on the theme: Software ... · Software Failure Seminars, Les Hatton, UKC, December 2003 , (slide 1 - 2) ©Copyright Les Hatton, 2003 Introduction

©Copyright Les Hatton, 2003Software Failure Seminars, Les Hatton, UKC, December 2003 , (slide 1 - 19)

Defect discovery profile for effective inspections

% of all defects ever found and where found, SEMA (Gilb & Graham 1993)

0

20

40

60

80

100

Code review Unit testing Field

Life-cycle stage

% d

efec

ts e

ver

foun

d

Page 20: Title Slide 2003- “Five variations on the theme: Software ... · Software Failure Seminars, Les Hatton, UKC, December 2003 , (slide 1 - 2) ©Copyright Les Hatton, 2003 Introduction

©Copyright Les Hatton, 2003Software Failure Seminars, Les Hatton, UKC, December 2003 , (slide 1 - 20)

Case histories of inspection yields

% of all defects found during Inspections

0102030405060708090

100

SEMA 1992 IBM SantaTeresa 1991

IBM Rochester

Case history

% o

f all

defe

cts

Page 21: Title Slide 2003- “Five variations on the theme: Software ... · Software Failure Seminars, Les Hatton, UKC, December 2003 , (slide 1 - 2) ©Copyright Les Hatton, 2003 Introduction

©Copyright Les Hatton, 2003Software Failure Seminars, Les Hatton, UKC, December 2003 , (slide 1 - 21)

Overview

v A case history in forensic analysisv Some successful technologiesv The process v. product dilemmav Why are we so bad at diagnosis ?

Page 22: Title Slide 2003- “Five variations on the theme: Software ... · Software Failure Seminars, Les Hatton, UKC, December 2003 , (slide 1 - 2) ©Copyright Les Hatton, 2003 Introduction

©Copyright Les Hatton, 2003Software Failure Seminars, Les Hatton, UKC, December 2003 , (slide 1 - 22)

Process v. Product Failure

v Process failure– We either build the wrong thing or build the right thing too

late or fail to build anything at all in the vast majority of casesv Product failure

– Cessationu The product crashes

– Misleading resultsu The product gives misleading answers

Page 23: Title Slide 2003- “Five variations on the theme: Software ... · Software Failure Seminars, Les Hatton, UKC, December 2003 , (slide 1 - 2) ©Copyright Les Hatton, 2003 Introduction

©Copyright Les Hatton, 2003Software Failure Seminars, Les Hatton, UKC, December 2003 , (slide 1 - 23)

“Planning is an unnatural process. Its much more fun to get on with it. The real benefit of not planning is that failure comes as a complete surprise and is not preceded by months of worry.”

Sir John Harvey Jones.

To plan or not to plan ?

Page 24: Title Slide 2003- “Five variations on the theme: Software ... · Software Failure Seminars, Les Hatton, UKC, December 2003 , (slide 1 - 2) ©Copyright Les Hatton, 2003 Introduction

©Copyright Les Hatton, 2003Software Failure Seminars, Les Hatton, UKC, December 2003 , (slide 1 - 24)

When the train of ambition pulls away from the platform of reality

Planning data from a grand ‘unified’ programming project.(Produced after the project seemed to be struggling.)

Note that unify appears next to unintelligible in the OCD.

Unsuccessful project (abandoned)

0

10

20

30

40

50

60

70

80

90

1 24 39 54 64 74

Day of prediction

Pred

icte

d da

ys to

co

mpl

etio

n

Page 25: Title Slide 2003- “Five variations on the theme: Software ... · Software Failure Seminars, Les Hatton, UKC, December 2003 , (slide 1 - 2) ©Copyright Les Hatton, 2003 Introduction

©Copyright Les Hatton, 2003Software Failure Seminars, Les Hatton, UKC, December 2003 , (slide 1 - 25)

Ruthlessly controlling tasks

Project restarted with (far) less ambitious goals and tracked weekly withresults published on staff notice board.

Succesful project (about 10% overrun)

0

20

40

60

80

100

120

140

160

1

15 36 50 64 78 92

106

120

134

176

Day of prediction

Pred

icte

d da

ys to

co

mpl

etio

n

Page 26: Title Slide 2003- “Five variations on the theme: Software ... · Software Failure Seminars, Les Hatton, UKC, December 2003 , (slide 1 - 2) ©Copyright Les Hatton, 2003 Introduction

©Copyright Les Hatton, 2003Software Failure Seminars, Les Hatton, UKC, December 2003 , (slide 1 - 26)

Points to consider– There is an inherent belief that a good process implies a good

product– So why is Linux so good ?

u Linux is categorically CMM level 1 (more shortly) so is the CMM wrong or does Open Source development have important properties that we don’t understand well yet ?

u Is the reliability of Linux incremental ?

Software Process

Page 27: Title Slide 2003- “Five variations on the theme: Software ... · Software Failure Seminars, Les Hatton, UKC, December 2003 , (slide 1 - 2) ©Copyright Les Hatton, 2003 Introduction

©Copyright Les Hatton, 2003Software Failure Seminars, Les Hatton, UKC, December 2003 , (slide 1 - 27)

Process quality v. Product quality

ISO 9001, CMM etc.

Measurement

Top-down

Bottom-up

Quality Spectrum

TimeProcess

Product

Page 28: Title Slide 2003- “Five variations on the theme: Software ... · Software Failure Seminars, Les Hatton, UKC, December 2003 , (slide 1 - 2) ©Copyright Les Hatton, 2003 Introduction

©Copyright Les Hatton, 2003Software Failure Seminars, Les Hatton, UKC, December 2003 , (slide 1 - 28)

A five level model developed on behalf of the US DoD at Carnegie-Mellon in the 80s and 90s– Level 1 (Initial – used to be called chaos)– Level 2 (Repeatable)– Level 3 (Defined)– Level 4 (Managed)– Level 5 (Optimised or Godlike)There are around 50 groups at level 5 in the world, around half of

them in India, (who take software development a lot more seriously than we do). BUT … what about Linux ?

Software Process – the layman’s guide to the CMM

Page 29: Title Slide 2003- “Five variations on the theme: Software ... · Software Failure Seminars, Les Hatton, UKC, December 2003 , (slide 1 - 2) ©Copyright Les Hatton, 2003 Introduction

©Copyright Les Hatton, 2003Software Failure Seminars, Les Hatton, UKC, December 2003 , (slide 1 - 29)

The CMM levels

5 Optimised^?

4 Managed^

Full statistical processcontrol; metrics used fordefect prevention

3 Defined^

Process database, processmetrics collected andanalysed; risks managed

2 Repeatable^

Process focus, softwareengineering process group;training program throughout

1 Chaotic^

Project planning, tracking;software quality assurance,configuration management

Page 30: Title Slide 2003- “Five variations on the theme: Software ... · Software Failure Seminars, Les Hatton, UKC, December 2003 , (slide 1 - 2) ©Copyright Les Hatton, 2003 Introduction

©Copyright Les Hatton, 2003Software Failure Seminars, Les Hatton, UKC, December 2003 , (slide 1 - 30)

CMM statistics v Who is where ?

– About 70% of all companies believed to be at level 1– About 20% believed to be at level 2– About 9% believed to be at level 3– Around 50 organisations in the world are at level 5 of

which about half are in India– It takes around 18 months to move between levels

realistically– There is significant evidence that software development

costs are reduced considerably as higher levels are achieved.

Page 31: Title Slide 2003- “Five variations on the theme: Software ... · Software Failure Seminars, Les Hatton, UKC, December 2003 , (slide 1 - 2) ©Copyright Les Hatton, 2003 Introduction

©Copyright Les Hatton, 2003Software Failure Seminars, Les Hatton, UKC, December 2003 , (slide 1 - 31)

0.1

1

10

100

1000

10000

W'95 Macintosh7.5-8.1

NT 4.0 Linux Sparc4.1.3c

OS

MTB

F (h

rs)

The PC picture ...

Mean Time Between Failures in hours of various operating systems

MTBF

Page 32: Title Slide 2003- “Five variations on the theme: Software ... · Software Failure Seminars, Les Hatton, UKC, December 2003 , (slide 1 - 2) ©Copyright Les Hatton, 2003 Introduction

©Copyright Les Hatton, 2003Software Failure Seminars, Les Hatton, UKC, December 2003 , (slide 1 - 32)

Similarities between Linux and Windows NT …

v Both operating systems are of comparable essential complexity– Both are multi-tasking, multi-threaded, symmetric multi-

processing systemsv There appears to be about a factor of 10 in non-

essential complexity– Windows NT is around 20 million lines– Linux is around 2 million lines

v Both started around 1991

Page 33: Title Slide 2003- “Five variations on the theme: Software ... · Software Failure Seminars, Les Hatton, UKC, December 2003 , (slide 1 - 2) ©Copyright Les Hatton, 2003 Introduction

©Copyright Les Hatton, 2003Software Failure Seminars, Les Hatton, UKC, December 2003 , (slide 1 - 33)

Differences between Linux and Windows

v Visibility– Windows is proprietary– Linux is open source and distributed under the GPL,

(GNU public licence).

Page 34: Title Slide 2003- “Five variations on the theme: Software ... · Software Failure Seminars, Les Hatton, UKC, December 2003 , (slide 1 - 2) ©Copyright Les Hatton, 2003 Introduction

©Copyright Les Hatton, 2003Software Failure Seminars, Les Hatton, UKC, December 2003 , (slide 1 - 34)

Differences between Linux and Windows

v Testing approach– Windows is heavily dependent on dynamic testing with

extensive beta testing campaignsu (Around 89,000 defects were reported to have been found in

Windows 95 during beta test)u (Around 60,000 defects were reported to have been found in

Windows 2000 during beta test).

– Linux is dependent on a mixture of large-scale code inspection and dynamic testing

Page 35: Title Slide 2003- “Five variations on the theme: Software ... · Software Failure Seminars, Les Hatton, UKC, December 2003 , (slide 1 - 2) ©Copyright Les Hatton, 2003 Introduction

©Copyright Les Hatton, 2003Software Failure Seminars, Les Hatton, UKC, December 2003 , (slide 1 - 35)

Linux is also characterised by:-– Excellent distributed version control– Excellent development/test communication– High average levels of experience– Unusually high levels of perfective maintenance

The Linux Picture …

Page 36: Title Slide 2003- “Five variations on the theme: Software ... · Software Failure Seminars, Les Hatton, UKC, December 2003 , (slide 1 - 2) ©Copyright Les Hatton, 2003 Introduction

©Copyright Les Hatton, 2003Software Failure Seminars, Les Hatton, UKC, December 2003 , (slide 1 - 36)

More notes on Windows:– Availability and how to confuse the picture …– There are about 30 billion Windows crashes a year, (every

machine at least one a week). (Dvorak, PC magazine, 4-Aug-2003)

– Does anybody add up the cost of this ? (Try this, each crash costs around 15 minutes of work say. At minimum wage, this is 30 billion pounds a year. Its probably much higher.)

The PC Picture …

Page 37: Title Slide 2003- “Five variations on the theme: Software ... · Software Failure Seminars, Les Hatton, UKC, December 2003 , (slide 1 - 2) ©Copyright Les Hatton, 2003 Introduction

©Copyright Les Hatton, 2003Software Failure Seminars, Les Hatton, UKC, December 2003 , (slide 1 - 37)

Overview

v A case history in forensic analysisv Some successful technologiesv The process v. product dilemmav Why are we so bad at diagnosis ?

Page 38: Title Slide 2003- “Five variations on the theme: Software ... · Software Failure Seminars, Les Hatton, UKC, December 2003 , (slide 1 - 2) ©Copyright Les Hatton, 2003 Introduction

©Copyright Les Hatton, 2003Software Failure Seminars, Les Hatton, UKC, December 2003 , (slide 1 - 38)

The prediction problem

Fault Failure

Prediction

Diagnosis

Prediction is in its infancy and diagnosis is becoming more difficult

Page 39: Title Slide 2003- “Five variations on the theme: Software ... · Software Failure Seminars, Les Hatton, UKC, December 2003 , (slide 1 - 2) ©Copyright Les Hatton, 2003 Introduction

©Copyright Les Hatton, 2003Software Failure Seminars, Les Hatton, UKC, December 2003 , (slide 1 - 39)

Diagnosis

One of the central ways of improving feedback is good failure diagnosis. However, several factors inhibit diagnosis

• System complexity and coupling

• Engineer over-optimism leading to poor diagnostics and hence to poor diagnosis

• and measurement suggests, increasingly complex paradigms.

Page 40: Title Slide 2003- “Five variations on the theme: Software ... · Software Failure Seminars, Les Hatton, UKC, December 2003 , (slide 1 - 2) ©Copyright Les Hatton, 2003 Introduction

©Copyright Les Hatton, 2003Software Failure Seminars, Les Hatton, UKC, December 2003 , (slide 1 - 40)

Increasing coupling

Coupling is the degree of interdependence between otherwise separate systems

• In telecommunications systems, coupling can be very high

• In consumer appliances such as cars, many computer systems communicate with each other giving potentially high coupling

Page 41: Title Slide 2003- “Five variations on the theme: Software ... · Software Failure Seminars, Les Hatton, UKC, December 2003 , (slide 1 - 2) ©Copyright Les Hatton, 2003 Introduction

©Copyright Les Hatton, 2003Software Failure Seminars, Les Hatton, UKC, December 2003 , (slide 1 - 41)

Diagnosis

DiagnosticDistance

DiagnosticQuality

Difficult

Easy

Moderate

Moderate

Poor Good

Close

Distant

Page 42: Title Slide 2003- “Five variations on the theme: Software ... · Software Failure Seminars, Les Hatton, UKC, December 2003 , (slide 1 - 2) ©Copyright Les Hatton, 2003 Introduction

©Copyright Les Hatton, 2003Software Failure Seminars, Les Hatton, UKC, December 2003 , (slide 1 - 42)

Moderate, (distant/good)

An example from real life, Airbus A320 AF319, 25/8/88, (Mellor (1994)):-

• MAN PITCH TRIM ONLY, followed in quick succession by ...• Fault in right main landing gear• Fault in electrical flight control system computer 2• Fault in alternate ground spoilers 1-2-3-5• Fault in left pitch control green hydraulic circuit• Loss of attitude protection• Fault in Air Data System 2• Autopilot 2 shown as engaged when it was disengaged• LAVATORY SMOKE

Page 43: Title Slide 2003- “Five variations on the theme: Software ... · Software Failure Seminars, Les Hatton, UKC, December 2003 , (slide 1 - 2) ©Copyright Les Hatton, 2003 Introduction

©Copyright Les Hatton, 2003Software Failure Seminars, Les Hatton, UKC, December 2003 , (slide 1 - 43)

Moderate, (close/poor)

“Button push ignored”• This appears on the Flight Management System of

a McDonnell-Douglas MD-11, (Drury (1997))

It is not clear what the programmer is trying to convey. “Paris is the capital of France” would have been equally useful.

• The pilot also noted “The airplane [computer system] manuals were written as though by creatures from another planet”.

Page 44: Title Slide 2003- “Five variations on the theme: Software ... · Software Failure Seminars, Les Hatton, UKC, December 2003 , (slide 1 - 2) ©Copyright Les Hatton, 2003 Introduction

©Copyright Les Hatton, 2003Software Failure Seminars, Les Hatton, UKC, December 2003 , (slide 1 - 44)

The great local bar disaster

Programmers effort:-

System over-stressed ...

Symptom: The author’s local bar was unable to dispense beer.

Translation into English:-

The printer has run out of paper(Try explaining this to a thirsty native)

Page 45: Title Slide 2003- “Five variations on the theme: Software ... · Software Failure Seminars, Les Hatton, UKC, December 2003 , (slide 1 - 2) ©Copyright Les Hatton, 2003 Introduction

©Copyright Les Hatton, 2003Software Failure Seminars, Les Hatton, UKC, December 2003 , (slide 1 - 45)

What we would like to do

DiagnosticDistance

DiagnosticQuality

Difficult

Easy

Moderate

Moderate

Poor Good

Close

Distant

Education

Design

Page 46: Title Slide 2003- “Five variations on the theme: Software ... · Software Failure Seminars, Les Hatton, UKC, December 2003 , (slide 1 - 2) ©Copyright Les Hatton, 2003 Introduction

©Copyright Les Hatton, 2003Software Failure Seminars, Les Hatton, UKC, December 2003 , (slide 1 - 46)

Overall Summary

To conclude:– On the negative side

u We don’t learn from our mistakesu System diagnosis is at a very poor stage of evolution

– On the positive sideu Some technologies are extraordinarily effective