42
The Failure of the London Ambulance Service Michael McDougall CIS 573 November 16 th , 1999

The Failure of the London Ambulance Service Michael McDougall CIS 573 November 16 th, 1999

Embed Size (px)

Citation preview

Page 1: The Failure of the London Ambulance Service Michael McDougall CIS 573 November 16 th, 1999

The Failure of the London Ambulance Service

Michael McDougallCIS 573

November 16th, 1999

Page 2: The Failure of the London Ambulance Service Michael McDougall CIS 573 November 16 th, 1999

The AccidentOn October 26th 1992 the London Ambulance System failed.

Phones rang for up to 10 minutesAmbulance response times were delayedSome calls were lost

On November 2nd the system crashed completely.Software was a major cause of the failures.

Page 3: The Failure of the London Ambulance Service Michael McDougall CIS 573 November 16 th, 1999

OutlineLondon Ambulance Service Computer Aided Despatch (CAD) system

BackgroundPlanning the systemDeveloping the systemHow it failed

ISO 12207 – Software Development StandardLAS failure w.r.t. ISO standard

Page 4: The Failure of the London Ambulance Service Michael McDougall CIS 573 November 16 th, 1999

BackgroundThe London ambulance service (LAS) is was the largest ambulance service in the world.

6.8 million residents – much higher during daytime.Services 5000 patients a day.Handles between 2000 and 2500 calls a day (more than 1 per minute).Employs 2700 full-time staff.

Page 5: The Failure of the London Ambulance Service Michael McDougall CIS 573 November 16 th, 1999

BackgroundIn 1990 the LAS was not meeting the U.K. standards for ambulance response times.Other parts of the U.K. National Health Service had undergone reforms throughout the 80’s but the LAS had not changed much since 1980.Staff/Management relations were low.

Page 6: The Failure of the London Ambulance Service Michael McDougall CIS 573 November 16 th, 1999

Despatch systemThe despatch system was responsible for:

Taking emergency callsDeciding which ambulance to sendSending information to ambulancesManaging allocation of ambulances

Page 7: The Failure of the London Ambulance Service Michael McDougall CIS 573 November 16 th, 1999

Despatch system

TakeCall Collection

Point

Paper

RegionalAllocator

Paper

RA

RA

Pap

er

DespatcherVoice

Ambulance

Page 8: The Failure of the London Ambulance Service Michael McDougall CIS 573 November 16 th, 1999

Despatch systemThe UK national standard required that this take less than 3 minutes.The LAS system in 1990 had a number of inefficiencies which made it impossible to meet the standard.

Page 9: The Failure of the London Ambulance Service Michael McDougall CIS 573 November 16 th, 1999

Inefficiencies

TakeCall

Finding the location of an accident was often difficult and time consuming.

Page 10: The Failure of the London Ambulance Service Michael McDougall CIS 573 November 16 th, 1999

Inefficiencies

Paper

Paper

Pap

er

Moving pieces of paper took unnecessary time

Page 11: The Failure of the London Ambulance Service Michael McDougall CIS 573 November 16 th, 1999

Inefficiencies

CollectionPoint

Identifying duplicate calls relied on human memory and was therefore slow and error prone.

Page 12: The Failure of the London Ambulance Service Michael McDougall CIS 573 November 16 th, 1999

Inefficiencies

RegionalAllocator

Pap

er

DespatcherVoice

Ambulance

Voice communication was slow

Allocating ambulances was done by hand. Reliedon memory of allocator.

Page 13: The Failure of the London Ambulance Service Michael McDougall CIS 573 November 16 th, 1999

Improving the systemThe LAS was under pressure from their superiors, MPs, the public and the media to improve performance.LAS management decided that a Computer Aided Despatch system was the fastest way to improve service.

Page 14: The Failure of the London Ambulance Service Michael McDougall CIS 573 November 16 th, 1999

The PlanLAS wanted to radically change the despatch system.In Autumn 1991 they began to write the system requirements for the new system.

Page 15: The Failure of the London Ambulance Service Michael McDougall CIS 573 November 16 th, 1999

CAD system goals

TakeCall

Finding the location of an accident was often difficult and time consuming.

Software connected to public telephones will locate incidents automatically

Page 16: The Failure of the London Ambulance Service Michael McDougall CIS 573 November 16 th, 1999

CAD system goals

Paper

Paper

Pap

er

Moving pieces of paper took unnecessary time

Information will move through a network between workstations.

Page 17: The Failure of the London Ambulance Service Michael McDougall CIS 573 November 16 th, 1999

CAD system goals

CollectionPoint

Identifying duplicate calls relied on human memory and was therefore slow and error prone.

AI will try to identify duplicate calls.

Page 18: The Failure of the London Ambulance Service Michael McDougall CIS 573 November 16 th, 1999

CAD system goals

RegionalAllocator

Allocating ambulances was done by hand. Reliedon memory of allocator.

Allocation of nearest ambulance will be done by computer in most cases.

Page 19: The Failure of the London Ambulance Service Michael McDougall CIS 573 November 16 th, 1999

CAD system goals

DespatcherVoice

Ambulance

Voice communication was slow

Digital communication to and from ambulances

Page 20: The Failure of the London Ambulance Service Michael McDougall CIS 573 November 16 th, 1999

LAS ambitionsThe new system was intended to mobilize an ambulance in less than 1 minute.The system would be the most ambitious of its time.A much more modest system had been planned for the LAS, but this was abandoned when it failed load-testing.No independent audit of the system requirements was carried out.

Page 21: The Failure of the London Ambulance Service Michael McDougall CIS 573 November 16 th, 1999

CAD requirementsLAS wanted a one-phase deliveryLAS decided that the system should cost £1,500,000LAS decided that the system would take 6 months to implement (though a project of this scale would usually take 18 months)These requirements were not based on any analysis of the design. They appear to be arbitrary.

Page 22: The Failure of the London Ambulance Service Michael McDougall CIS 573 November 16 th, 1999

Asking for tenders In early 1991, LAS publicized the requirements and asked for bidsMany potential suppliers expressed doubts that the project could be finished on time with the required budgetLAS replied that the timetable was not negotiable

Page 23: The Failure of the London Ambulance Service Michael McDougall CIS 573 November 16 th, 1999

Bids Many potential suppliers submitted bids for the projectMost of the bids required more time and/or moneyThe bids were evaluated by LAS staff who had no experience with information technology

Page 24: The Failure of the London Ambulance Service Michael McDougall CIS 573 November 16 th, 1999

Selecting a contractorOnly one bid was under £1,500,000 and promised an implementation system in 6 months. This bid was selected.The winning bid was from Systems Options Ltd (SO), a small software house with no experience in safety-critical software. SO had never managed a large project

Page 25: The Failure of the London Ambulance Service Michael McDougall CIS 573 November 16 th, 1999

The ContractLAS signed a contract with SO in September 1991. The system was supposed to go on-line on January 8th, 1992.The contract did not specify who would act as project manager or who would be responsible for quality assurance.No acceptance criteria was defined

Page 26: The Failure of the London Ambulance Service Michael McDougall CIS 573 November 16 th, 1999

Developing the systemSuppliers failed to meet deadlinesSO initially handled the project management, but this shifted to LAS as the project proceededNo independent QA or audit was performed; LAS intended to save money by leaving QA to the suppliers

Page 27: The Failure of the London Ambulance Service Michael McDougall CIS 573 November 16 th, 1999

Problem trackingThere was a formal procedure for reporting, analyzing and fixing bugs but… this was often skipped so that the software could be changed quickly to satisfy users

Page 28: The Failure of the London Ambulance Service Michael McDougall CIS 573 November 16 th, 1999

Training problemsUsers were trained long before the system was on-line. The training was often out of date or forgotten by the time the system was availableUsers were only trained for their part of the system

Page 29: The Failure of the London Ambulance Service Michael McDougall CIS 573 November 16 th, 1999

Partial deploymentThe complete system was not ready by Jan 8; systems was deployed in piecesBugs encountered

System needed perfect vehicle informationEvery 53rd vehicle was unavailableWorkstations froze often (Windows 3.0)Vehicle allocation could not be overriddenSending the wrong vehicle

Page 30: The Failure of the London Ambulance Service Michael McDougall CIS 573 November 16 th, 1999

Expected to failInteracting with the system was often awkward and frustratingThe LAS Staff had little confidence in the system

Page 31: The Failure of the London Ambulance Service Michael McDougall CIS 573 November 16 th, 1999

No testingNo testing of the full system was ever doneNobody ever tested to see if radio system could handle trafficManagement did not know what resources were required to maintain service; the CAD system was supposed to give this information

Page 32: The Failure of the London Ambulance Service Michael McDougall CIS 573 November 16 th, 1999

Failure 1On October 26th the LAS management decided to switch to the full CAD system. This decision was made even though

the system was never testedthere were outstanding bugs which were considered ‘severe’

Page 33: The Failure of the London Ambulance Service Michael McDougall CIS 573 November 16 th, 1999

Failure 1Initially the system worked; there were some errors but the staff were able to correct themAs the load increased the system response time decreased and the ambulance location data became less and less reliable

Page 34: The Failure of the London Ambulance Service Michael McDougall CIS 573 November 16 th, 1999

Feedback problems

Bad data

Crewfrustration Fewer

availablevehicles

More calls

Longer waits for

ambulance

Bad allocation

Page 35: The Failure of the London Ambulance Service Michael McDougall CIS 573 November 16 th, 1999

Design errorsSome of the design decisions made it harder to recover from errors

Allocators could only get info on ambulances by reserving an ambulanceControl room layout made it hard for operators to communicateSystem could not handle operators overriding computer decisions

Page 36: The Failure of the London Ambulance Service Michael McDougall CIS 573 November 16 th, 1999

ConsequencesAt the height of the accident emergency calls were ringing for 10 minutes before being answeredSome calls were lost because the list of calls was too big for the terminals80% of ambulances took more than 15 minutes to respond. (Average was 67%).

Page 37: The Failure of the London Ambulance Service Michael McDougall CIS 573 November 16 th, 1999

Consequences cont.The media reported that patients died because of the failure. A coroner later concluded that this was false.

Page 38: The Failure of the London Ambulance Service Michael McDougall CIS 573 November 16 th, 1999

Failure 2After the first failure LAS went back to the semi-automated system in use before October 26th. On November 4th the system frozeThe cause was a server that had run out of memory

Page 39: The Failure of the London Ambulance Service Michael McDougall CIS 573 November 16 th, 1999

Memory leakThe server software had been changed 3 weeks before. This change introduced a small memory leak. The server had been running out of memory ever since

Page 40: The Failure of the London Ambulance Service Michael McDougall CIS 573 November 16 th, 1999

Backup systemThere was a backup server, but it was only designed to work in the full CAD system

Page 41: The Failure of the London Ambulance Service Michael McDougall CIS 573 November 16 th, 1999

ConsequencesAt the time of the 2nd failure the load was light enough that the staff recovered all the information lost in the crash.No calls were missed.LAS went back to the original paper system

Page 42: The Failure of the London Ambulance Service Michael McDougall CIS 573 November 16 th, 1999

Next classISO 12207 - Software life cycle processesWould standards have prevented the LAS failure?

Are standards worth it?