11
Software Failures Ron Gilmore, CMC Edmonton April 2006

Software Failures Ron Gilmore, CMC Edmonton April 2006

Embed Size (px)

Citation preview

Page 1: Software Failures Ron Gilmore, CMC Edmonton April 2006

Software Failures

Ron Gilmore, CMC

Edmonton

April 2006

Page 2: Software Failures Ron Gilmore, CMC Edmonton April 2006

Software Failures

• Santayana

• The software sector

• Observations

• Case Study: Therac 25

• Lessons

• Engineering Comparisons

• Challenges

Page 3: Software Failures Ron Gilmore, CMC Edmonton April 2006

Santayana (1863 - 1952)

• Philosopher, essayist, poet, novelist

• The Life of Reason (1905)

• "Those who cannot remember the past are condemned to repeat it“

• Lots of other great quotes

• Egypt, March 2006

Page 4: Software Failures Ron Gilmore, CMC Edmonton April 2006

Software Sector

• Young – less than a century• Amateurs• Change, churn and failures• Compare to roads, houses, bridges• Professions evolving• Standards evolving• Best practices evolving• Societal awareness evolving

Page 5: Software Failures Ron Gilmore, CMC Edmonton April 2006

Case Study: Therac 25

• Radiation therapy machines

• Atomic Energy of Canada

• 1985 to 1987

• Six known “incidents”

• Massive radiation overdoses to patients

• Order of tens of thousands of rads

• At least five deaths!

Page 6: Software Failures Ron Gilmore, CMC Edmonton April 2006

Therac 25 Root Causes

• Institutional causes:– No independent code review– Software not included in reliability design– Documentation “lean” on error codes– AECL did not initially believe complaints

Page 7: Software Failures Ron Gilmore, CMC Edmonton April 2006

Therac 25 Root Causes

• Design Issues:– No preventative hardware interlocks– AECL re-used software from older models

which had hardware interlocks– No way for software to verify sensors were

working– Arithmetic overflow - safety checks bypassed– Software written in assembly language

Page 8: Software Failures Ron Gilmore, CMC Edmonton April 2006

Therac 25 Lessons?

• Professions?

• Standards?

• Best practices?

• Societal awareness?

Page 9: Software Failures Ron Gilmore, CMC Edmonton April 2006

Engineering Comparisons

• More mature sector

• Certification, legislation, compliance

• Curriculum: Tacoma Narrows Bridge

• Still: London Pedestrian bridge

• Still: Confusion re mandate, coverage

• Still: budget & schedule - oilsands

Page 10: Software Failures Ron Gilmore, CMC Edmonton April 2006

Challenges

• Education – technical, business

• Sensitivity – bad software can kill!

• Lots more examples:– Chinook helicopter– Missile detection systems

Page 11: Software Failures Ron Gilmore, CMC Edmonton April 2006

Constructive Notions

• Awareness efforts

• Consequences

• Core competencies

• Systems classifications:– A = Life threatening– B = Business threatening– C = Other