[American Institute of Aeronautics and Astronautics Infotech@Aerospace - Arlington, Virginia ()] Infotech@Aerospace - Sufficient Evidence: Building Certifiably Dependable Systems

AbstrActA study of the National Academies entitled “Suf-ficient Evidence: Building Certifiably Dependable Systems” is currently underway. Its purpose is to as-sess the state of dependable software development, and in particular the role that certification plays, and might play in the future, in making software more de-pendable. This paper outlines a few of the pertinent themes. Its content is based on comments that have been provided to the study committee either in open session, or as communications from non-members of the committee, and in some cases, it represents the views of this paper’s authors. It should therefore not be construed as a statement by the committee, and it does not necessarily represent the consensus of the committee, or even of the views of individual committee members. A full report will be available when the study has been completed (in 2006).

some Issues In softwAre DepenDAbIlItyIs software dependability a problem?. The software development community – in both research and in-dustry – has for a long time been concerned with is-sues of dependability. A skeptic might ask whether the dependability of software is actually a problem that merits attention, let alone investment. Depend-ability, especially of desktop applications, has im-proved dramatically in the last few decades, at the same time as the functionality offered by software has grown at a far greater rate. Software, although it causes frequent irritations, has been implicated in surprisingly few fatal accidents, and those accidents that have happened are widely known. Moreover, the number of deaths due to software failures in the US seems to be orders of magnitude smaller than those due to other kinds of failure. Medical errors, for ex-ample, account for about 44,000 deaths per annum, and workplace injuries alone for about 8,000.

There are two reasons that such skepticism might be unwise. First, the deployment of software is increas-ing at a rapid rate, and software is becoming perva-sive in our civic infrastructure. Failures of software are inevitable, and their effects are likely to be more far reaching and damaging than the failures we have seen to date. Second, software has great potential to improve quality of life and reduce accidents by mitigating the failures of human operators; perhaps a significant proportion of medical errors could be avoided with appropriate software measures over-seeing medical procedures.

What kinds of failures are of concern?. Two medical software failures, one well known, and one less well known, are typical of the classes of failure that should concern us – especially in their combination. The first is the failure of radiotherapy software at a public hospital in Panama City in 2001, which is implicated in the deaths of several patients [1]. The software, made by an American company, and passed by the FDA, computed appropriate dose settings to mitigate the effect of blocks placed in the path of the beam to shield healthy tissue. The therapist would draw the outlines of the blocks with a mouse, and the system would then compute the required dose. It turned that, if the outlines were drawn in a certain way, an incorrect dose was computed, sometimes close to double the correct dose.

The second is a failure of a pharmacy database in a large tertiary care hospital in the Chicago area re-ported by Richard Cook [2]. In short, the system for backing up the database was faulty, but the flaw has not been noticed. A separate, unrelated bug, caused a software engineer to restore the database to an older version using a backup tape. This restoration inadvertently corrupted the database, so that the ‘fill list’ of medications that it produced was incorrect. The hospital thhus found itself in a situation in which medication was not available for most of its patients.

Sufficient Evidence: Building Certifiably Dependable Systems

Daniel Jackson, Massachusetts Institute of Technology, Cambridge, MA Lynette Millett, The National Academies, Washington, DC

Infotech@Aerospace26 - 29 September 2005, Arlington, Virginia

AIAA 2005-6915

Copyright © 2005 by Daniel Jackson. Published by the American Institute of Aeronautics and Astronautics, Inc., with permission.

Fortunately, pharmacy staff were able to reconstruct the database from paper records.

This first failure shows the damage that software can do when connected, directly or indirectly, to physical devices. It was not the first failure of a radiotherapy machine; the failings of the Therac-25 over a decade earlier are well known to students of software fail-ures. As software is insinuated more and more into our physical infrastructure, the risk of such accidents rises. The second failure suggests that we should be concerned about the collapse of centralized systems that might affect large numbers of people. The inte-gration of systems, and a tendency to increase the coupling between them, increases vulnerability to single points of failure, and amplifies their effects. As Cook puts it, the automation that software can pro-vide will in general reduce the chance of a failure, but at the expense of greatly increasing its severity. Our greatest concern should be directed at those systems that combine the two features of these failures: sys-tems that control physical devices, and that can have widespread effect. Control systems for energy distri-bution and chemical plants, and perhaps also air traf-fic control systems seem to pose such risks.

Do existing certification schemes work?. The commit-tee has benefited from the expertise of specialists in many different domains (including medical devices, process control, enterprise software, electronic vot-ing, avionics, military, and aerospace) and it has heard testimony on many forms of certification. There seems to be no consensus on whether certifi-cation per se is efficacious. In some cases, especially in security certification (such as Common Criteria), experts seem by and large to be negative, and to be-lieve that the burden of certification on the software developers is considerable and not justified by ben-efits to the consumers. In other cases, most notably avionics and flight control, there is more optimism, and experts seem to believe that certification schemes (such as DO178B) have had a positive impact. There is, however, consensus on one fact: that there is little credible concrete evidence one way or another on the efficacy of certification. It might therefore be the case that, even when it works well, the benefits of certification are due to ‘collateral’ impact, in which simply the demands of meeting any standards result in greater attention to detail, and a greater degree of reflection on the qualities of an evolving product.

references[1] International Atomic Energy Agency. Investiga-

tion Of An Accidental Exposure Of Radiotherapy Patients In Panama. Report Of A Team Of Ex-perts, 26 May –1 June 2001.

[2] Richard Cook and Michael O’Connor. Thinking About Accidents And Systems. In: K. Thompson, H. Manasse, eds. Improving Medication Safety. ASHP, Washington, DC, to appear.

Documents

[American Institute of Aeronautics and Astronautics Infotech@Aerospace - Arlington, Virginia ()] Infotech@Aerospace - Sufficient Evidence: Building Certifiably Dependable Systems