View
212
Download
0
Category
Preview:
Citation preview
Rule Induction and StatisticsAuthor(s): Anna HartSource: The Journal of the Operational Research Society, Vol. 38, No. 5 (May, 1987), pp. 470-471Published by: Palgrave Macmillan Journals on behalf of the Operational Research SocietyStable URL: http://www.jstor.org/stable/2582743 .
Accessed: 28/06/2014 09:27
Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .http://www.jstor.org/page/info/about/policies/terms.jsp
.JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range ofcontent in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new formsof scholarship. For more information about JSTOR, please contact support@jstor.org.
.
Palgrave Macmillan Journals and Operational Research Society are collaborating with JSTOR to digitize,preserve and extend access to The Journal of the Operational Research Society.
http://www.jstor.org
This content downloaded from 91.223.28.130 on Sat, 28 Jun 2014 09:27:29 AMAll use subject to JSTOR Terms and Conditions
Journal of the Operational Research Society Vol. 38, No. 5
RISKS IN THE EVALUATION OF ACCEPTABLE RISK
The methodology advocated by Whittaker,l whilst intuitively attractive, displays a
lack of practical robustness which may prohibit its universal acceptance. Worse
still, it may be based on false assumptions.
The data presented in Figure 1 of Whittaker's paper, for life expectation of
Canadian males, and summarized in his Figure 2 are not truly representative of the
longer-term changes in life expectancy over the last one hundred or so years, and the
increase may in fact be purely an artifact of the rapidly changing social and welfare
conditions during the early part of the century rather than a product of the
technology he seeks to justify.
A close scrutiny of the English life table (series DH1, No. 16, Table 22)
showing the data for England and Wales reveals that, between 1838 and 1900, life
expectancy at ages 45 and 65 was generally falling for both males and females, whilst
the corresponding figures for the younger ages increased slightly, with an important
contribution to mortality clearly remaining in the first year of life.
During the period 1900 to 1950 there was a general increase in the life
expectations over all ages, particularly at the lower age points, but as the graph in
Whittaker's article shows, this general upward trend is now levelling off.
It could be argued that having eliminated the prime sources of infant mortality,
any extrapolation into the future is likely to be a dynamic balance between the
better standards of geriatric and health care on the one hand and the risks from
choice of lifestyle and exploitation of technology on the other (e.g. transport,
occupational hazards, smoking, lack of exercise, etc.).
It is possible, therefore, that within a few years the graph may become
horizontal at best, or develop a negative slope at worst. Neither case would make
its employment in the way advocated very useful, and in any case, the gradient of
such a curve would be extremely sensitive to sampling and other short-term
variations, which would make its robustness questionable.
Finally, to be attractive to the individual in society, any methodology on risk
evaluation should take into account quality of life as well as longevity.
Paddock Wood, Kent DAVID G. SMITH
Reference
1. JOHN D. WHITTAKER (1986) Evaluation of acceptable risk, J. Opl Res. Soc. 37, 541-
547.
RULE INDUCTION AND STATISTICS
Recent papers have examined the power of the ID3 algorithm.l'2 I welcome this work
because it portrays such algorithms as methods of data exploration whose results need
investigation and interpretation. If the training set of examples input to the
algorithm is complete and correct (i.e. describing all possible types of problems
accurately), then the induced rules, which work for the training set, will obviously
work in general. In practice, training sets are not complete and seldom absolutely
correct. Knowledge about data exploration and statistical method is therefore
necessary. Results can be sensitive to changes in the input, and the quality of the
output depends heavily on the quality of the input. It is beneficial to know about
470
This content downloaded from 91.223.28.130 on Sat, 28 Jun 2014 09:27:29 AMAll use subject to JSTOR Terms and Conditions
Letters and Viewpoints
sampling and the dangers of extrapolation of results. Results are suggestive, not certain.
Previous work by statisticians has stressed the importance of evaluating results and of pruning induced trees. Pruning involves growing a decision tree to its full length and then mathematically evaluating the benefit of extra rules against the cost of producing them. Superfluous rules are then pruned back. Results are possible rather than proven for general cases, and there is no evidence of causality or explanation. The method uses no knowledge about the problem domain, and so the induced results should always be examined by an expert who can explain, justify or contradict them. They should also be tested on other examples.
Induction can highlight problems and questions as much as it can suggest rules, and accepting the output without careful investigation is dangerous. For expert systems it gives a starting point, rather than an answer, in knowledge acquisition. For certain problems, other statistical techniques will give more efficient and more meaningful results. I have commented on this elsewhere.4'5
School of Computing, ANNA HART
Lancashire Polytechnic
References
1. J. MINGERS (1986) Expert systems - experiments with rule induction. J. Opl Res.
Soc. 37, 1031-1037.
2. J. MINGERS (1987) Expert systems - rule induction with statistical data. J. Opl
Res. Soc. 38, 39-47.
3. L. BREIMAN, J.H. FRIEDMAN, R.A. OLSHEN and C.J. STONE (1984) Classification and
Regression Trees. Wadsworth International, California.
4. A. HART (1985) Machine induction: practical issues and advice. Proceedings of the
Second International Conference on Expert Systems. Learned Information, London.
5. A. HART (1986) Knowledge Acquisition for Expert Systems. Kogan Page, London.
471
This content downloaded from 91.223.28.130 on Sat, 28 Jun 2014 09:27:29 AMAll use subject to JSTOR Terms and Conditions
Recommended