9
Boundaries of Statistics-Sharp or Fuzzy? Author(s): John Neter Source: Journal of the American Statistical Association, Vol. 81, No. 393 (Mar., 1986), pp. 1-8 Published by: American Statistical Association Stable URL: http://www.jstor.org/stable/2287957 . Accessed: 15/06/2014 18:55 Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at . http://www.jstor.org/page/info/about/policies/terms.jsp . JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact [email protected]. . American Statistical Association is collaborating with JSTOR to digitize, preserve and extend access to Journal of the American Statistical Association. http://www.jstor.org This content downloaded from 194.29.185.251 on Sun, 15 Jun 2014 18:55:07 PM All use subject to JSTOR Terms and Conditions

Boundaries of Statistics-Sharp or Fuzzy?

Embed Size (px)

Citation preview

Page 1: Boundaries of Statistics-Sharp or Fuzzy?

Boundaries of Statistics-Sharp or Fuzzy?Author(s): John NeterSource: Journal of the American Statistical Association, Vol. 81, No. 393 (Mar., 1986), pp. 1-8Published by: American Statistical AssociationStable URL: http://www.jstor.org/stable/2287957 .

Accessed: 15/06/2014 18:55

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .http://www.jstor.org/page/info/about/policies/terms.jsp

.JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range ofcontent in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new formsof scholarship. For more information about JSTOR, please contact [email protected].

.

American Statistical Association is collaborating with JSTOR to digitize, preserve and extend access to Journalof the American Statistical Association.

http://www.jstor.org

This content downloaded from 194.29.185.251 on Sun, 15 Jun 2014 18:55:07 PMAll use subject to JSTOR Terms and Conditions

Page 2: Boundaries of Statistics-Sharp or Fuzzy?

Boundaries of statistics-

a JOHN NETER* Sharp or Fuzzy?

1. INTRODUCTION

In preparing this presidential address, I reviewed some of the addresses given by recent predecessors. I came to a profound conclusion. Presidents of the American Statistical Association (ASA) are worriers. This could be for several reasons. Members of ASA may have a propensity to elect worriers. Alternatively, the office of president may make the occupant a worrier, par- ticularly the requirement of giving a presidential address that presumably should contain new wisdom and insights. And of course, there could be some major problems facing the statistics profession that deserve the continuing concern of the presidents.

George Box (1979) worried about, among other things, data analysts' neglecting other equally important jobs of statisti- cians, such as designing investigations and building models. Ho Hartley (1980) worried about the proper balance between mathematical and applied statistics and the relations between statisticians and subject matter specialists. Margaret Martin (1981) worried about how to improve the production of statistical data for public use. Ralph Bradley (1982) was concerned about the future of statistics as a discipline. Dick Anderson (1984) wor- ried about where the statistics profession should be going, and Richard Savage (1985) was concerned about difficult measure- ment problems in the social sciences that are being neglected.

I too am a worrier. I shall speak this evening about two of my worries. The first concerns the possibility of statisticians' excluding themselves from important statistical problems in other disciplines. The second worry pertains to difficulties that are encountered when statisticians work jointly with specialists in other disciplines on problems with significant statistical as- pects. I am not, of course, the first person to have these worries. In fact, some of the earlier presidential addresses have dealt with these concerns. Still, they are important worries that de- serve repeated discussion. I will discuss the two areas of con- cern in a context that many of you may not be familiar with-

* John Neter is Professor of Management Sciences and Statistics, Department of Management Sciences, University of Georgia, Athens, GA 30602. This article was presented as the Presidential Address at the 1985 Annual Meetings of the American Statistical Association in Las Vegas.

namely, the uses of statistical sampling techniques in account- ing and auditing. This is an area of application in which rela- tively few statisticians have worked, and I shall give a brief historical review to provide some perspective.

The theme of this annual meeting, Statistics and Public Pol- icy, is closely related to my concerns. At a number of sessions at this meeting, we have heard about important issues of public policy to which statistics can make significant contributions if we are willing to enlarge our domain of interest. An example is the very complex problem of quantitatively measuring po- litical repression. Enlarging our domain of interest is not enough, however, for statisticians to play an effective role in the area of public policy. We statisticians must also learn how to interact better with specialists from other disciplines with whom we are working on a given problem, as well as with the decision mak- ers. Otherwise we may have difficulty in gaining acceptance of our technical expertise in the problem formulation and analy- sis stages and, most important, may not gain acceptance of the results of the analysis leading to implementation. Gaining acceptance of a statistician's technical expertise and of the study's results for implementation requires interpersonal skills that have often been neglected in the training of statisticians.

In enlarging the domain of our interest and improving our interpersonal skills to enable statisticians to work more effec- tively with others and gain acceptance of the study results, we are assisted by a growing recognition of the importance and usefulness of statistical methodology. Science 84 undertook recently, as part of celebrating its fifth anniversary, to identify 20 of the most important discoveries in science, technology, and medicine of the 20th century. The ground rules required that "the discovery had to have been made in the 20th cen- tury . . . , and it must have had a significant impact on the way we live or the way we think about ourselves and our world" (Hammond 1984). Among the 20 discoveries identified by Science 84 are Einstein's theory of relativity, the discovery

? 1986 American Statistical Association Journal of the American Statistical Association

March 1986, Vol. 81, No. 393, Presidential Address

I

This content downloaded from 194.29.185.251 on Sun, 15 Jun 2014 18:55:07 PMAll use subject to JSTOR Terms and Conditions

Page 3: Boundaries of Statistics-Sharp or Fuzzy?

2 Journal of the American Statistical Association, March 1986

of plastics, the discovery of the structure of DNA, and Karl Pearson's development of the chi-squared goodness-of-fit test. This statistical test was selected to represent the contributions that statistics has made, in the words of Ian Hacking (1984), "by changing the ways we reason, experiment, and form our opinions about (the world)" (p. 70).

2. STATISTICAL APPLICATIONS IN ACCOUNTING

The development of statistical inroads into the fields of ac- counting and auditing is instructive about difficulties in gaining acceptance of statistical ideas, about problems in trying to adapt standard statistical methodology to special situations, and about efforts by nonstatisticians to develop needed new statistical methods. I shall first describe briefly two areas of accounting in which statistical sampling methodology has been applied and then turn to the area of auditing.

Adoption of statistical sampling methods was relatively easy in connection with studies made by railroads for use in pro- ceedings concerning the merger of railroads before the Interstate Commerce Commission. I first became involved in some of these studies in the 1960s, when railroads were closely regulated and sought conditions from the Interstate Commerce Commis- sion to protect them from serious negative consequences of mergers by other railroads. The methodology that had been commonly employed was to choose a segment of traffic, such as all shipments during a two-week period, and examine all of the shipments in this segment to see how much of the traffic would have been diverted by the proposed merger. Judgments on the potential diversion of traffic were made by traffic experts. The representativeness of the segment of traffic selected from all of the traffic during the study period, frequently the calendar year, also needed to be defended by the traffic experts.

It was natural for statisticians to propose that a probability sample of the traffic for the study period be selected instead of a judgment sample. In this way, one could obtain with known precision an estimate of the amount of traffic diversion that would be found by expert judgment if the traffic experts could examine all traffic in the frame with the same care as that used in the sample study. The design of the sample survey on which I worked used standard survey sampling methods, since no special problems arose in this application. When this statistical study was completed and hearings on it were held before the Interstate Commerce Commission, the lawyers on the other side questioned me at great length about statistical concepts in gen- eral and the statistical design and evaluation of this particular study.

As the use of statistical sampling in these traffic studies became more frequent, the cross-examination of the statistical expert became much less intense, and the burden of defense began to shift to those who still used judgment sampling. In 1971, the Bureau of Economics of the Interstate Commerce Commission issued a staff report containing guidelines for the presentation of the results of sample studies based on probability sampling (Bureau of Economics 1971). The use of judgment sampling was not even considered in these guidelines. Thus the adoption of statistical sampling methods occurred relatively easily in this area of application.

Another use of sampling methodology in accounting that received relatively easy acceptance is found in studies of the

liability that companies selling trading stamps incur for sold stamps that will be redeemed. When a trading stamp company sells stamps to a retailer (e.g., a supermarket chain), it incurs a liability for redeeming the stamps that will be tumed in by consumers. The amount of this liability affects both the balance sheet and the income statement of the trading stamp company. A difficulty in ascertaining the liability arises because not all of the trading stamps will eventually be turned in. Some stamps will be lost; others will be forgotten. Further, some trading stamps may not be redeemed for long periods of time. The methods for estimating the liability that were used in the 1960s, when I first encountered this problem, tended to be nonstatis- tical. Working with two accountants, we developed a probabi- listic model that utilizes information on the age of trading stamps at the time of redemption to estimate the liability for redeemed stamps (Davidson, Neter, and Petran 1967). Information about the length of time stamps are outstanding can be obtained from stamps that are redeemed because the identification number of the stamp can be used to determine the time of issue. Since billions of stamps may be redeemed each year, it is clearly not possible to study the age of each stamp redeemed. The use of interpenetrating samples permitted an evaluation of the sam- pling error in the estimate of the liability that would have been obtained with the model if all trading stamps during the study period had been examined.

By 1972 the Internal Revenue Service issued guidelines on acceptable methods of estimating the liability for trading stamps to be eventually redeemed (Internal Revenue Service 1972). One of the acceptable methods is based on probability sampling, with the minimum sample size specified. Other methods are also permitted by the guidelines, but probability sampling re- sults were quickly accepted.

3. STATISTICAL SAMPLING IN AUDITING

The acceptance of statistical sampling was much more dif- ficult to achieve in auditing than in the two areas of accounting just discussed. Auditing is a specialized area within accounting. Internal auditors are concerned with reviewing the operations of an organization to see whether the system of internal control is operating effectively. A system of internal control consists of a variety of managerial policies, such as a requirement that checks over a certain amount be signed by two persons. Ex- ternal, independent auditors (Certified Public Accountants) are engaged by boards of directors of corporations and other orga- nizations to review the financial statements prepared by the organization to determine whether the statements fairly present the financial position, results of operations, and changes in financial position in conformity with generally accepted ac- counting principles. To make these reviews, independent and internal auditors use a variety of audit procedures, many of which frequently employ samples.

Samples of transactions, such as payments of bills received, are used to make inferences about the effectiveness of the op- eration of the system of internal control. This type of sampling generally is concerned with qualitative characteristics or attri- butes, such as whether an internal control operated improperly for a transaction. The parameter of interest is usually the process proportion of transactions for which the internal control operates improperly.

This content downloaded from 194.29.185.251 on Sun, 15 Jun 2014 18:55:07 PMAll use subject to JSTOR Terms and Conditions

Page 4: Boundaries of Statistics-Sharp or Fuzzy?

Neter: Boundaries of Statistics-Sharp or Fuzzy? 3

Sampling of accounts, such as accounts receivable and in- ventory, is a second type of sampling used in auditing. Sam- pling of account balances is usually concerned with quantitative characteristics. The auditor ascertains, after examining a line item such as an individual inventory item or an individual ac- count receivable, the correct balance for this item. This correct balance is often the quantitative characteristic of interest; it is frequently called the audit amount, in contrast to the amount recorded by the company, called the book amount. Of course, if the book amount is correct, the audit amount equals the book amount. Alternatively, the characteristic of interest may be the error amount (book amount minus the audit amount) in a line item. The objective of sampling account balances for inde- pendent auditors is to make an inference about the accuracy of the 'book balance, that is, whether the total book amount is reasonably close to the total audit amount that would have been determined if all line items in the population had been audited with the same care as had been employed in the study of the sample of line items. Alternatively, the objective may be viewed as making an inference regarding whether the total error amount is reasonably small.

The earliest suggestions for auditors to use statistical sam- pling came in the 1930s (e.g., Carman 1933). Other papers on this subject followed in the 1940s, including a monograph by Vance (1950) in which he suggested the use of Wald's se- quential sampling methods for audits. At about the same time, I was a graduate student at Columbia University and met Wil- liam D. Cranstoun, a partner in an auditing firm, who suggested that I explore the use of statistical sampling in auditing. The result was a paper in the Journal of Accountancy (Neter 1949), in which I explained the advantages of using statistical sampling methods. These early papers tended to focus on sampling ap- plications in which the characteristics of interest are attributes, primarily sampling of transactions.

Gradually attention shifted to statistical sampling of accounts (e.g., Vance and Neter 1956). I shall focus attention on this area because the sampling of transactions for purposes of mak- ing inferences about an attribute presented relatively few dif- ficulties. In contrast, serious problems were encountered in applying statistical sampling methods to sampling account bal- ances. Consider the estimation of the population total audit amount. We shall denote the audit amount for the ith line item in the population by Xi and the total audit amount by X:

N

X = Xi, (1)

where N is the number of line items in the population. If a simple random sample of n line items is selected and the audit amount for each item in the sample is established, an unbiased estimator of X is obtained by expanding the sample total audit amount by the reciprocal of the sampling fraction:

X =- xi = N. (2)

Here xi denotes the audit amount of the ith sample item and x denotes the sample mean audit amount per line item. Estimator (2) is called a mean-per-unit estimator in the accounting lit- erature.

A difficulty with this estimator is that it tends to be very imprecise because of the large variability in many accounting populations. Even though stratification by book amount makes this estimator much more precise, it still suffers from several limitations. Comparison of the estimated total audit amount with the total book amount may indicate the presence of a substantial total error amount even though no errors were found in the sample. In addition, the use of large-sample confidence limits, such as the one-sided lower limit for the population total audit amount,

X - z(1 - a)s(X), (3)

where z(1 - a) is the (1 - a)100 percentile of the standard normal distribution and s(X) is the estimated standard error, may not be appropriate for sample sizes commonly used in auditing.

Since the auditor knows the book amounts of the line items, this information can be incorporated into the estimation pro- cedure. Let Yi denote the book amount for the ith line item and Y the population total book amount. Further, the population total error amount shall be denoted by E, which is defined as follows:

E = Y-X. (4)

One possible estimator that incorporates the supplementary in- formation about book amounts is the difference estimator. This estimator for simple random sampling of line items when the population total error amount is to be estimated is

E - -E di = Nd, (5) ni-

where

di = yi - xi (6)

is the difference between the book and audit amounts for the ith sample item and d is the sample mean difference per line item.

Another possibility is the ratio estimator:

= Y(dly), (7)

where -

is the sample mean book amount per line item. These estimators can also be used with stratified random

sampling, though the gains in precision are not large when accounting populations with small- or moderate-sized errors are sampled (e.g., Neter and Loebbecke 1975, p. 95).

Still another way in which the supplementary information about book amounts can be used is by means of sampling with probability proportionate to book amount. An unbiased esti- mator of the population total error amount with probability- proportional-to-size (pps) sampling is

E = _Z~~~~ = -E-1. ~~(8) n i=1 Yil Y n E= Yi(8

Estimators such as the difference, ratio, and regression es- timators with simple or stratified random sampling and the unbiased estimator with pps sampling have several important limitations. When, as frequently occurs in auditing, the sample contains no errors, that is, when di-=0, the estimated standard

This content downloaded from 194.29.185.251 on Sun, 15 Jun 2014 18:55:07 PMAll use subject to JSTOR Terms and Conditions

Page 5: Boundaries of Statistics-Sharp or Fuzzy?

4 Journal of the American Statistical Association, March 1986

error for these estimators is zero, as may be seen from the estimated variance formula for the difference estimator:

s2()E = N ( E (di- 2. (9) Nn(n - 1) i =

An estimated standard error of zero suggests perfect precision, a conclusion that is not warranted. In addition, the large-sample confidence limits frequently are not appropriate for sample sizes used in auditing, even when errors tend to be present in the sample.

Table 1 demonstrates the frequent inapplicability of upper confidence bounds for the population total error amount cal- culated on the basis of large-sample theory for sample size 100, a sample size that often would be considered large in auditing. Coverage results (i.e., the proportion of times the upper bound equals or exceeds the population total error amount) are pre- sented in this table for a simulation study in which 600 repli- cations of samples of size 100 were selected (Neter and Loeb- becke 1975, pp. 6-9). The estimators considered are the strat- ified mean-per-unit estimator (Table la), the stratified ratio esti- mator (Table Ib), and the pps unbiased estimator (Table Ic). The study populations are actual accounting populations into which errors were seeded with different error rates, based on empirical information about error amounts observed in audits. (Not all error rates were studied for each accounting population for the pps unbiased estimator.) The upper bounds were cal- culated for a nominal 93.3% confidence level by means of large- sample normal distribution theory. It is clear from Table 1 b and c that the stratified ratio and pps unbiased estimators fre- quently have coverages far below the nominal 93.3% level and from Table la that the stratified mean-per-unit estimator also at times has coverages below the nominal level.

A search for bounds not based on large-sample theory began in the 1960s. A bound was developed by Stringer (1963), the original logic for which involved stratification of the line items

Table 1. Coverages for Nominal 93.3% Upper Bounds on Population Total Error Amount for Three Estimators Based on

Sample Size 100 (in %/6)

Population Error Percentage

Population .5 1 5 10 30*

a. Stratified Mean-Per Unit Estimator 1 94.7 94.8 94.7 96.0 95.0 2 93.5 93.7 94.2 93.8 95.0 3 93.5 93.0 93.2 93.8 93.8 4 89.5 89.5 87.5 85.7 90.5

b. Stratified Ratio Estimator 1 19.7 63.0 96.7 98.8 99.0 2 26.2 40.5 99.2 98.3 97.3 3 8.7 14.8 48.0 64.5 85.5 4 29.5 39.7 70.8 84.3 91.2

c. PPS Unbiased Estimator 1 20.2 60.3 - 98.7 98.2 2 - - 98.7 98.7 -

3 5.2 - 31.5 44.2 76.8 4 30.7 - 69.7 83.0 89.7

*For population 2, this error percentage is 70. Source: Neter and Loebbecke (1975, pp. 87, 99, 122).

by book amount and sampling each stratum in proportion to the amount of dollars contained in the stratum. Anderson and Teitlebaum (1973) recognized that the logic of the Stringer bound could be simplified by considering a sample selection scheme that involves a random selection of dollar units and then prorating any errors found to the individual dollars in the line item containing the error. This method of sampling is frequently called dollar-unit sampling or monetary-unit sam- pling in accounting circles. When monetary units are selected at random and with replacement and any errors found are pro- rated to the individual monetary units, the process can be viewed equivalently as sampling with probability proportional to book amount with replacement.

The Stringer bound is applicable when the errors in line items in accounting populations are overstatements. It can also be used for understatement errors when the auditor is able to spec- ify a bound on the extent of understatements. An example of an overstatement error is when the book value of a particular inventory item on hand is $1,000, whereas the audit discloses that only one-half of the items are on hand and that the correct value is $500. The amount of overstatement error as a propor- tion of the book amount is called a taint. A taint thus is the prorated error amount assigned to each monetary unit. Since overstatement taints generally cannot exceed 1.0, the range for overstatement taints is usually taken to be from 0 to 1.0. A conservative estimate of the total overstatement error can there- fore be obtained by treating each observed taint as if it were 1.0. But then the problem is simply one of obtaining an upper bound for a population proportion (the proportion of monetary units in the population that contain an error) and multiplying this bound by the total number of monetary units in the pop- ulation to obtain an upper bound for the population total over- statement error amount.

The Stringer bound begins with this conservative upper bound and heuristically reduces it to take account of the fact that the observed taints may not all be at the maximum value of 1.0. Though no proof exists as yet that the Stringer bound is a confidence bound, simulation studies (e.g., Reneau 1978; Leitch, Neter, Plante, and Sinha 1982) show that the coverages asso- ciated with the Stringer bound always exceed the nominal con- fidence level and often are very close to 100%. Still, the Stringer bound has a serious weakness. Its high coverage is associated with bound values that often far exceed the population total error amount. This lack of tightness of the Stringer bound leads to substantial risks of incorrect conclusions when it is used for testing purposes.

Monetary-unit sampling may be viewed as sampling of a mixture of populations. The mixture consists of a probability mass associated with no error and a continuous distribution associated with error taints. There may also be another prob- ability mass, for taints t = 1.0, that is, for 100% overstatement errors. A similar type of mixture of populations has been noted by J. Bradley (1982) in behavioral science applications. The reason a standard mixtures approach (e.g., Aitchison 1955) is not applicable in auditing situations is that the probability mass for no error is often very large. Hence frequently all items in the sample are correct, thus providing no information about error taints.

Monetary-unit sampling may also be viewed in a discretized

This content downloaded from 194.29.185.251 on Sun, 15 Jun 2014 18:55:07 PMAll use subject to JSTOR Terms and Conditions

Page 6: Boundaries of Statistics-Sharp or Fuzzy?

Neter: Boundaries of Statistics-Sharp or Fuzzy? 5

form as multinomial sampling. The multinomial classes rep- resent the different possible prorated outcomes rounded to the desired degree. For example, if the error taints are rounded to the nearest cent, there would be 101 classes in the multinomial distribution, including the case of no error (0, 1, . . . , 99, 100). Fienberg, Leitch, and I used this perspective to develop an upper bound for the population total overstatement error (Fienberg, Neter, and Leitch 1977). A difficulty with this ap- proach is the computational complexity of finding the joint confidence region and maximizing over it.

Another approach to overcoming the difficulties associated with samples that frequently do not provide any information about non-zero error amounts is to use prior information by means of a Bayesian formulation. Felix and Grimlund (1977) suggested a Bayesian approach based on a random sample of line items that assumes a normal distribution of error amounts. Cox and Snell (1979) proposed a Bayesian bound based on monetary-unit sampling. Their model assumes that the prior distribution for ir, the population error rate per monetary unit, is a gamma distribution, and that the prior distribution for , the population mean taint for dollar units in error, is an inverse gamma distribution. They further assume that ic and,u are in- dependent, the observed number of errors in the sample is a Poisson variable, and the observed error taints are observa- tions from an exponential distribution. It then follows that the posterior distribution of the population total error amount is a simple multiple of the F distribution. Hence a Bayesian 1 - a bound for the population total error amount can be easily constructed by obtaining an appropriate percentile of the F distribution and multiplying it by the multiple. Godfrey and I (Neter and Godfrey 1985) investigated the behavior of the Cox and Snell bound in repeated sampling from a given population, in the same spirit as some of the frequency calculations sug- gested by Rubin (1984). We found that conservative values for the prior parameters exist for which the Cox and Snell bound has coverages near or above the Bayesian probability level for a wide range of different populations.

Although there had been relatively little research activity in the area of statistical sampling in auditing for many years, a large amount of research in this area is now being undertaken. Recent developments include a proposal to construct an upper bound for the population total error amount by modeling the sampling distribution of the mean error as a three-parameter gamma distribution and using the moments from the sample, together with a variety of heuristics, to obtain estimates of the parameters of the gamma distribution (Dworin and Grimlund 1984). Another recent proposal is to combine the multinomial sampling model with the Dirichlet prior distribution to obtain Bayesian bounds for the population total overstatement error amount (Tsui, Matsumura, and Tsui 1985).

Much of the current research on the use of statistical sampling methods in auditing (as well as on uses of other statistical methods, such as the use by auditors of regression methods for analytic review) is being conducted by accountants, not stat- isticians. Accountants today are receiving better training in mathematics and statistics than a generation ago, and many are eager to use this training. This strand in the history of the development of statistical sampling methods for auditing, namely, the active participation of persons who would not be considered

statisticians by many professional statisticians, is also found in other disciplines.

A consequence of accountants carrying on much of the re- search on statistical sampling methods for auditing is that many of the procedures developed are heuristic in nature, with rel- atively limited follow-up of the theoretical considerations by statisticians.

In addition to the accelerated research in statistical sampling methodology for auditing, there have also developed a much wider acceptance and increased usage of statistical sampling methods by auditors. In 1962 came the first official pronounce- ment from a committee of the American Institute of Certified Public Accountants (AICPA) in a report stating that statistical sampling was permitted under generally accepted auditing stan- dards (AICPA Statistical Sampling Subcommittee 1983, p. 6). By 1981 AICPA's Auditing Standards Board issued Statement on Auditing Standards No. 39, which provides explicitly for the use of statistical sampling in auditing, in addition to non- statistical or judgmental sampling. A number of the large au- diting firms now routinely use statistical sampling in many audit applications.

Collaborative efforts between statisticians and accountants have had mixed results. On an individual level, there have been some very fruitful outcomes in research and practice as a result of collaboration by individuals. In addition, AICPA engaged a statistician to prepare a monograph on statistical sampling in auditing in which both standard statistical sampling procedures and some procedures specially developed for sampling appli- cations in auditing are presented (Roberts 1978).

At an institutional level, efforts at collaboration have been less successful. ASA set up the Committee on Statistical Sam- pling in Auditing in 1960. The first chair of the committee was Fred Stephan, a former president of ASA. This committee un- dertook a number of joint activities with the AICPA Statistical Sampling Committee, including joint meetings of the two com- mittees and joint sponsorship at ASA Annual Meetings of ses- sions devoted to statistical sampling in auditing. Indeed, the ASA committee for several years had as chair an accountant active in both AICPA and ASA. Still, no formal collaborative efforts of the two groups resulted, despite the existence of serious problems in applying statistical sampling methods to auditing, for which expertise in both auditing and statistics is required. Indeed, the ASA Committee on Statistical Sampling in Auditing was recently abolished, largely because of the dif- ficulties in developing joint collaborative efforts.

Stronger organized collaboration between statisticians and auditors might have affected auditors' views about nonstatistical sampling. A recent guide to statistical and nonstatistical sam- pling in auditing (AICPA Statistical Sampling Subcommittee 1983) includes in the section on nonstatistical sampling an illustrative table of sample sizes calculated on the basis of statistical sampling specifications. The guide notes that "Nei- ther Statement on Auditing Standards No. 39 nor this guide requires the auditor to compare the sample size for a nonsta- tistical sampling application with a corresponding sample size calculated using statistical theory." Still the guide continues that "At times, however, an auditor might find familiarity with sample sizes based on statistical theory helpful when applying professional judgment and experience in considering the effect

This content downloaded from 194.29.185.251 on Sun, 15 Jun 2014 18:55:07 PMAll use subject to JSTOR Terms and Conditions

Page 7: Boundaries of Statistics-Sharp or Fuzzy?

6 Journal of the American Statistical Association, March 1986

of various planning considerations on (judgmental) sample size" (AICPA Statistical Sampling Subcommittee 1983, pp. 56-57). The guide then proceeds to give an example of the use of this table for planning the sample size of a nonstatistical sample. A serious danger is that auditors might consider judg- mental samples to be equivalent to statistical samples if they use the sample sizes indicated by the table.

4. SOME LESSONS FOR STATISTICIANS

The intent of this historical recitation is not to place any blame for the failure of statisticians and accountants to collab- orate more successfully. Rather, I see in this experience some useful lessons for statisticians. There are many other disciplines besides accounting in which expanded uses of statistical meth- ods would be highly desirable and in which special problems in applying statistical methodology are encountered that could lead to useful new methodological developments in statistics. We need to learn how to collaborate more effectively with experts in other subject matter areas. We also need to take greater interest in, and be more sympathetic to, the struggle of specialists in other disciplines to expand their use of statistical methods and to assist in the development of needed new sta- tistical methods. If this greater collaboration with experts of other disciplines makes the boundaries of statistics less well defined, I would conclude that fuzzy boundaries of statistics are preferable to well-defined ones that will isolate us from significant new statistical developments in other subject matter fields.

Two other examples of where, in my view, fuzzy boundaries of statistics are desirable are the evaluation and control of measurement errors in statistical data and the improvement of quality in manufacturing, service, and other processes. Statis- tical data are subject to a large variety of measurement errors. Some of these are associated with interviewers or other collec- tors of the data, others are related to respondents, and still others are connected with measurement instruments and the processing of the data. Some statisticians view the evaluation and control of measurement errors as outside the boundaries of statistics, perhaps because they do not attach much importance to the collection of statistical data. This view, which divorces the analysis of data from their collection, is short-sighted in my opinion, because the measurement errors in the data may be much more important than the sampling errors. Under these circumstances, statisticians who fail to consider measurement errors and concentrate entirely on the analysis of sampling errors are grappling with side issues instead of the main problem.

An illustration of the magnitudes of measurement errors that can be present in statistical data is found in a study that Waks- berg and I undertook some years ago (Neter and Waksberg 1964). A new sample survey on residential alterations and re- pairs expenditures was being developed, and data were to be obtained by personal interviews. There were a number of con- cerns about the data collection procedures, one of which was that recall of expenditures could involve telescoping-that is, expenditures made some months ago are recalled as having been made more recently and hence reported in an incorrect period. An experimental study was undertaken that included two methods of recall for purposes of studying possible tele- scoping. The first type of recall was unbounded recall, the usual

type of recall wherein respondents are asked for expenditures since a given date, without any control over possible telescop- ing. The second type of recall was bounded recall: the respond- ent was told of expenditures reported in the previous interview and asked about additional expenditures made during the most recent period. By using the information about expenditures reported in the previous interview, it is possible with bounded recall to exclude the reporting of previously reported expend- itures as expenditures for the most recent period.

Based on a comparison of unbounded one-month recall with bounded one-month recall for identical periods of time from panels constituting probability samples from the population un- der study, it was found that the telescoping of large alterations and repairs expenditures amounted to more than 50% (Neter and Waksberg 1964, p. 27). When measurement errors of this magnitude are present in statistical data, emphasis on refine- ments in the statistical analysis of the data, rather than on control and evaluation of measurement errors, is misplaced. To argue that the control and evaluation of measurement errors is the responsibility of subject matter specialists alone is to ignore the important contributions that statisticians can make and the vital interest that statisticians should have in the data to be analyzed.

The improvement of quality is another example of the de- sirability of fuzzy boundaries for statistics. During World War II and shortly thereafter, great interest in statistical quality con- trol developed, based on the ideas of Shewhart and stimulated by a number of statisticians, including Deming. In time, interest in statistical quality control waned. More recently, there has been a strong revival of interest in quality assurance and quality improvement as a result of competitive pressures. One of the differences between now and 40 years ago is the realization that statistical methods by themselves are not magic tools that provide instant solutions. As Deming (1982), among others, has stressed, improvements in quality and productivity require a total effort, including management reorientation and changes in organizational operations. Statistical methodology plays a key role, but not an isolated one. Joiner (1985) noted that statisticians can play a key role because many of them are accustomed to thinking about processes rather than isolated events and are trained in gathering and analyzing data about processes to help improve them. But Joiner also stressed the importance of interpersonal skills, team-building skills, and knowledge of organizational behavior in achieving improve- ments in quality and productivity.

Well-defined, narrow boundaries of statistics will tend to keep statisticians isolated and unappreciative of the team role that they must play to achieve lasting gains in quality and productivity improvements. This danger of viewing statistics too narrowly is not confined to the area of quality and produc- tivity improvement. Narrow boundaries of statistics also make it difficult for statisticians to participate effectively in fostering the utilization of current statistical methodologies and the de- velopment of needed new ones in other subject matter areas in which expanded uses of statistics could be very helpful.

The problem of too narrow boundaries for statistics also has important implications for ASA. Too narrow and well-defined boundaries for statistics will isolate our organization from sig- nificant developments and applications of statistical methodol- ogies that are being carried on by persons on the boundaries

This content downloaded from 194.29.185.251 on Sun, 15 Jun 2014 18:55:07 PMAll use subject to JSTOR Terms and Conditions

Page 8: Boundaries of Statistics-Sharp or Fuzzy?

Neter: Boundaries of Statistics-Sharp or Fuzzy? 7

of statistics and other disciplines. As one example, let me men- tion the area of simulation in management science. Important and difficult statistical problems arise in the design and analysis of management science simulation studies (e.g., Fishman 1985), yet much of this research is carried out by persons who do not consider their primary affiliation to be with statistics and who tend to publish in management science journals.

The trend for many major new statistical developments to occur at the interfaces of statistics and other disciplines-such as computer graphics, decision support systems, expert sys- tems, and artificial intelligence-is very clear. In my view, this trend presents a number of major challenges to us as stat- isticians:

1. We need to undertake more extensive joint work with experts from other disciplines on problems with significant sta- tistical elements.

2. We need to recognize the statistical activities of persons who are not primarily statisticians and to draw them into some of our activities and provide helpful services to them.

3. We need to improve our interpersonal skills to gain ac- ceptance of the statistical point of view and implementation of conclusions based on appropriate statistical methodologies and analyses.

Recognition of the importance of interpersonal skills and attitudes by statisticians is not too widespread. Boen and Zahn's (1982) monograph on statistical consulting makes a cogent case for the consideration of interpersonal aspects in effective sta- tistical consulting. Broadly viewed, statisticians do consulting whenever they work with other subject matter area specialists, whether in universities, industry, or government. Indeed, the term "statistical consulting" may often carry the wrong con- notation; a statistician serving as a "team member" may be a more appropriate term for many situations involving a joint effort. As Deming said in connection with statisticians in in- dustry, "Statisticians and other technical people can reach max- imal usefulness in industry, including service industries, only by adopting as their main purpose help to management" (Dem- ing 1985, p. 9).

Statisticians in industry and government probably are more aware of the need for interpersonal skills by statisticians than many academicians, which could explain why statistical con- sulting skills frequently are not emphasized in academic statis- tics programs. (An interesting description of a statistical con- sulting curriculum was recently given by McCulloch, Boroto, Meeter, Polland, and Zahn 1985.)

5. CONCLUDING COMMENTS

Having now developed for you my concerns about the pos- sibility of statisticians excluding themselves from important statistical problems in other disciplines and about the difficulties that are encountered when statisticians work jointly with spe- cialists from other disciplines, I would like to conclude with a personal hope and a personal note. My personal hope is that as the American Statistical Association later this year charts broad directions for the future, it will recognize the growing trend of many new, important statistical developments occur- ring at the interfaces of statistics and other disciplines, and that it will view this trend as a challenge to work cooperatively with

other disciplines. Further, it is my hope that the growing im- portance of statisticians' working with specialists from other disciplines will be reflected in academic statistics programs in the years ahead.

And now I would like to close on a personal note. Preparing this presidential address has provided me with an opportunity to reflect on my career and my professional association with other statisticians and persons from other disciplines. I have found membership in ASA a most rewarding experience. It has given me the opportunity to meet fine colleagues and to make friendships that have lasted many years. It has also enabled me to serve the profession in a variety of capacities. For all of this, I am most thankful. And I should add that I am also thankful for living in a country that has provided opportunities for a young immigrant refugee to develop and grow.

[Received October 1985.]

REFERENCES

AICPA Statistical Sampling Subcommittee (1983), Audit Sampling, New York: American Institute of Certified Public Accountants.

Aitchison, J. (1955), "On the Distribution of a Positive Random Variable Having a Discrete Probability Mass at the Origin," Journal of the American Statistical Association, 50, 901-908.

Anderson, R. L. (1984), "Goals: Where Are We and Where Should We Be Going?" Journal of the American Statistical Association, 79, 253-258.

Anderson, R., and Teitlebaum, A. D. (1973), "Dollar-Unit Sampling," CA Magazine, 102 (4), 30-39.

Boen, J., and Zahn, D. A. (1982), The Human Side of Statistical Consulting, Belmont, CA: Lifetime Learning Publications.

Box, G. E. P. (1979), "Some Problems of Statistics and Everyday Life," Journal of the American Statistical Association, 74, 1-4.

Bradley, J. V. (1982), "The Insidious L-Shaped Distribution," Bulletin of the Psychonomic Society, 20, 85-88.

Bradley, R. A. (1982), "The Future of Statistics as a Discipline," Journal of the American Statistical Association, 77, 1-10.

Bureau of Economics (1971), Guidelines for the Presentation of the Results of Sample Studies, Statement 71-1, Washington, DC: Interstate Commerce Commission.

Carman, L. A. (1933), "The Efficacy of Tests," American Accountant, 18, 360-366.

Cox, D. R., and Snell, E. J. (1979), "On Sampling and the Estimation of Rare Errors," Biometrika, 66, 125-132.

Davidson, H. J., Neter, J., and Petran, A. S. (1967), "Estimating the Liability for Unredeemed Stamps," Journal of Accounting Research, 5, 186-207.

Deming, W. E. (1982), Quality, Productivity, and Competitive Position, Cam- bridge, MA: MIT Center for Advanced Engineering Study.

(1985), "Transformation of Western Style of Management," Interfaces, 15 (3), 6-11.

Dworin, L., and Grimlund, R. A. (1984), "Dollar Unit Sampling for Accounts Receivable and Inventory," Accounting Review, 59, 218-241.

Felix, W. L., and Grimlund, R. A. (1977), "A Sampling Model for Audit Tests of Composite Accounts," Journal of Accounting Research, 15, 23- 41.

Fienberg, S., Neter, J., and Leitch, R. A. (1977), "Estimating the Total Overstatement Error in Accounting Populations," Journal of the American Statistical Association, 72, 295-302.

Fishman, G. S. (1985), "Estimating Network Characteristics in Stochastic Activity Networks," Management Science, 31, 579-593.

Hacking, I. (1984), "Trial By Number," Science 84, 5, 69-70. Hammond, A. (1984), "The Choosing of the 20," Science 84, 5, 9. Hartley, H. 0. (1980), "Statistics as a Science and as a Profession," Journal

of the American Statistical Association, 75, 1-7. Internal Revenue Service (1972), "Revenue Procedure 72-36," Internal Rev-

enue Bulletin (No. 1972-29, July 17, 1972). Joiner, B. L. (1985), "The Key Role of Statisticians in the Transformation of

North American Industry," The American Statistician, 39, 224-227. Leitch, R. A., Neter, J., Plante, R., and Sinha, P. (1982), "Modified Multi-

nomial Bounds for Larger Numbers of Errors in Audits," Accounting Review, 57, 384-400.

Martin, M. E. (1981), "Statistical Practice in Bureaucracies," Journal of the American Statistical Association, 76, 1-8.

This content downloaded from 194.29.185.251 on Sun, 15 Jun 2014 18:55:07 PMAll use subject to JSTOR Terms and Conditions

Page 9: Boundaries of Statistics-Sharp or Fuzzy?

8 Journal of the American Statistical Association, March 1986

McCulloch, C. E., Boroto, D. R., Meeter, D., Polland, R., and Zahn, D. A. (1985), "An Expanded Approach to Educating Statistical Consultants," The American Statistician, 39, 159-167.

Neter, John (1949), "An Investigation of the Usefulness of Statistical Sampling Methods in Auditing," Journal of Accountancy, 87, 390-398.

Neter, J., and Godfrey, J. (1985), "Robust Bayesian Bounds for Monetary- Unit Sampling in Auditing," Journal of the Royal Statistical Society, Ser. C, 34, 157-168.

Neter, J., and Loebbecke, J. K. (1975), Behavior of Major Statistical Esti- mators in Sampling Accounting Populations-An Empirical Study, New York: American Institute of Certified Public Accountants.

Neter, J., and Waksberg, J. (1964), "A Study of Response Errors in Expend- itures Data From Household Surveys," Journal of the American Statistical Association, 59, 18-55.

Reneau, J. H. (1978), "CAV Bounds in Dollar Unit Sampling: Some Simulation Results," Accounting Review, 53, 669-680.

Roberts, D. M. (1978), Statistical Auditing, New York: American Institute of Certified Public Accountants.

Rubin, D. B. (1984), "Bayesianly Justifiable and Relevant Frequency Cal- culations for the Applied Statistician," Annals of Statistics, 12, 1151-1172.

Savage, I. R. (1985), "Hard-Soft Problems," Journal of the American Sta- tistical Association, 80, 1-7.

Stringer, K. W. (1963), "Practical Aspects of Statistical Sampling in Auditing," in Proceedings of the Business and Economic Statistics Section, American Statistical Association, pp. 405-41 1.

Tsui, K. W., Matsumura, E. M., and Tsui, K. L. (1985), "Multinomial Dir- ichlet Bounds for Dollar-unit Sampling in Auditing," Accounting Review, 60, 76-96.

Vance, L. L. (1950), Scientific Method for Auditing, Berkeley: University of California Press.

Vance, L. L., and Neter, J. (1956), Statistical Sampling for Auditors and Accountants, New York: John Wiley.

This content downloaded from 194.29.185.251 on Sun, 15 Jun 2014 18:55:07 PMAll use subject to JSTOR Terms and Conditions