55
The History of Keysteps of Computational Statistics Wilfried Grossmann, University of Vienna, Austria Michael G. Schimek, Medical University of Graz, Austria Peter Paul Sint, Austrian Academy of Sciences, Vienna 30 Years

The History of Keysteps of Computational Statistics Wilfried Grossmann, University of Vienna, Austria Michael G. Schimek, Medical University of Graz, Austria

Embed Size (px)

Citation preview

Page 1: The History of Keysteps of Computational Statistics Wilfried Grossmann, University of Vienna, Austria Michael G. Schimek, Medical University of Graz, Austria

The History of

Keysteps of Computational Statistics

Wilfried Grossmann, University of Vienna, Austria

Michael G. Schimek, Medical University of Graz, Austria

Peter Paul Sint, Austrian Academy of Sciences, Vienna

30 Years

Page 2: The History of Keysteps of Computational Statistics Wilfried Grossmann, University of Vienna, Austria Michael G. Schimek, Medical University of Graz, Austria

2

1974Department of Statistics and

Informatics, University of Vienna

Peter Paul, a „senior“ Assistant Professor

Page 3: The History of Keysteps of Computational Statistics Wilfried Grossmann, University of Vienna, Austria Michael G. Schimek, Medical University of Graz, Austria

3

Department of Statistics and Informatics,

University of Vienna

1974 A few years after

Wilfried, a „junior“ Assistant Professor

Page 4: The History of Keysteps of Computational Statistics Wilfried Grossmann, University of Vienna, Austria Michael G. Schimek, Medical University of Graz, Austria

4

1974 A few years after

Michael, a first year student

University of Vienna

Gerhard Bruckmann

Page 5: The History of Keysteps of Computational Statistics Wilfried Grossmann, University of Vienna, Austria Michael G. Schimek, Medical University of Graz, Austria

5

Outline of PresentationThe Beginning of COMPSTAT• Early statistical computing• The institutional environment• The first symposium and the Compstat Society

Developments in Computational Statistics (CS)

• CS and statistical theory• CS and algorithms• CS and computer science• CS and application

The COMPSTAT Symposia

Page 6: The History of Keysteps of Computational Statistics Wilfried Grossmann, University of Vienna, Austria Michael G. Schimek, Medical University of Graz, Austria

6

The Beginning of COMPSTAT

Page 7: The History of Keysteps of Computational Statistics Wilfried Grossmann, University of Vienna, Austria Michael G. Schimek, Medical University of Graz, Austria

7

Early Computational Statistics

• The Beginnings in Vienna– Institute of Statistics

• Part of the Law Faculty - S. Sagoroff - Leipzig/Sofia/USA/Berlin//Vienna - Energy Balances

• first Computer: first generation machine – Paid for by Rockefeller-Foundation 1960– Arrival of the ‚Electronic Brain‘ 1st generation

» Never again similar enthusiasm

• Institute of Advanced Studies - Ford Institute– Statistical machines - card counting - >2nd generation

• Replaced by IBM /360-44 - 3rd gen. SSP / SPSS

– Computing Center

Page 8: The History of Keysteps of Computational Statistics Wilfried Grossmann, University of Vienna, Austria Michael G. Schimek, Medical University of Graz, Austria

8

Statistics-Computational

One year Biostatistics department Oxford UniversityStill: Not strongly integrated in international statistical community - Main contacts ISI: Central Statistical Office, Sagoroff1973 ISI-session in Vienna - emphasis on applications - computational methods rareBring statisticians with our interests to ViennaEncouragement by publisher Arnulf Liebing /Physica/ What is specific to our department?Concept of Computational Statistics - Johannes Gordesch (Math) - Peter Paul Sint (Physics)

Page 9: The History of Keysteps of Computational Statistics Wilfried Grossmann, University of Vienna, Austria Michael G. Schimek, Medical University of Graz, Austria

9

First COMPSTAT Call

COMPSTAT 1974

-Gerhart Bruckmann - Local fame as analyst of voting results during election nights-Leopold Schmetterer (successor of Sagoroff) - Internationally known Mathematical Statistician

(Franz Ferschl, incoming professor of statistics, new editor of Metrika - added as an editor by the publisher)

Page 10: The History of Keysteps of Computational Statistics Wilfried Grossmann, University of Vienna, Austria Michael G. Schimek, Medical University of Graz, Austria

10

S. Sagoroff and M. Tantilov

Page 11: The History of Keysteps of Computational Statistics Wilfried Grossmann, University of Vienna, Austria Michael G. Schimek, Medical University of Graz, Austria

11

First COMPSTAT Editors

Page 12: The History of Keysteps of Computational Statistics Wilfried Grossmann, University of Vienna, Austria Michael G. Schimek, Medical University of Graz, Austria

12

Preface of the first Proceedings

Page 13: The History of Keysteps of Computational Statistics Wilfried Grossmann, University of Vienna, Austria Michael G. Schimek, Medical University of Graz, Austria

13

Logic of the Logo

Page 14: The History of Keysteps of Computational Statistics Wilfried Grossmann, University of Vienna, Austria Michael G. Schimek, Medical University of Graz, Austria

14

J. Gordesch at Compstat76 Berlin

Page 15: The History of Keysteps of Computational Statistics Wilfried Grossmann, University of Vienna, Austria Michael G. Schimek, Medical University of Graz, Austria

15

Getting of Age

• International from the start • Compstat Society since Berlin• Leiden NL 1978 Integration into IASC• Edinburgh GB 1980 - Toulouse F 1982• Eastern Europe needed Politics ISI-IASC• Local Projects redirected: Prague 1984• Rome I 1986 - Copenhagen 1988 DK• Dubrovnik YU 1990 - Neuchâtel CH 1992

Page 16: The History of Keysteps of Computational Statistics Wilfried Grossmann, University of Vienna, Austria Michael G. Schimek, Medical University of Graz, Austria

16

Prague 1984

Page 17: The History of Keysteps of Computational Statistics Wilfried Grossmann, University of Vienna, Austria Michael G. Schimek, Medical University of Graz, Austria

17

Developments in Computational Statistics

Page 18: The History of Keysteps of Computational Statistics Wilfried Grossmann, University of Vienna, Austria Michael G. Schimek, Medical University of Graz, Austria

18

Computational Statistics

• What is Computational Statistics?– A question raised many times at the end of

the 80ies and beginning of the 90ies inside the community

Page 19: The History of Keysteps of Computational Statistics Wilfried Grossmann, University of Vienna, Austria Michael G. Schimek, Medical University of Graz, Austria

19

Computational Statistics

• Working definition (A. Westlake)Computational Statistics is related to the advance of statistical theory and methods through the use of computational methods. This includes both the use of computation to explore the impact of theories and methods, and development of algorithms to make these ideas available to users

Page 20: The History of Keysteps of Computational Statistics Wilfried Grossmann, University of Vienna, Austria Michael G. Schimek, Medical University of Graz, Austria

20

Computational Statistics

Computational Statistics

Statistical Theory Algorithms

Applications ComputerScience

Numerical Analysis

Statistical Software

ModellingSeminumerical

Algorithms

Page 21: The History of Keysteps of Computational Statistics Wilfried Grossmann, University of Vienna, Austria Michael G. Schimek, Medical University of Graz, Austria

21

Computational Statistics and Statistical Theory

• The statistical journey in the 20th century

• The Theory Era

• The Methodology Era

Page 22: The History of Keysteps of Computational Statistics Wilfried Grossmann, University of Vienna, Austria Michael G. Schimek, Medical University of Graz, Austria

22

Computational Statistics and Statistical Theory

• The statistical journey in the 20th century– B. Efron:

Statistics in the 20th century is a journey between three poles:

• Applications• Mathematics• Computation

Page 23: The History of Keysteps of Computational Statistics Wilfried Grossmann, University of Vienna, Austria Michael G. Schimek, Medical University of Graz, Austria

23

Computational Statistics and Statistical Theory

• The Theory Era(Pearson, Neyman, Fisher, Wald)– From models for solving practical problems

towards a mathematical decision theoretic framework

– Based on optimality principles– Application is based on computations feasible

for paper and pencil or mechanical computing devices

Page 24: The History of Keysteps of Computational Statistics Wilfried Grossmann, University of Vienna, Austria Michael G. Schimek, Medical University of Graz, Austria

24

Computational Statistics and Statistical Theory

• Modelling Era (1) – Tukey’s paper about the future of data

analysis (1962) as a turning point from mathematics towards computation

• Confirmatory versus explanatory analysis• Dynamics of data analysis• “Robustness”• Importance of Graphics

Page 25: The History of Keysteps of Computational Statistics Wilfried Grossmann, University of Vienna, Austria Michael G. Schimek, Medical University of Graz, Austria

25

Computational Statistics and Statistical Theory

• Modelling Era (2)– Important developments in the modelling era

• Nonparametric and Robust Methods• Kaplan-Meier and Proportional Hazards• Logistic Regression and GLM• Jackknife and Bootstrap• EM and MCMC• Empirical Bayes and James-Stein Estimation

Page 26: The History of Keysteps of Computational Statistics Wilfried Grossmann, University of Vienna, Austria Michael G. Schimek, Medical University of Graz, Austria

26

Computational Statistics and Statistical Theory

• Modelling Era (3)– The modelling area is characterized by a

strong interplay between statistical theory and computational statistics

– The computer as a workbench for statistical experiments (going back to v. Neumann and S. Ulam)

• Passive usage: Studying feasibility of statistical theory by simulation

• Active usage: Obtain results which cannot be computed by conventional numerical algorithms

Page 27: The History of Keysteps of Computational Statistics Wilfried Grossmann, University of Vienna, Austria Michael G. Schimek, Medical University of Graz, Austria

27

Computational Statistics and Statistical Theory

• COMPSTAT was probably not always at the frontier of this developments but the programs and the proceedings reflect quite well the dynamics of the subject in the Modelling Era

Page 28: The History of Keysteps of Computational Statistics Wilfried Grossmann, University of Vienna, Austria Michael G. Schimek, Medical University of Graz, Austria

28

Computational Statistics and Algorithms

• Numerical Algorithms– Matrix Computation, Optimization

• Random Numbers / Monte Carlo• Semi-numerical Algorithms

– Sorting, Searching, Combinatorial Methods, Graph Theoretic Algorithms,…

• Graphical Algorithms• Symbolic Computation (?)• Mathematical vs. Statistical Modelling

Page 29: The History of Keysteps of Computational Statistics Wilfried Grossmann, University of Vienna, Austria Michael G. Schimek, Medical University of Graz, Austria

29

Computational Statistics and Algorithms

• Statistics and Numerical Algorithms (1)– Fast Fourier Transform (Tukey)– Recursive Algorithms and Filtering (Kalman

Filter)

(Both topics seem to be not core topics in computational statistics)

Page 30: The History of Keysteps of Computational Statistics Wilfried Grossmann, University of Vienna, Austria Michael G. Schimek, Medical University of Graz, Austria

30

Computational Statistics and Algorithms

• Statistics in Numerical Algorithms (2)– Adaptation of optimization techniques (e.g.

scoring methods)– Behaviour of optimization methods in

statistical context (numerical convergence vs. stochastic convergence concepts)

Implicit Consideration at COMPSTAT

Page 31: The History of Keysteps of Computational Statistics Wilfried Grossmann, University of Vienna, Austria Michael G. Schimek, Medical University of Graz, Austria

31

Computational Statistics and Algorithms

• Statistics and Random Numbers / Monte Carlo– Generation of Random numbers was (and is)

probably more a topic of mathematics (number theory) and computer science

• In the beginning of COMPSTAT there was also some connection to simulation

– Genuine application of Monte Carlo Methods in connection with new developments of statistical theory (e.g. MCMC)

Page 32: The History of Keysteps of Computational Statistics Wilfried Grossmann, University of Vienna, Austria Michael G. Schimek, Medical University of Graz, Austria

32

Computational Statistics and Algorithms

• Statistics and semi-numerical algorithms – Applications in context of nonparametric statistics and

analysis of tabular data• Feasibility of conditional inference for logistic models

– New developments on the borderline between statistics and computer science

• Data Mining as a new statistical modelling paradigm

COMPSTAT was open towards these developments

and integrated it into the program

Page 33: The History of Keysteps of Computational Statistics Wilfried Grossmann, University of Vienna, Austria Michael G. Schimek, Medical University of Graz, Austria

33

Computational Statistics and Algorithms

• Statistics and Graphical Algorithms – Development rather complementary to the

developments of computer science, – Important issues (L. Wilkinson):

• Graphics are not only a tool for displaying results but rather a tool for perceiving relationships

• Dynamic graphics as important tool for data analysis• Graphics are a means of model formalization reflecting

quantitative and qualitative traits of its variables

Represented quite well at COMPSTAT

Page 34: The History of Keysteps of Computational Statistics Wilfried Grossmann, University of Vienna, Austria Michael G. Schimek, Medical University of Graz, Austria

34

Computational Statistics and Algorithms

• Mathematical vs. Statistical Modelling – Emphasis on different methods (e.g.

Differential Equations)– Different modelling environments (J. Nelder)

• Data structures in statistics• Exploratory nature of statistical analysis (statistical

analysis cycle)• Competence of users

Page 35: The History of Keysteps of Computational Statistics Wilfried Grossmann, University of Vienna, Austria Michael G. Schimek, Medical University of Graz, Austria

35

Computational Statistics and Computer Science

• Developments in Statistical Software

• Development of Statistical Languages

• Developments in Statistical Database Management

Page 36: The History of Keysteps of Computational Statistics Wilfried Grossmann, University of Vienna, Austria Michael G. Schimek, Medical University of Graz, Austria

36

Computational Statistics and Computer Science

• Developments in Statistical Software (1) – From numerical subroutines towards

statistical packages– Main goals:

• Taking into account the peculiarities of statistical data analysis

• Usage of actual hardware developments

Page 37: The History of Keysteps of Computational Statistics Wilfried Grossmann, University of Vienna, Austria Michael G. Schimek, Medical University of Graz, Austria

37

Computational Statistics and Computer Science

• Developments in Statistical Software (2)– COMPSTAT was from the beginning onwards

an important forum for the development of statistical software

• The proceedings in the beginning of the eighties show numerous software developments for specific statistical models

• There was always some tension in connection with presentation of commercial software developments and the scientific character of the conference

Page 38: The History of Keysteps of Computational Statistics Wilfried Grossmann, University of Vienna, Austria Michael G. Schimek, Medical University of Graz, Austria

38

Computational Statistics and Computer Science

• Development of Statistical Languages (1)– GLIM was probably the first genuine statistical

modelling language• Present at COMPSTAT from the very beginning

Page 39: The History of Keysteps of Computational Statistics Wilfried Grossmann, University of Vienna, Austria Michael G. Schimek, Medical University of Graz, Austria

39

Computational Statistics and Computer Science

• Development of Statistical Languages (2)– The S language set up a new paradigm for

computing which is of interest also outside statistical applications

• Contribution in Computer Science honoured by the ACM Software System Award for J. Chambers

Also it started already in 1976 it took a long time to enter the COMPSTAT community

Page 40: The History of Keysteps of Computational Statistics Wilfried Grossmann, University of Vienna, Austria Michael G. Schimek, Medical University of Graz, Austria

40

Computational Statistics and Computer Science

• Development of Statistical Languages (3)– R got rather fast popularity inside COMPSTAT

due to free availability and effective organisation of CRAN

– Omegahat: An umbrella for open source projects in computational statistics covering not only statistical computation but also other important aspects in distributed computing

Page 41: The History of Keysteps of Computational Statistics Wilfried Grossmann, University of Vienna, Austria Michael G. Schimek, Medical University of Graz, Austria

41

Computational Statistics and Computer Science

• Development of Statistical Languages (4)– XLISP-Stat as proof of concept (in particular

for animated graphics) – XploRe as Java based production system

Page 42: The History of Keysteps of Computational Statistics Wilfried Grossmann, University of Vienna, Austria Michael G. Schimek, Medical University of Graz, Austria

42

Computational Statistics and Computer Science

• Statistical Data Base Management– Main challenge is appropriate usage of the

developments in database technology in statistical context

• Combination of statistical data structures and statistical processing activities with conceptual data models

• Representation of tabular data• Metadata as a tool to capture the complexity of statistical

data

A small but active group inside the COMPSTAT community from the very beginning

Page 43: The History of Keysteps of Computational Statistics Wilfried Grossmann, University of Vienna, Austria Michael G. Schimek, Medical University of Graz, Austria

43

Computational Statistics and Applications

• Challenges for Computational StatisticsRather independent from application area– Data

• Data capture• Data structures• Data size

– Analysis Process• Analysis strategies• The role of the statistician in the computer age

Page 44: The History of Keysteps of Computational Statistics Wilfried Grossmann, University of Vienna, Austria Michael G. Schimek, Medical University of Graz, Austria

44

Computational Statistics and Applications

• Data challenges (1)– Contributions towards data challenges occur

occasionally at COMPSTAT

• Actual problems – Data capture

• Data capture tools are rather a side branch of computational statistics and more connected to official statistics

• A new challenge are data streams which have up to now attracted not so much attention in the computational statistics community

Page 45: The History of Keysteps of Computational Statistics Wilfried Grossmann, University of Vienna, Austria Michael G. Schimek, Medical University of Graz, Austria

45

Computational Statistics and Applications

• Data challenges (2)– Data structures

• New problems (e.g. in connection with data mining) raise questions with respect to the applicability of the basic statistical analysis paradigm (population, sample, measurement process)

– Data size• Handling huge datasets

All these challenges seem to be at the moment not core topics of computational statistics

Page 46: The History of Keysteps of Computational Statistics Wilfried Grossmann, University of Vienna, Austria Michael G. Schimek, Medical University of Graz, Austria

46

Computational Statistics and Applications

• Analysis process– Analysis strategies

• The question of formalization of analysis strategies was a hot topic at the COMPSTAT conferences in the end of the 80ies, but there was limited success

– The role of statisticians in the computer age• Is progress in computational statistics an enabler

for statisticians or leads it towards a de-skilling of the statistical profession?

Page 47: The History of Keysteps of Computational Statistics Wilfried Grossmann, University of Vienna, Austria Michael G. Schimek, Medical University of Graz, Austria

47

The COMPSTAT Symposia

Page 48: The History of Keysteps of Computational Statistics Wilfried Grossmann, University of Vienna, Austria Michael G. Schimek, Medical University of Graz, Austria

48

A full set of COMPSTAT proceedings (one statistical outlier removed)

Do you see the CSDA volumes in the background ?

Here they are !

Page 49: The History of Keysteps of Computational Statistics Wilfried Grossmann, University of Vienna, Austria Michael G. Schimek, Medical University of Graz, Austria

49

The COMPSTAT Symposia I

Symposium Year Organizers # Sub-missions

# Papers I/C

# Particip-ants

Vienna 1974 Sint 50 100

Berlin 1976 Gordesch

Naeve

58 180

Leiden 1978 Corsten

Hermans

68 310

Edinburgh 1980 Barrit

Wishart

250 4/82 750

Toulouse 1982 Caussinus

Ettinger

Tomassone

250 15/60 500

Page 50: The History of Keysteps of Computational Statistics Wilfried Grossmann, University of Vienna, Austria Michael G. Schimek, Medical University of Graz, Austria

50

The COMPSTAT Symposia IISymposium Year Organizers # Sub-

missions# Papers

I/C# Particip-

ants

Prag 1984 Havranek

Sidak

Novak

300 7/65 ???

Rome 1986 De Antoni

Lauro

Rizzi

300 14/60 900

Copenhag-en

1988 Edwards

Raun

300 9/51 800

Dubrovnik 1990 Momirovic 115 6/43 180

Neuchâtel 1992 Dodge

Whittaker

115 11/115 200

Page 51: The History of Keysteps of Computational Statistics Wilfried Grossmann, University of Vienna, Austria Michael G. Schimek, Medical University of Graz, Austria

51

COMPSTAT 1994 Vienna and Satellite Meeting on Smoothing Semmering (World Cultural Heritage)

Randy Eubank

Andrew Westlake, Allmut Hörmann, Wolfgang Härdle

Page 52: The History of Keysteps of Computational Statistics Wilfried Grossmann, University of Vienna, Austria Michael G. Schimek, Medical University of Graz, Austria

52

On the track from Vienna to Semmering in the Austrian Alps (historical train)

The organizer

Page 53: The History of Keysteps of Computational Statistics Wilfried Grossmann, University of Vienna, Austria Michael G. Schimek, Medical University of Graz, Austria

53

Satellite Meeting on Smoothing

We finally arrived at the mountain spa Semmering

Antoine de Falguerollesand the organizer at the opening

Page 54: The History of Keysteps of Computational Statistics Wilfried Grossmann, University of Vienna, Austria Michael G. Schimek, Medical University of Graz, Austria

54

The COMPSTAT Symposia IIISymposium Year Organizers # Sub-

missions# Papers

I/C# Particip-

ants

Vienna

Semmring

(Satellite)

1994 Dutter

Grossmann

Schimek

200

30

11/60

7/26

380

50

Barcelona 1996 Prat 250 13/56 300

Bristol 1998 Payne

Green

180 12/58 370

Utrecht 2000 Van der Heijden

Bethlehem

250 15/60 220

Berlin 2002 Härdle 220 9/90 260

Page 55: The History of Keysteps of Computational Statistics Wilfried Grossmann, University of Vienna, Austria Michael G. Schimek, Medical University of Graz, Austria

55

The COMPSTAT proceedings from the Vienna and Semmering meetings

Model of Vienna University

Kastalia Fountain