Probability and Statistics

1

ISBN 978-80-904 948-6-2

Monolingual English Version

2

The Principles of Probability and Statistics (data mining approach)

Monilingual English Version

Základy pravděpodobnosti a statistiky (data miningový přístup)

Jednojazyková anglická verze

CURRICULUM 2013. First edition.

No part of the present publication may be reproduced and distributed in any way and in

any form without express permission of the author and of the Publishing House

Curriculum

The publisher and author will appreciate possible comments concerning the work. They

may be forwarded to the addresses of the publisher and author presented below.

The grant project was supported by: MAKET PROMOTION INSTITUTE

The Company Corporation – 1313 N.Market Street – Wilmington, DE 19801-1151,

U.S.A.

The publisher: Publishing House CURRICULUM

Cholupická 39, CZ-142 00 Praha 4, Czech Republic

e-mail: [email protected]

The author: Assoc. Prof. RNDr. Přemysl Záškodný, CSc., Emy Destinové 17,

CZ-370 01 České Budějovice, Czech Republic

e-mail: [email protected]

Affiliation of the author:

The University of South Bohemia, České Budějovice, Czech Republic

The University of Finance and Administration, Praha, Czech Republic

The reviewers:

RNDr. Ivan Havlíček, CSc.

Assoc. Prof. Ing. Vladislav Pavlát, CSc.

Mgr. Petr Procházka

Assoc. Prof. PaeDr. Jana Škrabánková, CSc.

On line presentation: http//sites.google.com/site/csrggroup/textbook2/

ISBN 978-80-904948-6-2

mailto:[email protected]

mailto:[email protected]

3

CONTENTS

Introduction-5

Part 1. The Main Methods of Descriptive Statistics, Statistical Probability-11

1.1. Formulation of Statistical Investigation-11

1.2. Creation of Scale-14

1.3. Measurement-16

1.4. Elementary Statistical Processing-19 1.4.1. Table-21

1.4.2. Empirical Distribution of Frequencies-22

1.4.3. Empirical Parameters-23

1.4.4. Illustration of Calculation of Empirical Parameters-26

Part 2. The Main Methods of Mathematical Statistics, Probability Distribution-28

2.1. Assignment of Theoretical Distribution to Empirical Distribution-28 2.1.1. Interval Division of Frequencies-30

2.1.2. Theoretical Distribution-31

2.1.3. Description of Selected Theoretical Distributions-36

2.1.4. Apparatus of Non-parametric Testing-41

2.1.5. Illustration of Non-parametric Testing-43

2.2. Comparison of Empirical and Theoretical Parameters – Estimations of Theoretical

Parameters, Testing Parametric Hypotheses-46 2.2.1. Basics of Estimation Theory-48

2.2.2. Illustration of Confidence Intervals Construction-50

2.2.3. Basics of Parametric Hypotheses Testing-51

2.2.4. Illustration of Parametric Testing-54

2.3. Measurement of Statistical Dependences – Some Fundaments of Regression

and Correlation Analysis-59 2.3.1, Delimitation of Problem-60

2.3.2. Simple Linear and Quadratic Regression Analysis-62

2.3.3. Simple Linear and Quadratic Correlation Analysis-65

2.3.4. Illustration of Dependence Measurement-66

Part 3. Applications-70

3.1. Description of Statistical and Probability Base of Financial Options-70 3.1.1. Introduction

3.1.2. Financial Options

3.1.3. Statistical and Probability Base of Black-Scholes Model

3.1.4. Statistical and Probability Base of Binomial and Trinomial Model

3.1.5. Statistical and Probability Data Mining Tools – Normal, Binomial and Trinomial Distribution

3.1.6. Conclusion

3.2. Description of Statistical and Probability Base of Greeks-76 3.2.1. Introduction

3.2.2. Greeks

3.2.3. Value Function

3.2.4. Segmentation and Definitions of Greeks

3.2.5. Indications of Greeks

3.2.6. Formulas for Greeks

3.2.7. Needful Statistical and Probability Relations for Deduction of Greeks Formulas

3.2.8. Conclusion, References

3.3. Data Mining Tools in Statistics Education-85 3.3.1. Introduction

4

3.3.2. Data Mining

3.3.3. Data Preprocessing in Statistics Education

3.3.4. Data Processing in Statistics Education

3.3.5. Complex and Partial Tool of DMSTE – CP-DMSTE, ASM-DMSTE


3.3.7. Supplement of Chapter 3.3. – The Principles of Data Mining Approach

3.3.7.1. Quotations from Sources

3.3.7.2. Brief Summary

3.3.7.3. Data Mining Cycle, References

Part 4. Statistical Tables-109

CV of Author-119

Bibliography of Author-120

Global References-122

5

THE PRINCIPLES

OF PROBABILITY AND STATISTICS

(DATA MINING APPROACH)

Introduction

The applications of descriptive and mathematical statistics and probability theory in an

investigation of the collective random phenomena are the subject of probability and statistics.

To describe these applications it is necessary to first be concerned with descriptive and

mathematical statistics and probability theory. In view of the fact that the extent of the

probability and statistics presentation is to a certain degree limited (due to the study text

orientation to concrete branches of study) it will be effective to acquaint ourselves above all

with main statistical methods, continuously to illustrate them by the assigned example, by the

survey of acquired concepts and the check questions, marginally to touch of some concepts of

probability theory and finally to approach the applications. The studies of so structured

orientation is although accessible for attendance and combined form of study, it cannot,

however, be confused with a continuous and coherent study of statistics and probability

theory as a separate scientific disciplines.

The structure of presentation will be introduced by analytical-synthetic model of the

structure of statistics as a whole. This model can be used for the immediate classification of

statistical method and for the immediate location of previous and follow-up methods. The

model also has a significant cognitive dimension – it is showing which the operations of

analysis, abstraction and synthesis are to be carried out to be complete the adoption of

relevant statistical method. The presented model in figure Fig.1 contains the four partial

analytical-synthetic structures. The model in figure Fig.1, the legend to figure Fig.1 and the

description of component structural parts is presented only in English.

Following a short part of text presented only in English represents data mining

approach to the study of the principles of statistics and several needful concepts of

probability. The data mining approach enables to work with the integral concepts and

knowledge pieces in their system shape (see analytical-synthetic model). The data mining

6

approach is explained in Part.3 “Applications” more detail. The immediate structural

orientation, showing which part of the statistics and its probability applications is just

acquired in the course of the study, isn´t useless. It is always good to know whether the

selective statistical set (SSS) is “only” determined (the first partial structure from element a-1

up to element e-1), whether the empirical picture of set SSS is already created (the second

partial structure from element a-2 up to element e-2) or whether the probability picture of set

SSS is already even explored (the third partial structure from element a-3 up to element e-3)

or whether it was already entered to the process of creation of the associative picture of set

SSS (the fourth partial structure from element a-4 up to element e-4). In addition, the study of

the texts in English is needful assumption for the study of foreign literature.

7

Fig.1 Analytical synthetic model of statistics and needful probability concepts

formed by four partial models a1-e1, a2-e2, a3-e3, a4-e4

Frequencies tables

(Empirical distribution)

Graphical expression Empirical parameters

Empirical picture of selective statistical set, Necessity of probable investigation e-2=a-3

Probability distributions

Creating of scale

Measurement

Choice of acceptable

theoretical distribution

Quantification of

theoretical parameters

Comparison of theoretical and

empirical parameters

Testing non-parametric

hypotheses

Point & interval estimation

(e.g. confidence interval) Testing parametric hypotheses

Collective random phenomenon and reason of its investigation a-1

Statistical unit Statistical sign

Selective statistical set (SSS) as a part of basic statistical set, Goals of statistical examination e-1=a-2

Statistical probability

Statistical dependence

(causal, non-causal)

Regression analysis

Variants (values) of

statistical sign

Choice of statistical

units

Empirical & probable picture of selective statistical set, Necessity of association investigation e-3=a-4

Correlation analysis

Empirical & probable & association picture of selective statistical set

Interpretation and conclusions as the statistical & probable dimension e-4

of investigation collective random phenomenon

Applied probability and statistics

(e.g. financial options and their mathematical and statistical elaboration by means of greeks calculation and

option hedging models)

8

LEGEND to whole figure Fig.1

, , ,

One – Sample Analysis, Two / Multiple – Sample Analysis

LEGEND to partial models of figure Fig.1

Formulation of statistical examination

Relative & Cumulative Frequencies (Empirical distribution)

Plotting functions: e.g. Plot Frequency Polygon (Graphical expression)

Average-Means (Arithmetic Mean), Variance-Standard (Determinative) Deviation,

Obliqueness (Skewness), Pointedness (Kurtosis) – (Empirical parameters)

Theoretical Distribution (partial survey in alphabetical order):

Bernoulli, Beta, Binomial, Chi-square, Discrete Uniform, Erlang, Exponential, F, Gamma,

Geometric, Lognormal, Negative binomial, Normal, Poisson, Student´s, Triangular,

Trinomial, Uniform, Weibull

Testing Non-parametric Hypotheses (Hypothesis test for H0 – receive or reject H0):

e.g. computed Wilcoxon´s test, Kolmogorov-Smirnov test, Chi-square test

e.g. at alpha = 0,05

Point & Interval Estimation:

e.g. confidence interval for Mean, confidence interval for Standard Deviation

Testing Parametric Hypotheses (Hypothesis test for H0 – receive or reject H0):

e.g. computed u-statistic, t-statistic, F-statistic, Chi-square statistic, Cochran´s test, Barlett´s

test, Hartley´s test


Statistical dependence:

e.g. confidence interval for difference in Means (Equal variances, Unequal variances)

e.g. confidence interval for Ratio of Variances

Regression analysis: simple – multiple, linear – non-linear

Correlation analysis: e.g. Rank correlation coefficient, Pearson´s correlation coefficient

a-1 e-1 a-2 e-2 a-3 e-3 a-4 e-4

a-1 e-1

a-2 e-2

a-3 e-3

a-4 e-4

9

Description of four partial analytical synthetic structures

The example of applicability of analytical synthetic modeling presented via Fig. 1 is

introduced by means of description of statistics as a whole. In the framework of this

description it is possible to indicate four partial analytical-synthetic structures of statistical

dimension of investigated problem.

Now, these four partial analytical synthetic structures will be presented. Within this

presentation let us compare general model of analytical synthetic structure of investigated

problem (from investigated phenomenon to the result of solution given by intellectual

reconstruction) with figure Fig. 1 "Analytical synthetic model of statistics formed by four

partial models".

First structure a-1 e-1 (see Fig. 1)

From investigated phenomenon (marked a-1)

"Collective random phenomenon and reason of its investigation"

to the result of intellectual reconstruction (marked e-1)

"Selective statistical set as a part of basic statistical set"

Second structure a-2 e-2 (see Fig. 1)


"Selective statistical set as a part of basic statistical set"


"Empirical picture of selective statistical set"

Third structure a-3 e-3 (see Fig. 1)


"Empirical picture of selective statistical set"


"Probable picture of selective statistical set"

Fourth structure a-4 e-4 (see Fig. 1)


"Probable picture of selective statistical set"


"Association picture of selective statistical set"

Applied statistics a5 (see Fig. 1)

10

The structure of explanation will reflect the model represented by figure Fig.1.

Therefore, the interpretation of individual paragraphs can be described by means of the

structural elements a-1 up to a-5 and e-1 up to e-4. The explanation will be fulfilled for

persons interested in deeper understanding by both the chapter explaining some basic

concepts of probability theory and the survey of basic statistical tables.

The structure of explanation will be as follows:

Part 1. The main methods of descriptive statistics, Statistical probability

1.1. Formulation of statistical investigation

(from element a-1 to element e-1)

1.2. Creation of scale


1.3. Measurement, Probability


1.4. Elementary statistical processing


Part 2. The main methods of mathematical statistics, Probability distribution

2.1. Assignment of theoretical distribution to empirical distribution – testing non-parametric

hypotheses, Probability – theoretical distributions


2.2. Comparison of empirical and theoretical parameters – estimations of theoretical

parameters, testing parametric hypotheses


2.3. Measurement of statistical dependences – some fundaments of regression and

correlation analysis


Part 3. Applications (element a5)

3.1. Description of statistical and probability base of financial options

3.2. Description of statistical and probability base of Greeks

3.3. Data Mining Tools in Statistics Education

Part 4. Statistical tables

11

Part 1. The Main Methods of Descriptive Statistics, Statistical

Probability

1.1. Formulation of Statistical Investigation

Goals:

- Collective random phenomenon and reason of its investigation

- Selective statistical set as a part of basic statistical set

Acquired concepts and knowledge pieces:

Collective random phenomenon, statistical unit, statistical sign – statistical character, values

of statistical sign, basic statistical set – basic statistical file – population, selective statistical

set – sample statistical file

Check questions:

- What is the subject of investigation of statististics and probability theory

- What is the collective random phenomenon

- How is the statistical unit delimited

- How are statistical sign and its values delimited

- What is the difference between basic and selective statistical set

- Why is the procces of random selection important

12

The explanation will be illustrated by means of the assigned example.

Assigned example:

.

The 4000 enterprises have undergone tests on “export ability”. The average “export

ability”on a scale 1 to 5 (1 – maximum export ability, 5 – minimum export ability) was

necessary to define for preliminary information.That is why the 50 tests was randomly

selected and their results are presented in table Tab.1. Elaborate the collective random

phenomenon (export ability of enterprise) gradually and complexly.

xi ni ni/n Σ ni/n xini xi2ni xi

3ni

xi

4ni

1 9 0,18 0,18 9 9 9 9

2 15 0,3 0,48 30 60 120 240

3 20 0,4 0,88 60 180 540 1620

4 4 0,08 0,96 16 64 256 1024

5 2 0,04 1,00 10 50 250 1250

Σ 50 Σ 1,00 Σ 125 Σ 363 Σ 1175 Σ 4143

Table Tab.1: The results of 50 test elaboration

The formulation of statistical investigation is worked on delimitation of following concepts:

- collective random phenomenon CRP

- statistical unit SU

- statistical sign SS

- values of statistical sign VSS

- basic statistical set and its extent BSS

- random selection RS

- selective statistical set and its extent SSS

Collective random phenomenon CRP (e.g. export ability of enterprise) is the realization

of the activities or processes whose result cannot be predicted with certainty and which are

taking place in an extensive set of elements (e.g. enterprises). These elements have the certain

13

group of identical properties (e.g. identical type of economical parameter – enterprise

character) and the other a group of different properties (e.g. the different values of export

ability of global economical state of enterprise). Mathematical statististics and probability

theory deal with qualitative and quantitative analysis of the patterns of collective random

phenomena.

The statistical unit SU is delimited by the identical properties of investigated set

elements (e.g. the enterprises and their character).

The statistical sign SS is given by some from different properties of investigated set

elements (e.g. by export ability of enterprise).

The values of statistical sign VSS are a way of investigated statistical sign description

(e.g. the description of export ability of mining industry enterprises by the percent of the

mined ore transported for the processing within fortnight from the extraction).

The basic statistical set BSS (population) is given by all the statistical units, its extent is

equal to the number of all the statistical units (e.g. the extent of investigated BSS is equal to

the total number of 4000 enterprises in the assigned example). It is usually not in the practical

possibilities of statisticians to investigate the statistical sign SS in all the statistical units SU

and it is required to limit the number of statistical units SU.

The random selection RS is limit the number of investigated statistical units SU in such

a way, in order to transfer the results obtained to the entire BSS. The various ways of random

selection are existing (drawing, generating a table of random numerals, deliberate selection).

It is necessary to verify whether it could be considered as random selection obtained.

The selected statistical set SSS is given those statistical units, which have been selected

from the basic statistical set by the process of random selection. The extent of SSS is equal to

the number of selected statistical units (e.g. the extent of SSS in the assigned example is equal

to the number of 50 selected enterprises). Selected statistical set SSS is one-dimensional if it

investigated only one statistical sign, multidimensional set found at, if investigated more

statistical signs.

The formulation of the statistical investigation is implemented in the assigned example

by the delimitation of selective statistical set 50 enterprises. In the context of this delimitation

must be exactly characterized all the follow-up concepts – investigated collective random

phenomenon CRP, definition of the statistical unit SU, determination of the investigated

statistical sign SS, characterization of the statistical sign values VSS, exact delimitation of the

basic statistical set BSS and finally, ensuring the procedure of random selection RS.

14

1.2. Creation of Scale

Goals:

- Creation of scale – scaling

- Choice of scale type


Scale, classification of scales, parameters of selective type of scale)

Check questions:

- What is the creation of scale

- Is it possible to distinguish the types of scales according to which facts

- What are the basic types of scales

- What is the difference between the quantitative metric scale and absolute metric scale

The scale creation is the suitable expression of statistical sign values by means of scale

elements. The point is that the statistical sign values can be divided into reasonable groups,

into scale elements. The system of scale elements creates the scale. The number k of scale

elements can be calculated, for example, by Sturges rule k = 1 + 3.3 log10n, where n is an

extent of selective statistical set SSS.

According to the nature of statistical sign it is possible to distinguish, e.g., four types of

scales: qualitative (nominal), ordinal, quantitative metric and absolute metric. The

classification of scales can be used also to classify statistical signs. In some cases, the

statistical sign values immediately identify the scale and scaling isn´t necessary.

The nominal scale is the classification into categories (the scale elements are the

individual categories). For every two statistical units of selective statistical set it is possible to

decide whether or not they are in terms of investigated statistical sign of identical or different

(such as gender or employment, if the statististical units are individual persons).

15

The ordinal scale enables you to not only decide on the identity or the diversity of the

statistical units, but also to establish their order (e.g., achieve the degree of scholastic

education). The scale elements are the individual order. This one doesn´t enable to determine

the distance between two neighbouring statistical units arranged according to this scale.

The quantitative metric scale already enables to establish the distance between two

neighbouring statistical units – from this perspective, it is needful to define the unit of scale

(e.g. percentage evaluation of export ability or other parameter of the global economical

condition, the temperature in degrees Celsius). The scale elements are the individual points of

scale expressed the numerical sizes. The quantitative metric scale expesses the values of

statistical sign without the possibility factually to interpret, in the beginning (zero point) of

scale – the choice of scale beginning is the question of free choice.

The absolute metric scale is a quantitative metric scale and, in addition, it can be

interpreted in the beginning of the scale factually – the scale zero responds to real zero value

of investigated statistical sign (e.g. the temperature in degrees Kelvin, the number of errors in

testing, the length of school attendance). The scale elements are the individual points of scale

of numeric sizes not only expressed but also the absolute zero of scale. Only the absolute

metric scale enables to calculate the divisions, the proportion of any two points of scale

doesn´t depend on the choice of scale unit.

In the assigned example the statistical sign values “degree of export ability” are given

by the degrees 1, 2, …, 5. It is evident the way of export ability expression had to be produced

(e.g. degree 1 – exported 100%-80% of mined ore by enterprise of mining industry, degree 2

– exported 80%-60% of mined ore, … , degree 5 – exported 20%-0% of mined ore) – so the

degrees 1, 2, …, 5 can be identified the scale of, which is the typical quantitative metric

scale. The scale elements are the points of scale expressed by numerical sizes x1 = 1, x2 = 2,

… , x5 = 5. This scale should reflect “the identical distance (e.g. 20%)” of export ability

between any two neighbouring scale elements.

16

1.3. Measurement

Goals:

- Process of measurement

- Expression of measurement results


Measurement, absolute frequency, relative frequency, cumulative frequencies

Kontrolní otázky: Check questions:

- What is the measurement within statistical elaboration of collective random

phenomenon

- What does the selection of measurement method depend on

- What conditions must the measurement method fulfil

- What are the results of measurement

- What is the statistical definition of probability

- How is the absolute and relative frequency defined

- How are the cumulative frequencies defined

The measurement is the process by which is one of k scale elements x1, x2, …, xk

assigned to each statistical unit SU of selective statistical set SSS (with extent n of

statististical units). The measurement results are the findings, that the scale element

xi (i = 1, 2, …, k) was measured ni times. The summation of all the values ni (i = 1, 2, …, k),

so called the absolute frequencies, must be equal to the extent n of selective statistical

set SSS.

17

The potential results of measurement (i = 1, 2, …, k) can be evaluated by the size of the

probability which appears in the course of measurement. The statistical definition of

probability works on n times independently carried out measurement (the number of

measurement n corresponds to the extent of selective statistical set SSS) and on discovered

the absolute frequencies ni of potential measurement results. The statistical probability p(xi) of

result xi is then given by so called relative frequency ni / n. The summation of all the relative

frequencies must be equal to 1.

Also the cumulative frequencies can be classified as the results of the measurement. The

cumulative frequency Σ (ni / n) is the probability that the measurement result will be measured

lesser or equal to result xi. It is evident the cumulative frequencies can be detected only within

quantitative metric or absolute metric scales. The cumulative frequencies, for example, are of

great significance in the construction of financial or economical balance sheets.

Within the assigned example it is possible through table Tab.1 to discover that it was

being worked with the scale created by 5 elements x1=1, x2=2, …, x5=5 (see the first column in

table), their absolute frequencies were gradually n1=9, n2=15, n3=20, n4=4, n5=2 (see the

second column in table). The relative frequencies ni / n are then presented in the third column

of the table, the cumulative frequencies in the fourth column. Of the fifty enterprises selective

statistical set (n=50) 9 enterprises were with the maximum export ability (probability of this

degree is 0.18), 15 enterprises were with the lower degree than the highest degree

(probability 0.30), 20 enterprises were with the middle export ability (probability 0.40),

4 enterprises were with the degree of development lower than middle degree (probability

0.08) and 2 enterprises were with the lowest degree of export ability (probability 0.04).

Within the assigned example the cumulative frequency, e.g. of result x3=3, is given by

probability 0.88. This probability, that the degree 1, 2 or 3 will be determined within the

investigation of export ability degree, can be determined by the summation of probabilities

p(1) + p(2) + p(3) = 0.18 + 0.30 + 0.40 = 0.88. So the probability of detection of the middle

degree is significantly high.

In the case of quantitative metric scale or absolute metric scale the measurement can be

considered the projection of statistical units set (e.g. within selective statistical set) into set of

real numbers.

18

The measurement methods depend on the expert field, which was defined in the

investigated selective statistical set SSS. They will be different, e.g., in the investigation of a

collective random phenomenon in sociology (various questionnaire forms of measurement)

and the investigation of a collective random phenomenon in economy (various ways of export

ability measurement before and after application of economical optimization of enterprise).

The measurement method shall comply with the conditions of validity (whether it is

measured what is to be measured), reliability (reproducibility of measurements) and

objectivity (whether the various evaluators will mesure the statistical unit in the same way).

The measurement results of investigated selective statistical set SSS are given by the

information on statistical sign values, i.e. by the information on the absolute frequencies and

the relative frequencies of individual scale elements and by the information on the cumulative

frequencies.

19

1.4. Elementary Statistical Processing

Goals:

- Goals of investigation of descriptive statistics

- Empirical picture of selective statistical set


Frequencies tables

Empirical distribution

Graphical expression

Plotting function – Graphical expression of empirical distribution

Frequency polygon

Empirical parameters

General moments, e.g. average-means (arithmetic mean)

Central moments, e.g. variance-standard deviation (determinative deviation)

Standardized moments, e.g. obliqueness (skewness), pointedness (kurtosis)

20

Check questions:

- What are the main goals of the elementary statistical processing

- How can be the measurement results arranged by suitable way

- How can be the measurement results graphically expressed by suitable way

- How can be the parameters of measurement results expressed by suitable way

- What is the empirical distribution of frequencies

- How can be the empirical distribution of one-dimensional statistical set expressed by

graphical way

- What is the frequency polygon

- What is the significance of graphical expression of empirical distribution

-

- How can be the empirical parameters divided according to described feature of

investigated statistical set

- How can be the empirical parameters divided according to calculation way

- How are defined the general, central and standardized moments

- What is the most important parameter of location, variability, skewness and kurtosis,

what is the statistical interpretation of these parameters

- How is the “excess” quantity defined and what is its significance

21

The measurement results, it is necessary to arrange, to express graphically and to

express by suitable empirical parameters. These assignments can be fulfilled using the

elementary statistical processing. The empirical picture of investigated selective statistical set

SSS is the result of the elementary statistical processing. The elementary statistical processing

also completes this group of major statistical methods that can be called descriptive statistics.

The partial assignments “arrangement”, “graphical expression” and “expression by

parameters” can be represented in three basic results of the elementary statistical processing –

“table”, “empirical distributions (preferably in the shape of polygon)” and “empirical

parameters”.

1.4.1. Table

The table represents a form of arrangement of the measurement results. In the

description of the table stated in the assigned illustrating example, it can be watched the table

Tab.1.

The table contains eight columns. The first four columns are necessary partly for

the display of the measurement results (fulfillment of task “arrangement”) partly for

the representation of the empirical distributions (fulfillment of task “graphical expression”).

The remaining four columns have the helping significance and they can be used to easy and

quick calculation of empirical parameters (fulfillment of task “expression by parameters”).

The first four columns contain:

1. column marked xi – scale elements

2. column marked ni – absolute frequencies of scale elements

3. column marked ni / n – relative frequencies of scale elements

4. column marked Σ (ni / n) – cumulative frequencies

The following four columns contain the products needed for the calculation of empirical

parameters:

5. column contains the products xi.ni

6. column contains the products xi2.ni



22

The table is closed by summations of the data in individual columns. In the first four

columns these summations have the checking significance, in the other four columns they are

needed for the calculation of empirical parameters.

1.4.2. Empirical Distributions of Frequencies

The empirical distributions of frequencies can be divided into two basic types. The first

type assigns corresponding absolute frequencies ni or relative frequencies ni / n to the scale

elements xi. The second type assigns corresponding cumulative frequencies Σ(ni / n) to the

scale elements xi.

The graphical expression of empirical distribution of one-dimensional statistical set is

connected with the use of the coordinate system in the plane. In this coordinate system the

scale elements xi are always applied to horizontal axis, the corresponding frequencies to

vertical axis. The graphical expression of these functional dependences is given by the set of

points the first coordinate of which is always scale element xi, the second coordinate is

corresponding frequency. By connection of neighbouring points of this set of the line

segments it is possible to obtain the broken line which is called “polygon”. It is possible to

distinguish “polygon of absolute frequencies”, “polygon of relative frequencies”, “polygon of

cumulative frequencies”.

In addition to the graphical expression of empirical distributions by polygon the ranks

of helping graphical representations is used. Their “advantage” is a deviation from

mathematically exact apparatus and a certain quick orientation. The impossibility to continue

by a deepen apparatus of the mathematical statistics is the shortage, above all from the point

of view of the investigation of dependencies for the multi-dimensional statistical sets. The bar

charts, the bar graphs, the pie charts, etcetera, belong to these helping graphical

representations. Generally, it is possible to recommend the unique resorting to exact graphical

expression.

The significance of the graphical expression of the empirical distribution is substantial.

The graphical expression enables the immediate investigation which the theoretical

distribution (in terms of probability theory) is close to the empirical distribution obtained as a

result of descriptive statistics. The next significance consists in the immediate evaluation of

23

parameters of location, variability, skewness and kurtosis of empirical distribution and by this

way also of investigated statistical set.

Within the assigned example it is possible to practice, e.g., the construction of polygons

of the absolute and the cumulative frequency. In figure Fig.2 the absolute frequencies polygon

is represented, in figure Fig.3 then the cumulative frequencies polygon.

Fig.2 Absolute frequencies polygon Fig.3 Cumulative frequencies polygon

1.4.3. Empirical Parameters

The empirical parameters briefly and simply express the nature of investigated

statistical set. The empirical parameters are mostly related to a selective statistical set that´s

why they often bear the naming “selective parameters”. As selective parameters they have

themselves the statistics-probability character and from this reason they behave as a special

group of “statistical signs”. This view will not be developed in following explanation but it is

necessary to draw attention to it, especially from the point of view of a deeper study of

statistics and probability theory.

The empirical parameters can be classified according to the feature of the investigated

statistical set (investigated statistical sign):

parameters of location

parameters of variability

parameters of obliqueness (skewness)

parameters of pointedness (kurtosis)

0

0,2

0,4

0,6

0,8

1

1 2 3 4 5

0

5

10

15

20

25

1 2 3 4 5

24

The second classification is classification of empirical parameters according to the way

of their calculation:

- moment parameters (they work as a function of all values of statistical sign)

- quantile parameters (they represent only certain values of statistical sign)

The quantile parameters are closely related to the moment parameters but they are

constructed by different way. The empirical quantile is always a certain value of statistical

sign (which is expressed by quantitative metric or absolute metric scale). That value divides

the number of smaller and greater values of statistical sign in certain ratio. E.g., the quantile

dividing the values of statistical sign in the identical parts (i.e. fiftypercentage quantile) is

called a “median”. The quantile parameters will not be investigated in more detail.

The moment parameters are divided into general moments, central moments and

standardized moments. The location moment (arithmetic mean) can be accurately

characterized using general moment of 1.order, the variability moment (empirical variance)

can be accurately characterized using central moment of 2.order , the obliqueness (skewness)

and pointedness (kurtosis) can be accurately characterized using standardized moments of

3. and 4.order.

As the standardized moments can be calculated using central moments and the central

moments using general moments, the following procedure will be selected in next explanation

(within this procedure the investigated statistical sign will be marked by letter x; the marks of

statistical sign values xi, of absolute frequencies ni and of selective statistical set extent n

don´t change themselves):

- Presentation of common relations for general and central moments

- Expression of needful central moments using general moments

- Expression of needful standardized moments using central moments

a) The common relations for general and central moments

General moment of r-th order: Or(x) = 1

nΣ ni .(xi)

r

General moment of 1. order: O1(x) = x (arithmetic mean)

25

Central moment of r-th order: Cr(x) = 1

n Σ ni.(xi – x )

r

Central moment of 2. order: C2(x) = Sx2 (empirical variance)

Determinative (standard) deviation: Sx = 2 ( )C x

b) The expression of needful central moments using general moments

C2(x) = O2(x) – O1(x)2

C3(x) = O3(x) – 3.O2(x).O1(x) + 2.O1(x)3

C4(x) = O4(x) – 4.O3(x).O1(x) + 6.O2(x).O1(x)2 – 3.O1(x)

4

c) The expression of needful standardized moments using central moments

N3(x) = 3

2 2

( )

( ) ( )

C x

C x C x

N4(x) =

4

22

( )

( )

C x

C x

The procedure for calculation of general, central and standardized moments was

realized using the steps ad a), ad b) and ad c). Since all the needful moment parameters can be

determined using this procedure, now it is possible to describe the parameters of location,

variability, obliqueness (skewness) and pointedness (kurtosis).

The location parameter is determined by general moment of 1. order O1(x) and it bears

the name “arithmetic mean”. The position of the frequency empirical distribution is its

location on the horizontal axis of the coordinate system.

The variability parameter is determined by central moment of 2. order C2(x) and it bears

the name “empirical variance” (the square root from variance then bears the name “standard

deviation”). Determinative (standard) deviation shows what the information value is given to

arithmetic mean. If the determinative (standard) deviation is large, the information value of

arithmetic mean is small and vice versa.

26

The obliqueness parameter (skewness) is dominantly determined using standardized

moment of 3. order N3(x) and it bears then the name “coefficient of skewness”. If the

skewness coefficient is positive, then the scale elements lying to the left of the arithmetic

mean have greater frequencies (positively skew distribution of frequencies – greater

concentration of the lower scale elements, of the smaller values of statistical sign) and vice

versa.

The pointedness parameter (kurtosis) is dominantly determined using standardized

moment of 4. order N4(x) and it bears then the name “coefficient of kurtosis”. The greater

value of kurtosis coefficient corresponds to more pointed distribution of frequencies for

a given variance. The quantity “excess”, defined by relation Ex = N4(x) – 3, is used as well.

The excess compares the kurtosis of empirical distribution with the kurtosis of known

standardized normal distribution. If the excess is positive, the empirical distribution is more

pointed than this distribution.

1.4.4. Illustration of Calculation of Empirical Parameters

In the assigned example the calculation of the empirical parameters of location,

variability, skewness and kurtosis will be now carried out. The soonest the general moments

of 1. to 4. order will be calculated using 5. up to 8. column of table Tab.1.

O1(x) = 2.50

O2(x) = 7.26

O3(x) = 23.50

O4(x) = 82.86

Next part of the procedure will consist in the calculation of central moments of

2. up to 4. order:

C2(x) = 1.031 (standard deviation Sx = 1.015)

C3(x) = 0.300

C4(x) = 2.922

Final part of the procedure of empirical parameters calculation will be aimed at the

determination of standardized moments of 3. and 4. order and excess:

27

N3(x) = 3

2 2

( )

( ) ( )

C x

C x C x= 0.28

N4(x)=

4

22

( )

( )

C x

C x= 2.75

Ex = N4(x) – 3 = – 0.25

Location parameter (arithmetic mean) O1(x) shows to the placement of frequencies

empirical distribution on the horizontal axis – the arithmetic mean of export ability is 2.5

(a lower value than the middle degree of export ability)

Determinative (standard) deviation expressed by the square root from C2(x) gives

an indication of the arithmetic mean information value. An indication of the information value

can be quantified by following way – in the range from export ability degree 1.5 to export

ability degree 3.5 the 70% enterprises is roughly situated (the applicability of this information

depends on whether the empirical distribution can be substituted by theoretical normal

distribution).

The positive skewness coefficient N3(x) shows to the greater concentration of lower

scale elements, of lower degrees of export ability development. The figure Fig.2 confirms that

determination –the slight asymmetry of the left to the arithmetic mean.

Relatively the high value of kurtosis coefficient and also the value of excess show to

a comparability with the kurtosis of standardized normal distribution. This communication

additionally supports the conclusion of arithmetic mean good information value.

28

Part 2. The Main Methods of Mathematical Statistics,

Probability Distribution

2.1. Assignment of Theoretical Distribution to Empirical Distribution

Goals:

Probable investigation of selective statistical set: Choice of acceptable theoretical distribution

Probable picture of selective statistical set: Testing non-parametric hypotheses


Theoretical distribution, partial survey in alphabetical order:


Geometric, Lognormal, Negative binomial, Normal, Poisson, Student´s, Triangular, Uniform,

Weibull

Testing nonparametric hypotheses

Test of zero hypothesis H0

Receiving or rejecting of zero hypothesis H0

Level of statistical significance , e.g. at = 0,05

29

Check questions:

Why is it advantegous to substitute an empirical distribution by theoretical distribution

Describe the division of statistical sign values extent into suitable number of intervals

What is the interval division of frequencies, what is the condition for creation of frequency

interval division in the case of testing non-parametric hypotheses

What is the random attempt and random variable

How are the random variables divided

How do the values of discrete and continuous random variable differ

How is the theoretical distribution (the distribution of random variable) defined

How are the theoretical distributions divided

What is the form of discrete theoretical distribution description

What is the form of continuous theoretical distribution description

What is the difference between probability function and probability density

What is the significance of binomial distribution

What is the significance of normal distribution

What is the formulation of central limit theorem

Present the form of distribution function of binomial and normal distribution

Present the form of probability function (probability density) of binomial distribution (normal

distribution)

How many of the theoretical parameters do binomial and normal distribution depend on,

describe the theoretical parameters

What is standardized normal distribution

What are the common relations for mean value and variance for discrete and continuous


What is the relation between empirical and theoretical parameters

What does the law of large numbers express

What is the apparatus of non-parametric testing

What do the zero and alternative hypothesis suppose in the case of non-parametric testing

What is the essence of testing non-parametric hypotheses

What are the theoretical distributions used for testing non-parametric hypotheses

What is the relation of theoretical distribution and statistical criterion

What is the relation of experimental value and critical theoretical value of statistical criterion

What is the critical domain of statistical criterion

Describe the testing technique of chi-square

What is the level of statistical significance

What is the error of I. type

30

The assignment of theoretical distribution to empirical distribution is the expression of

content of statistical method which bears the name “testing non-parametric hypotheses”.

Within this statistical method it will be needful to deal with the interval division of

frequencies, the concept “theoretical distribution”, the apparatus of non-parametric testing and

the assigned example. The significance of testing non-parametric hypotheses consists above

all in the fact that it is always more advantageous to substitute an empirical distribution by

theoretical distribution – the simple mathematical apparatus is connected with theoretical

distribution and such apparatus enables to detect the information inaccessible by another way.

2.1.1. Interval Division of Frequencies

In some cases (e.g., for needs of non-parametric testing) it is useful to divide the extent

of statistical sign values or the extent of metric scale elements into a certain number of

intervals. In each from intervals created, then the corresponding values of statistical sign or

the corresponding elements of metric scale will be included. Usually it is recommended to

construct 5 – 20 intervals of the same length, also the empirical rules (working on an extent n

of selective statistical set SSS) are in being for rough delimitation of interval number k (e.g.

Sturges rule k = 1 + 3.3 log10n). It is needful to dedicate a relevant attention also for the

determination of interval boundaries.

Within the assigned example it will be determined if the empirical distribution in figure

Fig.1 can be substituted by normal distribution. This intention leads to the determination of

intervals number and intervals boundaries how it is presented in table Tab.2.

xi interval ni ni/n Σ ni/n nixi nixi2 nixi

3 nixi

4

1 ( - ∞; 1,5 9 0,18 0,18 9 9 9 9

2 ( 1,5; 2,5 15 0,3 0,48 30 60 120 240

3 ( 2,5; 3,5 20 0,4 0,88 60 180 540 1620

4 ( 3,5; 4,5 4 0,08 0,96 16 64 256 1024

5 ( 4,5; ∞ 2 0,04 1,00 10 50 250 1250

Σ 50 1,00 125 363 1175 4143

Table Tab. 2: Interval division of frequencies

31

2.1.2. Theoretical Distribution

The concept “theoretical distribution” is one from the fundamental concepts of

probability theory. The collective random phenomenon CRP, which is the subject of both

statistics and probability theory, is investigated in probability theory by means of the concepts

“random attempt” and “random variable”. The random attempt is a realization of activities or

processes the result of which isn´t possible to anticipate with certainty. The random variable

RV is then variable the value of which is definitely determined by result of random attempt.

“The value of random variable VRV” is concept which has strong theoretical

dimension. By certain analogy of this concept, the origin of which can be discovered in

probability theory, it is concept “the value of statistical sign VSS”, the origin of which can be

discovered in descriptive statistics. The concept “value of statistical sign VSS” so has on the

contrary strong empirical dimension.

The random variables RV can be divided into discrete (the values of discrete random

variable “don´t follow” themselves and they will be marked xi) and continuous (the values of

continuous random variable will be marked x and these values are continuously “following”

themselves – it isn´t possible to find the nearest neighbouring value). To values of random

variable it is possible to assign the probabilities with which they come in the course of

random attempt. These probabilities can be defined in a classical way (a number of random

attempt results positive to given value divided by the number of all random attempt results) or

e.g. according to Kolmogorov (by application of measure theory).

The rule that every value of random variable or every interval of values assigns

the probability is called the law of random variable distribution or shortly the random variable

distribution or also the “theoretical distribution”. From the point of view of cooperation

between probability theory and statistics the concept “theoretical distribution” is adequate to

statistical concept “empirical distribution of frequency”. According to an essence of random

variable RV the theoretical distributions can be divided into discrete and continuous ones.

The distribution function F is the important form of theoretical distribution description.

The distribution function F in the case of discrete random variable quotes the probability that

a random variable RV obtains the values smaller or equal to just chosen value xi and this

cumulative probability will be expressed by a summation of partial probabilities. In the case

of continuous random variable the distribution function F quotes that a random variable RV

32

obtains values smaller or equal to just selected value x, but this cumulative probability instead

of a summation will be expressed by an integral the lower limit of which is usually equal to 0

and upper limit is corresponding with selected value x. From the point of view of cooperation

between probability theory and statistics the concept “distribution function” is adequate to

statistical concept “empirical distribution of cumulative frequency”.

a) Binomial distribution – the example of discrete theoretical distribution

The characteristic of collective random phenomenon

The n independent random attempts are carried out, the probability of monitored random

phenomenon is the same in the all random attempts and it is equal to p. It is sought the

probability that this phenomenon occurs itself 0, 1, …, n-times. According to this definition

the values x0, x1, …, xn of relevant random variable are given by numbers 0, 1, …, n.

Theoretical distribution, distribution function

The theoretical distribution is called probability function in discrete case. For described

random phenomenon the probability function is a rule which assigns the probabilities Pi for

i = 0, 1, …, n to the values xi of random variable. The form of probability function is

1n ii

i

nP p p

i

.

The relevant form of distribution function (cumulative probability) F(xj) = Fj is given by

summation

0

j

j i

i

F P

,

where adding index i obtains the values from 0 to j.

The binomial distribution depends on two theoretical parameters – p, n.

The significance of binomial distribution

A typical example of independent random attempts is a random selection of elements from

a set if the selected element is returned back, so called the selection with return. It can be

shown that, in the case where the extent of selective set is small in comparison with the extent

of basic set, the difference between the selection with return and the selection without return

33

is insignificant. The binomial distribution can therefore serve as a suitable criterion, whether

the selective statistical set was created on the basis of random selection.

b) Normal distribution – the example of continuous theoretical distribution

The characteristic of collective random phenomenon

The continuous random variable whose values x(–,), can have a normal distribution. The

graph of function which assigns the probabilities to these values of random variable is given

by well-known Gauss curve in the shape of a “bell”. It is so sought a probability which will be

assigned to unit interval of continuous random variable values in the sense that this interval

will contain the value of x.

Theoretical distribution, distribution function

The theoretical distribution is called probability density in continuous case (the random

variable values continuously “follow” themselves, it is needful to assign the probabilities to

unit intervals of values because the nearest neighbouring value to value x isn´t possible to

find). The form of probability density is

2

221

2

x

x e

.

The relevant form of distribution function (cumulative probability) F(x) is given by integral

,

t

F t x dx

where lower integral limit acquires value 0, upper limit then value t.

The normal distribution depends on two theoretical parameters – μ, σ. This dependence

is usually recorded N(μ,σ). The theoretical parameter μ is a theoretical analogy of general

moment of 1.order O1(x) and so it is theoretical analogy of empirical arithmetic mean x . The

theoretical parameter σ is a theoretical analogy of the square root of central moment of 2.order

C2(x) and so it is theoretical analogy of empirical standard (determinative) deviation Sx.

The normal distribution can be normalized to the values of theoretical parameters μ=0,

σ=1 by means of standardized random variable

34

xu

.

This dependence is usually recorded N(0,1) and so called “standardized normal distribution”

(see figure Fig.4) is then marked by this record. The probability density of standardized

normal distribution will be marked u due to introduced variable u, the distribution

function is often called Laplace function and marked by record F(u). Very detailed statistical

tables are elaborated for the values of Laplace function. The graphical representation of

standardized normal distribution probability density is in the figure Fig.4.

Fig.4 Graphical representation of probability density u of standardized normal

distribution (the values u are applied in horizontal axis, the values of

probability density u are applied in vertical axis)

The significance of normal distribution

The significance of normal distribution is described by central limit theorem. Its essence is the

statement that the random variable, being created as the summation of a large number of

mutually independent random variables, has approximately the normal distribution under very

general conditions. The exact formulation is presented by Ljapunov theorem the component

of which is the condition enabling to work with a normal distribution for sufficiently the big

extent of selective set. The special forms of that theorem – Lindberg-Lévy theorem and

Moivre-Laplace theorem (this theorem shows that for sufficiently the big number of

independent attempts the binomial distribution is converging to normal distribution) are

useful, too.

35

c) Parameters of theoretical distributions

For the discrete theoretical distributions the Pj will mark the distribution function and

the xi the values of random variable RV. For the continuous theoretical distributions the

x will mark the probability density and the x the values of continuous random variable.

The theoretical general, central and standardized moments Oj, Cj and Nj are important

parameters of all the theoretical distributions. The theoretical general, central and

standardized moments Oj, Cj and Nj can be expressed through the formulas:

1

( ) ,

b nj j

j j i

ia

O x x dx O i P

1 1

1

( ) ,

b nj j

j j i

ia

C x O x dx C i O P

1 1

12 2

( ) ,

j jb n

j j i

ia

x O i ON x dx N P

C C

Often the names and marks “mean value (expected value) E and dispersion

(variance) D” are used, too. The expected value E is a location parameter which measures the

level of random variable RV. The dispersion D is a variability parameter which measures the

“diffusion” of random variable values. The expected value E is equal to theoretical general

moment of 1.order O1, the dispersion D is equal to theoretical central moment of 2.order C2.

The theoretical general moment of 1.order O1 is the location parameter, the theoretical

central moment of 2.order C2 is the variability parameter, the theoretical standardized moment

of 3.order N3 is the skewness parameter and the theoretical standardized parameter of 4.order

N4 is the kurtosis parameter.

The relation between empirical and theoretical parameters describes the law of large

numbers. Subject to compliance with certain conditions, it can be expected that the empirical

distribution and related empirical parameters will approximate the theoretical distribution and

associated with him theoretical parameters. And the more, the greater the extent of selective

statistical set (the larger the number of realized random attempts). Approaching the empirical

parameters to the theoretical parameters has not character of mathematical convergence but

probability convergence.

36

2.1.3. Description of Selected Probability (Theoretical) Distributions

a) Discrete theoretical distribution – Alternative distribution

The alternative distribution is discrete theoretical distribution A(p) with one theoretical

parameter of zero-one random variable RV (the random variable has values xi = i = 0, 1).

The probability and distribution functions Pi and Fi as analogies of empirical relative

and cumulative frequency and theoretical moments Oj, Cj have for alternative distribution the

forms

1

0

1 2 3 4

1 2 3

2

4

1 , where 0,1, , where 1

theoretical moments , , ,

, 1 , 1 1 2 ,

1 1 3 3 .

iii

i i i

j

i i

P p p i F P i

O C C C

O E p C D p p C p p p

C p p p p

b) Discrete theoretical distribution – Binomial distribution

The binomial distribution is discrete theoretical distribution Bi(n, p) with two

theoretical parameters n, p of random variable RV (the random variable has values

xi = i = 0,1, ….,n).


and cumulative frequency and theoretical moments Oj, Cj have for binomial distribution the

forms

0

1 2 3 4

1 2 3

22 2 2

4

1 , where 0,1,...., , , where ,


, 1 , 1 1 2 ,

3 1 1 1 6 6 .

in ii

i i i

j

i i

nP p p i n F P i n

i

O C C C

O E np C D np p C np p p

C n p p np p p p

c) Discrete theoretical distribution – Poisson distribution

The Poisson distribution is discrete theoretical distribution Po(λ) with one theoretical

parameter λ of random variable RV (the random variable has values

xi = i = 0,1, …., ).

37


and cumulative frequency and theoretical moments Oj, Cj have for Poisson distribution the

forms

0

1 2 3 4

2

1 2 3 4

, where 0,1,...., , , where ,!


, , , 3 .

i i

i i i

j

i i

P e i F P ii

O C C C

O E C D C C

The binomial distribution Bi(n, p) may be approximated by Poisson distribution Po(λ)

for n > 30 and for p → 0 (p ≤ 0.1 is sufficient).

d) Discrete theoretical distribution – Geometric distribution

The geometric distribution is discrete theoretical distribution Ge(p) with one theoretical

parameter p of random variable RV (the random variable has values

xi = i = 0,1, …., ).

The probabilities Pi geometrically decreases with increasing values i. The independent

attempts are carried out and a probability taking the observed phenomenon (i.e. the

probability of success) is for all the attempts the same and equal to p. The probability of

success only in attempt i + 1 is given by probability function Pi.


and cumulative frequency and theoretical moments Oj, Cj have for geometric distribution

Ge(p) the forms

0

1 2

1 2 2

1 , where 0,1,2,...., , , where ,

theoretical moments ,

1 1, .

ii

i i i

j

i i

P p p i F P i

O C

p pO E C D

p p

e) Discrete theoretical distribution – Hypergeometric distribution

The hypergeometric distribution is discrete theoretical distribution HGe(N, M, n) with

three theoretical parameters N, M, n of random variable RV (the random variable has values

xi = i = max(0, M – N + n),…., min(M, n)).

38

The hypergeometric distribution, unlike the previous discrete distributions, has the

dependent repeated random attempts (e.g. it is worked with N elements, M elements of which

has observed sign and n elements is selected from these N elements without return).

The probability function Pi as analogy of empirical relative frequency and theoretical

moments Oj, Cj have for hypergeometric distribution HGe(N, M, n) the forms

1 2

1 2

, where max 0, ,...,min( , ),

theoretical moments ,

, 1 .1

i

i i

M N M

i n iP i M N n M n

N

n

O C

M M M N nO E n C D n

N N N N

The forms of the theoretical parameters O1, C2 for N sufficiently large against n

correspond to forms of theoretical parameters O1, C2 of binomial distribution Bi(n, p) with

probability

Mp

N .

The hypergeometric distribution HGe(N, M, n) may be approximated for

0,05n

N ,

Mp

N

by binomial distribution Bi(n, p).

The hypergeometric distribution HGe(N, M, n) may be approximated for small fractions

,n M

N N and for n large

0,05, 0,1, 31,n M M

n nN N N

by Poisson distribution Po(λ).

39

f) Discrete theoretical distribution – Multinomial distribution

The s-multiple multinomial distribution is discrete theoretical distribution

s-Multi(n,p1,….,ps-1) with s theoretical parameters n, p1,…, ps-1 (the random variables

RV1,…, RVs have values marked i1,…, is = 0, 1,…., n).

The distribution s-Multi(n, p1,…, ps-1) is connected with incompatible random

phenomena A1,…., As which can come in n independent attempts with the probabilities

p1 ,…., ps (the summation of probabilities is equal to 1, s-multiple multinomial distribution is

therefore only with s–1 independent probabilities). The numbers of random phenomena Ai

occurrence in n attempts have the binomial distributions Bi(n, pi).

The probability function Pi for multinomial distribution s-Multi(n, p1,…,ps-1) has as

analogy of empirical relative frequency the form

11

1 ,..., 1

1

1

1

!... 1 .

!... ! !

s

j

js

s

n is

ii

i i s js

j

s j

j

nP p p p

i i n i

The individual binomial distributions Bi , in p have the theoretical parameters

1 2, 1 .i i i i iO E np C D np p

The distribution of one random variable (s = 2) is binomial distribution Bi(n, pi). The

distribution of two random variables (s = 3) is trinomial distribution Tr(n,pi,pj). The

probability function Pij for trinomial distribution Tr(n,pi,pj) has the form

The multinomial distribution for n → ∞, pi → 0 (i=1,…,s) may be approximated for

λi = npi (λi are the finite numbers) by multi-dimensional Poisson distribution Po(λi).

g) Continuous theoretical distribution – Normal and standardized normal distribution

The normal distribution is continuous theoretical distribution N(μ,σ) of random variable

RV (the random variable acquires the values ;x ). The normal distribution has two

theoretical parameters μ, σ. The standardized normal distrinution is continuous theoretical

1 2 1 2

!1 .

! ! !

n i ji j

ij

nP p p p p

i j n i j

40

distribution N(0,1) of random variable U (the random variable acquires the values

;u ). For standardized normal distribution the parameters μ, σ are standardized to

values 0, 1 by the substitution of the random variable RV by new random variable U

2

, 0, 1.E x D xx x x

u E D

The probability densities ρ(x), ρ(u) (corresponding with relative frequency), the

distribution functions F(x), F(u) (corresponding with cumulative frequency) and standardizing

conditions (corresponding with empirical standardizing condition) have the forms

22

22 21 1

,2 2

,

1, 1

x u

t t

x e u e

F t x dx F t u du

F x dx F u du

The theoretical parameters O1, C2 can be calculated in the form

1 1

2 2 2

2 1 2

, 0

( ) , 1.

O E x x x dx O E u u u du

C D x x O x dx C D u u u du

h) Continuous theoretical distribution – Lognormal distribution

The lognormal distribution is continuous theoretical distribution LN(μ, σ) of random

variable RV which is increasing function of random variable Y in the form x = ey (the random

variable Y has normal distribution N(μ, σ)). The lognormal distribution has two theoretical

parameters μ, σ.

The probability density ρ(x) (corresponding with relative frequency) has the form

2

2

ln1exp , where 0 .

22

xx x

x

41

The theoretical parameters Ok, O1, C2 can be calculated in the form

2 2

0

22

1 2

2 2 2

2 2 1

exp2

exp , exp 2 2 ,2

exp 2 exp 1 .

k k

k

kO E x x x dx k

O O

C D x O O

2.1.4. Apparatus of Non-parametric Testing

The use of apparatus of the zero hypotheses H0 and the alternative hypotheses Ha is the

foundation of the testing non-parametric (but also parametric) hypotheses.

In the case of non-parametric hypotheses the zero hypothesis supposes that empirical

distribution can be substituted by intended theoretical distribution (regarding the substitution

by normal distribution it had been a test of normality). An alternative hypothesis then

supposes that this presumption isn´t correct. A comparison between theoretical and empirical

absolute frequencies is the essence of testing non-parametric hypotheses. The empirical

absolute frequencies are calculated by means of elementary statistical processing in relation to

the empirical distribution. The theoretical absolute frequencies are then calculated through

probability function or probability density in relation to the intended theoretical distribution.

The parametric hypotheses relate to a comparison of empirical and theoretical

parameters and the zero and alternative hypotheses play the similar role here.

.

For the verification of non-parametric and parametric hypotheses the special group of

theoretical distributions was developed – these distributions are not intended to replace the

empirical distributions but they work as statistical criteria. The normal distribution is the only

exception – in its standardized shape it may play a role of statistical criterion, in its non-

standardized shape may substitute the empirical distributions.

Standardized normal distribution (u-test), Student´ distribution (t-test), Pearson´

χ2 distribution (χ

2-test, chi-square) and Fisher-Snedecor distribution (F-test) belong among the

most frequent statistical criteria. The detailed statistical tables are elaborated for all presented

statistical criteria.

42

For verification of hypotheses H0 and Ha the suitable statistical criterion is needful to

select. The χ2-test is used the most frequently for verification of a non-parametric hypothesis.

If the creation of interval division of frequencies is a condition for its application, it is then

needful to connect the each partial interval with the absolute frequency equal to at least 5. If

this condition isn´t fulfilled it is necessary to connect the partial intervals. Similarly, it is

necessary to proceed to the interval division of frequencies.

After the selection of statistical criterion (e.g., χ2-test) it is needful to come up to the

determination of experimental value of this criterion (e.g., 2

exp ) and critical theoretical value

(e.g., 2

teor ). So called the critical domain W of relevant statistical criterion will be recorded

by means of the critical theoretical value.

If the experimental value of selected criterion will be an element of the critical domain

W it is necessary to receive the alternative hypothesis Ha – i.e. the empirical distribution

cannot be substituted by intended theoretical distribution. In the contrary case (the

experimental value will not be an element of the critical domain W) the zero hypothesis H0

can be received – i.e. the empirical distribution can be substituted by intended theoretical

distribution.

The determination of significance level α is an essential element of testing non-

parametric and parametric hypotheses. This significance level quotes the probability of

erroneous rejection of tested hypothesis (i.e. the probability of the error of I. type). The most

frequent significance levels are the values α = 0.05 and α = 0.01. E.g., the significance level

0.05 enables for the positive test of normality (i.e. it is received the hypothesis H0 on the

possibility to substitute the empirical distribution by normal distribution and the hypothesis Ha

is refused) to determine the conclusion – if the selective statistical set SSS will be selected

100 times from basic statistical set BSS, in 95 cases it will be shown the empirical distribution

can be substituted by normal distribution.

The proper procedure of non-parametric testing can be exercised by means of the

solution of the assigned example.

43

2.1.5. Illustration of Non-parametric Testing

Within the assigned example it is now possible to monitor the procedure for the

verification of the zero hypotheses H0 that the empirical distribution in figure Fig.2 can be

substituted by a normal distribution (see Fig.4).

In the course of testing the χ2-test will be applied, in the course of its application the

letter k will be to refer to the number of intervals of frequency interval division, the letter r

then to the number of normal distribution theoretical parameters (i.e. r = 2). The formulation

ν = k–r–1 expresses the number of freedom degrees which enables together with a selected

level of significance to determine the critical theoretical value 2

teor =

2

- -1k r using statistical

tables. The significance level is selected α = 0,05.

The letter F marks the Laplace function depending on standardized random variable ui

(ui is standardized value reflecting the upper limit xi of relevant interval of frequency interval

division). The probabilities pi (expressed by integral calculus) are given by the difference of

Laplace function values, the products n.pi then express the theoretical absolute frequencies,

the values ni denote the empirical absolute frequencies (see tables Tab.1 and Tab.2).

The calculation of standardized values ui using the relation

1ii

x

x Ou

S

(general moment of 1.order O1 = 2,5, standard deviation Sx = 1, the upper limits xi are

x1 = 1,5,

x2 = 2,5,

x3 = 3,5,

x4 = 4,5,

x5 = ∞)

leads to the values

u1 = 1,

u2 = 0,

u3 = 1,

u4 = 2,

u5 = ∞.

44

The calculation of probabilities pi using the integral calculus and using the Laplace

function values F(u):

1,5 1

1 1

2,5 0

2 2

1,5 1

3,5 1

3 3

2,5 0

4,5 2

4 4

3,5 1

5 4

4,5 2

, 1

, 0 1

, 1 0

, 2 1

, 2

p x dx p u du F

p x dx p u du F F

p x dx p u du F F

p x dx p u du F F

p x dx p u du F F

The application of χ2-test form

2

2

exp 1

1

,k

i i

i i i

i i

n npp F u F u

np

already enables to realize the needful partial calculations (see table Tab.3).

xi Interval ni ui F(ui) pi npi

1 (– ∞; 1,5 9 –1 0,1625 0,1625 8,125

2 ( 1,5; 2,5 15 0 0,5000 0,3375 16,875

3 ( 2,5; 3,5 20 1 0,8175 0,3175 15,875

4 ( 3,5; 4,5 4 2 0,9754 0,1579 7,895

5 ( 4,5; ∞ 2 1,0000 0,0246 1,230

Table Tab.3: The calculations of ui, F(ui), pi and n.pi

The table Tab.4 reacts to the requirement at least 5 and more measurement results must

be in each interval in the course of normality test. The neighbouring intervals come together

to reach the 5 and more measurement results. At the same time the additional calculations,

enabling to establish the experimental value of statistical criterion, are carried out in this table.

45

xi ni npi 2

( - )i i

i

n np

np

1 9 8,1 0,100

2 15 16,9 0,214

3 20 15,9 1,057

4 + 5 6 9,1 1,056

Σ = 2,427 = 2

exp

Table Tab.4: The adjustment of intervals number, the calculation of 2

exp

In the final part of non-parametric testing it was needful to determine the critical

theoretical value 2

teor =

2

=

2

- -1k r = 2

4-2-1 = 2

1 = 3.84 using the calculated number of

freedom degrees ν = k – r – 1 = 4 – 2 – 1 = 1 and using the statistical tables with significance

level α = 0.05. By means of the critical theoretical value already it was possible to record the

right-sided critical domain W = 2 , ) 3.84, ).

For the experimental value of statistical criterion 2

exp = 2.427 (i.e.

2

exp W) it is

possible to do the conclusive verdict related to non-parametric hypothesis test:

The experimental value 2

exp doesn´t belong to critical domain, the zero hypothesis H0

can be received and the empirical distribution (empirical polygon) can be substituted by

theoretical normal distribution with the significance level α = 0.05. This conclusion is of

considerable importance – in the course of deducing the additional information it is possible

to use not only the simple mathematical apparatus connected with normal distribution but also

in the course of parametric hypotheses testing it is possible to apply the testing techniques

which are just bound to the normal distribution.

46

2.2. Comparison of Empirical and Theoretical Parameters – Estimations of

Theoretical Parameters, Testing Parametric Hypotheses

Goals:

- Probable investigation of selective statistical set: Quantification of theoretical

parameters, Comparison between theoretical and empirical parameters

- Probable picture of selective statistical set: Point & interval estimation – e.g.

confidence interval, Testing parametric hypotheses


Point estimation

Interval estimation

Confidence interval

Confidence interval for mean value

Confidence interval for standard deviation

Testing parametric hypotheses

Computed u-statistic

Computed t-statistic

Computed F-statistic

Computed chi-square statistic

47

Check questions:

Why do the estimations of theoretical parameters come before the comparison of theoretical

and empirical parameters?

What conditions must good point estimation fulfil?

What are the methods of point estimations?

What are the advantages of interval estimations?

Describe the way of confidence intervals construction

Which are the statistical criteria used for confidence intervals construction?

What is the apparatus of parametric testing?

What is the difference between one-selective and two-selective testing parametric

hypotheses?

What is the procedure for parametric testing?

Present a survey of the most general statistical criteria

.

48

Another of the main methods of statistics “Comparison of empirical and theoretical

parameters” builds on “Assignment of theoretical distribution to empirical distribution”. The

theoretical distribution is identified and assigned by non-parametric testing, but it contains

still the unknown values of theoretical parameters. Before an implementation of comparison

between empirical and theoretical parameters it is needful to estimate the theoretical

parameters. Then it is possible to approach to a comparison between empirical and theoretical

parameters with the application of parametric testing apparatus.

2.2.1. Basics of Estimation Theory

It is necessary to estimate the theoretical parameters (e.g. mean value E = μ and

dispersion D = σ2 for the normal distribution). Two kinds of the theoretical parameters

estimations can be: the point and the interval ones.

The good point estimations should fulfil the conditions of consistency, impartiality,

abundance and sufficiency. Here these conditions are reminded only, more detailed

information can be obtained in a literature dealing with estimation theory. The point

estimation can be carried out by moment method or by method of maximum likelihood. The

moment method is based on the effect that the empirical parameters are considered the

estimations of corresponding theoretical parameters. The method of mathematical likelihood

is essentially mathematically more demanding. The disadvantage of point estimations consists

above all in the ignorance of exactness which the estimation was done with.

The interval estimations remove the problem of estimation exactness ignorance. They

are trying to construct an interval providing the reasonable “guarantee” (sufficiently high

probability) the real value of theoretical parameter is located inside interval. This probability

relates to the selection of significance level again and the constructed interval then bears the

name “100 (1–α)% confidence interval” (e.g., for α = 0,05 the point will be 95% confidence

interval).

a) The construction of confidence interval for mean value μ of normal distribution using u-test

(the condition of construction – the variance σ2 is assigned in advance) works on the form of

statistical criterion

1Ou n

.

49

The critical values are –u(α/2), u(α/2), the conditions for construction of confidence

interval can be recorded in the form of inequalities –u(α/2)< u< u(α/2). After the solution of

presented inequalities it is possible to obtain the confidence interval (the interval estimation

of μ):

1 1

2 2;

u uO O

n n

.

b) The construction of confidence interval for mean value μ of normal distribution using t-test

(the condition of construction – the variance σ2 isn´t assigned in advance) works on the form

of statistical criterion

1

x

Ot n

S

.

The critical values are –tn–1(α/2), tn–1(α/2), the conditions for construction of confidence

interval can be recorded in the form of inequalities –tn–1(α/2) < t < tn–1(α/2). After the solution

of presented inequalities it is possible to obtain the confidence interval (the interval estimation

of μ):

1 1

1 1

2 2;

n x n xt S t SO O

n n

,

c) The construction of confidence interval for variance σ2 of normal distribution using

2χ -testu (the condition of construction – the empirical variance Sx2

is needful to calculate)

works on the form of statistical criterion

2

2

2

1 xn S

The critical values are 2 2

1 11 ,2 2n n

, the conditions for construction of

confidence interval can be recorded in the form of inequalities

2 2 2

1 11 < <2 2n n

. After the solution of presented inequalities it is possible to

obtain the confidence interval (the interval estimation of σ2):

50

2 2

2

2 2

1 1

1 1;

12 2

x x

n n

n S n S

.

2.2.2. Illustration of Confidence Intervals Construction

a) Within the assigned example the construction of confidence interval will be carried out for

mean value μ using t-test.

The confidence interval is given by form:

1 1

1 1

2 2;

n x n xt S t SO O

n n

For the significance level α = 0.05, for the extent n = 50 of selective statistical set SSS,

for standard deviation Sx = 1 (approximative value) and for the arithmetic mean O1 = 2.5 the

critical values are, according to the statistical tables, equal to t49 (0.025) = 1.96 (for

freedom degrees number n–1 > 33 it is possible to apply the statistical table for u-test).

After implementation into 95% confidence interval it is possible to obtain

2.221; 2.779 .

b) Within the assigned example the construction of confidence interval will be carried out for

variance σ2 using 2χ -test.

The confidence interval is given by form:

2 2

2

2 2

1 1

1 1;

12 2

x x

n n

n S n S

.

For the significance level α = 0.05, for the extent n = 50 of selective statistical set SSS,

for standard deviation Sx = 1 (approximative value) the critical values are according to the

statistical tables

2 2

49 49

2 2

49 49

(1 ( / 2) ) (0.975) 30.60

( / 2) (0.025) 70.22

After implementation into 95% confidence interval it is possible to obtain

2 0.705; 1.617 , 0.839; 1.272 .

51

2.2.3. Basics of Parametric Hypotheses Testing

The parametric hypotheses testing again works on the apparatus of zero hypothesis H0

and alternative hypotheses Ha. This apparatus shall be accompanied by usual apparatus of

critical domain W. Due to the central limit theorem it is the natural assumption that the

normal distribution, as the most suitable theoretical distribution, may be assigned to empirical

distribution.

The parametric testing can be divided into one-selective testing hypotheses of the mean

value or of the variance (then the one-selective tests u-test and t-test are used for mean value

and one-selective χ2-test for variance) and into two-selective testing hypotheses of an equality

of the mean values or of the variances (then the two-selective tests u-test and t-test are used

for an equality of the mean values and two-selective F-test for an equality of the variances).

In the case of one-selective testing the hypothesis H0 and Ha can be written in the form

H0: μ = μ0 or H0: σ = σ0, Ha: μ μ0 or Ha: σ σ0.

The one-selective parametric testing works on the comparison between an empirical

parameter μ or an empirical parameter σ (by these symbols the results of elementary statistical

processing of selective statistical set SSS are marked, by means of these results the relevant

theoretical parameters μ, σ of corresponding normal distribution were estimated) and some

external theoretical data μ0, σ0, origin of which can be various (study of literature, research

reports, commercial indicators and the like). By the collective denominator of these external

data it can be the determination that they probably characterize the certain significant basic

statistical set BSS. The one-selective parametric testing, then from the point of view of the

mathematical statistics, answers the question whether the investigated selective statistical set

SSS could be chosen from the described significant basic statistical set BSS. In the case of

hypotheses H0 verification it is possible to look at the results of selective statistical set SSS

investigation in the context created by basic statistical set BSS. In the case of hypothesis Ha

acceptance it is not possible to work on this context.

In the case of two-selective testing the hypothesis H0 and Ha can be written in the form

H0: μ1 = μ2 or H0: σ1 = σ2, Ha: μ1 μ2 or Ha: σ1 σ2.

52

The two-selective parametric testing works on the comparison between an empirical

parameter μ1 or an empirical parameter σ1 (by these symbols the results of elementary

statistical processing of selective statistical set SSS1 are marked, by means of these results the

relevant theoretical parameters μ1, σ1 of corresponding normal distribution were estimated)

and some external theoretical data μ0, σ0, origin of which can be usually found in the

investigation results of another selective statistical set SSS2. The two-selective parametric

testing, then from mathematical statistics point of view, answers the question whether both of

selective statistical sets SSS1 and SSS2 have investigated an analogous problem and whether

these sets can co-operate. In the case of confirmation of the hypotheses H0 it is possible to

consider the selective sets SSS1 and SSS2 the selective sets chosen from the same basic

statistical set BSS and usually the endeavour to identify the set BSS is worth. In the case of

acceptance of the hypotheses Ha it is necessary, from mathematical statistics point of view, to

articulate the doubts as to the compatibility of the sets SSS1 and SSS2.

The procedure for parametric testing is similar to the procedure for non-parametric

testing. First, it is needful to formulate a zero and an alternative hypothesis and to select the

significance level α. Then it is needful to select a suitable statistical criterion (u-test, t-test,

χ2-test, F-test), to discover its critical value and to record a corresponding critical domain W.

Finally it is necessary to approach to the calculation of statistical criterion empirical value and

to determine if it is or it isn´t the element of critical domain W. If the empirical value is an

element of domain W it is necessary to accept the alternative hypothesis Ha, in the opposite

case then the zero hypothesis H0.

Survey of some one-selective statistical criteria (n – the extent of set SSS):

a) One-selective u-test (the testing hypothesis about the mean value of the known variance σ2)

0exp , ( ; 2 2 ; )u n W u u

.

b) One-selective t-test (the testing hypothesis about the mean value of the unknown

variance σ2)

0exp 1 1, ( ; 2 2 ; )n n

x

t n W t tS

.

53

c) One-selective 2χ -test (the testing hypothesis about the variance of the unknown

parameters μ, σ2)

2

2 2 2

exp 1 12

0

1, 0; 1 2 2 ; )n n

nW

.

Survey of some two-selective statistical criteria:

a) Two-selective u-test (the testing hypothesis about the equality of mean values of the known

variances σ12, σ2

2), n1, n2 are the extents of selective statistical sets SSS1, SSS2

1 2exp

2 2

1 2

1 2

, ( ; 2 2 ; )u W u u

n n

.

b) Two-selective t-test (the testing hypothesis about the equality of mean values of the

unknown variances σ12, σ2

2), n1, n2 are the extents of selective statistical sets SSS1, SSS2,

Sx1, Sx2 are the empirical standard deviations of selective statistical sets SSS1, SSS2

c) Two-selective F-test (the testing hypothesis about the equality of variances of the unknown

parameters μ1, μ2, σ12, σ2

2), n1, n2 are the extents of selective statistical sets SSS1, SSS2,

Sx1, Sx2 are the empirical standard deviations of selective statistical sets SSS1, SSS2

1 2 1 2

2

1

exp 2

2

1, 1 1, 10; 1 2 2 ; )

x

x

n n n n

SF

S

W F F

.

1 2 1 2

1 2 1 21 2

exp2 2

1 21 1 2 2

2 2

2,

1 1

( ; 2 2 ; )

x x

n n n n

n n n nt

n nn S n S

W t t

54

The remark: The larger square power of square powers of the standard deviations Sx12, Sx2

2 is

usually put into the numerator of statistical criterion

2

1exp 2

2

x

x

SF

S .

From this point of view the right-sided critical domain W = 1 21, 1 ; )n nF with the value α

instead of value α/2 is usually used.

d) The paired t-test (the transformation of two-selective t-test on one-selective t-test on the

basis of the zero hypothesis H0: 12 = where the most frequent = 0).

2.2.4. Illustration of Parametric Testing

a) Assigned example – testing hypotheses about mean value

Determine if the investigated selective statistical set SSS (μ = 2.5, n = 50) could be, for

the significance level α = 0.05, selected from the basic statistical set BSS which is

characterized by the mean value a1) μ0 = 2.6, a2) μ0 = 2.9.

The information about variance is missing – it is needful to use the one-selective t-test:

0exp 1 1, ( ; 2 2 ; )n n

x

t n W t tS

The formulation of zero and alternative hypothesis: H0: μ = μ0, Ha: μ ≠ μ0

The determination of critical values and and critical domain:

t49(0.025) = u(0.025) = 1.96, W = (∞;1.96 1.96; ∞)

The calculation of statistical criterion experimental value for the case a1)

texp = 0.704, texpW

The result interpretation:

The experimental value texp doesn´t belong to the critical domain, on the significance

level α = 0.05 it is possible to accept the zero hypothesis H0. The investigated selective

statistical set could be selected from an external set BSS. The difference 0 is statistically

55

unimportant for the significance level α = 0.05 (it can be noted that the value 0 is the

element of the 95% confidence interval in the case a1))

The calculation of statistical criterion experimental value for the case a2):

texp = 2.814, texp W


The experimental value texp is the element of the critical domain, on the significance

level α = 0.05 it is possible to refuse the zero hypothesis H0. The investigated selective

statistical set SSS couldn´t be selected from an external set BSS. The difference 0 is, on

the significance level α = 0.05, statistically important (it can be noted that the value 0 isn´t

the element of the 95% confidence interval in the case a2))

b) Assigned example – testing hypothesis about variance

Determine if the investigated selective statistical set SSS (μ = 2.5, Sx = σ = 1.005,

n = 50) could be, for the significance level α = 0.05, selected from the basic statistical set BSS

which is characterized by the standard deviation b1) σ0 = 1, b2) σ0 = 0.5.

The one selective 2χ -test will be used:

2

2 2 2

exp 1 12

0

1, W 0; 1 2 2 ; )n n

n

.

The formulation of zero and alternative hypothesis: H0: σ = σ0, Ha: σ σ0.


2

49 0.975 30.60 , 2

49 0.025 70.22 , W 0; 30.60 70.22; ) .

The calculation of statistical criterion experimental value for the case b1):

2 2

exp exp49.49, W

56



exp doesn´t belong to the critical domain, on the significance

level α = 0.05 it is possible to accept the zero hypothesis H0. The investigated selective

statistical set SSS could be selected from an external set BSS. The quotient between σ and σ0

is statistically unimportant for the significance level α = 0,05 (it can be noted that the value σ0

is the element of the 95% confidence interval in the case b1))

The calculation of statistical criterion experimental value for the case b2):

2 2

exp exp197.96, W



exp belongs to the critical domain, on the significance level

α = 0.05 it isn´t possible to accept the zero hypothesis H0. The investigated selective statistical

set SSS couldn´t be selected from an external set BSS. The quotient between σ and σ0 is, on

the significance level α = 0,05, statistically important (it can be noted that the value σ0 isn´t

the element of the 95% confidence interval in the case b2))

c) Assigned example – testing hypotheses about equality of mean values

An analogous observation of the export ability as within the assign example (here it

was investigated the selective statistical set SSS1 n1 = 50 enterprises with the result μ1 = 2.5)

has led to the average export ability c1) μ2 = 2.6, c2) μ2 = 2.9 for n2 = 100 enterprises (the

variances were comparable, but the information about variance size is missing – it is needful

to use two-selective t-test). Determine if this selective statistical set SSS2 could be, for the

statistical significance level α = 0.05, selected from the same basic statistical set BSS as the

set SSS1.

The two-selective t-test will be used:

1 2 1 2

1 2 1 21 2

exp2 2

1 21 1 2 2

2 2

2,

1 1

W ( ; ; )2 2

x x

n n n n

n n n nt

n nn S n S

t t

57

The formulation of zero and alternative hypothesis: H0: μ1 = μ2, Ha: μ1 ≠ μ2


t148(0.025) = 1.96, W = (∞;1.96 1.96; ∞)

The calculation of statistical criterion experimental value for the case c1):

texp = 0.574, texpW


The experimental value texp doesn´t belong to the critical domain, it is possible to accept

the zero hypotheses H0 for the significance level α = 0.05. The investigated selective

statistical set SSS1 and the additional selective set SSS2 could be selected from one and the

same external set BSS. The difference between μ1 and μ2 is statistically unimportant with the

significance level α = 0.05.

The calculation of statistical criterion experimental value for the case c2):

texp = 2.298, texp W


The experimental value texp belongs to the critical domain, on the significance level

α = 0.05 it isn´t possible to accept the zero hypothesis H0. The investigated selective set SSS1

and the additional selective set SSS2 couldn´t be selected from one and the same external set

BSS. The difference between μ1 and μ2 is statistically important with the significance level

α = 0.05.

d) Assigned example – testing hypotheses about equality of variances

An analogous observation of the export ability as within the assign example (here it

was investigated the selective statistical set SSS1 n1 = 50 enterprises with the result

Sx12 = σ1

2 =1.01) has led to the average export ability for n2 = 100 enterprises which enabled

the calculation of variance d1) Sx22 = σ2

2 = 1, d2) Sx2

2 = σ2

2 = 1.631. Determine if this selective

statistical set SSS2 could be, for the statistical significance level α = 0.05, selected from the

same basic statistical set BSS as the set SSS1.

58

The two-selective F-test (with the right-sided critical domain W) will be used:

1 2

2

1

exp 1, 12

2

, W ; )x

n n

x

SF F

S for the case d1),

1 2

2

2

exp 1, 12

1

, W ; )x

n n

x

SF F

S for the case d2).

The formulation of the zero and right-sided alternative hypothesis:

H0: σ1 = σ2, i.e. Sx1 = Sx2 Ha: σ1 > σ2, tj. Sx1 > Sx2 (the case d1))

H0: σ2 = σ1, i.e.. Sx2 = Sx1 Ha: σ2 > σ1, tj. Sx2 > Sx1 (the case d2))

The determination of critical value and right-sided critical domain:

F49,99(0.05) = 1.545, W = 1.545; ∞)

The calculation of statistical criterion experimental value for the case d1):

Fexp = 1.01, Fexp W


The experimental value Fexp doesn´t belong to the critical domain, it is possible to

accept the zero hypothesis H0 for the significance level α = 0.05. The investigated selective

statistical set SSS1 and the additional selective set SSS2 could be selected from one and the

same external set BSS. The difference between Sx12 = 1.01 and Sx2

2 = 1 is statistically

unimportant with the significance level α = 0.05.

The calculation of statistical criterion experimental value for the case d2):

Fexp = 1.615, Fexp W


The experimental value Fexp belongs to the critical domain, on the significance level

α = 0.05 it is possible to refuse the zero hypothesis H0. The investigated selective set SSS1 and

the additional selective set SSS2 couldn´t be selected from one and the same external set BSS.

The difference between Sx12 = 1.01 and Sx2

2 = 1.631 is statistically important with the

significance level α = 0.05.

59

2.3. Measurement of Statistical Dependences – Some Fundaments

of Regression and Correlation Analysis

Goals:

Association investigation: Statistical dependence – causal, non-causal

Association picture of selective statistical set: Regression analysis, Correlation analysis


Simple and multiple selective statistical set


Simple and multiple regression dependence

Linear and nonlinear regression dependence

Regression analysis

Simple and multiple correlation


Pearson´ correlation coefficient

60

Check questions:

What is the difference between simple and multiple statistical set?

What is the statistical dependence?

What is the difference between simple and multiple regression and correlation analysis?

Wherein do the regression analysis basic tasks lie?

Wherein do the correlation analysis basic tasks lie?

What is the method of the least squares?

What is the normal equations system for simple linear and quadratic regression?

What is the difference between Pearson´ correlation coefficient and correlation index

2.3.1. Delimitation of Problem

The simple selective set SSS was investigated hitherto, only one statistical sign was

explored for the statistical units of this set. The statistical dependences measurement is

connected with a multiple selective set SSS, it will be simultaneously explored more

statistical signs for the statistical units.

The statistical dependence between the signs x, s is given by an instruction which

assigns exactly one empirical distribution of the frequencies of statistical sign s (the values of

sign s have to show the character of a random variable) to measured or entered values of sign

x (the values of sign x contrarily not has to have the character of a random variable).

61

The simple (paired) regression dependence then generally is one-sided dependence of

the given random variable s on another variable x (not necessarily random) – the point is an

inestigation of two-dimensional selective statistical set SSS. The multi-dimensional

(multiple) regression dependence is the dependence of given random variable s on the larger

number of another variable x, y, z, … (not necessarily random) – the point is an investigation

of multiple set SSS.

The concept “correlation dependence” is the narrower concept than “regression

dependence”. The simple (paired) correlation can be understood as the mutual dependence of

two random variables (two statistical signs x, s) which is associated, for a change of values of

one statistical sign (either x or s), with a change of the arithmetic mean deduced from the

exploration of the second statistical sign (either s or x). In the continuity with the dependence

of larger number of random variables (statistical signs) it would be possible analogously to

define the multiple correlation.

The definitions of regression and corretation dependence are different from the

definitions of the functions of one or more variables, and so from the functional dependences.

The part of mathematical statistics, which deals with the study of regression and

correlation dependences, is called regeression and correlation analysis.

The basic tasks of regression analysis consist in the detection of suitable regression

function for the expression of observed dependence, in the point and interval estimation of

the parameters and the values of theoretical regression function and in the verification of

harmony of regression function with experimental data. According to the type of the

appropriate theoretical regression function it can be spoken also about the types of regression

analysis – e.g. on polynomial regression, exponential regression, logarithmic regression,

hyperbolic regression and the like. The following explanation will be aimed at the seeking of

the suitable theoretical regression functions.

The basic tasks of correlation analysis consist in the measurement of correlation

tightness (strength, intensity). The problems of simple linear and non-linear correlation is

usually investigated, provided that the changes of random variables x, s (statistical signs x, s)

are correctly expressed by linear or non-linear regression function. Also for an investigation

of multiple correlation it is worked on the dependence description which is given by

62

regression function. The tasks of correlation analysis can be then transferred to the seeking of

correlation coefficients as the basic measures of tightness of the given correlation type. In

addition to using the correlation coefficients associated with the metric scales it is also

essential to explore the coefficients of ordinal correlation – these are worked on the ordinal

scales. The following explanation will be aimed only at the use of a simple relation for the

linear correlation coefficient.

On the basis of the reduction of the number of investigated statistical signs of the two

the problem of regression dependences measurement can be described in a simplified form.

Two-dimensional selective statistical set SSS is connected with the exploration of two

statistical signs SS-x and SS-s. The metric scale with elements x1, x2, …, xn is associated with

the sign x (the elements of scale were measured and the results of these measurements are

given by the absolute frequencies of individual elements), the measurement results

s1, s2, …, sn are then connected with the sign s (the absolute frequencies measured for the

sign x are included in these results). By this way the measurement results are at disposal in

the form of n ordered pairs xi, si.

On the basis of described simplification it is possible to use the method of least squares

in measuring the dependence between the signs SZ-x and SZ-s (the condition is that the

measurement errors of sign SZ-s, whose the values show the character of special random

variable, have the zero mean value and the same, although unknown, but the final variance).

Let the theoretical regression function generally described within the simple regression by an

equation y = f(x). The summation of least squares can be then expressed by relation

S = Σ(si - yi)2 where yi are the values of function y = f(x) corresponding to the values x = xi.

The method of least squares then consists in the seeking of regression function y = f(x) by

means of the minimum value of summation S.

2.3.2. Simple Linear and Quadratic Regression Analysis

The way of the regression function seeking will be described by means of the graphical

delimitation of problem in the figure Fig.5 “Simple linear regression analysis”. In this figure it

is work on n = 5 of the ordered pairs xi, si, which characterize the statistical dependence

between statistical signs SS-x and SS-s. The scale elements x1, x2, …, x5, connected with the

statistical sign x, are deposited on the horizontal axis. The measurement results s1, s2, …, s5 of

the sign s (the absolute frequencies, measured for the sign x, are already included in these

63

results) are deposited on the vertical axis. The ordered pairs xi, si are the coordinates of five

points A1 x1, s1, A2 x2, s2, A3 x3, s3, A4 x4, s4, A5 x5, s5. These 5 points graphically

express the dependence between the signs SS-x and SS-s. The goal of simple linear regression

analysis is to express this statistical dependence by the straight line the analytical expression

of which is given by the usual form y = b0 + b1.x for polynomial function of the 1.order.

Fig.5 Simple linear regression analysis

The least squares method is aimed at the seeking of minimum value of expression

S = Σ(si – yi)2 in which the adding index i acquires the values i = 1, 2, …, 5. Through yi it will

be installed yi = b0 + b1.xi and it will be looked for the minimum of function S which is the

function of two variables b0 a b1, i.e. S = g(b0, b1).

64

The conditions for the seeking of minimum are given by the realization of partial

derivatives of function S according to both variables and by their annulment (for the persons

interested in the exact seeking of function extremes with more variables it is possible to

recommend to acquaint themselves with Sylvestr´ theorem from the area of mathematical

analysis).

The conditions for the seeking of minimum of function S can be recorded in the form

0b

S

= 0,

1b

S

= 0.

Obtained system of the equations is called the system of normal equations for simple

linear regression and after the realization of derivatives it acquires the known form

si = nb0 + b1xi

sixi = b0xi + b1xi2.

The adding index i generally acquires the values i = 1, 2, …, n. The values of

parameters b0, b1 can be obtained through the solution of normal equations system and then it

is possible to record the straight line equation y = b0 + b1.x. The predictions of values si

corresponding with the relevant values xi for i 5 can be then done according to the figure

Fig.5 through the obtained regression function. The predictions of the time or also the

comparative trends would not be possible without the realization of linear regression analysis.

By the analogous way it is possible to explain the fundaments of simple quadratic

regression. In this case the investigated statistical dependence would be expressed by

polynomial function of 2.order the graph of which is a parabola. The analytical expression

y = f(x) of a parabola is given by the equation y = b0 + b1x + b2x2, the method of least squares

leads again to the seeking of minimum of function S = Σ (si – yi)2. This function

S = h(b0,b1,b2) is function of three variables, for the discovery of minimum the three partial

derivatives are already needful and their annulment leads to the normal equations system

0

S

b

= 0

1

S

b

= 0

2

S

b

= 0.

After the realization of derivatives the normal equations system for simple quadratic

regression acquires the form

65

si = nb0 + b1xi + b2xi2

sixi = b0xi + b1xi2 + b2xi

3

sixi2 = b0xi

2 + b1xi

3 + b2xi

4.

The adding index i acquires the values i = 1, 2, …,5 in the figure Fig.5, in the general

case then the values i = 1, 2, …, n (in the case of quadratic regression the group of points

A1 x1, s1, A2 x2, s2, A3 x3, s3, A4 x4, s4, A5 x5, s5 should naturally map the progress of

the parabola instead of the straight line). The values of parameters b0, b1, b2 can be obtained

by the solution of normal equations system and then it is possible to record the parabola

equation y = b0 + b1.x + b2.x2. The predictions of values si corresponding with the relevant

values xi for i 5 can be then done according to the figure Fig.5 by means of obtained

regression function. The predictions of the time or also the comparative trends would not be

possible without the realization of quadratic regression analysis.

2.3.3. Simple Linear and Quadratic Correlation Analysis

For the delimitation of problem it is again possible to use the graphical way indicated

by means of the figure Fig.5. After the realization of simple linear regression analysis (the

result is indicated by the drawn straight line in Fig.5) it is possible to approach to the

determination of statistical dependence tightness between the statistical signs SS-x and SS-s

of investigated selected statistical set SSS.

The most used measure of simple linear correlation tightness is Pearson´correlation

coefficient kxs. This coefficient is given by relation

kxs = sx

xs

SS

S

. ,

it acquires the values from interval 1, 1xsk (this conclusion can be easily deduced

from so called Schwarz´ inequality). The values approaching to 1 from the right correspond

with the case of positive correlation (the values of both statistical signs SS-x and SS-s

increase or decrease at the same time, the figure Fig.5 is connected with this case). The

values approaching to –1 from the left describe the negative correlation (while the values of

one statistical sign are increasing the values of the second sign are decreasing). The values

around 0 indicate the signs don´t correlate (it is possible to express no collective trends in the

66

increases or the decreases of the signs values). The Pearson´ correlation coefficient as the

empirical parameter has the character of a random variable and it can be used as a point

estimation of theoretical correlation coefficient.

In the relation for Pearson´ correlation coefficient the mixed central moment

C2(x,s) = Sxs of 2.order also occurs in addition to the usual standard deviations Sx and Ss (i.e.

the square roots of central moments C2(x) and C2(s)) connected with the investigation of

statistical signs SS-x and SS-s. The mixed central moment of 2.order is defined by relation

(k is number of scale elements for both statistical signs)

1 1i

xs i x i s

nS x O s O

n , where the adding index i acquires commonly values

i = 1, 2, …, k.

Apart from the Pearson´ correlation coefficient the other quantities are also used for the

measurement of simple linear correlation tightness (e.g. the size of the smaller of the angles

included by the associated regression straight lines or the determination coefficient). The

“index of correlation” is used for the measurement of simple quadratic correlation (the

statistical dependence is expressed by quadratic regression function). The relation for

correlation index can be used also for the investigation of other simple non-linear correlations

– within this relation it is only necessary to install the used regression function instead of

quadratic regression function.

2.3.4. Illustration of Dependence Measurement

a) Simple linear regression

The observation of economical state within the assigned example (it was investigated

the selective statistical set SSS with the extent n = 50 enterprises, the statistical sign SS-x

“export ability” was explored for the enterprises) was connected with the observation of the

second statistical sign SS-s on the basis of use of the analogous metric scale (the scale

element 1 corresponds with the best value, it was realized the elementary statistical

processing). The determined values xi (the development degrees) and si (the evaluation of

suitable parameter of the economical state) are presented in the table. The goal is to estimate

67

the type of regression dependence of both statistical data, to express it by suitable regression

function and to determine the tightness of correlation by means of suitable coefficient.

The sign SS-x: values xi

1 2 3 4 5

The sign SS-s: values si

1,8 2,2 3,8 4,2 4,6

The estimated type of regression dependence:

The simple linear regression expressed by regression straight line y = b0 + b1.x

Thy system of normal equations for the linear regression:

si = nb0 + b1xi

sixi = b0xi + b1xi2

The system of normal equations for the concrete case:

5b0 + 15b1 = 16,6 (5b0 + 15b1 = 16.6)

15b0 + 55b1 = 57,4 (15b0 + 55b1 = 57.4)

The discovery of regression function:

y = 1,48 + 0,64.x (y = 1.48 + 0.64.x)

The investigation of trends:

After the installment of sign SS-x value xi = 6 it is possible to calculate the corresponding

value si = 5,32 of sign SS-s (on the basis of the greater degree of development it is possible to

calculate the increased value of relevant parameter of the economical state)

The calculation of correlation coefficient:

- The values given by the elementary statistical processing of both statistical signs are equal

to Ss = 1.166, 1 3.02sO , Sx = 1.015, 1 2.5xO

- The calculation of mixed central moment of 2.order gives the value Sxs = 0.763

68

- The installment into the relation for Pearson´ coefficient enables to determine the

correlation tightness kxs = sx

xs

SS

S

. = 0.645

- The interepretation of result – tight positive correlation

b) Simple quadratic regression

The observation of economical state within the assigned example (it was investigated

the selective statistical set SSS with the extent n = 50 enterprises, the statistical sign SS-x

“export ability” was explored for the enterprises) was connected with the observation of the

second statistical sign SS-s. This sign was described by the percentage expression in

association with analogous metric scale. The determined values xi (the development degrees)

and si (the percentage evaluation of suitable parameter of the economical state) are presented

in the table. The goal is to estimate the type of regression dependence of both statistical data

and to express it by suitable regression function.

The sign SS-x: values xi 1 2 3 4 5

The sign SS-s: values si 20 % 10 % 6 % 2 % 2 %

The estimated type of regression dependence:

The simple quadratic regression expressed by regression parabola y = b0 + b1x + b2x2

The system of normal equations for the quadratic regression:

si = nb0 + b1xi + b2xi2

sixi = b0xi + b1xi2 + b2xi

3

sixi2 = b0xi

2 + b1xi

3 + b2xi

4

The system of normal equations for the concrete case:

xi xi2

xi3 xi

4 si sixi sixi

2

1 1 1 1 20 20 20

2 4 8 16 10 20 40

3 9 27 81 6 18 54

4 16 64 256 2 8 32

5 25 125 625 2 10 50

15 55 225 979 40 76 196

69

5b0 + 15b1 + 55b2 = 40

15b0 + 55b1 + 225b2 = 76

55b0 + 225b1 + 980b2 = 196

The discovery of regression function:

- First, the adjustment of relevant matrices (through the achievement of zero elements under

the main diagonal) will be carried out

5 15 55 / 40 5 15 55 / 40 5 15 55 / 40

15 55 225 / 76 0 10 60 / – 44 0 10 60 / – 44

55 225 980 / 196 0 60 375 / – 244 0 0 15 / 20

- On the basis of adjusted matrices it is possible to carry out the calculation of coefficients

values b0, b1, b2

b2 = 1.33, b1 = – 12.4, b0 = 30.54

- By the installment into general equation of parabola it is possible to obtain the analytical

expression of regression parabola y = 1.33x2

– 12.4x + 30.54 and after the adjustment to

obtain the form y = 1.33 (x – 4.7)2 + 1.21. From here the coordinates V [4.7; 1.21] of the top

of the parabola are evident

- Now the graph of regression parabola can be already constructed as a result of realized

simple quadratic regression analysis

The investigation of trends:

The corresponding value si = 24.67% of sign SS-s can be calculated on the basis of

installment of sign SS-x value xi = 0.5 (from a very high degree of export ability it is possible

to calculate a high value of the relevant parameter of the economical state).

0

5

10

15

20

25

30

1 2 3 4 5

70

Part 3. Applications

3.1. Description of Statistical and Probability Base of Financial Options

3.1.1. Introduction

An imperative of data mining and a need of cooperation of the human with today´s computers

are emphasized by D.A.Keim (Keim, 2002):

“The progress made in hardware technology allows today´s computer systems to store very

large amounts of data. Researchers from the University of Berkeley estimate that every year 1 Exabyte

(= 1 Million Terabyte) of data are generated, of which a large portion is available in digital form. This

means that in the next three years more data will be generated than in all of human history before”.

“If the data is presented textually, the amount of data which can be displayed is in range one

hundred data items, but this is like a drop in the ocean when dealing with data sets containing millions

of data items”.

“For data mining to be effective, it is important to include the human in the data exploration

process and combine the flexibility, creativity, and general knowledge of the human with the

enormous storage capacity and the computational power of today´s computers.”

The financial derivatives are such derivative contracts in which the underlying securities

are financial instruments such as stocks, bonds or an interest rate. The important constituent

of financial derivatives is created by financial options. The statistical and probability base of

financial options is exactly processed.

The Black-Scholes model observes the evolution of the option´s key underlying

variables in continuous-time. The Binomial and Trinomial model (the simplest variants of

the Mulltinomial model) observe the evolution of the option's key underlying variables in

discrete-time.

The statistical and probability base of financial options is connected, above all, with the

Black-Scholes model and the Multinomial model. These statistical and probability

applications will be described by means of data mining approach.

71

3.1.2. Financial Options

(quoted according to www.economywatch.com)

Financial options are those derivative contracts in which the underlying assets are

financial instruments such as stocks, bonds or an interest rate. The options on financial

instruments provide a buyer with the right to either buy or sell the underlying financial

instruments at a specified price on a specified future date. Although the buyer gets the rights

to buy or sell the underlying options, there is no obligation to use this option. However, the

seller of the contract is under an obligation to buy or sell the underlying instruments if the

option is used.

Two types of financial options exist, namely call options and put options. Under a call

option, the buyer of the contract gets the right to buy the financial instrument at the specified

price at a future date, whereas a put option gives the buyer the right to sell the same at the

specified price at the specified future date. The price that is paid by the buyer to the seller for

using this level of flexibility is called the premium (the fair price). The prescribed future price

is called the strike price.

The theoretical calculation of premium is connected namely with both the Black-

Scholes model (continuous statistical model based on normal distribution) and the Binomial

or Trinomial model (discrete statistical models based on binomial or trinomial distribution).

Financial options are either traded in an organized stock exchange or over-the-counter.

The exchange traded options are known as standardized options. The options exchange is

responsible for this standardization. This is done by specifying the quantity of the underlying

financial instrument, its price and the future date of expiration. The details of these

specifications may very vary from exchange to exchange. However, the broad outlines are

similar.

Financial options are used either to hedge against risks by buying contracts that will pay

out if something with negative financial consequences happens, or it allows the traders to

magnify the profits while the risks are limiting disadvantage.

Financial options involve the risk of losing some or all of the contract prices, if the market

moves against the trend expected, and counterpart risk, such as broker insolvency or contractors who

do not fulfil their contractual obligations.

72

3.1.3. Statistical and Probability Base of Black-Scholes Model

(quoted according to “mars.wiwi.hu-berlin.de/ebooks/html/sfe/sfenode41.html.” and

“Zaskodny,P., Pavlat,V., Budik,J. (2007). Financial Derivates and Their Evaluation, Prague,

Czech Republic: University of Finance and Administration”)

The Black-Scholes model observes the evolution of the option´s key underlying

variables in continuous-time. This is done by means of both the standard normal probability

densities ρ(d1), ρ(d2) and the standard normal distribution functions N(d1), N(d2).

The variables d1, d2 are connected with Spot price S, Strike price X, Risk-Free Rate r,

Annual Dividend d, Time to Maturity τ, and Volatility σ.

The basic formulas for Black-Scholes model (Value Function – Fair Price for call option

is marked “ C ”, Value Function – Fair Price for put option is marked “ P ”):

1 2

2 21 2

1 2 2 1

2

1 2 1

1 1 1 2 2 2

2 21 2

,

ln2

,

,

1 1,

2 2

d r r d

d d

d d

C Se N d Xe N d P Xe N d Se N d

S r dX

d d d

N d d d d N d d d d

d e d e

3.1.4. Statistical and Probability Base of Binomial and Trinomial Model

(quoted according to “mars.wiwi.hu-berlin.de/ebooks/html/sfe/sfenode41.html.” and

“Zaskodny,P., Pavlat,V., Budik,J. (2007). Financial Derivates and Their Evaluation, Prague,

Czech Republic: University of Finance and Administration”)

The Binomial model observes the evolution of the option's key underlying variables in

discrete-time. This is done by means of a binomial tree, for a number of time steps between

the valuation and expiration dates (the number of time steps is marked “n”). Each node, in the

tree, represents a possible price of the underlying at a given point in time.

At each step, it is assumed that the underlying instrument will move up or down by

a specific factor (u or d) per step of the tree (where, by definition, u≥1 and 0<d≤1). So, if S is

the spot price, then in the next period the price will be either Sup = S.u or Sdown = S.d.

The number of up factors is marked “j”, the number of down factors is “n–j”.

X is the Strike price and S is the Spot price of the underlying security.

http://en.wikipedia.org/wiki/Underlying_instrument

http://en.wikipedia.org/wiki/Strike_price

73

Under the risk neutrality assumption, today's fair price of a derivative is equal to the

expected value of its future payoff discounted by the risk-free rate. Therefore, expected value

is calculated using the option values from the later two nodes (Option up and Option down)

weighted by their respective probabilities – "probability" p of an up move in the underlying,

and "probability" (1-p) of a down move. The expected value is then discounted at q, the risk-

free rate corresponding with the life of the option (q d

pu d

).

The basic formulas for Binomial model (Value Function – Fair Price for call option is

marked “ C ”, Value Function – Fair Price for put option is marked “ P ”):

0

1 n

j jnj

C Cq

, Cj = max (0, Sj – X)

j

n

j

jnP

qP

0

1, Pj = max (0, X – Sj)

jnj

j ppj

n

1

SduSSduS jkjk

j

jnj

j

,

mm

kkn

n

k

n.......2.1!,

!!

!

,1q d u q

p pu d u d

.

The Trinomial model observes the evolution of the option's key underlying variables in

discrete-time. This is done by means of a trinomial tree, for a number of time steps between

the valuation and expiration dates (the number of time steps is marked “n”). Each node, in the

tree, represents a possible price of the underlying at a given point in time.

The fair price can be determined numerically. The Binomial model after Cox-Ross-

Rubinstein can be used. In this section it will be introduced a less complex but numerically

efficient approach based on trinomial trees. It is related to the classical numerical procedures

for solving partial differential equations, which are also used to solve the Black-Scholes

differential equations.

The Trinomial model follows the procedure of the binomial model whereby the price at

each time step can change to three instead of two directions.

http://en.wikipedia.org/wiki/Fair_value

http://en.wikipedia.org/wiki/Derivative_(finance)

http://en.wikipedia.org/wiki/Expected_value

http://en.wikipedia.org/wiki/Risk-free_interest_rate



74

At each step, it is assumed that the underlying instrument will move up or down by

a specific factor (e.g. two up factors u1, u2 and one down factor d) per step of the tree (where,

by definition, u1,u2≥1 and 0<d≤1). So, if S is the Spot price, then in the next period the price

will either be Su1 = S.u1, Su2 = S.u2 or Sd = S.d. The probability with which the price moves

from S to Su1, Su2, Sd is represented as p1, p2, p3 (p1+ p2+ p3 = 1).

The number of u1 factors is marked “j”, the number of u2 factors is marked “i”, and the

number of d factors is “n–j–i”.

The basic formulas for Trinomial model (Value Function – Fair Price for call option is

marked “ C ”, Value Function – Fair Price for put option is marked “ P ”):

max

0 0

1,

max 0,

n n

ij ijni j

ij ij

C C i j nq

C S X

1 2

max

0 0

,

j i n i j

ij

n n

ij ij

i j

S u u d S

S S i j n

1 2 1 21n i ji j

ij

np p p p

ij

max

0 0

1,n n

ij

i j

i j n

!

! ! !

n n

ij i j n i j

3.1.5. Statistical and Probability Data Mining Tools – Normal, Binomial and Trinomial

Distribution

a) Standard normal probability density ρ(x) and standard normal distribution

function N(x)

2

21

2

x

x

N x x dx

x e

http://en.wikipedia.org/wiki/Underlying_instrument

75

b) Binomial and Trinomial probability function

jnj

j ppj

n

1

1 2 1 21n i ji j

ij

np p p p

ij

3.1.6. Conclusion

The statistical and probability base of financial options as a part of statistical data mining

tools is created by

- Normal distribution,

- Binomial distribution,

- Trinomial distribution.

76

3.2. Description of Statistical and Probability Base of Greeks

3.2.1. Introduction

In mathematical finance, the Greeks are the quantities representing the sensitivities of

derivatives such as options to a change in underlying parameters on which the value function

of an instrument or portfolio of financial instruments is dependent. The name is used because

the most common of these sensitivities are often denoted by Greek letters.

The Greeks in the Black-Scholes model are relatively easy to calculate, a desirable

property of financial models, and are very useful for derivatives traders, especially those who

seek to hedge their portfolios from unfavourable changes in market conditions. For this

reason, those Greeks which are particularly for Hedging Delta, Gamma and Vega are well-

defined for measuring changes in Price, Time and Volatility.

The statistical and probability base of financial options is also connected with the

Greeks. These statistical applications will be described by means of data mining approach.

3.2.2. Greeks

(quoted according to http://en.wikipedia.org/wiki/Greeks_(finance) )

The Greeks are the quantities describing the sensitivities of financial options to

a change in underlying parameters on which the fair price (the value function) of an

instrument or portfolio of financial instruments is dependent. Collectively these have also

been called the Risk Sensitivities, Risk Measures or Hedge Parameters.

The Greeks are vital tools in Risk Management. Each Greek measures the sensitivity

of the fair price (the value function) of a financial instrument or portfolio to a small change in

a given underlying parameter, so that component risks may be treated in isolation, and the

portfolio rebalanced accordingly to achieve a desired state (see for example Delta Hedging).

According to 3.2.1. the Greeks in the Black-Scholes model are relatively easy to

calculate, a desirable property of financial models, and are very useful for derivatives traders,

especially those who seek to hedge their portfolios from adverse changes in market

conditions. For this reason, those Greeks which are particularly for Hedging Delta, Gamma

and Vega are well-defined for measuring changes in Price, Time and Volatility.

http://en.wikipedia.org/wiki/Greeks_(finance)

77

The most common of the Greeks are the first order derivates: Delta, Dual Delta, Vega,

Theta and Rho as well as Gamma, a second-order derivate of fair price (value function).

Although Rho is a primary input into the Black-Scholes model, the overall impact on the fair

price (the value function) of an option corresponding with changes in the risk-free rate is

generally insignificant and therefore higher-order derivates involving the risk-free interest rate

are not common.

The most used of the Greeks are some second order derivates: Gamma, Dual Gamma,

Vomma, Vanna, Charm, DvegaDtime. Also the most used of the Greeks are some third order

derivates: Speed, Zomma, Color, Ultima.

The Greeks in the Binomial model observe the evolution of the option's key

underlying variables in discrete-time. The most used of the Greeks are the Delta and Gamma.

Those Greeks are well-defined for Hedging Delta and Gamma.

The most common of the Greeks in the Black-Scholes and Binomial models are the

Delta, Vega, Theta and Gamma. The most used of the Option Hedging are the Hedging Delta

and Gamma. The remaining sensitivities (and hedging connected with them) in this list are

common enough that they have common names, but this list is by no means exhaustive.

3.2.3. Value Function

(quoted according to Záškodný,P., Havlíček,I., Budinský,P. (2010-2011), Partial Data

Mining Tools in Statistics Education – in Greeks and Option Hedging (In: Tarábek,P.,

Záškodný,P. (2010-2011), Educational and Didactic Communication 2010, Bratislava,

Slovak Republic: Didaktis, www.didaktis.sk.)

According to 3.1.2. the financial options are those derivative contracts in which the

underlying assets are financial instruments such as stocks, bonds or an interest rate. The

options on financial instruments provide a buyer with the right to either buy or sell the

underlying financial instruments at a specified price on a specified future date. Although the

buyer gets the rights to buy or sell the underlying options, there is no obligation to exercise

this option. However, the seller of the contract is under an obligation to buy or sell the

underlying instruments if the option is exercised.

According to 3.1.2. two types of financial options exist, namely call options and put

options. Under a call option, the buyer of the contract gets the right to buy the financial

instrument at the specified price at a future date, whereas a put option gives the buyer the

right to sell the same at the specified price at the specified future date. The price that is paid

http://www.didaktis.sk/

78

by the buyer to the seller for exercising this level of flexibility is called the premium (the fair

price, the value function). The prescribed future price is called the strike price.

The theoretical calculation of premium is connected namely with both the Black-

Scholes Model (continuous statistical model based on normal distribution) and the Binomial

or Trinomial Model (discrete statistical models based on binomial or trinomial distribution).

In this explanation the priority will be given to Black-Scholes Model.

The Black-Scholes model traces the evolution of the option´s key underlying variables

in continuous-time. This is done by means of both the standard normal probability densities

ρ(d1), ρ(d2) and the standard normal distribution functions N(d1), N(d2).

The variables d1, d2 are connected with Spot price S, Strike price X, Risk-Free Rate r,

Annual Dividend d, Time to Maturity τ, Volatility σ, and Annual Dividend Yield d.

Value Function V (as Fair Price or as Premium) can be expressed as a function of five

quantities V = f (S, X, r, τ, σ)

The basic formulas for Black-Scholes model (Value Function V – Fair Price for call

option is marked “ C ”, Value Function – Fair Price for put option is marked “ P ”):

1 2

2 21 2

1 2 2 1

2

1 2 1

1 1 1 2 2 2

2 21 2

,

ln2

,

,

1 1,

2 2

d r r d

d d

d d


S r dX

d d d

N d d d d N d d d d

d e d e

79

3.2.4. Segmentation and Definitions of Greeks

a) Greeks of first order

The speeds of value function change:

Dual

vega

V

S

V

X

V

V

V

r

b) Greeks of individual second order

The accelerations of value function change & the speeds of first order greeks change:

2

2

2

2

2

2

2

2

2

2

Dual

Vomma

Out of Use

Out of Use

V

S

V

X

V

V

V

r

c) Greeks of combined second order

The speeds of first order greeks change:

2

2

2

Vanna

Charm

DvegaDtime

V

S

V

S

V

80

d) Greeks of third order

The speeds of second order greeks change:

3

3

3

2

3

2

3

3

Speed

Zomma

Color

Ultima

V

S

V

S

V

S

V

3.2.5. Indications of Greeks

a) Greeks of First Order

DvalueDspot

Dual DvalueDstrike

Vega DvalueDvol

DvalueDtime

DvalueDrate

V

S

V

X

V

V

V

r

b) Greeks of Second Order

2

2

2

2

2

2

DdeltaDspot

DualDual DdualdeltaDstrike

X

Vomma DvegaDvol

V

S S

V

X

V

2

2

2

Vanna DdeltaDvol DvegaDspot

Charm DdeltaDtime D theta DspotS

DvegaDtime D theta Dvol DvegaDtime

V

S S

V

S

V

81

c) Greeks of Third Order

3 2

3 2

3 2 2

2 2

23 2

2 2

3 2

3 2

Speed DgammaDspot

Zomma DgammaDvol

Color DgammaDtime

vommaUltima DvommaDvol

V

S S S

V

S S S

V

S S S

V

3.2.6. Formulas for Greeks (CO – Call Option, PO – Put Option)

a) Formulas for Delta Greek

1

d

CO e N d

1

d

PO e N d

b) Formulas for Dual Delta Greek Dual

2Dual r

CO e N d

2Dual r

PO e N d

c) Formulas for Vega Greek

, 1 2

d r

CO PO e S d Xe d

d) Formulas for Theta Greek

1

22

d r

CO

S de rXe N d

1

22

d r

PO

S de rXe N d

e) Formulas for Rho Greek

2

r

CO Xe N d

2

r

PO Xe N d

f) Formula for Gamma Greek

1

,

d

CO PO

de

S

g) Formula for Dual Gamma Greek Dual

2

,Dual r

CO PO

de

X

82

i) Formulas for Vomma Greek Vomma

1 2 1 2, 1Vomma d

CO PO

d d d dSe d

j) Formulas for Vanna Greek Vanna

2 2 1, 1Vanna 1d

CO PO

d d de d

S S

k) Formulas for Charm Greek Charm

2

1 1

2

1 1

2Charm

2

2Charm

2

d d

CO

d d

PO

r d dde N d e d

r d dde N d e d

l) Formulas for DvegaDtime Greek DvegaDtime

1 1 2, 1

1 1 2,

1DvegaDtime

2

1DvegaDtime

2

d

CO PO

CO PO

r d d d de S d d

r d d d dd

m) Formulas for Speed Greek Speed

1 1 1, 2

Speed 1 1d

CO PO

d d de

SS

n) Formulas for Zomma Greek Zomma

1 1 2

, 1 22

1Zomma 1d

CO PO

d d de d d

S

o) Formulas for Color Greek Color

1 2

, 1

2

, 1

2Color 2 1

2

2Color 2 1

2

d

CO PO

CO PO

d r d de d d

S

r d dd d

p) Formulas for Ultima Greek Ultima

1

, 1 2 1 2 2 12

, 1 2 1 2 2 12

Ultima 2 1

Ultima 2 1

d

CO PO

CO PO

S de d d d d d d

d d d d d d

83

3.2.7. Needful Statistical and Probability Relations for Deduction of Greeks Formulas

2 21 2

2

2 1

1 2 2 1

2 2

1 2

2 1

2 21 2

21 2 2 1

a) Value Function

,

ln ln2 2

,

b) Standard Normal Probability Densities

1 1,

2 2

,

d r r d

d d

d d


S Sr d r dX X

d d

d d

d e d e

d d e e d d e

2

2 2

1 2

1 2

2

2 2

1 1 1 2 2 2

1 1 2 2

1 2

1 2

1 2

,

c) Standard Normal Distribution Functions

,

1, 1

,

r rd dd d

d d

e

S Se e e e e e

X X

N d d d d N d d d d

N d N d N d N d

N d N dd d

d d


The results of explanation:

- Description of Value Function as Fair Price

- Description of Greeks of First Order

- Description of Greeks of Second Order

- Description of Greeks of Third Order

- Names and Indications of Greeks

- Survey of Formulas for Greeks Calculation

- Survey of Needful Relations for Greeks Calculation

84

References - Keim,D.A. (2002)

Information Visualization and Visual Data Mining.

IEEE Transactions on Visualization and Computer Graphics. Vol.7, No.1, January-March 2002

- Záškodný,P., Tarábek,P. (2010-2011)

Data Mining Tools in Statistics Education In: Tarábek,P., Záškodný,P. (2010-2011), Educational and Didactic Communication 2010

Bratislava, Slovak Republic: Didaktis, ISBN 978-80-89160-78-5

www.didaktis.sk.

- Záškodný,P., Havlíček,I., Budinský,P. (2010-2011)

Partial Data Mining Tools in Statistics Education – in Greeks and Option Hedging In: Tarábek,P., Záškodný,P. (2010-2011), Educational and Didactic Communication 2010

Bratislava, Slovak Republic: Didaktis, ISBN 978-80-89160-78-5

www.didaktis.sk.



85

3.3. Data Mining Tools in Statistics Education

3.3.1. Introduction

In the introduction of chapter 3.3. the quotations showing the importance of educational

data mining are presented. These quotations from i) to vi) are selected according to

C.Romero, S.Ventura (2006) (In: Tarábek,P., Záškodný,P. (2009) Educational and Didactic

Communication 2009, Bratislava, Slovak Republic: Didaktis, www.didaktis.sk,

ISBN 978-80-89160-69-3).

i) Currently there is an increasing interest in data mining and educational systems (well-known

learning content management systems, adaptive and intelligent web-based educational systems),

making educational data mining as a new growing research community

ii) After preprocessing the available data in each case, data mining techniques can be applied in

educational systems – statistics and visualization, clustering, classification and detection, association

rule mining and pattern mining, text mining

iii) Data mining oriented towards students – to show recommendations and to use, interact,

participate and communicate by students within educational systems

iv) Data mining oriented towards educators (and academic responsible-administrators) – to

show discovered knowledge and to design, plan, build and maintenance by educators (administrators)

within educational systems

v) Data mining tools provide mining algorithms, filtering and visualization techniques. The examples

of Data Mining tool:

- Tool name: Mining tool, Authors: Zaïane and Luo (2001), Mining task: Association and patterns

- Tool name: Multistar, Authors: Silva and Vieiva (2002), Mining task: Association and classification

- Tool name: Synergo/ColAT, Authors: Avouris et al (2005), Mining task: Visualization

vi) Future research lines in educational data mining

- Mining tools more facilitate the application of data mining by educators or not expert users

- Standardization of data and methods (preprocessing, discovering, postprocessing)

- Integration with the e-learning system

- Specific data mining techniques

The main principle of chapter 3.3.: Data Mining in Statistics Education (DMSTE) as Problem Solving

The main goal of chapter 3.3.: Delimitation of Complex Tool and Partial Tool of DMSTE

The procedure of chapter 3.3.: - Data Preprocessing in Statistics Education

- Data Processing in Statistics Education

- Complex Tool of DMSTE – Curricular Process (CP-DMSTE)

- Partial Tool of DMSTE – Analytical Synthetic Modelling (ASM-DMSTE)

- Application of CP-DMSTE and ASM-DMSTE

- Supplement describing the principles of data mining approach


86

The results of chapter 3.3.:

1. Educational Communication of Statistics as Result of Data Preprocessing

2. Educational Communication of Statistics as Five Transformations T1-T5 of Knowledge

from Statistics to Mind of Educant

3. Curricular Process of Statistics as Result of Data Processing

4. Curricular Process of Statistics as Structuring, Algorithm Development and Formalization

of Educational Communication of Statistics

5. Curricular Process as Succession of Five Transformations T1-T5 of Curriculum Variant

Forms

6. Curriculum Variant Forms as Forms of Education Content Existence

7. Formalization of Curriculum Variant Form (Four of Universal Structural Elements: Sense

and Interpretation, Set of Objectives, Conceptual Knowledge System, Factor of Following

Transformation)

8. Variant Forms of Curriculum – Conceptual Curriculum (Communicable Scientific System

of Statistics), Intended Curriculum (Educational System of Statistics), Projected

Curriculum (Instructional Project of Statistics and Its Textbook), Implemented

Curriculum-1 (Preparedness of Educator to Education), Implemented Curriculum-2

(Results of Education in Mind of Educant), Attained Curriculum (Applicable Results of

Education)

9. Curricular Process as CP-DMSTE (Structuring, Algorithm Development and

Formalization of Five Transformations Succession T1-T5)

10. Analytical Synthetic Modeling as ASM-DMSTE (Modeling Inputs and Outputs of

Transformations T1-T5)

11. Analytical Synthetic Models as Results of Problems Solving (Real or Mediated Problems)

12. Application of CP-DMSTE and ASM-DMSTE (Visualia of Conceptual Curriculum in

Area of Statistics with Concrete Basic Statistical Set, Need of Visualiae of All Curriculum

Variant Forms as Application of CP-DMSTE)

3.3.2. Data Mining (see also Supplement of chapter 3.3.)

Data Mining – analytical synthetic way of extraction of hidden and potencially useful information

from large data files (continuum data-information-knowledge, knowledge discovery)

Data Mining Techniques – the system functions of structure of formerly hidden relations and patterns

(e.g. classification, association, clustering, prediction)

Data Mining Tool – a concrete procedure how to reach the intended system functions

Complex Tool – a resolution of complex problem of relevant science branch

Partial Tool – a resolution of partial problem of relevant science branch (e.g. analytical synthetic

modeling, needful mathematical or statistical procedures)

Result of Data Mining – a result of data mining tool application

Representation of Data Mining Result – a description of this what is expressed

Visualization of Data Mining Result – optical retrieval of data mining result

Data Mining Cycle – Data Definition, Data Gathering, Data Preprocessing, Data Processing,

Discovering Knowledge or Patterns, Representation and Visualization of Results

See P.Tarabek, P.Zaskodny, V.Pavlat, P.Prochazka, V.Novak, J.Skrabankova (2009-2010,

2009-2010abcde and quoted sources).

Quoted sources in 2009-2010abcde:

E.g. American Library Association, M.C.Borba, E.M.Villarreal, G.M.Bowen, W-M Roth, C.Brunk,

J.Kelly, R.Kohavi, Mineset, B.V.Carolan, G.Natriello, N.Delavari, M.R.Beikzadeh, S.Phon-

Amnuaisuk, U-D Ehlers, J.M.Pawlowski, U.M.Fayyad, G.Piatelsky-Shapiro, P.Smyth, J.Fox, D.Gabel,

J.K.Gilbert, O.de Jong, R.Justi, D.F.Treagust, J.H.Van Driel, M.Reiner, M.Nakhleh, W.Hämäläinen,

T.H.Laine, E.Sutinen, M.Hesse, A.H.Johnstone, M.J.Kearns, U.V.Vazivani, D.A.Keim, R.Kwan,

87

R.Fox, FT Chan, P.Tsang, Le Jun, J.Luan, J.Manak, National research Council-NRC, R.Newburgh,

I.Nonaka, H.Takeuchi, C.J.Petroselli, E.F.Redish, D.Reisberg, C.Romero, S.Ventura, N.Rubenking,

R.E.Scherr, M.Sabella, D.A.Simovici, C.Djeraba, V.Spousta, L.Talavera, E.Gaudioso, E.R.Tufte,

J.Tuminaro, R.Vilalta, C.Giraud-Carrier, P.Brazdil, C.Soares, D.M.Wolpert.

3.3.3. Data Preprocessing in Statistics Education

Result of Data Preprocessing – Educational Communication of Statistics as

a succession of transformations of education content forms (taken over from physics education):

- The transformation T1 is transformation of scientific system of statistics to communicable

scientific system of statistics (the first form of education content existence),

- The transformation T2 is transformation of communicable scientific system of statistics to

educational system of statistics (the second form of education content existence),

- The transformation T3 is transformation of educational system of statistics to both instructional

project of statistics and preparedness of educator to education (the third and fourth forms of education

content existence),

- The transformation T4 is transformation of both instructional project of statistics and preparedness

of educator to results of education (the fifth form of education content existence),

- The transformation T5 is transformation of results of statistics education to applicable results of

statistics education (the sixth form of education content existence)

See J.Brockmeyer (1982), P.Zaskodny a kol. (2004, 2007), P.Tarabek, P.Zaskodny (2001, 2007-

2008abc, 2008-2009, 2009-2010), P.Zaskodny (2001, 2006, 2009).

3.3.4. Data Processing in Statistics Education

Result of Data Processing – Curricular Process of Statistics as a succession of transformations

of algorithmized and formalized education content forms (taken over from physics education):

i. The form of education content existence - “variant form of curriculum”

ii. The curriculum - “education content” (see Prucha, 2005)

iii. The variant forms of curriculum have got the universal structure (four structural elements -

sense and interpretation, set of objectives, conceptual knowledge system, factor of following

transformation)

iv. The variant forms of curriculum were selected on the basis of fusion of Anglo-American

curricular tradition and European didactic tradition

v. The curricular process is defined as the succession of transformations T1-T5 of curriculum

variant forms:

“conceptual curriculum” (output of T1, the first variant form of curriculum) - the communicable

scientific system

“intended curriculum” (output of T2, the second variant form of curriculum) - the educational

system of statistics

88

“projected curriculum” (output of T3, the third variant form of curriculum) - the instructional project

of statistics

“implemented curriculum-1” (output of T3, the fourth variant form of curriculum) - the preparedness

of educator to education

“implemented curriculum-2” (output of T4, the fifth variant form of curriculum) – the results of

education

“attained curriculum” (output of T5, the sixth variant form of curriculum) - applicable results of

education

See P.Prochazka, P.Zaskodny (2009-2010c).

Quoted sources in 2009-2010c:

E.g. A.V.Kelly, M.K.Smith, W.Doyle, M.Pasch, A.M.Sochor, V.V.Krajevskij, I.J.Lerner, J.McVittie,

K.Carter, G.M.Blenkin, L.Stenhouse, E.Newman, G.Ingram, F.Bobitt, R.W.Tyler, H.Taba,

C.Cornblet, S.Grundy, D.Lawton, P.Gordon, M.Certon, M.Gayle, G.J.Posner.

3.3.5. Complex and Partial Tool of DMSTE – CP-DMSTE, ASM-DMSTE

Complex tool of DMSTE is given by curricular process of statistics (CP-DMSTE). CP-

DMSTE delimits the correct education content via succession of transformations T1-T5.

Partial tool of DMSTE is given by analytical synthetic modeling (ASM-DMSTE).

ASM-DMSTE describes the mediated or real problem solving within the inputs and outputs of

individual transformations T1-T5. In this paper, the description of ASM-DMSTE is realized

by means of both visualia Vis.1 and Legend to Vis.1.

Legend to Vis.1

a (Identified Complex Problem) – Investigated area of reality, investigated phenomenon

Bk (Analysis) – Analytical segmentation of complex problem to partial problems

bk (Partial problems PP-k) – Result of analysis: essential attributes and features

of investigated phenomenon

Ck (Abstraction) – Delimitation of partial problems essences by abstraction with goal

to acquire the partial solutions

ck (Partial solutions PS-k) – Result of abstraction: partial concepts, partial pieces of

knowledge, various relations, etc.

Dk (Synthesis) – Synthetic finding dependences among results of abstraction

dk (Partial conclusions PC-k) – Result of synthesis: principle, law, dependence, continuity

Ek (Intellectual reconstruction) – Intellectual reconstruction of investigated phenomenon /

investigated area of reality

e (Total solution of complex problem “a”) – Result of intellectual reconstruction:

analytical synthetic structure of final knowledge (conceptual knowledge system)

89

Vis.1 General Analytical Synthetic Model of Problem Solving

ANALYSIS Bk

C1 C2 C3 C4 ABSTRACTION Ck

D1 D2 SYNTHESIS Dk

E1 E2 RECONSTRUCTION Ek

5. Application of Partial Tool ASM-DMSTE

The application of ASM-DMSTE is the visualia Vis.2 from the area of statistics education.

The visualia Vis.2 is analytical synthetic model of statistics with concrete basic statistical set. This

visualia constitutes a part of statistics conceptual curriculum as a part of communicable scientific

system of statistics (a part of output of transformation T1).

The visualized result Vis.2 of data mining in statistics education constitutes the paramorphic

model and hypertextual representation, represents the external conceptual knowledge systems as

external representation of general social experience. The visualized result also represents the concrete

type of data file – the representation of statistics with concrete basic statistical set.

a - Identified Complex Problem

b1 - Partial Problem

No. 1 (PP-1)

b2 - Partial Problem

No. 2 (PP-2)

bk - Partial Problem

No. k (PP-k)

c1-Partial

Solution

No.1(PS-1)

c2-Partial

Solution

No.2(PS-2)

c3-Partial

Solution

No.3(PS-3)

c4-Partial

Solution

No.4(PS-4)

ck-Partial

Solution

No.k(PS-k)

d1 - Partial Conclusion

No. 1 (PC-1)

d2 - Partial Conclusion

No. 2 (PC-2)

dk - Partial Conclusion

No. k (PC-k)

e - Total Solution of Complex Problem "a" formed by means of PC-1, PC-2, .., PC-k

90

Vis.2: Analytical synthetic model of statistics formed by four partial models

a1-e1, a2-e2, a3-e3, a4-e4

(a part of conceptual curriculum of statistics – a part of communicable scientific system

of statistics – output of transformation T1)

Frequencies tables

(Empirical distribution)

Graphical expression Empirical parameters

Empirical picture of selective statistical set, Necessity of probable investigation e-2=a-3

Collective random phenomenon and reason of its investigation a-1

Statistical unit Statistical sign

Selective statistical set (SSS) as a part of basic statistical set, Goals of statistical examination e-1=a-2

Comparison of theoretical and

empirical parameters

Creating of scale

Measurement

Testing of non-parametric

hypotheses


(causal, non-causal)

Choice of acceptable


Quantification of

theoretical parameters

Regression analysis

Variants (values) of

statistical sign

Choice of statistical

units

Point & interval estimation

(e.g. confidence interval)

Testing of parametric hypotheses

Empirical & probable picture of selective statistical set, Necessity of association investigation e-3=a-4


Empirical & probable & association picture of selective statistical set

Interpretation and conclusions as the statistical & probable dimension e-4

of investigation collective random phenomenon

Applied statistics

(e.g. financial options and their mathematical and statistical elaboration by means of greeks calculation and

option hedging models)

91

LEGEND to whole visualia Vis.2

, , ,

One – Sample Analysis, Two / Multiple – Sample Analysis

LEGEND to partial models of visualia Vis.2

Formulation of statistical examination

Relative & Cumulative Frequencies (Empirical distribution)

Plotting functions: e.g. Plot Frequency Polygon (Graphical expression)

Average-Means, Variance-Standard Deviation, Obliqueness (Skewness), Pointedness

(Kurtosis) (Empirical parameters)

Theoretical Distribution (partial survey in alphabetical order):


Geometric, Lognormal, Negative binomial, Normal, Poisson, Student´s, Triangular,

Trinomial, Uniform, Weibull

Testing of Non-parametric Hypotheses (Hypothesis test for H0 – receive or reject H0):

e.g. computed Wilcoxon´s test, Kolmogorov-Smirnov test, Chi-square test


Point & Interval Estimation:

e.g. confidence interval for Mean, confidence interval for Standard Deviation

Testing of Parametric Hypotheses (Hypothesis test for H0 – receive or reject H0):

e.g. computed u-statistic, t-statistic, F-statistic, Chi-square statistic, Cochran´s test, Barlett´s

test, Hartley´s test


Statistical dependence:

e.g. confidence interval for difference in Means (Equal variances, Unequal variances)

e.g. confidence interval for Ratio of Variances

Regression analysis: simple – multiple, linear – non-linear

Correlation analysis: e.g. Rank correlation coefficient, Pearson´ correlation coefficient

a-1 e-1 a-2 e-2 a-3 e-3 a-4 e-4

a-1 e-1

a-2 e-2

a-3 e-3

a-4 e-4

92


Modeling as a partial tool of data mining – quotation acoording to J.K.Gilbert (2008)

(In: Tarábek,P., Záškodný,P. (2009) Educational and Didactic Communication 2009,

Bratislava, Slovak Republic: Didaktis, www.didaktis.sk, ISBN 978-80-89160-69-3).:

“In a nightmare world, we would perceive the world around us being continuous and

without structure. However, our survival as a species has been possible because we have

evolved the ability do “cut up” that world mentally into chunks about which we can think and

hence give meaning to”.

“This process of chunking, a part of all cognition, is modelling and the products of the

mental actions that have taken place are models. Science, being concerned with the provision

of explanations about the natural world, places an especial reliance on the generation and

testing of models”.

References

1. Used Publications

i. Brockmeyerová,J. (1982) Introduction into Theory and Methodology of Physics Education. Prague, Czech

Republic: SPN

ii. CSRG (2009). Curriculum Studies Research Group.

České Budějovice: University of South Bohemia, Czech Republic, http://sites.google.com/site/csrggroup/

iii. Gilbert,J.K. (2008) Visualization: An Emergent Field of Practice and Enquiry. In: Visualization: Theory and Practice

in Science (Models and Modeling in Science Education). New York: Springer Science + Business Media

iv. Keim,D.A. (2002) Information Visualization and Visual Data Mining. IEEE Transactions on Visualization

and Computer Graphics. Vol.7, No.1, January-March 2002

v. Průcha,J (2005) Moderní pedagogika (Modern Educational Science), Prague, Czech Republic: Portál

2. Used Papers, Monographs, and Books of Author (2001-2010)

i. Tarábek,P., Záškodný,P. (2001)

Structural Textbook and Its Creation.

Bratislava, Slovak Republic: Didaktis, ISBN 80-85456-76-1

ii. Záškodný,P. (2001)

Statistical Dimension of Scientific Research.

KONTAKT, 2, 5, 2001 ISSN 1212-4117

iii. Tarábek,P., Záškodný,P. (2007-2008a)

Educational and Didactic Communication 2007, Vol.1 – Theory.

Bratislava, Slovak Republic: Didaktis, www.didaktis.sk, ISBN 987-80-89160-56-3

iv. Tarábek,P., Záškodný,P. (2007-2008b)

Educational and Didactic Communication 2007, Vol.2 – Methods.


v. Tarábek,P., Záškodný,P. (2007-2008c)

Educational and Didactic Communication 2007, Vol.3 – Applications.



http://sites.google.com/site/csrggroup/




93

vi. Tarábek,P., Záškodný,P. (2008-2009)

Educational and Didactic Communication 2008.


vii. Tarábek,P., Záškodný,P. (2009-2010)



viii. Záškodný,P. a kol. (2004)

Základy zdravotnické statistiky.

České Budějovice, Czech Republic: South Bohemia University ISBN 80-7040-663-1

ix. Záškodný,P. (2006)

Survey of Principles of Theoretical Physics (with Application to Radiology) (in English). Lucerne, Switzerland, Ostrava, Czech Republic: Avenira, Algoritmus, ISBN 80-902491-9-1

x. Záškodný,P. a kol. (2007)

Základy ekonomické statistiky.

Prague, Czech Republic: Institute of Finance and Administration ISBN 80-86754-00-6

xi. Záškodný,P. (2009)

Curicular Process of Physics (with Survey of Principles of Theoretical Physics) (in Czech). Lucerne, Switzerland, Ostrava, Czech Republic: Avenira, Algoritmus, ISBN 978-80-902491-0-3

xii. Záškodný,P. (2009-2010)

Data Mining Tools in Science Education (in: vii.)

xiii. Záškodný,P., Pavlát,V. (2009-2010a)

Data Mining – A Brief Recherche (in: vii.)

xiv. Záškodný,P., Novák,V. (2009-2010b)

Data Mining – A Brief Summary (in: vii.)

xv. Záškodný,P., Procházka,P. (2009-2010c)

Collective Scheme of Both Educational Communication and Curricular Process (in: vii.)

xvi. Záškodný,P. , Škrabánková,J.(2009-2010d)

Modelling and Visualization of Problem Solving (in: vii.)

xvii. Záškodný,P. (2009-2010e)

Representation of Results of Data Mining (in: vii.)



94

3.3.7. Supplement of Chapter 3.3. – The Principles of Data Mining Approach

3.3.7.1. Quotations from Sources

i) Definitions of Data Mining

J.Luan (2002)

Definition of Data Mining

a) Data Mining is the process of discovering meaningful new correlations, patterns, and trends by

sifting through large amounts of data stored in repositories and by using pattern recognition

technologies as well as statistical and mathematical techniques

b) The notion of Data Mining for higher education: Data Mining is a process of uncovering hidden

trends and patterns that lend them to predicative modeling using a combination of explicit knowledge

base, sophisticated analytical skills and academic domain knowledge

N.Rubenking (2001)


Data Mining is the process of automatically extracting useful information and relationships from

immense quantities of data. In its purest form, Data Mining doesn´t involve looking for specific

information. Rather than starting from a question or a hypothesis, Data Mining simply finds patterns

that are already present in the data.

R.Kohavi (2000)

Definition of Data Mining as Knowledge Discovery

Data Mining (or Knowledge Discovery) is the process of identifying new patterns and insights in data

Interpretation of Data Mining

As the volume of data collected and stored in databases grows, there is a growing need to provide data

summarization, identify important patterns and trends, and act upon findings

Le Jun (2008)

Definition of Data Mining as New Technology

Data Mining is extraction of hidden predictive information from large database. Data Mining is

a powerful new technology with great potential to help an scientific area focus on the most important

information in its data

N.Delavari, M.R.Beikzadeh, S.Phon-Amnuaisuk (2005)


Searched knowledge (meaningful knowledge, previously unknown and potentially useful information

discovered) is hidden among the raw educational data set and it is extractable through Data Mining

R.Kwan, R.Fox, FT Chan, P.Tsang (2008), Le Jun (2008)

Data, Information, Knowledge

Data, Information, Knowledge are different terms, which differentiate in means and values.

a) Data is a collection of facts and quantitative measures, which exists outside of any context from

which conclusions can be drawn.

b) Information is data that people interpret and place in meaningful context, highlighting patterns,

causes of relationships in data.

95

c) Knowledge is the understanding human development as reaction to and use of information, either

individually or as an organization.

Data-Information-Knowledge Continuum

a) Data, information and knowledge are separated but linked concepts which can form a data-

information-knowledge continuum.

b) Data becomes information when people place it in context through interpretation that might seek to

highlighting.

c) Knowledge can be described as a belief that is justified through discussion, experience and perhaps

action. It can be shared with others by exchanging information in appropriate contexts.

ii) Data Mining and Problem Solving

L.Talavera, E.Gaudioso (2002)

Data Mining as Analysis Problem In this paper we propose to shape the analysis problem as a data mining.

J.Tuminaro, E.F.Redish (2005), E.F.Redish (2005)

Problem solving

Problem solving and the use of math in physics courses

Student Use of Math in the Context of Physics Problem Solving: A Cognitive Model

M.C.Borba, E.M.Villarreal (2005)

Problem solving Problem solving as context

Problem solving as skill

Problem solving as art

Process of modeling, process of problem solving The process of modeling or model building is a part of the process of problem solving

Steps of problem solving process (process of problem solving as entailing several steps):

The starting point is a real problematic situation

The first step is to create a real model, making simplifications, idealizations, establishing conditions

and assumptions, but respecting original situation

In the second step, the real model is mathematized, to get a mathematical model

The third step implies the selection of suitable mathematical methods and working within

mathematics in order to get some mathematical results

In the fourth step, these results are interpreted for and translated into the real situation

iii) Forms of Data Mining, Data Mining System, Goals of Data Mining, Scope of

Data Mining

R.Kohavi (2000)

Forms of Data Mining (Structured mining etc.)

Structured mining, Text mining, Information retrieval

96

W.Hämäläinen, T.H.Laine, E.Sutinen (2003)

Data Mining system, educational system

Data Mining system in educational system: the educational system should be served by Data Mining

system to monitor, intervene in, and counsel the teaching-studying-learning process

R.Kohavi (2000)

Goals of Data Mining Data Mining serves two goals:

-Insight: Identified patterns and trends are comprehensible

-Prediction: A model is built that predicts (scores) based on input data. Prediction as classification

(discrete variable) or as regression (continuous variable)

Scope of Data Mining The majority of research in DM has concentrated on building the best models for prediction.

A learning algorithm is given the training set and produces a model that can map new unseen data into

the prediction.

iv) Results of Data Mining, Applications of Data Minings, Interdisciplinarity of Data

Mining

R.Kohavi (2000), D.M.Wolpert (1994), M.J.Kearns, U.V.Vazivani (1994)

Some theoretical results in Data Mining

- No free lunch (All concepts are equally likely, then learning is impossible)

- Consistency (non-parametric models - target concept given enough data, parametric models as linear

regression are known to be of limited power) - enough data = consistency

- PAC learning (probably approximately correct learning) is a concept introduced to provide

guarantees about learning

- Bias-Variance decomposition

U.M.Fayyad, G.Piatelsky-Shapiro, P.Smyth (1996)

Interdisciplinarity of Data Mining

Data Mining, sometimes referred to as knowledge Discovery, is at the intersection of multiple

research area, including machine learning, statistics, pattern recognition, databases and visualization

J.Luan (2002)

Potential applications of Data Mining “There are several ways to examine the potential applications of Data Mining

a) One is to start with the functions of the algorithms to reason what can be utilized for

b) Another is to examine the attributes of a specific area where data are rich, but mining activities are

scare

c) And another is to examine the different functions of a specific area to identify the needs that can

translate themselves into Data Mining project”

Notes: a) - See Curricular Process as Data Mining Algorithm

b) - See Curriculum: Theory and Practice as scientific area in which mining activities are

scare

c) - Some of the most likely places where data miners (educational researchers who wear

this hat) may initiate Data Mining projects are: Variant Forms of Curriculum

97

v) Data Mining techniques

. N.Delavari, M.R.Beikzadeh, S.Phon-Amnuaisuk (2005)

Data Mining techniques “DM techniques can be used to extract unknown pattern from the set of data and discover useful

knowledge. It results in extracting greater value from the raw data set, and making use of strategic

resources efficiently and effectively.”

J.Luan (2001)

Data Mining techniques as Data Mining functions “Prediction, clustering, classification, association”

Le Jun (2008)

Data Mining techniques – application of Data Mining tools

“Application of DM tools: To solve the task of prediction, classification, explicit modeling and

clustering. The application can help understand learners´ learning behaviors”

C.Romero, S.Ventura (2006)

Data Mining techniques in educational systems

“After preprocessing the available data in each case, Data Mining techniques can be applied in

educational systems – statistics and visualization, clustering, classification and outlier detection,

association rule mining and pattern mining, text mining”

J.Luan (2002)

Clustering and prediction – the most striking aspects of Data Mining techniques

- “The clustering aspect of Data Mining offers comprehensive characteristics analysis of investigated

area”

- “The predicting function estimates the likelihood for a variety of outcomes”

B.V.Carolan, G.Natriello (2001)

Clustering

“Data-Mining Resources to identify structural attributes of educational research community-e.g.

clustering as collaboration of physicists and biologists”

D.A.Simovici, C.Djeraba (2008)

Clustering, Taxonomy of clustering

a) “Clustering is the process of grouping together objects that are similar. The groups formed by

clustering are referred to as clusters.”

b) “Clustering can be regarded as a special type of classification, where the clusters serve as

classes of objects”

c) “It is widely used data mining activity with multiple applications in a variety of scientific activities

from biology and astronomy to economics and sociology”

d) “Taxonomy of clustering (we follow here the taxonomy of clustering)

- Exclusive or nonexclusive: Clustering may be exclusive or may not be exclusive. It is exclusive,

where an exclusive clustering technique yields clusters that are disjoint. It is nonexclusive, where

a nonexclusive technique produces overlapping clusters.

98

- Intrinsic or extrinsic: Clustering may be intrinsic or extrinsic. Intrinsic - based only on

dissimilarities between the objects to be clustered. Extrinsic - which objects should be clustered

together and which should not, such information is provided by an external source.

- Hierarchical or partitional: Clustering may be hierarchical or partitional. Hierarchical - in

hierachical clustering algorithms, a sequence of partitions) is constructed. Partitional - partitional

clusterings creates a partition of the set of objects whose blocks are the clusters such that objects in

a cluster are more similar to each other than to objects that belong to different clusters”

vi) Data Mining tools

C.Brunk, J.Kelly, R.Kohavi (1997)

Data Mining tool

““Mineset” is a Data Mining tool that integrates Data Mining and visualization very tightly. Models

built can viewed and interacted with.”


Data Mining tools

“Data Mining tools provide mining algorithms, filtering and visualization techniques. The examples




- Tool name: Synergo/ColAT, Authors: Avouris et al (2005), Mining task: Visualization”

D.A.Simovici, C.Djeraba (2008)

Mathematical tools for Data Mining

a) “This book was born from experience of the authors as researches and educators, which suggests

that many students of Data Mining are handicapped in their research by the lack of formal,

systematic education in its mathematics. The book is intended as a reference for the working data

miner.”

b) “In our opinion, three areas of math are vital for DM:

- set theory, including partially ordered sets and combinatorics,

- linear algebra, with its many applications in principal component analysis and neural networks,

- and probability theory, which plays a foundational role in statistics, machine learning and DM”

vii) Modeling, Model

J.K.Gilbert, M.Reiner, M.Nakhleh (2008), J.K.Gilbert (2008), J.K.Gilbert, R.Justi ( 2002)

Definition of Modelling, Model “We have evolved the ability do “cut up” that world mentally into chunks about which we can think

and hence give meaning to. This process of chunking (Data Mining clustering),

a part of all cognition, is modelling and the products of the mental actions that have taken place are

models”

Significance of Modelling, Model

“Modelling as an element in scientific methodology and models at the outcome of modelling are both

important aspects of the conduct of science and hence of science education”

“Categorization of models a) Historical models (Curriculum models) - learning specific consensus (the P-N junction model of

transistor). Curriculum models can be used to provide an acceptable explanation of

99

a wide range of phenomena and specific facts, that´s why, it is useful way of reducing, by chunking,

the ever-growing factual load of science curriculum

b) New qualitative models - developed by following the sequence of learning: To revise an

established model, To construct a model de novo (to reconstruct an established model)

c) New quantitative models - developed by following the sequence of learning: quantitative version

of a useable qualitative model of phenomenon

d) Progress in the scientific enquiry is indicated by the value of particular combination of

qualitative and quantitative models in making successful predictions about it properties”

C.M.Borba, E.M.Villarreal (2005)

Definition of modeling “Modeling can be understood as a pedagogical approach that emphasizes students´ choice of

a problem to be investigated in the classroom. Students, therefore, play an active role in curriculum

development instead of being just the recipients of tasks designed by others.”

“Problem solving - problem solving as context

- problem solving as skill

- problem solving as art”

Process of modeling, process of problem solving “The process of modeling or model building is a part of the process of problem solving.”

“Steps of problem solving process Process of problem solving as entailing several steps:

a) The starting point is a real problematic situation

b) The first step is to create a real model, making simplifications, idealizations, establishing

conditions and assumptions, but respecting original situation

c) In the second step, the real model is mathematized, to get a mathematical model

d) The third step implies the selection of suitable mathematical methods and working within

mathematics in order to get some mathematical results

e) In the fourth step, these results are interpreted for and translated into the real situation”

J.K.Gilbert, O.de Jong, R.Justi, D.F.Treagust, J.H.van Driel (2002)

“Model as a major learning and teaching tool Models are one of the main products of science, modelling is an element in scientific methodology,

(and) models are a major learning and teaching tool in science education”

“Model of Modeling Framework

1. Decide on purpose - Select source for model and Have experience - Produce mental model

2. Produce mental model - Express in mode(s) of representation

3. Express in mode(s) of representation - Conduct thought experiments

4a. Conduct thought experiments (pass) - Design and perform empirical tests

4b. Conduct thought experiments (fail) - Reject mental model (Modify mental model) and back to

Select source for model (negative result)

5a. Design and perform empirical tests (pass) - Fulfill purpose and Consider scope and limitations of

model and back to Decide on purpose (positive result)

5b. Design and perform empirical tests (fail) - Reject mental model (Modify mental model) and back

to Select source for model (negative result)”

100

R.Justi, J.K.Gilbert (2002)

“Role of chemistry textbooks in the teaching and learning of models and modelling This role may be discussed from two main angles:

- the way that chemical models are introduced in textbooks

(note: projected curriculum, a learning model)

- and the teaching models that they present

(note: Implemented curriculum-1, a teaching model)”

“Teaching model, Learning model, Analogies A teaching model is a representation produced with the specific aim of helping students to

understand some aspect of content. Assuming the abstract nature of chemical knowledge, they

(learning models) are used very frequently in chemical textbooks mainly in the form of overt

analogies, as drawings and as diagrams (specifically to “the atom”, “chemical bonding” and “chemical

equilibrium”)”

“Some future research directions a) How can teachers´pedagogical content knowledge about models and modelling be improved?

b) The role of models and modelling in the development of chemical knowledge?

c) How can it be made evident to teachers that the introduction of model-based teaching and learning

approach can be way to shift the emphasis in chemical education from transmission of existing

knowledge to a more contemporary perspective in which students will really understand the

nature of chemistry and be able to deal critically with chemistry-related situations?”

viii) Representation (Creativity)

J.K.Gilbert, M.Reiner, M.Nakhleh (2008), J.K.Gilbert (2008)

“Levels of Representation

The “Representation in Science Education” is concerned with challenges that students face in

understanding the three “levels” at which models can be represented - “macro”, “sub-micro”,

“symbolic” - and the relationships between them.”

A.H.Johnstone (1993), D.Gabel (1999)

“Representations as distinct representational levels

a) The models produced by science are expressed in three distinct representational levels

b) The macroscopic level - this consists of what is seen in that which is studied

c) The sub-microscopic level - this consists of representations of those entities that are inferred to

underlie the macroscopic level, giving rise to the properties that it displays - molecules and ions are

used to explain the properties of pure solutions, of radiotherapy)

d) The symbolic level (this consists of any qualitative abstractions used to represent each item at the

sub-microscopic level - chemical equations, mathematical equations)”

J.K.Gilbert (2008), M.Hesse (1966), G.M.Bowen, W.-M.Roth (2005))

“The ontological categorization of representations

a) Two approaches to the ontological categorization of representations are put forward, one based on

the purpose which the representation is intended to serve, the other on the dimensionality -

1D,2D,3D - of the representation.

b) The purpose for which a Model is Produced

- All models are produced by the use analogy. The target (which is the subject of the model) is

depicted by a partial comparison with a source. The classification is binary: The target and the source

101

are the same things (they are homomorphs - an aeroplane, a virus), They are not (they are paramorphs

- paramorphs are used to model process rather than objects)

c) The dimensionality of the Representation

The idea that modelling involves the progressive reduction of the experienced world to a set of

abstract signs can be set out in terms of dimensions are follows:

- Macro level - Perception of the world-as-experienced - 3D, 2D

- Sub-micro level - Gestures, concrete representations (structural representations) - 3D

- Photographs, virtual representations, diagrams, graphs, data arrays - 2D

- Symbolic level - Symbols and equations - 1D”

E.R.Tufte (1983), J.K.Gilbert (2008), D.Reisberg (1997)

“External and internal representations, Series of internal representations and creativity

a) Visualization is concerned with External Representation, the systematic and focused public

display of information in the form of pictures, diagrams, tables, and the like

b) Visualization is also concerned with Internal Representation, the mental production, storage and

use of an image that often (but not always) is the result of external representation

c) External and internal representations are linked in that their perception uses similar mental

processes

d) Visualization is thus concerned with the formation of an internal representation from an

external representation. An internal representation must be capable of mental use in the making of

predictions about the behaviour of a phenomenon under specific conditions

e) It is entirely possible that once a series of internal representations have been visualized, that they

are amalgamated/recombined to form a novel internal representation that is capable of external

representation - this is creativity”

ix) Visualization

J.K.Gilbert, M.Reiner, M.Nakhleh (2008), J.K.Gilbert (2008)

Definition of Visualization

“The making of meaning for any such representation is “visualization”. Visualization is central

the production of representations of these models (curriculum models, qualitative and quantitative

models and their combinations).”

J.K.Gilbert (2008)

Visualization and Internal Representation

“Visualization is also concerned with Internal Representation, the mental production, storage and

use of an image that often (but not always) is the result of external representation.”

R.Kohavi (2000)

“Essence of Visualization - Data Summarization

As the volume of data collected and stored in databases grows, there is a growing need to provide data

summarization (e.g. through visualization), identify important patterns and trends, and act upon

findings.”

C.Brunk, J.Kelly, R.Kohavi (1997)

“Serviceability of Visualization

One way to did users in understanding the models is to visualize them.”

102

D.A.Keim (2002)

“Serviceability of Visualization

a) Information Visualization techniques may help to solve the problem

b) Data Mining will use Information Visualization technology for an improved data analysis”

Application of Visualization

“Application of Visualization is Visual Data Exploration”

“Benefits of Visual Data Exploration - University of Berkeley - every year 1 Exabyte of data (10

18 bytes, Gigabyte = 10

9 bytes)

- Finding the valuable information hidden in them, however, is a difficult task

- The data presented textually - The range of some one hundred data items can be displayed

(a drop in the ocean)

- The basic idea of visual data exploration is to present the data in some visual form, allowing the

human to get insight into the data, draw conclusions, and directly interact with the data (to combine

the flexibility, creativity and general knowledge of the human with the enormous storage capacity and

the computational power of today´s computers)

- The visual data exploration process can be seen a hypothesis generative process (coming up with

new hypotheses and the verification of the hypotheses can be done via visual data exploration)

- The main advantages of visual data exploration: Visual data exploration can easily deal with

inhomogenous and noisy data, visual data exploration is intuitive and requires no understanding of

mathematical and statistical algorithms, visual data exploration techniques are indispensable in

conjuction with automatic exploration techniques

- Visual data exploration paradigm: overview first, zoom and filter, details-on-demand”

x) Metavisualization

N.R.C. (2006)

“Metavisualization - spatial thinking

The associated visualization which can be called “spatial thinking””

J.K.Gilbert, M.Reiner, M.Nakhleh (2008), J.K.Gilbert (2008),

“Metavisualization - learning from representations

It is of such importance in science and hence in science education that the acquisition of fluency in

visualization is highly desirable and may be called “metavisual capability” or “metavisualization”. A

fluent performance in visualization has been described as requiring metavisualization and involving

the ability to acquire, monitor, integrate, and extend learning from representations. Metavisualization

- learning from representations.”

“Criteria for Metavisualisation Four criteria are suggested for attainment of metavisual status. The person concerned must be able to:

a) demonstrate an understanding of the “convention of representation” for all the modes and sub-

modes of 3D,2D,1D representations (what they can and cannot represent)

b) demonstrate a capacity to translate a given model between the modes and sub-modes in which it can

be depicted

c) demonstrate the capacity to be able to construct a representation within any mode and sub-mode of

dimensionality for a given purpose

d) demonstrate the ability to solve novel problems using a model-based approach”

“Developing the Skills of Metavisualization

level 1 - representation as depiction

level 2 - early symbolic skills

103

level 3 - syntactic use of formal representations

level 4 - semantic use of formal representations

level 5 - reflective, rhetorical use of representations”

xi) Visual DM techniques

D.A.Keim (2002)

“Classification of Visual Data Mining Techniques (abstraction criterium)

- Techniques as x-y plots, line plots, and histogram, but they are limited to relatively and low-

dimensional data sets

- Novel information visualization techniques allowing visualization of multidimensional data without

inherent 2D or 3D semantics.”

D.A.Keim (2002)

“Classification of Visual DM Techniques based on three criteria a), b), c)

a) The data to be visualized (one or two- dimensional data, multidimensional data, text and

hypertext, hierarchies and graphs, algorithms and software):

Dimensionality of date set = the number of variables of data set.

Text and hypertext = in the age of the world wide web one important data type is text and hypertext

Hierarchies and graphs = data records often have some relationship to other pieces of information,

i.e. a graph consists of set objects, called nodes, and connections between these objects, called edges.

Algorithms and software = the goal of V is to support software development by helping to understand

algorithms, e.g. by showing the flow of information in a program, to enhance the understading of

written code, e.g. by representing the structure of thousands of source code lines as graphs

b) The visualization techniques (Standard 2D/3D displays, Geometrically-transformed displays,

Icon-based displays, Dense pixel displays, Stacked displays-treemaps, dimensional stacking)

Geometrically-transformed displays = these techniques aim at finding “interesting” transformations of

multidimensional data sets. The class of geometric display techniques includes also the well-known

Parallel Coordinate Technique (PCT). The PCT maps the k-dimensional space onto the two display

dimensions by using k equidistant axes which are parallel to one of display axes

Icon-based displays = the idea is to map the attribute values of a multidimensional data item to the

features of an icon

c) The interaction (IT) and distortion (DT) techniques used (interactive projection, interactive

filtering, interactive zooming, interactive distortion, interactive linking and brushing)

Interaction techniques allow the data analyst to directly interact with visualizations and dynamically

change the visualizations according to exploration objectives

Distortion techniques help in the data exploration process by providing means for focusing on details

while preserving an overview of the data

Interactive filtering, Interactive zooming - in exploring large data sets it is important to interactively

partition the data into segments and focus on interesting subsets. This can be done by a direct selection

of the desired subset (BROWSING) or by a specification of properties of the desired subset

(QUERYING).”

104

xii) Educational Data Mining


Educational Data Mining

a) Currently there is an increasing interest in data mining and educational systems (well-known

learning content management systems, adaptive and intelligent web-based educational systems),

making educational data mining as a new growing research community b) After preprocessing the available data in each case, data mining techniques can be applied in

educational systems – statistics and visualization, clustering, classification and detection, association

rule mining and pattern mining, text mining

c) Data Mining oriented towards students – to show recommendations and to use, interact,

participate and communicate by students within educational systems

d) Data Mining oriented towards educators (and academic responsible-administrators) – to show

discovered knowledge and to design, plan, build and maintenance by educators (administrators) within

educational systems

e) Data Mining tools provide mining algorithms, filtering and visualization techniques. The examples




- Tool name: Synergo/ColAT, Authors: Avouris et al (2005), Mining task: Visualization

f) Future research lines in educational Data Mining

- Mining tools more facilitate the application of data mining by educators or not expert users

- Standardization of data and methods (preprocessing, discovering, postprocessing)

- Integration with the e-learning system

- Specific data mining techniques

W.Hämäläinen, T.H.Laine, E.Sutinen (2003)

Data Mining system, educational system

“Data Mining system in educational system: the educational system should be served by Data Mining

system to monitor, intervene in, and counsel the teaching-studying-learning process”

R.E.Scherr, M.Sabella, E.F.Redish (2007)

Curriculum development “Conceptual knowledge is only one aspect of good knowledge structure: how and when knowledge is

activated and used are also important.”

Representation of knowledge structure “The nodes represent knowledge. The lines represent relations between different nodes.”

R.Newburgh (2008)

“Linear and lateral (structural) thought process (in physics)

Why do we lose physics students?

a) There is a wide spectrum in thought process. Of the two major types one is linear (i.e. sequential)

and the other lateral (i.e. seeking horizontal connections).

b) Those who developed physics - from Galileo to Newton to Einstein to Heisenberg - were almost

exclusively linear thinkers. Paradigm for linear thought is Eucledian thinking, Eucledian logic

(many physicists chose physics for their career as a result of their exposure to geometry - a

consequence of this is that textbooks are usually written in a Eucledian format). The sense of

discovery is lost. Many students do not recognize that the Eucledian format is not a valid description

how we do physics. Their way of approaching problems is different but just as valid. Too many

105

physics teachers refuse to recognize the limitations of this approach (thereby causing would-be

students who do not think in a Eucledian fashion to leave).

c) The format of our textbooks is Eucledian. Newton´s laws, Hamilton-Jacobi theory, and

Maxwell´s equations are often presented as quasi-axioms in advanced texts. The laboratories become

fixed exercises in which the student must confirm some principle already established. He knows the

answer before he does the experiment.

d) Now I yield to no one in my admiration for Euclid. He has been an inspiration to many of us. We

understand his genius but also see his limitations. Unfortunately there are many who do not follow

his way of thinking. e) By presenting alternate approaches to students (specifically uses of lateral thinking), false starts

that must be corrected, and lessons that are discoveries not memorization, we can retain more

students in physics.

f) We should remember that lateral thinking is essential to the formation of analogies, an activity

that one cannot describe as Euclidean. Doing science without analogies seems to me an impossibility.”

J.K.Gilbert, O.de Jong, R.Justi, D.F.Treagust, J.H.van Driel (2002), R.Justi, J.K.Gilbert (2002)

Model as a major learning and teaching tool “Models are one of the main products of scince, modelling is an element in scientific methodology,

(and) models are a major learning and teaching tool in science education.”

Role of chemistry textbooks in the teaching and learning of models and modelling “This role may be discussed from two main angles:

- the way that chemical models are introduced in textbooks

- and the teaching models that they present.”

Teaching model, Learning model, Analogies “A teaching model is a representation produced with the specific aim of helping students to

understand some aspect of content. Assuming the abstract nature of chemical knowledge, they

(learning models) are used very frequently in chemical textbooks mainly in the form of overt

analogies, as drawings and as diagrams (specifically to “the atom”, “chemical bonding” and “chemical

equilibrium”)”

Some future research directions a) “How can teachers´pedagogical content knowledge about models and modelling be improved?”

b) “The role of models and modelling in the development of chemical knowledge?”

c) “How can it be made evident to teachers that the introduction of model-based teaching and learning

approach can be way to shift the emphasis in chemical education from transmission of existing

knowledge to a more contemporary perspective in which students will really understand the

nature of chemistry and be able to deal critically with chemistry-related situations?”

J.K.Gilbert, O.de Jong, R.Justi, D.F.Treagust, J.H.van Driel (2002), J.H.van Driel (2002)

“Curriculum for Chemical Eduaction

a) The central question is concerns the design of curricula for chemical education (note: curricular

process) which make chemistry interesting and relevant for various groups of learners (professional

chemists, general educational purposes-it is useful for all citizens in the future)

b) In recent decades, curricula have been changed, on the one hand for general educational

purposes, this has led to context-based approaches to teaching chemistry, on the other hand for

professional chemists specific chemistry courses have been developed in the context of vocational

training, aimed at developing the specific chemical competencies that are needed for various

professions.

c) Finally, chemistry is nowadays also presented in informal ways, for instance, in science centres and

through chemistry “shows”.”

106

U-D.Ehlers, J.M.Pawlowski (2006)

“Quality and Standardization in E-learning - Quality development: Methods and approaches

Methods, models, concepts and approaches for the development, management and assurance of quality

in e-learning are introduced

- E-learning standards

The main goal of e-learning standards is to provide solutions to enable and ensure interoperability and

stability of systems, components and objects.”

R.Kwan, R.Fox, FT Chan, P.Tsang (2008), Le Jun (2008)

Knowledge management, Data Mining

“We set up a few objects and value propositions of the initiative which was set up to improve teaching

and learning, to enhance the quality of curriculum, and to extent learning support. We apply Data

Mining tools to discover behavioral characteristics. A few strategies for knowledge management in the

curriculum development in distance education will be discussed.”

Le Jun (2008), I.Nonaka, H.Takeuchi (1995), I.Nonaka, H.Takeuchi (2005)

Types of knowledge, Interaction of types

“Many knowledge management experts agree that there are two general types of knowledge:

a) Tacit knowledge is linked to personal perspective intuition, emotion, belief, experience and value. It

is intangible, not easy to articulate, and difficult to share with others.

b) Explicit knowledge has a tangible dimension that can be more easily captured, codified and

communicated

Based on I.Nonaka, H.Takeuchi these two versions of knowledge can interact when the

“knowledge conversion” occurs:

- socialization: from tacit to tacit

- externalization: from tacit to explicit

- combination: from explicit to explicit

- internalization: from explicit to tacit”

Le Jun (2008), I.Nonaka, H.Takeuchi (2005)

“Research methods for knowledge management

a) Data Mining techniques

b) Web text mining is discovery knowledge from based non-structural text (text representation,

feature extraction, text categorization, text clustering, text summarization, semantic analysis, and

information extraction)

c) Learning theory

Learning theories are classified into four paradigms: behavioral theory, cognitive theory,

constructive theory, social learning theory.

We emphasize: Learning is continuous process that was indistinguishable from ongoing work practice

- by discovering the problems, recognizing their types, and by solving problems in routine work and

learning. Learners can continuously refine their cognitive, information, social and learning

competencies.

d) Knowledge management

Knowledge sharing and application of the SECI model (see I.Nonaka, H.Takeuchi)”

107

xiii) Metadata Mining Process

R.Vilalta, C.Giraud-Carrier, P.Brazdil, C.Soares (2004)

Meta-learning – Support Data Mining

„Current data mining tools are characterized by a plethora of algorithms but a lack of guidelines to

select the right method according to the nature of the problem under analysis. Producing such

guidelines is a primary goal by the field of meta-learning; the research objective is to understand the

interaction between the mechanism of learning and the concrete contexts in which that mechanism is

applicable. The field of meta-learning has seen continuous growth in the past years with interesting

new developments in the construction of practical model-selection assistants, task-adaptive learners,

and a solid conceptual framework. In this paper, we give an overview of different techniques

necessary to build meta-learning systems. We begin by describing an idealized meta-learning

architecture comprising a variety of relevant component techniques. We then look at how each

technique has been studied and implemented by previous research. In addition, we show how

metalearning has already been identified as an important component in real-world applications.“

J.Fox (2007)

Definition Metadata Mining process

“Since metadata is just another type of data, applying data mining to metadata is technically

straightforward. XML - eXtensible Markup Language”

American Library Association (1999)

“Definition of Metadata

a) As for most people the difference between data and information is merely a philosophical

one of no relevance in practical use, other definitions are:

Metadata is information about data.

Metadata is information about information.

Metadata contains information about that data or other data

b) There are more sophisticated definitions, such as:

Metadata is structured, encoded data that describe characteristics of information-bearing

entities to aid in the identification, discovery, assessment, and management of the described

entities.”

3.3.7.2. Brief Summary Data Mining – an analytical synthetic way of extraction of hidden and potencially useful information

from the large data files (continuum data-information-knowledge, knowledge discovery)

Data Mining Techniques – system functions of the structure of formerly hidden relations and patterns

(e.g. classification, association, clustering, prediction)

Data Mining Tool – a concrete procedure how to reach the intended system functions

Complex Tool – a resolution of the complex problem of relevant science branch

Partial Tool – a resolution of the partial problem of relevant science branch

Result of Data Mining – a result of the data mining tool application

Representation of Data Mining Result – a description of this what is expressed

Visualization of Data Mining Result – an optical retrieval of the data mining result

http://en.wikipedia.org/wiki/Information

http://en.wikipedia.org/wiki/Philosophical

108

3.3.7.3. Data Mining Cycle, References

i) Quotations from Sources

U.M.Fayyad, G.Piatelsky-Shapiro, P.Smyth (1996)

“Cycle of Data mining

Data Mining can be viewed as a cycle that consists of several steps:

- Identify a problem where analyzing data can provide value

- Collect the data

- Preprocess the data obtain a clean, mineable table

- Build a model that summarizes patterns of interest in a particular representational form

- Interpret/Evaluate the model

- Deploy the results incorporating the model into another system for further action.”

J.Luan (2002)

“Steps for Data Mining preparation (algorithm, building, visualization)

a) Investigate the possibility of overlaying Data Mining algorithms directly on a data warehouse

b) Select a solid querying tool to build Data Mining files. These files closely resemble

multidimensional cubes

c) Data Visualization and Validation. This means both examining frequency counts as well as

generating scatter plots, histograms, and other graphics, including clustering models

d) Mine your data”

Le Jun (2008)

“Main processes of Data Mining

- The main processes include data definition, data gathering, preprocessing, data processing and

discovering knowledge or patterns (Data Mining techniques can be implemented rapidly on existing

software and hardware)

- Application of Data Mining tools: To solve the task of prediction, classification, explicit modeling

and clustering. The application can help understand learners´learning behaviors.”

ii) Brief Summary of Data Mining Cycle

- Data Definition, Data Gathering

- Data Preprocessing, Data Processing

- Data Mining Techniques and Data Mining Tools,

- Discovering Knowledge or Patterns,

- Representation and Visualization of Data Mining Results,

- Application.

References i. Tarábek,P., Záškodný,P. (2009-2010)



ii. Záškodný,P., Pavlát,V. (2009-2010a)

Data Mining – A Brief Recherche (in: i.)

iii. Záškodný,P., Novák,V. (2009-2010b)

Data Mining – A Brief Summary (in: i.)


109

Part 4. STATISTICAL TABLES

Table I.: Values of distribution function of standardized normal distribution

u F(u) u F(u) u F(u) u F(u)

0,00 0,500 00 0,35 0,636 83 0,70 0,758 04 1,05 0,853 14

0,01 0,503 99 0,36 0,640 58 0,71 0,761 15 1,06 0,855 43

0,02 0,507 98 0,37 0,644 31 0,72 0,764 24 1,07 0,857 69

0,03 0,511 97 0,38 0,648 03 0,73 0,767 30 1,08 0,859 93

0,04 0,515 95 0,39 0,651 73 0,74 0,770 35 1,09 0,862 14

0,05 0,519 94 0,40 0,655 42 0,75 0,773 77 1,10 0,864 33

0,06 0,523 92 0,41 0,659 10 0,76 0,776 37 1,11 0,866 50

0,07 0,527 90 0,42 0,662 76 0,77 0,779 35 1,12 0,868 64

0,08 0,531 88 0,43 0,666 40 0,78 0,782 30 1,13 0,870 76

0,09 0,535 86 0,44 0,670 03 0,79 0,785 24 1,14 0,872 86

0,10 0,539 83 0,45 0,673 64 0,80 0,788 14 1,15 0,874 93

0,11 0,543 80 0,46 0,677 24 0,81 0,791 03 1,16 0,876 98

0,12 0,547 76 0,47 0,680 82 0,82 0,793 89 1,17 0,879 00

0,13 0,551 72 0,48 0,684 39 0,83 0,796 73 1,18 0,881 00

0,14 0,555 67 0,49 0,687 93 0,84 0,799 55 1,19 0,882 98

0,15 0,559 62 0,50 0,691 46 0,85 0,802 34 1,20 0,884 93

0,16 0,563 56 0,51 0,694 97 0,86 0,805 11 1,21 0,886 86

0,17 0,567 49 0,52 0,698 47 0,87 0,807 85 1,22 0,888 77

0,18 0,571 42 0,53 0,701 94 0,88 0,810 57 1,23 0,890 65

0,19 0,575 35 0,54 0,705 40 0,89 0,813 27 1,24 0,892 51

0,20 0,579 26 0,55 0,708 84 0,90 0,815 94 1,25 0,894 35

0,21 0,583 17 0,56 0,712 26 0,91 0,818 59 1,26 0,896 17

0,22 0,587 06 0,57 0,715 66 0,92 0,821 21 1,27 0,897 96

0,23 0,590 95 0,58 0,719 04 0,93 0,823 81 1,28 0,899 73

0,24 0,594 83 0,59 0,722 40 0,94 0,826 39 1,29 0,901 47

0,25 0,598 71 0,60 0,725 75 0,95 0,828 94 1,30 0,903 20

0,26 0,602 57 0,61 0,729 07 0,96 0,831 47 1,31 0,904 90

0,27 0,606 42 0,62 0,732 37 0,97 0,833 98 1,32 0,906 58

0,28 0,610 26 0,63 0,735 65 0,98 0,836 46 1,33 0,908 24

0,29 0,614 09 0,64 0,738 91 0,99 0,838 91 1,34 0,909 88

0,30 0,617 91 0,65 0,742 15 1,00 0,841 34 1,35 0,911 49

0,31 0,621 72 0,66 0,745 37 1,01 0,843 75 1,36 0,913 09

0,32 0,625 52 0,67 0,748 57 1,02 0,846 14 1,37 0,914 66

0,33 0,629 30 0,68 0,751 75 1,03 0,848 50 1,38 0,916 21

0,34 0,633 07 0,69 0,754 90 1,04 0,850 83 1,39 0,917 74

110


1,40 0,919 24 1,85 0,967 84 2,30 0,989 28 3,00 0,998 65

1,41 0,920 73 1,86 0,968 56 2,31 0,989 56 3,02 0,998 74

1,42 0,922 20 1,87 0,969 26 2,32 0,989 83 3,04 0,998 82

1,43 0,923 64 1,88 0,969 95 2,33 0,990 10 3,06 0,998 89

1,44 0,925 07 1,89 0,970 62 2,34 0,990 36 3,08 0,998 97

1,45 0,926 47 1,90 0,971 28 2,35 0,990 61 3,10 0,999 03

1,46 0,927 86 1,91 0,971 93 2,36 0,990 86 3,12 0,999 16

1,47 0,929 22 1,92 0,972 57 2,37 0,991 11 3,14 0,999 16

1,48 0,930 56 1,93 0,973 20 2,38 0,991 34 3,16 0,999 21

1,49 0,931 89 1,94 0,973 81 2,39 0,991 58 3,18 0,999 26

1,50 0,933 19 1,95 0,974 41 2,40 0,991 80 3,20 0,999 31

1,51 0,934 48 1,96 0,975 00 2,41 0,992 02 3,22 0,999 36

1,52 0,935 74 1,97 0,975 58 2,42 0,992 24 3,24 0,999 40

1,53 0,936 99 1,98 0,976 15 2,43 0,992 45 3,26 0,999 44

1,54 0,938 22 1,99 0,976 70 2,44 0,992 66 3,28 0,999 48

1,55 0,939 43 2,00 0,977 25 2,45 0,992 86 3,30 0,999 52

1,56 0,940 62 2,01 0,977 78 2,46 0,993 05 3,32 0,999 55

1,57 0,941 79 2,02 0,978 31 2,47 0,993 05 3,34 0,999 58

1,58 0,942 95 2,03 0,978 82 2,48 0,993 43 3,36 0,999 61

1,59 0,944 08 2,04 0,979 32 2,49 0,993 48 3,38 0,999 64

1,60 0,945 20 2,05 0,979 82 2,50 0,993 79 3,40 0,999 66

1,61 0,946 30 2,06 0,980 30 2,52 0,994 13 3,42 0,999 69

1,62 0,947 38 2,07 0,980 77 2,54 0,994 46 3,44 0,999 71

1,63 0,948 45 2,08 0,981 24 2,56 0,994 77 3,46 0,999 73

1,64 0,949 50 2,09 0,981 69 2,58 0,995 06 3,48 0,999 75

1,65 0,950 53 2,10 0,982 14 2,60 0,995 34 3,50 0,999 77

1,66 0,951 54 2,11 0,982 57 2,62 0,995 60 3,55 0,999 81

1,67 0,952 54 2,12 0,983 00 2,64 0,995 85 3,60 0,999 84

1,68 0,953 52 2,13 0,983 41 2,66 0,996 09 3,65 0,999 87

1,69 0,954 49 2,14 0,983 82 2,68 0,996 32 3,70 0,999 89

1,70 0,955 43 2,15 0,984 22 2,70 0,996 53 3,75 0,999 91

1,71 0,956 37 2,16 0,984 61 2,72 0,996 74 3,80 0,999 93

1,72 0,957 28 2,17 0,985 00 2,74 0,996 93 3,85 0,999 94

1,73 0,958 18 2,18 0,985 37 2,76 0,997 11 3,90 0,999 95

1,74 0,959 07 2,19 0,985 74 2,78 0,997 28 3,95 0,999 96

1,75 0,959 94 2,20 0,986 10 2,80 0,997 44 4,00 0,999 97

1,76 0,960 80 2,21 0,986 45 2,82 0,997 60 4,05 0,999 97

1,77 0,961 64 2,22 0,986 79 2,84 0,997 74 4,10 0,999 98

1,78 0,962 46 2,23 0,987 13 2,86 0,997 88 4,15 0,999 98

1,79 0,963 27 2,24 0,987 45 2,88 0,998 01 4,20 0,999 99

111


1,80 0,964 07 2,25 0,987 78 2,90 0,998 13 4,25 0,999 99

1,81 0,964 85 2,26 0,988 09 2,92 0,998 25 4,30 0,999 99

1,82 0,965 62 2,27 0,988 40 2,94 0,998 36 4,35 0,999 99

1,83 0,966 38 2,28 0,988 70 2,96 0,998 46 4,40 0,999 99

1,84 0,967 12 2,29 0,988 99 2,98 0,998 56 4,45 1,000 00

112

Table II.: Critical values of u-test

α 0,20 0,10 0,05 0,025 0,01 0,005

u(α) 0,842 1,282 1,645 1,960 2,326 2,576

113

Table III.: Critical values of t-test

ν α

0,05 0,025 0,01 0,005

1 6,31 12,71 31,82 63,66

2 2,92 4,30 6,96 9,92

3 2,35 3,18 4,54 5,84

4 2,13 2,78 3,75 4,60

5 2,02 2,57 3,36 4,03

6 1,94 2,45 3,14 3,71

7 1,90 2,36 3,00 3,50

8 1,86 2,31 2,90 3,38

9 1,03 2,26 2,82 3,25

10 1,81 2,23 2,76 3,17

11 1,80 2,2 2,72 3,11

12 1,70 2,18 2,68 3,06

13 1,77 2,16 2,65 3,01

14 1,76 2,14 2,62 2,98

15 1,75 2,13 2,6 2,95

16 1,75 2,12 2,58 2,92

17 1,74 2,11 2,57 2,90

18 1,73 2,10 2,55 2,88

19 1,73 2,09 2,54 2,86

20 1,72 2,09 2,53 2,84

21 1,72 2,08 2,52 2,83

22 1,72 2,07 2,51 2,82

23 1,71 2,07 2,50 2,81

24 1,71 2,06 2,49 2,80

25 1,71 2,06 2,48 2,79

26 1,71 2,06 2,48 2,78

27 1,70 2,05 2,47 2,77

28 1,70 2,05 2,47 2,76

29 1,70 2,04 2,46 2,76

30 1,70 2,04 2,46 2,75

31 1,70 2,04 2,45 2,75

32 1,69 2,03 2,45 2,74

33 1,69 2,03 2,45 2,74

114

Table IV.: Critical values of χ2-test

ν α

0,995 0,975 0,05 0,025 0,01 0,005

1 0,00 0,00 3,84 5,02 6,63 7,88

2 0,01 0,05 5,99 7,38 9,21 10,6

3 0,07 0,22 7,81 9,35 11,34 12,84

4 0,21 0,48 9,49 11,14 13,28 14,86

5 0,41 0,83 11,07 12,83 15,09 16,75

6 0,68 1,24 12,59 14,45 16,81 18,55

7 0,99 1,69 14,07 16,01 18,48 20,28

8 1,34 2,18 15,51 17,52 20,09 21,45

9 1,73 2,7 16,92 19,02 21,67 23,59

10 2,16 3,25 18,31 20,48 23,21 25,19

11 2,60 3,82 19,68 21,92 24,72 26,76

12 3,07 4,40 21,03 23,34 26,22 28,30

13 3,57 5,01 22,36 24,74 27,69 29,82

14 4,07 5,63 23,68 26,12 29,14 31,32

15 4,60 6,26 25,00 27,49 30,58 32,80

16 5,14 6,91 26,3 28,85 32,00 34,27

17 5,70 7,56 27,59 30,19 33,41 35,72

18 6,26 8,23 28,87 31,53 34,81 37,16

19 6,84 8,91 30,14 32,85 36,19 38,58

20 7,43 9,59 31,41 34,17 37,57 40,00

21 8,03 10,28 32,67 35,46 38,93 41,40

22 8,64 10,98 33,92 36,76 40,29 42,80

23 9,26 11,69 35,17 38,08 41,64 44,18

24 9,89 12,40 36,42 39,36 42,98 45,56

25 10,52 13,12 37,65 40,65 44,31 46,93

30 13,79 16,79 43,77 46,98 50,89 53,67

35 17,19 20,57 49,80 53,2 57,34 60,27

40 20,71 24,43 55,76 59,34 63,69 66,70

45 27,99 23,57 61,66 65,41 69,96 73,17

50 34,31 32,36 67,5 71,42 76,15 79,49

60 35,53 40,46 79,46 83,30 38,38 91,95

70 43,28 48,76 90,58 95,02 100,43 104,21

80 51,17 57,15 101,88 106,63 112,33 116,32

90 59,20 65,65 113,15 118,14 124,12 128,30

100 67,33 74,22 124,34 129,56 135,81 140,17

115

Table V.: Critical values of F-test for α = 0,05

ν μ

1 2 3 4 5 6 7 8 9 10 20 40 60 120

1 161 200 213 225 230 234 237 239 241 242 248 251 252 253

2 18,5 19,0 19,2 19,2 19,3 19,3 19,4 19,4 19,4 19,4 19,4 19,5 19,5 19,5

3 10,1 9,55 9,28 9,12 9,01 8,94 8,89 8,85 8,81 8,79 8,66 8,59 8,57 8,55

4 7,71 6,94 6,95 6,39 6,26 6,16 6,09 6,04 6,00 5,96 5,80 5,72 5,69 5,66

5 6,91 5,79 5,41 5,19 5,05 4,95 4,88 4,82 4,77 4,74 4,56 4,46 4,43 4,40

6 5,99 5,14 4,76 4,53 4,39 4,28 4,21 4,15 4,10 4,06 3,87 3,77 3,74 3,70

7 5,59 4,74 4,35 4,12 3,97 3,87 3,79 3,73 3,68 3,64 3,44 3,34 3,30 3,27

8 5,32 4,46 4,07 3,84 3,69 3,58 3,50 3,44 3,39 3,35 3,15 3,04 3,01 2,97

9 5,12 4,26 3,86 3,63 3,48 3,37 3,29 3,23 3,18 3,14 2,94 2,83 2,79 2,75

10 4,96 4,10 3,71 3,48 3,33 3,22 3,14 3,07 3,02 2,98 2,77 2,66 2,62 2,58

11 4,84 3,98 3,59 3,36 3,20 3,09 3,01 2,95 2,90 2,85 2,65 2,53 2,49 2,45

12 4,75 3,89 3,49 3,26 3,11 3,00 2,91 2,85 2,80 2,75 2,54 2,43 2,38 2,34

13 4,67 3,81 3,41 3,18 3,03 2,92 2,83 2,77 2,71 2,67 2,46 2,34 2,30 2,25

14 4,60 3,74 3,64 3,11 2,96 2,85 2,76 2,7 2,65 2,60 2,39 2,27 2,22 2,18

15 4,64 3,68 3,29 3,06 2,90 2,79 2,71 2,64 2,59 2,54 2,33 2,20 2,16 2,11

116

Table V.: Critical values of F-test for α = 0,05

ν μ

1 2 3 4 5 6 7 8 9 10 20 40 60 120

16 4,49 3,63 3,24 3,01 2,85 2,74 2,66 2,59 2,54 2,49 2,28 2,15 2,11 2,06

17 4,45 3,59 3,20 2,96 2,81 2,70 2,61 2,55 2,49 2,45 2,23 2,10 2,06 2,01

18 4,41 3,55 3,16 2,93 2,77 2,66 2,58 2,51 2,46 2,41 2,19 2,06 2,02 1,97

19 4,38 3,52 3,13 2,9 2,74 2,63 2,54 2,48 2,42 2,38 2,16 2,03 1,98 1,93

20 4,35 3,49 3,10 2,87 2,71 2,60 2,51 2,45 2,39 2,35 2,12 1,99 1,95 1,90

21 4,32 3,47 3,07 2,84 2,68 2,57 2,49 2,42 2,37 2,32 2,10 1,96 1,92 1,87

22 4,30 3,44 3,05 2,82 2,66 2,55 2,46 2,40 2,34 2,30 2,07 1,94 1,89 1,84

23 4,28 3,42 3,03 2,80 2,64 2,53 2,44 2,37 2,32 2,27 2,05 1,91 1,86 1,81

24 4,26 3,40 3,01 2,78 2,62 2,51 2,42 2,36 2,30 2,25 2,03 1,89 1,84 1,79

25 4,24 3,39 2,92 2,76 2,60 2,49 2,40 2,34 2,28 2,24 2,01 1,87 1,82 1,77

26 4,23 3,37 2,98 2,74 2,59 2,47 2,39 2,32 2,27 2,22 1,99 1,85 1,80 1,75

27 4,21 3,35 2,96 2,73 2,57 2,46 2,37 2,31 2,25 2,20 1,97 1,84 1,79 1,73

28 4,20 3,34 2,95 2,71 2,56 2,45 2,36 2,29 2,24 2,19 1,96 1,82 1,77 1,71

29 4,18 3,33 2,93 2,70 2,55 2,43 2,35 2,28 2,22 2,18 1,94 1,81 1,75 1,70

30 4,17 3,32 2,92 2,69 2,53 2,42 2,33 2,27 2,21 2,16 1,93 1,79 1,74 1,68

40 4,08 3,23 2,84 2,61 2,45 2,34 2,25 2,18 2,12 2,08 1,84 1,69 1,64 1,58

60 4,00 3,15 2,76 2,53 2,37 2,25 2,17 2,10 2,04 1,99 1,75 1,59 1,53 1,47

120 3,92 3,07 2,68 2,45 2,29 2,17 2,09 2,02 1,96 1,91 1,66 1,50 1,43 1,35

117

Table VI.: Critical values of F-test for α = 0,01

ν μ

1 2 3 4 5 6 7 8 9 10 20 40 60 120

1 4050 5000 5400 5620 5760 5860 5930 5980 6020 6060 6210 6290 6310 6340

2 998,5 99 99,2 99,2 99,3 99,3 99,4 99,4 99,4 99,4 99,4 99,5 99,5 99,5

3 34,1 30,8 29,5 28,7 28,2 27,9 27,7 27,5 27,3 27,2 26,7 26,4 26,3 26,2

4 21,2 18 16,7 16 15,5 15,2 15 14,8 14,7 14,5 14 13,7 13,7 13,6

5 16,3 13,3 12,1 11,4 11 10,7 10,5 10,3 10,2 10,1 9,55 9,2 9,2 9,11

6 13,7 10,9 9,78 9,15 8,75 8,47 8,26 8,1 7,98 7,87 7,4 7,14 7,06 6,97

7 12,2 9,55 8,45 7,85 7,46 7,19 6,99 6,84 6,72 6,62 6,16 5,91 5,82 5,74

8 11,3 8,65 7,59 7,01 6,63 6,37 6,18 6,03 5,91 5,81 5,36 5,12 5,03 4,95

9 10,6 8,02 6,99 6,42 6,06 5,8 5,61 5,47 5,35 5,26 4,81 4,57 4,48 4,4

10 10 7,56 6,55 5,99 5,64 5,39 5,2 5,06 4,94 4,85 4,41 4,17 4,08 4

11 9,65 7,21 6,22 5,67 5,32 5,07 4,89 4,74 4,63 4,54 4,1 3,86 3,78 3,69

12 9,33 6,93 5,95 5,41 5,06 4,82 4,64 4,5 4,39 4,3 3,86 3,62 3,54 3,45

13 9,07 6,7 5,74 5,21 4,86 4,62 4,44 4,3 4,19 4,1 3,66 3,43 3,34 3,25

14 8,86 6,51 5,56 5,04 4,69 4,46 4,28 4,14 4,03 3,94 3,51 3,27 3,18 3,09

15 8,68 6,36 5,42 4,89 4,56 4,32 4,14 4 3,39 3,8 3,37 3,13 3,05 2,96

118

Table VI.: Critical values of F-test for α = 0,01

ν μ

1 2 3 4 5 6 7 8 9 10 20 40 60 120

16 8,53 6,23 5,29 4,77 4,44 4,2 4,03 3,89 3,78 3,69 3,26 3,02 2,93 2,84

17 8,4 6,11 6,18 4,67 4,34 4,1 3,93 3,79 3,68 3,59 3,16 2,92 2,83 2,75

18 8,29 6,01 5,09 4,58 4,25 4,01 3,84 3,71 3,6 3,51 3,08 2,84 2,75 2,66

19 8,18 5,93 5,01 4,5 4,17 3,94 3,77 3,63 3,52 3,43 3 2,76 2,67 2,58

20 8,1 5,85 4,94 4,43 4,1 3,87 3,7 3,56 3,46 3,37 2,94 2,69 2,61 2,52

21 8,02 5,78 4,87 4,37 4,04 3,81 3,64 3,51 3,4 3,31 2,88 2,64 2,55 2,46

22 7,95 5,72 4,82 4,31 3,99 3,76 3,59 3,45 3,35 3,26 2,83 2,58 2,5 2,4

23 7,88 5,66 4,76 4,26 3,94 3,71 3,54 3,41 3,3 3,21 2,78 2,54 2,45 2,35

24 7,82 5,61 4,72 4,22 3,9 3,67 3,5 3,36 3,26 3,17 2,74 2,49 2,4 2,31

25 7,77 5,57 4,68 4,18 3,85 3,63 3,46 3,32 3,22 3,13 2,7 2,45 2,36 2,27

26 7,72 5,63 4,64 4,14 3,82 3,59 3,42 3,29 3,18 3,09 2,66 2,42 2,33 2,23

27 7,68 5,49 4,6 4,11 3,78 3,56 3,39 3,26 3,15 3,06 2,63 2,38 2,29 2,2

28 7,64 4,45 4,57 4,07 3,75 3,53 3,36 3,23 3,12 3,03 2,6 2,35 2,26 2,17

29 7,6 5,42 4,54 4,04 3,73 3,5 3,33 3,2 3,09 3 2,57 2,33 2,23 2,14

30 7,56 5,39 4,51 4,02 3,7 3,47 3,3 3,17 3,07 2,98 2,55 2,3 2,21 2,11

40 7,31 5,18 4,31 3,83 3,51 3,29 3,12 2,99 2,89 2,8 2,37 2,11 2,02 1,92

60 7,08 4,98 4,13 3,65 3,34 3,12 2,95 2,82 2,72 2,63 2,2 1,94 1,84 1,73

120 6,85 4,79 3,95 3,48 3,17 2,96 2,79 2,66 2,56 2,47 2,03 1,76 1,66 1,53

119

CV of Author

Assoc.Prof. RNDr. Přemysl Záškodný,CSc.

Assoc.Prof. RNDr. Přemysl Záškodný,CSc., graduated from the Mathematical-Physics

Faculty of Charles University, CSc. in the physics education, and docent (assoc. professor) of

physics education. As a university teacher, he is affiliated to the University of South Bohemia

in České Budějovice and to the University of Finance and Administration in Prague.

He is active in scientific work in cooperation with the International Institute of

Informatics and Systemics in U.S.A., and the Curriculum Studies Research Group in

Slovakia. In his scientific work, aimed at science and statistics education, he deals with

structuring and modelling physics and statistics knowledge and systems of knowledge and

also data mining and curricular process.

In addition to support from his faculty and university, the projects granted to the

author by the Avenira Foundation in Switzerland and the University of Finance and

Administration in Czech Republic has brought a considerable contribution to the results

achieved.

The conception of the last books “Survey of Principles of Theoretical Physics”,

“Curricular Process in Physics”, “Fundaments of Statistics” (with co-authors), and “From

Financial Derivatives to Option Hedging” (with co-author) and last monographs “Educational

and Didactic Communication 2008, 2009, 2010, 2011” are based on the scientific work of the

author. Some of the further works published by the author are quoted in the bibliography.

Assoc.Prof. RNDr. Přemysl Záškodný, CSc. is active as general chair of international

e-conferences OEDM-SERM 2011 and OEDM-SERM 2012 (Optimization, Education and

Data Mining in Science, Engineering and Risk Management).

120

Bibliography of Author

i) The monographs

Tarabek,P., Zaskodny,P.: Analytical-Synthetic Modelling of Cognitive Structures (volume 1:

New structural methods and their application).

Educational Publisher Didaktis Ltd., Bratislava, London 2001

Tarabek,P., Zaskodny,P.: Analytical-Synthetic Modelling of Cognitive Structures (volume 2:

Didactic communication and educational sciences).

Educational Publisher Didaktis Ltd., Bratislava, New York 2002

Tarabek,P., Zaskodny,P.: Structure, Formation and Design of Textbook (volume 1:

Theoretical basis).


Tarabek,P., Zaskodny,P.: Structure, Formation and Design of Textbook (volume 2: Theory

and practice).


Tarabek,P., Zaskodny,P.: Modern Science and Textbook Creation (volume 1: Projection of

scientific systems).

Educational Publisher Didaktis Ltd., Bratislava, Frankfurt a.M. 2005

Tarabek,P., Zaskodny,P.: Modern Science and Textbook Creation (volume 2: Modern

tendencies in textbook creation).


Tarabek,P., Zaskodny,P.: Educational and Didactic Communication 2007”





Educational Publisher Didaktis Ltd., Bratislava, 2010





ii) The books

Pavlát,V., Záškodný,P. at al: Capital Market, The first edition, 2003

Záškodný,P.: Survey of Principles of Theoretical Physics (with Application to Radiology)

(in Czech). Didaktis, Bratislava, Slovak Republic 2005

121

Záškodný,P.: Survey of Principles of Theoretical Physics (with Application to Radiology) (in

English). Avenira, Switzerland, Algoritmus, Ostrava, Czech Republic 2006

Pavlát,V., Záškodný,P. at al: Capital Market, The second edition, 2006

Záškodný,P.: Curricular Process in Physics (in Czech). Avenira, Switzerland, Algoritmus,

Ostrava, Czech Republic 2009

Záškodný,P. at al.: Fundaments of Statistics (in Czech). Curriculum, Czech Republic 2011

Pavlát,V., Záškodný,P.: From Financial Derivatives to Option Hedging. Curriculum, Czech

Republic 2012

iii) The textbooks

Záškodný,P.: Theoretical Mechanics in Examples I (in Czech). PF, Ostrava, Czech

Republic 1984

Záškodný,P., Sklenák,L.: Theoretical Mechanics in Examples II (in Czech). PF, Ostrava,

Czech Republic 1986

Záškodný,P. et al.: Principles of Economical Statistics (in Czech). VSFS, Praha, Czech

Republic 2004

Budínský,P., Záškodný,P.: Financial and Investment Mathematics. VSFS, Prague 2004

Záškodný,P. et al.: Principles of Health Statistics (in Czech). JU, České Budějovice, Czech

Republic 2005

Kozlovská,D., Skalická,Z., Záškodný,P.: Introduction to Practicum from Radiological

Physics. JCU, České Budějovice, Czech Republic, 2007

Záškodný,P., Pavlát,V., Budík,J.: Financial Derivates and Their Evaluation. Prague,

University of Finance and Administration, 2009

iv) The papers

Approximately 100 papers

122

Global References

Dalgaard,P. (2008). Introductory Statistics with R. Second Edition. New York, USA:

Springer. (In English)

ISBN-13: 978-038779-053-4

Field,A. (2009). Discovering Statistics Using SPSS. Third Edition. London, New Delhi,

Singapore: SAGE. (In English)

ISBN-13: 978-184787-907-3

Jorion,P. (2007). Financial Risk Manager. Handbook. Hoboken, New Jersey, USA:

Wiley&Sons. (In English)

ISBN 978-0-470-12630-1

Matloff,N. (2011). The Art R Programming: A Tour of Statistical Software Design. USA: No

Starch Press. (In English)

ISBN-13: 978-159327-384-2

Pavlát,V., Záškodný,P. (2012). From Financial Derivatives to Option Hedging. Prague, Czech

Republic: Curriculum. (In Czech)

ISBN 978-80-904948-3-1

Tarábek,P., Záškodný,P. (2011). Data Mining Tooůs in Statistics Education. In:

Educational&Didactic Communication 2010. Bratislava, Slovakia: Didaktis. (In English)

ISBN 978-80-89160-78-5

Záškodný,P. et al (2007). Principles of Economical Statistics. Prague, Czech Republic:

Eupress. (Partly on English)

ISBN 80-86754-00-6

Documents

Probability and Statistics