54
1 1 中中中中──中中中 Analysis of Data – Basic Concepts 中中中中 . 中中中中中 中中中 mailto: [email protected] 2013.05 updated 14

Analysis of Data – Basic Concepts

Embed Size (px)

DESCRIPTION

14. Analysis of Data – Basic Concepts. 中央大學 . 資訊管理系 范錚強 mailto: [email protected] 2013.05 updated. Descriptive Statistics 描述性統計. 描述樣本的特性 主要要呈現的是: 你的研究樣本,和母體究竟有什麼差異?. Exploratory Data Analysis. Exploratory. Confirmatory. 一些探索性的資料呈現 (Ch.16). Scatter-plot Bar Chart, Pie chart - PowerPoint PPT Presentation

Citation preview

Page 1: Analysis of Data –  Basic Concepts

11中央資管──范錚強

Analysis of Data –

Basic Concepts

中央大學 . 資訊管理系范錚強

mailto: [email protected]

2013.05 updated

14

Page 2: Analysis of Data –  Basic Concepts

22中央資管──范錚強

Descriptive Statistics描述性統計

描述樣本的特性主要要呈現的是:

你的研究樣本,和母體究竟有什麼差異?

Page 3: Analysis of Data –  Basic Concepts

33中央資管──范錚強

Exploratory Data Analysis

ConfirmatoryExploratory

Page 4: Analysis of Data –  Basic Concepts

44中央資管──范錚強

一些探索性的資料呈現 (Ch.16)

Scatter-plot

Bar Chart, Pie chart

Frequency table

Histogram 長條圖Cross Tabulation

Page 5: Analysis of Data –  Basic Concepts

55中央資管──范錚強

Statistical Procedures

Descriptive Statistics

Inferential Statistics

Page 6: Analysis of Data –  Basic Concepts

66中央資管──范錚強

Confirmatory Studies

Hypothesis Testing 假說檢驗Research Hypothesis

Null Hypothesis H0

Refutation 反証基於想要驗證的研究假說,建立反面的一個「稻草人」 H0 (原來的研究假說就是統計裡的替代假說)

用統計來推翻 H0 的真實性因此替代假說獲得支持

Page 7: Analysis of Data –  Basic Concepts

77中央資管──范錚強

Types of Hypotheses

Null

H0: = 50 mpg

H0: < 50 mpg

H0: > 50 mpg

Alternate

HA: = 50 mpg

HA: > 50 mpg

HA: < 50 mpg

Page 8: Analysis of Data –  Basic Concepts

88中央資管──范錚強

Two-Tailed Test of Significance

Page 9: Analysis of Data –  Basic Concepts

99中央資管──范錚強

One-Tailed Test of Significance

Page 10: Analysis of Data –  Basic Concepts

1010中央資管──范錚強

Decision Rule

Take no corrective action if the analysis shows that one cannot reject the null hypothesis.

Page 11: Analysis of Data –  Basic Concepts

1111中央資管──范錚強

Statistical Decisions

Page 12: Analysis of Data –  Basic Concepts

1212中央資管──范錚強

Tests of Significance

Nonparametric非參數、無母數

統計:弱統計

Parametric參數統計

是「強」統計

Page 13: Analysis of Data –  Basic Concepts

1313中央資管──范錚強

Assumptions for Using Parametric Tests

Independent observationsIndependent observations

Normal distributionNormal distribution

Equal variancesEqual variances

Interval or ratio scalesInterval or ratio scales

Page 14: Analysis of Data –  Basic Concepts

1414中央資管──范錚強

Advantages of Nonparametric Tests

Easy to understand and useEasy to understand and use

Usable with nominal dataUsable with nominal data

Appropriate for ordinal dataAppropriate for ordinal data

Appropriate for non-normal population distributions

Appropriate for non-normal population distributions

Page 15: Analysis of Data –  Basic Concepts

1515中央資管──范錚強

How to Select a Test

How many samples are involved?

If two or more samples are involved, are the individual cases independent or related?

Is the measurement scale nominal, ordinal, interval, or ratio?

Page 16: Analysis of Data –  Basic Concepts

1616中央資管──范錚強

Recommended Statistical Techniques

Two-Sample Tests____________________________________________

k-Sample Tests ____________________________________________

Measurement Scale One-Sample Case Related Samples

Independent Samples Related Samples

Independent Samples

Nominal Binomial

x2 one-sample test

McNemar Fisher exact test

x2 two-samples test

Cochran Q x2 for k samples

Ordinal Kolmogorov-Smirnov one-sample test

Runs test

Sign test

Wilcoxon matched-pairs test

Median test

Mann-Whitney U

Kolmogorov-Smirnov

Wald-Wolfowitz

Friedman two-way ANOVA

Median extension

Kruskal-Wallis one-way ANOVA

Interval and Ratio

t-test

Z test

t-test for paired samples

t-test

Z test

Repeated-measures ANOVA

One-way ANOVA

n-way ANOVA

Page 17: Analysis of Data –  Basic Concepts

1717中央資管──范錚強

Measures of Association: Interval/Ratio

Pearson correlation coefficient For continuous linearly related variables

Correlation ratio (eta)For nonlinear data or relating a main effect to a continuous dependent variable

BiserialOne continuous and one dichotomous variable with an underlying normal distribution

Partial correlationThree variables; relating two with the third’s effect taken out

Multiple correlationThree variables; relating one variable with two others

Bivariate linear regressionPredicting one variable from another’s scores

Page 18: Analysis of Data –  Basic Concepts

1818中央資管──范錚強

Pearson’s Product Moment Correlation r

Is there a relationship between X and Y?

What is the magnitude of the relationship?

What is the direction of the relationship?

Page 19: Analysis of Data –  Basic Concepts

1919中央資管──范錚強

Scatterplots of Relationships

Page 20: Analysis of Data –  Basic Concepts

2020中央資管──范錚強

Scatterplots

Page 21: Analysis of Data –  Basic Concepts

2121中央資管──范錚強

Interpretation of Correlations

X causes YX causes Y

Y causes XY causes X

X and Y are activated by one or more other variablesX and Y are activated by

one or more other variables

X and Y influence each other reciprocally

X and Y influence each other reciprocally

Page 22: Analysis of Data –  Basic Concepts

2222中央資管──范錚強

Artifact Correlations

Page 23: Analysis of Data –  Basic Concepts

2323中央資管──范錚強

Interpretation of Coefficients

A coefficient is not remarkable simply because it is statistically significant! It must be practically meaningful.

Page 24: Analysis of Data –  Basic Concepts

2424中央資管──范錚強

Coefficient of Determination: r2

Total proportion of variance in Y explained by X

Desired r2: 80% or more

Page 25: Analysis of Data –  Basic Concepts

2525中央資管──范錚強

Classifying Multivariate Techniques

InterdependencyDependency

Page 26: Analysis of Data –  Basic Concepts

2626中央資管──范錚強

Multivariate Techniques

Page 27: Analysis of Data –  Basic Concepts

2727中央資管──范錚強

Multivariate Techniques

Page 28: Analysis of Data –  Basic Concepts

2828中央資管──范錚強

Multivariate Techniques

Page 29: Analysis of Data –  Basic Concepts

2929中央資管──范錚強

Right Questions. Trusted Insight.

When using sophisticated techniques you want to rely on the knowledge of the researcher.

Harris Interactive promises you can trust their experienced research professionals to draw the right conclusions from the collected data.

Page 30: Analysis of Data –  Basic Concepts

3030中央資管──范錚強

Dependency Techniques

Multiple RegressionMultiple Regression

Discriminant AnalysisDiscriminant Analysis

MANOVAMANOVA

Structural Equation Modeling (SEM)Structural Equation Modeling (SEM)

Conjoint Analysis Conjoint Analysis

Page 31: Analysis of Data –  Basic Concepts

3131中央資管──范錚強

Uses of Multiple Regression

Develop self-weighting

estimating equation to

predict values for a DV

Control for

confounding Variables

Test and

explain causal theories

Page 32: Analysis of Data –  Basic Concepts

3232中央資管──范錚強

Generalized Regression Equation

Page 33: Analysis of Data –  Basic Concepts

3333中央資管──范錚強

Multiple Regression Example

Page 34: Analysis of Data –  Basic Concepts

3434中央資管──范錚強

Selection Methods

Forward

Backward

Stepwise

Page 35: Analysis of Data –  Basic Concepts

3535中央資管──范錚強

Evaluating and Dealing with Multicollinearity

Choose one of the variables and delete the other

Create a new variable that is a composite of the others

CollinearityStatistics

VIF

1.000

2.289

2.289

2.748

3.025

3.067

Page 36: Analysis of Data –  Basic Concepts

3636中央資管──范錚強

Discriminant Analysis

Predicted Success

Actual GroupNumber of Cases 0 1

Unsuccessful

Successful

0

1

15

15

13 86.70%

3

20.00%

2

13.30%

12

80.00%

Note: Percent of “grouped” cases correctly classified: 83.33%

Unstandardized Standardized

X1

X1

X1

Constant

.36084

2.61192

.53028

12.89685

.65927

.57958

.97505

A.

B.

Page 37: Analysis of Data –  Basic Concepts

3737中央資管──范錚強

MANOVA

Page 38: Analysis of Data –  Basic Concepts

3838中央資管──范錚強

MANOVA Output

Page 39: Analysis of Data –  Basic Concepts

3939中央資管──范錚強

Bartlett’s Test

Page 40: Analysis of Data –  Basic Concepts

4040中央資管──范錚強

MANOVA Homogeneity-of-Variance Tests

Page 41: Analysis of Data –  Basic Concepts

4141中央資管──范錚強

Multivariate Tests of Significance

Page 42: Analysis of Data –  Basic Concepts

4242中央資管──范錚強

Univariate Tests of Significance

Page 43: Analysis of Data –  Basic Concepts

4343中央資管──范錚強

Structural Equation Modeling (SEM)

Model SpecificationModel Specification

EstimationEstimation

Evaluation of FitEvaluation of Fit

Respecification of the ModelRespecification of the Model

Interpretation and CommunicationInterpretation and Communication

Page 44: Analysis of Data –  Basic Concepts

4444中央資管──范錚強

Structural Equation Modeling (SEM)

Page 45: Analysis of Data –  Basic Concepts

4545中央資管──范錚強

Interdependency Techniques

Factor AnalysisFactor Analysis

Cluster AnalysisCluster Analysis

Multidimensional ScalingMultidimensional Scaling

Page 46: Analysis of Data –  Basic Concepts

4646中央資管──范錚強

Factor Analysis

Page 47: Analysis of Data –  Basic Concepts

4747中央資管──范錚強

Factor Matrices

AUnrotated Factors

BRotated Factors

Variable I II h2 I II

A

B

C

D

E

F

Eigenvalue

Percent of variance

Cumulative percent

0.70

0.60

0.60

0.50

0.60

0.60

2.18

36.3

36.3

-.40

-.50

-.35

0.50

0.50

0.60

1.39

23.2

59.5

0.65

0.61

0.48

0.50

0.61

0.72

0.79

0.75

0.68

0.06

0.13

0.07

0.15

0.03

0.10

0.70

0.77

0.85

Page 48: Analysis of Data –  Basic Concepts

4848中央資管──范錚強

Orthogonal Factor Rotations

Page 49: Analysis of Data –  Basic Concepts

4949中央資管──范錚強

Factor Matrix, Metro U MBA Study

Variable Course Factor 1 Factor 2 Factor 3 Communality

V1

V2

V3

V4

V5

V6

V7

V8

V9

V10

Eigenvalue

Percent of variance

Cumulative percent

Financial Accounting

Managerial Accounting

Finance

Marketing

Human Behavior

Organization Design

Production

Probability

Statistical Inference

Quantitative Analysis

0.41

0.01

0.89

-.60

0.02

-.43

-.11

0.25

-.43

0.25

1.83

18.30

18.30

0.71

0.53

-.17

0.21

-.24

-.09

-.58

0.25

0.43

0.04

1.52

15.20

33.50

0.23

-.16

0.37

0.30

-.22

-.36

-.03

-.31

0.50

0.35

0.95

9.50

43.00

0.73

0.31

0.95

0.49

0.11

0.32

0.35

0.22

0.62

0.19

Page 50: Analysis of Data –  Basic Concepts

5050中央資管──范錚強

Varimax Rotated Factor Matrix

Variable Course Factor 1 Factor 2 Factor 3

V1

V2

V3

V4

V5

V6

V7

V8

V9

V10

Financial Accounting

Managerial Accounting

Finance

Marketing

Human Behavior

Organization Design

Production

Probability

Statistical Inference

Quantitative Analysis

0.84

0.53

-.01

-.11

-.13

-.08

-.54

0.41

0.07

-.02

0.16

-.10

0.90

-.24

-.14

-.56

-.11

-.02

0.02

0.42

-.06

0.14

-.37

0.65

-.27

-.02

-.22

-.24

0.79

0.09

Page 51: Analysis of Data –  Basic Concepts

5151中央資管──范錚強

Cluster Analysis

Select sample to clusterSelect sample to cluster

Define variablesDefine variables

Compute similaritiesCompute similarities

Select mutually exclusive clustersSelect mutually exclusive clusters

Compare and validate clusterCompare and validate cluster

Page 52: Analysis of Data –  Basic Concepts

5252中央資管──范錚強

Cluster Analysis

Page 53: Analysis of Data –  Basic Concepts

5353中央資管──范錚強

Cluster Membership

________Number of Clusters ________

Film Country Genre Case 5 4 3 2

Cyrano de Bergerac

Il y a des Jours

Nikita

Les Noces de Papier

Leningrad Cowboys . . .

Storia de Ragazzi . . .

Conte de Printemps

Tatie Danielle

Crimes and Misdem . . .

Driving Miss Daisy

La Voce della Luna

Che Hora E

Attache-Moi

White Hunter Black . . .

Music Box

Dead Poets Society

La Fille aux All . . .

Alexandrie, Encore . . .

Dreams

France

France

France

Canada

Finland

Italy

France

France

USA

USA

Italy

Italy

Spain

USA

USA

USA

Finland

Egypt

Japan

DramaCom

DramaCom

DramaCom

DramaCom

Comedy

Comedy

Comedy

Comedy

DramaCom

DramaCom

DramaCom

DramaCom

DramaCom

PsyDrama

PsyDrama

PsyDrama

PsyDrama

DramaCom

DramaCom

1

4

5

6

19

13

2

3

7

9

12

14

15

10

8

11

18

16

17

1

1

1

1

2

2

2

2

3

3

3

3

3

4

4

4

4

5

5

1

1

1

1

2

2

2

2

3

3

3

3

3

4

4

4

4

3

3

1

1

1

1

2

2

2

2

3

3

3

3

3

3

3

3

3

3

3

1

1

1

1

2

2

2

2

2

2

2

2

2

2

2

2

2

2

2

Page 54: Analysis of Data –  Basic Concepts

5454中央資管──范錚強

Dendogram