Analysis of Data – Basic Concepts

Preview:

DESCRIPTION

14. Analysis of Data – Basic Concepts. 中央大學 . 資訊管理系 范錚強 mailto: ckfarn@mgt.ncu.edu.tw 2013.05 updated. Descriptive Statistics 描述性統計. 描述樣本的特性 主要要呈現的是: 你的研究樣本,和母體究竟有什麼差異?. Exploratory Data Analysis. Exploratory. Confirmatory. 一些探索性的資料呈現 (Ch.16). Scatter-plot Bar Chart, Pie chart - PowerPoint PPT Presentation

Citation preview

11中央資管──范錚強

Analysis of Data –

Basic Concepts

中央大學 . 資訊管理系范錚強

mailto: ckfarn@mgt.ncu.edu.tw

2013.05 updated

14

22中央資管──范錚強

Descriptive Statistics描述性統計

描述樣本的特性主要要呈現的是:

你的研究樣本,和母體究竟有什麼差異?

33中央資管──范錚強

Exploratory Data Analysis

ConfirmatoryExploratory

44中央資管──范錚強

一些探索性的資料呈現 (Ch.16)

Scatter-plot

Bar Chart, Pie chart

Frequency table

Histogram 長條圖Cross Tabulation

55中央資管──范錚強

Statistical Procedures

Descriptive Statistics

Inferential Statistics

66中央資管──范錚強

Confirmatory Studies

Hypothesis Testing 假說檢驗Research Hypothesis

Null Hypothesis H0

Refutation 反証基於想要驗證的研究假說,建立反面的一個「稻草人」 H0 (原來的研究假說就是統計裡的替代假說)

用統計來推翻 H0 的真實性因此替代假說獲得支持

77中央資管──范錚強

Types of Hypotheses

Null

H0: = 50 mpg

H0: < 50 mpg

H0: > 50 mpg

Alternate

HA: = 50 mpg

HA: > 50 mpg

HA: < 50 mpg

88中央資管──范錚強

Two-Tailed Test of Significance

99中央資管──范錚強

One-Tailed Test of Significance

1010中央資管──范錚強

Decision Rule

Take no corrective action if the analysis shows that one cannot reject the null hypothesis.

1111中央資管──范錚強

Statistical Decisions

1212中央資管──范錚強

Tests of Significance

Nonparametric非參數、無母數

統計:弱統計

Parametric參數統計

是「強」統計

1313中央資管──范錚強

Assumptions for Using Parametric Tests

Independent observationsIndependent observations

Normal distributionNormal distribution

Equal variancesEqual variances

Interval or ratio scalesInterval or ratio scales

1414中央資管──范錚強

Advantages of Nonparametric Tests

Easy to understand and useEasy to understand and use

Usable with nominal dataUsable with nominal data

Appropriate for ordinal dataAppropriate for ordinal data

Appropriate for non-normal population distributions

Appropriate for non-normal population distributions

1515中央資管──范錚強

How to Select a Test

How many samples are involved?

If two or more samples are involved, are the individual cases independent or related?

Is the measurement scale nominal, ordinal, interval, or ratio?

1616中央資管──范錚強

Recommended Statistical Techniques

Two-Sample Tests____________________________________________

k-Sample Tests ____________________________________________

Measurement Scale One-Sample Case Related Samples

Independent Samples Related Samples

Independent Samples

Nominal Binomial

x2 one-sample test

McNemar Fisher exact test

x2 two-samples test

Cochran Q x2 for k samples

Ordinal Kolmogorov-Smirnov one-sample test

Runs test

Sign test

Wilcoxon matched-pairs test

Median test

Mann-Whitney U

Kolmogorov-Smirnov

Wald-Wolfowitz

Friedman two-way ANOVA

Median extension

Kruskal-Wallis one-way ANOVA

Interval and Ratio

t-test

Z test

t-test for paired samples

t-test

Z test

Repeated-measures ANOVA

One-way ANOVA

n-way ANOVA

1717中央資管──范錚強

Measures of Association: Interval/Ratio

Pearson correlation coefficient For continuous linearly related variables

Correlation ratio (eta)For nonlinear data or relating a main effect to a continuous dependent variable

BiserialOne continuous and one dichotomous variable with an underlying normal distribution

Partial correlationThree variables; relating two with the third’s effect taken out

Multiple correlationThree variables; relating one variable with two others

Bivariate linear regressionPredicting one variable from another’s scores

1818中央資管──范錚強

Pearson’s Product Moment Correlation r

Is there a relationship between X and Y?

What is the magnitude of the relationship?

What is the direction of the relationship?

1919中央資管──范錚強

Scatterplots of Relationships

2020中央資管──范錚強

Scatterplots

2121中央資管──范錚強

Interpretation of Correlations

X causes YX causes Y

Y causes XY causes X

X and Y are activated by one or more other variablesX and Y are activated by

one or more other variables

X and Y influence each other reciprocally

X and Y influence each other reciprocally

2222中央資管──范錚強

Artifact Correlations

2323中央資管──范錚強

Interpretation of Coefficients

A coefficient is not remarkable simply because it is statistically significant! It must be practically meaningful.

2424中央資管──范錚強

Coefficient of Determination: r2

Total proportion of variance in Y explained by X

Desired r2: 80% or more

2525中央資管──范錚強

Classifying Multivariate Techniques

InterdependencyDependency

2626中央資管──范錚強

Multivariate Techniques

2727中央資管──范錚強

Multivariate Techniques

2828中央資管──范錚強

Multivariate Techniques

2929中央資管──范錚強

Right Questions. Trusted Insight.

When using sophisticated techniques you want to rely on the knowledge of the researcher.

Harris Interactive promises you can trust their experienced research professionals to draw the right conclusions from the collected data.

3030中央資管──范錚強

Dependency Techniques

Multiple RegressionMultiple Regression

Discriminant AnalysisDiscriminant Analysis

MANOVAMANOVA

Structural Equation Modeling (SEM)Structural Equation Modeling (SEM)

Conjoint Analysis Conjoint Analysis

3131中央資管──范錚強

Uses of Multiple Regression

Develop self-weighting

estimating equation to

predict values for a DV

Control for

confounding Variables

Test and

explain causal theories

3232中央資管──范錚強

Generalized Regression Equation

3333中央資管──范錚強

Multiple Regression Example

3434中央資管──范錚強

Selection Methods

Forward

Backward

Stepwise

3535中央資管──范錚強

Evaluating and Dealing with Multicollinearity

Choose one of the variables and delete the other

Create a new variable that is a composite of the others

CollinearityStatistics

VIF

1.000

2.289

2.289

2.748

3.025

3.067

3636中央資管──范錚強

Discriminant Analysis

Predicted Success

Actual GroupNumber of Cases 0 1

Unsuccessful

Successful

0

1

15

15

13 86.70%

3

20.00%

2

13.30%

12

80.00%

Note: Percent of “grouped” cases correctly classified: 83.33%

Unstandardized Standardized

X1

X1

X1

Constant

.36084

2.61192

.53028

12.89685

.65927

.57958

.97505

A.

B.

3737中央資管──范錚強

MANOVA

3838中央資管──范錚強

MANOVA Output

3939中央資管──范錚強

Bartlett’s Test

4040中央資管──范錚強

MANOVA Homogeneity-of-Variance Tests

4141中央資管──范錚強

Multivariate Tests of Significance

4242中央資管──范錚強

Univariate Tests of Significance

4343中央資管──范錚強

Structural Equation Modeling (SEM)

Model SpecificationModel Specification

EstimationEstimation

Evaluation of FitEvaluation of Fit

Respecification of the ModelRespecification of the Model

Interpretation and CommunicationInterpretation and Communication

4444中央資管──范錚強

Structural Equation Modeling (SEM)

4545中央資管──范錚強

Interdependency Techniques

Factor AnalysisFactor Analysis

Cluster AnalysisCluster Analysis

Multidimensional ScalingMultidimensional Scaling

4646中央資管──范錚強

Factor Analysis

4747中央資管──范錚強

Factor Matrices

AUnrotated Factors

BRotated Factors

Variable I II h2 I II

A

B

C

D

E

F

Eigenvalue

Percent of variance

Cumulative percent

0.70

0.60

0.60

0.50

0.60

0.60

2.18

36.3

36.3

-.40

-.50

-.35

0.50

0.50

0.60

1.39

23.2

59.5

0.65

0.61

0.48

0.50

0.61

0.72

0.79

0.75

0.68

0.06

0.13

0.07

0.15

0.03

0.10

0.70

0.77

0.85

4848中央資管──范錚強

Orthogonal Factor Rotations

4949中央資管──范錚強

Factor Matrix, Metro U MBA Study

Variable Course Factor 1 Factor 2 Factor 3 Communality

V1

V2

V3

V4

V5

V6

V7

V8

V9

V10

Eigenvalue

Percent of variance

Cumulative percent

Financial Accounting

Managerial Accounting

Finance

Marketing

Human Behavior

Organization Design

Production

Probability

Statistical Inference

Quantitative Analysis

0.41

0.01

0.89

-.60

0.02

-.43

-.11

0.25

-.43

0.25

1.83

18.30

18.30

0.71

0.53

-.17

0.21

-.24

-.09

-.58

0.25

0.43

0.04

1.52

15.20

33.50

0.23

-.16

0.37

0.30

-.22

-.36

-.03

-.31

0.50

0.35

0.95

9.50

43.00

0.73

0.31

0.95

0.49

0.11

0.32

0.35

0.22

0.62

0.19

5050中央資管──范錚強

Varimax Rotated Factor Matrix

Variable Course Factor 1 Factor 2 Factor 3

V1

V2

V3

V4

V5

V6

V7

V8

V9

V10

Financial Accounting

Managerial Accounting

Finance

Marketing

Human Behavior

Organization Design

Production

Probability

Statistical Inference

Quantitative Analysis

0.84

0.53

-.01

-.11

-.13

-.08

-.54

0.41

0.07

-.02

0.16

-.10

0.90

-.24

-.14

-.56

-.11

-.02

0.02

0.42

-.06

0.14

-.37

0.65

-.27

-.02

-.22

-.24

0.79

0.09

5151中央資管──范錚強

Cluster Analysis

Select sample to clusterSelect sample to cluster

Define variablesDefine variables

Compute similaritiesCompute similarities

Select mutually exclusive clustersSelect mutually exclusive clusters

Compare and validate clusterCompare and validate cluster

5252中央資管──范錚強

Cluster Analysis

5353中央資管──范錚強

Cluster Membership

________Number of Clusters ________

Film Country Genre Case 5 4 3 2

Cyrano de Bergerac

Il y a des Jours

Nikita

Les Noces de Papier

Leningrad Cowboys . . .

Storia de Ragazzi . . .

Conte de Printemps

Tatie Danielle

Crimes and Misdem . . .

Driving Miss Daisy

La Voce della Luna

Che Hora E

Attache-Moi

White Hunter Black . . .

Music Box

Dead Poets Society

La Fille aux All . . .

Alexandrie, Encore . . .

Dreams

France

France

France

Canada

Finland

Italy

France

France

USA

USA

Italy

Italy

Spain

USA

USA

USA

Finland

Egypt

Japan

DramaCom

DramaCom

DramaCom

DramaCom

Comedy

Comedy

Comedy

Comedy

DramaCom

DramaCom

DramaCom

DramaCom

DramaCom

PsyDrama

PsyDrama

PsyDrama

PsyDrama

DramaCom

DramaCom

1

4

5

6

19

13

2

3

7

9

12

14

15

10

8

11

18

16

17

1

1

1

1

2

2

2

2

3

3

3

3

3

4

4

4

4

5

5

1

1

1

1

2

2

2

2

3

3

3

3

3

4

4

4

4

3

3

1

1

1

1

2

2

2

2

3

3

3

3

3

3

3

3

3

3

3

1

1

1

1

2

2

2

2

2

2

2

2

2

2

2

2

2

2

2

5454中央資管──范錚強

Dendogram

Recommended