Factor Analysis (FA) (1)

Embed Size (px)

DESCRIPTION

MBA topic.factor analysis is used in researches to concise the variables.

Citation preview

  • 5/23/2018 Factor Analysis (FA) (1)

    1/61

    Factor Analysis (FA)

    Factor analysis is an interdependence technique whose primary

    purpose is to define the underlying structure among the

    variables in the analysis.

    The purpose of FA is to condense the information contained in

    a number of original variables into a smaller set of new

    composite dimensions or variates (factors) with a minimum loss

    of information.

  • 5/23/2018 Factor Analysis (FA) (1)

    2/61

    Factor analysis decision processStage 1: Objectives of factor analysis

    Key issues:

    Specifying the unit of analysis

    R factor analysis- Correlation matrix of the variables to summarize the

    characteristics.

    Q factor analysis- Correlation matrix of the individual respondents

    based on their characteristics. Condenses large number of people intodistinctly different group.

    Achieving data summarization vs. data reduction

    Data summarization- It is the definition of structure. Viewing the set of

    variables at various levels of generalization, ranging from the most

    detailed level to the more generalized level. The linear composite of

    variables is called variate or factor.

    Data reduction- Creating entirely a new set of variables and completely

    replace the original values with empirical value (factor score).

  • 5/23/2018 Factor Analysis (FA) (1)

    3/61

    Variable selection

    The researcher should always consider the conceptual underpinnings of

    the variables and use judgment as to the appropriateness of the variables

    for factor analysis.

    Using factor analysis with other multivariate techniques

    Factor scores as representatives of variables will be used for further

    analysis.

    Stage 2: Designing a factor analysis

    It involves three basic decisions:

    Correlations among variables or respondents (Q type vs. R type)

    Variable selection and measurement issues- Mostly performed on metricvariables. For nonmetric variables, define dummy variables (0-1) and

    include in the set of metric variables.

    Sample size- The sample must have more observations than variables.

    The minimum sample size should be fifty observations. Minimum 5 and

    hopefully at least 10 observations per variable is desirable.

  • 5/23/2018 Factor Analysis (FA) (1)

    4/61

    Stage 3: Assumptions in factor analysis

    The assumptions are more conceptual than statistical.

    Conceptual issues- 1) Appropriate selection of variables 2)

    Homogeneous sample. Statistical issues- Ensuring the variables are sufficiently intercorrelated

    to produce representative factors.

    Measure of intercorrelation:

    Visual inspection of Correlations greater than .30 in substantial

    cases in correlation matrix , the factor analysis is appropriate.

    If partial correlation are high, indicating no underlying factors,

    then factor analysis is inappropriate.

    Bartlett test of sphericity- A test for the presence of correlation

    among the variables. A statistically significant Bartletts test of

    sphericity (sig. >.05) indicates that sufficient correlation existamong the variables to proceed.

  • 5/23/2018 Factor Analysis (FA) (1)

    5/61

    Measure of sampling adequacy (MSA)- This index ranges from

    0 to 1, reaching 1 when each variable is perfectly predicted

    without error by the other variables. The measures can be

    integrated with following guidelines: Kaiser-Meyer Measure of Sampling Adequacy

    in the .90s marvelous

    in the .80s meritorious

    in the .70s middling

    in the .60s mediocre

    in the .50s miserable

    below .50 unacceptable

    MSA values must exceed .50 for both the overall test and each

    individual variable

    Variables with value less than .50 should be omitted from the

    factor analysis.

  • 5/23/2018 Factor Analysis (FA) (1)

    6/61

    Stage 4: Deriving factors and assessing overall fit

    Apply factor analysis to identify the underlying structure of

    relationships.

    Two decisions are important: Selecting the factor extraction method

    Common factor analysis

    Principal component analysis

    Concept of Partitioning the variance of a variable

    Common variance- Variance in the variable shared with all other

    variables in the analysis. The variance is based on variablescorrelations

    with other variables. Communality of variable estimates common

    variance.

    Specific variance- AKA unique variance. This variance of variable cannot

    be explained by the correlations to the other variables but is associated

    uniquely with a single variable.

    Error variance- It is due to unreliability in the data-gathering process,

    measurement error, or a random component in the measured

    phenomenon.

  • 5/23/2018 Factor Analysis (FA) (1)

    7/61

    Component factor analysis- AKA principal components analysis.

    Considers the total variance and derives factors that contain

    small proportions of unique variance and in some instances

    error variance.

    Common factor analysis- Considers only the common or shared

    variance, assuming that both the unique and error variance are

    not of interest in defining the structure of the variables.

    Diagonal value

    Unity

    Variance

    Communality

    Variance extracted

    Variance excluded

    Total variance

    common

  • 5/23/2018 Factor Analysis (FA) (1)

    8/61

    Suitability of factor extraction method Component factor analysis is appropriate when data reduction is primary

    concern.

    Common factor analysis is appropriate when primary objective is toidentify the latent dimensions or constructs represented in the originalvalue.

    Criteria for the number of factors to extract

    Latent root criterion

    It applies to both extraction method. This criteria assumes that any individual factor should account for the

    variance of at least a single variable if it is to be retained for interpretation.

    In component analysis each variable contribute a value of 1 to the latentroots or eigen values.

    So, factors having eigen values greater than 1 are considered significant and

    selected.

    Eigen value- It represents the amount of variance accountedfor by the factor. It is column sum of squared loading for afactor.

  • 5/23/2018 Factor Analysis (FA) (1)

    9/61

    Scree test criterion

    This is plotting the latent roots against the number of

    factors in their order of extraction. The shape of the resulting curve is used to evaluate the

    cutoff point.

    The point at which the curve begins to straighten out is

    considered to indicate the maximum numbers of factorsto extract.

    As a general rule, the scree test results in at least one

    and sometimes two or three more factors being

    considered for inclusion than does the latent root

    criterion.

  • 5/23/2018 Factor Analysis (FA) (1)

    10/61

    0

    1

    2

    3

    4

    5

    0 5 10Number

    Scree plot of eigenvalues after factor

    Factor

    Eigen values

    Scree criterion

  • 5/23/2018 Factor Analysis (FA) (1)

    11/61

    Stage 5: Interpreting the factors

    Three processes of factor interpretation

    Estimate the factor matrix

    Initial unrotated factor matrix is computed.

    It contains factor loadings for each variable on each factor.

    Factor loadings are the correlation of each variable on each factor.

    Higher loadings making the variable representative of the factor.

    Factor rotation Rotational method is employed to achieve simpler and theoretically

    more meaningful factor solutions.

    The reference axes of the factors are turned about the origin until

    some position has been reached.

    There are two types of rotation:

    Orthogonal factor rotation

    Oblique factor rotation.

  • 5/23/2018 Factor Analysis (FA) (1)

    12/61

    Rotating Factors

    F1

    F1

    F2

    F2

    Factor 1 Factor 2

    x1 0.5 0.5

    x2 0.8 0.8

    x3 -0.7 0.7

    x4 -0.5 -0.5

    Factor 1 Factor 2

    x1 0 0.6

    x2 0 0.9

    x3 -0.9 0

    x4 0 -0.9

    2

    1

    3

    4

    2

    1

    3

    4

  • 5/23/2018 Factor Analysis (FA) (1)

    13/61

    Orthogonal Rotation Oblique Rotation

  • 5/23/2018 Factor Analysis (FA) (1)

    14/61

    14

    When to use Factor Analysis?

    Data Reduction

    Identification of underlying latent structures- Clusters of correlated variables are termed factors

    Example: Factor analysis could potentially be used to identify

    the characteristics (out of a large number ofcharacteristics) that make a person popular.

    Candidate characteristics: Level of social skills, selfishness, howinteresting a person is to others, the amount of time they spendtalking about themselves (Talk 2) versus the other person (Talk1), their propensity to lie about themselves.

  • 5/23/2018 Factor Analysis (FA) (1)

    15/61

    15

    The R-Matrix

    Meaningful clusters of large correlation

    coefficients between subsets of variables

    suggests these variables are measuring

    aspects of the same underlying

    dimension.

    Factor 1:

    The better your social skills,

    the more interesting and

    talkative you tend to be.

    Factor 2:

    Selfish people are likely to lie

    and talk about themselves.

  • 5/23/2018 Factor Analysis (FA) (1)

    16/61

    16

    What is a Factor?

    Factors can be viewed as classification axes alongwhich the individual variables can be plotted.

    The greater the loadingof variables on a factor,the more the factor explains relationships amongthose variables.

    Ideally, variables should be strongly related to (orload on)only one factor.

  • 5/23/2018 Factor Analysis (FA) (1)

    17/61

    17

    Graphical Representation of a

    factor plot

    Note that each variable

    loads primarily on only

    one factor.

    Factor loadings tell use about

    the relative contribution that a

    variable makes to a factor

  • 5/23/2018 Factor Analysis (FA) (1)

    18/61

    18

    Mathematical Representation

    of a factor plot

    Yi= b1X1i+b2X2i+ bnXn+ i

    Factori= b1Variable1i+b2Variable2i+ bnVariablen+ i

    The equation describing a linear model can be

    applied to the description of a factor.

    The bs in the equation represent the factorloadings observed in the factor plot.

    Note: there is no intercept in the equation since the lines intersection at zero and hence

    the intercept is also zero.

  • 5/23/2018 Factor Analysis (FA) (1)

    19/61

    19

    Mathematical Representation

    of a factor plot

    Sociabilityi= b1Talk 1i+b2Social Skillsi+ b3interesti

    + b4Talk 2 + b5Selfishi+ b6Liari + i

    There are two factors underlying thepopularity construct: general

    sociability and consideration.

    We can construct equations that describe each factor in terms of the

    variables that have been measured.

    Considerationi= b1Talk 1i+b2Social Skillsi+

    b3interesti+ b4Talk 2 + b5Selfishi+ b6Liari + i

  • 5/23/2018 Factor Analysis (FA) (1)

    20/61

    20

    Mathematical Representation

    of a factor plot

    Sociabilityi= 0.87Talk 1i+0.96Social Skillsi+ 0.92Interesti+ 0.00Talk 2 -

    0.10Selfishi+ 0.09Liari + i

    The values of the bsin the two equations differ, depending on

    the relative importance of each variable to a particular factor.

    Considerationi= 0.01Talk 1i- 0.03Social Skillsi+ 0.04interesti+ 0.82Talk 2 +

    0.75Selfishi+ 0.70Liari + i

    Ideally, variables should have very high b-values for one factor and very low

    b-values for all other factors.

    Replace values of b with the co-ordinate of each variable on the graph.

  • 5/23/2018 Factor Analysis (FA) (1)

    21/61

    21

    Factor Loadings

    The bvalues represent the weights of a variable on a factor and aretermed Factor Loadings.

    These values are stored in a Factor pattern matrix(A). Columns display the factors (underlying constructs) and rows

    display how each variable loads onto each factor.

    VariablesFactors

    Sociability Consideration

    Talk 1 0.87 0.01

    Social Skills 0.96 -0.03

    Interest 0.92 0.04

    Talk 2 0.00 0.82

    Selfish -0.10 0.75

    Liar 0.09 0.70

  • 5/23/2018 Factor Analysis (FA) (1)

    22/61

    22

    Factor Scores Once factors are derived, we can estimate each

    persons Factor Scores(based on their scores for eachfactors constituent variables).

    Potential uses for Factor Scores.

    - Estimate a persons score on one or more factors.- Answer questions of scientific or practical interest (e.g.,Are females are

    more sociable than males? using the factors scores for sociability).

    Methods of Determining Factor Scores- Weighted Average (simplest, but scale dependent)

    - Regression Method (easiest to understand; most typically used)

    - Bartlett Method (produces scores that are unbiased and correlate only with theirown factor).

    - Anderson-Rubin Method (produces scores that are uncorrelated andstandardized)

  • 5/23/2018 Factor Analysis (FA) (1)

    23/61

    Approaches to Factor Analysis

    Exploratory Reduce a number of measurements to a smaller number of indices or

    factors (e.g., Principal Components Analysis or PCA).

    Goal: Identify factors based on the data and to maximize the amountof variance explained.

    Confirmatory Test hypothetical relationships between measures and more abstract

    constructs.

    Goal: The researcher must hypothesize, in advance, the number of

    factors, whether or not these factors are correlated, and which itemsload onto and reflect particular factors. In contrast to EFA, where allloadings are free to vary, CFA allows for the explicit constraint ofcertain loadings to be zero.

  • 5/23/2018 Factor Analysis (FA) (1)

    24/61

    Communality

    Understanding variance in an R-matrix Total variance for a particular variable has two

    components:

    Common Variance variance shared with other variables.

    Unique Variance

    variance specific to that variable (includingerror or random variance).

    Communality The proportion of common (or shared) variance present in a

    variable is known as the communality. A variable that has no unique variance has a communality of 1;

    one that shares none of its variance with any other variable hasa communality of 0.

  • 5/23/2018 Factor Analysis (FA) (1)

    25/61

    Factor Extraction: PCA vs. Factor Analysis

    Principal Component Analysis. A data reduction technique that representsa set of variables by a smaller number of variables called principal components.

    They are uncorrelated, and therefore, measure different, unrelated aspects or

    dimensions of the data.

    Principal Componentsare chosen such that the first one accounts for as much of

    the variation in the data as possible, the second one for as much of the

    remaining variance as possible, and so on.

    Useful for combining many variables into a smaller number of subsets.

    Factor Analysis. Derives a mathematical model from which factors areestimated.

    Factors are linear combinations that maximize the shared portion of the

    variance underlying latent constructs.

    May be used to identify the structure underlying such variables and to estimate

    scores to measure latent factors themselves.

  • 5/23/2018 Factor Analysis (FA) (1)

    26/61

    Factor Extraction: Eigenvalues & Scree Plot

    Eigenvalues Measure the amount of variation accounted for by each factor.

    Number of principal components is less than or equal to the number of

    original variables. The first principal component accounts for as much of

    the variability in the data as possible. Each succeeding component has the

    highest variance possible under the constraint that it be orthogonal to

    (i.e., uncorrelated with) the preceding components.

    Scree Plots

    Plots a graph of each eigenvalue (Y-axis) against the factor with

    which it is associated (X-axis).

    By graphing the eigenvalues, the relative importance of each factor

    becomes apparent.

  • 5/23/2018 Factor Analysis (FA) (1)

    27/61

    27

    Factor Retention Based on Scree Plots

  • 5/23/2018 Factor Analysis (FA) (1)

    28/61

    28

    Kaiser (1960) recommends retaining all factors with

    eigenvalues greater than 1.

    - Based on the idea that eigenvalues represent the amount

    of variance explained by a factor and that an eigenvalueof 1 represents a substantial amount of variation.

    - Kaisers criterion tends to overestimate the number of

    factors to be retained.

    Factor Retention: Kaisers Criterion

  • 5/23/2018 Factor Analysis (FA) (1)

    29/61

    29

    Students often become stressed about statistics

    (SAQ) and the use of computers and/or SPSS to

    analyze data.

    Suppose we develop a questionnaire to measurethis propensity (see sample items on the following

    slides; the data can be found in SAQ.sav).

    Does the questionnaire measure a single construct?

    Or is it possible that there are multiple aspectscomprising students anxiety toward SPSS?

    Doing Factor Analysis: An Example

  • 5/23/2018 Factor Analysis (FA) (1)

    30/61

    30

  • 5/23/2018 Factor Analysis (FA) (1)

    31/61

    31

  • 5/23/2018 Factor Analysis (FA) (1)

    32/61

    32

    Doing Factor Analysis: Some

    Considerations

    Sample size is important! A sample of 300 or more

    will likely provide a stable factor solution, but

    depends on the number of variables and factors

    identified.

    Factors that have four or more loadings greater than

    0.6 are likely to be reliable regardless of sample

    size.

    Correlations among the items should not be too low

    (less than .3) or too high (greater than .8), but the

    pattern is what is important.

  • 5/23/2018 Factor Analysis (FA) (1)

    33/61

    33

    c

    E

    Factor Extraction

  • 5/23/2018 Factor Analysis (FA) (1)

    34/61

    34

    Scree Plot for theSAQ Data

  • 5/23/2018 Factor Analysis (FA) (1)

    35/61

    35

    Table of Communalities Before

    and After Extraction

    Component Matrix Before Rotation(loadings of each variable onto each factor)

    Note: Loadings less than

    0.4 have been omitted.

  • 5/23/2018 Factor Analysis (FA) (1)

    36/61

    36

    Factor Rotation

    To aid interpretation it is possible to maximize theloading of a variable on one factor while

    minimizing its loading on all other factors.

    This is known as Factor Rotation.

    Two types: Orthogonal (factors are uncorrelated)

    Oblique (factors intercorrelate)

  • 5/23/2018 Factor Analysis (FA) (1)

    37/61

    37

    Orthogonal Rotation Oblique Rotation

  • 5/23/2018 Factor Analysis (FA) (1)

    38/61

    38

    Rotated Com ponent Matrixa

    .800

    .684

    .647

    .638

    .579

    .550

    .459

    .677

    .661

    -.567

    .473 .523

    .516

    .514

    .496

    .429

    .833

    .747

    .747

    .648 .645

    .586

    .543

    .427

    I have little experience of computers

    SPSS alw ays cra shes w hen I try to use it

    I worr y that I w ill cause irreparable damage because

    of my incompetenece w ith computersA ll computers hate me

    Computers have minds of their ow n and deliberately

    go w rong w henever I use them

    Computers are useful only for playing games

    Computers are out to get me

    I can't sleep for thoughts of eigen vec tors

    I wake up under my duvet thinking that I am trapped

    under a normal distribtion

    Standard deviations excite me

    People try to tell you that SPSS makes statisticseasier to understand but it doesn't

    I dream that Pearson is attacking me w ith cor relation

    coefficients

    I w eep openly at the mention of central tendency

    Statiscs makes me cry

    I don't understand s tatistics

    I have never been good at mathematics

    I slip into a coma w henever I see an equation

    I did badly at mathematics at s chool

    My friends are better at statistics than meMy friends are better at SPSS than I am

    If I'm good at stat istics my f riends w ill think I'm a nerd

    My friends w ill think I'm s tupid for not being able to

    cope w ith SPSS

    Everybody looks at me when I use SPSS

    1 2 3 4

    Component

    Extraction Method: Principal Component Analys is.

    Rotation Method: Varimax w ith Kaiser Normalization.

    Rotation converged in 9 iterations.a.

    Orthogonal

    Rotation (varimax)Fear of Computers

    Fear of Statistics

    Fear of Math

    Peer Evaluation

    Note: Varimax rotation is the

    most commonly used

    rotation. Its goal is to

    minimize the complexity of

    the components by making

    the large loadings larger and

    the small loadings smallerwithin each component.

    Quartimax rotation makes

    large loadings larger and

    small loadings smaller within

    each variable. Equamax

    rotation is a compromise that

    attempts to simplify both

    components and variables.

    These are all orthogonal

    rotations, that is, the axes

    remain perpendicular, so the

    components are not

    correlated.

  • 5/23/2018 Factor Analysis (FA) (1)

    39/61

    39

    Oblique

    Rotation: PatternMatrix

    Pattern Matrixa

    .706

    .591

    -.511

    .405

    .400

    .643

    .621

    .615

    .507

    .885

    .713

    .653

    .650

    .588

    .585

    .412 .462

    .411

    -.902

    -.774

    -.774

    I can't sleep for thoughts of eigen vectors

    I wake up under my duvet thinking that I am trapped

    under a normal distribtion

    Standard deviations exc ite me

    I dream that Pearson is attacking me w ith correlation

    coefficients

    I w eep openly at the mention of central tendency

    Statiscs makes me cry

    I don't understand statistics

    My friends are better a t SPSS than I am

    My friends are better at statistics than me

    If I'm good at statistics my f riends w ill think I'm a nerd

    My friends w ill think I'm stupid f or not being able to

    cope w ith SPSSEverybody looks at me w hen I use SPSS

    I have little exper ience of computers

    SPSS alw ays cras hes w hen I try to use it

    All computers hate me

    I w orry that I w ill cause irreparable damage because

    of my incompetenece w ith computers

    Computers have minds of their ow n and deliberately

    go w rong whenever I use them

    Computers are useful only for playing games

    People try to tell you that SPSS makes s tatisticseasier to understand but it doesn't

    Computers are out to get me

    I have never been good at mathematics

    I slip into a coma w henever I see an equat ion

    I did badly at mathematics at school

    1 2 3 4

    Component

    Extraction Method: Principal Component A nalysis.

    Rotation Method: Oblimin w ith Kaiser Normalization.

    Rotation converged in 29 iterations.a.

    Fear of Statistics

    Fear of Computers

    Fear of Math

    Peer Evaluation

  • 5/23/2018 Factor Analysis (FA) (1)

    40/61

    40

    Reliability:A measure should consistently reflect the construct it is measuring

    Test-Retest Method

    What about practice effects/mood states?

    Alternate Form Method

    Expensive and Impractical

    Split-Half Method Splits the questionnaire into two random halves,

    calculates scores and correlates them.

    Cronbachs Alpha

    Splits the questionnaire (or sub-scales of a questionnaire)into all possible halves, calculates the scores, correlatesthem and averages the correlation for all splits.

    Ranges from 0 (no reliability) to 1 (complete reliability)

  • 5/23/2018 Factor Analysis (FA) (1)

    41/61

    41

    Reliability: Fear of Computers Subscale

  • 5/23/2018 Factor Analysis (FA) (1)

    42/61

    42

    Reliability: Fear of Statistics Subscale

  • 5/23/2018 Factor Analysis (FA) (1)

    43/61

    43

    Reliability: Fear of Math Subscale

  • 5/23/2018 Factor Analysis (FA) (1)

    44/61

    44

    Reliability: Peer Evaluation Subscale

  • 5/23/2018 Factor Analysis (FA) (1)

    45/61

    45

    Reporting the ResultsA principal component analysis (PCA) was conducted on the 23 items with

    orthogonal rotation (varimax). Bartletts test of sphericity, 2(253) = 19334.49,

    p< .001, indicated that correlations between items were sufficiently large for

    PCA. An initial analysis was run to obtain eigenvalues for each component in

    the data. Four components had eigenvalues over Kaisers criterion of 1 and

    in combination explained 50.32% of the variance. The scree plot was slightly

    ambiguous and showed inflexions that would justify retaining either 2 or 4factors.

    Given the large sample size, and the convergence of the scree plot and

    Kaisers criterion on four components, four components were retained in the

    final analysis. Component 1 represents a fear of computers, component 2 a

    fear of statistics, component 3 a fear of math, and component 4 peer

    evaluation concerns.The fear of computers, fear of statistics, and fear of math subscales of the

    SAQ all had high reliabilities, all Chronbachs = .82. However, the fear of

    negative peer evaluation subscale had a relatively low reliability, Chronbachs

    = .57.

  • 5/23/2018 Factor Analysis (FA) (1)

    46/61

    Step 1: Select Factor Analysis

  • 5/23/2018 Factor Analysis (FA) (1)

    47/61

    Step 2: Add all variables to be included

  • 5/23/2018 Factor Analysis (FA) (1)

    48/61

    Step 3: Get descriptive statistics & correlations

  • 5/23/2018 Factor Analysis (FA) (1)

    49/61

    Step 4: Ask for Scree Plot and set extraction options

  • 5/23/2018 Factor Analysis (FA) (1)

    50/61

    Step 5: Handle missing values and sort coefficients bysize

  • 5/23/2018 Factor Analysis (FA) (1)

    51/61

    Step 6: Select rotation type and set rotationiterations

  • 5/23/2018 Factor Analysis (FA) (1)

    52/61

    Step 7: Save Factor Scores

  • 5/23/2018 Factor Analysis (FA) (1)

    53/61

    Communalities

  • 5/23/2018 Factor Analysis (FA) (1)

    54/61

    Variance Explained

  • 5/23/2018 Factor Analysis (FA) (1)

    55/61

    Scree Plot

  • 5/23/2018 Factor Analysis (FA) (1)

    56/61

    Rotated Component Matrix: Component 1

  • 5/23/2018 Factor Analysis (FA) (1)

    57/61

    Rotated Component Matrix: Component 2

  • 5/23/2018 Factor Analysis (FA) (1)

    58/61

    Component 1: Factor Score

  • 5/23/2018 Factor Analysis (FA) (1)

    59/61

    Component (Factor): Score Values

    Rename Components According to

  • 5/23/2018 Factor Analysis (FA) (1)

    60/61

    Rename Components According to

    Interpretation

  • 5/23/2018 Factor Analysis (FA) (1)

    61/61