Introduction to Factor Analysis [Compatibility Mode].pdf

Embed Size (px)

Citation preview

  • 7/27/2019 Introduction to Factor Analysis [Compatibility Mode].pdf

    1/20

    1

    Introduction to Factor Analysis

    R.Venkatesakumar

    Department of Management Studies (SOM)

    Pondicherry University

    Factor Analysis 2

    Uniqueness of Factor Analysis

    n It is very unique in the sense that it is an 'inter-

    dependence'technique.

    n It will not consider variables entered in the

    analysis as dependent or independent - instead

    considers all the variables in the analysis as

    inter-dependent

  • 7/27/2019 Introduction to Factor Analysis [Compatibility Mode].pdf

    2/20

    2

    Factor Analysis 3

    Factor Analysis 4

  • 7/27/2019 Introduction to Factor Analysis [Compatibility Mode].pdf

    3/20

    3

    Factor Analysis 5

    Objective of Factor Analysis

    n The primary purpose of Factor Analysis is to

    definetheunderlying structure in the data matrix

    orgroupingofvariables

    n Hence factor analysis can be very useful to cull

    out from a large number of variables to set of

    'representative -subset, which still possessesthecharacteristicsof theoriginalset of variables.

    Factor Analysis 6

    Factor analysis - Research Design

    n Specificquestionssuchas

    n purpose/objectiveoftheanalysis

    n typeof the analysis

    n variablesconsideredintheanalysis

    n samplesizerequirements

    n assessingthecharacteristicsof the sample

    End of Slide

  • 7/27/2019 Introduction to Factor Analysis [Compatibility Mode].pdf

    4/20

    4

    Factor Analysis 7

    Objective of the Analysis

    n Identification of structure through summarizing thedataor

    n Data reduction, from a larger set of variables to somemanageable number of dimensions / Identifyingrepresentativesetofvariablesn by examining the correlation between the variables, or

    respondents, thestructureis identified.

    n 'data reduction - the factoranalysis focusesonidentifying the

    set of representative 'factors' lesser in number than theoriginal numberofvariables

    n Creationofanentirely newsetofvariables

    Back

    Factor Analysis 8

    n CasesVs. Variables

    Back

  • 7/27/2019 Introduction to Factor Analysis [Compatibility Mode].pdf

    5/20

    5

    Factor Analysis 9

    Issues related to variables

    n Variables

    n normally metric variables, that is either ratio scaled

    orinterval scaled.

    n sometimes, the non-metric variable especiallydummyvariablesarealsoused.

    n thespecificationof thevariablestobeincluded in the

    factor analysisisacrucial task.

    Cont

    Factor Analysis 10

    Issues related to variables -

    n include5or7variablesto measurethesamefeature

    n the strength/purpose of factor analysis is to find out the

    patternsamongthevariables.

    n If the variables are conceptually defined one, then the

    derivedfactorscontainmoremeaningfulconcepts

    n remember that inclusion of irrelevant variables orinclusion of more number of variables really going todistort theresults

    Back

  • 7/27/2019 Introduction to Factor Analysis [Compatibility Mode].pdf

    6/20

    6

    Factor Analysis 11

    Issues related to sample size

    n SampleSize

    n preferable sample size for doing factor analysis

    shouldbe100orlarger

    n Sometime a ratio of 10:1 (i.e., 10 observations pervariable) or 20:1 are considered which woulddefinitely improvethepredictionpower

    n But dont attempt when the sample size is less than50

    Back

    Factor Analysis 12

    Step -3 Basic assumptions about data

    n Partial correlationbetweenthevariables

    n If partial correlation is low/smaller then the variables

    canbeexplainedbythefactors

    n otherwisethere is no true factorsexists and a factoranalysis is inappropriatein thatsituation.

    n

    if partial correlation/anti-image correlation is high,then it is an indication of variables not suited for

    factor analysis

  • 7/27/2019 Introduction to Factor Analysis [Compatibility Mode].pdf

    7/20

    7

    Factor Analysis 13

    Measure of Sample Adequacy (MSA)

    Multicollinearity - Assessed using MSA (measure of sampling adequacy).

    TheMSA is measured by the Kaiser-Meyer-Olkin (KMO) statistics.

    As a measure of sam pl ing adequacy, th e KMO pr edi cts if data are li kely to

    factorwell based on correlation and partial correlation.

    KMO can be used t o identify which variables t o drop f rom the factor

    analysis because they lack multicoll inearity.

    Ther e i s a K MO s tat is ti c fo r eac h ind iv idual v ar iab le, and thei r s um is theKMO overall statistic.

    Factor Analysis 14

    Measure of Sample Adequacy (MSA)

    n This is another measure, which tries to quantify the inter-correlation amongthe variables

    n Theco-efficientranges from0 to1, with 1stands for eachvariable isperfectlypredictable bythe other variable

    n FirstweapplytheconceptofMSAtoindividualvariablesandwhichever isfalling in the unacceptablerangeis getting eliminated one ata timeuntilKMO overall rises above .50, and each individual variable KMO is above

    .50.n whichever variables qualify the criteria for to include in the test are

    considered for overall Measureof Sampling Adequacy test

  • 7/27/2019 Introduction to Factor Analysis [Compatibility Mode].pdf

    8/20

    8

    Factor Analysis 15

    Table Showing MSA Coefficients range& their interpretation

    Range of the Coefficient Remark

    0.80 or above Meritorious

    0.70 - 0.80 Middling

    0.60 - 0.70 Mediocre

    0.50 -0.60 Miserable

    .05)indicates that sufficient correlations exist among the variables to

    proceed.

    Measure of Sampling Adequacy (MSA) values must exceed .50for both the overall test and each individual variable. Variableswith values less than .50 should be omitted from the factor

    analysis one at a time, with the smallest one being omitted each

    time.

  • 7/27/2019 Introduction to Factor Analysis [Compatibility Mode].pdf

    9/20

    9

    Factor Analysis 17

    3 types of variances

    (i) Common Variance, which is defined as thevariance in a variable that is shared with allothervariablesin theprocedure.

    (ii) Specific Variance, which is that varianceassociatedwithaspecificvariableand

    (iii) Error Variance, which is due to measurement

    error or unreliable responses from therespondents.

    Factor Analysis 18

    Extraction Method Determines theTypes of Variance Carried into the Factor Matrix

    Diagonal Value Variance

    Unity (1)

    Communality

    Total Variance

    Common Specific and Error

    Variance extracted

    Variance not used

  • 7/27/2019 Introduction to Factor Analysis [Compatibility Mode].pdf

    10/20

    10

    Factor Analysis 19

    Factors Extractions - basic procedures

    n two basic extraction rules available for deriving

    factors

    n (i) CommonFactorAnalysis (CFA)

    n (ii) PrincipalComponentAnalysis(PCA)

    Factor Analysis 20

    Method of Extraction

    n Principal Component Analysis tries to explain the total variancethat is common variance and the extracted factors that explainthemaximumamountof total variance in the variables.

    n Principal components factor analysis inserts 1's on the diagonalof the correlation matrix, thus considering all of the availablevariance.

    n Most appropriate when the concern is with deriving a minimumnumberof factors toexplainamaximumportionof variancein the

    original variables, and the researcherknows the specific anderrorvariancesaresmall.

  • 7/27/2019 Introduction to Factor Analysis [Compatibility Mode].pdf

    11/20

    11

    Factor Analysis 21

    Common Factor Analysis

    n OntheotherhandCommonFactorAnalysis focusesonto explain the maximum amount of variance that issharedbyall the variablesin the analysis

    n Common factor analysis only uses the commonvariance and places communality estimates on thediagonalofthecorrelationmatrix.

    n Mostappropriate when there is adesire to reveal latent

    dimensions of the original variablesand the researcherdoes not know about the nature of specific and errorvariance.

    Factor Analysis 22

    Number of Factors to be extracted

    (i)LatentRootCriterion

    (ii)PercentageofVarianceCriterion

    (iii)ScreeTestand

    (iv) priori criterion

    To understand these concepts, knowledge about'Factor Loadings', 'Eigen Value/communalities'are required

  • 7/27/2019 Introduction to Factor Analysis [Compatibility Mode].pdf

    12/20

    12

    Factor Analysis 23

    Latent Root Criterion

    Any ind iv id ual fact or

    should explain the

    variance of at least of a

    single variable if it is to

    be considered in the

    procedure.

    Back

    Factor Analysis 24

    Percentage of Variance Criterion

    n proceed to extract factors until

    the pre-specified percentage ofvarianceisachieved

    Back

  • 7/27/2019 Introduction to Factor Analysis [Compatibility Mode].pdf

    13/20

    13

    Factor Analysis 25

    Scree Test

    n Scree test tries toidentify numberof factors thatcan be extractedbefore thedominance ofuniquevariances

    Back

    Factor Analysis 26

    Initial Communalities

    n It is the total amount of variance a variable shares

    with all the other variables in the analysis and

    usedinthe analysis.

    n If we use Principal Component Analysis (PCA),

    the initial variance considered in the analysis will

    be one, indicates full variance in the variable isbeingusedintheanalysis

  • 7/27/2019 Introduction to Factor Analysis [Compatibility Mode].pdf

    14/20

    14

    Factor Analysis 27

    Factor Loadings

    n It is the correlation between the original variable and

    the factors. The amountof variance explained by thefactor is squareof the correlation (as in the case ofco-efficientofdetermination)

    n The sumofsquare of factor loadings for a variable

    indicates the percentage of variancethathas extracted

    by all the factors. This will be displayed as 'FinalCommunalities' in theresults.

    Factor Analysis 28

    Eigen Values

    n Ifwe square and sum across the variables for a factor

    the coefficientis known as 'EigenValue' for that factor.The sumof the initial communalities will be named as'SumofEigenValues,whichwould beequal to number

    of variables used in the analysis provided if we usePrincipalComponentAnalysis.

    n The ratio of Eigen values for a factor to sumof Eigenvalue represent the percentages of variance explainedbythat factor

  • 7/27/2019 Introduction to Factor Analysis [Compatibility Mode].pdf

    15/20

    15

    Factor Analysis 29

    Obtaining 'Un-rotated ' Solution

    n The factor matrix contains the loadings of each

    variableonthefactors.

    n The first factor tries to extract the maximum

    variance in all the variables (i.e., can be viewed

    as summary of best linear relationship exists in

    the data)

    n which makes things complicated for the researcher

    in interpretation stage.

    Factor Analysis 30

    Interpreting Factor Loadings

    Factor Loading Remarks

    -0.30 - +0.30

    +0.40 to +0.50

    -0.40 to -0.50

    +0.50 to +1.00-0.50 to -1.00

    Minimal

    More Important Loadings

    Very Signi ficantLoadings

  • 7/27/2019 Introduction to Factor Analysis [Compatibility Mode].pdf

    16/20

    16

    Factor Analysis 31

    (Computation made with SOLO, Power Analysis, B MDP Statistical Software Inc. 1993)

    Loading Sample Size Required

    0.30

    0.40

    0.45

    0.50

    0.55

    0.60

    0.65

    0.70

    0.75

    350

    200

    150

    120

    100

    85

    70

    60

    50

    However Sample size is critical in determining theloadings

    Factor Analysis 32

    Interpreting factor loadings

    n identify highest loading which are also significant loadings for

    eachvariable ontheappropriate factorbasedonthesamplesize

    n if all the variables have only one higher-significant loading on

    one particular factor, then the interpretation would be very

    simple; thosevariables havinghighersignificant loadings onone

    factorprofiledwiththe characteristicsof the thosevariables.

    n

    if all the variables or most of the variables having highersignificant loadings on a single same factor, then the

    interpretationbecomesverydifficult

  • 7/27/2019 Introduction to Factor Analysis [Compatibility Mode].pdf

    17/20

    17

    Factor Analysis 33

    Communalities

    n The communality for avariable is theamount of(percentage or fraction) variance that isexplainedby theretainedfactors.

    n It is the sum of squares of loadings for eachvariable across the factors that are retained inthestudy.

    n Lower the communality means the particularvariableisnot well capturedby the factors

    Factor Analysis 34

    Rotation of factor matrix

    n rotation is a process by which the reference axis

    (Factor-1 axis, Factor -2 axis etc) are turned about theorigin, until some other 'betterposition' is reached, withthe objective that redistribute the variance from the

    earlierfactor to later ones

    n it will result with some of the variables will have higher

    loadings with only one factor and in the rest of thefactors will have low loadings which may not be verysignificantone.

  • 7/27/2019 Introduction to Factor Analysis [Compatibility Mode].pdf

    18/20

    18

    Factor Analysis 35

    Types of Rotations

    n Therotationcanbeclassifiedinto2types-

    n (i) OrthogonalRotation

    n (ii) ObliqueRotation.

    n As the name suggests, while orthogonal rotation the angle between

    the reference axis maintained at 90 which is not so in the case of

    obliquerotation.

    Factor Analysis 36

    Orthogonal Factor Rotation

    Unrotated Factor II

    Unrotated FactorI

    Rotated FactorI

    Rotated Factor II

    -1.0 -.50 0 +.50 +1.0

    -.50

    -1.0

    +1.0

    +.50

    V1

    V2

    V3V4

    V5

  • 7/27/2019 Introduction to Factor Analysis [Compatibility Mode].pdf

    19/20

    19

    Factor Analysis 37

    Unrotated

    Factor II

    Unrotated FactorI

    Oblique

    Rotation:Factor I

    Orthogonal Rotation:Factor II

    -1.0 -.50 0 +.50 +1.0

    -.50

    -1.0

    +1.0

    +.50

    V1

    V2

    V3V4

    V5

    Orthogonal Rotation:Factor I

    Oblique Rotation:Factor II

    Oblique Factor Rotation

    Factor Analysis 38

    Choosing Factor Rotation Methods

    Orthogonal rotation methods:

    o are the most widely used rotational methods.

    o are The preferred method when the research goal is data

    reduction to either a smaller number of variables or a set of

    uncorrelated measures for subsequent use in other

    multivariate techniques.

    Oblique rotation methods:

    o best suited to the goal of obtaining several theoretically

    meaningful factors or constructs because, realistically, very

    few constructs in the real world are uncorrelated.

  • 7/27/2019 Introduction to Factor Analysis [Compatibility Mode].pdf

    20/20

    20

    Factor Analysis 39