53
MEASUREMENT OF VARIABLES: OPERATIONAL DEFINITION AND SCALES

BRM Unit 4

Embed Size (px)

DESCRIPTION

BRM

Citation preview

  • MEASUREMENT OFVARIABLES: OPERATIONAL

    DEFINITION AND SCALES

  • THE RESEARCH DESIGNDETAILS OF STUDY MEASURMENT

    Purpose of thestudy

    ExplorationDescriptionHypothesis testing

    Types ofInvestigation

    Establishing:-Casual relationships-Correlations-Group differences,

    Extent of researcherInterference

    Minimum: Studying eventsas they normally occurModerate: Minimumamount of interferenceMaximum: High degreeof control and artificialsettings

    Study setting

    Contrived

    Noncontrived

    Measurementand measures

    Operationaldefinition

    items (measure)ScalingCategorizingCoding

    Unit of analysis(Population to bestudied)

    IndividualsDyadsGroupsOrganizationsMachinesetc.

    Sampling design

    Probability/nonprobability

    SampleSize (n)

    Time horizon

    One-Shot(cross-sectional)

    Multishot(longitudinal)

    Data-Collectionmethod

    ObservationInterview

    Questionnaire

    Physicalmeasurement

    Unobtrusive

    1. Feel fordata

    2. Goodnessof data

    3. Hypothesestesting

    PRO

    BLE

    MST

    ATEM

    ENT

    DATAANALYSIS

  • OPERATIONAL DEFINITION

    Reduction of abstract concepts to render themmeasurable in tangible way is called operationalizing theconcepts. It is done by looking at the behavioraldimensions, facts, or properties denoted by the concept.

  • EXAMPLE OF DIMENTIONS AND ELEMENTS The concept of thirst is abstract: we cannot see thirst. However, we would

    expect a thirsty person to drink plenty of fluid. If several people say they

    are thirsty, then we may determine the thirst level of each of these

    individuals by the measure of the quantity of fluid that they drink to

    quench their thirst. We will thus be able to measure their level of thirst,

    even though the concept of thirst itself is abstract and nebulous (unclear).

    In the above example the thirst is the concept, the drinking of plenty offluid is the dimension, and the measuring of the quantity of fluid that

    they drink to quench their thirst is the element.

  • The nature of measurement

    Measurement occurs when an established index verifies the height, weight,or other feature of a physical object.

    To measure is to discover the extent, dimension, quantity, or capacity ofsomething, especially by comparison with a standard.

    Measurement in research consists of assigning numbers to empirical events,objects or properties, or activities in compliance with a set of rules.

    This definition implies that measurement is a three-part process

  • The nature of measurement

    Select observable empirical events. Developing a set of mapping rules Applying the mapping rule(s) to each observation of that event.

  • Gender Attendees Styling Characteristics Attendees

    Gender

    AssignM if maleF if female

    (M, F)

    A B C D E Desirability of auto styling

    Assign5 if very desirable4 if desirable3 if neither2 if undesirable1 if very undesirable

    (1 through 5)

    A B C D E

    M F1 2 3 54

    Sampleelements

    EmpiricalObservations

    Mapping Rules

    Symbol

  • Methods of scale

    A scale is a tool or mechanism by which individuals are distinguished asto how they differ from one another on the variable of interest to ourstudy.

    There are four basic methods of scales: nominal, ordinal, interval, andratio. The degree of sophistication to which the scales are fine-tunedincreases progressively as we move from nominal to the ratio scale.

  • Nominal Scale

    The lowest measurement level you can use, from a statistical point of view,is a nominal scale, as the name implies, is simple some placing of data intocategories, without any order or structure.

    Example of a nominal scale SEX 1. Male 2. Female Area 2. Rural 2. Urban

    Numbers assigned to represent the categories cannot meaningfully beadded, subtracted, or divided.

    A mean or a median cannot be calculated for nominal data. A mode and a chi-square statistical test can be used.

  • Ordinal Scale Ordinal scale include the characteristics of the nominal scale plus an

    indication of order. Ordinal data require conformity to a logical postulate,which states: If a is greater than b and b is greater than c, then a is greaterthan c.

    The use of an ordinal scale implies a statement of greater than or lesser than(an equality statement is also acceptable) without stating how much greateror less. Example of a ordinal scale: Brand Preference for purchasing a Television

    Onida : Samsung : LG : Sony : Sharpe :

    Median and mode, rank order correlation statistical test can be used

  • INTERVAL SCALE An interval scale measure the distance between any two points on the

    scale. This help us to compute the means and the standard deviations of the

    responses on the variables.

    In other words, the interval scale not only groups, it also measures themagnitude of the differences in the preferences among the individuals.

    It is more powerful scale than the nominal and ordinal scale, and has forits measure of central tendency the arithmetic mean. Its measure of

    dispersion are the range, the standard deviation, and the variance.

  • EXAMPLE OF INTERVAL SCALE

    Indicate the extent to which you agree with the following statements asthey related to your job, by circling the appropriate number against each,using the scale given below.

    StronglyDisagree

    1

    Disagree

    2

    Neither AgreeNor Disagree

    3

    Agree

    4

    StronglyAgree

    5The following opportunities offered by the job are very important to me:

    a. Interacting with others. 1 2 3 4 5b. Using a number of different skills. 1 2 3 4 5

    c. Completing a task from beginning to end. 1 2 3 4 5d. Serving others. 1 2 3 4 5

    e. Working independently. 1 2 3 4 5

  • RATIO SCALE The ratio scale overcomes the disadvantage of the arbitrary origin point of the

    interval scale, in that it has an absolute zero point, which is a meaningful

    measurement point. Thus the ratio scale not only measures the magnitude of the

    differences between points on the scale but also tapes the propositions in the

    differences. It is most powerful of the four scales because it has a unique zero

    origin (not an arbitrary origin) and subsumes all the properties of the other three

    scales.

    The measurement of central tendency of the ratio scale could be either thearithmetic or the geometric mean and the measure of dispersion could be either

    the standard deviation, or variance, or the coefficient of variation.

  • EXAMPLE OF RATIO SCALE1. How many other organization did you work for before joining this system? ____2. Indicate the number of children you have in each of the following categories:

    ____below three years of age

    ____between three to six years

    ____over six years but under twelve years

    ____twelve years and over.

    3. How many retail outlets do you operate? ____.

    The responses to the questions could range from 0 to any reasonable figure.

  • SCALE

    Assigning numbers or symbols to elicit the attitudinal responses of subjectstoward object, event, or persons is called scale.

  • TYPES OF SCALES

    There are two main categories of attitudinal scales.

    1.Rating scale: Rating scales have several response categories and are used toelicit responses or behavioral concept with regard to the object, events,

    or person studied.

    2.Ranking scale: Ranking scales make comparison between or amongobjects, events, or persons and elicit the preferred choices and ranking

    among them.

  • Rating Scale

    Simple Attitude Scale Likert Scale Semantic Differential Scales Numerical/Multiple Rating List Scale STAPEL Scales Constant-Sum Scales Graphic Rating Scales

  • Simple Attitude Scale

    The simple category scale (also called a dichotomous scales) offers twomutually exclusive response choice.

    This response strategy is particularly useful for demographic questions orwhere a dichotomous response is adequate.

  • Examples of Simple Attitude Scale Do you plan to purchase a laptop in the next 3 Months

    Yes No

    What newspaper do you read most often for financial newsThe Economic times the financial express business standard Others

    Check any of the sources you consulted when designing your new homeOnline Magazines Designer Architect others

    The internet is superior to traditional libraries for comprehensive searchesStrongly Agree Agree Neutral Disagree Strongly Disagree

  • Likert Scale

    The Likert scale, developed by Rensis Liker (pronounced Lick-ert), is themost frequently used variation of the summated scale. Summated rating

    Scale consist of statements that express either a favourable or an

    unfavourable attitude towards the object of interest.

    Typically, each scale item will have 5 categories, with scale values rangingfrom -2 to 2 with 0 as neutral response.

  • Examples of Likert Scale

    Strongly Agree Agree Neutral DisagreeStronglyDisagree

    Quality of theFood

    2 1 0 -1 -2

    Cleanness ofthe Hostel

    2 1 0 -1 -2

    Amenitiesprovided by the

    management5 4 3 2 1

    Training timeintervals

    1 2 3 4 5

    Satisfactionwith thepresent

    Appraisalsystem

    5 4 3 2 1

  • Semantic Differential Scales

    This type of scale makes extensive use of words rather than numbers.Respondents describe their feelings about the products or brands on scales

    with semantic labels. When bipolar adjective are used at the end points of

    the scales, these are termed Semantic Differential Scales.

  • Examples of Semantic Differential Scales

    Extremely Quite Slightly Neither Slightly Quite Extremely

    Good Bad

    Important unimportant

    High Low

    Strong Week

    Active Passive

    Semantic Scales

    Semantic DifferentialScales

  • Numerical/Multiple Rating List Scale

    The Numerical/Multiple Rating List Scale is commonly used variation of thelinear, numeric scale but there is an important distinction. With the linear,

    numeric scale the respondents has to pick a number from the scale and then

    write the number beside the item.

  • Examples of Numerical/Multiple Rating ListScale

    Extremelysafe

    ExtremelyUnsafe

    SavingsAccount

    1 2 3 4 5 6 7

    Loansaving

    account1 2 3 4 5 6 7

    Certificateof deposit

    1 2 3 4 5 6 7

    Corporatecommon

    stocks1 2 3 4 5 6 7

    Preciousmetals

    1 2 3 4 5 6 7

  • STAPAL (BASIC) SCALE

    This scale simultaneously measure both the direction and intensity of theattitude toward the item under study. The characteristic of interest to the

    study is placed at the center and a numerical scale ranging, say from +3 to -3

    or +5 to -5, on either side of the item. This gives the idea of how closer or

    distant the individual response to the stimulus. Since this does not an

    absolute zero point, this is an interval scale.

  • Example of STAPAL (BASIC) SCALE

  • FIXED OR CONSTANT SUM SCALE

    The respondents are here asked to distribute a given number of pointsacross various items. This is an ordinal scale

  • Example of Fixed or Constant Sum Scale

  • GRAPHIC RATING SCALE

    A graphical scale representation helps the respondents to indicate on thisscale their answers to a particular question by placing a mark at the

    appropriate point in the line. This is an ordinal scale. The faces scale, which

    shows faces ranging from smiling to sad is also a graphic scale, used to

    obtain responses regarding peoples feelings.

  • EXAMPLE GRAPHIC RATING SCALE

    On a scale of 1 to 10 how would you rate your supervisor?

    1 5 10

  • Ranking Scale

    In ranking scales, the participants directly compares two or more objectsand makes choices among the.

    Ranking scales used in organization1. PAIRED COMPARISON.2. FORCED CHOICE.3. COMPARATIVE SCALE.

  • PAIRED COMPARISON

    It is used when, among a small number of objects, respondents are asked tochoose between two objects at a time.

    As the number of objects to be compared increases, so does the number ofpaired comparisons. The paired choices for n objects will be n (n-1) / 2.

  • Examples of Paired Comparison

  • FORCED CHOICE

    It enables respondents to rank objects relative to one another, among thealternatives provided. This is easier for the respondents, practically if the

    number of choices to be ranked is limited in number.

  • Example of Forced Ranking Scale

  • COMPARATIVE SCALE

    It provides a benchmark or a point of reference to assess attitudes towardthe current object, event, or situation under study.

    EXAMPLE OF COMPARATIVE SCALE

    In a volatile (evaporation) financial environment, compared to stocks,how wise or useful is it to invest in Treasury bonds? Circle theappropriate response.More Useful About the

    SameLess Useful

    1 2 3 4 5

  • GOODNESS OF DATA It is important to make sure that the instrument that we develop to measure

    a particular concept is indeed accurately measuring the variable, and that in

    fact, we are actually measuring the concept perceptual and attitudinal

    measure. This ensures that in operationally defining perceptual and

    attitudinal variables, we have not overlooked some important

    dimensions and elements or included some irrelevant ones.

  • Item analysis

    Item analysis is carried out to see if the items in the instrument belong thereor not. Each item is examined for its ability to discriminate between thosesubjects whose total scores are high, and those with low scores.

    Thereafter, test for the reliability of the instrument are carried out and thevalidity of the measure is established.

    Reliability is a test of how consistently a measuring instrument measureswhatever concept it is measuring.

    Validity is a test of how well an instrument that is developed measures theparticular concept it is intended to measure. In other words, validity isconcerned with whether we measure the right concept, and reliability withstability and consistency of measurement.

  • TESTING GOODNESS OF MEASURES

    Goodnessof data

    Reliability(accuracy

    InMeasure-

    ment)

    Validity(we are

    MeasuringThe right

    Thing)

    Stability

    Consistency

    Test-retest reliability

    Interitem consistency reliability

    Parallel-form reliability

    Split-half reliability

    Logical validity(content)

    Criterion-relatedvalidity

    Congruent validity(construct)

    Face validity ConvergentPredictive Concurrent Discriminant

  • Types of validity tests

    1. Content validity.2. Criterion-related validity.3. Construct validity

  • CONTENT VALIDITY It ensures that the measure includes an adequate and representative set

    of items that tap the concept. The more the scale items represent thedomain (circle of affection) or universe of the concept being measured, thegreater the content validity.

    Face validity considered by some as a basic and a very minimum index of contentvalidity. Face validity indicates that the items that are intended to measure a concept,so on the face of it look like they measure the concept.

    the simplest and least scientific definition of validity it is demonstrated when a measure superficially appears to measure what it claims

    to measure

    Based on subjective judgment and difficult to quantify e.g. intelligence and reasoning questions on the IQ test Problem - participants can use the face validity to change their answers

  • CRITERION-RELATED VALIDITY

    It is established when the measure differentiates individuals on a criterion itis expected to predict. This can be done by establishing concurrent ( withconsensus) validity or predictive validity

    CONCURRENT VALIDITY: It is established when the scale discriminates individualswho are known to the different; that is they should score differently on the instrument

    e.g. new IQ test correlates with an older IQ test PREDICTIVE VALIDITY: It indicates the ability of the measuring instrument to

    differentiate among individuals with reference to a future criterion.

    e.g. high scores on need for achievement test predict competitive behavior in children(ring toss game)

  • CONSTRUCT VALIDITY

    It testified to how well the results obtained from the use of the measure fitthe theories around which the test is designed. This is assessed throughconvergent and discriminant validity.

    CONVERGENT VALIDITY: It is established when the scores obtained with twodifferent instrument measuring the same concept are highly correlated

    e.g. an experimenter observing aggressive behavior in children correlated withteachers ratings of their behavior

    DISCRIMINANT VALIDITY: It is established when, based on the theory, two variablesare predicted to be correlated, and the scores obtained by measuring them are indeedempirically found to be so.

    e.g. aggressive behavior and general activity level in children

  • Convergent validity, divergent validity and constructvalidity

    By demonstrating strong convergent validity for two different constructs and then showingdivergent validity between the two constructs, you obtain strong construct validity of the twoconstructs

    Aggressivebehavior

    Teachers ratings Experimentersobservation

    Active behavior

    Teachers ratings Experimentersobservation

    High convergentvalidity

    Related scoresHigh Divergent Vali

    dityUnrelated

    scores

    High convergentvalidity

    Related scores

    High Divergent Vali

    dityUnrelated

    scores

  • TYPES OF VALIDITYValidity Description Method

    Content Validity Does the measure adequately measure the concept?JudgmentalPanel evaluation with content validity ratio

    Face Validity Do experts validate that the instrument measureswhat its name suggests it measures?

    Criterion-related Validity Does the measure differentiate in a manner thathelps to predict a criterion variable. Correlation

    Concurrent Validity Does the measure differentiate in a manner thathelps to predict a criterion variable currently? Correlation

    Predictive Validity Does the measure differentiate individuals in asmanner as to help predict a future criterion? Correlation

    Construct Validity Does the instrument tap the concept as theorized?JudgmentalCorrelation of proposed test with established oneConvergent discriminant techniquesFactor analysisMultitrait multimethods analysis

    Convergent Validity Do two instruments measuring the concept correlatehighly?

    Discriminant ValidityDoes the measure have a low correlation with avariable that is supposed to be unrelated to thisvariable?

  • RELIBILITY

    The reliability of a measure indicate the extent to which it is without bias(error free) and hence ensures consistent measurement across time and

    across the various items in the instrument. In other words, the reliability of a

    measures is an indication of the stability and consistency with which the

    instrument measures the concept and helps to assess the goodness of a

    measure.

  • TEST-RETEST RELIABILITY The reliability coefficient obtained with a repetition of the same measure

    on a second occasion is called test-retest reliability. That is, when a

    questionnaire containing some items that are supposed to measure a

    concept is administered to a set of respondents now, and again to the same

    respondents, say several weeks to 6 months later, the correlation between

    the scores obtained at the two different times from one and the same set of

    respondents is called the test-retest reliability.

  • PARALLEL-FORM RELIABILITY

    When responses on two comparable sets of measure tapping the sameconstruct are highly correlated, we have parallel-form reliability. Both

    forms have similar items and the same response format, the only

    changes being the wordings and the order or sequence of the questions.

  • INTERNAL CONSISTENCY OF MEASUREThe internal consistency of measures is indicative of the homogeneity of the

    items in the measure that tap the construct. In other words, the items

    should hang together as a set and be capable of independently

    measuring the same concept so that the respondents attach the same

    overall meaning to each of the items.

    Consistency can be examined through :

    1.Inter-item Consistency Reliability.2.Split- Half Reliability.

  • INTER-ITEM CONSISTENCY RELIABILITY

    This is a test of the consistency of respondents answer to all the items in ameasure. To the degree that items are independent measures of the same

    concept, they will be correlated with one another.

  • SPLIT-HALF RELIABILITY

    Split-half reliability reflects the correlations between two halves of aninstrument. The estimates would vary depending on how the items in the

    measure are split into two halves.

  • THANK YOUFOR YOUR

    CONCENTRATION