definition of assessing terms

Embed Size (px)

Citation preview

  • 8/13/2019 definition of assessing terms

    1/11

    1. Population

    Population in research is generally a large collection of individuals or objects that is the

    main focus of a scientific query. It is for the benefit of the population that researches are

    done. However, due to the large sizes of populations, researchers often cannot test every

    individual in the population because it is too expensive and time-consuming. This is the

    reason why researchers rely on techniques. It is also known as a well-defined collection

    of individuals or objects known to have similar characteristics. All individuals or objects

    within a certain population usually have a common, binding characteristic or trait.

    2. Sampling

    A sample is a subset of thepopulationbeing studied. It represents the larger population

    and is used to draw inferences about that population. It is a research technique widely

    used in the social sciences as a way to gather information about a population without

    having to measure the entire population. There are several different types and ways of

    choosing a sample from a population, from simple to complex.

    2.1 Non-probability Sampling Techniques

    Non-probability sampling is a sampling technique where the samples are gathered

    in a process that does not give all the individuals in the population equal chances

    of being selected.

    a) Reliance On Available Subjects.

    Relying on available subjects, such as stopping people on a street corner as

    they pass by, is one method of sampling, although it is extremely risky and

    comes with many cautions. This method, sometimes referred to as a

    convenience sample, does not allow the researcher to have any control over

    the representativeness of the sample. It is only justified if the researcher wants

    to study the characteristics of people passing by the street corner at a certain

    point in time or if other sampling methods are not possible. The researcher

    http://sociology.about.com/od/P_Index/g/Population.htmhttp://sociology.about.com/od/Types-of-Samples/a/Convenience-Sample.htmhttp://sociology.about.com/od/Types-of-Samples/a/Convenience-Sample.htmhttp://sociology.about.com/od/P_Index/g/Population.htm
  • 8/13/2019 definition of assessing terms

    2/11

    must also take caution to not use results from a convenience sample to

    generalize to a wider population.

    b) Purposive or Judgmental Sample.

    A purposive, or judgmental, sample is one that is selected based on the

    knowledge of a population and the purpose of the study. For example, if a

    researcher is studying the nature of school spirit as exhibited at a school pep

    rally, he or she might interview people who did not appear to be caught up in

    the emotions of the crowd or students who did not attend the rally at all. In

    this case, the researcher is using a purposive sample because those being

    interviewed fit a specific purpose or description.

    c) Snowball Sample.A snowball sample is appropriate to use in research when the members of a

    population are difficult to locate, such as homeless individuals, migrant

    workers, or undocumented immigrants. A snowball sample is one in which the

    researcher collects data on the few members of the target population he or she

    can locate, then asks those individuals to provide information needed to locate

    other members of that population whom they know. For example, if a

    researcher wishes to interview undocumented immigrants from Mexico, he or

    she might interview a few undocumented individuals that he or she knows or

    can locate and would then rely on those subjects to help locate more

    undocumented individuals. This process continues until the researcher has all

    the interviews he or she needs or until all contacts have been exhausted.

    d) Quota Sample.

    A quota sample is one in which units are selected into a sample on the basis ofpre-specified characteristics so that the total sample has the same distribution

    of characteristics assumed to exist in the population being studied. For

    example, if you a researcher conducting a national quota sample, you might

    need to know what proportion of the population is male and what proportion

    is female as well as what proportions of each gender fall into different age

    http://sociology.about.com/od/Types-of-Samples/a/Purposive-Sample.htmhttp://sociology.about.com/od/Types-of-Samples/a/Snowball-Sample.htmhttp://sociology.about.com/od/Types-of-Samples/a/Snowball-Sample.htmhttp://sociology.about.com/od/Types-of-Samples/a/Quota-Sample.htmhttp://sociology.about.com/od/Types-of-Samples/a/Quota-Sample.htmhttp://sociology.about.com/od/Types-of-Samples/a/Snowball-Sample.htmhttp://sociology.about.com/od/Types-of-Samples/a/Purposive-Sample.htm
  • 8/13/2019 definition of assessing terms

    3/11

    categories, race or ethnic categories, educational categories, etc. The

    researcher would then collect a sample with the same proportions as the

    national population.

    2.2 Probability Sampling Techniques

    Probability sampling is a sampling technique where the samples are gathered in a

    process that gives all the individuals in the population equal chances of being

    selected.

    a) Simple Random Sample.

    The simple random sample is the basic sampling method assumed in statistical

    methods and computations. To collect a simple random sample, each unit of

    the target population is assigned a number. A set of random numbers is then

    generated and the units having those numbers are included in the sample. For

    example, lets say you have a population of 1,000 people and you wish to

    choose a simple random sample of 50 people. First, each person is numbered

    1 through 1,000. Then, you generate a list of 50 random numbers (typically

    with a computer program) and those individuals assigned those numbers are

    the ones you include in the sample.

    b) Systematic Sample.

    In a systematic sample, the elements of the population are put into a list and

    then every kth element in the list is chosen (systematically) for inclusion in the

    sample. For example, if the population of study contained 2,000 students at a

    high school and the researcher wanted a sample of 100 students, the students

    would be put into list form and then every 20th student would be selected for

    inclusion in the sample. To ensure against any possible human bias in this

    method, the researcher should select the first individual at random. This is

    technically called a systematic sample with a random start.

    http://sociology.about.com/od/Types-of-Samples/a/Random-Sample.htmhttp://sociology.about.com/od/Types-of-Samples/a/Systematic-Sample.htmhttp://sociology.about.com/od/Types-of-Samples/a/Systematic-Sample.htmhttp://sociology.about.com/od/Types-of-Samples/a/Random-Sample.htm
  • 8/13/2019 definition of assessing terms

    4/11

    c) Stratified Sample.

    A stratified sample is a sampling technique in which the researcher divided

    the entire target population into different subgroups, or strata, and then

    randomly selects the final subjects proportionally from the different strata.

    This type of sampling is used when the researcher wants to highlight

    specificsubgroups within the population. For example, to obtain a stratified

    sample of university students, the researcher would first organize the

    population by college class and then select appropriate numbers of freshmen,

    sophomores, juniors, and seniors. This ensures that the researcher has

    adequate amounts of subjects from each class in the final sample.

    d) Cluster Sample.

    Cluster sampling may be used when it is either impossible or impractical to

    compile an exhaustive list of the elements that make up the target population.

    Usually, however, the population elements are already grouped into

    subpopulations and lists of those subpopulations already exist or can be

    created. For example, lets say the target population in a study was church

    members in the United States. There is no list of all church members in the

    country. The researcher could, however, create a list of churches in the United

    States, choose a sample of churches, and then obtain lists of members from

    those churches.

    3. Instrument

    Instrument is the generic term that researchers use for a measurement device (survey,

    test, questionnaire, etc.). To help distinguish between instrument and instrumentation,

    consider that the instrument is the device and instrumentation is the course of action (the

    process of developing, testing, and using the device). Instruments fall into two broadcategories, researcher-completed and subject-completed, distinguished by those

    instruments that researchers administer versus those that are completed by participants.

    Researchers chose which type of instrument, or instruments, to use based on the research

    question.

    http://sociology.about.com/od/Types-of-Samples/a/Stratified-Sample.htmhttp://sociology.about.com/od/Types-of-Samples/a/Stratified-Sample.htmhttp://sociology.about.com/od/S_Index/g/Subgroup.htmhttp://sociology.about.com/od/Types-of-Samples/a/Cluster-Sample.htmhttp://sociology.about.com/od/Types-of-Samples/a/Cluster-Sample.htmhttp://sociology.about.com/od/S_Index/g/Subgroup.htmhttp://sociology.about.com/od/Types-of-Samples/a/Stratified-Sample.htm
  • 8/13/2019 definition of assessing terms

    5/11

    3.1 Validity

    Validity is described as the degree to which a research study measures what it

    intends to measure. There are two main types of validity, internal and

    external. Internal validity refers to the validity of the measurement and test itself,

    whereas external validity refers to the ability to generalize the findings to the

    target population. Both are very important in analysing the appropriateness,

    meaningfulness and usefulness of a research study. However, here I will focus on

    the validity of the measurement technique (i.e. internal validity).There are 4 main

    types of validity used when assessing internal validity. Each type views validity

    from a different perspective and evaluates different relationships between

    measurements.

    a) Face validity

    This refers to whether a technique looks as if it should measure the variable it

    intends to measure. For example, a method where a participant is required to

    click a button as soon as a stimulus appears and this time is measured appears

    to have face validity for measuring reaction time. An example of analysing

    research for face validity by Hardesty and Bearden (2004) can be found here.

    b) Concurrent validity

    This compares the results from a new measurement technique to those of a

    more established technique that claims to measure the same variable to see if

    they are related. Often two measurements will behave in the same way, but

    are not necessarily measuring the same variable, therefore this kind of validity

    must be examined thoroughly. An example and some weakness associated

    with this type of validity can be found here (Shuttleworth, 2009).

    c) Predictive validity

    This is when the results obtained from measuring a construct can be

    accurately used to predict behaviour. There are obvious limitations to this as

    http://0-www.sciencedirect.com.unicat.bangor.ac.uk/science?_ob=MiamiImageURL&_cid=271680&_user=899436&_pii=S0148296301002958&_check=y&_origin=browse&_zone=rslt_list_item&_coverDate=2004-02-29&wchp=dGLbVlt-zSkWz&md5=0e78e0ee72be0f0f4c294a0cc3d00a7a/1-s2.0-S0148296301002958-main.pdfhttp://www.experiment-resources.com/concurrent-validity.htmlhttp://www.experiment-resources.com/concurrent-validity.htmlhttp://0-www.sciencedirect.com.unicat.bangor.ac.uk/science?_ob=MiamiImageURL&_cid=271680&_user=899436&_pii=S0148296301002958&_check=y&_origin=browse&_zone=rslt_list_item&_coverDate=2004-02-29&wchp=dGLbVlt-zSkWz&md5=0e78e0ee72be0f0f4c294a0cc3d00a7a/1-s2.0-S0148296301002958-main.pdf
  • 8/13/2019 definition of assessing terms

    6/11

  • 8/13/2019 definition of assessing terms

    7/11

    Example: If you wanted to evaluate the reliability of a critical thinking

    assessment, you might create a large set of items that all pertain to critical

    thinking and then randomly split the questions up into two sets, which would

    represent the parallel forms.

    c) Inter-rater reliability

    Inter-rater reliability is a measure of reliability used to assess the degree to

    which different judges or raters agree in their assessment decisions. Inter-

    rater reliability is useful because human observers will not necessarily

    interpret answers the same way; raters may disagree as to how well certain

    responses or material demonstrate knowledge of the construct or skill being

    assessed.

    Example: Inter-rater reliability might be employed when different judges are

    evaluating the degree to which art portfolios meet certain standards. Inter-

    rater reliability is especially useful when judgments can be considered

    relatively subjective. Thus, the use of this type of reliability would probably

    be more likely when evaluating artwork as opposed to math problems.

    d) Internal consistency reliability

    Internal consistency reliability is a measure of reliability used to evaluate the

    degree to which different test items that probe the same construct produce

    similar results. Average inter-item correlation is a subtype of internal

    consistency reliability. It is obtained by taking all of the items on a test that

    probe the same construct (e.g., reading comprehension), determining the

    correlation coefficient for each pair of items, and finally taking the average of

    all of these correlation coefficients. This final step yields the average inter-

    item correlation. Split-half reliability is another subtype of internal

    consistency reliability. The process of obtaining split-half reliability is begun

    by splitting in half all items of a test that are intended to probe the same area

    of knowledge (e.g., World War II) in order to form two sets of

    items. The entire test is administered to a group of individuals, the total score

  • 8/13/2019 definition of assessing terms

    8/11

    for each set is computed, and finally the split-half reliability is obtained by

    determining the correlation between the two total set scores.

    4. Measuring Scale

    Statistical information, including numbers and sets of numbers, has specific qualities that

    are of interest to researchers. These qualities, including magnitude, equal intervals, and

    absolute zero, determine what scale of measurement is being used and therefore what

    statistical procedures are best. Magnitude refers to the ability to know if one score is

    greater than, equal to, or less than another score. Equal intervals means that the possible

    scores are each an equal distance from each other. And finally, absolute zero refers to a

    point where none of the scale exists or where a score of zero can be assigned.

    When we combine these three scale qualities, we can determine that there are four scales

    of measurement. The lowest level is the nominal scale, which represents only names and

    therefore has none of the three qualities. A list of students in alphabetical order, a list of

    favorite cartoon characters, or the names on an organizational chart would all be

    classified as nominal data. The second level, called ordinal data, has magnitude only, and

    can be looked at as any set of data that can be placed in order from greatest to lowest but

    where there is no absolute zero and no equal intervals. Examples of this type of scale

    would include Likert Scales and the Thurstone Technique.

    The third type of scale is called an interval scale, and possesses both magnitude and equal

    intervals, but no absolute zero. Temperature is a classic example of an interval scale

    because we know that each degree is the same distance apart and we can easily tell if one

    temperature is greater than, equal to, or less than another. Temperature, however, has no

    absolute zero because there is (theoretically) no point where temperature does not exist.

    Finally, the fourth and highest scale of measurement is called a ratio scale. A ratio scale

    contains all three qualities and is often the scale that statisticians prefer because the data

    can be more easily analyzed. Age, height, weight, and scores on a 100-point test would

    all be examples of ratio scales. If you are 20 years old, you not only know that you are

  • 8/13/2019 definition of assessing terms

    9/11

    older than someone who is 15 years old (magnitude) but you also know that you are five

    years older (equal intervals). With a ratio scale, we also have a point where none of the

    scale exists; when a person is born his or her age is zero.

    Scales of Measurement

    Scale

    Level

    Scale of

    Measurement

    Scale

    QualitiesExample(s)

    4 Ratio

    Magnitude

    Equal

    Intervals

    Absolute Zero

    Age, Height, Weight, Percentage

    3 Interval

    Magnitude

    Equal

    Intervals

    Temperature

    2 Ordinal Magnitude Likert Scale, Anything rank ordered

    1 Nominal None Names, Lists of words

    5. Parametric and Non Parametric Data

    Several fundamental statistical concepts are helpful prerequisite knowledge for fully

    understanding the terms parametric and nonparametric. These statistical

    fundamentals include random variables, probability distributions, parameters, population,

    sample, sampling distributions and the Central Limit Theorem. I cannot explain these

    topics in a few paragraphs, as they would usually comprise two or three chapters in a

    statistics textbook. Thus, I will limit my explanation to a few helpful (I hope) links

    among terms.The field of statistics exists because it is usually impossible to collect data

  • 8/13/2019 definition of assessing terms

    10/11

    from all individuals of interest (population). Our only solution is to collect data from a

    subset(sample) of the individuals of interest, but our real desire is to know the truth

    about the population. Quantities such as means, standard deviations and proportions are

    all important values and are called parameters when we are talking about a population.

    Since we usually cannot get data from the whole population, we cannot know the values

    of the parameters for that population. We can, however, calculate estimates of these

    quantities for our sample. When they are calculated from sample data, these quantities are

    called statistics. A statistic estimates a parameter. Parametric statistical procedures rely

    on assumptions about the shape of the distribution (i.e., assume a normal distribution) in

    the underlying population and about the form or parameters (i.e., means and standard

    deviations) of the assumed distribution. Nonparametric statistical procedures rely on no

    or few assumptions about the shape or parameters of the population distribution from

    which the sample was drawn.

    Parametric Non-parametric

    Assumed distribution Normal Any

    Assumed variance Homogeneous Any

    Typical data Ratio orInterval Ordinal orNominal

    Data set relationships Independent Any

    Usual central measure Mean Median

    Benefits Can draw moreconclusions

    Simplicity; Less affectedby outliers

    Choosing test Choosing parametric test Choosing a non-

    parametric test

    Correlation test Pearson Spearman

    http://www.syque.com/improvement/Normal%20distribution.htmhttp://www.syque.com/improvement/Normal%20distribution.htmhttp://changingminds.org/explanations/research/analysis/variance_homogeneity.htmhttp://changingminds.org/explanations/research/analysis/variance_homogeneity.htmhttp://changingminds.org/explanations/research/measurement/types_data.htm#rathttp://changingminds.org/explanations/research/measurement/types_data.htm#inthttp://changingminds.org/explanations/research/measurement/types_data.htm#ordhttp://changingminds.org/explanations/research/measurement/types_data.htm#nomhttp://syque.com/improvement/Average.htmhttp://syque.com/improvement/Average.htmhttp://syque.com/improvement/Median.htmhttp://syque.com/improvement/Median.htmhttp://changingminds.org/explanations/research/analysis/choose_parametric.htmhttp://changingminds.org/explanations/research/analysis/choose_parametric.htmhttp://changingminds.org/explanations/research/analysis/choose_nonparametric.htmhttp://changingminds.org/explanations/research/analysis/choose_nonparametric.htmhttp://changingminds.org/explanations/research/analysis/choose_nonparametric.htmhttp://changingminds.org/explanations/research/analysis/pearson.htmhttp://changingminds.org/explanations/research/analysis/pearson.htmhttp://changingminds.org/explanations/research/analysis/spearman.htmhttp://changingminds.org/explanations/research/analysis/spearman.htmhttp://changingminds.org/explanations/research/analysis/spearman.htmhttp://changingminds.org/explanations/research/analysis/pearson.htmhttp://changingminds.org/explanations/research/analysis/choose_nonparametric.htmhttp://changingminds.org/explanations/research/analysis/choose_nonparametric.htmhttp://changingminds.org/explanations/research/analysis/choose_parametric.htmhttp://syque.com/improvement/Median.htmhttp://syque.com/improvement/Average.htmhttp://changingminds.org/explanations/research/measurement/types_data.htm#nomhttp://changingminds.org/explanations/research/measurement/types_data.htm#ordhttp://changingminds.org/explanations/research/measurement/types_data.htm#inthttp://changingminds.org/explanations/research/measurement/types_data.htm#rathttp://changingminds.org/explanations/research/analysis/variance_homogeneity.htmhttp://www.syque.com/improvement/Normal%20distribution.htm
  • 8/13/2019 definition of assessing terms

    11/11

    6.