© aSup-2011 Sampling Survey 1 SAMPLING SURVEY TECHNIQUE

© aSup-2011

Sampling Survey

1

SAMPLING SURVEY

TECHNIQUE

© aSup-2011

Sampling Survey

2

POPULATIONS and SAMPLES

THE POPULATIONis the set of all the individuals of

interest in particular study

THE SAMPLEis a set of individuals selected from a population, usually intended to represent the population in a research study

The sample is selected from the population

The result from the sample are generalized

from the population

© aSup-2011

Sampling Survey

Teknik pengumpulan data

Pengumpulan Data

Sensus (populasi) Sampling (sampel)

Probabilita Non-Probabilita

© aSup-2011

Sampling Survey

4

PARAMETER and STATISTIC A parameter is a value, usually a numerical

value, that describes a population.A parameter may be obtained from a single measurement, or it may be derived from a set of measurements from the population

A statistic is a value, usually a numerical value, that describes a sample.A statistic may be obtained from a single measurement, or it may be derived from a set of measurement from sample

© aSup-2011

Sampling Survey

5

SAMPLING ERROR Although samples are generally

representative of their population, a sample is not expected to give a perfectly accurate picture of the whole population

There usually is some discrepancy between sample statistic and the corresponding population parameter called sampling error

© aSup-2011

Sampling Survey

6

TWO KINDS OF NUMERICAL DATA

Generally fall into two major categories:

1. Counted frequencies enumeration data

2. Measured metric or scale values measurement or metric dataStatistical procedures deal with both kinds

of data

© aSup-2011

Sampling Survey

7

DATUM and DATA The measurement or observation obtain

for each individual is called a datum or, more commonly a score or raw score

The complete set of score or measurement is called the data set or simply the data

After data are obtained, statistical methods are used to organize and interpret the data

© aSup-2011

Sampling Survey

8

VARIABLE A variable is a characteristic or

condition that changes or has different values for different individual

A constant is a characteristic or condition that does not vary but is the same for every individual

A research study comparing vocabulary skills for 12-year-old boys

© aSup-2011

Sampling Survey

9

QUALITATIVE and QUANTITATIVE

Categories Qualitative: the classes of objects are

different in kind.There is no reason for saying that one is greater or less, higher or lower, better or worse than another.

Quantitative: the groups can be ordered according to quantity or amountIt may be the cases vary continuously along a continuum which we recognized.

© aSup-2011

Sampling Survey

10

DISCRETE and CONTINUOUS Variables

A discrete variable. No values can exist between two neighboring categories.

A continuous variable is divisible into an infinite number or fractional parts○ It should be very rare to obtain identical

measurements for two different individual○ Each measurement category is actually an

interval that must be define by boundaries called real limits

© aSup-2011

Sampling Survey

11

CONTINUOUS Variables Most interval-scale measurement are

taken to the nearest unit (foot, inch, cm, mm) depending upon the fineness of the measuring instrument and the accuracy we demand for the purposes at hand.

And so it is with most psychological and educational measurement. A score of 48 means from 47.5 to 48.5

We assume that a score is never a point on the scale, but occupies an interval from a half unit below to a half unit above the given number.

© aSup-2011

Sampling Survey

12

FREQUENCIES, PERCENTAGES, PROPORTIONS, and RATIOS

Frequency defined as the number of objects or event in category.

Percentages (P) defined as the number of objects or event in category divided by 100.

Proportions (p). Whereas with percentage the base 100, with proportions the base or total is 1.0

Ratio is a fraction. The ratio of a to b is the fraction a/b.A proportion is a special ratio, the ratio of a part to a total.

© aSup-2011

Sampling Survey

13

MEASUREMENTS and SCALES (Stevens, 1946)

Nominal

Ordinal

Interval

Ratio

© aSup-2011

Sampling Survey

14

FREQUENCY DISTRIBUTION, GRAPH,

and PERCENTILE

© aSup-2011

Sampling Survey

15

A class of 40 students has just returned the Perceptual Speed test score. Aside from the primary question about your grade, you’d like to know more about how you stand in the class

How does your score compare with other in the class? What was the range of performance

What more can you learn by studying the scores?

© aSup-2011

Sampling Survey

16

Score of PERCEPTUAL SPEED Test

29 47 45 40 48 48 49 4540 49 48 37 48 46 55 6736 53 25 58 33 33 43 4232 51 48 54 47 40 38 4446 50 28 44 52 49 56 48

Taken from Guilford p.55

© aSup-2011

Sampling Survey

17

OVERVIEW

When a researcher finished the data collect phase of an experiment, the result usually consist pages of numbers

The immediate problem for the researcher is to organize the scores into some comprehensible form so that any trend in the data can be seen easily and communicated to others

This is the jobs of descriptive statistics; to simplify the organization and presentation of data

One of the most common procedures for organizing a set of data is to place the scores in a FREQUENCY DISTRIBUTION

© aSup-2011

Sampling Survey

18

GROUPED SCORES After we obtain a set of measurement

(data), a common next step is to put them in a systematic order by grouping them in classes

With numerical data, combining individual scores often makes it easier to display the data and to grasp their meaning. This is especially true when there is a wide range of values.

© aSup-2011

Sampling Survey

19

TWO GENERAL CUSTOMS IN THE SIZE OF CLASS

INTERVAL1.We should prefer not fewer than 10 and

more than 20 class interval.○ More commonly, the number class

interval used is 10 to 15.○ An advantage of a small number class

interval is that we have fewer frequencies which to deal with

○ An advantage of larger number class interval is higher accuracy of computation

© aSup-2011

Sampling Survey

20

TWO GENERAL CUSTOMS IN THE SIZE OF CLASS

INTERVAL2. Determining the choice of class interval

is that certain ranges of units (scores) are preferred.Those ranges are 2, 3, 5, 10, and 20.These five interval sizes will take care of almost all sets of data

© aSup-2011

Sampling Survey

21


29 47 45 40 48 48 49 4540 49 48 37 48 46 55 6736 53 25 58 33 33 43 4232 51 48 54 47 40 38 4446 50 28 44 52 49 56 48

Taken from Guilford p.55

© aSup-2011

Sampling Survey

22

HOW TO CONSTRUCT A GROUPED FREQUENCY DISTRIBUTION

Step 1 : find the lowest score and the highest score

Step 2 : find the range by subtracting the lowest score from the highest

Step 3 : divide the range by 10 and by 20 to determine the largest and the smallest acceptable interval widths. Choose a convenient width (i) within these limits

© aSup-2011

Sampling Survey

23


29 47 45 40 48 48 49 45

40 49 48 37 48 46 55 67

36 53 25 58 33 33 43 42

32 51 48 54 47 40 38 44

46 50 28 44 52 49 56 48

Range = 42 42 : 10 = 4,2 and 42 : 20 = 2,1

© aSup-2011

Sampling Survey

24

WHERE TO START CLASS INTERVAL

It’s natural to start the interval with their lowest scores at multiples of the size of the interval.

When the interval is 3, to start with 24, 27, 30, 33, etc.; when the interval is 4, to start with 24, 28, 32, 36, etc.

© aSup-2011

Sampling Survey

25

HOW TO CONSTRUCT A GROUPED FREQUENCY DISTRIBUTION

Step 4 : determine the score at which the lowest interval should begin. It should ordinarily be a multiple of the interval.

Step 5 : record the limits of all class interval, placing the interval containing the highest score value at the top. Make the intervals continuous and of the same width

Step 6 : using the tally system, enter the raw scores in the appropriate class intervals

Step 7 : convert each tally to a frequency

© aSup-2011

Sampling Survey

26

FREQUENCY DISTRIBUTION TABLESCORE

66 - 68

63 - 65

60 - 62

57 -59

54 - 56

51 - 53

48 - 50

45 - 47

42 - 44

39 - 41

36 - 38

33 - 35

30 - 32

27 - 29

24 - 26

SCORE

64 - 67

60 - 63

56 - 59

52 - 55

48 - 51

44 - 47

40 - 43

36 - 39

32 - 35

28 - 31

24 - 27

X max = 67

X min = 25

RANGE = 42Interval = 3

C.i = 15

Interval = 4

C.i = 11

© aSup-2011

Sampling Survey

27

PERCEPTUAL

SPEED

SCORE f Xc Lower Exact Limit Upper Exact Limit

64 -67 1 65.5 63.5 67.5

60 - 63 0 61.5 59.5 63.5

56 - 59 2 57.5 55.5 59.5

52 - 55 4 53.5 51.5 55.5

48 - 51 11 49.5 47.5 51.5

44 - 47 8 45.5 43.5 47.5

40 - 43 5 41.5 39.5 43.5

36 - 39 3 37.5 35.5 39.5

32 - 35 3 33.5 31.5 35.5

28 - 31 2 29.5 27.5 31.5

24 - 27 1 25.5 23.5 27.5

© aSup-2011

Sampling Survey

28

WARNING!! Although grouped frequency distribution can make

easier to interpret data, some information is lost. In the table, we can see that more people scored in

the interval 48 – 51 than in any other interval However, unless we have all the original scores to

look at, we would not know whether the 11 scores in this interval were all 48s, all, 49s, all 50s, or all 51 or were spread throughout the interval in some way

This problem is referred to as GROUPING ERROR The wider the class interval width, the greater the

potential for grouping error

© aSup-2011

Sampling Survey

29

STEM and LEAF DISPLAY In 1977, J.W. Tukey presented a

technique for organizing data that provides a simple alternative to a frequency distribution table or graph

This technique called a stem and leaf display, requires that each score be separated into two parts.

The first digit (or digits) is called the stem, and the last digit (or digits) is called the leaf.

© aSup-2011

Sampling Survey

30

Data Stem & Leaf Display83 82 6362 93 7871 68 3376 52 9785 42 4632 57 5956 73 7474 81 76

3

4

5

6

7

8

9

3

2

1 6

5

2 3

2 66 2 7 9

4 3 8 4 6

2 8 3

2 1

3 7

© aSup-2011

Sampling Survey

31

GROUPED FREQUENCY DISTRIBUTION HISTOGRAM AND A STEM AND LEAF

DISPLAY

3 4 5 6 7 8 9

2 3

2 6

6 2

7

9

2 8

3

1 6

4

3

8 4

6

3 5

2

1

3 7

7654321

30 40 50 60 70 80 900

© aSup-2011

Sampling Survey

32

MAKING GRAPHPOLIGON and HISTOGRAM

© aSup-2011

Sampling Survey

33

MAKING GRAPH

POLIGON

© aSup-2011

Sampling Survey

34

PERCEPTUAL

SPEED

SCORE

f XcLower Exact

LimitLower Exact

Limit

64 -67 165.

563.5 67.5

60 - 63 061.

559.5 63.5

56 - 59 257.

555.5 59.5

52 - 55 453.

551.5 55.5

48 - 5111

49.5

47.5 51.5

44 - 47 845.

543.5 47.5

40 - 43 541.

539.5 43.5

36 - 39 337.

535.5 39.5

32 - 35 333.

531.5 35.5

28 - 31 229.

527.5 31.5

24 - 27 125.

523.5 27.5

© aSup-2011

Sampling Survey

35

POLIGON

X

f

0 29.5 37.5 45.5 53.5 61.525.5 33.5 41.5 49.5 57.5 65.5

12

10

8

6

4

2

21.5 69.5

Class Interval’s

MIDPOINT

© aSup-2011

Sampling Survey

36

PERCEPTUAL SPEED

X

f

0 29.5 37.5 45.5 53.5 61.525.5 33.5 41.5 49.5 57.5 65.5

12

10

8

6

4

2

21.5 69.5

© aSup-2011

Sampling Survey

37

MAKING GRAPH

HISTOGRAM

© aSup-2011

Sampling Survey

38

PERCEPTUAL

SPEED

SCORE

f XcLower Exact

LimitLower Exact

Limit

64 -67 165.

563.5 67.5

60 - 63 061.

559.5 63.5

56 - 59 257.

555.5 59.5

52 - 55 453.

551.5 55.5

48 - 5111

49.5

47.5 51.5

44 - 47 845.

543.5 47.5

40 - 43 541.

539.5 43.5

36 - 39 337.

535.5 39.5

32 - 35 333.

531.5 35.5

28 - 31 229.

527.5 31.5

24 - 27 125.

523.5 27.5

© aSup-2011

Sampling Survey

39

HISTOGRAM

X

f

0 27.5 35.5 43.5 51.5 59.5 67.523.5 31.5 39.5 47.5 55.5 63.5

12

10

8

6

4

2

Class Interval’s EXACT LIMIT

© aSup-2011

Sampling Survey

40

POLIGON and HISTOGRAM

X

f

0 27.5 35.5 43.5 51.5 59.5 67.523.5 31.5 39.5 47.5 55.5 63.5

12

10

8

6

4

2

© aSup-2011

Sampling Survey

41

THE SHAPE OF A FREQUENCY DISTRIBUTION

Symmetrical

It is possible to draw a vertical line through the

middle so that one side of the distribution is a mirror

image of the other

Skewed

The scores tend to pile up toward one end of the scale and taper off gradually at the other end

positive negative

© aSup-2011

Sampling Survey

42

Describe the shape of distribution for the data in the following table

X f

5

4

3

2

1

4

6

3

1

1

LEARNING CHECK

The distribution is negatively

skewed

© aSup-2011

Sampling Survey

43

PERCENTILES and PERCENTILE RANKS The percentile system is widely used in

educational measurement to report the standing of an individual relative performance of known group. It is based on cumulative percentage distribution.

A percentile is a point on the measurement scale below which specified percentage of the cases in the distribution falls

The rank or percentile rank of a particular score is defined as the percentage of individuals in the distribution with scores at or below the particular value

When a score is identified by its percentile rank, the score called percentile

© aSup-2011

Sampling Survey

44

Suppose, for example that A have a score of X=78 on an exam and we know exactly 60% of the class had score of 78 or lower….…

Then A score X=78 has a percentile of 60%, and A score would be called the 60th percentile

Percentile Rank refers to a percentage

Percentile refers to a score

© aSup-2011

Sampling Survey

45

CENTRAL TENDENCY

Mean, Median, and Mode

© aSup-2011

Sampling Survey

46

OVERVIEW The general purpose of descriptive

statistical methods is to organize and summarize a set score

Perhaps the most common method for summarizing and describing a distribution is to find a single value that defines the average score and can serve as a representative for the entire distribution

In statistics, the concept of an average or representative score is called central tendency

© aSup-2011

Sampling Survey

47

OVERVIEW Central tendency has purpose to provide

a single summary figure that best describe the central location of an entire distribution of observation

It also help simplify comparison of two or more groups tested under different conditions

There are three most commonly used in education and the behavioral sciences: mode, median, and arithmetic mean

© aSup-2011

Sampling Survey

48

The MODE A common meaning of mode is

‘fashionable’, and it has much the same implication in statistics

In ungrouped distribution, the mode is the score that occurs with the greatest frequency

In grouped data, it is taken as the midpoint of the class interval that contains the greatest numbers of scores

The symbol for the mode is Mo

© aSup-2011

Sampling Survey

49

The MEDIAN The median of a distribution is the point

along the scale of possible scores below which 50% of the scores fall and is there another name for P50

Thus, the median is the value that divides the distribution into halves

It symbols is Mdn

© aSup-2011

Sampling Survey

50

The ARITHMETIC MEAN The arithmetic mean is the sum of all

the scores in the distribution divided by the total number of scores

Many people call this measure the average, but we will avoid this term because it is sometimes used indiscriminately for any measure of central tendency

For brevity, the arithmetic mean is usually called the mean

© aSup-2011

Sampling Survey

51

The ARITHMETIC MEAN Some symbolism is needed to express the mean

mathematically. We will use the capital letter X as a collective term to specify a particular set of score (be sure to use capital letters; lower-case letters are used in a different way)

We identify an individual score in the distribution by a subscript, such as X1 (the first score), X8 (the eighth score), and so forth

You remember that n stands for the number in a sample and N for the number in a population

© aSup-2011

Sampling Survey

52

Properties of the Mode The mode is easy to obtain, but it is not very

stable from sample to sample Further, when quantitative data are grouped,

the mode maybe strongly affected by the width and location of class interval

There may be more than one mode for a particular set of scores. In rectangular distribution the ultimate is reached: every score share the honor! For these reason, the mean or the median is often preferred with numerical data

However, the mode is the only measure that can be used for data that have the character of a nominal scale

© aSup-2011

Sampling Survey

53

Properties of the Mean Unlike the other measures of central tendency,

the mean is responsive to the exact position of reach score in the distribution

Inspect the basic formula ΣX/n. Increasing or decreasing the value of any score changes ΣX and thus also change the value of the mean

The mean may be thought of as the balance point of the distribution, to use a mechanical analogy. There is an algebraic way of stating that the mean is the balance point:

0)( XX

© aSup-2011

Sampling Survey

54

Properties of the Mean The sums of negative deviation from the

mean exactly equals the sum of the positive deviation

The mean is more sensitive to the presence (or absence) of scores at the extremes of the distribution than are the median or (ordinarily the mode

When a measure of central tendency should reflect the total of the scores, the mean is the best choice because it is the only measure based of this quantity

© aSup-2011

Sampling Survey

55

The MEAN of Ungrouped Data The mean (M), commonly known as the

arithmetic average, is compute by adding all the scores in the distribution and dividing by the number of scores or cases

M =ΣX

N

© aSup-2011

Sampling Survey

56

The MEAN of Grouped Data When data come to us

grouped, or when they are too

lengthy for comfortable addition without the aid of a calculating machine, or

when we are going to group them for other purpose anyway,

we find it more convenient to apply another formula for the mean:

M =Σ f.Xc

N

X Xc f f.Xc

20 - 24

15 - 19

10 - 14

5 - 9

0 - 4

22

17

12

7

2

1

4

7

5

3

22

68

84

35

6

© aSup-2011

Sampling Survey

57

The MEDIAN of Ungrouped Data

Method 1: When N is an odd number list the score in order (lowest to highest), and the median is the middle score in the list

Method 2: When N is an even number list the score in order (lowest to highest), and then locate the median by finding the point halfway between the middle two scores

© aSup-2011

Sampling Survey

58

The MEDIAN of Ungrouped Data

Method 3: When there are several scores with the same value in the middle of the distribution 1, 2, 2, 3, 4, 4, 4, 4, 4, 5

There are 10 scores (an even number), so you normally would use method 2 and average the middle pair to determine the median

By this method, the median would be 4

© aSup-2011

Sampling Survey

59

X0 1 2 3 4 5

5

4

3

2

1

f

X0 1 2 3 4 5

5

4

3

2

1

f

© aSup-2011

Sampling Survey

60

The MEDIAN of Grouped Data There are 10 scores (an even number), so you

normally would use method 2 and average the middle pair to determine the median. By this method the median would be 4

In many ways, this is a perfectly legitimate value for the median. However when you look closely at the distribution of scores, you probably get the clear impression that X = 4 is not in the middle

The problem comes from the tendency to interpret the score of 4 as meaning exactly 4.00 instead of meaning an interval from 3.5 to 4.5

© aSup-2011

Sampling Survey

61

THE MODE The word MODE means the most

common observation among a group of scores

In a frequency distribution, the mode is the score or category that has the greatest frequency

© aSup-2011

Sampling Survey

62

SELECTING A MEASURE OF CENTRAL TENDENCY

How do you decide which measure of central tendency to use? The answer depends on several factors

Note that the mean is usually the preferred measure of central tendency, because the mean uses every score score in the distribution, it typically produces a good representative value

The goal of central tendency is to find the single value that best represent the entire distribution

© aSup-2011

Sampling Survey

63


Besides being a good representative, the mean has the added advantage of being closely related to variance and standard deviation, the most common measures of variability

This relationship makes the mean a valuable measure for purposes of inferential statistics

For these reasons, and others, the mean generally is considered to be the best of the three measure of central tendency

© aSup-2011

Sampling Survey

64


But there are specific situations in which it is impossible to compute a mean or in which the mean is not particularly representative

It is in these condition that the mode an the median are used

© aSup-2011

Sampling Survey

65

WHEN TO USE THE MEDIAN1. Extreme scores or skewed distribution

When a distribution has a (few) extreme score(s), score(s) that are very different in value from most of the others, then the mean may not be a good representative of the majority of the distribution.The problem comes from the fact that one or two extreme values can have a large influence and cause the mean displaced

© aSup-2011

Sampling Survey

66

WHEN TO USE THE MEDIAN2. Undetermined values

Occasionally, we will encounter a situation in which an individual has an unknown or undetermined score

Person Time (min.)

123456

811121317

Never finished

Notice that person 6 never complete the puzzle. After one hour, this person still showed no sign of solving the puzzle, so the experimenter stop him or her

© aSup-2011

Sampling Survey

67

WHEN TO USE THE MEDIAN2. Undetermined values

There are two important point to be noted: The experimenter should not throw out this

individual’s score. The whole purpose to use a sample is to gain a picture of population, and this individual tells us about that part of the population cannot solve this puzzle

This person should not be given a score of X = 60 minutes. Even though the experimenter stopped the individual after 1 hour, the person did not finish the puzzle. The score that is recorded is the amount of time needed to finish. For this individual, we do not know how long this is

© aSup-2011

Sampling Survey

68

WHEN TO USE THE MEDIAN3. Open-ended distribution

A distribution is said to be open-ended when there is no upper limit (or lower limit) for one of the categories

Number of children (X)

5 or more43210

322364

Notice that is impossible to compute a mean for these data because you cannot find ΣX

f

© aSup-2011

Sampling Survey

69

WHEN TO USE THE MEDIAN4. Ordinal scale

when score are measured on an ordinal scale, the median is always appropriate and is usually the preferred measure of central tendency

© aSup-2011

Sampling Survey

70

WHEN TO USE THE MODE Nominal scales

Because nominal scales do not measure quantity, it is impossible to compute a mean or a median for data from a nominal scale

Discrete variables indivisible categories Describes shape

the mode identifies the location of the peak (s). If you are told a set of exam score has a mean of 72 and a mode of 80, you should have a better picture of the distribution than would be available from mean alone

© aSup-2011

Sampling Survey

71

CENTRAL TENDENCY AND THE SHAPE OF THE DISTRIBUTION

Because the mean, the median, and the mode are all trying to measure the same thing (central tendency), it is reasonable to expect that these three values should be related

There are situations in which all three measures will have exactly the same or different value

The relationship among the mean, median, and mode are determined by the shape of the distribution

© aSup-2011

Sampling Survey

72

SYMMETRICAL DISTRIBUTION SHAPE

For a symmetrical distribution, the right-hand side will be a mirror image of the left-hand side

By definition, the mean and the median will be exactly at the center because exactly half of the area in the graph will be on either side of the center

Thus, for any symmetrical distribution, the mean and the median will be the same

© aSup-2011

Sampling Survey

73

SYMMETRICAL DISTRIBUTION SHAPE

If a symmetrical distribution has only one mode, it will also be exactly in the center of the distribution. All three measures of central tendency will have same value

A bimodal distribution will have the mean and the median together in the center with the modes on each side

A rectangular distribution has no mode because all X values occur with the same frequency. Still the mean and the median will be in the center and equivalent in value

© aSup-2011

Sampling Survey

74

MEASURES OF VARIABILITY

© aSup-2011

Sampling Survey

75

Knowing the central value of a set of measurement tells us much, but it does not by any means give us the total pictures of the sample we have measured

Two groups of six-year-old children may have the same average IQ of 105. One group contain no individuals with IQs below 95 or above 115, and that the other includes individuals with IQs ranging from 75 to 135

We recognize immediately that there is a decided difference between the two groups in variability or dispersion

© aSup-2011

Sampling Survey

76

75 85 95 105 115 135125

The BLUE group is decidedly more homogenous than the RED group with respect to IQ

© aSup-2011

Sampling Survey

77

Purpose of Measures of Variability

To explain and to illustrate the methods of indicating degree of variability or dispersion by the use of single numbers

The three customary values to indicate variability are○ The total range○ The semi-interquartile range Q, and○ The standard deviation S

© aSup-2011

Sampling Survey

78

The TOTAL RANGE The total range is the easiest and most

quickly ascertained value, but it also the most unreliable

The BLUE group (from an IQ of 95 to one of 115) is 20 points. The range of RED group from 75 to 135, or 60 points

The range is given by the highest score minus the lowest score

The RED group has three times the range of the BLUE group

© aSup-2011

Sampling Survey

79

The SEMI-INTERQUARTILE RANGE Q

The Q is one-half the range of the middle 50 percent of the cases

First we find by interpolation the range of the middle 50 percent, or interquartile range, the divide this range into 2

© aSup-2011

Sampling Survey

80

Q1 Q2 Q3

Lowest Quarter

Low Middle Quarter

High Middle Quarter Highest

Quarter

Q2 – Q1 Q3 – Q2

Q3 – Q1 = 2Q

Q =Q3 – Q1

2

© aSup-2011

Sampling Survey

81

The STANDARD DEVIATION S Standard deviation is by far the most

commonly used indicator of degree of dispersion and is the most dependable estimate of the variability in the population from which the sample came

The S is a kind of average of all deviation from the mean

S = √∑ x2

n - 1

© aSup-2011

Sampling Survey

82

As a general concept, the standard deviation is often symbolized by SD, but much more often by simply S

In verbal terms, a S is the square root of the arithmetic mean of the squared deviations of measurements from their means

© aSup-2011

Sampling Survey

83

Interpretation of a Standard Deviation

The usual and most accepted interpretation of a S is in percentage of cases included within the range from one S below the mean to one S above the mean

In a normal distribution the range from -1σ to +1σ contains 68,27 percent of the cases

If the mean = 29,6 and S = 10,45; we say about two-third of the cases lies from 19,15 to 40,05

© aSup-2011

Sampling Survey

84


One of the most common source of variance in statistical data is individual differences, where each measurement comes from a different person

© aSup-2011

Sampling Survey

85

Interpretation of a Standard Deviation Giving a test of n items to a group of person

Before the first item is given to the group, as far as any information from this test is concerned, the individuals are all alike. There is no variance

Now administer the first item to the group. Some pass it and some fail. Some now have score of 1, and some have scores of zero

There are two groups of individuals. There is much variation, this much variance

© aSup-2011

Sampling Survey

86


Give a second item. Of those who passed the first, some will past the second and some will fail it. Etc.

There are now three possible scores : 0, 1, and 2.

More variance has been introduced Carry the illustration further, adding item by

item The differences between scores will keep

increasing, and also, by computation, the variance and variability

© aSup-2011

Sampling Survey

87

Another rough check is to compare the S obtained with the total range of measurement

In very large samples (N=500 or more) the S is about one-sixth of the total range

In other word, the total range is about six S

In smaller samples the ratio of range to S can be expected to be smaller (see Guilford & Fruchter p.71)

© aSup-2011

Sampling Survey

88

Ratios of the Total Range to the Standard Deviation in a Distribution for Different Values

of N Rough check for a computed SD

○ The actual percentage of a case between +1 SD and -1 SD deviates 68 percents

○ In very large sample (N = 500 or more) the SD as about one-sixth of the total range

N Range/S

N Range/S

N Range/S

5 2.3 40 4.3 400 5.910

3.1 50 4.5 500 6.1

15

3.5 100

5.0 700 6.3

20

3.7 200

5.5 1000

6.5

© aSup-2011

Sampling Survey

89

z-Score:

Location of Scores and Standardized Distribution

© aSup-2011

Sampling Survey

90

PREVIEW In particular, we will convert each

individual score into a new, standardize score, so that the standardized score provides a meaningful description of its exact location within the distribution

We will use the mean as a reference point to determine whether the individual is above or below average

The standard deviation will serve as yardstick for measuring how much an individual differ from the group average

© aSup-2011

Sampling Survey

91

EXAMPLE Suppose you received a score of X = 76

on a statistics exam. How did you do? It should be clear that you need more

information to predict your grade Your score could be one of the best

score in class, or it might be the lowest score in the distribution

© aSup-2011

Sampling Survey

92

X = 76, the best score or the lowest score?

To find the location of your score, you must have information about the other score in the distribution

If the mean were μ = 70 you would be in better position than the mean were μ = 86

Obviously, your position relative to the rest of the class depends on mean

© aSup-2011

Sampling Survey

93

X = 76 and μ = 70 However, the mean by itself is not sufficient

to tell you the exact location of your score At this point, you know that your score is six

points above the mean Six points may be a relatively big distance

and you may have one of the highest score in class, or

Six points may be a relatively small distance and you are only slightly above the average

© aSup-2011

Sampling Survey

94

THE z-SCORE FORMULA

z =X - μ

σ

© aSup-2011

Sampling Survey

95

z-Score and Location In a Distribution

One of the primary purpose of a z-Score is to describe the exact location of a score within a distribution

The z-Score accomplishes this goal by transforming each X value into a signed number (+ or -), so that:○ The sign tells whether the score is located

above (+) or below (-) the mean, and○ The number tells the distance between the

score and the mean in term of the number of standard deviation

© aSup-2011

Sampling Survey

96

If every X value is transformed into a z-score, then the distribution of z-score will have the following properties: Shape of the z-score distribution will be the

same as the original distribution of raw scores. Each individual has exactly the same relative position in the X distribution and the z-score distribution

The Mean will always have a mean of zero. The subject with score same as the mean is transformed into z = 0

The Standard Deviation will always have a standard deviation of 1. The subject with score same as the +1S from the mean is transformed into z = +1

© aSup-2011

Sampling Survey

97

PROBABILITY and NORMAL DISTRIBUTION

In simpler terms, the normal distribution is symmetrical with a single mode in the

middle. The frequency tapers off as you move farther from the middle in either direction

μ

σ

© aSup-2011

Sampling Survey

98

THE DISTRIBUTION OF SAMPLE MEANS

© aSup-2011

Sampling Survey

99

OVERVIEW Whenever a score is selected from a

population, you should be able to compute a z-score

And, if the population is normal, you should be able to determine the probability value for obtaining any individual score

In a normal distribution, a z-score of +2.00 correspond to an extreme score out in the tail of the distribution, and a score at least large has a probability of only p = .0228

© aSup-2011

Sampling Survey

100

THE DISTRIBUTION OF SAMPLE MEANS Two separate samples probably will be

different even though they are taken from the same population

The sample will have different individual, different scores, different means, and so on

The distribution of sample means is the collection of sample means for all the possible random samples of a particular size (n) that can be obtained from a population

© aSup-2011

Sampling Survey

101

COMBINATION

Consider a population that consist of 5 scores: 3, 4, 5, 6, and 7

Mean population = ? Construct the distribution of sample

means for n = 1, n = 2, n = 3, n = 4, n = 5

nCr =n!

r! (n-r)!

© aSup-2011

Sampling Survey

102

SAMPLING DISTRIBUTION… is a distribution of statistics obtained by selecting all the possible samples of a specific size from a population

CENTRAL LIMIT THEOREMFor any population with mean μ and standard deviation σ, the distribution of sample means for sample size n will have a

mean of μ and a standard deviation of σ/√n and will approach a normal distribution as n approaches infinity

© aSup-2011

Sampling Survey

103

The STANDARD ERROR OF MEAN The value we will be working with is the

standard deviation for the distribution of sample means, and it called the σM

Remember the sampling error There typically will be some error

between the sample and the population The σM measures exactly how much

difference should be expected on average between sample mean M and the population mean μ

© aSup-2011

Sampling Survey

104

The MAGNITUDE of THE σM

Determined by two factors:○The size of the sample, and○The standard deviation of the

population from which the sample is selected

© aSup-2011

Sampling Survey

105

PROBABILITY AND THE DISTRIBUTION OF SAMPLE

MEANS The primary use of the standard distribution of sample means is to find the probability associated with any specific sample

Because the distribution of sample means present the entire set of all possible Ms, we can use proportions of this distribution to determine probabilities

© aSup-2011

Sampling Survey

106

EXAMPLE The population of scores on the SAT forms

a normal distribution with μ = 500 and σ = 100. If you take a random sample of n = 16 students, what is the probability that sample mean will be greater that M = 540?

σM =σ

√n= 25 z =

M - μ

σM= 1.6

z = 1.6 Area C p = .0548

© aSup-2011

Sampling Survey

107

© aSup-2011

Sampling Survey

Tipe-tipe Pengambilan SampelDesain pengambilan sampel

random/probabilita Untuk desain pengambilan sampel

random atau probabilita, setiap elemen dalam populasi harus memiliki kesempatan yang sama dan bebas untuk dipilih sebagai sampel.

108

© aSup-2011

Sampling Survey

Terdapat dua keuntungan dari sampel acak/probabilita:1.Sebagai representasi pengambilan sampel populasi total, penarikan kesimpulan dari sampel seperti ini dapat digeneralisasikan ke pengambilan sampel populasi total.2.Pengujian statistik yang didasarkan pada teori probabilita dapat diaplikasikan hanya pada data yang dikumpulkan dari sampel acak.

109

© aSup-2011

Sampling Survey Metode-metode mengambil sampel

acak The fishbowl draw: jika jumlah total

populasi kecil, prosedur yang mudah adalah menuliskan setiap elemen pada secarik kertas tiap elemennya, masukan pada sebuah kotak, dan ambil satu-persatu tanpa dilihat, sampai kertas yang dipilih sesuai dengan ukuran sampel yang telah ditetapkan

110

© aSup-2011

Sampling Survey Metode-metode mengambil sampel

acak Program komputer Tabel acak: kebanyakan buku

metodologi penelitian dan statistik memasukan tabel acak pada bagian lampirannya. Sampel dapat dipilih dengan menggunakan tabel sesuai prosedur

111

Documents

© aSup-2011 Sampling Survey 1 SAMPLING SURVEY TECHNIQUE