38
Agricultural and Biological Statistics

Agricultural and Biological Statistics

Embed Size (px)

DESCRIPTION

Agricultural and Biological Statistics. Summarizing Data. Chapter 2. Summarizing Data. Data would be observations on one or more variables selected from a population ( or a sample). Summarizing Data. Qualitative Variables - Variables that express attributes about a sample or population - PowerPoint PPT Presentation

Citation preview

Page 1: Agricultural and Biological Statistics

Agricultural and Biological Statistics

Page 2: Agricultural and Biological Statistics

Summarizing DataChapter 2

Page 3: Agricultural and Biological Statistics

Data would be observations on one or more variables selected from a population ( or a sample)

Summarizing Data

Page 4: Agricultural and Biological Statistics

Qualitative Variables - Variables that express attributes about a sample or population

Quantitative Variables - Variables that are the result of measurement or counting.

Summarizing Data

Page 5: Agricultural and Biological Statistics

Qualitative Variables give rise to nominal or ordinal data.

Quantitative Variables give rise to ratio or intervals

Summarizing Data

Page 6: Agricultural and Biological Statistics

It is also important to categorize observations as to their source. Primary Data- Collected by means of experiment

or survey. Secondary Data- acquired from a source that did

not collect the data even though the source may have published them.

Summarizing Data

Page 7: Agricultural and Biological Statistics

Data collection allows us to make informed decisions about the problem at hand.

Manipulation and statistical analysis of the data allows for improved decision making.

Generally speaking statistical data is collected in random order. Since there is no real order to the data it is difficult to obtain any valuable information upon inspection.

Data: 3, 1, 7, 22, 9, 10, 4, 17, 19

Summarizing Data

Page 8: Agricultural and Biological Statistics

Definition of Array

Array- an array reorders the data from the smallest to the largest value.

Data: 3, 1, 7, 22, 9, 10, 4, 17, 19

Arrayed data: 1, 3, 4, 7, 9, 10, 17, 19, 22

Page 9: Agricultural and Biological Statistics

Measures of Dispersion

Definition: Range - a range is computed by subtracting the smallest from the largest observation. Range: 22-1=21

An array also indicates something about the distribution the units between the two extremes and their tendency to cluster toward some central value.

Page 10: Agricultural and Biological Statistics

Data can further be summarized in the form of a frequency distribution.

A number of classes are chosen (5 to 15 normally)

The distribution has the classes on the vertical axis and frequencies on the horizontal axis.

Measures of Dispersion

Page 11: Agricultural and Biological Statistics

Cotton Yield

215 to 235 235 to 255 255 to 275

These don’t have to have equal width classes. (Income)

Number of farms

4 6 13 21 15 7 5 4

Measures of Dispersion

Page 12: Agricultural and Biological Statistics

Histogram- A frequency distribution presented as a bar chest Advantage- See its shape

Frequency polygon- A line graph used to display data Frequencies on y-axis. Class midpoints on x-axis.

Measures of Dispersion

Page 13: Agricultural and Biological Statistics

Averages- a number used to represent the central value of data set or distribution.

1. Arithmetic Mean- most widely used. n

µ = ∑ Xi

1

N

Averages

Page 14: Agricultural and Biological Statistics

Example: 7 this is the population

3

2

8

20/4 = 5 = Arithmetic Mean

Page 15: Agricultural and Biological Statistics

Example cont: Now Take Some Samples

n

X= ∑ Xi

1

n

S1 = 7

3

10/2 = 5

S2 = 7

8

15/2 = 7.5

Page 16: Agricultural and Biological Statistics

1a. Weighted Mean. n

x = ∑ wi xi

1

n

∑ wi

1

Page 17: Agricultural and Biological Statistics

Weighted Mean Example

Crop Hourly Number wx Wage,x Workers,w

Cucumbers 4.50 950 4,275Melons 4.75 600 2,850Onions 5.25 1,020 5,355 2,570 12,480

x = 12,480 = 4.86. The other way it is 4.83 2,570

Page 18: Agricultural and Biological Statistics

Two Properties of Arithmetic Meana. Sum of deviations from the mean are zero.

3 3-4 = -1

7 x = 4 7-4 = 3

4 4-4 = 0

6 6-4 = 2

0 0-4 = -40

Page 19: Agricultural and Biological Statistics

b. The sum of squares of the deviation’s from the mean is a minimum.

Page 20: Agricultural and Biological Statistics

2. Midrange- ( or center) is the arithmetic mean of the smallest and the largest items in the data set.

Unreliable as estimate of the population mean. Based on two values that change significantly from sample to sample

Page 21: Agricultural and Biological Statistics

MR = X1 + Xn

2 where X1is smallest and Xn

is the largest

MR = 0 + 7 = 3.5

2

Example

Page 22: Agricultural and Biological Statistics

Median3. Median – a place average for ungrouped data,

it is the value of the middle observation after the data is arrayed. When there is an even number of observations the

middle two observations are averaged. Better measure when extreme values are

encountered. Should not be used for small sample sizes. Half of observations are below half above.

Page 23: Agricultural and Biological Statistics

Mode4. Mode – It is most common observation in the

data set. For ungrouped data we determine the mode by

inspection. Ungrouped data may not have a mode.

All values appear once. Several modes could occur as well.

Use mode when we want to know what is in vogue.

Page 24: Agricultural and Biological Statistics

An arithmetic mean might be meaningless. ABC show 1 CBS show 2 NBC show 3 X = 2.3 meaningless

Page 25: Agricultural and Biological Statistics

Characteristics of Mean, Median, Mode

Use three averages together to determine relative symmetry of distribution.

Perfect symmetry.. All three values (averages) are identical.

If distribution has a tail on the right. Skewed positively. Arithmetic mean is largest Mode smallest Median 2/3 of the way in between. Toward mean.

Page 26: Agricultural and Biological Statistics

Mean is the largest because its affected by large values.

Median is sensitive to position of the values.

Arithmetic mean is only one that can be used in algebraic calculations, which makes it most useful.

Down side impossible to calculate with open ended classes. This does not affect the other two averages.

Characteristics of Mean, Median, Mode

Page 27: Agricultural and Biological Statistics

Measures of dispersion Range (R) = Xn-X1

Can be used with mean, median, and midrange. Range indicates both how high and low the

numbers go and the range of the data itself. Based on two extreme values of the data set.

Not first choice for a measure of dispersion.

Page 28: Agricultural and Biological Statistics

Quartile DeviationQD = Q3- Q1

2 Used only with the median. One half the distance between the first and the

third quartiles.

Page 29: Agricultural and Biological Statistics

8 12 6 14 10

6 8 10 12 14

1st Middle 3rd

Quartile Quartile Quartile

QD = 13-7/2 = 3

Quartile Deviation

Page 30: Agricultural and Biological Statistics

QD is similar to the range but uses values in the middle half of the distribution rather than the endpoints.

Poor measure when wide dispersion in the tails of the distribution!

Quartile Deviation

Page 31: Agricultural and Biological Statistics

Standard Deviation A measure of dispersion used with the

arithmetic mean. Its value is based on all the observations of the data set.

Page 32: Agricultural and Biological Statistics

SD For ungrouped data

N

x

2)(

2 is varianceThe

SD is most widely used measure of dispersion. Arithmetic mean is most widely used average.

s2- sample variance is an estimate of the population variance σ2 computed from sample

data.

Page 33: Agricultural and Biological Statistics

In repeated sampling, the sample variance is biased and underestimates to population variance by the fixed amount

Thus revision in the sample SD formula is needed; divide by n-1 for sample

n

n 1

2

1

)(

n

xxs

Standard Deviation

Page 34: Agricultural and Biological Statistics

Example for Calculating SDDays Absent

7 -3 9

14 4 16

8 -2 4

5 -5 25

15 5 25

11 1 1

60 0 80

)( xx 2)( xx

x=10 4 16 5

80s

Page 35: Agricultural and Biological Statistics

Standard Deviation (Another Formula)

1

)( 22

nnx

xs

This does not contain deviations from the mean.

4165

600680

166

)60(680

2

s

Page 36: Agricultural and Biological Statistics

Two properties of S & X1. If we add a constant to every element in the

data set, the mean changes by that same value and the SD remains unchanged.

2. Multiplying each value of x by a constant multiplies the mean and SD by the absolute value of the constant and the variance by the square of the constant.

Page 37: Agricultural and Biological Statistics

Standardizing The Data Mean for every element 1/s for every element in data set. 0 mean 1SD becomes Z for population

)(

ii

xz

Page 38: Agricultural and Biological Statistics

Coefficient of Variation Uses the SD and mean to measure the variability of the data

set. Gives a relative measure of variability in the data set

States how large the Standard Deviation is in comparison to the mean in percentage terms

CV=100 would mean that the S & X are equal. When CV over 50 use caution in stating that mean represents population.

)100)(X

( Vor S

CV