12
Week 2 What is the definition of statistics? The science of classifying, organizing, and analyzing data 2 Main Types of Statistics- Descriptive Statistics & Inferential Statistics STATISTIC – data about a sample PARAMETER – data about the population RANDOM SAMPLING- A process for ensuring all members of a population have the same chance to be included in the sample SCALES- All variables are classified on the type of scale they are measured by Nominal-Mutually exclusive, exhaustive categories, (Group / Labels) Categories have no numeric relationship Ordinal-Relates to the order or rank of the data. No attempt to show the degree to which the orders are different (1st place, 2nd place, etc..Richest, 2nd richest, 3rd richest, etc) Interval-Same as the Ordinal scale but includes relative distance or amount . Can be in positive or negative numbers (Temperature, Checking account balance, return on investment) Ratio-Same as the Interval Scale only include an absolute zero (no negative numbers) (Height, age, time in a race,score on a test) All variables are either numeric or non-numeric variables Nominal variables are always non-numeric Ordinal variables can be numeric or non-numeric Interval and Ratio variables are always numeric Week 3 Freq Distr (Ch.2-3) (Descr. Stat) Frequency Distribution – shows the number of observations for the possible categories or score values in a set of data What are the characteristics of that table?

Quantitative Methods 1 (MIdterm Study Guide)

Embed Size (px)

DESCRIPTION

Quantitative Methods 1 (MIdterm Study Guide) statistical reasoning in the behavioral sciences

Citation preview

Page 1: Quantitative Methods 1 (MIdterm Study Guide)

Week 2What is the definition of statistics? The science of classifying, organizing, and analyzing data

2 Main Types of Statistics- Descriptive Statistics & Inferential Statistics

STATISTIC – data about a samplePARAMETER – data about the population

RANDOM SAMPLING-A process for ensuring all members of a population have the same chance to be included in the sample

SCALES-All variables are classified on the type of scale they are measured byNominal-Mutually exclusive, exhaustive categories, (Group / Labels) Categories have no numeric relationshipOrdinal-Relates to the order or rank of the data. No attempt to show the degree to whichthe orders are different (1st place, 2nd place, etc..Richest, 2nd richest, 3rd richest, etc)Interval-Same as the Ordinal scale but includes relative distance or amount . Can be in positive or negative numbers (Temperature, Checking account balance, return on in-vestment)Ratio-Same as the Interval Scale only include an absolute zero (no negative numbers)(Height, age, time in a race,score on a test)

All variables are either numeric or non-numeric variablesNominal variables are always non-numericOrdinal variables can be numeric or non-numericInterval and Ratio variables are always numeric

Week 3 Freq Distr (Ch.2-3) (Descr. Stat)

Frequency Distribution – shows the number of observations for the possible categories or score values in a set of dataWhat are the characteristics of that table?GROUPED SCORES: In a frequency distribution, a range of values that are grouped to-gether (class intervals)

i – interval widthf - frequencyn – number of cases in a sampleN – number of cases in a populationCum f – cumulative frequency Cumulative Frequency DistributionShows how many cases lie below the upper real limit for each class interval Cum % – cumulative percentage

Page 2: Quantitative Methods 1 (MIdterm Study Guide)

Histogram – a graph that consists of a series of rectangles, the heights of which repre-sent the frequency or relative frequencyBar Diagram – used for qualitative data, a graph similar to a histogram, except a space appears between the rectanglesFREQUENCY POLYGON - a graph that consists of a series of connected dots above the mid-point of each possible class interval (height of dots corresponds to the fre-quency or relative frequency)Pie Chart – used for qualitative data, area in any piece of the pie shows the relative fre-quency of the categoryCumulative Percentage Curve – a graph that consists of a series of connected dots above the upper real limits of each possible class interval (height of dots connects to cu-mulative percentage) NORMAL CURVE (shows a normal distribution)SKEW-A distribution in which the distribution is not normal. The “tail” points in the direc-tion of the skew

Kurtosis-“Peekedness” of a distributionFlat distribution: platykurtic

Normal distribution: mesokurtic

More Peeked: leptokurtic

Week4 Central Tend (Ch.4)What point on the range of scores (distribution) best represents the overall set of scores? MEANWhat one number gives us the best snapshot of all the numbers in a data set?

Central Tendency-Helpful in comparing groups or comparing a group to a known refer-ence group.(Mean, Median, Mode)Mode-only measure of CT can be used for Nominal Scale.: only one can be used for Qualitative Data.Median-the value that divides the distribution in half.(not sensitive to extreme scores)Mean- avg. ∑x ÷ n (Xbar for sample set)The mean is the measure of central tendency that is most responsive to the exact posi-tion of each score in the distribution.

In a Perfectly Normal Distribution, the Mean, Median, and Mode are the Same (ex: i.q.)

Page 3: Quantitative Methods 1 (MIdterm Study Guide)

Skewed Distributions:

Score Transformation: Changing every score in a distribution (data set) to one on a different scale (compare iq with income level)Linear Transformation: (most common type of score transformation): A score transfor-mation that preserves the straight-line relationship between the original scores and their transformationsHow/Why would you do that? comparisons and conversions to find s.d. then find the z-score.

Start SPSSColumn 1: 1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2Column 2: 99,98,87,87,99,88,77,77,88,99,66,55,88,88,45,99,78,66,99,99Go to Variable View, name Variable 1 Gender and Variable 2 Test ScoreAssign Value of Variable 1 as 1=Female 2=MaleUse SPSS to Determine Mean, Mode, Median for Variable 2 (overall)“Split File” to Determine Mean, Mode, Median for Variable 2

Week 5 Variability & Normal Curve (b/n Descr. & Infer. Statistics) Ch.5

Variability-A single number that expresses how “spread out” or “clustered together” the scores in a distribution are (RANGE is the most common measure of var.)

Q – Semi Interquartile Range (another common measure of variability)½ the distance between the 1st and 3rd quartile points in a distribution

Quartile Points –the three scores in a distribution that divide the distribution into 4 parts. Use the median to divide the ordered data set into two halves. Include the me-dian into both halves.

Quartile Deviation: (Q3 - Q1) ÷ 2

Deviation Scores (Measure of Variability)X – X bar (raw score minus the mean)Tells us how far a score is from the mean ( + or -)

Page 4: Quantitative Methods 1 (MIdterm Study Guide)

Variance (Measure of Variability)- Mean of the Squares of the Deviation Scores

3-7 = -4 …..-4 X -4 = 164-7 = -3 …..-3 X -3 = 95-7 = -2 …..-2 X -2 = 48-7 = 1 …..1 X 1 = 110-7 = 3 …..3 X 3 = 912-7 = 5 …..5 X 5 = 25SUM OF SQUARES = 6464 / 6 = 10.67

Standard Deviation: s, The MOST commonly used measure of variability. (√ variance)68-95-99

Z scores-A statement about how far away a score is from the mean in standard devia-tion units. (z = score minus the mean divided by Standard Deviation)

Page 5: Quantitative Methods 1 (MIdterm Study Guide)

Calculating z scores from our earlier example10 …z = -1.320 …z = -1.030 …z = -.7740 …z = -.5250 …z = -.2660 …z = 0Converting scores to z scores does NOT make the distribution a normal distribution. Whatever distribution you had before you calculated your z score, you will have the same distribution in z score units

Standard Scores and the Normal Curve“Flipping a Coin” Example

Ch.6Characteristics of the Normal Curve- Symmetrical, Unimodal, Asymptotic, Continu-ousNot all normal curves look the same

Week 6 (CORRELATION-more toward Inferential Stat.) Ch.7

What are we doing when we calculate a correlation? MEASURING THE DEGREE OF RELATIONSHIP BETWEEN TWO VARIABLESCorrelation-a change in one of the variables will affect the other

BIVARIATE CORRELATION-(2 variables) (Scatter Diagram)

Page 6: Quantitative Methods 1 (MIdterm Study Guide)

Linear Relationship – A relationship that can best be represented by a straight lineSPSS (Pearson’s r) -- ANALYZE->CORR->BIVARIATEPEARSON’S COEFFICIENT OF CORRELATION (Pearson’s r)-Shows strength & di-rection of correlation of two sets of data.

Week 7 (PREDICTION-even more toward Inferential Stat.) Ch.8

When 2 variables are correlated, prediction is estimating the value of one variable based on anotherThe stronger the correlation, the better you can predict

*Surry-Note: What’s important is knowing what the prediction number is.Not too many papers have prediction in their papers....mostly use correlation.

Page 7: Quantitative Methods 1 (MIdterm Study Guide)

Y-axis (Dependent Var.) var that’s measuredX-axis (Indep. Var.) var that’s manipulated

Regression Line-The straight line that serves as the “best fit” to explain the predicted value of Y based on XRegression Equation-The statistical formula that gives you the regression lineAssumption of Linearity-The assumption that a straight line is the best way to de-scribe the relationship between X and Y (usually is)

Error of Prediction-The predicted value of Y (Y’) for any given value of X will usually not be the exact value of Y (unless the correlation is -1.0 or +1.0)

Homoscedasticity –the variance of variable Y is the same at all values of X

*PREDICTION DOES NOT PROVE CAUSATION

SPSS (Pearson’s r) -- ANALYZE->CORR->BIVARIATE

SPSS (Regression) -- ANALYZE->REGRESSION->LINEAR (choose X-Indep & Y-Dep)

Page 8: Quantitative Methods 1 (MIdterm Study Guide)

What Does Correlation Tell Us or Not Tell Us? (Pearson’s r)...THAT’S WHERE THE REGRESSION COEFFICIENT COMES IN...

Correlation Table:

What Does Correlation Tell Us or Not Tell Us? (Pearson’s r)...THAT’S WHERE THE REGRESSION COEFFICIENT COMES IN...

Regression Table:

(B) .420 is Regression Coefficient - the rate of change of dependent variable as a function of changes in the independent variable; it is the slope of the regression line (Beta) .412 is the Correlation (Pearson’s r)Significance – .071 tells us whether the Correlation is Statistically Significant at the .05 level

PREDICTED SCORE ON DEPENDENT VARIABLE (Yhat) = Beta 0 + Beta1 x X1

Predicted score on dependent variable (test 2) = B 0 (constant) 53.246 PLUS Beta 1 (.420) TIMES Score on X (lets say 88)Predicted score on Y (test 2) = 53.246 + .420X88

Page 9: Quantitative Methods 1 (MIdterm Study Guide)
Page 10: Quantitative Methods 1 (MIdterm Study Guide)

*Note: the SPSS stats are wrong...use this as a guide only

What would a score of 75 on the Pretest Predict the Final Exam Score to Be?

PREDICTED SCORE ON DEPENDENT VARIABLE = Beta 0 + Beta1 x X1

Predicted score on DV (Final Exam) = 740 +-5.091 x 75

740+ -381.82 = 358.18

A score of 75 on the pretest would predict a score of 358.18 on the final