25
Expected values, covariance, correlation and expected values Introduction to Bivariate Regression

Sept 17 Correlation Covariation DSI

  • Upload
    sin117

  • View
    215

  • Download
    0

Embed Size (px)

DESCRIPTION

expected values, covariance, correlation ppt

Citation preview

  • Expected values, covariance, correlation and expected valuesIntroduction to Bivariate Regression

  • Review MeanModeMedianFreq Variance Standard deviation

  • Is the perception that the majority of Russians believe the same way you do related to how often you discuss politics with friends?

  • Is this a causal relationship?Majority of Russians believe the sameMajority of Russians believe the sameDiscussions of politics with friendsDiscussions of politics with friends

  • When it comes to politics, how close do you think your opinions are to the opinions of the majority of Russians? very close, rather close, not very close, not close at allfreq vars = majrcl / stats = mean stddev var.

  • How often do you do the following discuss political questions with friends, neighbors, or coworkers almost never, a few times a year, a few times a month, a few times a week, or practically every day? freq vars = discfrnd / stats = mean stddev var.

  • Review standard deviation and varianceVariance: for each unit or observation, it is the distance from the mean squared and then divide by the number of unitsStandard deviation squareroot of variancesince variance is in squared units, it doesnt make any sense. The standard deviation can be understood in terms of the original measurement unit

  • Calculating variance and standard deviations

    Sheet1

    unit valuemeandistancesquared distance

    42.93751.061.13

    32.93750.060.00

    42.93751.061.13

    12.9375-1.943.75

    12.9375-1.943.75

    42.93751.061.13

    32.93750.060.00

    32.93750.060.00

    32.93750.060.00

    22.9375-0.940.88

    32.93750.060.00

    32.93750.060.00

    42.93751.061.13

    52.93752.064.25

    12.9375-1.943.75

    32.93750.060.00

    1.311.1439378261

    Sheet2

    Sheet3

  • Review: Units, mean, variance and standard deviation majrcl discfrnd

    2.00 4.00 2.00 3.00 . 4.00 . 1.00 2.00 1.00 3.00 4.00 3.00 3.00 3.00 3.00 . 3.00 2.00 2.00 . 3.00 3.00 3.00 3.00 4.00 3.00 5.00 3.00 1.00 3.00 3.00

  • Expected value v. probability If our population set of numbers is: 1,1,3,3,17, then the expected value is 5, even though P(5) = 0. Suppose we know that E(X) = 5 with the equation y = 5 + 7x.What is E(Y)?

  • Expected valuesWhat is the expected value of majrcl?What is the range?Mode?Why are there 63 missing?What is the expected value of discfrnd?Why is the standard deviation and variance so high?

  • Crosstab

  • Causation Time orderingCovariation

  • Co-variation from variation?

    (xi - xmean)^2/naverage distance between the mean of x and each x value, squared

    aka (xi - xmean) (xi - xmean)/n

  • Covariation?

    (xi - xmean) * (yi - ymean) / n-1

  • Covariationcovariance can take any value negative infinity to positive infinity

  • Intuitive explanation(xi - xmean) * (yi - ymean) / n-1When x and y are high at the same time and x and y are low at the same time, then the covariance is positiveThey are both higher than their means and so the products being added together are positive

  • Plot showing positive covarianceMean female literacyMean urban %

  • Intuitive explanation(xi - xmean) * (yi - ymean) / n-1When x is low when y is high and vice versa, then the covariance is negativeThey are both higher than their means and so the products being added together are negative

  • Plot showing negative covarianceMean infant mortalityMean calorie intake

  • Intuitive explanation(xi - xmean) * (yi - ymean) / nWhen sometimes: x and y are high at the same time and x and y are low at the same timeAnd about half of the other timex is low when y is high and vice versaThen the covariance is about 0

    High positive numbers are added to high negative numbers

  • Plot showing no covarianceMean crop productionMean GDP

  • Covariance is a function ofVariance (standard deviation) of xVariance (standard deviation) of yRelationship between x and y

  • How can you compare a covariance of 132 and 134,847?134, 847 could be high variance of x, high variance of y, high variance of both variables, or a high relationship between x and y?

    Not that helpful?

  • How can you change the covariance to a number that tells you only the magnitude of the relationship between x and y?Divide by the standard deviation of x * the standard deviation of y

    Correlation = (x-xmean)*(y-ymean) /Sd(x) * sd (y)

    Pearson r ranges from -1 to +1Weak correlation = .1 moderate correlation = .4strong correlation = .7