Using Real Data For Decision Analysis

Embed Size (px)

Citation preview

  • 8/14/2019 Using Real Data For Decision Analysis

    1/15

  • 8/14/2019 Using Real Data For Decision Analysis

    2/15

    Which class would you take?

    Next semester, two professors will be teaching

    Decision Theory.

    VS.

    Based on RateMyProfessor.com, the past gradesare posted over the five years.

    mean = 1750.82mean = 1770.16 (out of 2000 points)

    Which professor would you sign up for?

    Dr. UragonnafailDr. Pieceofcake

  • 8/14/2019 Using Real Data For Decision Analysis

    3/15

    Finding the Distribution

    VS.

    mean = 1750.82mean = 1770.16 (out of 2000 points)

    Dr. Uragonnafail GPA distribution

    X

  • 8/14/2019 Using Real Data For Decision Analysis

    4/15

    Standardizing the axis

    Dr. Pieceofcake GPA Distribution

    0

    1

    2

    3

    4

    5

    6

    1.5 1.591 1.682 1.773 1.864 1.955

    Values in Thousands

    Valuesx10^-2

    Dr. Uragonnafail GPA distribuition

    0

    1

    2

    3

    4

    5

    6

    7

    1.5 1.591 1.682 1.773 1.864 1.955

    Values in Thousands

    Va

    lues

    x

    10^-3

    Which professor would you sign up for now?

    mean = 1750.82

    var = 63.8635

    mean = 1770.16

    var = 7.139664

    (out of 2000 points)

  • 8/14/2019 Using Real Data For Decision Analysis

    5/15

    A matter of perspective

    Which professor would you want if you

    were a 4.0 students?

    What if you were a slacker?

    Remember: it is not just the centraltendency that is important, the dispersion

    is also critical

  • 8/14/2019 Using Real Data For Decision Analysis

    6/15

    There are a variety of probability models

    that can be used to help make decisions

    Binomial distribution

    Poisson distribution

    Normal distribution

    Exponential distribution

    Beta distribution etc, etc, etc

    It is always important to use the correct

    distribution to explain your data

    BUT

    More importantly, it is essential to always

    consider the context by which you are

    making the decision

  • 8/14/2019 Using Real Data For Decision Analysis

    7/15

    Caveats in using real data to

    make decisions

    Fallacy of Averages

    Assumptions of normality

    Errors in estimations Impact of outliers

    Residual Values

  • 8/14/2019 Using Real Data For Decision Analysis

    8/15

    Fallacy of averages

    0

    0 .0 5

    0 .1

    0 .1 5

    0 .2

    0 .2 5

    0 .3

    0 .3 5

    0 .4

    0 .4 5

    0 2 4 6 8 1 0 1 2 1 4 1 6 1 8 2 0

    Duration of Drug Development

    Freq

    uencyofDevelopedDrugs

    Mean / Average

  • 8/14/2019 Using Real Data For Decision Analysis

    9/15

    Fallacy of Averages

    0

    0 .0 5

    0 .1

    0 .1 5

    0 .2

    0 .2 5

    0 .3

    0 .3 5

    0 .4

    0 .4 5

    0 2 4 6 8 1 0 1 2 1 4 1 6 1 8 2 0

    Duration of Drug Developme nt

    FrequencyofDeve

    loped

    Drugs

    Mean Duration of Typical Process

    Mean Duration of

    Complex Processes

    A simple case of fallacy of averages is the case of overall height of a specific

    population. Men and Women have natural bimodal distributions, but combinedhave a normal distribution.

  • 8/14/2019 Using Real Data For Decision Analysis

    10/15

    Assumptions of Normality

    Normal distributions are not always bell

    shapedRequirements for

    normal distribution

    symmetrical

    68, 95, 99 rule

    Not all distributions are normal, few are perfectly normally

    distributed

    Much of decision analysis is based on assumptions of the normal curve to make

    calculations easier. It is important to understand the limitations of the normal

    curve when basing your decisions off of it.

  • 8/14/2019 Using Real Data For Decision Analysis

    11/15

    Errors in Estimations

    Some linear-

    regressions force

    the line to the 0-0

    point .

    (meaning:x-intercept = 0

    y-intercept = 0 )

    When creating

    regressions to the

    data, have an

    understanding of therange to which your

    estimation is

    reasonable.

    Would it make sense to have a the regression line go be forced through the 0-0 pointin this case?

  • 8/14/2019 Using Real Data For Decision Analysis

    12/15

    Impact of Outliers

    Outliers can bias the regression estimation to

    accommodate the extreme data point(s). What happens if you add Maudie Hopkins (19) and William Cantrell (86) [last civil

    war widow] & Anna Nicole Smith (26) to J. Howard Marshal II (89)? Or, more

    recently, Demi Moore (42) to Ashton Kutcher (27)?

  • 8/14/2019 Using Real Data For Decision Analysis

    13/15

    Impact of Outliers

    Outliers can also give a false sense of correlation between two variables. In a

    correlation test, the strength of the relationship between the husbands age and the

    wifes age would be incorrectly accentuated. This would reflect the incorrect

    observation pertaining the compactness of the dataset.

  • 8/14/2019 Using Real Data For Decision Analysis

    14/15

    Residual value

    Two datasets can give identical regression estimations. The graph on the

    right has a larger residual value therefore there is a presence of greater

    error. In other words, there is not as strong of a relationship of

    predicting values of y from x.

  • 8/14/2019 Using Real Data For Decision Analysis

    15/15

    Things to remember about real

    data

    It is messy! It is not always memoryless

    i.e., the recent past really is a better indicator offuture performance

    It can have many outliers that are: Important indications of new trends, or Oddities that should be eliminated

    There is a major difference between

    statistical significance and practicalsignificance Dont just look at the statistical results, look at

    the data itself!