RandomVariables ProbDistributions Complete

Embed Size (px)

DESCRIPTION

raf

Citation preview

  • Random Variables and

    Probability Distributions

    Ravindra S. Gokhale

    IIM Indore

    1

  • Random Variables

    A variable that associates a number with the outcome of a

    random experiment is a random variable

    Denoted by an uppercase letter such as X, Y, etc.

    A random variable can take only numeric values.

    Toss of a coin is NOT a random variable.

    [It is an experiment that yields random results]

    However, number of heads from toss of a coin is a random

    variable

    2

  • Random Variables

    Examples:

    If two dice are thrown, the sum of the faces is a random variable, as in: X = 3, X = 11, etc.

    If two coins are tossed, then the number of heads is a random variable, as in: X = 0, X = 2, etc.

    In a speed (rpm) measurement: X = 457, X = 1209, etc.

    In a dimension measurement with the help of a caliper: X = 23.46, X = 48.97, etc.

    3

  • Random Variables (cont)

    A random variable with a finite or countable infinite range

    Examples:

    Number of scratches on a car surface

    Proportion of defective parts among 1000 tested

    Number of people arriving at a bank in a given time interval

    4

    Discrete Random Variables

  • Random Variables (cont)

    A random variable with an interval (either finite or infinite) of real

    numbers for its range

    Examples:

    Length dimension (like surface area of a table)

    Time dimension (like time between failure for a machine)

    Temperature dimension (like temperature inside a room)

    5

    Continuous Random Variables

  • Random Variables (cont)

    Population in a particular state of India.

    Total weight of consignments handled by a courier company in a

    day.

    Time to complete an exam.

    Number of participants in an exit poll.

    Total number of goals scored in a football game.

    Life of a particular medicine.

    Height of the Ocean's tide at a given location.

    Amount of rain on a particular day.

    Number of train derailments in a year.

    6

    Discrete or Continuous?

  • Random Variables (cont)

    The manner in which random variables are expressed sometimes

    depends on the problem at hand

    Sometimes a random variable is discrete in nature, but it is

    treated continuous

    This is because the range of values it can take is too large

    Example: Marks of a student in a 100 marks paper

    7

    Expression of Random Variables

  • Random Variables (cont)

    Sometimes a random variable is continuous in nature, but it is

    treated discrete

    This is because the exact value (to the smallest level) is not required

    Example: Age of a person may be expressed as a discrete random variable forming different categories: 0-21, 21-35, 35-50, 50-65, 65+

    8

    Expression of Random Variables

  • Probability Distributions

    Probability distribution of a random variable X is a formula, table,

    or graph that gives all possible values of X and corresponding

    probabilities P(X = x) for all x's in the domain of X.

    Example: Probability distribution of roll of a dice:

    9

    x 1 2 3 4 5 6

    P(X = x) 1/6 1/6 1/6 1/6 1/6 1/6

  • Probability Distributions

    Standard probability models (probability distributions) are

    available in the literature and have been studied in detail.

    These models can mimic many real life scenarios very well and

    have mathematically tractable representation.

    10

  • Discrete Random Variables

    Examples:

    If two coins are tossed and we are interested in the event number of heads obtained, then:

    P(X = 0) = 0.25 P(X = 1) = 0.50

    P(X = 2) = 0.25 P(X = 3) = 0

    P(X > 1) = 1 [P(X = 0) + P(X = 1)] = 1 (0.25 + 0.50) = 0.25

    In a lot that contains 10% defective pieces, if we are interested in the number of defective pieces in a sample of 5 then:

    P(X = 0) = 0.590 P(X = 1) = 0.328

    P(X = 2) = 0.073 P(X = 3) = 0.008

    P(X = 4) 0.001 P(X = 5) 0.000

    P(X

  • Discrete Random Variables (cont)

    Terminologies associated with discrete random variables

    Probability mass function (pmf) denoted by f(x)

    Cumulative distribution function (cdf) denoted by F(x)

    12

    x

    f(x)

    Probability mass function of a fair dice

    f(xi) = P(X = xi)

    1 2 4 3 6 5 0

    1/6

    3/6

    2/6

    1

    4/6

    5/6

    x

    F(x)

    Cumulative distribution function of a fair dice

    F(x) = P(X

  • Discrete Random Variables (cont)

    Mean of a discrete random variables

    Mean is the expected value of the random variable denoted by or E(X)

    It is the measure of the center of the probability distribution

    Formula:

    If we make infinite number of draws from the distribution of a

    random variable and calculate the average of the data then the

    average is the expected value (or mean) of the random variable.

    Note: The expected value should not be confused with most

    likely value.

    13

    = E(X) = x f(x) x

    Mean and Variance of a Discrete Random Variables

  • Discrete Random Variables (cont)

    A simple example:

    You can insure a Rs.500,000 jewellery against theft for its total

    value by annual premium of Rs. R. If the probability of theft in a

    given year is estimated to be 0.01, what premium should the

    insurance company charge if it wants an annual expected gain

    equal to Rs. 10,000?

    14

    Mean and Variance of a Discrete Random Variables

  • Discrete Random Variables (cont)

    Variance of a discrete random variables

    Denoted by 2 or V(X)

    It is the measure of the dispersion or variability in the probability distribution

    Formula:

    The standard deviation () of X is the (positive) square root of the variance

    15

    Mean and Variance of a Discrete Random Variables

    2 = V(X) = (x )2 f(x) = [ x2 f(x)] 2 x x

  • Bernoulli Trial

    A basic building block for all the discrete probability distributions

    A trial has only two possible outcomes

    Usually termed as: success and failure

    Examples:

    Did tossing of a coin lead to a head (success) or not?

    Did the student pass the exam (success) or not?

    Did India lose the match (success) or not?

    Was the part defective (success) or not?

    Probability of success is denoted by p

    16

  • Bernoulli Trial

    Mean of a Bernoulli Trial = p

    Variance of a Bernoulli Trial = p (1 p)

    Derive

    17

  • Binomial Random Variable

    A random experiment consists of n Bernoulli trials such that:

    The trials are independent

    Each trial results in only two possible outcomes success and failure

    The probability of success in each trial (denoted as p) remains constant

    The random variable X that equals the number of trials that result

    in a success follows a Binomial Distribution

    18

  • Binomial Random Variable

    Exercise:

    A jar contains five rings: three red and two white. Two

    rings are randomly selected without replacement from the

    jar, and the number X of red rings obtained are recorded.

    Explain why X is or is not a binomial random variable?

    19

  • Binomial Random Variable

    A random variable X following a Binomial Distribution is denoted

    by:

    X ~ Binomial(n, p)

    n and p are the parameters of the binomial distribution

    n = 1, 2, 3, and 0 < p < 1

    What is the meaning of Parameters of a Distribution?

    20

  • Binomial Random Variable

    The pmf of the binomial random variable X is given by:

    f(x) = nCx px (1 p)n-x x = 0, 1, 2, , n

    This is the probability of x successes in n trials

    Note: nCx = n! / [x! (n-x)!]

    Understanding the pmf of a Binomial Distribution:

    Probability of x successes is px .

    Probability of (n x) failures is (1 - p)n-x .

    x successes in n trials can happen in nCx ways

    21

  • Binomial Random Variable

    For a binomial random variable with parameters n and p:

    mean = = n p and variance = 2 = n p (1-p)

    Derive

    22

  • Binomial Random Variable

    Effect of parameters on the shape of the distribution.

    23

  • Problems

    A batch of 500 machined parts contains 10 that are defective.

    Parts are selected successively without replacement, until a non-

    conforming part is obtained. The random variable is the number of

    parts selected.

    What is the range of random variables?

    24

    Discrete Random Variables

  • Problems (cont)

    In a semiconductor manufacturing process, three wafers from a

    lot are tested. Each wafer is classified as pass or fail. Assume

    that the probability that the wafer passes the test is 0.8, and that

    the wafers are independent.

    Determine the probability mass function of the number of wafers from a lot that passes the test.

    Determine the cumulative distribution function for the random variable.

    Determine the mean and the variance of the random variable.

    25

    pmf, cdf, mean and variance of Discrete Random Variables

  • Problems (cont)

    Because not all airline passengers show up for their reserved seat,

    an airline adopts a policy of overbooking 5 seats for a flight that

    has a capacity of 120 seats. The probability that a passenger does

    not show up is 0.1 and all the passengers are assumed to behave

    independently.

    What is the probability that every passenger who shows up can take the flight?

    What is the probability that the flight departs with empty seat(s)?

    26

    Binomial Distribution

  • Problems (cont)

    A manufacturer has 100 customer orders to be satisfied. Each

    order requires on component part that is purchased from a

    supplier. However, 2% of the components are identified as

    defective and the components are assumed to be independent.

    If the manufacturer stocks 100 components what is the probability that the 100 orders can be filled up?

    If the manufacturer stocks 105 components what is the probability that the 100 orders can be filled up?

    27

    Binomial Distribution

  • Problems (cont)

    A multiple choice test contains 25 questions, each with 4 answers.

    Assume that a student just guesses each question.

    What is the probability that a student answers more than 20 questions correctly?

    What is the probability that the student answers less than 5 questions correctly?

    What will be the mean marks scored by a student?

    What is the variance of the marks scored?

    28

    Binomial Distribution

  • Poisson Distribution

    It is derived from a binomial distribution

    Limiting case of the binomial distribution

    Applied to systems with large number of possible events, each of which are rare

    29

  • Poisson Distribution

    With reference to a binomial distribution, if n becomes very large

    and p becomes considerably small, such that the product of n

    and p (denoted by ) remains some manageable constant, then

    limn P(X = x) = (e x) / (x!)

    The assumption of independence is still required

    30

  • Poisson Distribution

    Example (to distinguish Poisson distribution from Binomial

    distribution):

    A page in a book can have two outcomes: with error and without error

    The probability of with error in a page of a book published by a good publishing house is very small (this is the p)

    But the number of pages in the book will be large, say 500 (this is n)

    In this case, the distribution of pages with error in a book may be modeled as a Poisson distribution

    Another Example:

    Number of bike accidents in a city in a week

    31

  • Poisson Distribution

    Typical application:

    Poisson distribution is appropriate for a random variable that

    counts the number of occurrences of an event of interest in a

    given time interval.

    Other application:

    Number of surface defects

    Number of errors

    32

  • Poisson Distribution (cont)

    A random variable X following a Poisson distribution is denoted by

    X ~ Poisson()

    is the parameter of the Poisson distribution

    can be considered as the product of n and p

    Note however that Poisson Distribution does not require the knowledge about n and p

    33

  • Poisson Distribution (cont)

    The pmf of the Poisson random variable X is given by:

    f(x) = (e x) / (x!) x = 0, 1, 2, > 0

    Note: The range of X is integers from 0 to infinity (and not

    bounded by n unlike Binomial Distribution)

    For a Poisson random variable with parameter :

    mean = = and variance = 2 =

    Derive

    34

  • Poisson Distribution (cont)

    Important Notes:

    In practical applications, will correspond to the rate of

    something per some unit (example: number of people arriving

    at a bank per hour, number of defects per square meter of a

    surface)

    It is important to use consistent units in calculating

    probabilities, means, and variances involving Poisson random

    variables

    35

  • Poisson Distribution (cont)

    Effect of the value of parameter on the shape of the

    Distribution

    36

  • Problems (cont)

    Suppose that the number of customers that enter a bank in an

    hour is a Poisson random variable, and suppose that P(X = 0) =

    5%.

    Determine the mean of X?

    Determine the variance of X?

    37

    Poisson Distribution

  • Problems (cont)

    The number of surface flaws in plastic panels used in the interior

    of automobiles has a Poisson distribution with a mean of 0.05

    flaws per square foot of plastic panel. Assume that an automobile

    interior contains 10 square feet of plastic panel.

    What is the probability that there are no surface flaws in an autos interior?

    If 10 cars are sold to a rental company, what is the probability that none of the 10 cars has any surface flaws?

    If 10 cars are sold to a rental company, what is the probability that at most one car has any surface flaws?

    38

    Poisson Distribution

  • Problems (cont)

    The number of failures of a testing instrument from contamination

    particles on the product is a Poisson random variable with a mean

    of 0.02 failures per hour.

    What is the probability that the instrument does not fail in an 8-hour shift?

    What is the probability that there is at least one failure in a 24-hour day?

    39

    Poisson Distribution

  • Problems (cont)

    The increased number of small commuter planes in major airports

    has heightened concern over air safety. An has recorded a

    monthly average of five near-misses on landings and take offs in

    the past 5 years.

    Find the probability that during a given month there are no near-

    misses on landings and take offs at the airport.

    Find the probability that during a given month there are five near-

    misses.

    Find the probability that there are at least five near-misses during a

    particular month.

    40

    Poisson Distribution

  • Problems (cont)

    In a food processing and packaging plant, there are, on an

    average, two packaging machine breakdowns per week.

    What is the probability that there are no machine breakdowns

    in a given week?

    Calculate the probability that there are no more than two

    machine breakdowns in two weeks?

    41

    Poisson Distribution

  • Case

    US Public Healthcare Service

    42

  • Continuous Random Variables

    Examples:

    When a machine breaks down, it is serviced. It runs for some time until it again breaks down. We are interested in the event time (in hours)

    between successive breakdowns, then:

    P(X < 10) = ? P(X > 250) = ?

    P(50 < X < 150) = ?

    A finance executive wants to predict the various financial ratios (say X, Y, etc.) of different organizations, based on past data. For a particular

    organization, he may be interested in:

    P(X > 0.75) = ? P(Y < 0.6) = ?

    P(0.35 < X < 0.50) = ?

    43

  • Terminologies associated with continuous random variables

    Probability density function (pdf) denoted by f(x)

    Cumulative distribution function (cdf) denoted by F(x)

    Continuous Random Variables (cont)

    44

    pdf

    Resembles a histogram

    Used to calculate an area that

    represents the probability that X

    takes the values between [a, b]

    P(a

  • Continuous Random Variables (cont)

    Mean of a continuous random variables

    Defined similarly to that of a discrete random variable

    Denoted by or E(X)

    Formula:

    Variance of a continuous random variables

    Defined similarly to that of a discrete random variable

    Denoted by 2 or V(X)

    Formula:

    The standard deviation () of X is the square root of the variance

    45

    Mean and Variance of a Continuous Random Variables

    dxf(x)][xE(X)

    2222 -dx}f(x)][x{dx}f(x)]-[x{V(X)

  • Uniform Distribution

    The simplest type of continuous distribution to understand

    A random variable X following a Uniform Distribution is denoted

    by: X ~ Uniform(a, b)

    The pdf of a uniform distribution is:

    f(x) = 1 / (b a), a

  • Uniform Distribution

    For a uniform random variable over a

  • Problems

    The probability density function of the length of a metal rod is f(x)

    = 2 for 2.3 < x < 2.8 meters

    If the specifications of this process are 2.25 to 2.75 meters, what proportion of the rods fail to meet the specifications?

    Determine the mean and the variance of the length of the metal rod

    48

    pdf, mean and variance of Continuous Random Variables

  • Problems (cont)

    The net weight (in kilogram) of a packaged chemical powder

    follows a uniform distribution for 49.75 < x < 50.25 kilogram

    Determine the mean and the variance of the weight of the packages.

    Determine the probability of a randomly selected package being less than 50.1 kilogram.

    49

    Uniform Distribution

  • Problems (cont)

    The manager of a local soft-drink company believes that when a

    new beverage-dispensing machine is set to dispense 7 ounces, it

    in fact dispenses an amount X at random anywhere between 6.5

    and 7.5 ounces inclusive. Suppose X has a uniform probability

    distribution.

    Draw the graph of distribution function.

    Find the mean and the standard deviation

    50

    Uniform Distribution

  • Problems (cont)

    An officer of the highway patrol is assigned to assist motorists

    should they become involved in an accident or have a mechanical

    breakdown. He can use either of the two strategies:

    Locating his patrol point at the midpoint of the highway.

    Patrolling the entire stretch of the highway.

    Which strategy is better in terms of faster response?

    51

    Uniform Distribution

  • Exponential Distribution

    Preamble:

    For a Poisson distribution, the number of people arriving at a bank in one hour (i.e. arrival rate) is of interest

    In some cases the time between arrivals may be of interest

    This is exactly what is described by an exponential distribution

    If the arrival rate is a Poisson random variable then the corresponding time between arrivals is an exponential random

    variable

    52

  • Exponential Distribution

    A random variable X following an Exponential Distribution is

    denoted by:

    X ~ Exponential()

    is the parameter of the exponential distribution

    Important: is the mean of the corresponding Poisson process (example: arrival rate, and not the mean time between arrivals)

    The pdf of an exponential distribution is:

    f(x) = e-x for 0

  • Exponential Distribution (cont)

    The cdf of an exponential distribution is:

    F(x) = P(X = 0

    For an exponential random variable with parameter :

    mean = = [1 / ] and variance = 2 = [1 / (2)]

    Derive

    54

  • Exponential Distribution (cont)

    Lack of memory property of the exponential distribution

    Mathematically: P(X < (t1 + t2) | X > t1) = P(X < t2)

    Implication (Example): Suppose the time between arrival of a city bus is exponentially distributed with a mean of 15 minutes.

    If you have already waited at the bus stop for 1 hour, then the

    probability that a bus will arrive in the next 10 minutes is equal

    to the probability that a bus would have arrived in the next 10

    minutes as soon as you come to the bus stop (that is, without

    the fact that you waited for one hour)

    Derive

    55

  • Problems

    A catalog company that receives majority of its orders by

    telephone conducted a study to determine how long customers

    are willing to wait on hold before ordering a product. The length of

    time was found to be a random variable best approximated by an

    exponential distribution with mean equal to 2.8 minutes. What

    proportions of customers have to hold more than 3 minutes before

    placing an order?

    56

    Exponential Distribution

  • Problems (cont)

    The time between arrivals of taxis at a busy intersection is

    exponentially distributed with a mean of 10 minutes.

    What is the probability that you wait longer than one hour for a taxi?

    Suppose you have already been waiting for one hour for a taxi, what is the probability that one arrives within the next 10 minutes?

    57

    Exponential Distribution

  • Problems (cont)

    The lifetime of a mechanical assembly in a vibration test is

    exponentially distributed with a mean of 400 hours.

    What is the probability that an assembly on test fails in less than 100 hours?

    What is the probability that an assembly operates for more than 500 hours before failure?

    If an assembly has been on test for 400 hours without a failure, what is the probability of a failure in the next 100 hours?

    58

    Exponential Distribution

  • Normal Distribution

    Preamble:

    The most widely used model for describing a random variable

    Outcomes of a large number of real life situations has a bell shaped frequency distribution that can be modelled by a Normal

    Distribution.

    Central Limit Theorem is associated with the Normal Distribution

    The mean, median, and mode of a Normal Distribution are theoretically same.

    The range of variable extends from - to +

    59

  • Normal Distribution

    A random variable X following a Normal Distribution is denoted

    by:

    X ~ Normal(, 2)

    and 2 are the parameters of the normal distribution

    60

  • Normal Distribution (cont)

    61

  • Normal Distribution (cont)

    A normal random variable described by Normal (0, 1),

    that is = 0 and 2 = 1, is called a standard normal random

    variable

    Denoted by Z, that is, Z ~ Normal (0, 1)

    Any normal random variable X ~ Normal (, 2) can be mapped to

    the standard normal random variable:

    Z = [(X ) / ]

    How to read and use the standard normal distribution table?

    62

  • Problems

    A Normal random variable is denoted by X ~ Normal (1.2, 0.152).

    Find the following probabilities:

    P(X < 1.10)

    P(X > 1.38)

    P(1.35 < X < 1.5)

    63

    Normal Distribution

  • Problems (cont)

    Find z0 such that:

    P(Z > z0) = 0.025

    P(Z < z0) = 0.925

    P(- z0 < Z < z0) = 0.8262

    64

    Normal Distribution

  • Problems (cont)

    The compressive strength of samples of cement can be modeled

    by a normal distribution with a mean of 6000 kilograms per

    square centimeter and a standard deviation of 100 kilograms per

    square centimeter.

    What is the probability that a samples strength is less than 6250 Kg/cm2?

    What is the probability that a samples strength is between 5800 and 5900 Kg/cm2?

    What strength is exceeded by 95% of the samples?

    65

    Normal Distribution

  • Problems (cont)

    The reaction time of a driver to visual stimulus is normally

    distributed with a mean of 0.4 seconds and a standard deviation

    of 0.05 seconds.

    What is the probability that a reaction requires more than 0.5 seconds?

    What is the probability that a reaction requires between 0.4 and 0.5 seconds?

    What is the reaction time that is exceeded 90% of the time?

    66

    Normal Distribution

  • In a certain city, the daily supply of electric power (in mega watt) can be

    treated as a random variable having a normal distribution with mean 300 MW

    and s.d. 50 MW.

    Since supply is not a constant, the local authorities have imposed a system

    of rationing to deal with the problem. It is known that to ensure proper

    rationing a minimum of 250 MW supply is required; otherwise load shedding

    is to be imposed.

    There is no need of rationing whenever supply exceeds 350 MW. On the other

    hand, maximum consumption of the city can never exceed 425 MW.

    Find the percentage of the days

    1. in which the city experiences load-shedding.

    2. in which proper power rationing is implemented.

    3. when there is an excess of power supply.

    Problems (cont)

    Normal Distribution

  • The average mileage before a major breakdown of a particular

    bike is 60000 kms. with a s.d. of 10000 kms. The

    manufacturer wishes to warranty these bikes, offering to

    make necessary service free of charge, if the the new bike

    has a break-down before covering certain number of kms.

    Assuming that the mileage, before a major breakdown, is

    distributed normally. Determine for how many kms , should

    the manufacturer warranty, so that not more than 3 % of the

    new bikes come for free service?

    Problems (cont)

    Normal Distribution

  • A wholesale distributor of fertilizer products finds that the

    annual demand for one type of fertilizer is normally distributed

    with a mean of 120 tonnes and s.d. of 16 tonnes.

    If he orders only once a year, what quantity should be ordered

    to ensure that there is only a 5% chance of running short of

    stock?

    Problems (cont)

    Normal Distribution

  • Case

    Specialty Toys

    70

    Normal Distribution

  • Joint Distribution

    Let X1 denote the random variable describing the outcome from

    the roll of dice 1 and let X2 denote the random variable describing

    the outcome from the roll of dice 2 then the joint probability

    distribution of X1 and X2 is the following:

    71

    X1 X2

    1 2 3 4 5 6

    1 1/36 1/36 1/36 1/36 1/36 1/36

    2 1/36 1/36 1/36 1/36 1/36 1/36

    3 1/36 1/36 1/36 1/36 1/36 1/36

    4 1/36 1/36 1/36 1/36 1/36 1/36

    5 1/36 1/36 1/36 1/36 1/36 1/36

    6 1/36 1/36 1/36 1/36 1/36 1/36

  • Joint Distribution

    P (X1 = 2, X2 = 3) = ?

    P (X1 = 3, X2 = 1) = ?

    72

    X1 X2

    1 2 3 4 5 6

    1 1/36 1/36 1/36 1/36 1/36 1/36

    2 1/36 1/36 1/36 1/36 1/36 1/36

    3 1/36 1/36 1/36 1/36 1/36 1/36

    4 1/36 1/36 1/36 1/36 1/36 1/36

    5 1/36 1/36 1/36 1/36 1/36 1/36

    6 1/36 1/36 1/36 1/36 1/36 1/36

  • Joint Distribution

    P (X1 + X2 = 3) = ?

    (X1 + X2) is another random variable having its own distribution.

    73

    X1 X2

    1 2 3 4 5 6

    1 1/36 1/36 1/36 1/36 1/36 1/36

    2 1/36 1/36 1/36 1/36 1/36 1/36

    3 1/36 1/36 1/36 1/36 1/36 1/36

    4 1/36 1/36 1/36 1/36 1/36 1/36

    5 1/36 1/36 1/36 1/36 1/36 1/36

    6 1/36 1/36 1/36 1/36 1/36 1/36

  • Linear Combinations of Random Variables

    The linear combination of random variables leads to another

    random variable.

    Example: If length (say X1) and width (say X2) are random

    variables, then the perimeter (say Y) is another random variable

    and Y = 2(X1 + X2)

    Y is a linear combination of X1 and X2

    In general, given random variables X1, X2, , Xn and constants c1,

    c2, , cn, then:

    Y = c1X1 + c2X2 + + cnXn is a linear combination of X1, X2, Xn

    74

  • Linear Combination of Random Variables (cont)

    If Y is a linear combination of X1, X2, Xn AND if X1, X2, , Xn are

    independent, then

    Mean of Y is:

    E(Y) = c1 E(X1) + c2 E(X2) + + cn E(Xn)

    Variance of Y is:

    V(Y) = c12 V(X1) + c2

    2 V(X2) + + cn2 V(Xn)

    75

  • Linear Combination of Random Variables (cont)

    If X1, X2, Xn are independent random variables, with each having

    a mean and a variance 2 and

    if Y = [(X1 + X2 + + Xn) / n], then,

    Mean of Y is:

    E(Y) = (1/n) + (1/n) + (1/n) + . [n times]

    =

    Variance of Y is:

    V(Y) = (1/n)2 2 + (1/n)2 2 + (1/n)2 2 + . [n times]

    = 2 / n

    76

    Special Case 1

    Combination that represents the average of n independent

    random variables with identical means and variances

  • Linear Combination of Random Variables (cont)

    If X1, X2, Xn are independent normal random variables, such that:

    X1 ~ Normal(1, 12), X2 ~ Normal(2, 2

    2), X3 ~ Normal(3, 32), ,

    Xn ~ Normal(n, n2),

    and

    if Y = c1X1 + c2X2 + + cnXn then,

    Y is also a normal random variable

    Mean of Y is:

    E(Y) = c1 1 + c2 2 + + cn n

    Variance of Y is:

    V(Y) = c12 1

    2 + c22 2

    2 + + cn2 n

    2

    77

    Special Case 2

    Reproductive property of Normal Distribution

  • Problems

    The VP of Marketing at a breakfast cereal company wants to

    implement a promotion idea. Each cereal box will contain any one of a

    set of game pieces which a consumer can collect. The pieces will be

    placed in cereal boxes at random so that a box is equally likely to

    contain any one from the set. When the consumer has collected all

    the pieces from a set, the consumer can claim a prize. The number of

    pieces that should be in the set to maximize the promotion effect is

    not clear. The VP wants to base this decision on the expected value

    and variance of the number of boxes a consumer needs to buy to be

    able to claim the prize. Compute the expected value of the number of

    boxes a consumer has to buy to be able to claim the prize for a case:

    (a) two game pieces, and

    (b) three game pieces.

    78

    Linear combination of random variables

  • Problems (cont)

    Assume that the weights of individuals are independent and normally

    distributed with a mean of 160 pounds and a standard deviation of 30

    pounds. Suppose that 25 people squeeze into an elevator that is designed

    to hold 4300 pounds.

    What is the probability that the load (total weight) exceeds the design limit?

    What design limit is exceeded by 25 occupants with probability 0.001?

    79

    Linear combination of random variables

  • Problems (cont)

    The width of a casing for a door is normally distributed with a mean of 24

    inches and a standard deviation of 0.125 inch. The width of a door is

    normally distributed with a mean of 23.875 inches and a standard

    deviation of 0.0625 inch. Assume independence.

    Determine the mean and standard deviation of the difference between the width of the casing and the width of the door.

    What is the probability that the width of the casing minus the width of the door exceeds 0.25 inch?

    What is the probability that the door does not fit in the casing?

    80

    Linear combination of random variables

  • Covariance and Correlation between two random variables

    Let X1 and X2 be two random variables

    The Covariance between X1 and X2 is given by:

    Cov(X1, X2) = E((X1 - X1) (X2 - X2))

    The Correlation between X1 and X2 is given by:

    X1X2 = Cov(X1, X2) / (X1 X2)

    81

  • Covariance and Correlation between two random variables

    If X1 and X2 are independent then Cov(X1, X2) = 0.

    The reverse need not be true

    That is, even if Cov(X1, X2) = 0, X1 and X2 may still not

    be independent.

    Cov(X1, X1) = V(X1)

    82

  • Linear Combination of Dependent Random Variables

    If Y is a linear combination of X1, X2, Xn

    Mean of Y is:

    E(Y) = c1 E(X1) + c2 E(X2) + + cn E(Xn)

    Variance of Y is:

    V(Y) = c12 V(X1) + c2

    2 V(X2) + + cn2 V(Xn)

    + 2c1c2Cov(X1,X2) + 2c1c3Cov(X1,X3) + +

    + 2cn-2cn-1Cov(Xn-2Xn-1) + 2cn-2cnCov(Xn-2,Xn)

    + 2cn-1cnCov(Xn-1,Xn)

    83

  • Linear Combination of Dependent Random Variables - Example

    Management of a chain of retail stores has the opportunity to lock

    in prices for electricity and natural gas, the two energy sources

    used in the stores. A typical store in this chain uses electricity for

    lighting and air conditioning. In the winter; natural gas supplies

    heat. Managers at a recent meeting settled on the following

    estimates of typical annual use of electricity and natural gas by

    the stores. They estimated the chances of varying levels of use

    based on their own experiences operating stores and their

    expectation for the coming long-term weather patterns. The cost

    of electricity is roughly $100 per thousand kilowatt-hours, and the

    cost of natural gas is about $12 per thousand cubic feet.

    84

    Source: Statistics for Business Decision Making and Analysis by Stine and Foster

  • Linear Combination of Dependent Random Variables Example cont

    The usage pattern for electricity is as follows:

    The usage pattern for natural gas is as follows:

    85

    thousand kilowatt-hours

    200 300 400 500

    chances 5% 25% 40% 30%

    thousand cubic feet

    600 800 1000 1200

    chances 5% 25% 40% 30%

    Source: Statistics for Business Decision Making and Analysis by Stine and Foster

  • Linear Combination of Dependent Random Variables Example cont

    Identify random variables for the amount of electricity that is used

    (X) and the amount of natural gas that is used (Y).

    What are the marginal probability distributions for these random

    variables?

    Define a third random variable T that combines these two random

    variables to determine the annual energy operating costs.

    We do not have the joint distribution for X and Y. Do you think

    that it is appropriate to model the two random variables X and Y

    as independent?

    The correlation between X and Y is believed to be 0.4. Using this

    value, find the mean and variance of T.

    86 Source: Statistics for Business Decision Making and Analysis by Stine and Foster