Upload
naman-bindra
View
16
Download
0
Embed Size (px)
Citation preview
Traffic-Light Experiment Project
Design and Analysis IE 442
Naman Bindra Nicolas Buitrago Majed Takieddine Mythri Addanki
1
Table Of Contents:
INTRODUCTION .................................................................................................................................. 3
BACKGROUND/OBJECTIVES......................................................................................................... 3-4
DATA COLLECTION ..........................................................................................................................4-5
OBJECTIVES & PROBLEM DEFINITIONS.................................................................................... 5-6
CHOICE OF FACTORS & LEVELS + RESPONSE VARIABLE......................................................... 6
CHOICE OF EXPERIMENTAL DESIGN & DESIGN MATRIX ..................................................... 7-8
PERFORMING THE EXPERIMENT..............................................................................................9-15
DESIGN MATRIX AND FACTORIAL DESIGN ........................................................................... 16-17
STATISTICAL ANALYSIS OF DATA AND MODEL ADEQUACY CHECKING......................... 17-21
ANALYSIS OF THE FACTOR INTERACTIONS......................................................................... 21-22
PAIRED AND SAMPLE T-TEST ANALYSIS................................................................................22-26
CONCLUSION ................................................................................................................................26-27
REFERENCES .................................................................................................................................... 28
2
IE 442 Traffic Lights Experiment Report
Introduction:
Traffic lights are a necessity on the road when maintaining safety and civil order, however traffic
light times are dependent on a variety of factors that influence how long each cycle time lasts.
Working under the assumption that a cycle time is the amount of time in seconds for a traffic
light to turn from red to green and back to red, which consequently leads to study the possible
causes of location and time deviations for this cycle time. Working with surrounding areas the
use of population density and income differences between the Chicago Metropolitan area and
the surrounding suburbs to analyze whether these two differences have any significant
correlation with cycle time of the lights. Along with these statements, another possible
correlation is the idea that perhaps income wealth spread in cities and suburbs could have an
influence on traffic light times; perhaps more money per capita in these areas means that there
is a different amount of infrastructural planning involved in the programming of sensors and
lights, causing a significant difference in the overall wait time a person must experience.
Background/Objectives:
In this experiment, a random sample of traffic lights from chosen neighborhoods in both in the
metropolitan city area of Chicago as well as the surrounding suburbs will be compared to each
other for statistical significance. The factors influencing our analysis will be the location of each
neighborhood, the time of day, and the average income levels according to official city records
3
of each neighborhood. The levels of these factors will be the binomial designation of city versus
suburb, the binomial designation of daytime (8am-12pm) versus night-time (4pm-8pm) and the
binomial designation of income in each neighborhood separated by the lower bracket of income
versus the higher bracket of income. The main question being explored in this experiment is
whether or not there is a statistical significance difference in the traffic cycle times of Chicago
metropolitan neighborhoods versus the Chicago suburban surrounding neighborhoods, in
particular once certain correlated factors is blocked away. Secondary questions that will be
explored are whether or not higher income can play a role irrespective of suburb versus. city
correlation and whether time of day can change the cycle times of single lights and if so how
large of a difference this variation is.
Data Collection:
To start with the group made two lists of accessible neighborhoods with population greater than
5000, one list for metropolitan neighborhoods within the lines of the Chicago area, and another
for Chicago suburbs within the lines of the Chicagoland greater area. Once the two lists of
possible neighborhoods two neighborhoods were randomly chosen from each list. From this
random selection the received four neighborhoods were: Glen Ellyn, Naperville, Little Italy, and
Pilsen. Once the selected neighborhoods were randomly chosen, the next step of the data
collection process was to make a list of all regular two way four-entrance traffic lights.
T-intersections were excluded and roundabouts from our analysis as that could confound the
situation and prevent us from reaching any sort of reliable conclusion. From this list, the group
randomly selected two traffic lights from each neighborhood. For the Pilsen neighborhood the
randomly selected stoplights were: Ashland and 18th and Racine and 18th. For Little Italy the
4
randomly selected the traffic lights were at: Halsted and 16th, and Halsted and Maxwell. For
Naperville the randomly selected the traffic lights were at: Rt. 59 and 95th, and Rt. 59 and 87th.
For Glen Ellyn traffic lights were selected randomly in the suburbs at lights at the intersections
of: Shorewood and Bloomingdale, and Bloomingdale and Geneva. For each chosen traffic light
we made two trips to record data, one in the morning between 8AM and 12AM and another in
the evening hours between 4 and 8PM. With each trip the group recorded the data times of 9
traffic cycles, without interruption or cherry picking of data. In order to ensure an ample amount
of observations(specifically greater than 30 observations per set of characteristics) the decision
made was to choose to do one replicate at each location with an additional 18 traffic cycle times
per replicate. The replication was done in separate days so it has become a nuisance factor
which will be blocked in the factorial design. Aside from all of our empirical data city and
government websites were used to look up average per capita income for all four selected
neighborhoods for the year of 2016.
Objective & Problem Definitions of the Experiment:
Objective: To assess the traffic light changing times by varying relevant factors and to see if there is a
difference between the mean traffic light times of the city versus the suburbs.
Problem Definitions:
Null Hypothesis: The means of the cycle traffic light time have a statistically significant difference in one
location versus the other. (Suburbs versus the City)
Alternative Hypothesis: There is no difference between mean times between the city and the suburbs.
5
Exposure: Predetermined cycle time (Sensors-not assumed)
Outcome: Cycle time in Traffic Lights
Nuisance factor: Day of Replication
Interval Levels: Time of cycles
Analysis Factors: Time (day & morning vs night), location (City vs Suburbs), Income (high vs low)
Blocking: Day of Replication
Choice of Factors and Levels
The following factors and their level were considered for the experiment
Factors Levels High + Low --
1. A Location 2
City Suburbs
2. B Time 2 A.M P.M
3. C Income 2 Below 75k Above 75k
6
Response variable:
The response variable was the traffic light changing time. In conducting the experiment the design was to
measure the time it takes the traffic light to change from red back to red varying the factors as per the
design matrix. Thereafter different factors were assigned to the different traffic lights, which includes
time, location and income levels. The measurements were replicated twice so a total of 18 observations
were performed per traffic light. A total of 36 observation were observed per neighborhood.
Choice of experimental design and design matrix:
Choice of Design:
The different number of factors that were decided by the team in conducting the experiment were three:
Location, time and income.
Design Test Matrix
The design test matrix for the 2^3 is as shown below. Design Expert was used to arrive at the above
design matrix to test combination and run orders.
3 Factors: A, B, C
Factor A = Location
Factor B = Time
Factor C = Income
Design Matrix Evaluation for Factorial Model
7
8
Performing The Experiment
● The experiment was conducted using a stopwatch and by all members of the group. The stopwatch
ran continuously for the entire set of observations. In addition video evidence corroborated stopwatch
times and ensured maximum accuracy.
● Two locations were chosen. Suburbs and the City. In each location, two neighborhoods were picked
and in each neighborhood two traffic lights were measured twice (cycle times), once in the morning
and once in the evening. The overall total traffic lights measured were eight.
● Each traffic light measurement was repeated nine times to reduce variations. nine times in the
morning, and nine times in the evening; this was also done to allow the overall normality of the
experiment and perform the necessary tests for further analysis
● The experiment was replicated once along with the initial observation run. Each in a separate day.
Because the replication was done in separate days, that was a nuisance factor as conditions such as
weather, events or unforeseen variables could happen in separate days so each day was blocked in a
separate block. This allowed for the removal of bias along with more accurate data in the overall
experiment.
● The observations per traffic light were done at the same settings. This also helped to reducing any
variation related to measurements error.
9
City:
Data Observation Cycle 1: Little Italy Pilsen
Halsted Maxwell Halsted & 16th Racine and 18th Ashland and 19th
Day
Run
s Morning
Eveni
ng
Morning
Evenin
g
Morning
Evenin
g Morning Evening
1 1 64.93 64.95 90.08 90.03 65.01 64.77 37.11 36.47
1 2 65.01 64.63 89.88 89.85 64.91 65.07 36.87 36.21
1 3 65.12 64.95 90.12 89.86 64.77 64.88 36.21 35.89
1 4 64.87 65.21 89.85 89.73 65.07 64.52 36.46 36.45
1 5 64.97 65.06 90.15 89.85 64.81 64.82 36.48 37.18
1 6 65.05 64.89 90.13 89.9 64.72 64.53 36.81 35.84
1 7 65.22 64.2 90.11 89.78 64.88 64.68 36.21 36.47
1 8 65.1 64.85 89.98 90.01 65.06 64.59 36.29 36.54
1 9 64.95 65.07 89.87 89.96 64.7 64.73 36.58 36.21
Time + -- + -- + -- + --
Location + + + + + + + +
Income -- -- -- -- + + + +
Average 65.02 64.87 90.02 89.89 64.88 64.73 36.56 36.36
10
Suburbs:
Data Observation Cycle 1: Glen Ellyn Naperville
Shorewood &
Bloom Bloom & Geneva Rt.59 and 95th St Rt.59 and 87th St
Day
Run
s Morning
Evenin
g
Mornin
g
Evenin
g
Mornin
g
Evenin
g
Mornin
g Evening
1 1 109.16 109.49 90.27 90.88 93.49 93.24 110.43 111.41
1 2 108.28 109.24 90.84 90.5 93.75 92.32 112.57 111.37
1 3 110.26 109.99 90.65 89.99 92.58 92.38 110.41 110.92
1 4 109.18 109.32 90.01 90.46 92.86 92.44 111.35 111.28
1 5 109.89 108.88 90.65 90.66 92.42 93.89 110.22 111.1
1 6 108.67 109.12 90.7 90.23 92.8 92.5 110.39 111.72
1 7 108.17 109.13 90.34 90.63 93.12 93.87 111.59 111.33
1 8 108.84 108.76 90.75 90.57 93.67 93.71 110.83 110.57
1 9 108.69 109.47 90.68 90.44 93.77 93.85 112.07 110.76
Time + -- + -- + -- + --
Location -- -- -- -- -- -- -- --
Income + + + + -- -- -- --
11
Average 109.02 109.27 90.54 90.48 93.16 93.13 111.1 111.16
City:
Replicate 1: Little Italy Pilsen
Halsted Maxwell Halsted & 16th
Racine and
18th Ashland and 19th
Day
Run
s Morning
Eveni
ng
Mornin
g
Evenin
g
Mornin
g
Eveni
ng Morning Evening
2 1 64.99 64.89 90.16 89.62 64.89 65.03 36.97 35.97
2 2 65.12 64.79 89.96 89.84 64.97 64.86 35.97 37.06
2 3 65.06 65.18 89.93 89.66 64.92 64.77 36.21 36.84
2 4 64.97 65.06 90.04 89.55 65.03 64.92 36.89 36.21
2 5 64.88 64.82 90.03 89.74 65.09 65.02 36.77 36.89
2 6 65.02 65.14 89.92 89.38 64.94 64.84 36.9 36.42
2 7 64.96 65.04 90.02 89.89 64.9 64.93 36.18 35.96
2 8 65.15 65.28 90.21 89.98 64.78 65.01 36.82 37.16
2 9 65.09 64.98 89.91 89.77 65.05 64.88 36.02 36.12
12
Time + -- + -- + -- + --
Location + + + + + + + +
Income -- -- -- -- + + + +
Average 65.03 65.02 90.02 89.71 64.95 64.92 36.53 36.51
Suburbs:
Replicate 1: Glen Ellyn Naperville
Shorewood &
Bloom Bloom & Geneva Rt.59 & 95th St Rt.59 and 87th St
Day
Run
s Morning
Evenin
g
Mornin
g
Evenin
g
Mornin
g
Evenin
g
Mornin
g Evening
2 1 108.69 109.47 90.68 90.44 93.77 93.85 112.07 110.76
2 2 108.27 109.61 90.56 90.72 93.97 93.06 109.97 111.98
2 3 108.66 108.99 90.77 90.68 93.57 93.14 111.52 112.16
2 4 109.06 108.64 90.56 90.66 93.33 93.67 112.04 111.75
2 5 109.24 109.71 90.88 90.75 92.78 92.54 112.44 111.88
2 6 108.63 109.42 90.47 90.51 92.7 92.46 111.61 112.04
2 7 107.98 109.45 90.65 90.47 93.55 92.5 112.33 111.66
13
2 8 108.24 109.67 90.69 90.66 93.53 93.73 112.21 110.83
2 9 109.2 108.94 90.73 90.58 94.11 92.72 111.45 111.67
Time + -- + -- + -- + --
Location -- -- -- -- -- -- -- --
Income + + + + -- -- -- --
Average 108.72 109.37 90.68 90.64 93.51 93.06 111.66 111.86
In the Table below is the final response variable data. Each factor combination is the average of
the two traffic lights for every neighborhood.
For instance, the Morning/City/Low-Income coded as ‘+++’ has two values which are 64.88 and
36.56 which belong to Racine &18th, Ashland and 19th lights which were taken in the morning,
in the city and in the low income neighborhood(Pilsen). Therefore the value would be:
= 50.72 which includes 36 data points; the same process is conducted for the264.88+36.56
Evening-City-High/Income.
Coded Replicate 1 Replicate 2
Morning/City/Low-Incom
e + + + 50.72 50.74
Evening/City/High-Inco
me -- + -- 77.38 77.37
14
Morning/City/High-Inco
me + + -- 77.52 77.52
Evening/City/Low-Incom
e -- + + 50.55 50.72
Morning/Subs/High-Inco
me + -- -- 102.13 102.59
Evening/Subs/Low-Inco
me -- -- + 99.88 100.00
Morning/Subs/Low-Inco
me + -- + 99.78 99.70
Evening/Subs/High-Inco
me -- -- -- 102.15 102.46
Design Matrix- Factorial Design
Std Order RunOrder CenterPt Blocks Location Time Income Response
10 1 1 1 1 -1 -1 102.13
14 2 1 1 1 -1 1 99.78
15
9 3 1 1 -1 -1 -1 102.15
13 4 1 1 -1 -1 1 99.88
1 5 1 2 -1 -1 -1 102.46
8 6 1 1 1 1 1 50.72
15 7 1 1 -1 1 1 50.55
4 8 1 1 1 1 -1 77.52
16 9 1 2 1 1 1 50.74
5 10 1 2 -1 -1 1 100.00
6 11 1 2 1 -1 1 99.7
2 12 1 2 1 -1 -1 102.59
3 13 1 1 -1 1 -1 77.38
12 14 1 2 1 1 -1 77.52
7 15 1 2 -1 1 1 50.72
11 16 1 2 -1 1 -1 77.37
Statistical Analysis of the Data and Model Adequacy Checking:
Statistical Analysis The results of the 2^3 fractional factorial design are shown in the table below.
16
Analysis of Variance
Source DF Adj SS Adj MS F-Value P-Value
Model 8 6927.58 865.95 51503.45 0.000
Blocks 1 0.06 0.06 3.64 0.098
Linear 3 6338.30 2112.77 125659.65 0.000
Time 1 0.00 0.00 0.13 0.725
Location 1 5482.29 5482.29 326066.95 0.000
Income 1 856.00 856.00 50911.87 0.000
2-Way Interactions 3 589.21 196.40 11681.45 0.000
Time*Location 1 0.04 0.04 2.20 0.181
Time*Income 1 0.02 0.02 1.38 0.278
Location*Income 1 589.15 589.15 35040.77 0.000
3-Way Interactions 1 0.01 0.01 0.62 0.455
Time*Location*Income 1 0.01 0.01 0.62 0.455
Error 7 0.12 0.02
Total 15 6927.70
Model Summary
S R-sq R-sq(adj) R-sq(pred)
0.129666 100.00% 100.00% 99.99%
The Model F-value of 51503.45 implies the model is significant. There is a 0.000% chance that a
"Model F-Value" this large could occur due to noise.
17
It can be seen that the 2 factors with P-values less than our alpha of 0.05 which are: Location and Income,
are indeed significant. As for the interaction, the Income and Location interaction are also significant. This
model shows that the factor of Time of day is not significant and does not change the response variable in
any way.
From these results it is clear that there were large block differences (p=0.098), implying that blocking was
necessary to validate our analysis of the experiment. In other words it improved the power of the
experiment.
The data analysis indicated a very high R-squared value of 100% which indicate that the regression model
has no variation. The "Pred R-Squared" of 99.99% is in reasonable agreement with the "Adj R- Squared"
of 100%. The difference between the two is not greater than 1%.
Below is the regression model and the table of residuals. This is useful for the normality check of our data
which contributes to the model adequacy check.
18
Regression Equation in Uncoded Units
Response = 82.5756 + 0.0119 Time - 18.5106 Location - 7.3144 Income + 0.0481
Time*Location - 0.0381 Time*Income - 6.0681 Location*Income + 0.0256
Time*Location*Income
Obs Response Fit Resid Std Resid
1 102.150 102.243 -0.093 -1.09
2 99.780 99.678 0.102 1.19
3 77.380 77.313 0.067 0.78
4 102.130 102.298 -0.168 -1.96
5 77.520 77.458 0.062 0.72
6 99.880 99.878 0.002 0.02
7 50.720 50.668 0.052 0.60
8 50.550 50.573 -0.023 -0.27
9 99.700 99.802 -0.102 -1.19
10 50.720 50.697 0.023 0.27
11 102.460 102.367 0.093 1.09
12 102.590 102.422 0.168 1.96
13 77.370 77.437 -0.067 -0.78
14 77.520 77.582 -0.062 -0.72
19
15 50.740 50.792 -0.052 -0.60
16 100.000 100.002 -0.002 -0.02
20
The figures above shows the normal probability plot of residuals, the histogram and the fitted values
graph. The plot is reasonably okay and it passes the fat pencil test, the histogram shows a perfect normal
distribution and the fitted values graph does not show any pattern and the scatter is reasonably well.
21
Analysis of the Factor Interactions:
The graph below shows the two- factor interaction found significant from the analysis of the model.
From the plot the conclusion made is:
1. As one moves towards the Suburbs or a high income area, the average traffic light times
increases. The opposite is also true, as one moves to the City and/or a low income area, the
average traffic light times decreases. There is also a slight interaction factor between location
and time of day.
2. One can see that time of day does not affect the traffic light time and is constant in the morning
and in the evening. It also have no effect on the location and income.
22
3. In conclusion from the data obtained, the lowest traffic times can be obtained in the city or in a
low income area.
Paired & 2-Sample T-test Analysis:
23
24
25
ANALYSIS:
It was important to test if the mean times of the morning and the evening differ at each traffic light.
Because the evening and the morning data points where from the same traffic light, and sample size of
18 was below 36 which was not adequate to assume normality; this lead to the simple conclusion that a
paired t-test had to be run. After running the paired t-test on all the traffic lights, the leading p-values
led to the conclusion that 6 out of the 8 total traffic lights had the same means, and only two had
different means. A full factorial analysis was consequently conducted and previously in the report
explained to also show how the time factor is not significant. This data collected and conducted from
this test led to the group's original decision that the time of day does not affect the mean times to not be
disproven. Assuming that time is not a factor from the factorial analysis and the paired t-test, the
morning and evening data was combined for each traffic light. The group wanted to test whether or not
the traffic light mean time differed from within the same neighbourhood. Therefore, two-sample T-tests
were conducted through minitab between the traffic lights within each neighborhood. Once again, after
looking at the p-values in each test, the results concluded almost incontrovertibly that all the traffic
lights had different means. The sample size for each traffic light in the experiment was 36 allowing for
adequate Normality for the test to be run and experiment to be conducted thoroughly. Next, a
two-sample t-test was run to compare the average mean times between city neighborhoods (Pilsen and
Little Italy), and the suburbs neighbourhoods(Glen Ellyn and Naperville). Result is that the mean times
are different as P-values is below 0.05. With a sample size of 72 data points consisting of all traffic ligh
within a neighborhood, normality assumption can be assumed. Lastly, a two sample t-test was run to
compare the mean between the city and the suburb. This led to the conclusion that the location from
26
suburb and city has indeed different mean times. The sample size was 144 data points consisting of all
traffic light in the two neighborhoods per location(City or Suburbs). This led to the conclusion that we
fail to reject the null hypothesis that the locations, city and suburbs, have indeed different average
traffic light cycle mean times.
Conclusions:
The statistical analysis of the data clearly indicates the following factors have an effect on traffic cycle
light times that is both statistically significant and remains significant when blocking is taken into
account:
1) Location Factor: Statistical analysis of our data showed that traffic lights in the city have lower
average light cycle times than the traffic lights in the suburb.
2) Income Factor: Statistical analysis of the data showed that traffic lights in the areas of
lower income areas tend have statistically significant lower cycle times than that of higher
income areas.
In addition it was concluded that time of day has no significant effect on the traffic light times.
Whether it's the morning or the evening , all statistical analysis done implied that the traffic light
times will be the same for individual traffic lights in the areas tested.
27
Upon further investigation, it appears that the reason the suburbs have a higher traffic light times is to
the wide multiple lane roads unlike the city’s small two lane roads. Since its a big suburban area with
lots of space, roads are wider and has more lanes which result in more complex intersections. This
leads to higher time cycle.
For the income factor, it appears that when the area has high income levels, a bigger budget is
available for the roads maintenance and traffic light installation. This results in an actuated traffic
light with sensors that can change cycles if there is unusual events. This results in higher percentage
of vehicles stopping because green time is not held for upstream platoons.
28
References:
Montgomery, Douglas C. Design and Analysis of Experiments. Hoboken, NJ: John Wiley &
Sons, 2013. Print
MINITAB. Computer software. N.p., n.d. Web.
"Average Daily Traffic Counts." City of Chicago :: Average Daily Traffic Counts. N.p., n.d.
Web. 20 Nov. 2016
"Point2 Homes." Real Estate. Web.| Homes for Sale & Rent by
Point2 Homes. N.p., n.d20 Nov. 2016.
29