IE 442 Final Project

Traffic-Light Experiment Project

Design and Analysis IE 442

Naman Bindra Nicolas Buitrago Majed Takieddine Mythri Addanki

1

Table Of Contents:

INTRODUCTION .................................................................................................................................. 3

BACKGROUND/OBJECTIVES......................................................................................................... 3-4

DATA COLLECTION ..........................................................................................................................4-5

OBJECTIVES & PROBLEM DEFINITIONS.................................................................................... 5-6

CHOICE OF FACTORS & LEVELS + RESPONSE VARIABLE......................................................... 6

CHOICE OF EXPERIMENTAL DESIGN & DESIGN MATRIX ..................................................... 7-8

PERFORMING THE EXPERIMENT..............................................................................................9-15

DESIGN MATRIX AND FACTORIAL DESIGN ........................................................................... 16-17

STATISTICAL ANALYSIS OF DATA AND MODEL ADEQUACY CHECKING......................... 17-21

ANALYSIS OF THE FACTOR INTERACTIONS......................................................................... 21-22

PAIRED AND SAMPLE T-TEST ANALYSIS................................................................................22-26

CONCLUSION ................................................................................................................................26-27

REFERENCES .................................................................................................................................... 28

2

IE 442 Traffic Lights Experiment Report

Introduction:

Traffic lights are a necessity on the road when maintaining safety and civil order, however traffic

light times are dependent on a variety of factors that influence how long each cycle time lasts.

Working under the assumption that a cycle time is the amount of time in seconds for a traffic

light to turn from red to green and back to red, which consequently leads to study the possible

causes of location and time deviations for this cycle time. Working with surrounding areas the

use of population density and income differences between the Chicago Metropolitan area and

the surrounding suburbs to analyze whether these two differences have any significant

correlation with cycle time of the lights. Along with these statements, another possible

correlation is the idea that perhaps income wealth spread in cities and suburbs could have an

influence on traffic light times; perhaps more money per capita in these areas means that there

is a different amount of infrastructural planning involved in the programming of sensors and

lights, causing a significant difference in the overall wait time a person must experience.

Background/Objectives:

In this experiment, a random sample of traffic lights from chosen neighborhoods in both in the

metropolitan city area of Chicago as well as the surrounding suburbs will be compared to each

other for statistical significance. The factors influencing our analysis will be the location of each

neighborhood, the time of day, and the average income levels according to official city records

3

of each neighborhood. The levels of these factors will be the binomial designation of city versus

suburb, the binomial designation of daytime (8am-12pm) versus night-time (4pm-8pm) and the

binomial designation of income in each neighborhood separated by the lower bracket of income

versus the higher bracket of income. The main question being explored in this experiment is

whether or not there is a statistical significance difference in the traffic cycle times of Chicago

metropolitan neighborhoods versus the Chicago suburban surrounding neighborhoods, in

particular once certain correlated factors is blocked away. Secondary questions that will be

explored are whether or not higher income can play a role irrespective of suburb versus. city

correlation and whether time of day can change the cycle times of single lights and if so how

large of a difference this variation is.

Data Collection:

To start with the group made two lists of accessible neighborhoods with population greater than

5000, one list for metropolitan neighborhoods within the lines of the Chicago area, and another

for Chicago suburbs within the lines of the Chicagoland greater area. Once the two lists of

possible neighborhoods two neighborhoods were randomly chosen from each list. From this

random selection the received four neighborhoods were: Glen Ellyn, Naperville, Little Italy, and

Pilsen. Once the selected neighborhoods were randomly chosen, the next step of the data

collection process was to make a list of all regular two way four-entrance traffic lights.

T-intersections were excluded and roundabouts from our analysis as that could confound the

situation and prevent us from reaching any sort of reliable conclusion. From this list, the group

randomly selected two traffic lights from each neighborhood. For the Pilsen neighborhood the

randomly selected stoplights were: Ashland and 18th and Racine and 18th. For Little Italy the

4

randomly selected the traffic lights were at: Halsted and 16th, and Halsted and Maxwell. For

Naperville the randomly selected the traffic lights were at: Rt. 59 and 95th, and Rt. 59 and 87th.

For Glen Ellyn traffic lights were selected randomly in the suburbs at lights at the intersections

of: Shorewood and Bloomingdale, and Bloomingdale and Geneva. For each chosen traffic light

we made two trips to record data, one in the morning between 8AM and 12AM and another in

the evening hours between 4 and 8PM. With each trip the group recorded the data times of 9

traffic cycles, without interruption or cherry picking of data. In order to ensure an ample amount

of observations(specifically greater than 30 observations per set of characteristics) the decision

made was to choose to do one replicate at each location with an additional 18 traffic cycle times

per replicate. The replication was done in separate days so it has become a nuisance factor

which will be blocked in the factorial design. Aside from all of our empirical data city and

government websites were used to look up average per capita income for all four selected

neighborhoods for the year of 2016.

Objective & Problem Definitions of the Experiment:

Objective: To assess the traffic light changing times by varying relevant factors and to see if there is a

difference between the mean traffic light times of the city versus the suburbs.

Problem Definitions:

Null Hypothesis: The means of the cycle traffic light time have a statistically significant difference in one

location versus the other. (Suburbs versus the City)

Alternative Hypothesis: There is no difference between mean times between the city and the suburbs.

5

Exposure: Predetermined cycle time (Sensors-not assumed)

Outcome: Cycle time in Traffic Lights

Nuisance factor: Day of Replication

Interval Levels: Time of cycles

Analysis Factors: Time (day & morning vs night), location (City vs Suburbs), Income (high vs low)

Blocking: Day of Replication

Choice of Factors and Levels

The following factors and their level were considered for the experiment

Factors Levels High + Low --

1. A Location 2

City Suburbs

2. B Time 2 A.M P.M

3. C Income 2 Below 75k Above 75k

6

Response variable:

The response variable was the traffic light changing time. In conducting the experiment the design was to

measure the time it takes the traffic light to change from red back to red varying the factors as per the

design matrix. Thereafter different factors were assigned to the different traffic lights, which includes

time, location and income levels. The measurements were replicated twice so a total of 18 observations

were performed per traffic light. A total of 36 observation were observed per neighborhood.

Choice of experimental design and design matrix:

Choice of Design:

The different number of factors that were decided by the team in conducting the experiment were three:

Location, time and income.

Design Test Matrix

The design test matrix for the 2^3 is as shown below. Design Expert was used to arrive at the above

design matrix to test combination and run orders.

3 Factors: A, B, C

Factor A = Location

Factor B = Time

Factor C = Income

Design Matrix Evaluation for Factorial Model

7

8

Performing The Experiment

● The experiment was conducted using a stopwatch and by all members of the group. The stopwatch

ran continuously for the entire set of observations. In addition video evidence corroborated stopwatch

times and ensured maximum accuracy.

● Two locations were chosen. Suburbs and the City. In each location, two neighborhoods were picked

and in each neighborhood two traffic lights were measured twice (cycle times), once in the morning

and once in the evening. The overall total traffic lights measured were eight.

● Each traffic light measurement was repeated nine times to reduce variations. nine times in the

morning, and nine times in the evening; this was also done to allow the overall normality of the

experiment and perform the necessary tests for further analysis

● The experiment was replicated once along with the initial observation run. Each in a separate day.

Because the replication was done in separate days, that was a nuisance factor as conditions such as

weather, events or unforeseen variables could happen in separate days so each day was blocked in a

separate block. This allowed for the removal of bias along with more accurate data in the overall

experiment.

● The observations per traffic light were done at the same settings. This also helped to reducing any

variation related to measurements error.

9

City:

Data Observation Cycle 1: Little Italy Pilsen

Halsted Maxwell Halsted & 16th Racine and 18th Ashland and 19th

Day

Run

s Morning

Eveni

ng

Morning

Evenin

g

Morning

Evenin

g Morning Evening

1 1 64.93 64.95 90.08 90.03 65.01 64.77 37.11 36.47

1 2 65.01 64.63 89.88 89.85 64.91 65.07 36.87 36.21

1 3 65.12 64.95 90.12 89.86 64.77 64.88 36.21 35.89

1 4 64.87 65.21 89.85 89.73 65.07 64.52 36.46 36.45

1 5 64.97 65.06 90.15 89.85 64.81 64.82 36.48 37.18

1 6 65.05 64.89 90.13 89.9 64.72 64.53 36.81 35.84

1 7 65.22 64.2 90.11 89.78 64.88 64.68 36.21 36.47

1 8 65.1 64.85 89.98 90.01 65.06 64.59 36.29 36.54

1 9 64.95 65.07 89.87 89.96 64.7 64.73 36.58 36.21

Time + -- + -- + -- + --

Location + + + + + + + +

Income -- -- -- -- + + + +

Average 65.02 64.87 90.02 89.89 64.88 64.73 36.56 36.36

10

Suburbs:

Data Observation Cycle 1: Glen Ellyn Naperville

Shorewood &

Bloom Bloom & Geneva Rt.59 and 95th St Rt.59 and 87th St

Day

Run

s Morning

Evenin

g

Mornin

g

Evenin

g

Mornin

g

Evenin

g

Mornin

g Evening

1 1 109.16 109.49 90.27 90.88 93.49 93.24 110.43 111.41

1 2 108.28 109.24 90.84 90.5 93.75 92.32 112.57 111.37

1 3 110.26 109.99 90.65 89.99 92.58 92.38 110.41 110.92

1 4 109.18 109.32 90.01 90.46 92.86 92.44 111.35 111.28

1 5 109.89 108.88 90.65 90.66 92.42 93.89 110.22 111.1

1 6 108.67 109.12 90.7 90.23 92.8 92.5 110.39 111.72

1 7 108.17 109.13 90.34 90.63 93.12 93.87 111.59 111.33

1 8 108.84 108.76 90.75 90.57 93.67 93.71 110.83 110.57

1 9 108.69 109.47 90.68 90.44 93.77 93.85 112.07 110.76

Time + -- + -- + -- + --

Location -- -- -- -- -- -- -- --

Income + + + + -- -- -- --

11

Average 109.02 109.27 90.54 90.48 93.16 93.13 111.1 111.16

City:

Replicate 1: Little Italy Pilsen

Halsted Maxwell Halsted & 16th

Racine and

18th Ashland and 19th

Day

Run

s Morning

Eveni

ng

Mornin

g

Evenin

g

Mornin

g

Eveni

ng Morning Evening

2 1 64.99 64.89 90.16 89.62 64.89 65.03 36.97 35.97

2 2 65.12 64.79 89.96 89.84 64.97 64.86 35.97 37.06

2 3 65.06 65.18 89.93 89.66 64.92 64.77 36.21 36.84

2 4 64.97 65.06 90.04 89.55 65.03 64.92 36.89 36.21

2 5 64.88 64.82 90.03 89.74 65.09 65.02 36.77 36.89

2 6 65.02 65.14 89.92 89.38 64.94 64.84 36.9 36.42

2 7 64.96 65.04 90.02 89.89 64.9 64.93 36.18 35.96

2 8 65.15 65.28 90.21 89.98 64.78 65.01 36.82 37.16

2 9 65.09 64.98 89.91 89.77 65.05 64.88 36.02 36.12

12

Time + -- + -- + -- + --

Location + + + + + + + +

Income -- -- -- -- + + + +

Average 65.03 65.02 90.02 89.71 64.95 64.92 36.53 36.51

Suburbs:

Replicate 1: Glen Ellyn Naperville

Shorewood &

Bloom Bloom & Geneva Rt.59 & 95th St Rt.59 and 87th St

Day

Run

s Morning

Evenin

g

Mornin

g

Evenin

g

Mornin

g

Evenin

g

Mornin

g Evening

2 1 108.69 109.47 90.68 90.44 93.77 93.85 112.07 110.76

2 2 108.27 109.61 90.56 90.72 93.97 93.06 109.97 111.98

2 3 108.66 108.99 90.77 90.68 93.57 93.14 111.52 112.16

2 4 109.06 108.64 90.56 90.66 93.33 93.67 112.04 111.75

2 5 109.24 109.71 90.88 90.75 92.78 92.54 112.44 111.88

2 6 108.63 109.42 90.47 90.51 92.7 92.46 111.61 112.04

2 7 107.98 109.45 90.65 90.47 93.55 92.5 112.33 111.66

13

2 8 108.24 109.67 90.69 90.66 93.53 93.73 112.21 110.83

2 9 109.2 108.94 90.73 90.58 94.11 92.72 111.45 111.67

Time + -- + -- + -- + --

Location -- -- -- -- -- -- -- --

Income + + + + -- -- -- --

Average 108.72 109.37 90.68 90.64 93.51 93.06 111.66 111.86

In the Table below is the final response variable data. Each factor combination is the average of

the two traffic lights for every neighborhood.

For instance, the Morning/City/Low-Income coded as ‘+++’ has two values which are 64.88 and

36.56 which belong to Racine &18th, Ashland and 19th lights which were taken in the morning,

in the city and in the low income neighborhood(Pilsen). Therefore the value would be:

= 50.72 which includes 36 data points; the same process is conducted for the264.88+36.56

Evening-City-High/Income.

Coded Replicate 1 Replicate 2

Morning/City/Low-Incom

e + + + 50.72 50.74

Evening/City/High-Inco

me -- + -- 77.38 77.37

14

Morning/City/High-Inco

me + + -- 77.52 77.52

Evening/City/Low-Incom

e -- + + 50.55 50.72

Morning/Subs/High-Inco

me + -- -- 102.13 102.59

Evening/Subs/Low-Inco

me -- -- + 99.88 100.00

Morning/Subs/Low-Inco

me + -- + 99.78 99.70

Evening/Subs/High-Inco

me -- -- -- 102.15 102.46

Design Matrix- Factorial Design

Std Order RunOrder CenterPt Blocks Location Time Income Response

10 1 1 1 1 -1 -1 102.13

14 2 1 1 1 -1 1 99.78

15

9 3 1 1 -1 -1 -1 102.15

13 4 1 1 -1 -1 1 99.88

1 5 1 2 -1 -1 -1 102.46

8 6 1 1 1 1 1 50.72

15 7 1 1 -1 1 1 50.55

4 8 1 1 1 1 -1 77.52

16 9 1 2 1 1 1 50.74

5 10 1 2 -1 -1 1 100.00

6 11 1 2 1 -1 1 99.7

2 12 1 2 1 -1 -1 102.59

3 13 1 1 -1 1 -1 77.38

12 14 1 2 1 1 -1 77.52

7 15 1 2 -1 1 1 50.72

11 16 1 2 -1 1 -1 77.37

Statistical Analysis of the Data and Model Adequacy Checking:

Statistical Analysis The results of the 2^3 fractional factorial design are shown in the table below.

16

Analysis of Variance

Source DF Adj SS Adj MS F-Value P-Value

Model 8 6927.58 865.95 51503.45 0.000

Blocks 1 0.06 0.06 3.64 0.098

Linear 3 6338.30 2112.77 125659.65 0.000

Time 1 0.00 0.00 0.13 0.725

Location 1 5482.29 5482.29 326066.95 0.000

Income 1 856.00 856.00 50911.87 0.000

2-Way Interactions 3 589.21 196.40 11681.45 0.000

Time*Location 1 0.04 0.04 2.20 0.181

Time*Income 1 0.02 0.02 1.38 0.278

Location*Income 1 589.15 589.15 35040.77 0.000

3-Way Interactions 1 0.01 0.01 0.62 0.455

Time*Location*Income 1 0.01 0.01 0.62 0.455

Error 7 0.12 0.02

Total 15 6927.70

Model Summary

S R-sq R-sq(adj) R-sq(pred)

0.129666 100.00% 100.00% 99.99%

The Model F-value of 51503.45 implies the model is significant. There is a 0.000% chance that a

"Model F-Value" this large could occur due to noise.

17

It can be seen that the 2 factors with P-values less than our alpha of 0.05 which are: Location and Income,

are indeed significant. As for the interaction, the Income and Location interaction are also significant. This

model shows that the factor of Time of day is not significant and does not change the response variable in

any way.

From these results it is clear that there were large block differences (p=0.098), implying that blocking was

necessary to validate our analysis of the experiment. In other words it improved the power of the

experiment.

The data analysis indicated a very high R-squared value of 100% which indicate that the regression model

has no variation. The "Pred R-Squared" of 99.99% is in reasonable agreement with the "Adj R- Squared"

of 100%. The difference between the two is not greater than 1%.

Below is the regression model and the table of residuals. This is useful for the normality check of our data

which contributes to the model adequacy check.

18

Regression Equation in Uncoded Units

Response = 82.5756 + 0.0119 Time - 18.5106 Location - 7.3144 Income + 0.0481

Time*Location - 0.0381 Time*Income - 6.0681 Location*Income + 0.0256

Time*Location*Income

Obs Response Fit Resid Std Resid

1 102.150 102.243 -0.093 -1.09

2 99.780 99.678 0.102 1.19

3 77.380 77.313 0.067 0.78

4 102.130 102.298 -0.168 -1.96

5 77.520 77.458 0.062 0.72

6 99.880 99.878 0.002 0.02

7 50.720 50.668 0.052 0.60

8 50.550 50.573 -0.023 -0.27

9 99.700 99.802 -0.102 -1.19

10 50.720 50.697 0.023 0.27

11 102.460 102.367 0.093 1.09

12 102.590 102.422 0.168 1.96

13 77.370 77.437 -0.067 -0.78

14 77.520 77.582 -0.062 -0.72

19

15 50.740 50.792 -0.052 -0.60

16 100.000 100.002 -0.002 -0.02

20

The figures above shows the normal probability plot of residuals, the histogram and the fitted values

graph. The plot is reasonably okay and it passes the fat pencil test, the histogram shows a perfect normal

distribution and the fitted values graph does not show any pattern and the scatter is reasonably well.

21

Analysis of the Factor Interactions:

The graph below shows the two- factor interaction found significant from the analysis of the model.

From the plot the conclusion made is:

1. As one moves towards the Suburbs or a high income area, the average traffic light times

increases. The opposite is also true, as one moves to the City and/or a low income area, the

average traffic light times decreases. There is also a slight interaction factor between location

and time of day.

2. One can see that time of day does not affect the traffic light time and is constant in the morning

and in the evening. It also have no effect on the location and income.

22

3. In conclusion from the data obtained, the lowest traffic times can be obtained in the city or in a

low income area.

Paired & 2-Sample T-test Analysis:

23

24

25

ANALYSIS:

It was important to test if the mean times of the morning and the evening differ at each traffic light.

Because the evening and the morning data points where from the same traffic light, and sample size of

18 was below 36 which was not adequate to assume normality; this lead to the simple conclusion that a

paired t-test had to be run. After running the paired t-test on all the traffic lights, the leading p-values

led to the conclusion that 6 out of the 8 total traffic lights had the same means, and only two had

different means. A full factorial analysis was consequently conducted and previously in the report

explained to also show how the time factor is not significant. This data collected and conducted from

this test led to the group's original decision that the time of day does not affect the mean times to not be

disproven. Assuming that time is not a factor from the factorial analysis and the paired t-test, the

morning and evening data was combined for each traffic light. The group wanted to test whether or not

the traffic light mean time differed from within the same neighbourhood. Therefore, two-sample T-tests

were conducted through minitab between the traffic lights within each neighborhood. Once again, after

looking at the p-values in each test, the results concluded almost incontrovertibly that all the traffic

lights had different means. The sample size for each traffic light in the experiment was 36 allowing for

adequate Normality for the test to be run and experiment to be conducted thoroughly. Next, a

two-sample t-test was run to compare the average mean times between city neighborhoods (Pilsen and

Little Italy), and the suburbs neighbourhoods(Glen Ellyn and Naperville). Result is that the mean times

are different as P-values is below 0.05. With a sample size of 72 data points consisting of all traffic ligh

within a neighborhood, normality assumption can be assumed. Lastly, a two sample t-test was run to

compare the mean between the city and the suburb. This led to the conclusion that the location from

26

suburb and city has indeed different mean times. The sample size was 144 data points consisting of all

traffic light in the two neighborhoods per location(City or Suburbs). This led to the conclusion that we

fail to reject the null hypothesis that the locations, city and suburbs, have indeed different average

traffic light cycle mean times.

Conclusions:

The statistical analysis of the data clearly indicates the following factors have an effect on traffic cycle

light times that is both statistically significant and remains significant when blocking is taken into

account:

1) Location Factor: Statistical analysis of our data showed that traffic lights in the city have lower

average light cycle times than the traffic lights in the suburb.

2) Income Factor: Statistical analysis of the data showed that traffic lights in the areas of

lower income areas tend have statistically significant lower cycle times than that of higher

income areas.

In addition it was concluded that time of day has no significant effect on the traffic light times.

Whether it's the morning or the evening , all statistical analysis done implied that the traffic light

times will be the same for individual traffic lights in the areas tested.

27

Upon further investigation, it appears that the reason the suburbs have a higher traffic light times is to

the wide multiple lane roads unlike the city’s small two lane roads. Since its a big suburban area with

lots of space, roads are wider and has more lanes which result in more complex intersections. This

leads to higher time cycle.

For the income factor, it appears that when the area has high income levels, a bigger budget is

available for the roads maintenance and traffic light installation. This results in an actuated traffic

light with sensors that can change cycles if there is unusual events. This results in higher percentage

of vehicles stopping because green time is not held for upstream platoons.

28

References:

Montgomery, Douglas C. Design and Analysis of Experiments. Hoboken, NJ: John Wiley &

Sons, 2013. Print

MINITAB. Computer software. N.p., n.d. Web.

"Average Daily Traffic Counts." City of Chicago :: Average Daily Traffic Counts. N.p., n.d.

Web. 20 Nov. 2016

"Point2 Homes." Real Estate. Web.| Homes for Sale & Rent by

Point2 Homes. N.p., n.d20 Nov. 2016.

29

Documents

IE 442 Final Project