- 1 -
SOFTWARE ARCHITECTURE FOR PLAN4SAFETY CHINA
Dr. Mohsen Jafari
Program Director, Information Management Group
Center for Advanced Infrastructure and Transportation
100 Brett Rd.
Piscataway, NJ 08854
848-445-2980, [email protected]
Tao Gang
Head of Information Technology Dept.
Anhui Keli Information Industry Co. Ltd
628 Huangshan Road,
Hefei, Anhui Province, China
Evan Bossett
Lead Application Developer, Transportation Safety Resource Center
Center for Advanced Infrastructure and Transportation
100 Brett Rd.
Piscataway, NJ 08854
848-445-2882, [email protected]
Bobby Jafari
EAMG LLC
20 Eton Court
Bedminster, NJ 07921
(908) 502 1067, [email protected]
ABSTRACT
This document describes the software architecture for the P4S China project, including the
requirements and its design philosophy. The goal is the give a clear summary of the software
structure for integration and future feature extensions. The software architecture of P4S
China project follows the service oriented architecture (SOA) design pattern, using web
services to couple different software components. The computation results are stored in
Oracle database. For advanced computation, a Matlab runtime engine is incorporate within
the system to support scientific computations, such as nonlinear regressions, etc. The
architecture is scalable, extensible, and flexible.
Keywords: transportation, safety, software, design, architecture
- 2 -
INTRODUCTION
The report is organized as following: an overview on the Plan4Safety China project and a
detailed design of the architecture. Several different initial designs are presented and
compared.
Anhui Province lies in the hinterland of Yangtze Delta; its biggest prefecture, Hefei, has an
area of 7,048 km2 (2,721 Sq. mi) and, as of the 2010 Census, a population of 5,702,466
inhabitants. Its built-up area (“metro”) is home to 3,352,076 inhabitants encompassing all
urban districts. There has been a rapid increase in the number of vehicles in Anhui Province
over the last decade—from 530,000 in 2002 to 2.7 million in 2011 (trucks and cars only). In
2012, the total number of all vehicles in Anhui Province exceeded 7.8 million—almost
double that of 2006.
Due to the increase in new vehicles on its roads, crashes in Anhui Province have increased
from 13,553 in 2006 to 18,075 in 2012—a 33 percent spike in just seven years. Despite the
sharp increase of the number of added vehicles to the roads, Anhui Province has been
successful in decreasing the total number of fatalities of the traffic accidents from 3,735 in
2006 to 2,690 in 2012. This indicates that safety is a top priority for local government
officials. The estimated congestion cost to China is over $3 billion per year. In Beijing along,
, drivers spend roughly 66 minutes of their time sitting in traffic for each trip they make per
day.
All the above facts overwhelmingly suggested the rapid need to develop advanced preventive
measures to significantly reduce traffic accidents while simultaneously increasing mobility.
This led Keli to do an exhaustive review of COTS traffic safety software currently on the
market. Rutgers’ CAIT in Piscataway, New Jersey, was identified as a stand out in the field
of traffic safety through its transportation safety program site, TSRC. TSRC synergizes
experience in engineering, training and outreach, and advanced statistical modeling and
software development to improve safety and reduce fatalities and injuries on New Jersey’s
roadways.
AN OVERVIEW OF PLAN4SAFETY CHINA
Prior to the start of the project, Keli Information had already deployed a black spot analysis
tool which provided statistical crash analysis and map display of crash locations. Developed
by CAIT, Plan4Safety is nationally recognized software in America that has a similar
functionality to Keli’s system, with the addition of more advanced analysis tools. The
Rutgers team also has experience with the new models released in AASHTO’s HSM (HSM).
The manual details best practices in analyzing roadway safety which were the result of many
years of research by FHWA.
The Plan4Safety China project is collaboration between CAIT and Keli to create the “next
generation” of traffic safety software that reaches beyond traditional approaches and deals
only with historical crash data. The system will incorporate cutting-edge models, like those
- 3 -
found in the HSM. This requires the integration of additional data sets, such as roadway
network information, crime data, inspection data, and GIS.
In addition to the basic reporting and analysis tools within Plan4Safety China, this effort has
resulted in a number of unique tools. These powerful tools provide a myriad of benefits to
safety professionals including traffic engineers, planners, law enforcement, and decision-
makers to different agencies in the industry. They were developed with safety professionals’
needs in mind and allow the users to hone in on problem areas so that site-specific
improvements can be implemented. By making this powerful tool available to all public
agency professionals—and encouraging them to use it through support and training—Anhui
Province will significantly raise the autonomous capabilities of its safety professionals at all
levels.
Safety Performance Functions
One of these tools is the implementation of the safety performance function (SPF). It is a
methodology that is employed for estimating and predicting average crash frequency of a
network, facility, or specific locations, including roadway segments, intersections and so on.
SPF is a set of equations that estimates expected average crash frequency as a function of
traffic volume and roadway characteristics.
SPF is applied to a given time period, traffic volume, and geometric design characteristics of
a roadway and may include crash severity and collision types. The predictive method
provides a quantitative measure of expected average crash frequency under both existing
conditions and future conditions, which have not yet occurred. This allows proposed roadway
conditions to be quantitatively assessed along with other considerations such as community
needs, capacity, delay, cost, right-of-way, and environmental considerations.
The estimate is for a given time period of interest (in years) during which the geometric
design and traffic control features are unchanged and the annual average daily traffic
(AADT) counts are known or can be forecast. SPF is a regression equation, which estimates
the number of crashes occurring at a site as a function of AADT and—for the roadway
segments—the length of the segment. SPFs are developed using Negative Binomial
regression models; the number of crashes is the response variable and roadway segment or
intersection geometric design features, traffic control features, and other characteristics as
predictive variables. The predictive variables used in the regression model for roadway
segments are length of segment, AADT, lane width, shoulder width, shoulder type, roadside
hazard rating, presence or absence of horizontal curve, curve characteristics (in case of
presence of a curve), centerline rumble strips, a short four-lane section, a two-way left-turn
lane, roadway segment lighting, and speed enforcement. For intersections, the predictive
variables are the following: number of intersection legs, type of traffic control, intersection
skew angle, number of approaches with left/right-turn lanes, and presence or absence of
intersection lighting.
A general SPF function has been proposed by the HSM to apply to networks in the United
States. However, it is mentioned that whenever possible, the local agencies should develop
- 4 -
their own crash prediction models. In the HSM version of SPF, only AADT and roadway
length are considered as inputs. Other roadway characteristics can indirectly affect the
estimated crashes via a set of measures referred to as the crash modification factors (CMF).
There are some major issues with the SPF model proposed by the HSM. It was developed in
the United States based on local data, which does not necessarily apply to other countries and
states; secondly, CMF methods for different locations has not been released. CMF methods
rely on data from before and after adoption of traffic safety improvements. An alternative
method recommended by many researchers is an ad-hoc crash prediction model using local
data. In doing so, one can input all available roadway characteristics directly instead of using
and developing new CMF values.
China’s crash prediction model was developed based upon a Negative Binomial generalized
linear model and has been applied for a large amount of data collected from a wide range of
urban, suburban and rural areas. This model is popular, well-structured, and can be
customized for a wide range of different networks. The model is also considers the problem
of over-dispersion, which is quite common in crash data. The performance of the Chinese
crash prediction model has been evaluated using a large sample of Chinese crash and network
data. The model adequately predicts crashes for each urban, suburban, and rural areas. The
results are used for ranking crash hotspots within a selected area.
Diagnosis and Countermeasures
The diagnosis and countermeasure selection tool diagnoses problems of crash hotspots based
on site characteristics and outputs an array of countermeasure recommendations that can
improve safety in those locations.
Based on the crash type, a tree structured questionnaire is displayed to the user. The questions
are generally about traffic control devices, intersection/roadway geometry and traffic volume
data of the selected site. Depending on user responses, one path of the tree structure leads to a
problem diagnosis and appropriate solutions.
The output data in this methodology contain the potential problem, the suitable solution,
approximate implementation time and cost of the solutions, applicable areas, and the CMF
approved by FHWA.
Case Studies
This case study outlines the questionnaire-building procedure. The procedure references
include a wide range of applicable countermeasures in signalized and non-signalized
intersections, and roadway segments. For each countermeasure, there is a description
outlining roadway geometry, traffic control devices, other conditions that cause crashes, and
the specific countermeasure that can mitigate them. A comprehensive review of these
descriptions, along with sound safety engineering judgment, is required to define the
questions and their order.
- 5 -
As an example, consider the countermeasure “signal coordination” which is indicated as a
proven countermeasure in National Cooperative Highway Research Program (NCHRP)
Report 500, Vol. 12. Below is the description of this countermeasure; key terms that the
safety engineer used to develop the questionnaires for are in bold.
“The target of this strategy is crashes involving major-street left-turning and minor street
right-turning vehicles where adequate safe gaps in opposing traffic are not available. These
crash types are generally angle and rear-end crashes. Major road rear-end crashes associated
with speed changes can also be reduced by retiming signals to promote platooning. “A key to
success is the appropriate spacing of the signals. Signals within a half mile of each other
should be coordinated, but signal systems that operate on different cycle lengths do not need
to be coordinated.
“Factors that should be considered include geographic boundaries, volume/capacity ratios,
and characteristics of traffic flow (random vs. platoon arrivals).
“Spacing of traffic signals is an important factor. As with all signals, coordinated signals too
close together can present problems related to drivers focusing on a downstream signal and
not noticing the signal they are approaching or proceeding through a green signal and not
being able to stop for a queue at an immediate downstream signal. Dispersion of platoons can
occur if signals are spaced too far apart, resulting in inefficient use of the signal coordination
and loss of any operational benefit. Operations on cross streets may be negatively impacted.”
Cost-Benefit Analysis
A third powerful methodology incorporated into Plan4Safety China is the cost-benefit
analysis. The objective of the economic analysis layer provides users with the tools to
economically evaluate countermeasures that were presented by the diagnosis and
countermeasure tool. The user has the option to select among three methodologies; Net
Present Value (NPV), Benefit/Cost Ratio (BCR) and Cost-Effectiveness Index (CEI). The
user can select one (or more) of these methods for analysis. Based upon the results from this
module, the user can decide how best to invest funds to see the most return for the money.
The steps of each method are described below.
Net Present Value (NPV)
• Estimate the useful life of the countermeasure.
• Estimate the number of crashes per year susceptible to correction by project using
crash prediction algorithm.
• Apply the countermeasure(s) CMF to come up with estimated number of crashes
saved per year.
• Estimate economic benefit of crashes saved per year.
• Calculate total annual savings by subtracting any expected annual savings from low-
cost interim project(s).
- 6 -
• Calculate Present Worth Factor (PWF).
• Calculate the net present value of the project benefits.
Benefit/Cost Ratio (BCR)
• Estimate the useful life of the countermeasure.
• Estimate the number of crashes per year susceptible to correction by project using
crash prediction algorithm.
• Apply the countermeasure(s) CMF to come up with estimated number of crashes
saved per year.
• Estimate economic benefit of crashes saved per year.
• Calculate total annual savings by subtracting any expected annual savings from low-
cost interim project(s).
• Calculate Present Worth Factor (PWF).
• Calculate present value of the project benefits.
• Estimate the present value of implementation costs.
• Calculate the cost/benefit ratio.
Cost effectiveness Index (CEI)
• Estimate the Number of crashes per year.
• Apply the countermeasure(s) CMF to come up with estimated number of crashes
saved per year.
• Calculate the total number of crashes reduced in desired service life by multiplying
total number of crashes reduced per year to service life.
• Calculate the cost effectiveness by dividing construction cost (for desired service life)
to total number of crashes reduced in desired service life.
Near Miss Model
Another cutting edge traffic safety model that was developed is the “near miss” model. This
section assesses the relationship between crashes and near crashes as two types of traffic
events. When crash history is absent or unavailable, near miss crash events can be used as a
surrogate for risk assessment. This model focuses mainly on pedestrian and crossing-merging
vehicles safety in intersections.
Pedestrian crashes in intersections present a high cost to society, making pedestrian safety a
focus area for transportation professionals. Training such models with near miss crash events
would be beneficial due to the fact that actual pedestrian crashes don’t occur as frequently. In
addition, modifications to the traffic signal schedule in any given intersection can affect
adjacent roads and traffic flow, prioritizing the feasible countermeasures in order to enhance
- 7 -
safety is of great importance. Decision-makers can decide which countermeasures to
implement based on priorities and effects of each countermeasure on adjacent roads and
intersections.
A recently proposed model for pedestrian safety in intersections developed by Gharieh and
Gonzales for New Brunswick, New Jersey is applied to the collected data from a sample
intersection in Hefei City, China. According to statistical test results, pedestrian flows, and
vehicle flows are both effective factors on pedestrian risk in the studied intersection;
however, pedestrian flow was reported as the only dominant factor on pedestrian safety in the
sample intersection. It clearly shows that the proposed pedestrian risk model results depend
on individual site characteristics. This model can be expanded with in Plan4Safety China.
Commercial Vehicles
An important component of roadway safety is addressing the specific challenges of
commercial vehicles (CMV). These vehicles are involved in a large number of crashes and
can present unique challenges to both safety professionals and enforcement officials. To
address this, the commercial vehicles model was developed. This section will identify
locations with high CMV crash counts, make correlations between crash and inspection data
considering carrier’s information, and pinpoint locations that may need frequent monitoring.
To achieve this, the commercial safety measurement tool was implemented. This tool probes
through historical CMV inspection records, CMV carriers, and crash circumstances. By
establishing a relationship between CMV inspection records and CMV crashes, a diagnosis
and evaluation layer in the tool can perform a cause and effect analysis on CMV crashes.
This analysis can ultimately identify carriers that have more recorded crashes by cross-
referencing traffic data with inspection violations. These results can assist law enforcement
and engineers when implementing changes to inspection procedures and enforcing routine
examinations for specific carriers. This assessment tool integrated seven major features
measuring carrier’s safety index by evaluating CMV crash contributing circumstances and
inspection violations recorded under that carrier.
Violations and Crime
The latest model developed for Plan4Safety China is a tool that investigations traffic
violations and crime data and identifies hotspots based on this information. This section
describes a correlation between general negative behavior (involvement in antisocial
behaviors) and risky driving and criminal behavior, including theft, burglary, or drunk
driving. The initial model that was developed includes several functions:
• Type of driver which is most likely to commit certain violations
• Type of vehicle which is most likely to commit certain violations
• Type of violations which could potentially lead to certain crash types
• Weighted locations based upon the number of violations and types of violations
- 8 -
• Type of trip (local or long distance) which could lead to certain types of violations
• Weighted locations by number of violations and crime
As more data becomes available, it is envisioned these tools will be expanded upon and the
model can become more robust.
Figure 1 An Example CMV model flowchart
- 9 -
Figure 2 An example of CMV model flowchart, continued
- 10 -
PLAN4SAFETY CHINA’S ARCHITECTURE
System Level Nonfunctional Software Requirements
The Plan4Safety China system is a cloud-based business intelligence platform, where
domain-specific data models and advanced scientific computations collaborate to seamlessly
achieve the paramount goal of saving precious lives by solid analytics. Interactive technology
is becoming more and more reliant on data and history, so the Plan4Safety China project
software architecture must be flexible, extensible, reliable, and thrive on a data-centric design
philosophy.
This article focuses on the nonfunctional requirements, including flexibility, extensibility,
and robustness.
Flexibility. To be able to reconfigure software components in the future, the core business
analysis component from CAIT shall communicate with other components via web service
interfaces. The CAIT component will be an independent executable function or Windows
system service, so that the Keli team can start or stop business analyses at will.
Extensibility. Isolation between the “data physical storage” method and data analysis logic
follows a Model-View-Control design pattern. This ensures that analysis functions are not
compromised if Keli changes the physical data storage format. As the result, Rutgers
component shall not access raw data in the database. Instead, the Keli team offers a stable
data abstraction interface for Rutgers’ component. Rutgers’ component reads and writes data
via this data abstraction interface.
Robustness. A thoroughly tested defensive programming methodology ensures the high
quality of software components. In order to achieve this target, all the function calls through
the web interface have meaningful feedbacks. Inside the data analytics algorithm, the
software validates input parameter type before proceeding with mathematical computations;
this way, the calculation is only initiated if the input data makes sense to the system—this
will eradicate meaningless or erroneous user results. For robustness considerations, all the
function calls to the CAIT component return in seconds. For complicated computations that
require longer, the return is asynchronous—the computations continue to run in the
background while the simpler web service calls return immediately.
While a comprehensive system architecture will not be implemented in this phase of the
project, the team will implement a simplified architecture. However, the simplified
architecture can be extended into this comprehensive architecture in the future.
The full-featured system architecture is shown in Figure 3. The component on top of this
figure is a web-based presentation layer provided by Keli. The data presentation and
abstraction layers are provided by Keli, and the business intelligence layer is provided by
CAIT. The engineering tool and runtime tool of the CAIT component are separated in light
blue and deep blue colors, respectively. The engineering tools are used by CAIT internally at
the development phase, not as a part of the deliverables. The runtime components are
executable files that CAIT delivers to Keli, including the Business Intelligence Engine, the
- 11 -
core logic, a client of Keli data abstraction layer, and a function call web service. The
business logics are programmed in SQL query, C#, and Matlab languages. The SQL and C#
code are directly embedded into the Business Intelligence Engine. The Matlab code is
compiled into executable files via the Matlab NE Builder toolbox. Following Matlab
conventions, the CAIT team implements the Matlab function call using C#, and open web
services that allow external function calls via HTTP. The Keli data abstraction layer calls the
abstraction layer client. In order to isolate the CAIT’s business logic from Keli’s, the two
components have independent data repositories in the database. CAIT logic reads and writes
data in a fixed schema jointly defined by both teams. Keli’s component is responsible for
preparing the retrieved data and translating the schema to other formats, if necessary. The
motivation of this design is to ensure that the Business Intelligence component is usable in
future, even if the Keli data schema changes.
Figure 3 Plan4Safety China comprehensive architecture
The presentation layer shall retrieve the analytical result from the web service provided by
the data abstraction layer. Following a Model-View-Control design pattern, the presentation
or, the views is isolated from the content, or, the model. The CAIT Business Intelligence
provides computation results, which will be aggregated into data model within Keli’s data
abstraction layer. In summary, future changes on the user interface (UI) designs will not be
limited by the Business Intelligent component. The current data schema can be used for any
future presentation technologies.
Design Considerations
In the last century, computation power was so precious and the data size was limited. Old
software architectures featured a single computation resource and mobile data package. Now,
pervasive computing devices offer a tremendous amount of data. However, moving a large
amount of data around the computation infrastructure is inefficient and expensive—the
gravity of the data gradually bends the whole computation infrastructure toward a data-
- 12 -
centric computation framework, where computation components are independent from a
specific data structure and can be deployed next to the data source per requirement. The data
management, business analyses, and presentation modules are loosely coupled by web
services and can be easily replaced or relocated in order to keep up with fast-evolving IT
technologies. Server side technologies are organized with Service Oriented Architectures
(SOAs).
Plan4Safety China is a perfect embodiment of this design philosophy. The proposed long-
term architecture is shown in Figure 3. Due to cost constraints, the team implemented a
simplified architecture, which is shown in Figure 4.
The most important difference between Figure 3 and Figure 4 is that the input and output data
from the Business Intelligence component are read and write directly to the database, without
using web service, in Figure 4. Note that the functions within the Business Intelligence
component are still triggered by the function call web service. During the project, the CAIT
team demonstrated the capability to retrieve data from Keli’s data abstraction web service.
The data was transferred from Keli’s server in China to the CAIT development PC in the
United States. However, the team simplified this approach for efficiency, so that the Business
Intelligence component accesses data within the same PC.
Architecture Design
Figure 4 Simplified Plan4Safety China at product phase
- 13 -
Figure 5 Development phase system architecture
The development phase system architecture is shown in Figure 5, where the development
environments of CAIT and Keli are specified in two blocks. The dash box on top is the CAIT
development system setup, while the dash box at the bottom of the figure is Keli’s system
setup. The CAIT system is located in United States, while Keli’s environment is located in
- 14 -
China. We use SVN Version Control system as the channel to transfer data and software
across the Pacific Ocean. For validation purposes, the setup of the two systems shall be
identical, except that the user account in the Oracle databases are different. Keli transmits
sample data to CAIT, which is stored in the database together with CAIT’s sample data. The
CAIT team then tests the Business Intelligence component and validates the outputs. For
testing purpose, CAIT provides a test client using .NET technology. The client simulates the
Java-based client on Keli’s side system, calls functions within CAIT’s Business Intelligence
component, so that tester can validate the outputs from the Business Intelligence component.
CONCLUSION
The release of Plan4Safety China is a proud moment. It represents a significant step forward
in analysis capabilities and software architecture. The new models developed for the system
are at the forefront of traffic engineering. These data intensive models draw from numerous
sources, including crash data, road network data, crime data, and violation data. The quality
of the reporting output from Plan4Safety China is dependent upon these data links. The
database feeding the system must be powerful and responsive to support the number of users
we expect to serve.
Reports generated by Plan4Safety China are only as good as the data housed in the database.
A common understanding of the data fields in the traffic safety community is important.
Those collecting the data at the crash scene, like police officers, should be monitored and
periodically trained, as data completeness is critical to functionality. As an example, the road
and intersection network database should expand to incorporate all roads and intersections, as
well as capturing additional features.
Continued development of this software will enhance its capabilities for current users, as well
as bring new users into the system. This can be achieved by adding new software tools,
developing new traffic safety models, and integrated additional data sets into the system.