14
Evaluation of Texas Condition Assessment Program and Recommendations for Improvement by Hui Wu, Ph.D. (corresponding author) Postdoctoral Research Fellow Phone: (512)232-3019 Email: [email protected] Zhanmin Zhang, Ph.D. Associate Professor Phone: (512)471-4534 Email: [email protected] and Abdus Shakur Qazi, Graduate Research Assistant Phone: (512)698-2595 E-mail: [email protected] Department of Civil, Architectural & Environmental Engineering The University of Texas at Austin Paper Submitted for Presentation and Publication at the 92nd Annual Meeting of the Transportation Research Board Text - 4582 Table 3×250 750 Figure 4×250 1000 Total - 6332 TRB 2013 Annual Meeting Paper revised from original submittal.

Evaluation of Texas Condition Assessment Program and Recommendations ...docs.trb.org/prp/13-2850.pdf · Evaluation of Texas Condition Assessment Program and Recommendations for Improvement

  • Upload
    dohanh

  • View
    219

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Evaluation of Texas Condition Assessment Program and Recommendations ...docs.trb.org/prp/13-2850.pdf · Evaluation of Texas Condition Assessment Program and Recommendations for Improvement

Evaluation of Texas Condition Assessment Program and Recommendations for Improvement

by

Hui Wu, Ph.D. (corresponding author) Postdoctoral Research Fellow

Phone: (512)232-3019 Email: [email protected]

Zhanmin Zhang, Ph.D. Associate Professor

Phone: (512)471-4534 Email: [email protected]

and

Abdus Shakur Qazi, Graduate Research Assistant

Phone: (512)698-2595 E-mail: [email protected]

Department of Civil, Architectural & Environmental Engineering

The University of Texas at Austin

Paper Submitted for Presentation and Publication at the 92nd Annual Meeting of the Transportation Research Board

Text - 4582

Table 3×250 750

Figure 4×250 1000

Total - 6332

TRB 2013 Annual Meeting Paper revised from original submittal.

Page 2: Evaluation of Texas Condition Assessment Program and Recommendations ...docs.trb.org/prp/13-2850.pdf · Evaluation of Texas Condition Assessment Program and Recommendations for Improvement

Wu, Zhang, and Qazi 2

ABSTRACT 1 The Texas Department of Transportation (TxDOT) uses the Texas Condition Assessment 2 Program (TxCAP) to measure and compare the overall road maintenance conditions among its 3 25 districts. TxCAP combines data from its three subsystems: the Pavement Management 4 Information System (PMIS), which scores the condition of pavement; the Texas Maintenance 5 Assessment Program (TxMAP), which evaluates roadside conditions; and the Texas Traffic 6 Assessment Program (TxTAP), which assesses the condition of signs, work zones, railroad 7 crossings, and other traffic elements. The scores for each of the subsystems are based on data of 8 different sample sizes, accuracy, and levels of variations. Therefore, whether the use of the 9 current TxCAP system is an effective and consistent means to measure the TxDOT roadway 10 inventory conditions needs to be evaluated. Statistical analyses were carried out to evaluate the 11 system from two aspects: 1) to determine whether the mechanism employed in TxCAP is 12 effective in measuring the maintenance performance of 25 districts statistically, and 2) to 13 determine whether the difference between the TxCAP scores for any two districts is statistically 14 significant in terms of the insufficient sampling of the subsystems. A case study was carried out 15 using sample data collected for the whole state from the year 2008 to 2010. The case study 16 results show that the differences in scores between two districts are statistically significant for 17 some of the districts and insignificant for some other districts. It is therefore recommended that 18 TxDOT either compare the 25 districts by groups/tiers or increase the sample size of the data 19 being collected if it wants to compare the districts as individual ones. 20

TRB 2013 Annual Meeting Paper revised from original submittal.

Page 3: Evaluation of Texas Condition Assessment Program and Recommendations ...docs.trb.org/prp/13-2850.pdf · Evaluation of Texas Condition Assessment Program and Recommendations for Improvement

Wu, Zhang, and Qazi 3

INTRODUCTION 1 Civil infrastructures, particularly roads and bridges, are intended to provide service in terms of 2 effectively, efficiently, and safely transporting both people and goods under all weather 3 conditions. In order to achieve this objective, engineers in highway agencies not only need to 4 maintain the infrastructure to ensure its continued service but also need to be able to evaluate the 5 maintenance process and needs of the roadway network. To ensure that the maintenance 6 activities are effective in terms of performance and cost, engineers and decision-makers need 7 tools to conduct performance-based evaluations. Different agencies have developed various 8 scores and indices to evaluate the performance of the roadway network maintenance. The 9 highway system performance not only depends on the individual performance of pavements and 10 bridges but also on the combined interactive “function” of the pavement component, the traffic 11 component (mainly traffic control devices), and the roadside component. Each of these 12 components or subsystems functions differently and has different maintenance requirements. To 13 be able to evaluate the overall performance of the network, highway agencies need a system to 14 comprehensively plan, measure, and manage the highway system. Such an assessment 15 program/system must be able to organize infrastructure inventory, assess condition, set minimum 16 acceptable condition levels, and establish condition targets. Such a system should serve as a 17 valuable tool to facilitate: a) ranking of projects, b) evaluation of the maintenance needs of the 18 highway network, c) distinguishing between the implications of different performance goals, d) 19 performing cost-effectiveness studies using a performance-based budget selection process, and e) 20 communicating to key customers, including the public, the impact of policy and budget decisions 21 on program service and program delivery (1). 22

In order to address this issue, the Texas Department of Transportation (TxDOT) 23 developed the Texas Condition Assessment Program (TxCAP) to assess the state’s interstate and 24 non-interstate highways in a more comprehensive manner. TxCAP combines data from the three 25 existing systems: Pavement Management Information System (PMIS), Texas Traffic Assessment 26 Program (TxTAP), and Texas Maintenance Assessment Program (TxMAP). As a result, TxCAP 27 eliminates duplication of the three separate scoring systems and provides a simplified scoring 28 system. However, the current TxCAP consists of three subsystems where scores are calculated 29 using data of different sample sizes, accuracy, and levels of variations. Whether the use of 30 TxCAP is an effective and consistent means to measure the TxDOT roadway maintenance 31 conditions raises concerns and needs to be evaluated (2). 32

In this study, an attempt is made to analyze the source data and evaluate the validity of 33 the current TxCAP system. The rest of the discussions are organized as follows. In Section 2, a 34 thorough literature review of state of the art and the practice in performance-based condition 35 assessment programs is presented. Section 3 elaborates on the current practice by TxDOT and 36 how the calculations are conducted. Section 4 describes the data used in the case study to 37 evaluate the current TxCAP system. Section 5 presents the methodology and the results of the 38 case study. Finally, Section 6 provides the conclusions and recommendations. 39

40 LITERATURE REVIEW 41 Several states have developed different practices and methodologies that integrate infrastructure 42 inventory, condition assessment, minimum acceptable condition levels, and funding decisions. A 43 review of the practices followed by the Florida Department of Transportation (FDOT), the Ohio 44 Department of Transportation (ODOT), the Oregon Department of Transportation (ODOT), 45 TxDOT, and the Washington Department of Transportation (WSDOT) revealed that not all 46

TRB 2013 Annual Meeting Paper revised from original submittal.

Page 4: Evaluation of Texas Condition Assessment Program and Recommendations ...docs.trb.org/prp/13-2850.pdf · Evaluation of Texas Condition Assessment Program and Recommendations for Improvement

Wu, Zhang, and Qazi 4

infrastructure assets were assessed—only major ones were, such as pavements and bridges (1). 1 However, a few agencies had programs either in operational phase or under development to 2 establish condition targets for other high value assets such as drainage structures and traffic 3 control devices. Table 1 summarizes the practices for overall road inventory condition 4 assessment of five state DOTs. 5

As it can be seen from Table 1, all the five DOTs developed condition assessment 6 procedures for pavements and bridges. Some DOTs also use the latest mapping technologies 7 such as Geographic Information Systems (GIS) and the Global Positioning Satellites (GPS). The 8 Florida DOT, Texas DOT, and Washington SDOT use automated condition assessment vehicles 9 for pavement condition assessments. The Ohio DOT and Oregon DOT, on the other hand, 10 employ a manual approach. 11

12 TABLE 1 Practices for Overall Road Inventory Condition Assessment by State DOTs 13

AASHTO Member Department

Measurement Component Comprehensive Road Inventory Scoring System

Pavements Roadside

Maintenance Traffic Bridges

Florida DOT √ √ √ √ √ Ohio DOT √ * √

Oregon DOT √ √ Texas DOT √ √ √ √ √

Washington DOT √ √ √ √ √ *Culvert inspection under development and inspection by some districts 14

15 More specifically, the Florida DOT maintains a Roadways Characteristics Inventory 16

(RCI) and a Maintenance Rating Program (MRP), which is a visual and technical evaluation of 17 actual highway maintenance conditions. The MRP is divided into five categories (elements): 18 Roadway, Roadside, Traffic Services, Drainage, and Vegetation and Aesthetics; and each 19 element is further divided into characteristics. Using a weighting process, characteristic and 20 element scores are converted into the overall MRP Score. Scores for maintenance areas are 21 rolled up into an overall Statewide MRP Score, which is then reported to the Executive 22 Committee as Florida’s “Report Card.” 23

A similar methodology is followed by the Washington DOT. The Washington DOT 24 currently uses the Maintenance Accountability Process (MAP) for comprehensively planning, 25 measuring, and managing the state highway system. A sub program of this system is the - 26 Maintenance on State System (M2) Program, which contains the maintenance functions needed 27 to maintain the highway system. Each of these groups has its associated MAP activities. The 28 groups are Roadway Maintenance & Operations, Drainage Maintenance & Slope Repair, 29 Roadside & Vegetation Management, Bridge & Urban Tunnel Maintenance & Operations, Snow 30 & Ice Control Operations, Traffic Control Maintenance & Operations, and Rest Area Operations. 31 MAP utilizes outcome-based performance measures with a rating scale of A (best) to F (worst) 32 for reporting the level of service provided. The system also allows maintenance activities to be 33 prioritized according to their impact on specific policy objectives. 34

35 THE TEXAS CONDITION ASSESSMENT PROGRAM 36 TxCAP combines information from PMIS, TxMAP, and TxTAP. As a result, TxCAP provides a 37 more comprehensive assessment of the interstate and non-interstate highway system (2). 38

TRB 2013 Annual Meeting Paper revised from original submittal.

Page 5: Evaluation of Texas Condition Assessment Program and Recommendations ...docs.trb.org/prp/13-2850.pdf · Evaluation of Texas Condition Assessment Program and Recommendations for Improvement

Wu, Zhang, and Qazi 5

PMIS is an automated system for storing, retrieving, analyzing, and reporting pavement 1 condition information such as distress, ride quality, deflection, and skid resistance data. It can be 2 used to retrieve and analyze pavement information in order to compare maintenance and 3 rehabilitation treatment alternatives, monitor current pavement condition and estimate total 4 pavement needs. The annual PMIS survey currently consists of three separate surveys: a visual 5 evaluation survey, a ride quality survey, and skid resistance measurement (3). 6

TxMAP evaluates the overall condition levels for the interstate and non-interstate 7 highway systems. TxMAP requires a visual inspection of 23 elements of the highway system in 8 three different components: pavement, roadsides, and traffic operations for each 1-mile segment 9 (in one direction). The entire evaluation procedure requires only two full-time employees who 10 perform the evaluations with assistance from district personnel. TxMAP inspection consists of 11 evaluation of 10 percent of the interstate highway and 5 percent of all other highways in the state 12 highway system (3). 13

TxTAP evaluates traffic control devices across the state such as signs, work zones, 14 railroad crossings, and other traffic elements. TxTAP is a tool used by TxDOT to evaluate the 15 uniformity, quality, and consistency of traffic control devices placed on the state highway system. 16 Because evaluating every traffic control device statewide is not feasible in terms of available 17 resources, TxTAP scores are based on a relatively small sample of all traffic control devices. 18 TxDOT’s Traffic Operations Division conducts the annual evaluation of the various types of 19 traffic control devices in each of TxDOT’s 25 field districts. Each district review consists of 20–20 30 randomly selected segments on the state highway system, 5–16 signalized intersections, 3–4 21 work zones, and 2–6 railroad crossings (3). 22

23 EXAMINATION OF THE STRUCTURE OF TxCAP SCORE 24 The overall inspection score, commonly known as the TxCAP score, is calculated from the 25 weighted average of the three subsystems, i.e., the TxTAP score, the TxMAP score, and the 26 PMIS score. Each of these scores is calculated as the average of the element scores. The 27 elements are scored via field inspections conducted manually by experts and technicians and by 28 automated vehicles. The elements in each of the subsystems are detailed in Figure 1 (3, 4). 29

TRB 2013 Annual Meeting Paper revised from original submittal.

Page 6: Evaluation of Texas Condition Assessment Program and Recommendations ...docs.trb.org/prp/13-2850.pdf · Evaluation of Texas Condition Assessment Program and Recommendations for Improvement

Wu, Zhang, and Qazi 6

1 FIGURE 1 Elements of the subsystems in the Texas Condition Assessment Program. 2

3 The development of TxCAP eliminates duplication of the three separate scoring systems 4

and provides a simplified and concise scoring system, where potential double-counting the effect 5 of any components is prevented. In this sense, the structure of the TxCAP score is valid in terms 6 of the components being considered. 7

On the other hand, the ratings and descriptions of the numerical grading system used in 8 TxCAP are based on a five-point system. The five-point system then is converted to a percentage 9 by multiplying each rating by 20. The resulting score is then weighted to determine the overall 10 score for each subsystem. Each subsystem’s overall score is then weighted according to the 11 appropriate TxCAP value to obtain a total composite score for the entire roadway system (4). 12 The TxCAP score is thus a weighted average of the scores of its three components. The TxCAP 13 score (for each district) is calculated from its components’ scores using the following formula: 14

15 0.5 0.25 0.25 (1)

16 In other words, there are a number of weights applied at different levels to the structure of 17 TxCAP score, resulting in a scoring system that can be directly affected by the choice of the 18 weights. For this very reason, the TxCAP scores cannot be used as an absolute measurement of 19 roadway conditions, but can be used as a relative measurement of the maintenance performance 20 of the 25 TxDOT districts. Furthermore, even when the TxCAP score is used as a relative 21 measurement of the maintenance performance, it can be fine-tuned to more accurately reflect the 22

TxCAP

TxTAP

Roadside  Signs

RR Xing

Signals

Shoulder

Delineator

Attenuator

Stripping

Raised Pvt Marker

TxMAP

Vegetation Management

Litter

Sweeping

Trees Brush

Drainage

Encroachment

Guard Rails

Guard Rail End Treatments (GETS)

Mail Boxes

General Public Rating

PMIS

Rutting

Cracking

Failure

Ride

Edge

Shoulder

TRB 2013 Annual Meeting Paper revised from original submittal.

Page 7: Evaluation of Texas Condition Assessment Program and Recommendations ...docs.trb.org/prp/13-2850.pdf · Evaluation of Texas Condition Assessment Program and Recommendations for Improvement

Wu, Zhang, and Qazi 7

local conditions of a district, especially considering that Texas covers wide range of climatic 1 conditions in terms of temperature and moisture. For example, in west Texas, vegetation 2 management many not be as important as it is to a district in east Texas, as in general there is not 3 much rainfall in west Texas; consequently, a smaller weight for vegetation management should 4 be used for a district in west Texas than that for a district in east Texas. 5 6 DATA DESCRIPTION 7 In order to evaluate the effectiveness and consistency of the current TxCAP system, a case study 8 was carried out using a dataset collected from all 25 districts spanning a period of 3 years: 2008, 9 2009, and 2010. The dataset consisted of scores of the elements under each of the three 10 subsystems as well as the calculated PMIS, TxTAP, and TxMAP scores for each surveyed 11 section within each district. The TxCAP score for each section was calculated using Equation 1. 12 The respective district scores were obtained by averaging the scores of the sections surveyed in 13 that year. The average of the scores for all districts gave the mean score for the state. The mean 14 and standard deviation of all the four scores (TxCAP, PMIS, TxMAP, and TxTAP), for the 3 15 years of data used in the case study, are shown in Table 2. The data in Table 2 shows that, of the 16 three subsystems, TxMAP has the least variation and TxTAP has the highest variation in scores. 17 This indicates that condition of the roadside environment is more consistent, whereas the 18 condition of the traffic component has large fluctuations. The analysis was carried out for each 19 year individually and for the 3 years combined together. 20 21

TABLE 2 Summary Descriptive Statistics of Data 22 PMIS TxTAP TxMAP TxCAP Mean S.D. Mean S.D. Mean S.D. Mean S.D.

2008 77.80 11.111 79.48 12.243 81.24 9.004 79.13 8.87 2009 74.75 9.176 78.55 10.202 78.92 7.189 76.76 7.19 2010 76.52 8.874 80.10 10.170 79.93 6.844 78.26 6.91

Combined 76.34 9.838 79.37 10.919 80.02 7.763 77.86 7.98 23

EVALUATION OF TXCAP AND RECOMMENDATIONS FOR IMPROVEMENT 24 This section covers the data analysis methodology employed in this study. In the first part, the 25 statistical comparison of the scores is discussed, followed by the minimum sample size 26 determination for the subsystems. The results of the data analysis are also presented in this 27 section. 28 29 Comparison of the Performance Scores 30 One of the objectives of this study was to determine whether the difference between the TxCAP 31 scores for any two districts is statistically significant or not, and then to determine if all the 32 districts are performing at the same level or not. As discussed above, the TxCAP score is a mean 33 value calculated based on sampling scores from its three sub system for each district. Therefore, 34 the TxCAP score for each district can be considered as a random variable. A null hypothesis of 35 the comparison was constructed as below, 36

37 : , , (2)

38 and the alternative hypothesis as 39

40

TRB 2013 Annual Meeting Paper revised from original submittal.

Page 8: Evaluation of Texas Condition Assessment Program and Recommendations ...docs.trb.org/prp/13-2850.pdf · Evaluation of Texas Condition Assessment Program and Recommendations for Improvement

Wu, Zhang, and Qazi 8

: , , (3)where: , the mean of TxCAP score for district 1, and 1

, the mean of TxCAP score for district 2. 2 3

The t-test is usually used to compare means for two samples from the same population as 4 well as for samples from two different populations (5, 6). A pilot evaluation showed that the 5 variances of the scores in TxCAP subsystems differed among the districts. Therefore, the TxCAP 6 score for each district was considered as an independent random variable with a different 7 variance. A two-tailed, heteroskedastic t-test was carried out to compare the mean values, the 8 formulas of which are presented by Equations 4 through 6. 9

10 The two possible outputs of the hypothesis test are either “reject H ” or “fail to reject H .” 11

If t-test results in rejecting the null hypothesis, it means that the scores are statistically different 12 and a true difference exists in the scores. If the t-test results “fail to reject” the null hypothesis, 13 then it can be concluded that the scores are not statistically different. In such a case the scores of 14 the two districts may be the same or appear different due to variability in measurement 15 (measurement error). The same null and alternate hypothesis was followed in the comparison of 16 the PMIS, TxTAP, and TxMAP scores. The t-statistic, combined standard deviation, and the 17 degrees of freedom were calculated using Equations 4 through 6, respectively (5, 6); 18

19

(4)

(5)

. .

1 1

(6)

where: = the mean TxCAP score of district 1, 20 = the mean TxCAP score of district 2, 21 = unbiased estimator of the variance of sample data points for district 1, 22 = unbiased estimator of the variance of sample data points for district 2, 23 = number of sample data points for district 1, and 24 = number of sample data points for district 2. 25

26 Pairwise comparison of the mean for PMIS, TxTAP, TxMAP, and TxCAP scores of the 27

25 districts were then carried out. The t-statistics, calculated using Equation 4, were compared 28 with the critical t-statistics at a 95-percent level of confidence. The output results of the t-tests 29 are a number of symmetrical matrices for each year of analysis for each of the four performance 30

TRB 2013 Annual Meeting Paper revised from original submittal.

Page 9: Evaluation of Texas Condition Assessment Program and Recommendations ...docs.trb.org/prp/13-2850.pdf · Evaluation of Texas Condition Assessment Program and Recommendations for Improvement

Wu, Zhang, and Qazi 9

scores. Each output is a matrix of “Yes” and “No” results. “Yes” indicates that the difference 1 between the two respective districts is statistically significant, i.e., it can be concluded that the 2 two districts have different levels of performance. Similarly, a “No” response indicates that 3 statistically the two respective districts have the same performance level. In addition, another set 4 of matrices has been created that presents the level of confidence of the scores. In other words, it 5 shows the probability that the scores are different. The second set of matrices can allow 6 engineers to conduct hypothesis tests at many different levels of confidence without having to 7 repeat the calculations each time. Figure 2 shows the t-test results of the TxCAP scores for the 8 year 2010. It should be noted that more information can be derived from these outputs by 9 grouping the “No” responses as shown in Figure 2. Because each “No” response indicates that 10 the scores of the two respective districts are not statistically different, a group of “No” responses 11 indicates that the scores of all districts within that group are not statistically different. In other 12 words, no true difference exists between them and all members of this set are performing at the 13 same level. Figure 3 presents the same results in a different format, i.e., the level of confidence 14 tables. Instead of conducting a two-tail test at 95 percent, the level of confidence is shown in 15 Figure 3. In other words, Figure 3 shows the probability that the scores are different when 16 conducting a two-tail test. As an example, the cells corresponding to an 80 percent level of 17 confidence, in Figure 3, have been highlighted. 18

19

20 *Cells indicated “No” responses are highlighted. 21

FIGURE 2 t-test results for TxCAP scores for 2010. 22 23

TRB 2013 Annual Meeting Paper revised from original submittal.

Page 10: Evaluation of Texas Condition Assessment Program and Recommendations ...docs.trb.org/prp/13-2850.pdf · Evaluation of Texas Condition Assessment Program and Recommendations for Improvement

Wu, Zhang, and Qazi 10

1 *The highlighted cells correspond to a level of confidence of 80% 2

FIGURE 3 Level of confidence analysis output for TxCAP score for 2010. 3 4 Sample Size 5 The changes in any of the performance scores between different years or between districts could 6 be due to a number of reasons, including insufficient sample size. A generally recognized 7 statistical rule is that the accuracy of the estimated mean value of a population increases as the 8 number of samples taken from the population increases. The accuracy of the estimate for 9 variability or standard deviation from the mean also increases with the increase in sample size. In 10 other words, the greater the number of samples collected, the higher the confidence level that the 11 mean will be identified with sufficient accuracy and that the variability will be better defined (7). 12

Many statistical methods are available for determining the sample size, such as the 13 Bootstrap method, the Assume Normal-Pool Variance method, the Noether method, and the 14 Risk-based method. Also, some valuable experience and results can be found in and adopted 15 from related research. For instance, Zhang (7) conducted a thorough research on determining the 16 appropriate minimum testing frequencies for the construction and maintenance of highway 17 infrastructure using the risk-based method, which is one of the most popular and effective 18 methods. The risk is determined by the probability of making a hypothesis testing error, i.e., both 19 Type I and Type II error, and tolerable error (7, 8). A number of factors affect the determination 20 of a suitable sample size. In Zhang’s research, four factors were considered to determine the 21 optimal sample sizes or appropriate testing frequencies: the variability of the quality 22 characteristic being measured, the risks that a state DOT or a contractor is willing to take, the 23 tolerable errors each party is willing to accept, and the cost of the testing to be performed. A risk-24 based statistical approach was used with the aim of conducting tests to achieve a certain standard 25 of quality. In this study the first three factors have been considered, but the issue of cost has not 26 been addressed. Since both Type I and Type II errors should be controlled when determining the 27 required sample size, the minimum sample size (considering a two-tailed test) was calculated 28 using the following formula (5, 6, 7): 29

TRB 2013 Annual Meeting Paper revised from original submittal.

Page 11: Evaluation of Texas Condition Assessment Program and Recommendations ...docs.trb.org/prp/13-2850.pdf · Evaluation of Texas Condition Assessment Program and Recommendations for Improvement

Wu, Zhang, and Qazi 11

1

⁄ (7)

where, 2 n = sample size, 3

⁄ = the 1 /2 percentile of the standard normal distribution; being the Type 4 I error, 5

= the 1 percentile of the standard normal distribution; being the Type II 6 error, 7

= standard deviation, and 8 e = tolerable error 9 10 Equation 7 shows the different factors that affect the determination of minimum sample 11

size. The variance used in calculating the minimum sample size was obtained from the sample 12 data available for the 3 years. The sample sizes were calculated for different combinations of 13

, and . Different sample sizes have correspondingly different risk levels (7). The risk level 14 includes/combines the risk, risk and the risk associated with .The tolerable error is defined 15 as the maximum difference the decision-maker is willing to conclude that two comparing scores 16 are the same. Different values of tolerance have been mentioned in literature and are determined, 17 in most cases, from experience or by expert’s judgment. This study also attempted to determine a 18 suitable estimate of the tolerable error from the data available. The mean of the differences was 19 used to obtain a reasonable estimate of the tolerable error (e). A histogram of the differences of 20 each performance score was plotted and the mean of the absolute differences was obtained for 21 TxTAP, TxMAP, and PMIS as being 3.61, 2.92, and 3.61, respectively. Based on these mean 22 values, suitable values of the tolerable error were selected to calculate the required sample size. 23 The values selected for the tolerable error are 0.5, 1.0, and 2.0. A total of four different values 24 (0.01, 0.05, 0.1 and 0.2) were selected for both and for each of the three different tolerable 25 errors. Table 3 shows the current data collection practice and the recommended number of 26 samples to be collected in order to conduct tests at a statistical significance level of 27 0.05, 0.05 for the three different tolerable errors. It should be noted that different districts 28 have different sample sizes as detailed in Figure 4, which shows the data collection practice for 29 all 25 districts over the 3 years. Based on Equation 7 and the standard deviation values in Table 2, 30 TxTAP is expected to require the largest number of samples and TxMAP the least to ensure the 31 estimation accuracy at the same risk level. This expectation is verified by the results in Table 3. 32 Thus more data points are required to carry out tests at the same risk level. For the purpose of 33 establishing a valid TxCAP, data for all three subsystems is required for all pavement sections 34 under consideration (being sampled). Therefore it is recommended that the number of data 35 samples collected should match the largest minimum sample size (of the three subsystems) for a 36 chosen risk level. Although PMIS data is collected for the statewild highways, insufficient data is 37 collected for TxMAP and TxTAP (9). 38

Two methods were considered to improve the data collection without incurring a 39 significant increase in costs. One is to take data from ½-mile segments instead of the current 1-40 mile segments, i.e., by dividing the current sample into two samples. However, this method does 41 not increase the statistical significance of the data as the data becomes non-random, violating the 42 key assumption for the sampling process. The statistical significance can be increased only if all 43

TRB 2013 Annual Meeting Paper revised from original submittal.

Page 12: Evaluation of Texas Condition Assessment Program and Recommendations ...docs.trb.org/prp/13-2850.pdf · Evaluation of Texas Condition Assessment Program and Recommendations for Improvement

Wu, Zhang, and Qazi 12

sections (1-mile or ½-mile) are randomly selected. Another option is to aggregate the data for 2 1 years to create a larger sample. This process will definitely increase the sample size, but care 2 must be taken to ensure the time periods for comparison do not overlap. 3

4 TABLE 3 Data Collection Practice for the Three Performance Scores 5

2008 2009 2010

Sample size for 0.05

Min Max Avg. Min Max Avg. Min Max Avg. 2 1 e 0.5 PMIS

99 260 154 99 258 160 103 257 157 262 1047 4190

TxTAP 323 1290 5161 TxMAP 163 652 2609

6

7 8

FIGURE 4 Data collection practice and minimum sample size for . ,9 . , and . 10

0 50 100 150 200 250 300 350

PAR

FTW

WFS

AMA

LBB

ODA

SJT

ABL

WAC

TYL

LFK

HOU

YKM

AUS

SAT

CRP

BRY

DAL

ATL

BMT

PHR

LRD

BWD

ELP

CHS

Sample size

District

2010 2009 2008

TxMAP

PMISTxTAP

TRB 2013 Annual Meeting Paper revised from original submittal.

Page 13: Evaluation of Texas Condition Assessment Program and Recommendations ...docs.trb.org/prp/13-2850.pdf · Evaluation of Texas Condition Assessment Program and Recommendations for Improvement

Wu, Zhang, and Qazi 13

CONCLUSION 1 The objective of this study is to evaluate whether the use of the current TxCAP system is an 2 effective and consistent means to measure the TxDOT roadway maintenance performance of 3 TxDOT districts, and develop recommendations according to the analysis results. To achieve this 4 objective, the source data of each subsystem of TxCAP was thoroughly analyzed, and the 5 validity of the current TxCAP system developed as a combination of three subsystems is studied. 6 Based on a careful examination of the structure of TxCAP scores, it was clear that the TxCAP 7 scores cannot be used as an absolute measurement of roadway conditions, but can be used as a 8 relative measurement of the maintenance performance of the 25 TxDOT districts. Furthermore, 9 it is recommended that, even when the TxCAP scores are used as a relative measurement of the 10 maintenance performance of the TxDOT districts, it should be enhanced to more accurately 11 reflect the local conditions of the districts through adjusting the weights that are used at various 12 levels of the TxCAP structure. 13

The statistical analyses were carried out in two steps on a dataset for all 25 districts 14 spanning over a period of years. In the first step, two-sample comparison was conducted to test 15 the null hypothesis: the mean TxCAP scores of two districts are equal. A two-tail, 16 heteroskedastic, t-test with a 95 percent level of confidence was carried out on the sample data to 17 test the hypothesis. Each district was treated as a separate population with a different size and 18 variance. The results, presented as matrices, showed which pairs of district scores are statistically 19 different and which are not. The results can be arranged to form groups or tiers of statistically 20 similar districts. Within each group, statistically no difference exists between the districts, i.e., 21 the maintenance performance of those districts are statistically the same. In addition, a matrix of 22 the level of confidence for TxCAP was created. This matrix gives the probability that scores (of 23 any two districts) are different. The test was also conducted on the subsystem scores. 24

In the second part of the study, a risk-based statistical procedure was selected for 25 determining the minimum sample size (survey sites) for each of the subsystem. The sample size 26 determination is associated with variability, Type I error, Type II error, and tolerable error. The 27 analysis results show that the three subsystems have different variances, and therefore each 28 subsystem requires a different minimum sample size, for a specific risk level. However to 29 establish a valid and effective TxCAP system, all three subsystem scores are required for all 30 sampled sections. This analysis, therefore, recommends that if TxDOT wants to achieve a higher 31 and uniform risk level for TxCAP, the amount of sample data being collected for TxTAP and 32 TxMAP should be increased, as the PMIS data is already being collected for the entire network 33 of the state highways; however, as an alternative, TxDOT can maintain the current level of data 34 collection effort, but compare the districts by groups/tiers, where the differences in scores for the 35 districts within the same group/tier are statistically insignificant. 36

TRB 2013 Annual Meeting Paper revised from original submittal.

Page 14: Evaluation of Texas Condition Assessment Program and Recommendations ...docs.trb.org/prp/13-2850.pdf · Evaluation of Texas Condition Assessment Program and Recommendations for Improvement

Wu, Zhang, and Qazi 14

REFERENCES 1 1. Chait, E. P., Brinckerhoff, P., and Cambridge Systematics, Incorporated. GASB 34—Methods 2

for Condition Assessment and Preservation, National Cooperative Highway Research 3 Program (NCHRP), Report 608, Transportation Research Board, Washington D.C., 2008. 4

5 2. Sunset Advisory Commission, Sunset Advisory Commission—Final Report, Texas 6

Department of Transportation, 2009. 7 8

3. TxDOT, Project Summary: Peer State Review of TxDOT Maintenance Practices. 9 ftp://ftp.dot.state.tx.us/pub/txdot-info/rti/psr/0-6664.pdf , Accessed July 5, 2011. 10

11 4. PBS&J, Central Texas Turnpike Project Annual Inspection Report, 12

http://www.txdot.gov/about_us/commission/2010_meetings/documents/minute_orders/may213 7/7.pdf Accessed July 6, 2011. 14

15 5. Devore, J. L. Probability and Statistics for engineering and the sciences. Thomson, MA, 2004. 16 17 6. Walpole, R. E., Myers, R. H., Myers, S. L., and Ye, K. Probability & Statistics for Engineers 18

and Scientists. Prentice Hall, N.J., 2011. 19 20 7. Zhang, Z., Fowler, D. W., Hudson, W. R., and Ceballos, R. Development of a methodology to 21

determine the appropriate minimum testing frequency for the construction and maintenance 22 of highway infrastructure, Report number 0-7181-1, Center for Transportation Research, 23 University of Texas at Austin, October 2001. 24

25 8. AASHTO. Implementation Manual for Quality Assurance, American Association of State 26

Highway and Transportation Officials. Washington, D.C., 1996. 27 28 9. Zhang, Z., Machemehl, R. B., Pavement Related Databases in TxDOT, Report number 0-29

4186-3, Center for Transportation Research, University of Texas at Austin and Texas 30 Transportation Institute, The Texas A&M University System, August 2004 31

TRB 2013 Annual Meeting Paper revised from original submittal.