[IEEE 2011 IEEE/AIAA 30th Digital Avionics Systems Conference (DASC) - Seattle, WA, USA (2011.10.16-2011.10.20)] 2011 IEEE/AIAA 30th Digital Avionics Systems Conference - Evaluating

978-1-61284-798-6/11/$26.00 ©2011 IEEE 3D2-1

EVALUATING SURFACE TRAJECTORY-BASED OPERATIONS CONCEPTS THROUGH A HUMAN-IN-THE-LOOP SIMULATION

Emily Stelzer, Raymond M. Stanley, and Kathryn Klein Shepley, The MITRE Corporation, McLean, VA

Abstract Surface Trajectory-Based Operations (STBO) is

a concept and research area for improving the safety and efficiency of surface operations and is envisioned as part of the Federal Aviation Administration’s (FAA’s) Next Generation Air Transportation System (NextGen). STBO provides a concept of operating the surface in the mid- and far-term that includes automated decision support tools to assist air traffic controllers and managers in the tower with responsibilities such as planning for airport configuration changes, assigning runways and taxi routes, sequencing aircraft for departure, and monitoring pilot compliance to clearances.

The MITRE Corporation’s Center for Advanced Aviation System Development (CAASD) has led an effort to evaluate STBO concept elements through human-in-the-loop (HITL) simulations using prototype automation developed by Mosaic ATM. This paper describes the results of a third STBO HITL simulation that assessed how taxi routing and surface conformance monitoring decision support tools can enhance the safety and efficiency of surface operations. This simulation included consideration of a local and ground controller, implementation of a configuration change, and multiple traffic loads. Results showed that automation had positive impacts on controllers’ ability to detect pilot deviations, efficiently move surface traffic, and effectively communicate with pilots. These results helped take a step towards validating the STBO concepts by showing a clear benefit of the STBO decision support tools. The results also point to specific areas where improvements can be made to the design of these decision support tools.

Introduction An air traffic controller operating in the control

tower of an airport has the challenging task of managing aircraft with high efficiency while still maintaining safety. This complex and demanding job could benefit from automation that is offered by modern computing. Specifically, automation could be

used to support controller tasks related to managing the pattern of movements on the airport surface and monitoring aircraft’s conformance with clearance instructions. Providing automation assistance to controllers has the potential to allow the controller to direct traffic with greater safety and throughput [1].

Recognizing the need and availability of automation tools, the Federal Aviation Administration (FAA) has incorporated the adoption of automation into its initiative for modernizing air transportation called “NextGen” (Next Generation Air Transportation System). Specifically, the Surface Trajectory-Based Operations (STBO) concept within the Terminal Flight Data Manager (TFDM) NextGen plan describes automation for surface operations implemented during the mid-term timeframe (Audenaerd, Burr, & Morgan, 2010; Morgan, 2010). Automation within the STBO concept is conceived as a suite of decision support tools within the TFDM system that can provide controllers assistance with the following tasks: 1) departure runway assignment, 2) taxi routing and conformance monitoring, 3) airport configuration, and 5) scheduling and sequencing (Audeneard et al., 2010). The STBO decision support tools are designed to complement (and thus not replace) surface surveillance systems that are already being used today, such as AMASS and ASDE-X.

Implementing such a change in how surface air traffic is managed requires thorough evaluation prior to implementation. An important stage of evaluation is empirical evaluation within a laboratory setting, where extraneous variables can be controlled, specific situations examined, and general principles of operation be tested free of any safety consequences. This paper describes empirical evaluations of STBO decision support tool concepts that have been conducted in a simulation environment at The MITRE Corporation’s Center for Advanced Aviation System Development (CAASD), a Federally-Funded Research and Development Center that is sponsored by the FAA.

3D2-2

The MITRE Corporation has conducted three Human-in-the-Loop simulations to evaluate STBO decision support tools over the past two years. In addition to assessing the performance benefits of decision support tools, these evaluations consider the controller’s use and acceptance of the decision support tools. It is important to note that the primary goal of simulations was to assess operational feasibility and benefits of the STBO concepts and decision support tools, rather than the prototype automation algorithms and interface usability. The data presented in this paper will focus on the most recently completed STBO HITL.

The first two simulations in the research program focused on evaluating the taxi route assignment and taxi route conformance monitoring automation for ground controllers [2,3]. Both simulations had pilots deviating from their taxi route, and the second simulation also included aircraft that failed to hold short of taxiway intersections and runway crossings. The results from these simulations showed a clear objective and subjective benefit of the decision support tools. The controller’s detection of pilot deviations nearly doubled with the use of the decision support tools in both the simulations. Furthermore, the second simulation showed that automation also decreased the response time to pilot deviations [3]. Similarly, controllers reported a decrease in workload in association with the taxi routing and conformance monitoring capabilities.

The (third) simulation described in this paper was designed to build upon and expand knowledge gained in the first two simulations (see [4] for a full technical report). This simulation tested decision support tools that support departure taxi routing and surface conformance monitoring. Unlike previous simulations, this simulation included analysis of the local controller’s use of the decision support tools, as well as the implementation of two levels of alerts that communicated severity. In addition, this simulation implemented a configuration change that impacted traffic in the airport movement area. The present simulation sought out to address the following three research questions:

� Do the decision support tools enhance surface operations?

� Do the decision support tools reduce controller workload and enhance the

controller’s ability to handle nonconformance events?

� Do the decision support capabilities enhance operations during scheduled configuration changes?

This simulation involved current air traffic controllers managing traffic in a mid-fidelity simulator of an air traffic control tower environment with electronic flight strips. Data were collected from controllers operating in 16 scenarios spread across two days. These scenarios differed in the position of the controller (ground or local), the presence of automation, the presence of a planned configuration change, and traffic load. Measures collected during the simulation included objective throughput and safety markers, as well as subjective workload and user acceptance.

Methods

Participants Seventeen current FAA controllers participated

in this simulation. On average, the participants reported having 19.1 years of experience as air traffic controllers. The participants were currently staffed at FAA facilities that ranged from level 9 to 12 (M = facility level 10.6). Two participants were staffed at Dallas-Fort Worth International Airport (DFW), the airport that the simulation environment conveyed.

Decision Support Tools The decision support tools described in the

FAA’s STBO concept were implemented within a prototype system developed by Mosaic ATM called the Surface Decision Support System (SDSS). SDSS serves as a prototype decision support tool for testing air traffic surface operation concepts under the STBO umbrella. SDSS provided automation for departure runway assignment, updates associated with a planned airport configuration changes, taxi route generation, and surface conformance monitoring. Runway assignment (including updates in response to a configuration change) was provided by SDSS regardless of the automation condition. For the half of the scenarios in the automation condition, SDSS automation generated taxi routes and monitored conformance. For the other half of the scenarios without automation, controllers had to manually

3D2-3

generate the taxi routes and monitor aircraft conformance. SDSS also provided two interfaces, which were available to the controller regardless of the automation condition: an electronic flight strip display and a surface surveillance map. These interfaces will be described in further detail later.

SDSS performed departure runway assignment (regardless of the automation condition) by following a set of rules based on the departure fixes of the aircraft’s flight plan. SDSS also aided a configuration change (regardless of the automation condition) by re-assigning aircraft going to a set of departure fixes to a new departure runway. This configuration change was done to reduce departure demand on the other runway. In the automation condition only, SDSS automated taxi routing recommendations by providing one of five standard taxi routes by following a set of rules based on the aircraft’s departure fix. In the experimental condition without automation, these same rules were expected to be used by the controllers to make decisions about assigning taxi route. A companion to the taxi route generation function in the automation condition was a function that monitored aircraft’s conformance with these taxi instructions.

Conformance monitoring was based on decomposing the taxiways and runways into sets of small polygons. In general, an aircraft was considered to be out of conformance when the polygon that it was currently in was not part of the categorized taxi route. Controllers were alerted to an out-of-conformance aircraft, and this alert was categorized as either a safety-critical alert or a non-safety-critical alert. Non-safety critical alerts were signaled by turning the flight strip corresponding to the offending aircraft yellow. Safety-critical alerts were signaled by turning the corresponding flight strip red, as well as turning the aircraft icon on the surface surveillance map red, and delivering an auditory alarm. Non-safety-critical alerts included aircraft entering a taxiway polygon that was not part of their route, or entering a taxiway polygon beyond the hold short point without clearance. One safety-critical alert was entering a taxiway polygon not included in their route that was adjacent to a runway edge. In addition to determining if the aircraft was in a polygon within its clearance route, aircraft states within each polygon were assessed to determine other errors. A failure to take off (a non-safety-critical error) was determined when an aircraft was in a runway polygon (within its

assigned route) and did not move for 30 seconds. A takeoff roll without clearance (a safety-critical error) was determined when an aircraft exceeded 30 knots within a runway polygon when the aircraft was not cleared for takeoff.

Two displays worked in conjunction with SDSS’s taxi route generation and alerting automation: An electronic flight strips display and a surface surveillance map. Electronic flight strips are a part of the NextGen and TFDM concepts which replace paper flight strips and physical bays in the tower. Users operate electronic flight strips by viewing strips on a computer display, organizing them into different “bays” that are depicted on the display, and passing them to other controllers through electronic transfer. The electronic flight strip display was a key element to the surface conformance alerting because it allowed SDSS to access data regarding where the aircraft were supposed to be on the airport surface.

The electronic flight strip display was presented on a 22-inch touch screen monitor. The ground controller’s display (see Figure 1) organized the flight strips into a set of vertical bays analogous to the paper strip bays (e.g., arrivals, departure runways, waiting for taxi instructions, transfers, etc.). The local controller’s display was similar (see Figure 2), organizing flight strips into a set of bays and subsection appropriate for their task, based on the airport configuration. There was a bay for each runway, and subsections for each classification of status (e.g., Line and up wait (LUAW), cleared for takeoff (CFTO), cleared to land (CTL), clear of the landing runway (COLR), etc.).

Image Source: MITRE Lab (Mosaic ATM)

Figure 1. Ground Controller Electronic Flight Strip Display

3D2-4


Figure 2. Electronic Flight Strip Display for Local Controller

Each flight strip (see Figure 3) displayed in the top row the aircraft identification (ACID), beacon code, and estimated arrival or departure time. When automation was on, the taxi route was displayed adjacent to the time. The taxi route recommendation was automatically populated (based on the aforementioned departure fix assignment rules) when the automation was on, but could be edited at any time. When automation was off, a suggested taxi route was not displayed on the flight strip and the taxi route was manually assigned by a controller. With the automation off, controllers had the option to enter the taxi route into the electronic flight strip, but most elected to only communicate the taxi route over the radio system. Taxi routes could be assigned (or re-assigned) and displayed in the flight strip with a single button click, and edited using a small keyboard.


Figure 3. Electronic Flight Strip

A surface surveillance map (see Figure 4) was presented adjacent to the electronic flight strip display on a 19-inch touch screen monitor. This display indicated the real-time location of all aircraft

on the airport surface with a small triangle icon that pointed to the aircraft’s directional heading. Data tags with critical flight information were also present. All departures were presented in green and all arrivals presented in white.


Figure 4. Surface Surveillance Map

Simulation Environment The simulation was conducted in the Integration

Demonstration and Experimentation for Aeronautics (IDEA) laboratory within The MITRE Corporation’s Center for Advanced Aviation System Development’s (CAASD). The simulation was conducted in a mid-fidelity representation of the tower environment offered by the IDEA lab’s Air traffic control tower (ATCT) simulator (see Figure 5). The ATCT simulator delivered an “out-the-window” tower environment with a series of large displays that filled 200 degrees of the visual field. The aircraft were moved on the surface according to controller’s instructions through the actions of the two confederate simulation pilots. Aircraft not critical to the traffic situation moved on their own automated routes.

3D2-5

Image Source: MITRE

Figure 5. ATCT Simulator

The ATCT simulator was used to present a simulated view of DFW. This site was chosen because of the configuration of its runways, the number and complexity of taxiways, and the location and arrangement of ramp areas. In current operations, control of DFW is split between west and east ATCTs. The current simulation was designed to examine the control of traffic from the east tower with a southward flow.

The ATCT simulation was staffed by two participants, one in the ground controller position and one in the local controller position. The ground controller position was responsible for providing taxi instructions to departing aircraft pilots located in the ramp area for movement to the runway queue, where they were handed off to the local controller. Ground controllers were also responsible for giving taxi instructions to arriving aircraft pilots (passed from the local controller) for movement to their assigned gates. The local controller position was responsible for arrivals and departures on the runways. Their duty was to provide instructions for the departing aircraft pilots (passed from the ground controller) for movement onto the runway, as well as delivering clearance for takeoff, while maintaining a minimum separation of three nautical miles. Local controllers were also responsible for giving arrival aircraft clearance to land as well as taxi instructions to move across the active runways, where they were then passed to the ground controller.

Simulation Scenarios Participants were presented with 3 practice

scenarios and 16 test scenarios. Each scenario lasted for 30 minutes, beginning at DFW during sunrise with clear visibility. The traffic flow commenced at a level representative of airport opening and increased throughout the duration of the scenario. There was an additional overall traffic load manipulation that will be described subsequently.

The following variables were manipulated across the sixteen scenarios: traffic load, automation, the occurrence of a configuration change, and controller position. Traffic load was manipulated because of its potential impact on the effectiveness of automation. In half of the scenarios, 45 departures and 43 arrivals occurred during each scenario. In the other half of the scenarios there was a higher level of traffic: 50 departures and 44 arrivals took place during each scenario. These values were chosen based on analysis of ASDE-X data from a typical day at DFW, but scaled back slightly because of participants’ unfamiliarity with the airport environment and displays. In half the scenarios, automation was on, supporting airport taxi routing and surface conformance. In the other half of the scenarios, taxi routing and conformance monitoring automation were off, relying on the controller to complete those duties manually. There was a planned configuration change in half the scenarios, which occurred 15 minutes after the start of the trial, and resulted in some traffic being reassigned to runway 13L to reduce departure demand on runway 17R. A given participant served as a ground controller in half the scenarios, and a local controller in the other half of the scenarios.

Each level of all four variables were completely crossed to create 16 experimental conditions, and the participant was presented with one experimental scenario in each of these conditions. Trials were blocked by controller position, and the order of experimental conditions within those blocks was randomly selected from a fully counterbalanced design.

Simulation Procedure The simulation took three full-day sessions for

each participant to complete. The first day consisted of training preparation for the test scenarios. First

3D2-6

participants completed informed consent procedures. Then the participants received three hours of classroom training on surface automation displays, DFW procedures, and the their assigned tasks and responsibilities. Next, a hands-on training session lasting about five hours at the tower simulator was conducted. This hands-on training session comprised reviewing the automation displays and out-the-window view, and then having participants control traffic in six half-hour practice scenarios. Half of these practice scenarios were conducted as a ground controller, and half as a local controller. The first four of the six practice scenarios were conducted with automation on, and the last two practice scenarios were conducted with automation off.

The test scenarios were administered on the second and third days of the simulation, each containing eight scenarios. The simulator recorded a large volume of objective data during the scenarios that included aircraft location and the conformance state of each aircraft, as well as the timing of pilot deviation detections and radio call button presses.

A subjective measure was also collected during the scenarios, where participants judged their perceived workload using a real-time assessment technique called the Air Traffic Workload Input Technique (ATWIT). ATWIT was used to aurally prompt participants to enter a rating of workload using a seven-point Likert scale on a workload assessment keypad every five minutes (Stein, 1985).

At the end of each test scenario, participants completed electronic questionnaires assessing subjective measures of their experience. The subjective measures included a measure of mental workload (NASA Task Load Index), situational awareness (SART, Taylor, 1990), trust in the surface automation, and feedback on the surface automation. After all test scenarios were complete, participants were debriefed to reinforce the purpose and goals of the experiment, as well as gain any additional feedback about the surface automation tools and simulation environment.

Simulation Results

Nonconformance Event Detection Improving airport safety is an important goal of

the STBO decision support tools. A measure of safety

improvement in this simulation was the ability for controllers to detect nonconformance events quickly and accurately. Nonconformance event detection accuracy was computed by dividing the number of correctly detected nonconformance events by the number of nonconformance events in the scenario. Nonconformance event detection accuracy was analyzed with a four-way repeated-measures analysis of variance (ANOVA). The four variables tested were automation (yes or no), controller position (ground or local), airport configuration change (none or planned), and traffic load (lower or higher). The results of the ANOVA showed a statistically significant interaction between automation and controller position (F(1,16) = 24.54, p < 0.001). This interaction can be seen in Figure 6: automation produced a statistically significant benefit for ground controllers (t16 = 7.49, p < .001), but not local controllers (p > .05). It is important to note that the lack of effect for local controllers is likely due to the fact that performance was already so good without automation that adding automation could not make an improvement, regardless of its potential effectiveness. Specifically, performance both with and without automation was above 90 percent.

Figure 6. Effect of Automation and Controller Position on Nonconformance Detection Accuracy

The average time it took to correctly respond to a nonconformance event was also calculated, and analyzed with a four-way ANOVA, again with automation, controller position, airport configuration change, and traffic load as the independent variables. Automation reduced the time it took for controllers to detect events by 2.84 s, a reduction that the ANOVA revealed to be statistically significant (F(1,12) = 4.58, p = 0.04). Furthermore, this effect of automation did

3D2-7

not depend on any of the other variables (no interaction of automation with any other variables).

A separate analysis was conducted to determine the effect of the safety criticality on nonconformance detection accuracy, and how this effect depended on controller position and automation (see Figure 7). A three-way ANOVA (position, automation, and safety criticality) revealed a statistically significant three-way interaction (F(1,16) = 21.16, p < 0.001). Specifically, automation improved detection accuracy of nonconformance events, but only the non-safety-critical events for the ground controller (an automation by safety-criticality interaction for ground controllers only: F(1,16) = 68.87, p < 0.001). The lack of automation effect for safety-critical events was likely due to the fact that performance was already above 90 percent for safety-critical events without automation.

Figure 7. Effect of Automation and Safety Criticality on Nonconformance Detection Accuracy for Ground Controller Position

Surface Operations Throughput Efficiency is another key goal of implementing

the STBO decision support tools. Efficiency was measured in this simulation by examining airport throughput. Throughput was defined as the average number of departures and arrivals for the pair of controllers over the course of the scenario. Throughput was compared across levels of automation, traffic, and configuration change by treatment to a three-way ANOVA. In general, automation made a statistically significant improvement in throughput (automation main effect: F(1,8)=216.08, p < 0.001).

The effect of automation, however, depended on whether a configuration change was present in the

scenario (automation by configuration change interaction: F(1,8) = 182.71, p < 0.001). In particular, automation only provided a benefit when there was not a configuration change (t8=4.91, p = 0.001), where the controllers were forced to manage departure traffic on a single runway resource (see Figure 8).

Figure 8. Effect of Automation and Configuration Change on Airport Throughput

Communication Timing The count and duration of radio calls were

considered as an additional metric of surface efficiency. A four-way ANOVA was conducted on both radio call measures, examining the effect of controller position, automation, traffic load, and the presence of a configuration change. The ANOVA on radio call count showed that automation had a statistically significant impact (automation main effect: F(1,16) = 4.34, p = 0.05): controllers had fewer communications in the automation condition (M=100.6) than in the non-automation condition (M=102.3). The ANOVA on average radio call duration did not show an effect of automation (F(1,15) = 0.85, p > 0.05).

Workload Ratings In addition to providing efficiency benefits, it is

useful to consider how decision support tools implemented in STBO impact increase controller workload. Workload was measured in this study during each scenario by the ATWIT and at the end of each scenario by the NASA TLX.

A four-way ANOVA was conducted on the NASA TLX data, investigating the effect of controller position, automation, traffic load, and the presence of a configuration change. This ANOVA

3D2-8

revealed that the effect of automation depended on the combination of configuration change and traffic load that occurred (automation by traffic by configuration change interaction: F(1,11)=3.81, p > 0.05). This three-way interaction showed that automation decreased workload, but only when the traffic load was lower and there was not a configuration change in the scenario (configuration change by automation interaction for lower traffic load only: F(1,11) = 12.74, p = 0.004). Given the interaction of automation with configuration change in several cases, it is worth considering the effect of a configuration change on workload. The ANOVA showed that there was a significant interaction between the controller position and configuration change (F(1, 11) = 13.42, p = 0.004), such that a configuration change increased workload for a ground controller (t11 = 5.13, p < 0.001) but did not affect workload for a local controller.

An analogous four-way ANOVA was conducted on the ATWIT ratings, with controller position, automation, traffic load, and the presence of a configuration change as factors. This analysis showed no statistically significant effect of automation. The grand mean ATWIT rating was 2.82 out of 7, suggesting a consistently low workload overall.

The ATWIT ratings conducted within a trial provided a unique opportunity to consider how workload changed within a scenario in response to a configuration change. To inspect the effect of a configuration change within a scenario, a four-way repeated-measures ANOVA was conducted on ATWIT ratings for only those scenarios where a configuration change was present. This ANOVA encompassed four factors: whether the rating occurred before or after the configuration change, controller position, automation, and traffic load. This analysis shows that there was an effect of automation that depended on whether the rating was before or after a configuration change (F(1,16) = 10.43, p = 0.005). In particular, workload ratings increased in response to a configuration change when automation was present (t16 = -5.71, p < 0.001), but did not increase when automation was absent (t16 = -1.49, p > 0.05; see Figure 9). The ATWIT results also showed that the configuration change increased workload more for the ground controller position than for the local controller position (F(1,16) = 2.76, p = 0.05).

Figure 9. Effect of Automation and Configuration

Change and Automation on ATWIT Workload Rating

Situation Awareness Ratings Situation awareness was assessed by SART

ratings gathered at the end of each test scenario. The effect of automation and other variables was considered by submitting the SART ratings to a four-way ANOVA, with automation, position, traffic load, and configuration change as factors. The results of this analysis showed that there was an effect of automation that depended on controller position (F(1,11)=4.85, p = 0.05): automation reduced ground controller’s situation awareness based on the SART measure (t11 = 2.17, p = 0.05), but did not affect local controller’s situation awareness (p > 0.05). Figure 10 below depicts this relationship.

Figure 10. Effect of Automation and Controller Position on Situation Awareness Rating

3D2-9

Controller Acceptance and Trust At the end of each session, participants were

asked to give feedback on the design and timing of the alerts. One aspect of the alarm design that was queried was the appropriateness of the criticality level. In general, most controllers (greater than 90 percent) thought that the levels chosen were appropriate for the taxi route deviation yellow alert and all red alerts. There was less consensus for the failure to takeoff yellow alert: 24 percent of participants indicated that it should have been a red alert. The participants were divided on the appropriateness of a yellow alert for a failure to hold short at a taxiway intersection: 47 percent indicated that this alert should have been a red level alert.

In general, controllers thought that the alarms’ onset was too late, especially for safety-critical events. Specifically, 47 percent of participants reported that non-safety-critical alerts should have turned on more quickly, and 74 percent of controllers reported that safety-critical events should have turned on more quickly.

Participants were also questioned about their trust in the automation (only in scenarios when automation was present). For the taxi route generation decision support tool, participants had a high amount of trust (7.8 out of 10). The participants also had a high amount of trust for the conformance monitoring decision support tool (8.2 out of 10). ANOVAs run on these data showed that the trust did not depend on position, traffic, or the presence of a configuration change. Interestingly, controllers rated the system confidence in the joint performance of the controller and automation at a value higher than the confidence in either component alone (self-confidence in the detection task or trust in the automation), for both taxi route generation and surface conformance monitoring.

Discussion and Conclusions This evaluation was conducted to test the

effectiveness of a specific set of decision support tools in aiding the controller with his or her need to balance safety with throughput. This simulation showed the benefit of decision support tools conceptualized in the FAA’s STBO concept, as indicated by improved safety and airport throughput. Furthermore, this simulation showed that the

controllers generally trusted and accepted the decision support tools. This simulation also pointed to areas where the decision support tools could be further refined and suggested ways that future simulations could better examine the use and benefit of decision support tools.

Safety Some of the clearest benefits of automation were

seen in the safety measures. Automation caused quicker detection of nonconformance events, and this effect was not dependent on any other variables manipulated in the simulation. This shows the broad-sweeping benefit that automation provided across many situations. This reduction in detection time could provide valuable seconds for a controller to take corrective actions. The detection time improvement occurred despite the fact that feedback on alert timing indicated that the onset of many alerts was too late. This suggests despite the potential for refinement, the effectiveness of the decision support tool is relatively robust to imperfections in timing.

Automation also improved nonconformance event detection for ground controllers. Local controllers were already performing at such a high rate that automation was not able to make a difference. This does not mean that the DSTs tested here could not help local controllers. In situations where local controllers are more challenged, decision support tools are more likely to help.

Efficiency In addition to workload reduction and safety

improvements, another critical goal of STBO decision support tools is to improve the controller’s efficiency, in order to help manage demand on surface resources. This simulation showed that there was a benefit gained by automation when there was not a configuration change, but automation did not improve throughput when there was a configuration change. This lack of automation effect in a scenario with a configuration change could be explained by the fact that a configuration change facilitated improved airport throughput after the configuration change was complete: local controllers could utilize two runways instead of one to move aircraft off the surface. With the less demanding situation, the assistance provided by the decision support tools may not have been needed as much.

3D2-10

Aside from airport throughput, communication timing provided a more granular (but still objective) look at airport efficiency. This metric showed another broad-sweeping benefit of automation. Specifically, there was a benefit provided by automation that occurred across all the other manipulations present in the simulation. This finding hints again that automation is helping the controller in some way in all situations.

Workload Managing controller workload is likely to

facilitate the safest traffic environment possible, as well as increase the efficiency of the system. In general, workload was low but not at the minimum possible values. The current simulation showed reduction in workload as a result of automation, but only in specific situations. One measure of workload gathered at the end of each scenario, the NASA TLX, showed that automation only decreased workload when traffic load was lower and there was not a configuration change in the scenario. Assuming that a configuration change increases workload (it did for the ground controllers in this simulation), this suggests that workload was only reduced by automation when the workload was already low. Perhaps controllers had limited cognitive resources available in these challenging situations that could not easily be allocated to use these novel systems. In addition, perhaps more intense situations force controllers to fall back on their extensive experience working without automation. Another possibility is that the more demanding situations reduced resources available to monitor the subjective impression of workload. This is consistent with the finding that there were still safety and efficiency benefits of automation in these high-demand situations, as will be discussed later.

Another measure of workload, the ATWIT gathered within the scenarios, also showed a dependence of an automation effect on configuration change. The onset of a configuration change increased workload when automation was present, but did not increase workload when automation was absent. Perhaps right after a dynamic event like a configuration change, the controllers were still unsure about how the automation was handling the change. More experience with the automation could remedy this concern. The ATWIT findings also showed a

generally low workload, with a grand mean ATWIT rating of 2.8 out of 7.

An important construct related to workload is situation awareness. This simulation showed that for local controllers, there was no change in situational awareness as a result of automation. For ground controllers, however, automation actually decreased situation awareness. Informal conversations with controllers after the simulation revealed that the ground controllers’ interaction with the electronic flight strips required a lot of “head-down” time that prevented them from looking out the window. Increased familiarity and improved design should reduce this effect of the electronic flight strips. There is also the possibility that use of the decision support tools could allow the controllers to be less aware of the situation because they trust the automation to alert them to any problems. The high amount of reported trust in the automation is consistent with this explanation.

Controller Acceptance of Decision Support Tools

In addition to assessing the impact of automation, this simulation provided an important opportunity to gain feedback on how the decision support tools were designed. There are many properties of the design of an interface, and the queries for this simulation focused on high-level conceptual design choices such as alert level and timing, rather than lower-level implementation such as button placement. For failure to take off and failure to hold short at taxiways, many controllers felt that the yellow alert should have been a red alert, despite the majority of controllers still being content with its assignment to yellow alert level. This clearly demonstrates that there are differences in preferences between controllers regarding alert level assignment. One solution is to allow customizable assignments for certain alerts. Alternatively, perhaps adding parameters to the alert criteria could cause alerts to occur in situations that would have greater controller consensus on alert level.

In addition to concerns about the alert level, controllers also noted their preference for reducing the latency between the beginning of the event and the onset of the alert. The latency could be improved through algorithm refinement, but at some point (e.g., predicting events before they occur), algorithm

3D2-11

design will become a significant engineering challenge. Furthermore, predictive alerting is more likely to cause false alerts that could decrease trust in automation. So, careful consideration must be taken in finding the optimal alerting threshold necessary to reduce alerting onset latency while preserving alert performance. Despite the preference for quick onset, it is important to note that nonconformance events were still detected sooner with these alerts. Thus, there may be a point of diminishing returns on improving onset latency beyond a certain boundary.

Even with the desire for changing alert levels and quicker onset of alarms, the controllers still showed a high degree of trust in the automation. Furthermore, they had more confidence in the human-automation system as a whole than either component alone. This is promising given the limited experience that the controllers had with the technology. Given more extensive training and experience, trust should increase even more.

Conclusions and Future Work There were far-reaching effects of automation

shown in this simulation: regardless of several situational variables, automation reduced non-conformance detection time and the number of times that controllers had to talk to the pilots. The effect of automation was more subtle for other measures, where it depended on controller position (nonconformance detection) and whether there was a configuration change (workload and throughput). The contrast between the broad-sweeping and isolated benefits of automation may be related to the specific measures linked to each effect. Detection time and radio call counts are relatively low-level effects that may just not have been large enough to propagate all the way through the human-machine system to affect final output measures such as the detection of nonconformance and number of planes passed through the airport. Perhaps refinement of the decision support tools (based on information gathered in this simulation) could create a larger benefit that may produce an impact that will carry through to output measures. Furthermore, when scaled up to many controllers in many airports, throughput benefits shown at this low level should grow and exponentially increase.

Feedback on the alerts indicates that implementation of these decision support tools

should allow user-defined alert level definition or add alert criteria parameters. The feedback also indicates that the onset of alerts should be reduced. In addition to these refinements, there are some remaining issues that could be addressed by subsequent simulations in order to prepare for implementation of the STBO decision support tools. The data in the present simulation indicated that future simulations should consider the safety benefit of decision support tools for local controllers under conditions where surface conformance monitoring is more challenging. In addition, the present simulation showed that automation had limited benefit during a configuration change, at least for the throughput and workload measures. The basis for this finding is unclear, due to the fact that a configuration change can decrease demands in some ways (for example, allowing greater throughput for local controllers) but increase demands in other ways (requiring a ground controller to re-route to new runways).

All but two controllers that participated in this study were unfamiliar with the airport where the simulation was tested. Future simulations may benefit from getting more controllers already familiar with the environment where tested, so that airport familiarity could not interfere with testing the benefit of automation. Relatedly, simulations should test locations aside from DFW to ensure that conclusions from simulations at DFW can be generalized to other airports.

The simulation environment offered a high amount of control over the data presented to the controller as well as aircraft movements on the surface. Implementation of a suite of decision support tools at an actual airport will inevitably bring an increase in errors between truth and data presented to the controller. Before such a situation is encountered, introducing purposeful error into a simulation would provide some insight into the potential pitfalls, which would further inform the design of algorithms.

Finally, as the STBO decision support tools move towards implementation, testing will have to move out of the simulation environment and into field testing at an actual airport. This will be necessary to catch any intervening factors that may have not been manipulated in the simulations. Extension to field testing will also provide a more rigorous test of how the decision support tools will be

3D2-12

accepted and used by controllers in an operational environment.

References [1] Diffenderfer, P. A., & Morgan, C. M. (2010). Mid-Term Surface Trajectory-Based Operations (STBO) Concept of Use: Surface Conformance Monitoring, Draft, Version 2.0, MTR100269V7. The MITRE Corporation: McLean, VA.

[2] Stelzer, E. K. (2010). Human-in-the-Loop Concept Evaluation of Surface Conformance Monitoring Automation, MTR100188. The MITRE Corporation: McLean, VA.

[3] McGarry, K.A. & Kerns, K. (2010). Results of a Second (Controller) Human-in-the-Loop Simulation Study of Automated Capabilities Supporting Surface Trajectory- Based Operations, MTR100483. The MITRE Corporation: McLean, VA.

[4] Stelzer, E. K. & Stanley, R.M. (2011) Examination of Air Traffic Controller use of Surface Trajectory-Based Operations Decision Support Tools: Third Human-in-the-Loop Evaluation, MTR110344. The MITRE Corporation: McLean, VA.

Acknowledgements The authors would like to thank Suzanne Porter

and Wesley Link for their leadership and guidance in conducting this research. The authors recognize the contributions of Mosaic ATM in designing and developing the surface automation prototype. The

authors also recognize the contributions of Paul Diffenderfer for providing subject matter expertise during the design and execution of the simulation, Tom Niedermaier for providing training for all of the controller participants, Alain Oswald for his technical expertise and oversight of the tower simulator, and Juliana Goh and Tony Masalonis who assisted with data collection. Finally, the authors recognize and appreciate the valuable feedback of the controllers who participated in the simulation.

Disclaimer The contents of this material reflect the views of

the author and/or the Director of the Center for Advanced Aviation System Development (CAASD), and do not necessarily reflect the views of the Federal Aviation Administration (FAA) or the Department of Transportation (DOT). Neither the FAA nor the DOT makes any warranty or guarantee, or promise, expressed or implied, concerning the content or accuracy of the views expressed herein.

Email Addresses Author email addresses are as follows: Raymond

Stanley, [email protected]; Emily Stelzer, [email protected]; Kathryn Klein Shepley, [email protected].

30th Digital Avionics Systems Conference October 16-20, 2011

Documents

[IEEE 2011 IEEE/AIAA 30th Digital Avionics Systems Conference (DASC) - Seattle, WA, USA (2011.10.16-2011.10.20)] 2011 IEEE/AIAA 30th Digital Avionics Systems Conference - Evaluating