15
Combining electrodermal responses and cardiovascular measures for probing adaptive automation during simulated flight q Andrea Haarmann, Wolfram Boucsein * , Florian Schaefer Physiological Psychology, University of Wuppertal, Max-Horkheimer-Str. 20, D-42119 Wuppertal, Germany article info Article history: Received 28 March 2008 Accepted 30 April 2009 Keywords: Adaptive automation Skin conductance Heart rate variability abstract Adaptive automation increases the operator’s workload in case of hypovigilance and takes over more responsibility if workload becomes too high. Two consecutive studies were conducted to construct a biocybernetic adaptive system for a professional flight simulator, based on autonomic measures. Workload was varied through different stages of turbulences. In a first study with 18 participants, electrodermal responses of experimental subjects oscillated very close to the individual set point, demonstrating that workload level was adjusted as a result of adaptive control, which was not the case in yoked control subjects without adaptive automation. Combining electrodermal responses with heart rate variability in a second study with 48 participants further enhanced the adaptive power which was seen in even smaller set point deviations for the experimental compared to the yoked control group. We conclude that the level of arousal can be adjusted to avoid hypovigilance by combining autonomic measures in a closed loop. Ó 2009 Elsevier Ltd. All rights reserved. 1. Introduction Vigilance decrement is a common phenomenon during the operation of highly automated systems as used in long-haul flights with commercial airplanes (Wright and McGown, 2001). On a traditional flight deck, a certain level of the pilot’s attention is a prerequisite for successfully operating a complex airplane. With increasing automation of flight management systems, the pilot’s role will be reduced to one of a mere observer during long flight periods, which can unknowingly exert detrimental effects on the pilot’s vigilance. In case of an unexpected change in situational demands or a system failure requiring immediate actions, the pilot may not be able to respond timely and adequately, since she/he may have lost situational or mode awareness (Endsley, 1996). Adaptive automation might be used as a precautionary measure to prevent pilots from dangerous vigilance decrements while oper- ating a highly automated cockpit. The rationale of adaptive automation is keeping the operator in an optimal state of vigilance and avoiding excessive workload by adjusting the system’s mode or level of automation dynamically (Byrne and Parasuraman, 1996; Scerbo et al., 2000). Beyond tradi- tional forms of automation, the degree of which is changed by the operator, a system with adaptive power initiates such changes on its own discretion. In case of detecting an operator’s hypovigilance, the system may switch to a lower level of automation and assign more responsibilities to the operator to increase her/his vigilance. In turn, if the operator’s workload becomes too high, the system may automatically take over more responsibility for the task in question by increasing its level of automation. To obtain continuous information on the operator’s vigilance state, the system will need some adequate real-time input from her/him. It is critical for such an adaptive system that changes in operator vigilance and work- load can be continuously recorded. Since both hypovigilance and work overload normally result in performance decrement, perfor- mance changes would be the natural measure of choice here. However, since most parts of the task are taken over by the system with increasing degrees of automation, availability of continuous measures of operator performance from the primary task is limited (Morrison and Gluckman, 1994; Scerbo et al., 2001). Introducing secondary tasks is not a proper solution, since they unnecessarily increase the operator’s workload and may cause motivational problems. Subjective reports of vigilance decrement are obtrusive and may interfere with the task, and their retrospective nature may prevent them from reflecting the operator’s actual state properly (Hart and Wickens, 1990). Therefore, psychophysiological measures which are both unobtrusively and continuously available in real time came into the focus in adaptive automation (Morrison and Gluckman, 1994; Pope et al., 1995). Vigilance decrement is accompanied by a decline in psychophysiological arousal which can q The rights of all subjects were protected and informed consent was obtained. * Corresponding author. Tel.: þ49 2024392338; fax: þ49 202 4392926. E-mail address: [email protected] (W. Boucsein). Contents lists available at ScienceDirect Applied Ergonomics journal homepage: www.elsevier.com/locate/apergo 0003-6870/$ – see front matter Ó 2009 Elsevier Ltd. All rights reserved. doi:10.1016/j.apergo.2009.04.011 Applied Ergonomics 40 (2009) 1026–1040

Combining electrodermal responses and cardiovascular measures for probing adaptive automation during simulated flight

Embed Size (px)

Citation preview

Page 1: Combining electrodermal responses and cardiovascular measures for probing adaptive automation during simulated flight

lable at ScienceDirect

Applied Ergonomics 40 (2009) 1026–1040

Contents lists avai

Applied Ergonomics

journal homepage: www.elsevier .com/locate/apergo

Combining electrodermal responses and cardiovascular measures for probingadaptive automation during simulated flightq

Andrea Haarmann, Wolfram Boucsein*, Florian SchaeferPhysiological Psychology, University of Wuppertal, Max-Horkheimer-Str. 20, D-42119 Wuppertal, Germany

a r t i c l e i n f o

Article history:Received 28 March 2008Accepted 30 April 2009

Keywords:Adaptive automationSkin conductanceHeart rate variability

q The rights of all subjects were protected and info* Corresponding author. Tel.: þ49 2024392338; fax

E-mail address: [email protected] (W. B

0003-6870/$ – see front matter � 2009 Elsevier Ltd.doi:10.1016/j.apergo.2009.04.011

a b s t r a c t

Adaptive automation increases the operator’s workload in case of hypovigilance and takes over moreresponsibility if workload becomes too high. Two consecutive studies were conducted to constructa biocybernetic adaptive system for a professional flight simulator, based on autonomic measures.Workload was varied through different stages of turbulences. In a first study with 18 participants,electrodermal responses of experimental subjects oscillated very close to the individual set point,demonstrating that workload level was adjusted as a result of adaptive control, which was not the case inyoked control subjects without adaptive automation. Combining electrodermal responses with heart ratevariability in a second study with 48 participants further enhanced the adaptive power which was seenin even smaller set point deviations for the experimental compared to the yoked control group. Weconclude that the level of arousal can be adjusted to avoid hypovigilance by combining autonomicmeasures in a closed loop.

� 2009 Elsevier Ltd. All rights reserved.

1. Introduction

Vigilance decrement is a common phenomenon during theoperation of highly automated systems as used in long-haul flightswith commercial airplanes (Wright and McGown, 2001). Ona traditional flight deck, a certain level of the pilot’s attention isa prerequisite for successfully operating a complex airplane. Withincreasing automation of flight management systems, the pilot’srole will be reduced to one of a mere observer during long flightperiods, which can unknowingly exert detrimental effects on thepilot’s vigilance. In case of an unexpected change in situationaldemands or a system failure requiring immediate actions, the pilotmay not be able to respond timely and adequately, since she/hemay have lost situational or mode awareness (Endsley, 1996).Adaptive automation might be used as a precautionary measure toprevent pilots from dangerous vigilance decrements while oper-ating a highly automated cockpit.

The rationale of adaptive automation is keeping the operator inan optimal state of vigilance and avoiding excessive workload byadjusting the system’s mode or level of automation dynamically(Byrne and Parasuraman, 1996; Scerbo et al., 2000). Beyond tradi-tional forms of automation, the degree of which is changed by the

rmed consent was obtained.: þ49 202 4392926.oucsein).

All rights reserved.

operator, a system with adaptive power initiates such changes onits own discretion. In case of detecting an operator’s hypovigilance,the system may switch to a lower level of automation and assignmore responsibilities to the operator to increase her/his vigilance.In turn, if the operator’s workload becomes too high, the systemmay automatically take over more responsibility for the task inquestion by increasing its level of automation. To obtain continuousinformation on the operator’s vigilance state, the system will needsome adequate real-time input from her/him. It is critical for suchan adaptive system that changes in operator vigilance and work-load can be continuously recorded. Since both hypovigilance andwork overload normally result in performance decrement, perfor-mance changes would be the natural measure of choice here.However, since most parts of the task are taken over by the systemwith increasing degrees of automation, availability of continuousmeasures of operator performance from the primary task is limited(Morrison and Gluckman, 1994; Scerbo et al., 2001). Introducingsecondary tasks is not a proper solution, since they unnecessarilyincrease the operator’s workload and may cause motivationalproblems. Subjective reports of vigilance decrement are obtrusiveand may interfere with the task, and their retrospective nature mayprevent them from reflecting the operator’s actual state properly(Hart and Wickens, 1990). Therefore, psychophysiological measureswhich are both unobtrusively and continuously available in realtime came into the focus in adaptive automation (Morrison andGluckman, 1994; Pope et al., 1995). Vigilance decrement isaccompanied by a decline in psychophysiological arousal which can

Page 2: Combining electrodermal responses and cardiovascular measures for probing adaptive automation during simulated flight

A. Haarmann et al. / Applied Ergonomics 40 (2009) 1026–1040 1027

be monitored by various measures of central and autonomicnervous system activity reflecting functional state (Boucsein andBacks, 2000; Offenloch and Zahner, 1990; Streitberg et al., 1987).Psychophysiological measures to be used in adaptive automationare required to be continuously recorded and on-line evaluated.They are supposed not to interfere with the task or impair theoperator’s well-being. A variety of psychophysiological measureshave been already used for adaptive automation or in related fields,such as personalized computer game controlling. They comprisemeasures derived from spontaneous electroencephalographicactivity (EEG), electrocardiography (ECG), blood pressure andelectrodermal activity (EDA). First attempts to establish systemswith adaptive automation related to piloting made use of EEG-derived indices, in particular several studies conducted in the NASALangley Research Center (e.g., Pope et al., 1995). The task applied inthese studies for constructing a biocybernetic closed-loop systemwas the Multi-Attribute Task Battery (MATB; Comstock and Arne-gard, 1992). This personal computer based suite of tasks somehowrefers to skills that are needed by pilots like compensatory tracking,system monitoring, communication and resource management.While performing the MATB, the operator’s EEG data were recordedand on-line evaluated for acquiring relative power in the alpha,beta and theta frequency bands. A ‘‘task engagement index’’ formedby the combined EEG power indices beta/(alphaþ theta) was foundto best represent the operator’s state of vigilance needed foradaptive automation. In case of low task engagement indication,the system allocated more tasks to the operator. If the taskengagement index increased, the system reallocated subtasks tothe MATB computer to relieve the operator (see also Prinzel et al.,2001, 2003). In clinical settings, Schwilden et al. (1989) probeda closed loop for the adaptive dosage of anesthetics based on EEGfrequency bands (see also Bibian et al., 2003). In a similar way,electromyographic activity (EMG) was implemented by Simanskiet al. (2003) for the adaptive titration of muscle relaxants forneuromuscular blockade during anesthesia. Another similarapproach used EEG engagement indices for treatment of attentiondeficit/hyperactivity disorder (ADHD; Cox et al., 1998; Lubar, 1991;Lubar et al., 1995).

In the cardiovascular realm, heart rate variability (HRV) has firstbeen discussed as a candidate psychophysiological measure foradaptive automation by Byrne and Parasuraman (1996). Prinzelet al. (2003) developed a biocybernetic closed loop based on indi-vidually calculated HRV thresholds for task allocation. If the upperthreshold was exceeded, indicating an increase in HRV, subtasks ofthe MATB were allocated to the operator for manual control. If thelower threshold was undercut, indicating a decrease in HRV,subtasks were automated for relieving the operator. As anothercardiovascular measure, blood pressure has been used by De Waardand Mulder (2006) for introducing an interface for ambulancedispatching (activating emergency rides, scheduling orderedtransports and preserving coverage of the area the dispatch workeris in charge of) which can be considered as a fine example forreality-related interfaces.

Not much use has been made so far from EDA recording inadaptive automation itself. Several Japanese studies probed EDA asa feedback measure to avoid drowsiness and maintain a certainlevel of arousal (Isshiki and Yamamoto, 1994; Nishimura andNagumo, 1985; Yamamoto and Isshiki, 1992). Additionally, EDA wasapplied in a flight simulator context with so-called ‘‘mind-operateddevices’’ at Wright Patterson Air Force Base (Junker et al., 1995).More recently, the usability of real-time autonomic measures hasbeen discussed for controlling the interaction between a player anda computer game on a cognitive/motivational/emotional level(Fairclough, 2007). In this context, EDA has been probed as well foradaptive purposes in personalizing computer games (Kuikkaniemi

and Kosunen, 2007; Laufer and Nemeth, 2007; Metz and Hoffman,1997; Parente and Parente, 2006).

To date, the majority of studies probing adaptive automationrelated to aviation psychology have been performed with the use ofEEG-derived measures. The above mentioned ‘‘task engagementindex’’, based on EEG power band changes during continuousperformance of the MATB, figured out to be most suitable for bothup- and down-regulation of an adaptive closed-loop system (e.g.,Scerbo et al., 2000; Bailey et al., 2004; Prinzel et al., 2007).However, the suitability of the MATB for simulating flight tasks islimited, since it does only partly resemble real piloting, althoughsome of its elements contain task elements from the cockpit.Therefore, we decided to use a professional flight simulator in ourown studies.

It is also debatable whether the use of EEG would be the bestchoice for adaptive automation in the cockpit. Although there arestudies with pilots from which EEG recordings are taken duringflight (Wright and McGown, 2001), using EEG-derived measuresduring long-haul flight operations may not be suitable for severalreasons. First, electrical cockpit noise can be a considerable sourceof EEG artifacts. Second, EEG electrodes are too much visible to beworn without commercial pilots feeling uneasy. Therefore, ourcurrent research aimed at providing an adaptive system based onelectrodermal and cardiovascular measures.

Electrodermal measures have the advantage of not only beingvalid arousal indicators but also pure measures of the sympatheticbranch of the ANS (Boucsein, 1992; Critchley, 2002; Wallin, 1981).Additionally, Wilson (2002) reported a significant correlation ofr¼ 0.83 between skin conductance and heart rate (HR) during 22flight segments. Within EDA measures, the frequency of electro-dermal responses (EDR freq.) turned out to differentiate betweenflight segments, while the sum of EDR amplitudes showed onlyminimal changes.

In cardiovascular measures, it is not so straightforward todetermine the influence of the different ANS branches, sincesympathetic and parasympathetic activities are confounded (Levy,1997; Obrist, 1981). Parasympathetic responses usually occurduring passive coping and under resting conditions, whilesympathetic responses can be seen during active coping and higherworkload conditions (Backs, 1998; Berntson et al., 1997; Lennemanand Backs, 2000). Mulder et al. (2003) described two patterns ofphasic (sympathetic) and tonic (parasympathetic) changes incardiovascular activity. Backs (1998) addressed the confounded-ness of these two branches by introducing a method based onprincipal component analysis (PCA) for estimating sympathetic andparasympathetic contributions to cardiovascular activity, thusimproving the diagnosticity of heart rate for mental workloadduring mental arithmetic tasks and tracking tasks.

In a pilot study performed in our laboratory (Boucsein et al.,2005), we recorded EDA and HR from student subjects during fourflight missions in the professional instrument flight rule (IFR)simulator used in the present studies, varying the strength ofturbulence to check the usability of ANS measures for a prospectiveclosed loop to be applied in adaptive automation. Increasingstrength of turbulence resulted in an increment of EDR freq. whichcan be interpreted as an indicator of increased workload (Boucseinand Backs, 2000). In turn, progression of flight missions was asso-ciated with habituation shown by a decreased EDR freq. andreduced sum of EDR amplitudes.

Based on the results of Boucsein et al. (2005), the aim of thepresent study 1 was to construct and probe a closed-loop adaptivesystem in a preliminary study, implementing EDR freq. as anadequate indicator of general arousal and as a control variable foradjusting the strength of turbulence onset during a flight task ina yoked control design (cf. Fairclough et al., 2005; Mikulka et al.,

Page 3: Combining electrodermal responses and cardiovascular measures for probing adaptive automation during simulated flight

A. Haarmann et al. / Applied Ergonomics 40 (2009) 1026–10401028

1999, and Prinzel et al., 2003). As our flight simulator did notprovide different levels of automation, we decided to vary thestrength of turbulence, simulating autopilot mode by removingturbulence. The less turbulence was exerted, the higher was thelevel of ‘‘automation’’. We chose the subjects’ deviations from theindividually predefined set point (comparison of normative andactual value) as a criterion of adaptive quality. For each participantand each psychophysiological measure, a set point assessment wasattained by calculating arithmetic means of psychophysiologicalbaseline recordings.

In study 2, we refined and extended the closed-loop adaptivesystem (1) by the use of additional psychophysiological measures,namely HR and HRV, and (2) a finer grading of turbulence intensityfor adaptive automation. Additionally, we obtained subjectiveworkload and arousal data before and after physiological dataacquisition. For both studies, we expected that experimentalsubjects would show smaller deviations from the set point inautonomic arousal measures compared to yoked control subjects.We also expected less turbulence switches under combinedmeasures compared to a single measure as an indicator of a moreefficient adaptive system.

2. Study 1

2.1. Methods of study 1

2.1.1. ParticipantsEighteen right-handed students (11 female, 7 male) aged 20–34

years (M¼ 26.39 years, SD¼ 4.5 years) participated in the study aspart of a psychology course requirement. All subjects had normal orcorrected to normal visual abilities and had neither cardiovascularnor neurological disorders. Five more subjects had to be excludedfrom the sample due to technical irregularities or electrodermalnonresponding. Informed consent was received by all participants.Pairs were matched according to the yoked control designdescribed in the following section.

2.1.2. Task and designAfter the explanation of the flight simulator’s basic instruments

and functions (standard-T displays: indicators of airspeed, attitude,altitude and direction), each subject underwent an extensive flighttraining, followed by an additional practice of four basic IFR flightmissions: (1) Takeoff to a northerly direction and climb to 2000 ft.(2) After being automatically lifted to 10,000 ft, the task consistedof flying northbound, controlling for altitude, speed (120 kts) anddirection. (3) Thereafter, subjects had to perform a 90-degree turnto an easterly direction and to keep the new course, controlling foraltitude and speed. (4) While continuing to fly straight and leveleastward, subjects faced turbulences (steps 0 and 2 out of sixpossible turbulence intensities). Varying turbulence intensitiesinfluenced the frequency as well as the amount of mainly pitch butalso yaw and roll control. The higher the turbulence the moredifficult it was to keep altitude, course and speed. The trainingsession took about 1 h.

For the experimental part, after a new start up and takeoff,subjects were moved up to 10,000 ft and to a direction of 90degrees via the external keyboard of the flight simulator computer(‘‘move mode’’). After having flown straight and level for 2 minwithout turbulences, the participant was informed about receivingmild turbulences and flew under turbulence step 2 until perform-ing the task within the usual limits for IFR pilots (5 degrees hori-zontal and 50 ft vertical deviation). Afterwards, the subjectscontinued flying straight and level eastbound, and two baselinerecordings were performed (60 s without turbulences as restingperiod, 60 s with turbulence step 2). Based on the arithmetic mean

of the two baseline recordings, the control computer calculated thesubjects’ individual set point for EDR freq. as her/his mean level ofarousal, followed by thirty 60-s flight sections.

The participants were assigned to one of two conditionsaccording to the yoked control design: (1) In the experimental(‘‘adaptive’’) condition, subjects flew thirty flight sections contin-uously, keeping altitude and course while facing turbulences. EDRfreq. was calculated every 60 s and used to trigger the turbulencestrength for the subsequent 60 s, dependent on the set point of theindividual subject as follows: turbulences were turned on by thecontrol computer in case of low arousal (EDR freq.� set point) andturned off in case of high arousal (EDR freq.> set point), based onthe update of the electrodermal data that was performed every60 s, without the subjects’ knowledge, as they flew continuously.(2) The other participants belonged to the yoked control condition,in which each yoked control subject received the same sequence ofturbulences during thirty 60-s flight sections as the correspondingexperimental subject, regardless of her/his own EDR freq. and setpoint and hence without adaptive automation. Thus, the yokedcontrol subject may suffer states of underarousal as well as over-arousal. The participants were blind with respect to the experi-mental conditions. For avoiding artifacts due to hand movements,participants rested their hands in molds beside the yoke.

2.1.3. ApparatusThe study was conducted in an air conditioned and soundproof

chamber (ca. 9 m2, 23 �C, 50% relative humidity). A professional IFRflight simulator software of a four-seated aircraft which runs ona personal computer (LAS 5.0, made by Fahsig, Germany) wasextended by the manufacturer to allow for varying the strength ofturbulence via a serial port. The cockpit instruments were dis-played on a 1700 monitor placed 0.5 m in front of the subject. TheLAS simulator provided controls for ailerons, elevator and throttle,together with an electrical trim. A second computer controlledadaptive automation: (1) It calculated the individual set point(reference-input variable), (2) triggered the automatic onset andoffset of turbulences for experimental manipulation on the LAScomputer according to the subjects’ individual set point, (3) startedand ceased the recording of physiological data on a third computer,and (4) received the on-line calculated EDR freq. from the recordingcomputer for adaptive regulation of the subjects’ arousal. Fig. 1gives an overview of the information flow in the closed-loopadaptive system between the subject and the various instruments.

2.1.4. Recording and data analysisPhysiological data recording (EDA, ECG and respiration) was

performed on a Nihon Kohden Neurofax EEG-8310 G polygraphwhich was connected to a personal computer with a customizedsoftware package (PSYCHOLAB, � Jorn Grabke, 1997).

EDA was recorded thenar and hypothenar with constant voltage(0.5 V) from the nondominant hand with two Ag/AgCl electrodes(0.8 cm diameter), filled with isotonic electrode paste (MedSafe TD-246, Med Associates, Inc.) according to the recommendations ofFowles et al. (1981), with a sampling rate of 20 Hz, a sensitivity of0.001 mS, and a 0.3 Hz low pass filter. Frequency and sum of EDRamplitudes were used as tonic EDA measures, calculated on-line for60-s epochs by customized software (EDR_PARA and EDR_SLCT, �Florian Schaefer, 2003) on the data recording computer. EDR_PARAidentified and parameterized EDRs for the defined epoch, andEDR_SLCT read the parameterized EDR data, selecting all responsesthat emerged within the defined time slot, and that exceeded thedefined amplitude criterion of 0.01 mS. For all selected EDRs, thesum of amplitudes was calculated (in case of overlapping EDRs, anextrapolated recovery line was applied; cf. Boucsein, 1992, p. 134).

Page 4: Combining electrodermal responses and cardiovascular measures for probing adaptive automation during simulated flight

Fig. 1. Closed-loop adaptive system based on physiological measures.

A. Haarmann et al. / Applied Ergonomics 40 (2009) 1026–1040 1029

ECG was recorded from the Einthoven II-lead (above the rightwrist vs. above the left ankle, ground electrode on the left forearm)by means of two Ag/AgCl electrodes, filled with Hellige conductiveelectrode cream, at a sampling rate of 200 Hz. The ECG signal wasanalyzed by customized software (EKG_IBI, IBI_SCAN, IBI_PARA, �Florian Schaefer, 2003). EKG_IBI detected IBIs during each 60-sepoch, being corrected for outliers by IBI_SCAN. Based on the cor-rected data, IBI_PARA calculated mean heart rate (HR in beats permin; cf. Hugdahl, 1995) and heart rate variability (HRV) as rootmean square of successive differences (RMSSD in beats per min; cf.Van Dellen et al., 1985). Additionally, a respiration belt containinga piezo element was fastened to the subject’s thorax (sampling rateof 10 Hz).

2.1.5. Statistical analysisFor each pair of experimental and yoked control subjects, the

differences between nominal (set point) and actual EDR freq.during the thirty 60-s flight sections were calculated as absolutedeviation values and compared with paired t-tests. The sameprocedure was applied for the sum of EDR amplitudes. Usingdifference values gives the advantage that interindividual differ-ences with respect to baseline values can be alleviated. Further-more, the correlations between EDR freq. (or the sum of EDRamplitudes, respectively) throughout the thirty flight sections andthe resulting onset/offset of turbulences were calculated for eachpair of subjects, using Spearman’s rho (r) coefficient. For groupdata, repeated measurement ANOVAs were conducted with 30flight sections as within-subjects factor and the experimentalconditions (adaptive vs. yoked control) as between-subjects factor,using Greenhouse–Geisser adjusted p values. For the psychophys-iological data not used for the adaptive algorithm (HR, HRV andrespiration rate), difference values were calculated according to thesame procedure applied for EDR freq., but are not presented here indetail as EDR was central to the algorithm applied. The probabilityof error was set to a¼ 0.05 (two-tailed).

2.2. Results of study 1

Paired t-tests partly revealed significant and highly significantmean differences for EDR freq. between experimental and yokedcontrol subjects (see Table 1a).

This held for three yoked pairs with lower EDR freq. set pointdeviations for experimental subjects compared to yoked controlsubjects (pairs 108–208, 111–211, and 113–213). As can be inferredfrom Fig. 2, EDR freq. of the experimental subject 108 oscillatedvery close to the individual set point of 9, indicating an optimal

arousal level as a result of adaptive control. In case of the yokedcontrol subject 208, EDR freq. diverged from the set point of 8.5 asa consequence of forcing the sequence of turbulences of hisexperimental counterpart upon him without adaptive control.

For pair 114–214, the yoked control subject 214 displayedsmaller EDR freq. set point deviations than the experimentalsubject. For seven yoked pairs, the sum of EDR amplitudes yieldedhighly significant mean differences (see Table 1a), although thatparameter was not integrated in the closed-loop adaptive system.In four of these pairs, experimental subjects displayed smaller setpoint deviations in sum of EDR amplitudes than yoked controlsubjects (pairs 104–204, 106–206, 108–208, and 114–214). Threepairs showed the opposite pattern, with higher set point deviationsin the sum of EDR amplitudes for experimental subjects comparedto yoked control subjects (pairs 109–209, 110–210, and 111–211).These results were supplemented by highly significant negativecorrelations between EDR freq. and EDR-dependent turbulences foreight out of nine experimental subjects (see Table 1b), i.e., thestrength of turbulence was reduced in case of high EDR freq. andvice versa as a result of the algorithm applied. With respect to thesum of EDR amplitudes, only two experimental subjects (104 and114) showed highly significant negative correlations. Yoked controlsubjects did not show any significant correlations between EDRfreq. and/or the sum of EDR amplitudes and EDR-dependentturbulences.

On the group level, a significant effect of flight section was seenfor EDR freq. difference values (F(29, 464)¼ 2.24, p¼ 0.023), withpeak differences in flight segments 1, 14, 22, 24, and 27 (Fig. 3). Theinteraction between flight section and experimental conditions didnot reach significance. However, on a descriptive level, more yokedcontrol subjects showed higher set point deviation values thanexperimental subjects in the above mentioned flight segments.

There was no overall significant effect of experimentalconditions.

For the sum of EDR amplitudes difference values, no overalleffects of flight sections, experimental conditions or interactionbetween both factors reached significance. The same held for HR,HRV and respiration rate difference values.

2.3. Discussion of study 1

Study 1 probed the viability of adjusting autonomic arousal ina closed-loop adaptive system during a professional flight simula-tion task. In the experimental group, adaptive adjustment by meansof EDR freq. was performed according to the subjects’ individual setpoint taken from two baseline recordings. In the yoked controlgroup, subjects received the sequence of turbulences of theirexperimental counterparts without adaptive control, i.e., regardlessof their own actual EDR freq. and their individual set point. Weexpected that experimental subjects would show smaller set pointdeviation values in EDR freq. compared to yoked control subjects.

Results based on individual data indicated that at least part ofthe experimental subjects were closer to their individual set pointof arousal compared to the yoked control subjects as a consequenceof adaptive control. Moreover, correlations between EDR freq. andEDR-dependent turbulences were partly significant for experi-mental subjects, indicating that skin conductance variation wasusable to trigger changes in turbulence setting. Contrarily, the sumof EDR amplitudes yielded significant negative correlations only fortwo experimental subjects. The sum of EDR amplitudes was notinvolved in the algorithm, therefore it does not invalidate ourhypothesis that this parameter did not yield significant correlationswith EDR-dependent turbulences. In addition, it can be inferredfrom the literature that correlations between EDR freq. and the sumof EDR amplitudes are rather low (Boucsein, 1992). Moreover, both

Page 5: Combining electrodermal responses and cardiovascular measures for probing adaptive automation during simulated flight

Table 1a) Mean deviation values (actual value minus set point value) and t-values for paired comparisons over 30 flight sections (1 min each) for electrodermal measures (study 1).b) Correlations between EDA measures and resulting turbulences (Spearman’s rho, r).

Subjects a) EDR freq. EDR sum of amplitudes (mS) b) Correlations

Set point Deviation M (SD) t(29) Set point Deviation M (SD) t(29) r EDR freq. r EDR (mS)

104 6 2.10 (1.77) 1.87 1.61 1.28 (0.41) 17.81** �0.74** �0.52**204 8.5 2.70 (1.47) 12.9 9.34 (2.69) �0.01 �0.07

106 14.5 3.03 (1.81) 1.06 2.7 0.96 (0.66) 3.22** a a206 6.5 2.80 (2.35) 1.85 2.43 (2.97) a a

108 9 2.20 (1.88) 2.39* 7.64 3.23 (1.99) 3.46** �0.88** 0.26208 8.5 3.53 (3.17) 2.97 7.59 (7.56) �0.23 0.27

109 7.5 1.97 (3.15) 0.78 13.77 8.01 (5.77) 5.94** �0.87** 0.27209 10 1.87 (2.00) 3.94 1.93 (0.60) �0.19 �0.29

110 11 2.53 (2.27) 1.25 4.98 3.92 (3.15) 6.85** �0.86** �0.04210 5.5 2.17 (1.49) 0.07 0.12 (0.11) 0.09 0.07

111 14.5 3.57 (1.89) 4.24** 37.68 21.45 (6.08) 9.02** �0.44* 0.20211 5 5.43 (2.10) 13.92 9.93 (3.51) �0.24 �0.08

112 11 2.77 (2.10) 1.90 22.63 5.72 (4.17) 1.18 �0.87** �0.03212 7 2.07 (1.55) 11.92 6.33 (3.28) �0.15 0.23

113 11 2.13 (1.53) 3.33** 32.45 8.47 (4.80) 1.09 �0.60** �0.10213 7.5 3.50 (1.58) 29.47 7.74 (5.87) 0.20 �0.18

114 8.5 2.47 (1.65) 2.60* 7.81 6.58 (0.37) 6.83** �0.86** �0.53**214 8.5 1.60 (1.13) 35.7 15.15 (7.07) 0.16 0.21

Note: a was not calculated as the strength of turbulence did not change.*p< 0.05, **p< 0.01.

A. Haarmann et al. / Applied Ergonomics 40 (2009) 1026–10401030

measures may address different aspects of arousal: EDR freq. ismore related to general arousal, while the sum of EDR amplitudesindicates negatively tuned emotional arousal (Boucsein and Backs,2009). It is rather unlikely that negative emotional arousal wasinduced in our experiment. This might additionally explain thelower number of correlations for the sum of EDR amplitudescompared to EDR freq. For yoked control subjects, none of thecorrelations reached significance, presumably due to the lack ofcoherence between physiological measures and turbulences causedby the absence of adaptive adjustment.

Fig. 2. Comparison of EDR freq. for an experimental subject (10

There were no significant differences in EDR freq. set pointdeviations between experimental and yoked control subjects ona group level. Differences could be detected on a descriptive levelonly for certain flight sections, with experimental subjects showingsmaller EDR freq. set point deviations than their yoked counter-parts (see Fig. 3). One possible reason for the lack of significantgroup results could have been that the sequence of turbulence wasnot identical for all subjects, since the order of turbulence switchesdepended on the individual deviations of EDR freq. Moreover, usingonly two steps of turbulence intensity – i.e., with (step 2) and

8) and a yoked control subject (208) for 30 flight sections.

Page 6: Combining electrodermal responses and cardiovascular measures for probing adaptive automation during simulated flight

Fig. 3. Absolute set point deviation values (group means and standard errors) of EDR freq. for adaptive (Exp.) and yoked control (YC) condition.

A. Haarmann et al. / Applied Ergonomics 40 (2009) 1026–1040 1031

without (step 0) turbulence – might have contributed to problemsin adjusting the strength of turbulence to the actual arousal level insome experimental subjects. As can be inferred from Fig. 2, usinga two-point control by merely turning on and off turbulences mighthave resulted in a sudden excess amount of arousal for experi-mental subjects which may have induced higher oscillationsaround the set point in some flight sections (so-called excessiveincrease of regulating energy). This may have hampered adaptiveadjustment of turbulences, as reflected in the small number ofexperimental subjects showing significant correlations betweenEDR freq. and EDR-dependent turbulences. Therefore, a finergrading of turbulence intensity was introduced in our second study.

Furthermore, for it is rather unlikely that a single physiologicalsystem will have both sensitivity and diagnosticity to cover allaspects of arousal and vigilance decrement in such a complexsystem, recordings from several physiological systems may beneeded to gain a full picture of the different arousal and attentionalsystems (Allanson and Fairclough, 2004; Boucsein and Backs,2000). Hence, a subsequent study was performed with a morerefined turbulence switching and the inclusion of additional ANSmeasures from the cardiovascular system.

3. Study 2

3.1. Aim of study 2

Our second study followed largely the design of study 1, withthe following changes:

(1) We used different combinations of autonomic measures (EDRfreq., HR and HRV) for probing their quality of regulation. Wediscussed but dismissed the following additional measures: (a)The sum of EDR amplitudes was not chosen for adaptiveautomation since it is more related to the emotionally nega-tively tuned EDR and less to general arousal (Boucsein andBacks, 2009, Table 35.1). (b) Respiration rate holds the problemthat it is strongly confounded with respiration depth (cf. Scerboet al., 2001). Wientjes (1992) observed increases of respirationdepth depending on task difficulty, even when respiration rateremained unchanged. Veltman and Gaillard (1996) reportedthat some subjects showed higher respiratory activity underresting conditions, but others under arousing conditions.Moreover, with respect to usability in the field, respiration isprone to speech artifacts (Manzey, 1998; Wilson and Egge-meier, 1991), becoming a problem during communication withair traffic controllers and other members of the flight crew.Therefore, respiration rate was excluded from all algorithmsapplied in the present study.

(2) Since some subjects had shown a greater amount of EDA underresting than under workload conditions in study 1, we

extended our baseline recordings to obtain a more accuratecalculation of the subjects’ individual set point by using fourinstead of two baseline recordings: two periods withoutturbulences and two periods with the maximal stage ofturbulences.

(3) As an additional measure for increasing the reliability ofpsychophysiological data parameterization, we extended therecording periods from 1 to 2 min per flight section. Weexpected that this would help maintaining effective and stableadjustment of arousal within our adaptive system.

(4) As an extension of study 1, steps 0, 1, 3 and 5 out of six possibleturbulence stages were chosen based on a previously per-formed unpublished study with subjective evaluations ofturbulence intensities (Haarmann, 2007). Increasing thenumber of turbulence steps was necessary to get a more fine-grained manipulated variable for avoiding the problem ofa two-point control as a result of merely turning on and offturbulences (see discussion of study 1).

(5) Gender was considered for pair matching as gender differenceswere reported for EDA depending on the type of task to dealwith (Boucsein, 1992). This is substantiated by the results of anunpublished study (Haarmann, 2007), with male participantsshowing significantly higher EDR freq. than female subjectsduring different stages of a simulated flight task that comprisedstart up, takeoff, climbing, turns, adjusting of instruments andinstrument scanning epochs. Controlling for gender was alsoimportant with respect to cardiovascular activity, since Antelmiet al. (2004) reported an influence of gender on differentmeasures of HRV, with higher values in females compared tomales.

3.2. Methods of study 2

3.2.1. ParticipantsForty-eight right-handed students (24 female, 24 male) aged

20–39 years (M¼ 26.42 years, SD¼ 5.34) took part in the secondstudy. None of them had been involved in study 1. Pairs of subjectswere matched according to a yoked control design as already usedin study 1, but in addition separately for males and females.

3.2.2. Task and designBefore starting the experiment, subjects were submitted to an

extensive flight simulator training (about 1 h duration), duringwhich the flight simulator’s basic instruments and functions wereexplained, followed by additional practice of basic IFR flightmissions, as already described in study 1.

For the experimental part, all subjects flew three blocks ina counterbalanced order according to a Latin square, comprisingten 2-min flight sections each. Each block was preceded by a 2-min

Page 7: Combining electrodermal responses and cardiovascular measures for probing adaptive automation during simulated flight

A. Haarmann et al. / Applied Ergonomics 40 (2009) 1026–10401032

resting period, followed by four 2-min baseline recordings fora more reliable individual set point calculation than in study 1 (twoperiods without turbulence and two with maximum turbulence,with the aim of capturing the whole range between low and highworkload conditions). For each block and each physiologicalmeasure, the arithmetic mean of the four baseline periods formedthe individual set point. In all three blocks, subjects tried to keepaltitude (10,000 ft), direction (60 degrees) and speed (120 kts)while facing turbulences. Each block applied different algorithmsbased on different combinations of psychophysiological measures:In block 1, adaptive automation was based on EDR freq. alone, i.e.,turbulence intensity was modified if the threshold of EDR freq. wasexceeded or undercut. In block 2, EDR freq. was coupled with HR,i.e., if both EDR freq. and HR together exceeded or fell below the setpoint, turbulence changes were initiated. In block 3, EDR freq. andHRV were combined, i.e., adaptive processes were only initiated ifan EDR freq. increase and a HRV decrease appeared together asa consequence of task demands, or EDR freq. diminished and HRVincreased indicating loss of arousal or vigilance. In case of EDR freq.and HRV changes in the same direction, no such adaptation processwas initiated. This was treated as a contradiction of both arousalindicators during psychophysiological recording, possibly due toartifacts like body movements or deep breathing. This aspect willbe discussed in Section 3.4.

During the process of adaptive automation, deviations from theindividual set point were calculated every 2 min without thesubjects’ knowledge, as they flew continuously. Adaptive automa-tion was performed for the 24 experimental subjects as follows: Ifthe actual 2-min average of EDR freq. and HR was below and HRVabove the individual set point (depending on the algorithmapplied), turbulence intensity was increased, and vice versa. Theother 24 participants served as yoked control group, i.e., each yokedcontrol subject received the same sequence of turbulences as thecorresponding experimental subject, regardless of her/his own setpoint and hence without adaptive automation.

3.2.3. Apparatus, recording and data analysisTechnical equipment, psychophysiological data recording and

parameterization were the same as described in study 1, except forthe on-line parameterization every 2 min.

3.2.4. Subjective dataFor detecting subjective general arousal, a visual analog arousal

scale (Bartenwerfer, 1969) was applied. For the assessment ofsubjective effort, participants filled in a mental effort scale (Eilerset al., 1986). The two scales were presented after each block (threemeasurement points). In addition, at the beginning of the experi-mental session and after each block (four measurement points),a 20-item subjective workload assessment was presented coveringfive factors: Tiredness, positive affect, stress, negative affect andmonotony (Richter et al., 2002).

3.2.5. Statistical analysis of physiological dataIn a first step, the absolute set point deviations (nominal minus

actual values) were calculated for the psychophysiologicalmeasures involved in the different blocks of adaptive automationalgorithms (EDR freq., HR and HRV). Afterwards, repeatedmeasurement ANOVAs were conducted separately for each block,with experimental conditions (adaptive vs. yoked control), gender(male vs. female) and sequence of blocks (three sequencesaccording to a Latin square) as between-subjects factors, and theten flight sections as within-subjects factor, using Greenhouse–Geisser corrected p values.

In a second analysis, the three blocks were directly compared byrepeated measurement ANOVAs for each physiological measure,

using collapsed values of flight segments 6–10 only. This additionala posteriori evaluation was chosen because of marked differencesbetween conditions (experimental vs. yoked control) within thoseflight sections.

In a third analysis, the frequency of turbulence switches wascalculated, using a repeated measurement ANOVA with the samebetween-subjects factors mentioned above and frequency ofturbulence switches as within-subjects factor. This was only per-formed with the 24 experimental subjects because their physio-logical responses actually triggered the variation of turbulences,which was not the case in the yoked control group.

To perform post hoc tests for significant group results, addi-tional paired t-tests for each yoked pair were calculated for thealgorithm in question. Furthermore, the correlations betweenactual physiological responses involved in the respective algorithmand the resulting onset/offset of turbulences were calculated foreach pair of subjects, using Spearman’s rho (r) coefficient (asalready performed in study 1). For all statistical analyses, theprobability of error was set to a¼ 0.05 (two-tailed).

3.2.6. Statistical analysis of subjective dataEffects on visual analog arousal scale, effort scale and subjective

workload ratings were statistically tested by repeated measure-ment ANOVA with block as within-subjects factor (beginning, afterblock 1, after block 2, after block 3) and experimental conditions(adaptive vs. yoked control condition), gender (male, female) andsequence of blocks (3 blocks) as between-subjects factors, usingGreenhouse–Geisser adjusted p values. The probability of error wasset to a¼ 0.05 (two-tailed).

3.3. Results of study 2

3.3.1. Physiological data3.3.1.1. Algorithm applying EDR freq. For EDR freq., repeatedmeasurement ANOVA did not yield significant effects of experi-mental conditions or significant interactions with flight segments.The same held for HR and HRV.

3.3.1.2. Algorithm applying EDR freq. and HR. For EDR freq., HR andHRV, no significant effects or interactions emerged.

3.3.1.3. Algorithm applying EDR freq. and HRV. Repeated measure-ment ANOVAs and post hoc t-tests revealed that set point devia-tions of EDR freq. were significantly smaller for the experimentalgroup compared to the yoked control group for the combination ofEDR freq. and HRV, especially during the second half of the block(flight segments 6–10), as supported by a significant interactionbetween experimental conditions and flight segment (F(9,324)¼ 2.65, p¼ 0.014). In flight section 10, experimental subjectsshowed highly significant lower absolute EDR freq. set point devi-ations of M¼ 2.92 (SD¼ 1.83) compared to their yoked controlcounterparts with M¼ 6.15 (SD¼ 3.87; t(46)¼ 3.67, p¼ 0.001; seeFig. 4). HRV data did not yield significant differences between thetwo groups (see Fig. 5).

Additionally, for HR (which was not included in the algorithm),a significant effect of experimental conditions was found (F(1,36)¼ 4.84, p¼ 0.034), with smaller HR set point deviations valuesin experimental subjects compared to yoked control subjects.Furthermore, a significant interaction of experimental conditionsand gender (F(1, 36)¼ 5.88, p¼ 0.021) revealed that male yokedcontrol subjects showed higher set point deviations in HR thantheir experimental counterparts (post hoc Tukey’s test: p¼ 0.012,df¼ 36; see Fig. 6). For HRV, no significant effects emerged.

On an individual level, separate paired t-tests for each yokedpair partly revealed significant and highly significant differences

Page 8: Combining electrodermal responses and cardiovascular measures for probing adaptive automation during simulated flight

Fig. 4. Absolute set point deviation values (group means and standard errors) of EDRfreq. for adaptive (Exp.) and yoked control (YC) condition by combining EDR freq.and HRV.

A. Haarmann et al. / Applied Ergonomics 40 (2009) 1026–1040 1033

between experimental and yoked control subjects. This held foreleven yoked pairs (subjects 101–201, 105–205, 107–207, 108–208,109–209, 110–210, 111–211, 114–214, 116–216, 117–217, and 120–220; see Table 2a), where set point deviations for EDR freq. and/orHRV were lower in experimental subjects compared to yokedcontrol subjects. Five pairs showed significant or highly significantdifferences in the opposite direction (subjects 104–204, 113–213,121–221, 122–222, and 124–224), i.e., experimental subjects dis-played higher set point deviations for EDR freq. and/or HRV thanyoked control subjects (see Table 2a).

An example for adaptive adjustment is given in Fig. 7, withsubject 107 (adaptive) staying closer to the set points for EDR freq.as well as HRV over 10 flight sections compared to subject 207 (itsyoked control counterpart).

Correlations between EDR freq. actual values (HRV actual values,respectively) and resulting turbulence steps were significant orhighly significant for two experimental subjects only, with subject103 showing increased turbulences with decreased HRV, andsubject 115 showing turbulence increment with decreasing EDRfreq. For seven yoked control participants, mainly positive corre-lations emerged for EDR freq. (subjects 201, 203, 207, 209, 210, 213,

Fig. 5. Absolute set point deviation values (group means and standard errors) of HRV(bpm) for adaptive (Exp.) and yoked control (YC) condition by combining EDR freq.and HRV.

224 – see Table 2a), i.e., higher EDR freq. induced a turbulenceincrement.

3.3.1.4. Direct comparison of algorithms. Based on the results ofSections 3.3.1.1–3.3.1.3, a direct comparison of the three blocksusing flight segments 6–10 showed a significant interactionbetween blocks and experimental conditions (F(2, 72)¼ 3.406,p¼ 0.042; t(46)¼ 2.198, p¼ 0.033), with significantly smaller setpoint deviations of EDR freq. in experimental subjects (M¼ 3.28,SD¼ 1.86) compared to yoked control subjects (M¼ 4.69,SD¼ 2.54) for the algorithm ‘‘EDR freq.þHRV’’ (see Fig. 8).

HR yielded a significant interaction between blocks, experi-mental conditions and gender (F(2, 72)¼ 4.971, p¼ 0.012; seeFig. 9). Male yoked control subjects showed higher set point devi-ation values of HR than their experimental counterparts in ‘‘EDRfreq.þHRV’’ (t(22)¼ 2.492, p¼ 0.021). Female yoked controlsubjects displayed higher set point deviation values of HR in ‘‘EDRfreq.þHR’’ than female experimental subjects (t(22)¼ 1.881,p¼ 0.084). A comparison of algorithms revealed that femaleexperimental subjects showed the lowest rate of HR set pointdeviations of all experimental groups in ‘‘EDR freq.þHR’’, espe-cially in comparison to male experimental subjects (t(22)¼ 1.887,p¼ 0.081). Other effects did not reach significance.

For HRV, no significant effects were found.

3.3.2. Turbulence switchesAn analysis of turbulence switches within the adaptive auto-

mation group (n¼ 24) revealed that the frequency of turbulenceswitches was significantly higher for the block ‘‘EDR freq.’’(M¼ 6.38, SD¼ 3.17) compared to the blocks ‘‘EDR freq.þHR’’(M¼ 2.92, SD¼ 1.72) and ‘‘EDR freq.þHRV’’ (M¼ 3.25, SD¼ 1.68)containing combined psychophysiological measures (F(2,36)¼ 17.67, p< 0.001; see Fig. 10).

3.3.3. Subjective dataFor the visual analog arousal scale, a highly significant effect of

gender emerged (F(1, 36)¼ 9.802, p¼ 0.003). Female subjectsshowed higher arousal values than male subjects (see Fig. 11).

Analysis of mental effort scale ratings yielded a significant effectof gender as well (F(1, 36)¼ 6.057, p¼ 0.019), with females showingsignificantly higher scores than males. Moreover, the interactionbetween experimental conditions, gender and blocks reachedmarginal significance (F(2, 72)¼ 3.063, p¼ 0.054, see Fig. 12):Female experimental subjects showed higher effort values in ‘‘EDRfreq.’’ and ‘‘EDR freq.þHR’’ than yoked control females. In ‘‘EDRfreq.þHRV’’, male experimental subjects displayed higher effortscores than their male yoked control counterparts. In all threealgorithms, female experimental subjects showed the highest effortscores.

For the mental workload ratings, the factor ‘‘tiredness’’ revealeda highly significant effect of blocks (F(3, 108)¼ 27.983, p< 0.001):All subjects had lower scores at the beginning compared to thethree blocks they worked on. There was no significant differencebetween experimental conditions.

‘‘Positive affect’’ yielded a highly significant effect of blocks (F(3,108)¼ 18.524, p< 0.001). Subjects showed a higher degree ofpositive affect at the beginning compared to the following blocks,again without significant differences between experimentalconditions.

‘‘Stress’’ showed a significant block effect (F(3, 108)¼ 4.165,p¼ 0.013), a significant interaction between gender and blocks (F(3,108)¼ 2.980, p¼ 0.046) and of blocks, experimental conditions andgender (F(3, 108)¼ 3.284, p¼ 0.033, see Fig. 13). Female experi-mental subjects showed lower stress scores in all blocks comparedto their male counterparts, especially in ‘‘EDR freq.þHRV’’

Page 9: Combining electrodermal responses and cardiovascular measures for probing adaptive automation during simulated flight

Fig. 6. Absolute set point deviation values (group means and standard errors) of HR (bpm) for adaptive (Exp.) and yoked control (YC) male/female subjects by combining EDR freq.and HRV.

A. Haarmann et al. / Applied Ergonomics 40 (2009) 1026–10401034

(t(22)¼ 1.792, p¼ 0.087). This was substantiated by a directcomparison of the three algorithms excluding baseline ratings,revealing a highly significant interaction of experimental condi-tions, gender, and blocks as well (F(2, 72)¼ 5.81, p¼ 0.007).

For ‘‘negative affect’’, a highly significant block effect wasobserved (F(3, 108)¼ 39.651, p< 0.001), with higher scores ofnegative affect at the beginning compared to the following blocks.

Finally, ‘‘monotony’’ showed a highly significant block effect(F(3, 108)¼ 35.544, p< 0.001) with lower monotony scores atthe beginning, and a significant gender effect (F(1, 36)¼ 6.045,p¼ 0.019), whereby male participants reported increasedmonotony compared to females (see Fig. 14). This was confirmed bya direct comparison of algorithms without baseline ratings,showing a significant effect of gender (F(1, 36)¼ 5.884, p¼ 0.02)as well.

3.4. Discussion of study 2

As an extension of study 1, adaptive adjustment of arousal wasperformed in our second study by means of (1) either EDR freq.alone, or (2) a combination of EDR freq. and HR, or (3) by combiningEDR freq. with HRV. To improve determination of the individual setpoints, they were obtained as the average of four instead of twobaseline recordings. As in the first study, subjects in the yokedcontrol group flew their experimental counterparts’ sequence ofturbulences without any adaptive control, i.e., regardless of therelation of their actual arousal to their individual set point. Addi-tionally, subjective arousal and workload data were collected at thebeginning of the experimental session and after each block.

Our results partly support the usability of autonomic measuresfor adaptive automation already demonstrated in study 1.Comparing three different algorithms revealed a significant groupdifference between experimental and yoked control condition ifEDR freq. was combined with HRV in the process of adaptivecontrol. Experimental subjects showed smaller EDR freq. set pointdeviations than yoked control subjects with this combination,especially during flight segments 6–10 (see Figs. 4, 7, and 8).Additionally, this combination showed smaller HR set point devi-ations for experimental subjects compared to yoked controlsubjects, especially in the male group, although HR was notinvolved in the algorithm ‘‘EDR freq.þHRV’’. This may reflect theoutreaching impact of this algorithm on psychophysiologicalmeasures not directly involved in the closed loop.

On an individual level, paired t-tests only partly mirrored theresults already found on a group level. Some of the yoked pairs evenshowed results in the opposite direction, i.e., experimental subjects

displayed higher set point deviations in EDR freq. and/or HRVcompared to their yoked control counterparts.

In contrast to study 1, correlations between physiologicalmeasures – EDR freq. and HRV – and resulting adjustment ofturbulences did not reach significance in most cases. Severalcorrelations were particularly against our expectations: Subject 103revealed a negative correlation between HRV and turbulences, i.e.,lower HRV resulted in higher turbulences. Furthermore, subject 115showed a positive correlation between EDR freq. and resultingturbulences, which was expected to be negative. Finally, yokedcontrol subjects displayed mainly positive correlations betweenEDR freq. and EDR-dependent turbulences, despite no such corre-lation should occur without adaptive adjustment. A reason forthese unexpected results might be found in the coupling of the twomeasures – EDR freq. and HRV – by an AND operator. Turbulencechanges were only initiated if both measures deviated from theirpredefined set points, i.e., if EDR freq. fell below (or exceeded,respectively) its set point, but HRV did not exceed (or did not fallbelow, respectively) its set point at the same time, no turbulencechanges took place. This might have resulted in difficulties for theadaptive system to adhere to the physiological state of the subject.

For the remaining algorithms, no significant differences could bedetected between the experimental conditions. Probably, thealgorithm ‘‘EDR freq.’’ led to higher oscillations even in the exper-imental group due to the lack of restrictive influence exerted by anadditional measure, such as HR oder HRV. With ‘‘EDR freq.þHR’’we expected a differentiation of experimental conditions compa-rable to ‘‘EDR freq.þHRV’’. After inspection of the data sets for‘‘EDR freq.þHR’’, we observed a higher variance in mean set pointdeviation values for yoked control subjects compared to experi-mental subjects, which might explain the lack of significantdifferences in set point deviations. This may reflect the way yokedcontrol subjects dealt with the sequence of turbulences forcedupon them without consideration of their actual physiologicalresponses. Presumably, several yoked control subjects were able toadapt to the given sequence of turbulences, others were not.

A direct comparison of the three algorithms for flight segments6–10 fostered the efficacy of the algorithm ‘‘EDR freq.þHRV’’.Results showed a significant interaction of blocks and experimentalconditions for EDR freq. Moreover, ‘‘EDR freq.þHRV’’ yieldeda significant interaction of gender, blocks and experimentalconditions, with male experimental subjects displaying smallerEDR freq. set point deviations than their yoked control counter-parts. In block ‘‘EDR freq.þHR’’, a significant interaction of gender,blocks and experimental conditions emerged for HR set pointdeviations, with experimental female subjects showing smaller HR

Page 10: Combining electrodermal responses and cardiovascular measures for probing adaptive automation during simulated flight

Table 2a) Mean deviation values (actual value minus set point value) and t-values for paired comparisons over 10 flight sections (2 min each) for EDR freq. and HRV (based onalgorithm ‘‘EDR freq.þHRV’’, study 2).b) Correlations between EDR freq./HRV and resulting turbulences (Spearman’s rho, r).

Subjects a) EDR freq. HRV (bpm) b) Correlations

Set point Deviation M (SD) t(9) Set point Deviation M (SD) t(9) r EDR freq. r HRV

101 15.00 3.60 (2.84)4.73**

3.50 0.58 (0.74) 1.56 �0.14 0.07201 21.00 8.30 (3.59) 4.01 0.92 (0.68) 0.69* �0.37

102 16.75 4.95 (2.10)1.66

2.29 0.60 (0.72) 1.82 0.32 �0.37202 12.50 3.20 (3.50) 1.64 0.26 (0.23) 0.36 0.02

103 10.25 2.90 (2.24)1.50

1.74 0.22 (0.17) 1.11 �0.25 �0.78**203 18.00 2.00 (1.05) 2.11 0.16 (0.25) �0.06 0.73*

104 14.50 4.50 (3.02)2.21

6.54 1.16 (0.74) 4.86** 0.47 0.36204 9.50 2.30 (2.30) 2.00 0.17 (0.09) 0.26 �0.52

105 16.25 1.45 (1.32)4.74**

4.80 0.32 (0.21) 2.30* �0.46 0.65*205 27.00 6.00 (2.62) 3.86 0.54 (0.30) 0.58 �0.27

106 17.50 3.70 (3.39)0.98

2.43 0.50 (0.44) 0.97 �0.36 �0.49206 6.00 4.30 (2.54) 3.78 0.64 (0.95) �0.01 �0.61

107 17.00 2.80 (1.14)4.47**

2.70 0.24 (0.17) 2.84* �0.79** �0.22207 12.25 6.55 (3.38) 5.37 0.89 (0.76) 0.77** �0.58

108 22.75 3.70 (2.58)2.61*

3.08 0.43 (0.19) 1.57 �0.35 0.45208 17.25 6.15 (3.07) 3.57 0.31 (0.31) 0.11 0.17

109 19.00 1.80 (1.93)2.13

2.64 0.22 (0.16) 2.75* 0.43 0.12209 12.50 3.70 (2.44) 3.28 0.39 (0.22) 0.74* �0.62

110 18.25 3.80 (2.18)2.41*

3.17 0.22 (0.20) 4.67** �0.49 0.42210 15.25 6.15 (3.55) 5.48 1.80 (1.10) �0.65* 0.09

111 21.75 1.70 (1.38)2.49*

4.50 0.81 (1.18) 1.99 �0.23 �0.26211 19.75 3.55 (2.61) 1.97 0.27(0.13) 0.11 �0.58

112 27.50 1.20 (0.82)2.01

3.31 0.26 (0.21) 2.13 �0.14 �0.06212 19.75 2.00 (1.59) 3.67 0.63 (0.77) �0.24 �0.42

113 19.00 3.90 (1.97)2.06

5.59 1.27 (1.39) 2.28* �0.08 �0.19213 21.00 2.40 (1.58) 2.47 0.42 (0.22) 0.74* �0.58

114 18.25 3.15 (2.20)6.78**

3.86 0.48 (0.20) 2.75* a a214 24.50 8.50 (2.49) 5.99 1.43 (1.23) a a

115 21.00 3.00 (2.71)1.20

2.47 0.41 (0.65) 1.07 �0.66* 0.06215 16.75 2.15 (2.88) 1.74 0.30 (0.10) 0.06 �0.04

116 18.00 2.70 (1.89)1.62

2.34 0.15 (0.12) 6.37** a a216 18.75 3.95 (2.14) 3.58 0.51 (0.20) a a

117 22.00 1.90 (0.74)3.87**

2.60 0.26 (0.29) 3.91** �0.36 �0.13217 21.25 4.50 (2.57) 4.06 0.94 (0.77) 0.13 �0.44

118 24.25 2.50 (2.06)1.07

8.42 0.95 (0.64) 1.04 0.00 0.17218 16.25 3.15 (3.13) 3.92 0.80 (0.79) 0.09 0.09

119 28.25 4.65 (2.55)1.80

4.36 0.44 (0.36) 1.11 �0.17 0.38219 18.75 2.85 (3.24) 4.31 0.35 (0.25) 0.19 �0.56

120 16.25 2.40 (2.11)2.51*

2.78 0.30 (0.22) 0.87 �0.12 0.52220 20.50 5.10 (4.09) 2.86 0.33 (0.25) �0.24 �0.41

121 33.50 9.60 (4.09)4.57**

5.26 0.96 (0.63) 4.15** �0.05 0.13221 18.50 2.80 (1.64) 1.81 0.24 (0.16) 0.59 �0.62

122 26.25 5.55 (4.91)2.22

6.79 1.92 (1.15) 4.52** �0.14 0.24222 22.00 2.60 (2.27) 1.87 0.52 (0.66) �0.46 0.56

123 21.00 2.60 (2.12)2.09

3.83 0.68 (0.74) 1.17 �0.07 0.07223 18.75 4.10 (2.08) 2.70 0.54 (0.42) �0.45 0.32

124 25.75 3.25 (1.70)0.91

4.07 0.83 (0.49) 4.74** 0.00 0.49224 26.25 3.00 (2.42) 0.96 0.15 (0.10) 0.70* 0.57

Note: a was not calculated as the strength of turbulence did not change.*p< 0.05, **p< 0.01.

A. Haarmann et al. / Applied Ergonomics 40 (2009) 1026–1040 1035

set point deviations than male experimental subjects. Presumably,‘‘EDR freq.þHRV’’ was more adequate for male participants, while‘‘EDRþHR’’ provided better adaptive adjustment in females.Moreover, as can be inferred from Fig. 8, yoked control subjects

displayed higher EDR freq. set point deviations in the block ‘‘EDRfreq.þHRV’’ compared to the other two blocks. This may imply thatyoked control participants had particular difficulties in adapting tothe sequence of turbulences forced upon them in this block. As

Page 11: Combining electrodermal responses and cardiovascular measures for probing adaptive automation during simulated flight

Fig. 7. Comparison of EDR freq. for an experimental subject (107) and a yoked control subject (207) for 10 flight sections in block ‘‘EDR freq.þHRV’’.

A. Haarmann et al. / Applied Ergonomics 40 (2009) 1026–10401036

a consequence, significantly higher EDR freq. set point deviationsemerged for yoked control subjects compared to experimentalsubjects.

The efficacy of the algorithm ‘‘EDR freq.þHRV’’ was also backedup by a significant lower amount of turbulence switches comparedto an adaptive regulation based on EDR freq. alone. The frequencyof turbulence switches can be regarded as an indicator for thequality of adaptive automation, since frequent switches of themanipulated variable (here: turbulence intensity) may causeinstabilities in a closed loop, resulting in an unwanted amount ofsystem oscillations. According to results reported by Scallen et al.(1995), short cycles of automation (i.e., higher frequencies of mode

Fig. 8. Comparison of the three blocks for EDR freq. set point deviation values based onflight segments 6–10 (group means and standard errors).

switches per time unit) in adaptive function allocation enhanceperformance, but at the same time increase subjective workload.Moreover, Hadley et al. (1999) showed that subjects had moredifficulties in switching back from automated to manual operationin case of short cycles of switches compared to long cycles witha lower frequency of switches. Hence, the high frequency ofswitches under EDR freq. alone as seen here might have contrib-uted to instabilities within the closed loop, resulting in higher setpoint deviations compared to the combinations of EDR freq. withHRV or HR.

As an additional benefit, coupling of EDR freq. and HRV withina single algorithm for adaptive automation may also contribute toreducing artifacts during on-line data assessment. Both measuresexert mutual control, as simply falling below or exceeding a definedset point value in one measure would not suffice to initiateswitching of turbulence. Only in case of displaying a clear

Fig. 9. Interaction of blocks, experimental conditions and gender for HR differencevalues (group means and standard errors).

Page 12: Combining electrodermal responses and cardiovascular measures for probing adaptive automation during simulated flight

Fig. 10. Frequency of turbulence switches within the experimental group (n¼ 24) forthe different blocks (group means and standard errors).

Fig. 12. Interaction of blocks, experimental conditions and gender for mental effortscores (group means and standard errors).

A. Haarmann et al. / Applied Ergonomics 40 (2009) 1026–1040 1037

sympathetic response pattern (EDR freq. upwards AND HRVdownwards) or sympathetic withdrawal (EDR freq. downwardsAND HRV upwards), a system adjustment by varying strength ofturbulence was initiated. For example, deep breathing may triggeran HRV increase, leading to underestimation of arousal due toincreased HRV (cf. Veltman and Gaillard, 1996). As a consequence,without a combined algorithm, arousal would be further increasedby erroneously imposing additional workload. On the other hand,deep breathing elicits an increase of EDA, leading to an over-estimation of arousal. An algorithm solely based on EDA wouldhave erroneously reduced the amount of workload in case of deepbreathing. A combination of the two physiological measures withAND operators as performed in our combined algorithm containsthe possibility for reducing a possible impact of artifact duringadaptive automation.

No substantial effects for the experimental conditions or algo-rithms emerged in the visual analog arousal scale, effort scale andsubjective workload ratings. Each of the five factors of the subjec-tive workload ratings displayed a highly significant effect of blocks,with differences mainly based on changes from baseline ratings toblock ratings. This may be due to the fact that subjective evalua-tions took place only at the beginning and at the end of each block,whereas the critical changes might have appeared during theblocks. However, obtaining subjective recordings during blockswould have interrupted the closed-loop process. Only the factor‘‘stress’’ of subjective workload ratings indicated that experimentalfemale subjects reported lower stress ratings in all blocks comparedto experimental male subjects, especially in the combination of

Fig. 11. Interaction of blocks, experimental conditions and gender for the visual analogarousal scale scores (group means and standard errors).

‘‘EDR freq.þHRV’’, thus further supporting the efficacy of thatalgorithm for adaptive automation. Removing baseline ratings fromthe analysis substantiated this finding. ‘‘Monotony’’ yielded higherrating scores in male subjects, accompanied by higher stress scores.Therefore, monotony can be interpreted as an aversive state ofboredom which might be perceived as stressful (e.g., instrumentscanning in a cockpit for longer time epochs). O’Hanlon (1981)already considered monotony and boredom to be stress-inducingdue to the requirement to maintain an adequate state of arousal inarousal-alleviating settings over a longer period of time. Accordingto informal post-experimental interviews with our participants,nearly all male subjects reported to have extensive experience withcomputer games and simulations. Therefore, an increased boredomin male participants may refer to the higher degree of experiencewith computer games and simulations compared to femalesubjects. Female subjects were more involved in the flight task –probably due to a lack of experience in computer games andsimulations – as displayed in higher arousal scores on the visualanalog scale and in higher mental effort scores, but a decrease instress scores. A sample consisting of professional male and femalepilots with a comparable technical background might haveemerged no substantial differences in subjective ratings withrespect to gender.

Studies of Prinzel et al. using the NASA Task Load Index (NASA-TLX; Hart and Staveland, 1988) for subjective workload assessmentreported lower total workload scores in subjects with adaptiveautomation compared to yoked control subjects (e.g., Prinzel et al.,2003). At first glance, this might appear contradictory to our resultsshowing higher mental effort scores; however, higher effort in theexperimental group as reflected in the mental effort scale can alsobe interpreted as a result of being more involved in the flight taskcompared to yoked control subjects, which is supplemented by

Fig. 13. Interaction of blocks, experimental conditions and gender for stress scores(group means and standard errors).

Page 13: Combining electrodermal responses and cardiovascular measures for probing adaptive automation during simulated flight

Fig. 14. Interaction of blocks, experimental conditions and gender for monotony scoresfor adaptive (Exp.) and yoked control (YC) condition (group means and standarderrors).

A. Haarmann et al. / Applied Ergonomics 40 (2009) 1026–10401038

lower stress ratings as mentioned above. In this context, it would beinteresting to have a look at NASA-TLX subscale results – mentaldemand, physical demand, temporal demand, performance, effort,and frustration – which are not reported in the studies by Prinzelet al. where only total workload scores were provided.

Another problem can be seen in the way yoked pairs wereformed with respect to baseline values. It was not always possibleto yoke adequate pairs with comparable baseline values. A morefine-grained matching would have required substantially moresubjects for initial testing.

4. General discussion

An important goal of our present research was to demonstratethe usability of autonomic measures for adaptive automation ina closed-loop system, since these measures may have severaladvantages over EEG-derived measures when used in a cockpitenvironment, as discussed in our introduction. Besides havingattained this general goal to a certain extent, our second studyshowed that the combination of measures from different physio-logical systems for adaptive automation, i.e., EDR freq. and HRV, isclearly superior over using the electrodermal system alone. Each ofthese measures has its own sensitivity (Boucsein and Backs, 2009,Table 35.2), which may add to the advantage of such a combinationover a single measure.

Several problems that come with constructing an adaptiveautomation system based on psychophysiological measures wereaddressed and partly resolved between our first and second study.As De Waard and Mulder (2006) pointed out, the stability ofa closed-loop system is influenced by the epoch length (delay) andby the sensitivity of responding to changes (gain vs. loss). Bycombining several psychophysiological measures as in our secondstudy, the modulation of the gain factor in a closed loop may beimproved, which is reflected in switching behavior. In fact, thequality of a closed-loop system is determined by the frequency ofmode switches. In this context, Freeman et al. (1999) as well asHadley et al. (1997) and Freeman et al. (2000) compared differentepoch lengths (2 s vs. 20 s and 4 s vs. 40 s, respectively). They foundthat shorter time windows triggered more mode switches andresulted in better tracking performance which can be explained byhigher system sensitivity. However, better performance mightresult in higher subjective workload appraisal, as reported byScallen et al. (1995) when using short time slots. To avoid frequentmode switches resulting in higher subjective workload, weincreased the epoch length from 60 s in our first study to 2 min inthe second study.

Moreover, a fine grading of automation is important for avoidinga two-point control by merely turning on and off the automationmode. Since automation was simulated in our studies by changingthe strength of turbulence, we opted for four steps out of sixpossible turbulence degrees in our second study to improve the finegrading of ‘‘automation’’ compared to the first study, which mighthave contributed to the better success in the second study.

Using difference-to-baseline values as in our studies instead ofcalculating the slope between two subsequent epochs is also sup-ported by results reported by Freeman et al. (1999), comparinga slope criterion (changes on an epoch-to-epoch base) to an abso-lute criterion (changes in comparison to baseline level). The lattercriterion showed better and more stable performance undernegative feedback, which is typically applied in a closed loop.Moreover, using baseline recordings for calculating differencevalues holds the advantage that interindividual differences in initialvalues are considered.

One possible disadvantage of the LAS flight simulator used inour research is that performance data are not available, such ascourse deviations on a time scale, which could be important forcomparison with real-time psychophysiological data. In anunpublished study (Haarmann, 2007), we acquired response times(RT) for 4 shorter and longer curve maneuvers (10 degrees vs. 30degrees roll angle) during simulated flight, controlling for genderand daytime (morning vs. afternoon). We expected RT to increasewith time on flight task as a consequence of arousal decrement.When comparing 10 degrees maneuvers and 30 degrees maneu-vers separately, no significant differences or interactions emerged.In that context, RT turned out to be not a valid indicator ofvigilance decrement and therefore we skipped it for the twostudies reported here.

Finally, our second study offered a possibility for the hithertounsolved problem with an on-line artifact control by combiningEDR freq. and HRV as an indirect artifact control. The existingliterature has not offered a solution for this problem yet. In brain–computer interaction, Halder et al. (2007) have recently introducedan on-line artifact removal tool for estimating the artifact-free EEGsignal. In our present study 2, adaptive adjustment was only initi-ated if EDR freq. increased and HRV decreased and vice versa, whichwould not be the case in artifacts that produce both increases inEDR freq. and HRV. As already discussed in Section 3.4, it isnecessary to combine physiological measures with AND operatorsand to consider the direction of psychophysiological measuresunder differing arousal levels. Such a restriction, however, mightcreate new problems by developing an algorithm responding ina clumsy way to varying arousal levels, i.e., if it does not match theworkload actually experienced by the operator, or the operator hasadapted very well to the task requirements (Parasuraman, 2003;Veltman and Jansen, 2004). If tasks are reallocated in such a situa-tion, the operator may be confused, and thus her/his performancemight be hampered. Hence, reallocation should only take place ifthe operator is no longer able to get along with the changing taskdemands.

Weighting of combined physiological measures involved in analgorithm may improve system adjustment with respect to theindividual arousal level. Mathematical modelling based onpsychophysiological data of different participants can be regardedas a viable way, e.g., by regression models (cf. Fairclough et al.,2005). However, there is still the question whether a universalalgorithm will fit for any operator or pilot. Human responses are notforeseeable as in machines. For meeting individual differences, wechose an approach based on individual set point calculation forconstructing a closed loop. For it has to be considered that arousalshifts automatically take place for longer time-on-task intervals,new set points were calculated for each block in study 2,

Page 14: Combining electrodermal responses and cardiovascular measures for probing adaptive automation during simulated flight

A. Haarmann et al. / Applied Ergonomics 40 (2009) 1026–1040 1039

considering arousal shifts from block to block. Determining an idealset point criterion for a physiological measure can be difficult. Forexample, De Waard and Mulder (2006) found that the 10 mmHgcriterion for blood pressure changes was scarcely met by theirsubjects. Therefore, we opted for calculating a mean arousal fromseveral baseline epochs with low and high workload. This meanvalue was considered as the individual optimal arousal level.

In conclusion, our results can be considered as a first steptowards the development of algorithms for future field applicationswith adaptive automation. We are well aware of the fact, thatfurther research is needed for implementing adaptive automationin real environments. However, we successfully used easy-to-measure autonomic parameters instead of EEG measures thatwould be much harder to record during real flight, especially incommercial pilots who do not wear helmets as military pilots do.Moreover, probing adaptive automation in a professional IFR flightsimulator with an authentic cockpit comes very close to reality. Inour opinion, realistic setups are a prerequisite for the imple-mentation of adaptive automation in such a complex work envi-ronment as a cockpit. Although we applied a close-to-realitysimulated flight task, it may be still a long way for this kind ofadaptive hypovigilance/workload control to make it to the realcockpit. However, our results look promising that this goal might beattained.

In the future, the biocybernetic closed-loop systems withadaptive automation may serve as a training protocol for pilots foravoiding hazardous states of awareness (Pope and Bogart, 1992;Prinzel et al., 2001). Showing a certain pattern of psychophysio-logical responses will be ‘‘rewarded’’ by allocating tasks to the pilotas proposed by the concept of Autonomic Feedback Training (AFT;Kellar et al., 1993). AFT aims at improving the flight crew’s perfor-mance during periods of high physiological and psychologicalarousal by specific feedback. During an emergency scenario ina cockpit, the experimental group receiving AFT showed betterflight crew coordination and communication compared to a controlgroup without AFT. Hence, feedback of the physiological statemight counteract stress-induced performance decrements. More-over, today’s portable devices for ambulatory data acquisition andparameterization (Fahrenberg and Wientjes, 2000) may pave theway for field studies applying adaptive automation in an even morereality-related context.

References

Allanson, J., Fairclough, S.H., 2004. A research agenda for physiological computing.Interact. Comput. 16, 857–878.

Antelmi, I., De Paula, R.S., Shinzato, A.R., Peres, C.A., Mansur, A.J., Grupi, C.J., 2004.Influence of age, gender, body mass index, and functional capacity on heart ratevariability in a cohort of subjects without heart disease. Am. J. Cardiol. 93,381–385.

Backs, R.W., 1998. A comparison of factor analytic methods of obtaining cardio-vascular autonomic components for the assessment of mental workload.Ergonomics 41, 733–745.

Bailey, N.R., Scerbo, M.W., Freeman, F.G., Mikulka, P.J., Scott, L.A., 2004. A Compar-ison of a Brain-based Adaptive System and a Manual Adaptable System forInvoking Automation. NASA technical report. NASA Langley Research Center,Hampton, Virginia.

Bartenwerfer, H., 1969. Einige praktische Konsequenzen aus der Aktivier-ungstheorie. Z. Exp. Angew. Psychol. 16, 195–222.

Berntson, G.G., Bigger, J.T., Eckberg, D.L., Grossman, P., Kaufmann, P.G., Malik, M.,Nagaraja, H.N., Porges, S.W., Saul, J.P., Stone, P.H., van der Molen, M.W., 1997.Heart rate variability: origins, methods, and interpretive caveats. Psychophys-iology 34, 623–648.

Bibian, S., Ries, C.R., Huzmezan, M., Dumont, G.A., 2003. Clinical anesthesia andcontrol engineering: terminology, concepts and issues. In: European ControlConference. Cambridge, UK, September 1–4, 2003. Available via: http://www.ece.ubc.ca/whuzmezan/docs/ECC03_ADD.pdf.

Boucsein, W., 1992. Electrodermal Activity. Plenum Press, New York.Boucsein, W., Backs, R.W., 2000. Engineering psychophysiology as a discipline:

historical and theoretical aspects. In: Backs, R.W., Boucsein, W. (Eds.),

Engineering Psychophysiology. Issues and Applications. Lawrence ErlbaumAssociates, Mahwah, N.J., pp. 3–30.

Boucsein, W., Backs, R.W., 2009. The psychophysiology of emotion, arousal, andpersonality: methods and models. In: Duffy, V.G. (Ed.), Handbook of DigitalHuman Modeling. Taylor & Francis/CRC Press, Boca Raton, pp. 35-1–35-18.

Boucsein, W., Haarmann, A., Schaefer, F., 2005. The usability of cardiovascular andelectrodermal measures for adaptive automation during a simulated IFR flightmission. Psychophysiology 42, S26.

Byrne, E.A., Parasuraman, R., 1996. Psychophysiology and adaptive automation. Biol.Psychol. 42, 249–268.

Comstock, J.R.J., Arnegard, R.J., 1992. The Multi-Attribute Test Battery for HumanOperator Workload and Strategic Behavior Research (No. 104174). NationalAeronautics and Space Administration.

Cox, D.J., Kovatchev, B.P., Morris, J.B., Phillips, C., Hill, R.J., Merkel, L., 1998. Electro-encephalographic and psychometric differences between boys with andwithout attention-deficit/hyperactivity disorder (ADHD): a pilot study. Appl.Psychophysiol. Biofeedback 23, 179–188.

Critchley, H.D., 2002. Electrodermal responses: what happens in the brain.Neuroscientist 8, 132–142.

De Waard, D., Mulder, L.J.M., 2006. Can cardiovascular indices be used to adapt theambulance dispatch worker’s task? In: Pikaar, R.N., Koningsveld, E.A.P.,Settels, P.J.M. (Eds.), Proceedings IEA 2006 Congress: Meeting Diversity inErgonomics. Elsevier, Amsterdam.

Eilers, K., Nachreiner, F., Hanecke, K., 1986. Entwicklung und Uberprufung einerSkala zur Erfassung subjektiv erlebter Anstrengung. (The development andtesting of a scale to validation for recording subjectively experienced effort). Z.Arbeitswiss. 40, 215–224.

Endsley, M.R., 1996. Automation and situation awareness. In: Parasuraman, R.,Mouloua, M. (Eds.), Automation and Human Performance: Theory and Appli-cations. Lawrence Erlbaum Associates, Hillsdale, N.J., pp. 163–181.

Fahrenberg, J., Wientjes, C.J.E., 2000. Recording methods in applied environments.In: Backs, R.W., Boucsein, W. (Eds.), Engineering Psychology: Issues andApplications. Lawrence Erlbaum Associates, Mahwah, N.J., pp. 111–136.

Fairclough, S.H., 2007. Psychophysiological inference and physiological computergames. Workshop of the International Conference on Advances in ComputerEntertainment Technology (ACE). BrainPlay ’07: Playing with your brain. Brain–computer interfaces and games. Online document: http://hmi.ewi.utwente.nl/brainplay07_files/brainplay07.pdf#page¼11, pp. 19–24.

Fairclough, S., Venables, L., Tattersall, A.J., 2005. The use of autonomic measures forbiocybernetic adaptation. Psychophysiology 42 (Suppl. 1), S25.

Fowles, D.C., Christi, M.J., Edelberg, R., Grings, W.W., Lykken, D.T., Venables, P.H.,1981. Publication recommendations for electrodermal measurements. Psycho-physiology 18, 232–239.

Freeman, F.G., Mikulka, P.J., Prinzel, L.J., Scerbo, M.W., 1999. Evaluation of anadaptive automation system using three EEG indices with a visual tracking task.Biol. Psychol. 50, 61–76.

Freeman, F.G., Mikulka, P.J., Scerbo, M.W., Prinzel, L.J., Clouatre, K., 2000. Evaluationof a psychophysically controlled adaptive automation system, using perfor-mance on a tracking task. Appl. Psychophysiol. Biofeedback 25, 103–115.

Haarmann, A., 2007. Der Einsatz psychophysiologischer Variablen bei der adaptivenAutomatisierung am Beispiel einer Flugsimulationsaufgabe (Application ofpsychophysiological measures for adaptive automation during simulatedflight). Doctoral thesis. University of Wuppertal.

Hadley, J., Mikulka, P.J., Freeman, F.G., Scerbo, M.W., Prinzel III, L.J., 1997. Psycho-physiological evaluation of a biocybernetic system for adaptive automation. In:Paper Presented at the 9th Symposium on Aviation Psychology, Columbus,Ohio.

Hadley, G.A., Prinzel III, L.J., Freeman, F.G., Mikulka, P.J., 1999. Behavioral, subjective,and psychophysiological correlates of various schedules of short-cycle auto-mation. In: Scerbo, M.W., Mouloua, M. (Eds.), Automation Technology andHuman Performance: Current Research and Trends. Lawrence Erlbaum Asso-ciates, Mahwah, N.J., pp. 139–143.

Halder, S., Bensch, M., Mellinger, J., Bogdan, M., Kubler, A., Birbaumer, N.,Rosenstiel, W., 2007. Online artifact removal for brain–computer interfacesusing support vector machines and blind source separation. Comput. Intell.Neurosci., 10 pp. (online document).

Hart, S.G., Staveland, L.E., 1988. Development of NASA-TLX (task load index): resultsof empirical and theoretical research. In: Hancock, P.A., Meshkati, N. (Eds.),Human Mental Workload, pp. 139–183. North-Holland, Amsterdam.

Hart, S.G., Wickens, C.D., 1990. Workload assessment and prediction. In:Booher, H.R. (Ed.), MANPRINT: an Approach to Systems Integration. Van Nos-trand Reinhold, New York, pp. 257–296.

Hugdahl, K., 1995. Psychophysiology. The Mind–Body Perspective. HarvardUniversity Press, Cambridge, pp. 167–196.

Isshiki, H., Yamamoto, Y., 1994. Instrument for monitoring arousal level usingelectrodermal activity. In: Instrumentation and Measurement TechnologyConference, 10–12 May 1994. IMTC/94. Conference Proceedings. 10th Anniver-sary. Advanced Technologies in I. & M., 1994. IEEE, vol. 2, pp. 975–978.

Junker, A., Berg, C., Schneider, P., McMillan, G., November 1995. Evaluation of theCyberlink� Interface as an Alternative Human Operator Controller (AL/CF-TR-1995-0011). Technical Report. Wright-Patterson Air Force Base, ArmstrongLaboratory, Ohio.

Kellar, M.A., Folen, R.A., Cowings, P.S., Toscano, W.B., Hisert, G.L., 1993. Autogenic-feedback training improves pilot performance during emergency flying condi-tions. NASA Tech. Mem. 104005. NASA-Ames Research Center, Moffett Field, CA.

Page 15: Combining electrodermal responses and cardiovascular measures for probing adaptive automation during simulated flight

A. Haarmann et al. / Applied Ergonomics 40 (2009) 1026–10401040

Kuikkaniemi, K., Kosunen, I., 2007. Progressive system architecture for buildingemotionally adaptive games. Workshop of the International Conference onAdvances in Computer Entertainment Technology (ACE). BrainPlay ’07: Playingwith your brain. Brain–computer interfaces and games. Online document:http://hmi.ewi.utwente.nl/brainplay07_files/brainplay07.pdf#page¼11, pp. 33–35.

Laufer, L., Nemeth, B., 2007. Skin Signals Betray a Gamer’s Moves. Online document:www.newscientist.com/blog/technology/2007/08/skin-signals-betray-gamers-moves.html.

Lenneman, J.K., Backs, R.W., 2000. The validity of factor analytically derived cardiacautonomic components for mental workload assessment. In: Backs, R.W.,Boucsein, W. (Eds.), Engineering Psychophysiology: Issues and Applications.Lawrence Erlbaum Associates, Mahwah, N.J., pp. 161–175.

Levy, M.N., 1997. Neural control of cardiac function. Baillieres Clin. Neurol 6,227–244.

Lubar, J.F., 1991. Discourse on the development of EEG diagnostics and biofeedbackfor attention-deficit/hyperactivity disorders. Biofeedback Self Regul. 16,201–225.

Lubar, J.F., Swartwood, M.O., Swartwood, J.N., O’Donnell, P.H., 1995. Evaluation ofthe effectiveness of EEG neurofeedback training for ADHD in a clinical setting asmeasured by changes in T.O.V.A. scores, behavioral ratings, and WISC-Rperformance. Biofeedback Self Regul. 20, 83–99.

Manzey, D., 1998. Psychophysiologie mentaler Beanspruchung. In: Rosler, F. (Ed.),Enzyklopadie der Psychologie. Themenbereich C: Theorie und Forschung. SerieI: Biologische Psychologie, Band 5: Ergebnisse und Anwendungen der Psycho-physiologie. Hogrefe, Gottingen, pp. 799–864.

Metz, S., Hoffman, B., 1997. Mind operated devices. Cogn. Technol. 2, 69–72.Mikulka, P., Hadley, G., Freeman, F., Scerbo, M., 1999. The effects of a biocybernetic

system on vigilance decrement. In: Proceedings of the Human Factors andErgonomics Society 43rd Annual Meeting, 1410.

Morrison, J.G., Gluckman, J.P., 1994. Definitions and prospective guidelines for theapplication of adaptive automation. In: Mouloua, M., Parasuraman, P. (Eds.),Human Performance in Automated Systems: Current Research and Trends.Lawrence Erlbaum Associates, Hillsdale, N.J., pp. 256–263.

Mulder, L.J.M., Van Roon, A., Veldman, H., Laumann, K., Burov, O., Quispel, L.,Hoogeboom, P., 2003. How to use cardiovascular state changes in adaptiveautomation. In: Hockey, G.R.J., Gaillard, A.W.K., Burov, O. (Eds.), OperatorFunctional State. IOS Press, Amsterdam, pp. 260–272.

Nishimura, C., Nagumo, J.I., 1985. Feedback control of the level of arousal using skinpotential level as an index. Ergonomics 28, 905–913.

Obrist, P.A., 1981. Cardiovascular Psychophysiology. Plenum Press, New York.Offenloch, K., Zahner, G., 1990. Computer aided physiological assessment of the

functional state of pilots during simulated flight. In: NATO Advisory Group forAerospace Research and Development Conference Proceedings 490, pp.9-1–9-9.

O’Hanlon, J.F., 1981. Boredom: practical consequences and a theory. Acta Psychol.49, 53–82.

Parasuraman, R., 2003. Adaptive automation matched to human mental workload.In: Hockey, G.R.J., Gaillard, A.W.K., Burov, O. (Eds.), Operator Functional State.IOS Press, Amsterdam, pp. 177–193.

Parente, M.A., Parente, R., 2006. Mind-operated devices: mental control ofa computer using biofeedback. Cyberpsychol. Behav. 9, 1–4.

Pope, A.T., Bogart, E.H., 1992. Identification of hazardous awareness states inmonitoring environments. SAE Trans.: J. Aerosp. 101, 449–457.

Pope, A.T., Bogart, E.H., Bartolome, D.S., 1995. Biocybernetic system evaluatesindices of operator engagement in automated task. Biol. Psychol. 40, 187–195.

Prinzel III, L.J., Parasuraman, R., Freeman, F.G., Scerbo, M.W., Mikulka, P.J., Pope, A.T.,2003. Three Experiments Examining the Use of Electroencephalogram, Event-related Potentials, and Heart-rate Variability for Real-time Human-centered

Adaptive Automation Design. NASA/TP-2003-212442. NASA Langley ResearchCenter, Hampton, Virginia, pp. 1–62.

Prinzel III, L.J., Pope, A.T., Freeman, F.G., 2001. Application of Physiological Self-regulation and Adaptive Task Allocation Techniques for Controlling OperatorHazardous States of Awareness. NASA Langley Research Center, Hampton, Vir-ginia, pp. 1–17.

Prinzel III, L.J., Pope, A.T., Freeman, F.G., 2007. Physiological Self-regulation andAdaptive Automation. NASA technical report. NASA Langley Research Center,Hampton, Virginia.

Richter, P., Debitz, U., Schulze, F., 2002. Diagnostik von Arbeitsanforderungen undkumulativen Beanspruchungsfolgen am Beispiel eines Call Centers. (Diagnosticof job demands and cumulating strain consequences in call center jobs). Z.Arbeitswiss. 1–2, 67–75.

Scallen, S.F., Hancock, P.A., Duley, J.A., 1995. Pilot performance and preference forshort cycles of automation in adaptive function allocation. Appl. Ergon. 26,397–403.

Scerbo, M.W., Freeman, F.G., Mikulka, P.J., 2000. A biocybernetic system for adaptiveautomation. In: Backs, R.W., Boucsein, W. (Eds.), Engineering Psychophysiology.Issues and Applications. Lawrence Erlbaum Associates, Mahwah, N.J., pp.241–253.

Scerbo, M.W., Freeman, F.G., Mikulka, P.J., Parasuraman, R., Di Nocera, F.,Prinzel III, L.J., 2001. The Efficacy of Psychophysiological Measures for Imple-menting Adaptive Technology. NASA Langley Research Center, Hampton, Vir-ginia, pp. 1–66.

Schwilden, H., Stoeckel, H., Schuttler, J., 1989. Closed-loop feedback control ofPropofol anaesthesia by quantitative EEG analysis in humans. Br. J. Anaesth. 62,290–296.

Simanski, O., Kahler, R., Pohl, B., Hofmockel, R., Friedrich, R., Lampe, B.P., 2003.Measurement and control of neuromuscular blockade and depth of anaes-thesia. In: Proceedings of the 7th European Control Conference, Cambridge(UK), 2003.

Streitberg, B., Rohmel, J., Herrmann, W.M., Kubicki, S., 1987. COMSTAT rule forvigilance classification based on spontaneous EEG activity. Neuropsychobiology17, 105–117.

Van Dellen, H.J., Aasman, J., Mulder, L.J.M., Mulder, G., 1985. Time domain versusfrequency domain measures of heart-rate variability. In: Orlebeke, J.F.,Mulder, G., van Doornen, L.J.P. (Eds.), Psychophysiology of CardiovascularControl. Plenum Press, New York, pp. 353–374.

Veltman, H.J.A., Gaillard, A.W.K., 1996. Physiological indices of workload in a simu-lated flight task. Biol. Psychol. 42, 323–342.

Veltman, H.J.A., Jansen, C., 2004. The adaptive operator. In: Vincenzi, D.A.,Mouloua, M., Hancock, P.A. (Eds.), Human performance, situation awarenessand automation: Current research and trends, vol. II. Lawrence Erlbaum Asso-ciates, Mahwah, N.J., pp. 7–10.

Wallin, B.G., 1981. Sympathetic nerve activity underlying electrodermal andcardiovascular reactions in man. Psychophysiology 18, 470–476.

Wilson, G.F., 2002. An analysis of mental workload in pilots during flight usingmultiple psychophysiological measures. Int. J. Aviat. Psychol. 12, 3–18.

Wilson, G.F., Eggemeier, F.T., 1991. Psychophysiological assessment of workload inmulti-task environments. In: Damos, D.L. (Ed.), Multiple-task Performance.Taylor & Francis, London, Washington, D.C., pp. 329–360.

Wientjes, C.J.E., 1992. Respiration in psychophysiology: methods and applications.Biol. Psychol. 34, 179–203.

Wright, N., McGown, A., 2001. Vigilance on the civil flight deck: incidence ofsleepiness and sleep during long-haul flights and associated changes in phys-iological parameters. Ergonomics 44, 82–116.

Yamamoto, Y., Isshiki, H., 1992. Instrument for controlling drowsiness using galvanicskin reflex. Med. Biol. Eng. Comput. 30, 562–564.