20
11 th AIAA SoCal Aerospace Systems & Technology (ASAT) Conference, 3 May 2014, Santa Ana, CA Approved for Public Release. Distribution Unlimited. 412TW-PA-13416 DESIGN OF EXPERIMENTS USED TO INVESTIGATE AN F/A-18 E/F STRAFING ANOMALY Maj Zachary A. McCarley Experimental Test Pilot Naval Air Weapons Center, CA [email protected] Lt Col Timothy R. Jorris, Ph.D. Director, Hypersonic Combined Test Force 412th Test Wing, Edwards AFB, CA [email protected] ABSTRACT The F/A-18 E/F in the US Navy’s (USN) inventory is currently configured with the M-61A2 Vulcan cannon. The M-61A2 is the aircrafts internal gun system and has a rotating cluster of six barrels that fires 20-mm electrically-primed ammunition at 6,000 or 4,000 rounds per minute. A variety of Armor-Piercing Incendiary (API), High Explosive Incendiary (HEI), and training rounds are available. It was recently reported that the F/A-18 strafing accuracy seemed less than anticipated in certain combat scenarios. The subject matter experts identified several possible contributors to the suspected targeting error. Four factors were identified; each had two levels or options available during a given strafing run. The factors were Ammunition, either M56/M55 or PGU-28/PGU-27; Software, either Legacy or Upgraded; Altitude Source, either radar or system computed; and Radar, either APG-73 or APG-79. The chief pilot elicited the help of the statistics expertise at Edwards AFB. Using Design of Experiments (DOE), they collectively developed a matrix of factors to identify the contributors to any errors. Incidentally, the number of flights determined with DOE was substantially less than the original estimates. Wind was another strong potential influence on accuracy; however, since winds could not be controlled, they were recorded and input into the statistical analysis. The results found statistically significant contributors to strafing accuracy. These statistically significant results are compared against operationally significant error values. Flight test analysis results are presented, and the data collection lessons learned will be addressed since they made the final analysis possible by eliminating several sources of measurement errors. The presentation will also address the future actions being taken as a consequence of collecting such compelling and supportable evidence via statistically defensible flight test results.

DESIGN OF EXPERIMENTS USED TO INVESTIGATE … Conference...DESIGN OF EXPERIMENTS USED TO INVESTIGATE AN F/A-18 E/F ... either radar or system computed; and Radar, ... This acoustic

Embed Size (px)

Citation preview

Page 1: DESIGN OF EXPERIMENTS USED TO INVESTIGATE … Conference...DESIGN OF EXPERIMENTS USED TO INVESTIGATE AN F/A-18 E/F ... either radar or system computed; and Radar, ... This acoustic

11th AIAA SoCal Aerospace Systems & Technology (ASAT) Conference, 3 May 2014, Santa Ana, CA

Approved for Public Release. Distribution Unlimited. 412TW-PA-13416

DESIGN OF EXPERIMENTS USED TO

INVESTIGATE AN F/A-18 E/F STRAFING ANOMALY

Maj Zachary A. McCarley Experimental Test Pilot

Naval Air Weapons Center, CA [email protected]

Lt Col Timothy R. Jorris, Ph.D.

Director, Hypersonic Combined Test Force 412th Test Wing, Edwards AFB, CA

[email protected]

ABSTRACT

The F/A-18 E/F in the US Navy’s (USN) inventory is currently configured with the

M-61A2 Vulcan cannon. The M-61A2 is the aircrafts internal gun system and has a

rotating cluster of six barrels that fires 20-mm electrically-primed ammunition at 6,000 or

4,000 rounds per minute. A variety of Armor-Piercing Incendiary (API), High Explosive

Incendiary (HEI), and training rounds are available. It was recently reported that the

F/A-18 strafing accuracy seemed less than anticipated in certain combat scenarios. The

subject matter experts identified several possible contributors to the suspected targeting

error. Four factors were identified; each had two levels or options available during a

given strafing run. The factors were Ammunition, either M56/M55 or PGU-28/PGU-27;

Software, either Legacy or Upgraded; Altitude Source, either radar or system computed;

and Radar, either APG-73 or APG-79. The chief pilot elicited the help of the statistics

expertise at Edwards AFB. Using Design of Experiments (DOE), they collectively

developed a matrix of factors to identify the contributors to any errors. Incidentally, the

number of flights determined with DOE was substantially less than the original

estimates. Wind was another strong potential influence on accuracy; however, since

winds could not be controlled, they were recorded and input into the statistical analysis.

The results found statistically significant contributors to strafing accuracy. These

statistically significant results are compared against operationally significant error

values. Flight test analysis results are presented, and the data collection lessons

learned will be addressed since they made the final analysis possible by eliminating

several sources of measurement errors. The presentation will also address the future

actions being taken as a consequence of collecting such compelling and supportable

evidence via statistically defensible flight test results.

Page 2: DESIGN OF EXPERIMENTS USED TO INVESTIGATE … Conference...DESIGN OF EXPERIMENTS USED TO INVESTIGATE AN F/A-18 E/F ... either radar or system computed; and Radar, ... This acoustic

2

INTRODUCTION

It was recently reported that the F/A-18 strafing accuracy seemed less than

anticipated in certain combat scenarios. The subject matter experts identified several

possible contributors to the suspected targeting error. Four factors were identified;

each had two levels or options available during a given strafing run. The factors were

Ammunition, either M56/M55 or PGU-28/PGU-27; Software, either Legacy or

Upgraded; Altitude Source, either radar or system computed; and Radar, either

APG-73 or APG-79. The chief pilot elicited the help of the statistics expertise at

Edwards AFB. Using Design of Experiments (DOE), they collectively developed a

matrix of factors to identify the contributors to any errors.

TEST ITEM DESCRIPTION

The item under test was the F/A-18 E/F with a M-61A2 Vulcan cannon, to be

evaluated during strafing runs.

Gun

The M-61A2 is an upgrade to the venerable M61A1 that has flown in many

aircraft since before the Vietnam era, including the F/A-18 A-D. The upgrade,

designed for the F-22, consisted of slightly reducing the weight. The F/A-18 E/F in

the US Navy’s (USN) inventory is currently configured with the M-61A2 Vulcan

cannon. The M-61 is the aircrafts internal gun system and has a rotating cluster of

six barrels that fires 20-mm electrically-primed ammunition at 6,000 or 4,000 rounds

per minute. A 20-mm round consists of a brass cartridge case, an electric primer,

propellant powder, and the projectile. The primer is ignited by electrical power from

the aircraft armament system. The primer ignites the propellant powder that forms a

gas as it burns, forcing the projectile through the gun barrel. There are currently two

configurations of ammunition available, M50 series and Projectile Gun Unit (PGU),

the only significant difference being the projectile design. Until the late 1980s the

M-61 primarily used the M50 series of ammunition in various types, typically firing a

3.5 oz (100 gram) projectile at a muzzle velocity of about 3,380 feet per second

(1,030 m/s). A variety of Armor-Piercing Incendiary (API), High Explosive Incendiary

(HEI), and training rounds are available. Around 1988 a new round was introduced,

the PGU-28/B, which is now standard for US Navy and United States Air Force

(USAF) aircraft. The PGU-28/B is a "low-drag" round designed to reduce in-flight

drag and deceleration, and has a slightly increased muzzle velocity of 3,450 feet per

second (1,050 m/s). It is a semi-armor piercing high explosive incendiary (SAPHEI)

round, providing substantial improvements in range, accuracy, and power over the

preceding M56A3 HEI round. However, recent combat operations have shown

significant tactical utility of the M56 ammunition in engaging soft targets in

non-paved/dirt environments where PGU ammunition tends to bury prior to

detonation. The 20-mm target practice (TP) ball projectile is a hollow steel body that

does not contain filler. This projectile is used for target practice. The M-55 TP

Page 3: DESIGN OF EXPERIMENTS USED TO INVESTIGATE … Conference...DESIGN OF EXPERIMENTS USED TO INVESTIGATE AN F/A-18 E/F ... either radar or system computed; and Radar, ... This acoustic

3

projectile shape and ballistic characteristics are similar to the M56. The PGU-27 is

an improved TP round with ballistic characteristics similar to the PGU-28.

Aircraft

The F/A-18E/F Super Hornet is a twin engine strike fighter aircraft currently

configured with an M-61A2 Vulcan Cannon and either the APG-73 (mechanically

scanned array) or APG-79 (electronically scanned array) radar. The F/A-18E is the

single-seat variant and the F/A-18F is the dual-seat variant. In both variants, the

cannon is positioned in the nose of the aircraft, between the pilot and the radar.

Older Super Hornets operate via Systems Configuration Set (SCS) software

programmed in assembly language, referred to as legacy software. Newer Super

Hornets are upgraded to advanced mission computers, and operate via SCS

software programmed in C++, referred to as Higher Order Language (HOL). Both

legacy and HOL software utilize the same algorithm for the predictive gunsight,

which projects the Continuously Computed Impact Point (CCIP) solution in the

Heads Up Display (HUD). Figure 1 shows the gun location for the F/A-18E/F and

the F-22.

Figure 1 F/A-18E/F and F-22 Gun Location

Acoustic Scoring

The Precision Impact area (PIRA) of the Edwards AFB range complex is

configured with an Improved Remote Strafe Scoring System (IRSSS), as seen in

Figure 2. This acoustic scoring system for supersonic projectiles consists of two

microphone arrays in front of the staffing target, a computer system hardwired to the

microphone arrays, a scoring console, and a weather station. The individual

microphones sense the shock wave of each individual passing supersonic projectile.

The computer triangulates the position over the two arrays, leading to the trajectory

and the calculated impact on the target as depicted in Figure 3 and Figure 4. The

scoring console compiles all the individual impacts and graphically presents the

information for each attack.

Gun

Gun

Page 4: DESIGN OF EXPERIMENTS USED TO INVESTIGATE … Conference...DESIGN OF EXPERIMENTS USED TO INVESTIGATE AN F/A-18 E/F ... either radar or system computed; and Radar, ... This acoustic

4

Figure 3 Improved Remote Strafe Scoring System (IRSSS) Triangulation

Figure 4 Improved Remote Strafe Scoring System (IRSSS) Triangulation Hits Data

Previous F/A-18 strafe testing at China Lake utilized each strafe run as a data

point, measuring the area of impact of all the bullets in an attack as one data point.

The precise impact position of each bullet was not measured. Preparation for this

test program revealed the capability of IRSSS system a very short distance away,

however in Air Force airspace. Collaboration between the Navy and Air Force

opened up this much more precise tool for gathering hits data. Changing from each

Figure 2 Improved Remote Strafe Scoring System on Edwards Range

Page 5: DESIGN OF EXPERIMENTS USED TO INVESTIGATE … Conference...DESIGN OF EXPERIMENTS USED TO INVESTIGATE AN F/A-18 E/F ... either radar or system computed; and Radar, ... This acoustic

5

attack being a data point to each bullet being a point effectively increased the

number of data points ten-fold.

METHODOLOGY

The methodology followed a series of steps for implementing Design of

Experiments (DOE) approach, references (Montgomery, 2005) and (Hutto & Higdon,

2009). DOE was selected as a means of gaining the most statistical information

from the fewest number of test point, which translates to the fewest number of

sorties while gaining insight into interactions. The factors that could affect the

observations had to be identified. Next a flight test matrix of factors was designed,

enforcing randomization where possible. The design was evaluated based on the

computed statistical power, i.e. the ability to detect a difference. An assessment of

aircraft configuration feasibility was applied. External factors were identified and

either controlled or measures. Lastly, sources of measurement errors were

investigated and then mitigated to the maximum extent possible.

Factors

The first step was to identify the factors that could influence the observation,

which for this experiment was miss distance from the aimpoint. There is an

intentional distinction between the aimpoint and the target. The target was fixed as

the location of the electronic scoring system; however, the aimpoint was where the

aircraft computer predicted bullet impact. An assessment of target to aimpoint

distance would be an evaluation of pilot capability; however, this was a system test

so the measure of interest was how well the bullets struck the predicted target, i.e.

aimpoint. Through consultation with subject matter experts on the system, and pilots

on strafing performance observations there four factors of interest identified:

Ammunition (Ammo) – There were live and inert variants for both legacy

rounds and newer “low drag” rounds. The legacy rounds were M56 and M55

which were live and inert respectively. The “low drag” rounds were PGU-28

and PGU-27 which were live and inert respectively. In an attempt to be as

operationally representative as possible the live round variants were initially

pursued for testing. However, the live rounds posed multiple logistic issues

both for ground handling and range restrictions. Upon further ballistics

research it was concluded that the inert variants had nearly identical ballistics

therefore would be a suitable operationally representative choice or rounds;

thus, M55 and PGU-27 were the ammunition flown and tested.

Software – There were two software loads available, reference (Daddis,

2007). The legacy code programmed in Assembly language, and the

subsequent HOL/C++.

BAAT source – Slant range was calculated via an algorithm referred to as

Best Altitude Above Target (BAAT). BAAT in this test utilized Radar (AGR) or

Page 6: DESIGN OF EXPERIMENTS USED TO INVESTIGATE … Conference...DESIGN OF EXPERIMENTS USED TO INVESTIGATE AN F/A-18 E/F ... either radar or system computed; and Radar, ... This acoustic

6

Table 1 General Factorial (3x2x2) Design, Single Replicate

Configuration Ammo BAAT Radar/SW

1 A1 B1 RS1

2 A2 B1 RS1

3 A1 B2 RS1

4 A2 B2 RS1

5 A1 B1 RS2

6 A2 B1 RS2

7 A1 B2 RS2

8 A2 B2 RS2

9 A1 B1 RS3

10 A2 B1 RS3

11 A1 B2 RS3

12 A2 B2 RS3

Digital Terrain Elevation Data (DTED). An altitude error would create a

miscalculation in the strafing guidance producing a greater miss distance.

Radar – The two radars were the mechanically scanned array APG-73 and

Active Electronically Steered Array (AESA) APG-79 radar.

Flight Test Matrix Design

The next step was to design a flight test matrix using Design of Experiments.

Since each factor has two levels, this can be constructed as a 2k full factorial design,

where k=4 for this design since there are four factors. A fractional factorial design

could be considered, e.g. 2k-1 using half the design; however, this would have

decreased power and eliminated the ability to detect and de-conflict confounded two-

way interactions. Also, before completing the design, it was determined that not

every software could be used with every radar. Therefore, one of the four

combinations was eliminated. This left a general factorial design of 3x2x2. And, with

discussion with the pilots, it was determined that about 8 passes could be achieved

for any given sortie; therefore, each configuration could achieve at least 4 replicates.

This was the final design, 12 configurations with 4 passes each. A single replicate

design is provide in Table 1.

The power of this design was compared to the simplistic one-factor-at-a-time

(OFAT) design. This design is graphically presented in Figure 5.

Page 7: DESIGN OF EXPERIMENTS USED TO INVESTIGATE … Conference...DESIGN OF EXPERIMENTS USED TO INVESTIGATE AN F/A-18 E/F ... either radar or system computed; and Radar, ... This acoustic

7

Figure 5 One Factor At a Time (OFAT) Design

The red dots represent design points, whereas the smaller green dots represent

replicates. As can be seen in Figure 5, there are 16 replicates at each node. This

was determined iteratively in order to achieve a power at or above that which could

be achieved using DOE, i.e. the general factorial with 4 replicates as seen in Figure

7. The power for the OFAT design is in Figure 6 as computed using Design Expert.

The “1.25 Std. Dev.” is the column of interest, which is 1.25 times the standard

deviation. This represents the power to detect a difference 25% greater than the

noise in the observations. This is a strafing scenario, so finding an error at or within

the anticipated dispersion of the bullets, or 1 standard deviation (“1 Std. Dev.”) was

not of operational interest, hence the selection of the 1.25 ratio. Factors A and B are

at two levels so they can be represented as a single factor coded at -1 or 1, e.g.

Ammo 1 or Ammo 2. Factor C is the Radar/SW which has 3 levels. Three levels

must be coded with two columns with combinations of 1/0, 0/1, and -1/1, i.e. RS1,

RS2, or RS3. This is the reason for two coefficients being listed in Figure 6 but only

one row of power value being given. For this case, the power of interest is 88.3%.

Figure 6 Power for One Factor At a Time Design

This should be compared to the power of the DOE design in Figure 7, as seen in

Figure 8. The comparable power, as seen in the row for coefficient C[1], is 87%.

Page 8: DESIGN OF EXPERIMENTS USED TO INVESTIGATE … Conference...DESIGN OF EXPERIMENTS USED TO INVESTIGATE AN F/A-18 E/F ... either radar or system computed; and Radar, ... This acoustic

8

Figure 7 Full Factorial (3x2x2) with 4 Replicates

There are two compelling points which make the DOE, or full factorial, design

superior to the OFAT design. Power is nearly equivalent so that is no longer a

measure of performance. In comparing Figure 6 and Figure 8 it can be seen that the

latter has 3 more power values listed. These addition powers represent the fact that

the DOE design can also identify significance in interaction terms, i.e. AB, AC, and

BC which are Ammo-BAAT, Ammo-Radar/SW, and BAAT-Radar/SW respectively.

The multiple rows or coefficients, per interaction, is symptomatic of the coding of the

three levels for factor C.

Figure 8 Power for DOE Design with 4 Replicates

The other extremely important contrast between these two designs is to equate

the number of sorties required to acquire all of the design points. The number of

design points is equal to the number of nodes times the number of replicates at each

node. Therefore, the OFAC design has 16 replicates at 5 nodes, thus 80 test points.

The DOE design has 4 replicates at 12 nodes, thus 48 test points. Due to the

inability to change ammo or radar inflight, 6 nodes in the DOE require a dedicated

sortie, but can still change BAAT between passes or sets of passes. This can be

seen in Figure 7 as the bottom 3x2 plane at one BAAT level, toggles to the top 3x2

plane at the other level. This toggling was randomized to thwart a steady increase in

Page 9: DESIGN OF EXPERIMENTS USED TO INVESTIGATE … Conference...DESIGN OF EXPERIMENTS USED TO INVESTIGATE AN F/A-18 E/F ... either radar or system computed; and Radar, ... This acoustic

9

winds or temperature affecting all of one setting of BAAT at the beginning of the

mission versus the second setting of BAAT for the latter portion of the mission. Only

8 passes were assumed possible on any given mission. Thus, the OFAC design

requires 10 flights, whereas the DOE design only requires 6 flights for the same

power and more information, i.e. interaction effects. This is a 40% savings in sortie

count from the original 10 flights, or it would take 66% more sorties had the OFAT

design been chosen over the DOE design. The difference in percent is a

consequence of whether the 10 sorties or the 6 sorties is used in the denominator for

the ratio (percentage).

Influential Factors

Next, It was also determined that other factors were present which could

influence the results. The list of additional influential factors is:

Wind

Temperature

Pressure

Pilot

Aircraft

Cannon

Ammunition Lot

Slant Range

Dive Angle

Airspeed

It was acknowledged that some of these could be controlled via setup, while

others simply had to be measured on test day on that test run. For example, dive

angle was performed in a repeatable manner to the maximum extent possible by

briefing and flying a prescribed dive profile, as seen in Figure 9.

Page 10: DESIGN OF EXPERIMENTS USED TO INVESTIGATE … Conference...DESIGN OF EXPERIMENTS USED TO INVESTIGATE AN F/A-18 E/F ... either radar or system computed; and Radar, ... This acoustic

10

Figure 9 Dive Angle Profile

Variables that could not be controlled via procedures were recorded. For

example, wind could not be controlled so it was recorded. If it had needed to be

eliminated, the sorties would have had to be wind placarded, meaning no flights

would have been attempted if the winds exceeded a certain limit. However, wind was

believed to be a contributing factor, so varying non-zero winds were advantageous

for statistical analysis. Clearly, if a factor does not change, its effect cannot be

evaluated.

Instrumentation and Measurements

There were three sources of measurements; acoustic scoring system, onboard

data acquisitions system (DAS), and head-up display (HUD) video. Each

measurement source had challenges. The HUD video was used to determine

distance from target to aimpoint. The aimpoint was well defined as a symbol on the

HUD, however the target could not be well identified during earlier runs, as shown in

Figure 10.

Page 11: DESIGN OF EXPERIMENTS USED TO INVESTIGATE … Conference...DESIGN OF EXPERIMENTS USED TO INVESTIGATE AN F/A-18 E/F ... either radar or system computed; and Radar, ... This acoustic

11

Figure 10 HUD Video of Initial Target

Figure 10 shows the difficulty in identifying the target. This was remedied by placing a parachute on the ground, aft of the target such that the pilots aim at the parachute would coincide with the target-center of the acoustic scoring system. The parachute on the ground, and the corresponding HUD video with the parachute as the target, are shown in Figure 11.

Figure 11 Parachute on Ground Behind Acoustic Scoring System

The second measurement challenge was extracting the acoustic scoring

measurements from the graphical electronic deliverable, as shown in Figure 12.

Aimpoint

Target

Parachute Target

Page 12: DESIGN OF EXPERIMENTS USED TO INVESTIGATE … Conference...DESIGN OF EXPERIMENTS USED TO INVESTIGATE AN F/A-18 E/F ... either radar or system computed; and Radar, ... This acoustic

12

Figure 12 Acoustic Scoring Deliverable

This is an excellent, and accurate, set of bullet hits; however, the position of the

hits relative to the target (red box) was taken manually, i.e. “eyeballed.” This

resulted in excessive errors as seen on the left side of Figure 13.

Figure 13 Reading Errors from Acoustic Graphic

The solution to this was to use a Matlab function called grabit, which was

obtained from the Mathworks File Exchange. This function loads the graphic into a

Matlab figure, and allows the user to calibrate the mouse-to-figure distances by

selecting extreme points in the x and y directions. Once calibrated the data points

could simply be clicked with the mouse and a matrix of points in data units were

Page 13: DESIGN OF EXPERIMENTS USED TO INVESTIGATE … Conference...DESIGN OF EXPERIMENTS USED TO INVESTIGATE AN F/A-18 E/F ... either radar or system computed; and Radar, ... This acoustic

13

saved to the workspace. The plot on the right side of Figure 13 shows the drastic

minimization of error by using grabit instead of the “eyeball” method.

Figure 14 Matlab grabit Usage with Hits Data

The same data extraction advantage could be used on the HUD video, as seen in

Figure 15.

Figure 15 Matlab grabit Usage with Aimpoint Data

Page 14: DESIGN OF EXPERIMENTS USED TO INVESTIGATE … Conference...DESIGN OF EXPERIMENTS USED TO INVESTIGATE AN F/A-18 E/F ... either radar or system computed; and Radar, ... This acoustic

14

Understanding the DAS parameters was another challenge. Simply

understanding which signals were available, and how they could be used to back out

measurements such as winds was formidable. Obtaining a sample set of DAS

outputs far prior to the flight test would have been beneficial.

ANALYSIS AND RESULTS

The two observations of most interest were the cross track and long track error in

strafing hits, these were called Miss X and Miss Y, respectively. The wind was

resolved into a cross track component and long track component and input as a

factor for analysis.

Analysis

In Figure 16 it can be seen that the BAAT, Radar/SW, and both components of

Wind are significant. A p-value of less than .05 indicates that the null hypothesis (a 0

value for the coefficient) can be rejected; therefore, if 0 is rejected the coefficient is

considered non-zero.

Figure 16 ANOVA for Miss X Observation

Figure 17 shows that Radar/SW, Wind, and Wind interaction are significant;

however, BAAT is not significant.

Figure 17 ANOVA for Miss Y Observation

Page 15: DESIGN OF EXPERIMENTS USED TO INVESTIGATE … Conference...DESIGN OF EXPERIMENTS USED TO INVESTIGATE AN F/A-18 E/F ... either radar or system computed; and Radar, ... This acoustic

15

A graph of the effects, as shown in Figure 18, is used to better interpret these

results of significance. The x-axis represents the crosswind factor, the different

colored lines represent the Radar/SW factor, and the y-axis represents the Miss X

observation. Along the x-axis clearly shows a slope, and the magnitude of the slope

from left to right exceeds the magnitude of the error bars at the ends of each line;

therefore, the magnitude of the effect (change in y-value from left to right) exceeds

the noise and is thus significant, as indicated by the p-value. Of further interest is

that the 0 crosswind point roughly coincides with a 0 Miss X distance. Furthermore,

the change in magnitude exceeds the lines depicting “Operational Significance”. This

implies that there is indeed a statistically significant effect as described earlier, but

this effect is also operationally significant—that means that the pilots actually care if

they miss by that extreme an amount!

Figure 18 shows that the Radar/SW (C: RS) is also statistically significant. This

can be seen by the difference between the lines exceeding the height of the error

bars. These results are statistically significant; however, the change is Miss X is

small and may not be of operational concern.

The observation that the lines are parallel in Figure 18 and Figure 19 shows that

there is no interaction between Wind and Radar/SW. If there had been a significant

interaction the lines would have crossed, or had varying slopes, rather than being

parallel.

Figure 18 Interaction Plot for Miss X Observation

Figure 19 shows the Interaction plot for the Long Track wind factor on the x-axis,

the Radar/SW as the different lines, and Miss Y observation along the y-axis. The

magnitude of the effect from left to right exceeds the height of the error bars, so this

matches the ANOVA conclusion of Long Track Wind (factor F in Figure 17) being

Page 16: DESIGN OF EXPERIMENTS USED TO INVESTIGATE … Conference...DESIGN OF EXPERIMENTS USED TO INVESTIGATE AN F/A-18 E/F ... either radar or system computed; and Radar, ... This acoustic

16

significant. In contrast to Miss X, even though Long Track Wind has a statistically

significant effect according to the ANOVA analysis, it does not have “Operational

Significance”. Thus, there is a statistically significant difference but it simply doesn’t

matter to the pilot during strafing runs.

Figure 19 Interaction Plot for Miss Y Observation

The next Interaction plot, in Figure 20, shows a graphic representation of results

that are not statistically significant. In this case, BAAT is the factor along the x-axis.

There is not an appreciable change in Miss X (along the y-axis) with a change in

BAAT, as demonstrated by the fact that the change in Miss X magnitude as BAAT

varies from left to right does not exceed the height of the error bars at the end of the

lines. Furthermore, the change in Miss X does not exceed the “Operation

Significance” lines; therefore, BAAT has no significant effect.

Page 17: DESIGN OF EXPERIMENTS USED TO INVESTIGATE … Conference...DESIGN OF EXPERIMENTS USED TO INVESTIGATE AN F/A-18 E/F ... either radar or system computed; and Radar, ... This acoustic

17

Figure 20 Interaction Plot for Miss Y Observation and BAAT as a Factor

Results

One rather unanticipated result was the lack of hits on one of the first runs. The

first culprit was believed to be the acoustic scoring system, but it was nearly

unfathomable that it would not account for nearly half the bullets. Figure 21 shows

that the issue was that the gun was tearing itself apart. It was rather fortuitous that

the test team analyzed the data between test points to ascertain that there was an

issue, which was subsequently brought to the attention of maintenance and

resolved.

Figure 21 Gun Fragments after Firing Damage

The ultimate result of this analysis was a statistically defensible confirmation that

wind has an effect on miss distance. This led to a detailed analysis of the underlying

software. In gaining a better understanding of the software, now particularly focuses

Page 18: DESIGN OF EXPERIMENTS USED TO INVESTIGATE … Conference...DESIGN OF EXPERIMENTS USED TO INVESTIGATE AN F/A-18 E/F ... either radar or system computed; and Radar, ... This acoustic

18

on the effects of winds, it was determined that winds were not properly being

accounted for during the strafing solution. This discovery has led to recommended

improvements which are anticipated to gain high priority to implement.

CONCLUSIONS

Several factors were both statistically significant and operational significant. This

led to more in-depth discussions with the contractor to discover a misunderstanding

in the system capability. The software in use was not compensating for winds;

however all of the required measurements and predictive algorithms were in place.

These statistically defensible results lead to a recommendation to improve the

strafing algorithm to better account for surface winds. Furthermore, this validated the

operational allegations of quantifiable strafing errors that had been contended for

years. It was the application of a scientific approach with established statistical

analysis that lead to affirmation of systemic errors which could be remedied for future

fighter operations.

LESSONS LEARNED

Carefully design the flight test experiment. This led to the correct number of

flights and passes to ensure enough power in the statistical results to discern

a difference.

Perform data analysis between sorties. This uncovered the inability to identify

the target within the HUD video. This could have alternatively been confirmed,

with fewer assets in place, during a practice mission.

Understand the instrumentation. Understanding if winds were available within

the DAS, and how to best extract the acoustic scoring hits could have been

determined well before flight test, thus avoiding any delays in analysis and

ultimately decisions for improvements.

Create a sample solution prior to flight test. This ensures all instrumentation is

in place. It can also identify if additional measurements are required, e.g. a

secondary source of winds as reported versus aircraft computed.

Run a rehearsal flight, which doesn’t require all assets, prior to the fully

scheduled flight date. Train like you fight, or in this case, practice how you’re

going to test. For this test, rehearsals were conducted in the simulator with all

involved test pilots, but an actual rehearsal flight would have helped more.

Open your field of regard in reference to resources available. USAF assets,

traditionally not tapped into during Navy flight test, included the acoustic

scoring system and the Edwards AFB statistics office. The quality of

information gathered during the test effort would have been significantly less

without those resources.

Page 19: DESIGN OF EXPERIMENTS USED TO INVESTIGATE … Conference...DESIGN OF EXPERIMENTS USED TO INVESTIGATE AN F/A-18 E/F ... either radar or system computed; and Radar, ... This acoustic

19

ACRONYMS, ABBREVIATIONS, SYMBOLS

BAAT Best Altitude Above Target

DOE Design of Experiments

DTED Digital Terrain Elevation Data

HOL Higher Order Language (C++)

RS Radar/Software

Std. Dev. Standard Deviation

USMC United States Marine Corps

USN United States Navy

FUTURE WORK

New requirements are in work to be added in future Software Configuration Sets

to better account for winds in future strafe algorithm. Spiral development is the

model in use for F/A-18 capability improvements. Once changes have been made, a

similar approach will be executed to evaluate the effectiveness of those software

changes.

WORKS CITED

Daddis. (2007, April 24). F/A-18 & EA-18G Program: Capable, Affordable & Joint

interoperable ... Today & Tomorrow. (PEOTACAIR) Retrieved from

http://www.dtic.mil/ndia/2007psa_apr/gaddis.pdf

Hutto, G. T., & Higdon, J. M. (2009). Survey of Design of Experiments (DOE)

Projects CY07-08. US Air Force T&E Days Conference. Albuquerque, NM.

Montgomery, D. C. (2005). Design and Analysis of Experiments. John Wiley & Sons,

Inc.

Remund, T., & Kitto, W. (n.d.). Monte Carlo Techniques for Estimating Power in

Aircraft T&E Tests. International Test and Evaluation Association–2011

Annual ITEA Technology Review--Technology for Rapid Aquisition and Test.

Annapolis, MD.

Roberts, D. (2012). Improved Remote Strafe Scoring System (IRSSS). Corona, CA:

NAVSEA Surface Warfare Division.

Waddell Jr., E. T., & Jorris, T. (2008). Filtering and Statistics of Flight Test

Instrumentation Data. Society of Flight Test Engineers 39th Annual

International Symposium. Fort Worth, TX.

Watson, R., & McCarley, Z. (2013). Report of Test Results: A/G Gun Precision &

Accuracy Testing of the F/A-18E/F with SCS 25X and SCS H8E. China Lake,

CA: Naval Air Warfare Center Weapons Division.

Page 20: DESIGN OF EXPERIMENTS USED TO INVESTIGATE … Conference...DESIGN OF EXPERIMENTS USED TO INVESTIGATE AN F/A-18 E/F ... either radar or system computed; and Radar, ... This acoustic

20

BIOGRAPHIES

Maj Zachary A. McCarley received a

B.S. in Aerospace Engineering from

Georgia Tech, a M.S. in Aeronautical

Science from Embry-Riddle, and a M.S.

in Flight Test Engineering from the USAF

Test Pilot School, as an Experimental

Test Pilot. One of his most notable

achievements was remembering his

extremely astute Statistics and Advanced

Data Analysis Instructor (see right). This

singular achievement is what ultimately

launched this monumental statistical

analysis paper. Without the tenacity of

Maj McCarley to relentlessly solve the

plaguing strafing error conundrum this

detriment to attack effectiveness would

have persisted unabated for potentially

years. He is a true credit to that other

sister organization, admirably referred to

as the Marines. Semper Fi!

Lt Col Tim Jorris received a B.S. and

M.S. in Aerospace Engineering from the

University of California, Los Angeles

(UCLA); and his Ph.D. in Astronautical

Engineering from the Air Force Institute of

Technology (AFIT). He worked at the

Munitions Lab at Eglin, is a US Air Force

Test Pilot School (USAF TPS) graduate,

and worked on the F-15I, B-1, B-52,

Global Hawk, X-40A, X-37, X-33, X-51,

and Dream Chaser as a flight test

engineer at Edwards AFB, CA. He has

served as an Instructor Flight Test

Engineer at USAF TPS and an Adjunct

Professor for AFIT. He is currently the

Director of the Hypersonic Combined

Test Force at the 412th Test Wing,

Edwards AFB, CA.