61
CAN UNCLASSIFIED Calculation software for ballistic limit and Vproof tests from ballistic tests results (BLC) Daniel Bourget Manon Bolduc DRDC Valcartier Research Centre The body of this CAN UNCLASSIFIED document does not contain the required security banners according to DND security standards. However, it must be treated as CAN UNCLASSIFIED and protected appropriately based on the terms and conditions specified on the covering page. CAN UNCLASSIFIED July 2020 DRDC-RDDC-2020-R056 Scientific Report Defence Research and Development Canada

Calculation software for ballistic limit and Vproof tests

  • Upload
    others

  • View
    14

  • Download
    0

Embed Size (px)

Citation preview

CAN UNCLASSIFIED

Calculation software for ballistic limit and Vproof tests from ballistic tests results (BLC)

Daniel Bourget Manon Bolduc DRDC – Valcartier Research Centre

The body of this CAN UNCLASSIFIED document does not contain the required security banners according to DND security standards. However, it must be treated as CAN UNCLASSIFIED and protected appropriately based on the terms and conditions specified on the covering page.

CAN UNCLASSIFIED

July 2020

DRDC-RDDC-2020-R056

Scientific Report

Defence Research and Development Canada

CAN UNCLASSIFIED

Template in use: EO Publishing App for SR-RD-EC Eng 2018-12-19_v1 (new disclaimer).dotm © Her Majesty the Queen in Right of Canada (Department of National Defence), 2020

© Sa Majesté la Reine en droit du Canada (Ministère de la Défense nationale), 2020

CAN UNCLASSIFIED

IMPORTANT INFORMATIVE STATEMENTS

This document was reviewed for Controlled Goods by Defence Research and Development Canada (DRDC) using the Schedule to the Defence Production Act.

Disclaimer: This publication was prepared by Defence Research and Development Canada an agency of the Department of National Defence. The information contained in this publication has been derived and determined through best practice and adherence to the highest standards of responsible conduct of scientific research. This information is intended for the use of the Department of National Defence, the Canadian Armed Forces (“Canada”) and Public Safety partners and, as permitted, may be shared with academia, industry, Canada’s allies, and the public (“Third Parties”). Any use by, or any reliance on or decisions made based on this publication by Third Parties, are done at their own risk and responsibility. Canada does not assume any liability for any damages or losses which may arise from any use of, or reliance on, the publication.

Endorsement statement: This publication has been peer-reviewed and published by the Editorial Office of Defence Research and Development Canada, an agency of the Department of National Defence of Canada. Inquiries can be sent to: [email protected].

DRDC-RDDC-2020-R056 i

Abstract

This document provides scientific information on the mathematical and statistical methods used by the

BLC (Ballictic Limit Calculator) Beta Version 2 program to generate ballistic limit and Vproof data and

their related statistics based on experimental ballistic data.

The BLC software enables the analysis of ballistic data using 5 different statistical models (namely, the

Probit, Logit, Gompit, Scobit and Weibull models) that, at least for the Probit and Logit models, follow

the data analysis procedures of NATO STANAG 2920 and NIJ 0101.06 standards. In addition to the

model parameters, BLC calculates standard errors and confidence limits on the model parameters, on the

ballistic limit, on the standard deviation and on the Vproof. Furthermore, a series of statistical tests

enables the comparison of the different models and allows to assess their significance and goodness of fit.

This document is also the user manual of the BLC Beta version 2 program. As such, this document

provides details on the nature and meaning of input and output information and presents its different

features.

Significance to defence and security

This document describes the use and scientific foundation of the Ballistic Limit Calculator (BLC)

software. The BLC provides straightforward statistical analysis of raw ballistic data to determine the

protection level provided by armour against specific projectiles.

ii DRDC-RDDC-2020-R056

Résumé

Ce document fournit des informations scientifiques sur les méthodes mathématiques et statistiques

utilisées par le programme BLC (Ballictic Limit Calculator) Beta Version 2 pour générer des données de

limite balistique et Vproof et leurs statistiques associées basées sur des données balistiques

expérimentales.

Le logiciel BLC permet l'analyse de données balistiques à l'aide de 5 modèles statistiques différents (à

savoir, les modèles Probit, Logit, Gompit, Scobit et Weibull) qui, au moins pour les modèles Probit et

Logit, suivent les procédures d'analyse des données des normes OTAN STANAG 2920 et NIJ 0101.06

STANAG 2920 et NIJ Normes 0101.06. En plus des paramètres du modèle, BLC calcule les erreurs

standards et les limites de confiance sur les paramètres du modèle, sur la limite balistique, sur l'écart type

et sur le Vproof. En outre, une série de tests statistiques permet la comparaison des différents modèles et

permet d'évaluer leur signification et leur qualité d'ajustement.

Ce document est également le manuel d'utilisation du programme BLC Beta version 2. À ce titre, ce

document fournit des détails sur la nature et la signification des informations d'entrée et de sortie et

présente ses différentes caractéristiques.

Importance pour la défense et la sécurité

Ce document décrit l'utilisation et les fondements scientifiques du logiciel Ballistic Limit

Calculator (BLC). Le BLC fournit une analyse statistique simple des données balistiques brutes

pour déterminer le niveau de protection fourni par l'armure contre des projectiles spécifiques.

DRDC-RDDC-2020-R056 iii

Table of contents

Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . i

Significance to defence and security . . . . . . . . . . . . . . . . . . . . . . . . . i

Résumé . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii

Importance pour la défense et la sécurité . . . . . . . . . . . . . . . . . . . . . . . ii

Table of contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii

List of figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v

List of tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi

1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.1 General . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.2 Ballistic tests for which this software can be used . . . . . . . . . . . . . . . . 1

2 User manual . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

2.1 How to use the BLC . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

2.2 Execution warnings . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

2.3 Execution errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

3 Ballistic limit calculation procedures . . . . . . . . . . . . . . . . . . . . . . . 8

3.1 General . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

3.2 Link Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

3.2.1 Probit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

3.2.2 Logit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

3.2.3 Gompit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

3.2.4 Scobit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

3.2.5 Weibull . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

3.2.6 Summary of link functions . . . . . . . . . . . . . . . . . . . . . 10

3.3 Fitting Process Calculations . . . . . . . . . . . . . . . . . . . . . . . 11

3.3.1 General . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

3.3.2 Maximum Likelihood Estimation (ML) . . . . . . . . . . . . . . . . 12

3.4 Confidence Interval and Standard Error Calculation . . . . . . . . . . . . . . 13

3.4.1 General . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

3.4.2 Standard Error Estimation . . . . . . . . . . . . . . . . . . . . . 13

3.4.3 Confidence interval curves . . . . . . . . . . . . . . . . . . . . . 14

3.4.3.1 Normal error distribution confidence interval . . . . . . . . . . . 14

3.4.3.2 Binomial error distribution confidence interval . . . . . . . . . . 15

4 Diagnostic tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

4.1 General . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

4.2 Information on the data sample . . . . . . . . . . . . . . . . . . . . . . 17

4.2.1 Sample Statistics section of the “Results 1” sheet . . . . . . . . . . . . . 17

4.2.2 Box Plot sheet. . . . . . . . . . . . . . . . . . . . . . . . . . 18

iv DRDC-RDDC-2020-R056

4.3 Tests on validity of the data . . . . . . . . . . . . . . . . . . . . . . . 19

4.3.1 Criteria for validity of dependent variable value independency . . . . . . . . 20

4.3.1.1 MONOBIT Test . . . . . . . . . . . . . . . . . . . . . 20

4.3.1.2 RUNS Test . . . . . . . . . . . . . . . . . . . . . . . 21

4.3.1.3 Cumulative Sum Test . . . . . . . . . . . . . . . . . . . 21

4.4 The Goodness-of-Fit and Significance of the model . . . . . . . . . . . . . . 22

4.4.1 Significance of the parameters and independent variables . . . . . . . . . . 22

4.4.1.1 Likelihood Ratio Test . . . . . . . . . . . . . . . . . . . 22

4.4.1.2 Wald Test . . . . . . . . . . . . . . . . . . . . . . . . 23

4.4.2 Goodness-of-Fit Tests . . . . . . . . . . . . . . . . . . . . . . . 24

4.4.2.1 Anderson-Darling Test . . . . . . . . . . . . . . . . . . . 24

4.4.2.2 Sensitivity, Specificity and area under the ROC curve (AUC) . . . . . 26

4.4.2.3 The Stukel Test . . . . . . . . . . . . . . . . . . . . . . 28

4.4.2.4 Pseudo-R2 . . . . . . . . . . . . . . . . . . . . . . . . 28

4.4.2.5 Information Criterion (IC): General considerations. . . . . . . . . 29

4.4.2.6 The small sample size corrected AIC . . . . . . . . . . . . . . 31

4.4.2.7 Small sample size correction using the Bootstrap method (EIC) . . . . 31

5 Results description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

5.1 Summary data (“Results 1” sheet) . . . . . . . . . . . . . . . . . . . . . 33

5.1.1 Experimental data sample statistics . . . . . . . . . . . . . . . . . . 33

5.1.2 Independency of dependent variable . . . . . . . . . . . . . . . . . 33

5.1.3 Parameter values and statistics. . . . . . . . . . . . . . . . . . . . 34

5.1.4 Significance of the model . . . . . . . . . . . . . . . . . . . . . 34

5.1.5 Goodness-of-fit of the model . . . . . . . . . . . . . . . . . . . . 35

5.1.6 Data analysis based on STANAG 2920 . . . . . . . . . . . . . . . . 36

5.1.7 Confidence Interval at V50 based on Normal of Binomial error distribution . . . 37

5.1.8 Vproof statistics . . . . . . . . . . . . . . . . . . . . . . . . . 38

5.2 Experimental data points and confidence limit curves (“Results 2” sheet) . . . . . . 39

5.3 Other sheets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

5.3.1 Box Plot sheet. . . . . . . . . . . . . . . . . . . . . . . . . . 39

5.3.2 Proofing Statistics sheet . . . . . . . . . . . . . . . . . . . . . . 40

5.3.3 Graphical output sheets . . . . . . . . . . . . . . . . . . . . . . 41

5.3.4 ROC sheet . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

6 Data analysis example . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

6.1 General . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

6.2 Determination of the correct model . . . . . . . . . . . . . . . . . . . . . 42

6.2.1 Guidelines . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

6.2.2 Correct model determination: An example . . . . . . . . . . . . . . . 44

6.3 Ballistic limit and Vproof analysis . . . . . . . . . . . . . . . . . . . . . 46

7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

DRDC-RDDC-2020-R056

List of figures

Figure 1: The “Main” sheet. ................................................................................................................. 3

Figure 2: The “Navigation Menu”. ...................................................................................................... 4

Figure 3: Screen during code execution. .............................................................................................. 5

Figure 4: Density probability (left) and cumulative probability (right) distribution curves for the

Probit, Logit, Gompit, Scobit and Weibull distributions. For the Scobit and the Weibull

distributions, curves for 2 different shape parameter values are shown

(δ = 0.5 and δ = 2.0). ........................................................................................................... 10

Figure 5: Box plot of a symmetric sample distribution (left side, calculated skewness = -0.19)

and a positively-skewed sample distribution (right side, calculated skewness = +1.51). ... 19

Figure 6: Example of an ROC curve. ................................................................................................. 26

Figure 7: Illustration of V50 statistics. In this example, STANAG 2920 maximal and minimal

V50 values are located within the normal error distribution bracket. It is not necessary

always the case. .................................................................................................................. 37

Figure 8: Schematic illustrating Vproof statistics with an example for 10% probability of

perforation. The fitted cumulative function is presented in red, the lower confidence

limit is presented in purple and the upper confidence limit is presented in green. ............ 39

Figure 9: Example of perforation probability fitted curve for normal error (left) and binomial

error (right) distribution for 95% confidence level. ............................................................ 41

Figure 10: Decision process to determine best model. ......................................................................... 44

Figure 11: Curve fit statistics data for the Logit, Probit and Scobit models. ....................................... 46

v

DRDC-RDDC-2020-R056

List of tables

Table 1: Summary of link function characteristics and ballistic limit formula. . . . . . . . 11

Table 2: List of assumptions for OLS versus GLM fitting process. . . . . . . . . . . . . 19

Table 3: Power level of the AD test for normality for symmetric and asymmetric distributions

with varying skewness (Sk), kurtosis (Ku), Type I error level (α) and number of data

points in the distribution (n). Data from [28]. . . . . . . . . . . . . . . . . . 25

Table 4: Classification table for a predetermined threshold value. . . . . . . . . . . . . 27

Table 5: – ROC AUC qualitative level of discrimination . . . . . . . . . . . . . . . . 35

Table 6: Example of the number of perforation allowed (Allowed Nb of failure) and total

number of shots to fire (Nb of shot required) to test 10% probability of perforation

(V10%) with a confidence level of 95% (CL = 95%). . . . . . . . . . . . . . . 40

Table 7: Variables of interest for Ballistic Limit and Vproof data analysis in “Result 1”

Worksheet. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

vi

DRDC-RDDC-2020-R056 1

1 Introduction

1.1 General

Armour is an integral part of Personal Protective Ensemble (PPE) provided to soldiers and Law

Enforcement personnel. Because they are live saving devices their performance is strictly controlled by

standards and test procedures. Different national and international bodies are managing standards (e.g.,

NATO [1][2], National Institute of Justice, NIJ [3]). Because armour material and capabilities are

evolving as well as our understanding of their interaction with the human subject wearing it, standards are

continually evolving. One recent evolution concerns the number of ballistic impacts used to assess

ballistic limit performance of armours and the use of ballistic data fitting techniques to model the

probability of perforation. In addition, requirements to specify confidence intervals for safe protection

impact velocity value (Vproof) have appeared in different NATO standards. Due to the extensive

mathematical process related to these new ways of assessing armour performance and new requirements,

an Microsoft Excel ballistic data analysis package calle BLC (Ballistic Limit Calculator) was created.

This document provides scientific information on the mathematical and statistical methods used by the

BLC Beta version 2 program to generate ballistic limit and Vproof data and their related statistics based

on experimental ballistic data. This document is also the user manual of the BLC Beta version 2 program.

Therefore, this document addresses the needs of technicians, engineers and scientists on the capability,

procedures, scientific background, usage and caveats of BLC with the goal of providing enlightened

assessment of armour performance.

1.2 Ballistic tests for which this software can be used

The MS Excel “BLC Beta version 2.xls” file can be used to calculate the ballistic limit value and the

Vproof statistics based on experimental ballistic data. These two tests are the most common ballistic

control tests.

Ballistic limit (V50) tests: Ballistic limit is the velocity at which the probability of perforation of an

armour is 50% That velocity is calculated, using statistical procedure, for a specific armour and a

specific threat. The ballistic limit is often used during armour production for controlling the

performance variability between batches (combination of material and manufacturing variabilities)

and maintaining the quality of the armour material. The ballistic limit is calculated based on a series

of shots fired at an armour following a pre-defined impact pattern for a range of impact velocity

above and below the V50 value for which perforation (1) or non-perforation (0) are recorded. The

sequence of impact velocity follows an up and down procedure described in STANAG 2920 ([1],

[2]) and NIJ 0101.06 [3]. The ballistic limit can also be considered as the upper limit of protection

where the probability of perforation is 50% in comparison to the Vproof test where the probability

of perforation is lower.

Vproof tests: It is also called a ballistic resistance tests. The Vproof is a PASS/FAIL test for which

a certain number of shot is fired at a fixed velocity and for which a certain number of penetration

through the armour is allowed. These two values are determined depending on the probability of

perforation (e.g., 5%) for which the test needs to be done and for the confidence level (e.g., 95%)

2 DRDC-RDDC-2020-R056

that the probability of perforation is real. In BLC, Vproof statistics are calculated based on the

perforation/no perforation test results of ballistic limit tests.

DRDC-RDDC-2020-R056 3

2 User manual

2.1 How to use the BLC

To use BLC Beta version 2.xls, do the following steps:

1. Load the BLC Beta version 2.xls file into Excel by double clicking on the file icon.

2. Once loaded, the program displays the “Main” page where inputs can be entered and a floating

“Navigation Menu”on top of it. The “Navigation Menu” cannot be closed, but it can be minimized

and placed anywhere on the screen for further recall.

3. Following is a description of the “Main” sheet and “Navigation Menu” features

Figure 1: The “Main” sheet.

a. The “Main” sheet (Figure 1):

i. In the “Main” sheet, enter your experimental ballistic data in the order in which they

were fired (see Section 4.3.1):

a) Column A: Striking velocities (VS).

b) Column B: Response. For a complete perforation of the target, the value entered

can be either 1, “Y”, “y” or “CP”. For a partial perforation, the value entered can

be either 0, “N”, “n” or “PP”.

c) Column C: Wether or not you want the data be used in the ballistic limit and

Vproof calculation. If you want to use the data point, enter a 1. If you do not

want to use the data point, enter 0. This feature was added to allow flexibility for

removing or adding data points.

4 DRDC-RDDC-2020-R056

d) It is possible to enter up to 500 data points. Copy/paste can be used to insert data

in the different columns

ii. Select the Error distribution shape. For this version of the software, it can be either

Normal or Binomial.

iii. Select the confidence level for error distribution (value between 0.001 and 0.999) for

the V50 and Vproof confidence interval calculation.

iv. Select the Vproof probability value (value between 0.001 and 0.999). Vproof

probability is the perforation probability for which Vproof statistics are calculated by

the program.

v. Select the statistical test confidence level (value between 0.001 and 0.999), This value

will be used by the program as the threshold probability for the statistical inference

tests.

Figure 1 – The “Navigation Menu” Figure 2: The “Navigation Menu”.

b. Navigation in the software

i. It is possible to navigate in the program by using the Excel tabs located at the bottom

of the screen.

ii. It is also possible to navigate in the program by using The “Navigation Menu”

(Figure 2). It is divided as follows:

a) Navigation (Red area): It is is used to navigate from any pages in the Excel file

to the 3 most frequently used pages, that is the “Main” sheet, the “Results 1”

sheet and the “Results 2” sheet by using the respective buttons.

DRDC-RDDC-2020-R056 5

b) File Handling (Green area): It is used to handle the Excel file by either saving it,

(“Save” button), saving it under another name (“Save File As” button) or closing

the file (“Close File” button)

c) Select Pages to Copy (Blue area): It is used to select one or more result sheets of

the Excel file and copy them into another Excel Workbook for further use. The

name of the copied sheets in the new Workbook will be kept as they are but a

number will be added to the sheet’s name to distinguish between different

analysis.

i) The different sheet to be copied can be selected individually by using the

check boxes next to their name

ii) All sheets can be selected or unselected using the “Select All Pages” or

the “Unselect All Pages” buttons

iii) Sheets can be either copied to a new workbook by using the “Copy

Selected Pages to New Workbook” button or to an already existing

workbook by using the “Copy Selected Pages to Existing Workbook”

button.

d) Calculate Ballistic Limit (Yellow area): It is used to start a new calculation from

anywhere in the Excel file by clicking on the “Calculate Ballistic Limit” button. When

this is done, the “Main” sheet will become active and the code execution begins.

Figure 3: Screen during code execution.

4. Code execution:

a. Click on the large grey “Calculate Ballistic Limit” button on the “Main” sheet, or

alternatively, click on the “Calculate Ballistic Limit” button situated on the “Navigation

Menu”

6 DRDC-RDDC-2020-R056

b. The program will ask to select the range of cells containing the experimental data that were

entered on the “Main” sheet. Data must be selected in columns A, B and C.

c. After you have selected the range, click OK

d. If there are issues with the data, the program will display warning messages. See Section 2.2

and 2.3 on execution warnings and execution errors.

e. During execution, the program stays on a blank blue page and displays a progress bar.

(Figure 3). Be patient. A typical 12 to 25 data points analysis takes about 50 to 60 seconds. A

100 data points analysis takes about 80 seconds

f. At the end of the execution, the progress bar displays a “Done” button. Pressing enter or

clicking on the “Done” button will close the progress bar and the “Navigation Menu” will

reappear.

2.2 Execution warnings

The program may display warnings in circumstances describe below. For some warnings (Termination

Errors), the user will have to click the “Ok” button, check data on the “Main” sheet and click again on the

“Calculate Ballistic Limit” button to start the calculation again.

For some others (Warning to the User), the User can either click on the “Cancel” button in which case

he/she will check data on the “Main” sheet and click again on the “Calculate Ballistic Limit” button to

start the calculation again, or the User can click on the “Retry” button in which case the calculation

continues using the same data.

1. Termination Errors:

a. If more then three columns were selected for the VS/Response/Calculation data.

b. If less then two rows were selected for the VS/Response/Calculation data.

c. If a value of VS in the VS/Response list is less then or equal to 0.

d. If a value of Response in the VS/Response list is different from 0, “n”, “N” and “PP” or

different from 1, “y”, “Y” and “CP”.

e. If the selected values of Response contains only 0 or only 1 values.

f. If the Confidence Level for Error Distribution, the Vproof probability value or the

Statistical Tests Confidence Level values entered are not between 0 and 1 (0 and 1 excluded).

2. Warning to the User

a. If the number of partial perforation is more then 0 and less than 3

b. If the number of complete perforation is more then 0 and less than 3

DRDC-RDDC-2020-R056 7

c. If the highest VS value is not a complete perforation

d. If the lowest VS value is not a partial perforation

2.3 Execution errors

Although it has been extensively tested, the BLC software can still present errors during runtime. Usually,

when such an error occurs, a message is displayed on the screen starting with “Error in” followed by the

name of a function or a subroutine name. A “Ok” button also appears on the message. When click, the

“Ok” button will resume the code execution.

In case such errors occur, please contact the author of this document. Please also provide the error

message along with all the data that was used in the “Main” sheet.

The author’s coordinate are:

By e-mail at [email protected], or [email protected]

By phone at (001) 418-844-4000 ext 4228.

8 DRDC-RDDC-2020-R056

3 Ballistic limit calculation procedures

3.1 General

The V50 is the convergence value from the data produced by the up and down technique around the

velocity at which 50 % of shots penetrated and 50% are stopped by the armour system. The procedure to

evaluate the V50 is well described in the the STANAG 2920 (Edition 3) [1] and [2] and the NIJ 0101.06

[3]. The up and down technique is used to minimize the number of shots while finding the convergence

point. For each impact velocity, the status of the shot (either penetration or no penetration) is recorded.

That type of dependent variable is said to be binary and consequently, different type of “link functions”

can be used to model the probability of armour perforation.

In the current version of BLC, 5 different link functions are used: the Probit distribution, the Logit

distribution, the Gompit distribution, the Scobit distribution and the Weibull distribution. Each of these

link function will be detailed in the next section. As these link functions are non-linear, the Maximum

Likelihood estimation (ML) technique has to be used to provide the estimate of the parameters. Both, NIJ

and STANAG suggest using maximum likelihood and have described the procedure to develop the

model. Both of these processes are described in the next sections.

3.2 Link Functions

3.2.1 Probit

The Probit link function is also called the cumulative normal distribution (Φ) with mean µ and variance σ.

The cumulative normal distribution at an impact velocity VS can be written as:

(1)

The same can be written using different parameters like α = -µ/σ and β = 1/σ. Replacing µ and σ by α and

β results in the following equation:

(2)

In this case, α is the location parameter and β is the scale parameter. It is Equation (2) that is fitted as

described below by BLC. Note that the Probit link function is symmetric around α+βx = 0 which

corresponds to the location where VS = V50. This means that using the Probit link function, the rate of

change of the probability of penetration versus striking velocity is the same above and below V50.

DRDC-RDDC-2020-R056 9

3.2.2 Logit

The Logit link function is also called the cumulative Logistic distribution. It is frequently used to model

dichotomic data with continuous independent variables because of its simplicity. It has two parameters: α

is the location parameter and β is the scale parameter. The cumulative Logistic distribution at an impact

velocity VS can be written as:

(3)

Like to Probit link function, the Logit link function is symmetric at VS = V50. Also, the Probit and Logit

link functions have very close shapes, the only exception being at the tails of the distribution where the

Logit link function is heavier.

3.2.3 Gompit

The Gompit link function is also called the cumulative complementary log log (CLogLog) distribution. It

is part of the extreme value for minima family of distribution. It has two parameters: α is the location

parameter and β is the scale parameter. The Gompit distribution at an impact velocity VS can be written

as:

(4)

The Gompit link function is not symmetrical and has a fixed negative skewness of -1.1396 (skewed to the

left). This means that the response has an S-shaped curve, that approaches 0 fairly slowly but approaches

1 quite sharply, when β > 0.

3.2.4 Scobit

The Scobit link function [4] is also called the Burr-10 distribution or a Type I skewed-logit function. It

has tree parameters: α is the location parameter, β is the scale parameter and δ is the shape parameter. The

Scobit distribution at an impact velocity VS can be written as:

(5)

It is shown in [5] that the Scobit distribution is actually a special case of the Logit-Type distribution.

Furthermore, the Logit distribution is a special case of the Scobit distribution: for δ = 1, the Scobit and

Logit distributions are equivalent. The Scobit link function is not symmetrical, except when δ = 1.

Consequently, data that are skewed can be more realistically modeled using this link function. Contrary to

the Gompit distribution, it does not have limitation in the amount of skewness it can model

3.2.5 Weibull

The Weibull link function has tree parameters: α is the location parameter, β is the scale parameter and δ

is the shape parameter. The Weibull distribution at an impact velocity VS can be written as:

10 DRDC-RDDC-2020-R056

(6)

The Weibull link function is not symmetrical. Consequently, data that are skewed can be more

realistically modeled using this link function. Contrary to the Gompit distribution, it does not have

limitation in the amount of skewness it can model

3.2.6 Summary of link functions

The typical curves for the different link functions presented above are shown in Figure 4. It can be seen

that the Probit and Logit distributions have very close shapes. Although very close to the previous

distribution, the Gompit distribution is slightly skewed, but the level of skewness is fixed. Finally, the

Scobit and Weibull distributions are presenting a variety of shapes.

As stated by Mauchant in [6] relative to the analysis of ballistic data using the Probit, Logit and Gompit

link functions: “This work shows that the choice of link function, between the logit, the probit and the

complementary log-log link functions, is not the most important issue in V50 ballistic limit performance

estimation, since the different GLMs (Generalized Linear Model) examined all gave similar results.” It is

believed that it is the case because the Probit, Logit and Gompit link functions have a constant shape

parameter, which is not the case for the Weibull and the Scobit link functions.

Figure 4: Density probability (left) and cumulative probability (right) distribution curves for the Probit,

Logit, Gompit, Scobit and Weibull distributions. For the Scobit and the Weibull distributions, curves for

2 different shape parameter values are shown (δ = 0.5 and δ = 2.0).

Table 1 presents a summary of the information related to the different link functions presented above. It

includes the formulas to calculate the velocity for a specific probability of perforation and the formulas to

calculate the ballistic limit value for each link function. For completeness, it also includes the probability

of perforation functions which are similar to Equations (2) to (6). For all link functions, the standard error

σ is calculated as being equal to 1/

DRDC-RDDC-2020-R056 11

Table 1: Summary of link function characteristics and ballistic limit formula.

Link

function

name

Probability of

perforation at

velocity VS

Probability

scaling

(

Velocity (VS) at

probability of

perforation P

Ballistic limit value

estimation

Probit

Φ-1(P) (Φ-1(P)- )/ - /

Logit

- /

Gompit (-0.3665 - )/

Scobit

Weibull

3.3 Fitting Process Calculations

3.3.1 General

The program takes the VS versus Response data and fits the above

defined link functions to the data The fitting process consists of finding the estimates of the parameters

(α, β, δ) that maximise the likelihood that the fitted curve corresponds to the experimental data

(Maximum Likelihood Estimation, ML). The search for the optimal values of α, β and δ is done using

the Downhill Simplex algorithm. The Simplex method published by Nelder and Mead [7] is a

single-objective optimization approach for searching the space of n-dimensional real vectors. It only uses

the values of the objective functions without any derivative information and therefore, it falls into the

general class of direct search methods. The absence of derivative information in this method makes it

suitable for problems with non-smooth functions and discontinuous functions which occurs frequently in

statistics and experimental mathematics.

12 DRDC-RDDC-2020-R056

The Simplex method is simplex-based. A simplex S in Rn is defined as the convex hull of n+1 vertices x0,

x1, … xn ϵ Rn. For example, a simplex in R1 is a line, in R2 is a triangle, and in R3 is a tetrahedron. For the

case of the cumulative normal equation (Probit), the dimension n of the simplex is 2 (α and β). The

method begins with a working simplex S and the corresponding set of function values at the vertices ƒ(xj)

for j = 0, 1, …, n. The function ƒ is the maximum likelihood equation. The method then performs a

sequence of transformations of the working simplex aimed at decreasing the function values at its

vertices. At each step, the transformation is determined by computing one or more test points, together

with their function values, and by comparison of these function values with those at the vertices.

3.3.2 Maximum Likelihood Estimation (ML)

The likelihood equation (L) for M Bernoulli trials (0 or 1 response) with success probability PP(VS) can

be written as follows:

(7)

Where X are the experimental data points composed of Respi (the perforation status, either 0 or 1) and the

corresponding impact velocity . Note that PP(Vs) in Equation (7) is any one of the link functions

presented above (Equation (2) to (6)) with parameters α, β and δ. In order to maximise the Likelihood

equation, STANAG 2920 [1], does not suggests any specific optimisation method. It was decided that it

would be more efficient and easier to use the Simplex method.

For ease of calculation and better precision, the program utilises a different version of the Likelihood

equation called the Log-Likelihood equation (LL). The Log-Likelihood equation is defined as:

(8)

References [8] and [9] provide an overview of the ML estimator properties:

a. The ML estimator is consistent. As the sample size grows large, the probability that the ML

estimator differs from the true parameter by an arbitrarily small amount tends toward 0.

b. The ML estimator is asymptotically efficient, which means that the variance of the ML estimator

is the smallest possible among consistent estimators.

c. The ML estimator is asymptotically normally distributed, which justifies various statistical tests.

To summarize, the desirable properties of consistency, normality and efficiency of the ML estimator are

asymptotic, i.e. these properties have been proven to hold as the sample size M approaches infinity. The

small sample behavior of ML estimators is largely unknown. The standard advice is that with small

samples, smaller p-values should be accepted as evidence against the null hypothesis.

The ML techniques should be equivalent to the Least Square (LS) method if the error distribution

between the regressed curve and the experimental data points is normal. For binary dependent data, the

DRDC-RDDC-2020-R056 13

error distribution is not normal and therefore both techniques are not equivalent. Furthermore, in ballistic

applications, the data set is typically small or moderate in size (less than 500 data points) which induces a

bias in the estimations, resulting in models that are not equivalent

Some simulation studies have shown that for small data size where there are only a few failures, the ML

method is better than the LS method [10]. Also, some authors have shown that for binary dependent data,

the error distribution might not follow the normal distribution [11]. Therefore, the ML method is used by

this program. It should be considered that the model that results in the lowest standard error on the V50,

SD and fit values should be more accurate.

3.4 Confidence Interval and Standard Error Calculation

3.4.1 General

There are three different methods used to calculated the confidence limits in BLC. The first is explained

in details in STANAG 2920 ([1], [2]) and won’t be detailed further herein. The results of that process for

the ballistic limit and the standard deviation values is shown in the “Results 1” sheet for each link

functions on the 5 lines labeled “STANAG 2920 method”.

The second and third are explained in the subsection on Confidence interval curves below and the result

of that process for each link functions are shown in the “Results 1” sheet on the 3 lines labeled either

“Normal CI curves” (the second method) or “Binomial CI curves” (the third method). The first line

presents the Confidence Interval length for 50% probability of perforation whereas the second line and

third line presents the maximal and minimal ballistic limit values for the selected Confidence Level and

assumed error distribution. Other results of the second or third process are presented in the “Results 2”

sheet under the headings “CL Upper” and “CL Lower” for each link functions and presented graphically

on the graphics sheets named Probit, Logit, Scobit, Gompt and Weibull.

For the second method that assumes the Normal distribution of the error , it is first necessary to evaluate

the standard error of the different fitted parameters. The process to estimate the standard error is therefore

explained first.

3.4.2 Standard Error Estimation

Standard errors (SE) are calculated based on the estimated variance (Var) and covariance of the fitted

parameters ( ). The variance of an equation with N parameters (βi, i = 1 to N) is

given by Equation (9). Notice that the variance/covariance matrix is symetric and that all the values of the

diagonal are positive while the others can be either positive or negative. Once the best fit parameters for

each link functions are defined using the ML technique, a numerical double derivative of the LL function

is calculated for each parameters. The double derivatives are calculated using a 5 points central stencil

with a truncation error O(Δβi4) for the unidirectional derivatives and a 4 points central stencil with a

truncation error O(Δβi2, Δβj

2), i ≠ j for the cross derivatives.

To minimize errors in the double derivative calculation, the double derivatives are calculated by a

convergence algorithm that optimises the balance between the round-off error of the computer arithmetic

and the truncation error of the double derivative. Briefly, the convergence algorithm works as follows:

14 DRDC-RDDC-2020-R056

1. For each step of the convergence, smaller value of Δβi are used to calculate the second derivative.

2. The calculated double derivative of the current step is compared to the double derivative calculated in

the previous step.

3. If the difference between the two is smaller than a fixed small value, then the last step derivative is

the answer, otherwise steps a) and b) are repeated

(9)

The software calculates the SE for α, β and δ parameters using the procedure presented above. For V50 , σ

and PP(Vi) values, the SE is calculated based on the arithmetic of error propagation described in [12],

Section 1.4 as well as [13], [14] and [15] specifically for the Weibull distribution.

3.4.3 Confidence interval curves

3.4.3.1 Normal error distribution confidence interval

Once the variance and covariance of each parameters are calculated, the normally distributed confidence

interval is calculated for the fitted cumulative probability function. For a general cumulative probability

function Pp(g), with a link function g( ) with Z = , the upper and lower confidence interval

value CIi at data point Vi is:

(10)

Where:

CIi: Confidence interval for data point i

Vi: Striking velocity of data point i

M: Number of data points

N: Number of parameters in the fitted equation (Equations (2) to (6))

: Estimate of parameter α

DRDC-RDDC-2020-R056 15

: Estimate of parameter β

Estimate of paramter δ

Zi =

α: Confidence level

T((1-α)/2, M-N): Student’s T distribution for probability (1-α)/2 and M-N degree of freedom

Var(): Variance of estimate (diagonal elements of Equation (9))

Covar(): Covariance of estimate (non-diagonal elements of Equation (9))

Var(Zi) =Var( ) + Vi2 * Var( ) + 2*Vi * Covar( ) from [12]

Var(exp(Zi)) = exp(Zi)2 * Var(Zi)

Var( ) =

The variance equations expressed in the last 3 points of the enumeration above are the equations from

[12], [13], [14] and [15] to propagate the variance within the cumulative probability link functions based

on the variance of .

3.4.3.2 Binomial error distribution confidence interval

The binomial confidence interval curves are calculated for each point based on the binomial cumulative

probabillity function assuming that the number of trials corresponds to the number of experimental data

points and that the probability of success corresponds to the fitted cumulative probability of perforation as

given by Equations (2) to (6). Therefore, for each data point, the following is evaluated successively to

find the upper or lower confidence curve:

(11)

Where:

CumBinom: Cumulative Binomial function.

CIi: Confidence interval for data point i.

Xi: Total number of success (perforation) within a certain number of independent trials N. In this

case, it is the probability given by the fitted cumulative probability of perforation (equations (2) to

(6)) at velocity Vi multiplied by the number of trials N.

N: Number of independent trials.

16 DRDC-RDDC-2020-R056

: Probability of success for each independent trails.

For the upper CI, the value of is varied from 0.999 to 0 and Equation (11) is evaluated until the

condition for the upper CI is true. For the lower CI, the value of is varied from 0 to 0.999 and

Equation (11) is evaluated until the condition for the lower CI is true.

DRDC-RDDC-2020-R056 17

4 Diagnostic tools

4.1 General

With the fitted parameters, the ballistic limit and standard deviation values and their standard error, BLC

also executes a series of tests to assess:

Information on the data sample

Tests to assess the validity of the data for use with the different tests below

The goodness of fit of the link functions to the data points

The significance of the fitted parameters

The following sections will describe the different diagnostic tools used in BLC.

4.2 Information on the data sample

Information on the data sample can be found under the heading “SAMPLE STATISTICS” of the

“Results 1” sheet and on the Box Plot sheet.

4.2.1 Sample Statistics section of the “Results 1” sheet

The data included in that section are:

Sample median: The velocity value that splits the ordered data sample in two halfs. Within the

sample, 50% of the data points are higher and 50% of the data points are lower than that velocity.

Sample mean: The average of all velocities .

Sample standard deviation: The standard deviation of the sample .

Spread: Defined as the difference between the highest impact velocity and the lowest impact

velocity of the data sample.

LC: Velocity of the lowest complete perforation within the sample

HP: Velocity of the highest partial perforation within the sample

ZMR: Zone of mixed results defined as the difference between HP and LC of the data sample. If the

value is negative, then there is no ZMR (ZMR = 0)

Skewness: Measure of the symmetry of the sample. It is defined as (this

is the Fisher-Pearson adjusted coefficient of skewness, adjusted for sample size [16]) That value can

18 DRDC-RDDC-2020-R056

be positive or negative. For a positive skewness or right-skewed sample, the longest tail of the

sample density function is at the right of the distribution and the mean of the sample is at the right of

the median (higher than the median) of the sample. Inversely, for a negative skewness or

left-skewed sample, the longest tail of the sample density function is at the left of the distribution

and the mean of the sample is at the left of the median (lower than the median) of the sample. A

sample skewness of 0 or near 0 indicates symmetry. Based on [17], the skewness value can be

interpreted as:

If skewness is between −½ and +½, then the distribution is approximately symmetric

If skewness is less than -1 or higher than +1, then the distribution is highly skewed

If the skewness is between −½ and -1 or between +½ and +1 then the distribution is

moderately skewed.

Kurtosis: Measure of the heaviness of the sample distribution tails. It is defined as

(this is actually the excess kurtosis as discussed in [16]) That value can be

positive or negative. A value of 0 indicates that the tails of the sample distribution are identical to

the tail of a standard normal distribution. Positive kurtosis indicates that the tails of the sample

distribution are heavier than the tail of a normal density distribution. Inversely, negative kurtosis

kurtosis indicates that the tails of the sample distribution are lighter than the tail of a standard

normal distribution [16].

Number of data points: Indicates the number of data points used in the calculation.

4.2.2 Box Plot sheet

A summary of the data are displayed graphically on the “Box Plot” sheet. That chart presents the

following information:

Maximum (“Max” label) and minimum (“Min” label) impact velocities of the sample

The median, first quartile (“Q1” label) and third quartile (“Q3” label) values of the sample

The interquartile range (“IQR” label). It is defined as the difference between Q3 and Q1

The sample mean value (“Mean” label), marked with a blue “X” on the chart

Vertical bars outside the Q3-Q1 box that describe the 1.5 IQR limit above and below Q3 and Q1

respectively. Data points outside those bars are sometimes designated as outliers by some authors

Data points that are outside the Q1-Q3 range

The Box plot chart can be used to assess the sample characteristics. If the distance Median-Q3 and

Median-Max is about the same as the distance Median-Q1 and Median-Min, and that the Mean value is

about the same as the Median value, then the sample distribution is almost symmetric (see Figure 5 left

side). If it is not the case, the data is skewed either negatively or positively (see Figure 5 right side).

DRDC-RDDC-2020-R056 19

Figure 5: Box plot of a symmetric sample distribution (left side, calculated skewness = -0.19) and a

positively-skewed sample distribution (right side, calculated skewness = +1.51).

4.3 Tests on validity of the data

A series of test are executed within BLC to assess if the data points are valid for the analysis. The

software will still do the analysis even if the data are not suited for the analysis.

In the open literature, many text books list the assumptions for which a curve fitting analysis can be done.

There are many assumptins made for the classical case of Ordinary Least Square (OLS) fitting which can

be relaxed for the case of Generalized Linear Models (GLM) that is used by the BLC software. The

assumptions for the OLS are briefly described in Table 2, based on [18], [19] along with the equivalent

assumptions for the GLM.

Table 2: List of assumptions for OLS versus GLM fitting process.

Assumption

number

OLS GLM for binary data

1 Estimated via least squares. Estimated via Maximum Likelihood.

2 Dependent variable is continuous and can

take on any value.

Dependent variable can only take on 2 values,

typically 0 and 1.

3 Independent variables are continuous. Independent variables are continuous.

4

Independent variables are linearly related to

dependent variable.

Independent variables are linearly related to

some type of log odds of event occurring. Log

odds, in turn, are nonlinearly related to P(Y =

1) as seen on the Probability Scaling column

of Table 1.

5 Dependent variables values are statistically

independent of each other.

Dependent variables values are statistically

independent of each other.

20 DRDC-RDDC-2020-R056

6 Normally distributed errors of mean 0 and

variance 1 are assumed.

Error distribution is assumed to be from the

exponential family.

7

Homoscedasticity: variance of the error

terms are constant and do not depend on the

independent variable value.

Heteroscedasticity: variance of the error terms

depend on the independent variable value.

The relaxed assumptions concerns homoscedasticity (Assumption 7), the error distribution

(Assumption 6), linearity of the relationship between the independent and the dependent variables

(Assumption 4). Some assumptions are kept but modified, like the obligation to use Maximum Likelihood

Estimation to estimate fit parameters (Assumption 1) and the nature of the dependent variable that has to

be dichotomic (Assumption 2). Finally, 2 assumptions are kept as they are, Assumptions 3 and 5.

Assumption 3 (Independent variables are continuous) is made because the link functions are built to take

any values of the independent variables. The only assumption that needs to be validated is Assumption 5

(Dependent variables values are statistically independent of each other)

4.3.1 Criteria for validity of dependent variable value independency

The criteria discussed in this section are included in BLC to verify that the dependent variables are

statistically independent from each others (Assumption 5 above). Since we are dealing with binary

dependent data that can take the value of either 1 or 0, random tests for binary data are used. There are

3 tests taken from [20] included in BLC:

MONOBIT test

RUNS Test

Cumulative Sum test

It is important to notice that all those tests are taking into account that the experimental data are entered

in the “Main” sheet in the order in which they were fired. Indeed, the tests proposed herein were build

to verify if some ordering patterns are present in the data. If the data are entered in order of impact

velocity for example, then the tests discussed herein will fail. The next sub-section explains each of these

tests.

4.3.1.1 MONOBIT Test

Reference [20] recommends that the Monobit test be run first, since it supplies the most basic evidence

for the existence of non-randomness in a sequence, specifically non-uniformity. If this test fails, the

likelihood of other tests failing is high. In more details, the purpose of this test is to determine whether the

number of ones and zeros in the sample are approximately the same as would be expected for a truly

random sample. The test assesses the closeness of the fraction of ones to ½, that is, the number of ones

and zeroes in a sequence should be about the same.

The procedure consist of transforming the values of the dependent variable (Yj = 0 or 1) to -1 and +1

values respectively. Then the transformed values are added for the M data points (SM). For large number

of data points (M ≥100), a test statistic (Sobs) is computed: [20]. The p-value for the test

DRDC-RDDC-2020-R056 21

statistic is then calculated: , where erfc is the complementary error

function. For small number of data points [21], the statistic is calculated and its p-value is

calculated from the cumulative binomial distribution of the statistic for M data points and the probability

of occurrence of ½.

If the computed p-value is < α, then conclude that the sequence is non-random. Otherwise, conclude that

the sequence is random.

4.3.1.2 RUNS Test

A “run” is an uninterrupted sequence of identical values (0 or 1). A run of length k consists of exactly k

identical values and is bounded before and after with the opposite value. The purpose of the runs test is to

determine whether the number of runs of 1 and 0 of various lengths is as expected for a random sequence.

In particular, this test determines whether the oscillation between 0 and 1 runs is too fast or too slow. As a

pre-requisite, the MONOBIT test has to be calculated as its test statistic (Sobs) is used.

Following [20], the calculation consists of first compute π, the pre-test proportion of ones in the sample

sequence: . If the following inequality is true then the RUNS test should be

stopped and its p-values should be set to null, i.e. the RUNS test concludes that the sequence is

non-random.

Otherwise, a test statistics (VM(obs)) is computed: , where r(j) = 1 if Yj = Yj+1,

and r(j) = 0 otherwise. This test statistic is the actual number of runs in the sequence. For very large

sequences (above 200 data points), the p-value for the test statistic is calculated as:

. For smaller number of data points, the exact probability of

occurrence of the number of runs (VM(obs)) is calculated and its probability of occurrence is calculated as

explained in [22] and [23].

If the computed p-value is < α, then conclude that the sequence is non-random. Otherwise, conclude that

the sequence is random.

4.3.1.3 Cumulative Sum Test

The purpose of the test is to determine whether the cumulative sum of the partial sequences occurring in

the tested sample is too large or too small relative to the expected behavior of that cumulative sum for

random sample. For a random sequence, the excursions of the random walk should be near zero. This test

can be run forward (from the first to the last data point) or backward (from the last to the first data point)

of the sample.

The procedure consist of transforming the values of the dependent variable (Yj = 0 or 1) to -1 and +1

values respectively (Tj). For the forward test, compute partial sums Si of successively larger subsequences

starting with T1: S1 = T1, S2 = S1 + T2, …., SM-1 = SM-2 + TM-1 , SM = SM-1 + TM. For the backward test,

22 DRDC-RDDC-2020-R056

compute partial sums Si of successively larger subsequences starting with TM: S1 = TM, S2 = S1 + TM-1,

…., SM-1 = SM-2 + T2 , SM = SM-1 + T1.

Compute the test statistic , where is the largest of the absolute values of

the partial sums (Si). For M ≥ 100, the p-value is calculated using the following:

where Φ is the standard normal cumulative probability function. No statistics for number of test samples

less than 100 was found.

If the computed p-value is < α, then conclude that the sequence is non-random. Otherwise, conclude that

the sequence is random.

4.4 The Goodness-of-Fit and Significance of the model

4.4.1 Significance of the parameters and independent variables

As discussed in [12], after estimating the parameters for our models, we need to assess the signifcance of

the parameters and independent variables selected. This usually involves calculating a statistic for which a

p-value is determined. That p-value is compared to the type-I error level (α) that can be accepted.

The significance of the parameters and of the independent variables is checked to ensure that the model

selected tells us more about the dependent variable than a model with more (overdefined) or less

(underdefined) model. The interest in BLC is to verify: Does the probability of perforation is correctly

modeled by the impact velocity using the model calculated?

4.4.1.1 Likelihood Ratio Test

To do that verification, the model with the independent variable ( ) is compared to the model

without impact velocity ( and only). In GLM, the comparison of the two models is based on the

Log-Likelihood function (Equation (8)). By definition, the deviance (D) of a model is:

(12)

The deviance of a model should be seen as the equivalent of the residual sum of square in OLS. It is

therefore a measure of the error between the sample data points and the fitted model curve. For the

DRDC-RDDC-2020-R056 23

purpose of comparing the models with and without the independent variable, the values of the deviance

are compared using the following equation:

(13)

The test statistic G is called the Likelihood Ratio. Under the hypothesis that = 0, the G statistic

asymptotically follows a χ2 distribution with 1 degree of freedom (because there is 1 parameter difference

between the two values of LL). When the sample size is small, the χ2 distribution has to be corrected.

Many different corrections exists, and reference [24] examined 5 of them in terms for Type I and Type II

errors. The authors recommends the Bartlett correction (Bc) for small samples. Therefore, the p-value of

this test is [25]:

(14)

(15)

If the computed p-value is < α, then conclude that ≠ 0,. Otherwise, conclude that that = 0.

The corrected Likelihood Ratio is a powerful test that enables the comparison of two models, even when

sample size is small. Unfortunately this test is valid only for nested models, i.e. models that are exactly

similar except for one parameter. A good example of nested model is the Logit and Scobit models: Both

models are similar except for parameter δ that is equal to 1 for the Logit model and is free for the Scobit

model. For non nested models (like Probit model compared to Logit and/or to Gompit models for

example) the information criteria has to be used for comparison (Section 4.4.2.5)

4.4.1.2 Wald Test

The Wald test is obtained by comparaing the Maximum Likelihood estimate of the parameters to its

estimated standard error. The Wald Test statistic is given by the following equation:

(16)

Under the hypothesis that = 0, the W statistic asymptotically follows the Standard Normal

Distribution. Reference [12] reports that the Wald statistic was examined and it was found that it behaves

in a aberrant manner, often failing to reject the null hypothesis ( = 0) when the coefficient

was significant. This means that the Wald test should be considered as a conservative test, that is: if the

Wald test rejects the null hypothesis, then the tested parameter is most likely significant.

24 DRDC-RDDC-2020-R056

4.4.2 Goodness-of-Fit Tests

Goodness-of-Fit tests are use to assess if the values predicted by the model are representative of the

experimental values in the absolute sense.

4.4.2.1 Anderson-Darling Test

The Anderson-Darling Test enable the comparison between an assumed cumulative distribution function

(in this case, the selected link function with its fitted parameters) and the empirical distribution function

(EDF) that is based on the experimental data points. The procedure of the Anderson-Darling (AD) test is

described in [27]. It was designed to verify that the independent variable (x) has a continuous cumulative

distribution F(x,θ) where θ is a vector of one or more parameters that corresponds to the link function

(Equation (2) to (6)). F(x,θ) is compared to the EDF (Fn(x) ) using the AD statistic (Equation (18)) with

Fn(x) defined as:

(17)

The AD statistics is calculated as follows:

(18)

The advantage of using the AD test in BLC relative to the Shapiro-Wilk (SW) test is that the SW test can

only be used to detect the goodness-of-fit for the Probit distribution, while the AD test can be used to

detect the goodness-of-fit for the Probit, Logit, Weibull and exponential distributions. Another advantage

of the AD test is that its weight function is such that it usually makes a more powerful test statistic by

emphasizing the tail differences between the empirical distribution function and the assumed cumulative

distribution function.

Reference [28] compared the power of the SW test, Kolmogorov-Smirnov (KS) test, Lilliefors (LF) test and

AD test. Ten thousand (10000) samples of various sample size (n = 10, 15, 20 ,25, 30, …., 2000) were

generated from a series of alternative symmetric (7) and asymmetric (7) distributions with a variety of

skewness and kurtosis values. The power of each test was then obtained by testing the generated samples

against the test of normality for the SW, KS, LF and AD tests with the respective critical values. Results

show that Shapiro-Wilk test is the most powerful normality test, followed very closely by Anderson-Darling

test and then the Lilliefors test and Kolmogorov-Smirnov test. However, the power of all four tests is still

low for small sample size (30 and below). A summary of the expected power of the AD test for normality is

presented in Table 3. Also, reference [29] calculated the power of the AD test for other assumed cumulative

distribution function (Exponential, Weibull, Logit and log-normal distributions) using the KS, AD and the

Cramér–von Mises Test (CVM) and small sample sizes (below 200).

The general conclusions for [28] and [29] are similar: a) the Anderson-Darling test results in excellent

power levels compared to the other tests or it is marginally close to the best test, b) sample sizes above 50

are required to reach power levels above 80% and c) the increase of the allowable Type I error level

results in higher power level for the same sample size.

DRDC-RDDC-2020-R056 25

It is therefore advisable to accept higher Type I error (15 to 20%) in order to reach acceptable Type II

error rates for the typical types of sample size used in ballistic studies.

Table 3: Power level of the AD test for normality for symmetric and asymmetric distributions with

varying skewness (Sk), kurtosis (Ku), Type I error level (α) and number of data points

in the distribution (n). Data from [28].

Sk = 0.0, Ku = 1.80 Sk = 0.0, Ku = 5.0

Sy

met

ric

Type I error level n Power Type I error level n Power

α = 0.05

10 8.5

α = 0.05

10 8.6

20 17.1 20 11.8

30 30.2 30 14.3

50 58.2 50 17.9

100 95.2 100 27.8

α = 0.10

10 16.5

α = 0.10

10 14.7

20 29.2 20 18.3

30 44.7 30 21.6

50 73.1 50 26.3

100 98.2 100 37.7

Sk = 1.0, Ku = 4.50 Sk = 1.41, Ku = 6.00

Asy

met

ric

Type I error level n Power Type I error level n Power

α = 0.05

10 12.9

α = 0.05

10 22.0

20 24.7 20 46.2

30 37.7 30 66.2

50 59.1 50 88.9

100 89.3 100 99.7

α = 0.10

10 20.8

α = 0.10

10 31.9

20 34.6 20 58.4

30 48.5 30 76.2

50 69.8 50 93.9

100 94.0 100 100.0

26 DRDC-RDDC-2020-R056

4.4.2.2 Sensitivity, Specificity and area under the ROC curve (AUC)

The use of Sensitivity,Specificity and AUC tries to answer the following question: How good a job the

model does of predicting outcomes? Or said anotherway: What percent of the observations the model

correctly predicts? Can my model discriminates between positive and negative outcomes?

These questions can be answered by measuring how good the model is to measure true positive and true

negative answers while minimizing the number of false positive and false negative. All of these are

calculated by comparing the model outcome to the sample measured outcome

But first, here are some definitions:

1. Sensitivity (or true positive rate) is the proportion of positives (complete perforation) correctly

identified by the model. This value has to be has high a possible.

2. Specificity (or true negative rate) is the proportion of negative (partial perforation) correctly

identified by the model. This value has to be has high a possible. On contrary the value of

1-Specificity is the false negative rate, and it has to be as small as possible

3. ROC : Receivers Operating Characteristics. It is a plot of the model sensitivity versus the model false

negative rate (1-Specificity) as shown in Figure 6.

4. The AUC is a measure of how well a parameter (in our case, velocity) can distinguish between two

diagnostic groups (complete perforation/partial perforation). AUC is the area under the ROC curve.

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Sen

siti

vity

(Tr

ue

Po

siti

ve R

ate

)

1 - Specificity (False Positive Rate)

Logit

Probit

Gompitl

Scobit

Weibull

Figure 6: Example of an ROC curve.

The ROC is calculated based on the classification table presented at Table 4. As the threshold probability

value is changed (between 0 and 1), the number of true and false complete as well as the number of true

and false partial changes. This results in a variation in sensitivity and specificity, resulting in the ROC

curve of Figure 6. A model with high discrimination ability will have high sensitivity and specificity

simultaneously, leading to an ROC curve which goes close to the top left corner of the plot. A model with

DRDC-RDDC-2020-R056 27

no discrimination ability will have an ROC curve which is the 45 degree dashed diagonal line shown in

Figure 6.

Table 4: Classification table for a predetermined threshold value.

Condition predicted by the Model Total

Predicted complete

(+)

Predicted partial

(-)

Rea

l, o

bse

rved

,

exp

erim

enta

l

tru

e co

ndit

ion

Complete

penetration (+)

True Complete –TC

(true positive)

False Complete - FC

(false negative)

(Type I error)

NC

Partial

penetration (-)

False Partial – FP

(false positive)

(Type II error)

True Partial - TP

(true negative) NP

Total N+ N- N

Threshold: Probability above which the predicted conditions equals 1.

N = Number of data points.

N+ = N(P ≥ Threshold); Number of data points above the threshold.

N- = N(P < Threshold); Number of data points below the threshold.

NP: Number of Partial penetrations.

NC: Number of Complete penetrations.

Sensitivity = TC/NC: True positive rate.

Specificity = TP/NP: True negative rate.

Theoretically, the AUC is the probability that a randomly selected complete perforation shot has a higher

score or value (higher impact velocity) on the test than a randomly selected partial perforation shot. This

assumes that the complete perforation have (on average) a higher score than the partial perforation. The

AUC is at minimum equal to 0.50 (shown by the dashed line in Figure 6) and is at a maximum of 1.0. The

higher the AUC between those two limits, the better the model is in predicting the outcome. When the

AUC is 0.5, the overall diagnostic accuracy of the model is equivalent to chance.

It is also possible to build an inference [31], comparing the calculated value of the AUC to the minimal

value, with the null hypothesis: H0: = 0.5 and H1: ≠ 0.5. The variance estimate necessary to

calculate the Z statistics for the hypothesis test is calculated based on the method described in details in

[32]. The method is based on the calculation similitude between the AUC using the trapezoidal rule and

the non-parametric Mann-Withney generalized statictics for which the standard error is known. The

statistic test is described as:

(19)

(20)

If the computed p-value is < α, then conclude that the estimated ( ) is different from 0.50. Otherwise,

conclude that = 0.50

28 DRDC-RDDC-2020-R056

4.4.2.3 The Stukel Test

The Stukel Test [12], [30] works only for the Logit link function. It consist of adding two parameters to

the Logit equation to make it more general (Generalized Logistic Model). Again, those additional

parameters are giving weight to the tails of the Logit distribution. The 4 parameter equation is then fitted

using the ML estimation technique and its resulting likelihood is compared to the likelihood of the

standard Logit (Equation (4)) using the likelihood ratio test described is Section 4.4.1.1. If the ratio is not

significant, then the two additional paramters are not significant, which means that the sample data at the

tails of the distribution are well described by the Logit model.

The test can be briefly described as

1. Fit the data sample using the Logit link function (Equation (3))

2. Using the fitted parameters, create two new variables (za and zb):

If

3. Fit the data sample using the Generalized Logistic Model:

4. Calculated the Log Likelihood Ratio (G) statistic

5. Find the p-value of the G statistic using Equations (14) and (15).

Reference [30] compared the power of the Stukel test relative to 8 other tests for the Logit link function

using Monte Carlo simulations of 15 different symmetric and asymmetric distributions with 100 and

500 data points. The authors recommend the use of the Stukel Test to detect departure from Logistic

distribution. Similarly to the Anderson-Darling Test above, the authors also add that “In all cases one

must keep in mind the lack of power with small sample sizes to detect subtle deviations from the logistic

model”. With this in mind the same conclusion as for the Anderson=Darling Test holds here: It is

therefore advisable to accept higher Type I error (15 to 20%) in order to reach acceptable Type II error

rates for the typical types of sample size used in ballistic studies.

4.4.2.4 Pseudo-R2

The use of R2,the coefficient of determination, is well established in classical regression analysis (OLS).

By definition it is the proportion of the dependent variable variance 'explained' by the regression model to

the total variance of the dependent variable observed. It is therefore useful as a measure of success of

predicting the dependent variable from the independent variables. Unfortunately, that definition is invalid

for GLM because of the invalidity of the total Sum of Square equation: Sum of Square Regression + Sum

of Square Error = Sum of Square Total

DRDC-RDDC-2020-R056 29

The extension of R2 to generalized linear models (GLMs) and other more general models is not

straightforward. Different perspectives led to several generalizations to the coefficient of determination.

In BLC, 2 pseudo-R2 are presented: Nagelkerke and Tjur.

The Nagelkerke pseudo-R2 should be seen as a measure of the improvement from null model to the fitted

model. It is based on the following equation:

Actually, R2N is the ratio of the Cox and Snell pseudo-R2 (R2

CS) to the maximum possible value for R2CS

which is R2max. This value lies between 0 and 1. The R2

CS value contains the L0/L1 ratio which is the ratio

between the “null” model (containing the intercept parameter α only) to the full model (containing α, β

and δ parameters). M is the number of data points. Hence, the Nagelkerke pseudo-R2 measures the

improvement from null model to the fitted model.

The Tjur pseudo-R2 should be seen as a coefficient of discrimination: It measures the ability of the model

to discriminate between successes and failures. It is calculated as the difference in the average of the

event probabilities between the groups of observations with observed events ( ) and nonevents (( ))

To conclude this section, reference [12] provides a interesting insight in the use and the typical values of

R2 in logistic regression and GLM:

...low R2 values in logistic regression are the norm and this presents a problem when reporting

their values to an audience accustomed to seeing linear regression values. ... Thus [arguing by

reference to running examples in the text] we do not recommend routine publishing of R2 values

with results from fitted logistic models. However, they may be helpful in the model building state

as a statistic to evaluate competing models.

4.4.2.5 Information Criterion (IC): General considerations

As pointed above, the Log-Likelihood Ratio cannot be used to compare non nested models (like Probit

model compared to Logit and/or to Gompit models for example). For non nested models, the information

criterion has to be used [26]. It is based on the information theory using the Kullback-Leibler information

equation that measures the 'information' lost when approximating reality using a function. The

information criterion accounts for how well the model fits the data and also accounts for the complexity

of the model. Model complexity is its ability to fit any data set and can be approximated as the number of

parameters in the model [26]. It is well known that some models can be so complex that they can fit any

data set [26]: A model that seems to fit all data sets because it is overly complex may include so many

30 DRDC-RDDC-2020-R056

parameters that it is of little use to explain the outcome score that is of interest, which goes against the

purpose of developing models, that is to explain a particular facet of a phenomenon.

The Kullback-Leibler information ([34], [35]) or “distance” between f(x), which is a function that

represents the “full truth”or “the real model” and an approximating model gj(x) that represents a series of

M possible models (j= 1, 2, … M) is defined as the first part of Equation (21). In this case, x represents a

series of independent variable. The approximating model g(x) has p parameters (θ) that are estimated

from the finite data set x using the maximum likelihood method for example.

(21)

The quantity I(f(x), g(x)) measures the 'information' lost when g(x) is used to approximate f(x) (the truth).

If f(x)=g(x), then I(f(x), g(x))= 0, since there is no information lost when the model reflects the truth

perfectly. In reality, there is always some information that will be lost when a model is used to

approximate reality and thus l(f(x), g(x)) > 0. In reality, I(f(x), g(x)) cannot be used in that form for model

selection because it requires knowledge of the truth (f(x)) and of the parameters (θ) in g(x) = g(x|θ). But,

the relative Kullback-Leibler information can be estimated from the data based on the maximized log

likelihood function.

The Kullback-Leibler information equation can also be written as the second part of Equation (21) where

Ef is the expectation with respect to the distribution f(x) ([34], [35]). Noting that is an

unknown constant, the second part of Equation (21) can be rewritten as

, hence the above mention about the relative Kullback-

Leibler information. Takeuchi [36] found the general asymptotic relationship between the target criterion,

I(f(x), g(x))] - constant, and Ef [ln(g(x| ))] which is the expected empirical maximized Log-Likelihood. In

this case, is the MLE of model g(x) with p parameters, based on the data set x. Takeuchi result is:

(22)

The Log-Likelihood, Equation (8), is un unbiased estimator of and can therefore be used in

the evaluation of the IC. Matrices J and I are of size p x p and represents the first and second partial

derivatives of with respect to θ. The matrices are evaluated at θ = θ0, which is the value of

θ that minimizes I(f(x),g(x)) for the model g(x) over all possible values of θ (i.e. the ML estimator of θ).

The term reduces to p when the number of data points to estimate the Kullback-

Leibler becomes large.

To summarize, the Information Criteria (IC) can be written as the AIC (Akaike Information Criterion

[37]), noting that there is only a factor of 2 between the two criterion (that factor appeared for historical

reasons). AIC can asymptotically be written as:

(23)

Remember that the AIC (IC) accounts for the fitted model goodness of fit and the complexity of the

model. For comparison between models, the AIC value as to be as small as possible. As Log-Likelihood

DRDC-RDDC-2020-R056 31

(LL) is always negative, -LL is a positive number. The complexity correction (2p) is added to it such that

for two models with the same Log-Likelihood value, the less complex model, i.e. the one with less

parameters, will have a lower AIC. Equation (23) is true asymptotically, i.e. when the number of data

point is large. The rule of thumb from [26] is that for cases where sample sizes are small (n < 100) or

where the number of free parameters is large (p > 5), the AIC value is biased and therefore it has to be

corrected. This is the subject of the next section.

4.4.2.6 The small sample size corrected AIC

Reference [38] presents equations to determine the exact small number corrected bias for the Probit, Logit

and Gompit link functions. The term presented in Equation (22) is evaluated

analytically by the author and presented in the form of equations that can be easily programmed.

Validation of the corrections computed using the BLC were done using an R code provided by the authors

of [38]. Results of the validation for the Probit and Logit link functions were positive, but it was found

that for the Gompit link function, the values predicted by the BLC and the R code, although similar,

where much higher than expected. The source of error could not be found.

4.4.2.7 Small sample size correction using the Bootstrap method (EIC)

For cases where the analytical evaluation of the term of Equation (22) is not

available or is too complex, there exist a well known method called Bootstrap (or resampling) that can be

used to estimate the small sample size correction. In BLC, the Bootstrap method is used for the Gompit,

Scobit and Weibull link functions. The Bootstrap method to estimate the small sample size correction is

explained in details in [35]. More exactly, it is the variance reduction method described in Section 8.3 of

[35] that is used to estimate the EIC (Extended Information Criterion). The method can be summarized as

follows:

1. Fit the M experimental data point (X), using one of the link functions (g) and the MLE technique.

This provides a set of parameters ( ). Determine the Log-Likelihood value for that fit (ln(g(X| ))).

2. Randomly resample M data points from the experimental data (X) with replacement. This means that

each experimental data points has an equal chance of being drawn for each draw of the resampling.

This also means that the resampled data set (X*) may have one or more of the experimental data point

appearing more than once. Take a total of N resampled data set.

3. Fit each of the resampled data set (X*) using the same link function (g) as in a) and determine a set of

parameters ( ) for each. Also find the Log-Likelihood value for each of the resampled data set

(ln(g(X*| ))).

4. For each of the set of parameters ( ), evaluate the Log-Likelihood of the experimental data

(ln(g(X| ))).

5. Evaluate the Log-Likelihood for each resampled data set (X*) using the set of parameters of the

experimental data points ( ). This is ln(g(X*| )).

6. Find the average bias ( ) using the following equation:

32 DRDC-RDDC-2020-R056

(24)

7. Execute Steps 2. to 7. Q times.

8. Determine the average of the average biases ( ) and the variance of the bias.

The average and the variance of the bias are displayed on the “Results 1” page for the N x Q sets of

resampled data. BLC does the calculation with N = 100 and Q = 10 for a total of 1000 random samples.

Step 4. of the above procedure is quite time consuming. It was therefore decided to limit the number of

random resampling to 1000. It sometimes occurs that the fitting process of the resampled data set does not

converge despite every effort to correctly evaluate the starting values of the Simplex optimization

procedure. When this occurs the results are ignored and it is accounted for in the bias average and

variance evaluation.

For very small samples (less then 15 experimental data points), the equations developed in [38] for small

sample correction of the IC for the Probit and Logit link functions can result in very large values of bias

(on the order of the IC value). Interestingly, calculation of the bias using the EIC process result in much

smaller values but with standard deviations of the order of the EIC value. Therefore, when the

analytically developed small sample correction result in aberrant values, the EIC calculated values are

displayed along with their standard deviation.

DRDC-RDDC-2020-R056 33

5 Results description

The principal results are provided in the “Results 1” and “Results 2” sheets. The “Results 1” sheet

contains summary data while the “Results 2” sheet displays experimental data points along with their

confidence limit values based on either a Normal or a Binomial distribution of the error.

Other summary data are provided in graphical form in the “Box Plot” sheet. In addition, raw data points

along with their fitted curves and the confidence limits curves are presented graphically in the “Probit”,

“Logit”, “Scobit”, Gompit” and “Weibull” sheets.

5.1 Summary data (“Results 1” sheet)

5.1.1 Experimental data sample statistics

This part concerns descriptive statistics of the experimental data sample

Sample median (m/s): The sample data median value.

Sample mean (m/s): The sample arithmetic mean.

Sample σ (m/s): The sample standard deviation.

Spread (m/s): The sample spread (difference between the maximum velocity value and the

minimum velocity value of the sample).

ZMR (m/s): The Zone of Mixed Results (ZMR) of the sample. It is the difference between the

highest maximum partial perforation velocity (HP) and the lowest minimum complete perforation

(LC) velocity.

LC (m/s): Lowest Complete (LC) perforation of the sample.

HP (m/s): Highest Partial (HP) perforation of the sample.

Skewness: It defines the level of skewness of the data sample, By convention, right skewed is a

positive value and results in a distribution with the longest tail to the right. Left skewed is a negative

value with the longest tail to the left of the distribution.

Kurtosis: It defines the level of kurtosis. Positive kurtosis means that the tails of the data sample

distribution are heavier than standard normal and negative kurtosis means that the tails of the data

sample distribution are lighter than standard normal.

5.1.2 Independency of dependent variable

This part of the “Results 1” sheet concerns random tests done on the experimental data sample to ensure

the mutual independence of the dependent variables. Independace of the dependent variable is one of the

important assumption of the GLM process as described in Section 4.3. The message displayed in green

means that it is a positive answer, i.e., the test sample can be assumed as random, while the message

displayed in red means that it is a negative answer, i.e., the test sample cannot be assumed as random.

34 DRDC-RDDC-2020-R056

MONOBIT Test p-value for sample randomness: Provides the result of the Monobit Test (H0:

data are randomly distributed) with the p-value related to the test statistic. The Monobit test

procedure is described in Section 4.3.1.1.

RUN Test p-value for sample randomness: Provides the result of the Run Test (H0: data are

randomly distributed) with the p-value related to the test statistic. The Run test procedure is

described in Section 4.3.1.2.

Cumulative Sum Test p-value for sample randomness: Provides the result of the Cumulative

Sum Test (H0: data are randomly distributed) with the p-value related to the test statistic. The

Cumulative Sum test procedure is described in Section 4.3.1.3.

5.1.3 Parameter values and statistics

In this part, the fitted parameter values and statistics, based on the calculations described above are

provided for each different link function.

V50 ± SE (m/s): The ballistic limit calculated from the fitted values of and (following the

equations specified in the last column of Table 1, section 3.2.6) and the calculated standard error of

the parameters as described in Section 3.4.2. are displayed.

σ ± SE (m/s): The standard deviation calculated from the fitted values of and , following the

process described in Section 3.2.6 and the calculated standard error as described in section 3.4.2.

α± SE (m/s): The value of parameter as determined for the ML fitting process along with its

standard error estimation as described in Section 3.4.2.

β± SE (m/s): The value of parameter as determined for the ML fitting process along with its

standard error estimation as described in Section 3.4.2.

δ± SE (m/s): The value of parameter as determined for the ML fitting process along with its

standard error estimation as described in Section 3.4.2.

Standard Error of Estimate at V50 (m/s): The standard error of the fitted equation:

, where p is the number of parameter in the fitted equation, is the

probability of perforation evaluated using the fitted function evaluated at velocity Vi and Pri is the

response of the material (either 0 or 1) obtained experimentally.

5.1.4 Significance of the model

The significance of the model is assessed through the evaluation of the significance of the different fitted

parameters of the model. The message displayed in green means that it is a positive answer, i.e., the

parameter is significant, while the message displayed in red means that it is a negative answer, i.e., the

parameter is not significant. The confidence level that is used to compare with the calculated p-value is

specified on the “Main” sheet (Statistical Test confidence level). The calculated p-value of the test is also

provided to give insight on how close it is to the confidence level.

Wald Test p-value on α: Result of the test of significance of the α parameter using the Wald Test as

described in Section 4.4.1.2.

Wald Test p-value on β: Result of the test of significance of the β parameter using the Wald Test as

described in Section 4.4.1.2.

DRDC-RDDC-2020-R056 35

Wald Test p-value on δ: Result of the test of significance of the δ parameter using the Wald Test as

described in Section 4.4.1.2.

Log-Likelihood Ratio test p-value: Result of the test of significance of the β and δ parameters

using the Likelihood Ratio Test as described in Section 4.4.1.1.

5.1.5 Goodness-of-fit of the model

Goodness of fit is assessed using the Anderson-Darling Test and the Stukel Test. The message displayed

in green means that it is a positive answer, i.e., the fit between the experimental data points and the fitted

curve is significant, while the message displayed in red means that it is a negative answer, i.e., the fit

between the experimental data points and the fitted curve is not significant. The confidence level that is

used to compare with the calculated p-value is specified on the “Main” sheet.

This section of “Results 1” sheet also presents other measures of a model goodness-of-fit like the

Receiver Operational Characteristics (ROC) Area Under the Curve (AUC) value and their associated

values (sensitivity, specificity, threshold probability), Deviance and Pearson X2, different measures of

pseudo-R2, Log-Likelihood value, Least Square error value and the small-sample corrected information

criterion (IC) value.

Anderson-Darling Goodness-of-Fit test p-value: P-value and test diagnostic using the Anderson-

Darling Test procedure to assess the goodness-of-fit of the data points to the fitted link functions as

described in Section 4.4.2.1.

Stukel test for fit of distribution tails p-value: P-value and test diagnostic using the Stukel Test

procedure to assess the goodness-of-fit of the data points to the fitted logit link functions as

described in Section 4.4.2.3.

ROC AUC: Receiver Operational Characteristics (ROC) Area Under the Curve (AUC) value

(Section 4.4.2.2), It represents the probability that a randomly selected complete perforation

experimental data point has a higher value (higher impact velocity) than a randomly selected partial

perforation experimental data point. It has a minimum equal to 0.5 (equivalent to chance) and a

maximum of 1.0. The higher is the AUC between those two limits, the better the model is in

predicting the outcome. There are some scale that qualify the quality of the ROC AUC. One of these

qualitative scale is shown in [12] and is reproduced in Table 5.

Table 5: – ROC AUC qualitative level of discrimination

ROC AUC value Level of discrimination

0.50 ≤ ROC AUC < 0.70 Suggest no discrimination

0.70 ≤ ROC AUC < 0.80 Acceptable discrimination

0.80 ≤ ROC AUC < 0.90 Excellent discrimination

0.90 ≤ ROC AUC Outstanding discrimination

Model Sensitivity: Optimal proportion of complete perforation correctly identified by the model at

the optimal threshold probability value (Section 4.4.2.2).

36 DRDC-RDDC-2020-R056

Model Specificity: Optimal proportion of partial perforation correctly identified by the model at the

optimal threshold probability value (Section 4.4.2.2).

Threshold probability: Optimal value of probability at which the model sensitivity and specificity

are defined. Above that threshold, perforations are considered as complete, below that threshold,

perforations are considered as partial.

Deviance: Deviance is defined by Equation (12). It is the equivalent of the sum of square residuals

in OLS and therefore provides an estimate of the model departure from the experimental data points

based on the likelihood rather then on the residuals. The smaller Deviance, the better the model fit

is.

Pearson X2: The Pearson residuals is a measure of the difference between the observed and fitted

values. Again, the smaller the Pearson X2 value, the better the model fits.

Nagelkerke pseudo-R2: The Nagelkerke pseudo-R2 should be seen as a measure of the

improvement from null model to the fitted model. Its value lies between 0 and 1, but as stated in

Section 4.4.2.4, small values are often seen in GLM which doesn’t mean the model is a bad fit for

the data.

Tjur pseudo-R2: The Tjur pseudo-R2 should be seen as a coefficient of discrimination: It measures

the ability of the model to discriminate between successes and failures. Its value lies between 0 and

1, but as stated in Section 4.4.2.4, small values are often seen in GLM which doesn’t mean the

model is a bad fit for the data.

Log-Likelihood: The Log-Likelihood value for each link function as calculated using Equation (8).

The maximum value represent the best model fit.

Least square: The least square value for each link function as calculated by summing the square of

the residuals between the data points and the estimated probability. The minimum value represents

the best fit model.

Small sample corrected information criterion bias: This value is the small sample corrected bias

estimate of the information criterion (IC) based on the two methods presented in Section 4.4.2.5,

4.4.2.6 and 4.4.2.7. The actual correction on the IC is 2 times the bias value. Theoritically, for large

sample size, the bias correction on the IC is 2p where p is the number of parameter in the model.

Small sample corrected IC ± SD: This is the Information Criterion value that can be compared

between the models to assess which one fits the best the experimental data. It provides a measure of

how well the model fits the data versus the complexity of the model. The lower the IC value, the

best the model is. The standard deviation (SD) of the IC is also provided for the models that rely on

the Bootstrap method to estimate the small sample bias correction (see Section 4.4.2.7).

5.1.6 Data analysis based on STANAG 2920

An illustration of the different variables presented in this paragraph is presented in Figure 7.

V50 (m/s) CI length from STANAG 2920 method: The confidence interval (CI) length of the

ballistic limit for the confidence level specified on the “Main” sheet. The confidence level is

calculated using the procedure described in STANAG 2920 [1].

DRDC-RDDC-2020-R056 37

V50 max (m/s) from STANAG 2920 method: The maximum expected value of the ballistic limit

with a confidence level as specified in the “Main” sheet. The value is evaluated based on the

procedure described in STANAG 2920 [1].

V50 min (m/s) from STANAG 2920 method: The minimum expected value of the ballistic limit

with a confidence level as specified in the “Main” sheet. The value is evaluated based on the

procedure described in STANAG 2920 [1].

Sigma max (m/s) from STANAG 2920 method: The maximum value of the standard deviation

(SD) to be expected at the confidence level specified in the “Main” sheet. It is calculated using the

procedure described in STANAG 2920 [1] for each link function.

Sigma min (m/s) from STANAG 2920 method: The minimum value to be expected of the SD at

the confidence level specified in the “Main” sheet. It is calculated using the procedure described in

STANAG 2920 [1] for each link function.

0.0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1.0

460 470 480 490 500 510 520 530 540 550 560

Pe

rfo

rati

on

pro

bab

ility

Impact velocity (m/s)

Vs Exp.

Fit Prob

CL Upper

CL Lower

Fitted curve

V50

V50 maximal, normalerror distribution

V50 minimal, normalerror distribution

V50 maximal, STANAG 2920

V50 minimal, STANAG 2920

Upper confidence intervalCurve, normal error distribution Logit curve

Lower confidence intervalCurve, normal error distribution

CI Length STANAG 2920

CI Length normal error distribution

Figure 7: Illustration of V50 statistics. In this example, STANAG 2920 maximal and minimal V50 values

are located within the normal error distribution bracket. It is not necessary always the case.

5.1.7 Confidence Interval at V50 based on Normal of Binomial error distribution

Depending of the confidence interval (CI) error distribution selected on the ‘Main” sheet (either Normal

or Binomial) two different type of confidence interval are presented:

38 DRDC-RDDC-2020-R056

a. Normal: Based on the standard error of the parameters (section 3.4.2), the CI for a Normal

distribution of the error around the mean (fitted) curve is calculated based on the method

described in sections 3.4.3.1. at the confidence level specified on the “Main” sheet.

b. Binomial: The CI for a Binomial distribution of the error around the mean (fitted) curve is

calculated based on the method described in Sections 3.4.3.2 at the confidence level specified on

the “Main” sheet.

An illustration of the different variables presented below is presented in Figure 7.

V50 (m/s) CI length from Normal (or Binomial) CI curves: It is the confidence interval length of

the ballistic limit for the confidence level specified on the “Main” sheet.

V50 max (m/s) from Normal (or Binomial) CI curves: It is the maximum expected value of the

ballistic limit for the confidence level specified in the “Main” sheet.

V50 min (m/s) from Normal (or Binomial) CI curves: It is the minimum expected value of the

ballistic limit with a confidence level as specified in the “Main” sheet.

Notice that even though the confidence level is the same for the 3 types of CI calculation method

(STANAG 2920 or section 5.1.6, Normal and Binomial), the values calculated are different.

5.1.8 Vproof statistics

The Vproof statistics are calculated for the confidence level value and the Vproof probability value

specified in the “Main” sheet. Figure 8 presents the Vproof statistics in a graphical form. Results in the

graphical form can be found in the “Probit”, “Logit”, “Gompit”, “Scobit” and “Weibull” sheets (section

5.3.3). The Vproof statistics presented on the “Results 1” page are:

Vproof (m/s): The proof velocity (Vproof) determined at the probability of perforation specified as

input in the “Main” sheet. That velocity is calculated from Equations (2) to (6) using values of

and specified above. It is represented by the big black dot on Figure 8 for 10% probability of

perforation in that example.

Max prob of perforation (%): This is the maximum probability of perforation to be expected at

Vproof based on the confidence level specified on the “Main” sheet. It is represented by the value

on the upper confidence limit curve at the Vproof velocity on Figure 8. Again this value is different

depending if the Normal or the Binomial error distribution is used.

Min prob of perforation (%): This is the minimum probability of perforation to be expected at

Vproof based on the confidence level specified on the “Main” sheet. It is represented by the value

on the lower confidence limit curve at the Vproof velocity on Figure 8. Again this value is different

depending if the Normal or the Binomial error distribution is used.

Vproof max value (m/s): This is the maximum Vproof value to be expected at the probability of

perforation specified on the “Main” sheet. That value is calculated based on the confidence level

specified. . It is represented by the value on the lower confidence limit curve at the 10% probability

of perforation on Figure 8. Again this value is different depending if the Normal or the Binomial

error distribution is used.

DRDC-RDDC-2020-R056 39

Vproof min value (m/s): This is the minimum Vproof value to be expected at the probability of

perforation specified on the “Main” sheet. That value is calculated based on the confidence level

specified. Again this value is different depending if the Normal or the Binomial error distribution is

used.

10% probability of perforation

Vproof

Max prob of perforation

Min prob of perforation

Upper confidence limit

Lowerconfidence limit

Vproofmax value

Cumulative normal curve

Figure 8: Schematic illustrating Vproof statistics with an example for 10% probability of perforation.

The fitted cumulative function is presented in red, the lower confidence limit is presented in purple and

the upper confidence limit is presented in green.

5.2 Experimental data points and confidence limit curves (“Results 2” sheet)

On the “Results 2” Sheet, the experimental data points and confidence limit curves can be found. It

consists of a serie of six tables. The first table contains the ordered experimental data. The second, third,

fourth, fifth and sixth tables contain the fitted probability values and the upper and lower confidence limit

curves for each of the link function (Logit, Probit, Scobit, Gompit and Weibull). The confidence limit

curves are calculated based on the selected confidence interval error distribution and the confidence level

for the error distribution (both specified on the “Main” sheet) and they are compatible with the values

described in section 5.1.7 (Confidence Interval at V50 based on Normal of Binomial error distribution).

40 DRDC-RDDC-2020-R056

5.3 Other sheets

5.3.1 Box Plot sheet

Details of the data presented on the Box Plot sheet are presented in section 4.2.2.

5.3.2 Proofing Statistics sheet

The “Proofing Statistic” sheet enables the calculation of the minimum number shot required and the

number of complete perforation allowed to test that the actual probability of perforation of an armour is at

least at a certain level (say 10%) with a certain confidence level (say 95%)

For example, based on Table 6, to accept a batch for 10% probability of perforation, considering a

maximum of 2 perforation is accepted, will require 61 shots. However, to minimize the number of shots,

if after 29 shots no perforation occurred, the test is terminated and the batch is accepted. If one

perforation occurs, the test continues up to 46 shots. If no other perforation occurs, the test is terminated

and the batch is accepted. If a second perforation occurs, the test continues up to 61 shots. If no other

perforation occurs, the batch is accepted. If a third perforation occurs, the batch is rejected.

To compute proofing statistics in BLC:

1. Click on the button “Click here to enter data”.

a. A floating menu will appear. Enter the requested information

i. “Proof probability”, between 0 and 1.

ii. “Confidence Level”, between 0 and 1.

b. Afterward, click the “Calculate” button.

c. To avoid calculation, click the “Exit” button any time.

d. The software will refresh the table with values calculated based on the user inputs.

Table 6: Example of the number of perforation allowed (Allowed Nb of failure) and total number of shots

to fire (Nb of shot required) to test 10% probability of perforation (V10%)

with a confidence level of 95% (CL = 95%).

Allowed Nb of failure Nb of shot required

0 29

1 46

2 61

3 76

4 89

5 103

6 116

7 129

DRDC-RDDC-2020-R056 41

Allowed Nb of failure Nb of shot required

8 142

9 154

10 167

5.3.3 Graphical output sheets

The 5 graphical output sheets are “Probit”, “Logit”, “Gompit”, “Scobit” and “Weibull” sheets. These

5 sheets present the graphical representation of the experimental data, the fitted curve using the ML fitting

process and the confidence limit curves based on the error distribution shape and the confidence level for

the error distribution values provided in the “Main” sheet.

0.0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1.0

460 470 480 490 500 510 520 530 540 550 560

Pe

rfo

rati

on

pro

bab

ility

Impact velocity (m/s)

Vs Exp.

Fit Prob

CL Upper

CL Lower

Fitted curve

0.0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1.0

460 470 480 490 500 510 520 530 540 550 560

Pe

rfo

rati

on

pro

bab

ility

Impact velocity (m/s)

Vs Exp.

Fit Prob

CL Upper

CL Lower

Fitted curve

Figure 9: Example of perforation probability fitted curve for normal error (left) and binomial error

(right) distribution for 95% confidence level.

As example, Figure 9 presents the probability of perforation for the Logit link function with the normal

error distribution shape and the binomial error distribution shape. Experimental data points are presented

in blue of the 0.0 and 1.0 probability of perforation axis, the fitted data points corresponding to the

experimental data points are presented as red squares, the fitted curve is the continuous line in red and

finally the upper and lower confidence limit curves are represented by continuous green and purple curves

respectively. The processes to calculate the confidence limits is explained in Section 3.4.3.

5.3.4 ROC sheet

Receiver Operating Characteristcs curves for each of the link function are presented on the ROC sheet.

Details of what is an ROC curve and its signification can be found in Section 4.4.2.2.

42 DRDC-RDDC-2020-R056

6 Data analysis example

6.1 General

This section provides a description of the proposed process for the analysis of ballistic data. It is divided

in two parts: The first concerns the selection of the best model and how to determine which models are

valid for further analysis, while the second details where the data for ballistic data analysis can be found

in BLC. Both processes are illustrated by an example.

6.2 Determination of the correct model

This section deals with the question: How to determine which model is correct? Keeping in mind George

Box quotation: "The most that can be expected from any model is that it can supply a useful

approximation to reality: All models are wrong; some models are useful", this question then become

which model is closer to reality? To answer that question, guidelines on how to analyse data calculated by

BLC are presented. Then, an example of typical output is presented and guidelines mentionned above are

applied.

6.2.1 Guidelines

Various output on the “Results 1” sheet can help diagnose which model better represent the data:

1. The first thing is to verify if the calculated parameters ( and ) are significant. To verify that

point, the Wald test and Likelihood Ratio results can be analysed.

a. Each of the 2 or 3 parameters have to be significant using the Wald Test (that is, the message

“Reject that (α or β or δ) is null” appears in green). Remember that the Wald test should be

considered as a conservative test, that is: if the Wald test rejects the null hypothesis, then the

tested parameter is most likely significant, because as mentioned in [12], it often fails to reject the

null hypothesis even though the parameter is significant.

b. Similarly, the Likelihood Ratio tests compares the full model to the constant only ( parameter

only) model. Again, if the message “Reject that α, β and δ is null” appears in green, then this

means that β (for 2 parameters link functions) or β and δ (for 3 parameters link functions) are

significant.

2. The other thing to verify is the goodness-of-fit of the model relative to the data. This can be verified

using the Anderson-Darling (AD) test, and the ROC.

a. The ROC AUC has to be as close as possible to 1.0. Use of the BLC until now have very often

shown that the value of the ROC AUC is the same for all link function. The model should at least

have an ROC AUC value above 0.70, which represents an acceptable discrimination (Table 5).

DRDC-RDDC-2020-R056 43

b. Because they represent the proportion of complete and partiel perforation correctly identified by

the model, sensitivity and specificity of the model has to be as close as possible to 1.

c. The use of the AD test to assess if the experimental data sample is a good fit to the ML fitted link

function can be done keeping in mind, as mentionned in Section 4.4.2.1, that:

i. Sample sizes above 50 are required to reach power levels above 80%.

ii. For lower sample size, the increase of the allowable Type I error level results in higher

power level for the same sample size. As a result, it is advisable to accept higher Type I

error (15 to 20%) in order to reach acceptable Type II error rates for the typical types of

sample size used in ballistic studies. Therefore, despite the diagnostic provided by BLC, the

actual p-value should be verified and should be above 0.15 to 0.20.

3. The Information Criterion values. The lowest the IC value, the best the model is. Of all the criteria

presented in this section, this is the only one that can be directly compared between the different

models. Different aspects have to be considered when comparing the IC values between models:

a. Use of the BLC until now have shown that the bias correction for sample size as small as 15 can

result in large bias correction which can throw off otherwise promising models.

b. The variance on the IC value for the models which require the Bootstrap method can be quite

large. Therefore, when compared to each others, the difference between the values of IC can be

unsignificant.

c. Reduction of the variance on the bias estimate can be achieved by increasing the number of

experimental data point.

d. A more accurate evaluation of the bias and its variance can be obtained by increasing the total

number of Bootstrap sample from 1000 to say 1000000, but this can only be done at the cost of

very long execution durations.

The selection process to determine the best model can be described in graphical form (Figure 10). A

model that has insignificant parameters is useless, hence the first step is verifying the parameter

significance. The second step is to select the highest ROC AUC value indicating that the selected model

has the highest predictive capability. Finally, the IC has to be verified. It is a combination of the lowest

Log likelihood value and the lowest model complexity. It has to be as low as possible.

For some reasons, it is possible that only one specific model needs to be used (for example, because it is

necessary to follow the STANAG 2920 or the NIJ 0101.06 standard). In this case, only the validity of the

model has to be verified and therefore only the parameter significance and the ROC AUC (higher than

0.70) needs to be checked (Figure 10).

44 DRDC-RDDC-2020-R056

PARAMETER SIGNIFICANCE (not null) Wald test (Essential) Likelihood Ratio test (Essential) Anderson-Darling – higher rejection level (Desirable)

ROC Highest AUC – at least 0.70 (Essential) Highest Sensitivity (Desirable) Highest Specificity (Desirable)

INFORMATIONCRITERIA

Lowest value (Essential)

MODELSLogit

ProbitScobit

GompitWeibull

BEST MODEL

Figure 10: Decision process to determine best model.

6.2.2 Correct model determination: An example

Example of BLC results using ballistic data presented in [1] is presented in this section. An excerpt of

“Result 1” page is presented in Figure 11 for the Probit, Logit and Scobit link function. The Gompit and

Weibull link functions were removed from this analysis to simplify the process. Figure 11 is divided in

sections for further discussions. Note that for Section B and C, the message displayed in green means that

the test result supports the use of the model, while the message displayed in red means that the test result

does not support the use of the model. In the discussion below, the conclusion about each analysis point is

written in italic to simplify the lecture.

Section A: Shows the calculated ballistic limit value and its standard deviation. It also shows the

values of α, β and δ as defined above. For each variable, the standard error is presented. Note that

the ballistic limit values are very close for the different link functions. The other variables are not

directly comparable as they are used differently within the different link functions as shown in

Table 1.

Section B: Shows the result of the Wald test and the Likelihood ratio test. The Wald test enables the

verification of the probability that each parameter of each link function is null. The results provided

by BLC specifies if the parameter should be considered as not null (Reject that it is null) or if the

parameter should be considered as null (Cannot reject that it is null). For further information, the

probability that the parameter is null is written in parenthesis. Depending of the “Statistical Test

Confidence Level” value specified in the “Main” Worksheet, the diagnostic will be to reject or not

to reject that a specific parameter is null. In a similar way, the Likelihood ratio test has the same

purpose and it is presented similarly by BLC. The only difference is that contrary to the Wald test,

the Likelihood ratio test assesses if all parameters are null rather than only one. The result displayed

in Figure 11 shows that the parameters of each link function are all significant (cannot be

considered as null).

DRDC-RDDC-2020-R056 45

Section C: Shows the result of the Anderson-Darling test and the Stukel test. The Anderson-Darling

test compares the experimental distribution of partial-complete penetrations to the fitted curve and

assesses if they correspond. It therefore assesses if the experimental data corresponds to the best

fitted link function. The results provided by BLC specifies if we can or cannot reject that the

experimental data corresponds to a specific link function. Again the probability values is specified in

parenthesis. The Stukel test was designed for the Logit link function and verifies if its parameters

are significant (not null). This is done by enhancing the capability of the Logit function at its tails

and therefore provides information on the tail of the fitted Logit function. The result displayed in

Figure 11 shows that the experimental data corresponds to the Logit and Probit fitted function. It

also shows that the experimental data do not corresponds well to the fitted Scobit function. Finally,

the Stukel test shows that the experimental data are poorly modeled by the tails of the Logit

function.

Section D: Shows the model’s AUC, sensitivity and specificity. The closer the AUC is from 1.0, the

better the model is. Each model sensitivity (proportion of complete penetration correctly predicted

by the model) and specificity (proportion of partial penetration correctly predicted by the model)

also needs to be as close to 1.0 as possible. The result displayed in Figure 11 shows that the 3

models are equivalent in terms of AUC, sensitivity and specificity. With a AUC of 85.86%, the

models are showing excellent discrimination (Table 5).

Section E: Shows the Tjur and Nagelkerke pseudo-R2 values for the different models. The value has

to be as close as possible to 1.0. Nagelkerke pseudo-R2 measures the improvement from null model

to the fitted model. The Tjur pseudo-R2 should be seen as a coefficient of discrimination and

therefore measures the ability of the model to discriminate between successes and failures. The

result displayed in Figure 11 shows that the Scobit model has the highest values of pseudo-R2.

Section F: Shows the Log Likelihood value and the Least-square value for the different models.

Remember that the model which has the highest likelihood (highest possible value of likelihood is

1.0) corresponds to the model that has the Log likelihood value closest to 0 (log(1) = 0). The result

displayed in Figure 11 shows that the model that has the Log likelihood value closest to 0 is the

Scobit model whereas the model that has the lowest least-square value is the Probit model.

Section G: Shows the Information Criterion for each model. The IC accounts for how well the

model fits the data and also accounts for the complexity of the model. It is the summation of the Log

likelihood value and the number of parameters (Equation (23)). This is the only criteria that enables

the comparison between the different models. The model with the lowest IC values is the best

model. The result displayed in Figure 11 shows that the Probit is the best of the 3 models because it

fits well the data and because of his simplicity. The Scobit model has a lower Log likelihood value,

therefore better fit the data, but it is more complex than the Probit model.

46 DRDC-RDDC-2020-R056

± 4.0419 516.06 ± 1394.2220

± 4.5551 1164816.3056 ± 1394.8439

± 8.4774 0.9996 ± 0.0012

± 0.0164 0.0000 ± 0.0000

128917.4142 ± 64342.6965

± 9.0753 32.4126 ± 5.7077

Reject that α is null (p = 0.000)

Reject that β is null (p = 0.000)

0.8586

0.7778

0.9091

0.4800

-9.276

1.8615E-01

6.930

18.5521

Weibull

Cannot reject that α is null (p = 1.000)

Reject that β is null (p = 0.000)

Reject that δ is null (p = 0.023)

Reject that α, β and δ are null (p = 0.005)

Reject that distribution is Weibull (p = 0.012)Reject that distribution is Gompit (p = 0.013)

Reject that α and β are null (p = 0.005)

0.4314

0.3711

16.2410

0.4837

Figure 11: Curve fit statistics data for the Logit, Probit and Scobit models.

To conclude, for this example, the best model for the data is the Probit model because:

Probit model has significant parameters

Probit model can predict complete/partial penetration (ROC AUC value) equivalent to the other and

it is above 0.7.

Probit model has the lowest IC value, i.e. it has the best combination of low Log likelihood value

and lowest complexity.

Note that in Figure 11 example, the Logit and Gompit models, like the Probit model are both equally

valid for further analysis as their parameters are significant and their ROC AUC are above 0.70.

6.3 Ballistic limit and Vproof analysis

Once the model of interest has been validated, typical analysis consists of extracting data from the

different Worksheets. Most of data can be found in the “Result 1” Worksheet. Table 7 summarises the

most important variables and where they are located.

Table 7: Variables of interest for Ballistic Limit and Vproof data analysis in “Result 1” Worksheet.

General

Line number Data of interest

21 Ballistic limit with its standard error (using STANAG 2920 and NIJ 0101.06 calculation

method – Maximum Likelihood for Probit and Logit link function)

22 Standard deviation of the distribution (using STANAG 2920 and NIJ 0101.06

calculation method – Maximum Likelihood for Probit and Logit link function)

V50 ± SE (m/s) = 515.25 ± 3.8825 515.20 ± 3.8295 517.24 ± 0.7923

σ ± SE (m/s) = 6.6496 ± 3.3908 10.9791 ± 5.3795 0.0841 ± 0.0001

α ± SE (m/s) = -77.4866 ± 24.5426 -46.9257 ± 19.3767 -6260.7103 ± 9.4536

β ± SE (m/s) = 0.1504 ± 0.0478 0.0911 ± 0.0378 11.8953 ± 0.0179

δ ± SE (m/s) = 0.0064 ± 0.0022

Standard Error of Estimate at V50 =

Wald test p-value on α =

Wald test p-value on β =

Wald test p-value on δ =

Likelihood Ratio test p-value =

Osius-Rojek Goodness-of-Fit test p-value =

Anderson-Darling Goodness-of-Fit test p-value =

Stukel test for fit of distribution tails p-value =

ROC AUC =

Model Sensitivity (Proportion of complete perforation correctly

identified by the model)

Model Specificity (Proportion of partial perforation correctly

identified by the model)

Threshold probability =

Deviance =

Pearson X2 =

Nagelkerke pseudo R2 =

Tjur pseudo R2 =

Log Likelihood =

Least square =

Small Sample Corrected Information Criterion Bias =

Small Sample Corrected IC ± SD (m/s) = 23.5451 ± 0.0000 23.0840 ± 0.0000 24.1334 ± 2.7488

18.6390

0.4309

0.8586

16.2809

0.4799

0.3669

0.8586

2.223

1.8566E-011.7683E-01

0.8586

0.7778

0.9091

0.5100

1.7656E-01

0.7778

0.9091

0.5100

18.7279

16.3956

0.4761

0.3636

-9.364

Logit (NIJ 0101.06)

0.4202

Reject that α is null (p = 0.008)

Reject that β is null (p = 0.008)

0.4205

Probit (STANAG 2920)

0.9091

Reject that δ is null (p = 0.002)

Scobit

Reject that α is null (p = 0.001)

Reject that β is null (p = 0.000)

Reject that distribution is Scobit (p = 0.000)

Reject that α is null (p = 0.000)

Reject that α, β and δ are null (p = 0.004)

Cannot reject that distribution is Probit (p =

0.222)

Reject that α and β are null (p = 0.005)

Reject that β is null (p = 0.001)

Cannot reject that distribution is Logit (p =

0.382)

Logit model tails do not fit the data (p =

0.001)

Reject that α and β are null (p = 0.005)

CURVE FIT STATISTICS for Cl = 95%

-9.320

2.409

0.7778

0.4400

18.2416

15.8247

0.4968

0.3830

-9.121

2.946

A

B

C

D

E

G

F

DRDC-RDDC-2020-R056 47

General

Line number Data of interest

6 Zone of Mixed Results (ZMR)

7 Lowest complete penetration (LC)

8 Highest partial penetration (HP)

5 Spread of the data

Confidence interval limits based on STANAG 2920 method (Figure 7)1

Line number Data of interest

49 Confidence interval length of the V50

50 Maximum V50 value at the upper end of the confidence interval

51 Minimum V50 value at the lower end of the confidence interval

52 Maximum standard deviation value at the upper end of the confidence interval

53 Minimum standard deviation value at the lower end of the confidence interval

Confidence interval limits for Normal or Binomial distribution of error (Figure 7)1,2

Line number Data of interest

54 Confidence interval length of V50 for Normal or Binomial distribution error

55 Maximum V50 value at the upper end of the confidence interval

56 Minimum V50 value at the lower end of the confidence interval

Vproof statistics (Figure 8)1, 2 & 3

Line number Data of interest

61 Vproof value for the set probability of perforation

62 Maximum probability of perforation at the upper end of the confidence interval

63 Minimum probability of perforation at the lower end of the confidence interval

64 Maximum Vproof at the upper end of the confidence interval

65 Minimum Vproof at the lower end of the confidence interval

1 Confidence level used to determine the confidence interval is set in the “Main” Worksheet, Confidence Level for

Error Distribution cell. 2 Confidence level distribution shape – Normal or Binomial – used to determine the confidence interval is set in the

“Main” Worksheet, Error Distribution Shape cell. 3 Vproof probability value is set in the “Main” Worksheet, Vproof probability value cell.

48 DRDC-RDDC-2020-R056

7 Conclusion

This document provides scientific information on the mathematical and statistical methods used by the

BLC (Ballictic Limit Calculator) Beta version 2 program to generate ballistic limit and Vproof data and

their related statistics based on experimental ballistic data.

The BLC software enables the analysis of ballistic data using 5 different statistical models (namely, the

Probit, Logit, Gompit, Scobit and Weibull models) that, at least for the Probit and Logit models, follow

the data analysis procedures of STANAG 2920 and NIJ 0101.06 standards. In addition, different

statistical measures and statistical tests are presented along with their limitations and details on their

meaning. This enables the BLC user to have a clear understanding of the significance and the goodness of

fit of the different models. In addition, typical data analysis process is presented through an example.

This document is also the user manual of the BLC Beta version 2 program. As such, this document

provides details on the nature and meaning of input and output information and presents its different

features.

DRDC-RDDC-2020-R056 49

References

[1] AEP 2920 – Procedures for the Evaluation and Classification of Personal Armour, Bullet and

Fragmentation Threats, Edition A, Version 1, 22 June 2015, NATO Standardization Office, Brussels.

[2] STANAG 2920 – Classification of Personal Armour (EDITION 3), 22 June 2015, NATO

Standardization Office, Brussels.

[3] NIJ Standard-0101.06 – Ballistic Resistance of Body Armour, July 2008, U.S. Department of Justice,

Office of Justice Programs, National Institute of Justice.

[4] Nagler, J., Scobit: An Alternative Estimator to Logit and Probit, Americam Journal of Political

Science, Vol. 38, No. 1, February 1994, pp 230–255.

[5] Brathwaite, T. and Walker, J.L., Asymmetric, closed-form, finite-parameter models of multinomial

choice, Journal of Choice Modelling (2017), http://dx.doi.org/10.1016/j.jocm.2018.01.002

[6] Mauchant, D., Rice, K.D., Riley, M.A., Leber, D., Samarov, D., and Forster, A.L., Analysis of Three

Different Regression Models to Estimate the Ballistic Performance of New and Environmentally

Conditioned Body Armor, U.S. Department of Commerce, National Institute of Standards and

Technology, NISTIR 7760, February 2011.

[7] Nelder, J.A. and Mead, R.A., A simplex method for function minimization. Computer Journal, Vol. 7,

308–313, 1965.

[8] Williams, R., Maximum Likelihood Estimation, University of Notre-Dame,

http://www3.nd.edu/~rwilliam/, Last revised January 14, 2016, Accessed 13 July 2016.

[9] Watkins, J., Topic 15 – Maximum Likelihood Estimation, University of Arizona,

http://math.arizona.edu/~jwatkins/o-mle.pdf, November 1 and 3 2011. Accessed 13 July 2016.

[10] Genschel, U. and Meeker, W.Q., A Comparison of Maximum Likelihood and Median-Rank

Regression for Weibull Estimation. Quality Engineering, 22 (4), pp. 236–255, 2010.

[11] Nemes, S.,Jonasson, J.M., Genell, A. and Steineck, G., Bias in odds ratios by logistic regression

modelling and sample size, BMC Medical Research Methodology, Vproof. 9, No. 56, 2009.

[12] Hosmer, D.W. and Lemeshow, S., Applied Logistic Regression, Second Edition, Wiley Series on

Probability and Statistics, 2000.

[13] Ku, H.H., Notes on the Use of Propagation of Error Formulas, Journal of Research of the National

Bureau of Standards - C. Engineering and Instrumentation, Vol. 70C, No.4, October–December

1966.

[14] Lorenz, V., Yang, L., Grosse Perdekamp, M., Hertzog, D., Clegg, R., Error Analysis,

https://courses.physics.illinois.edu/phys403/sp2016/lectures/ErrorAnalysis_Lorenz2016.pdf .

PHYS403 Course lecture, University of Illinois at Urbana, College of Engineering, Spring 2016.

50 DRDC-RDDC-2020-R056

[15] Heo, J.H., Salas, J.D, Kim, K.D., Estimation of Confidence Intervals of Quantiles for the Weibull

Distribution, Stochastic Environmental Research and Risk Assessment, Vol. 15, 2001, pp 284–309.

[16] NIST/SEMATECH, e-Handbook of Statistical Methods: Measures of Skewness and Kurtosis,

https://www.itl.nist.gov/div898/handbook/eda/section3/eda35b.htm ,

[17] Bulmer, M.G., Principles of Statistics,. The M. I. T. Press, 1965, 214 pages.

[18] Williams, R., Logistic Regression, Part III: Hypothesis Testing, Comparisons to OLS, University of

Notre Dame, http://www3.nd.edu/~rwilliam/ , Last revised February 22, 2015.

[19] DeCook, R., 22s:152 Applied Linear Regression: Ch. 14 (sec. 1) and Ch. 15 (sec. 1 & 4), Logistic

Regression, Department of Statistics and Actuarial Science, University of Iowa,

http://homepage.stat.uiowa.edu/~rdecook/stat3200/notes/LogReg_part2_4pp.pdf , STAT3200

course notes, July 2011.

[20] Rukhin, A., Soto, J., Nechvatal, J., Smid, M., Barker, E., Leigh,S., Levenson, M., Vangel, M.,

Banks, D., Heckert, A., Dray,J. and Vo, S., A Statistical Test Suite for Random and Pseudorandom

Number Generators for Cryptographic Applications, US Department of Commerce, National

Institute of Standards and Technology, NIST Special Publication 800-22, Revision 1a, 2010.

[21] Rukhin, A.L., Statistical Testing of Randomness : New and Old Procdures, Appeared as Chapter 3

in Randomness Through Computation by Zenil, H. ed., World Scientific, pp 33-51, 2011.

[22] Pennsylvania State University, Department of Statistics Online Programs, Stat 414/415, Lesson 49,

The Run Test, https://online.stat.psu.edu/stat414/node/329/, 2018.

[23] NCSS Statistical Software, Chapter 256, Analysis of Runs, https://ncss-wpengine.netdna-

ssl.com/wp-content/themes/ncss/pdf/Procedures/NCSS/Analysis_of_Runs.pdf.

[24] Das, U., Dhar, S.S. and Pradhan, V., Corrected likelihood-ratio tests in logistic regression using

small-sample data, Communications in Statistics—Theory and Methods, 2017.

[25] Tanizaki, H., POWER COMPARISON OF EMPIRICAL LIKELIHOOD RATIO TESTS - SMALL

SAMPLE PROPERTIES THROUGH MONTE CARLO STUDIES, Kobe University Economic

Review, Vol. 50, 2004.

[26] Cousineau, D. and Allen, T.A., Likelihood and its use in Parameter Estimation and Model

Comparison, Mesure et évaluation en éducation, Vol 37, No. 3, pp 63–98, 2015.

[27] Stephens, M.A., The Anderson-Darling Statistic, Prepared under Grant DAAG29-77-G-0O31 For

the U.S. Army Research Office, Technical Report No. 39, October 31, 1979.

[28] Razali, N.M. and Wah, Y.B., Power comparisons of Shapiro-Wilk, Kolmogorov-Smirnov, Lilliefors

and Anderson-Darling tests, Journal of Statistical Modeling and Analytics Vol.2 No.1, 21–33, 2011.

DRDC-RDDC-2020-R056 51

[29] Bispo, R., Marques. T.A. and Pestana, D., Statistical power of goodness-of-fit test based on

empirical distribution function for Type-I right censored data, Journal of Statistical Computation

and Simulation, 2011, pp 1–9.

[30] Hosmer, D.W., Hosmer, T., , LE CESSIE, S. and S. Lemeshow, S., A COMPARISON OF

GOODNESS-OF-FIT TESTS FOR THE LOGISTIC REGRESSION MODEL, Statistics in

Medicine, Vol. 16, 1997, pp 965–980.

[31] Mandrekar, J.N., Receiver Operating Characteristic Curve in Diagnostic Test Assessment,

Biostatistics for Clinicians, Journal of Thoracic Oncology, Vol 5, No 9, September 2010.

[32] Hanley, J.A. and Hajian, K.O., Sampling variability of nonparametric estimates of the areas under

receiver operating characteristic curves - An update, Statics in Radiology, Acad Radiol, Vol. 4,

No. 1, January 1997, pp 49-58.

[33] Wonnacott, T.H. and Wonnacott, R.J., Introductory statistics for business and economics, Second

edition, Wiley and Sons Inc., 1997.

[34] Anderson, D.R and Burnham, K.P., Understanding information criteria for selection among capture-

recapture or ring recovery models, Bird Study, 46 (suppl.), 1999, pp S14-21.

[35] Konishi, S. and Kitagawa, G., Information Criteria and Statistical Modeling, Spriger Series in

Statistics, 2008, ISBN: 978-0-387-71886-6.

[36] Takeuchi, K., Distribution of informational statistics and a criterion of model fitting, Suri Kagaku

(Mathematic Sciences), 153, 1976, pp.12–18 (in Japanese)

[37] Akaike, H., A new look at the statistical model identification. IEEE Transactions on Automatic

Control, Vol. 19, No. 6, pp 716–723, 1974.

[38] Imori, S., Yanagihara, H., and Wakaki, H., General Formula of Bias-Corrected AIC in

Generalizwed Linear Models, Scandinavian Journal in Statiststics, Vol. 41,No. 2, pp 535–555,

2014.

DOCUMENT CONTROL DATA *Security markings for the title, authors, abstract and keywords must be entered when the document is sensitive

1. ORIGINATOR (Name and address of the organization preparing the document. A DRDC Centre sponsoring a contractor's report, or tasking agency, is entered in Section 8.)

DRDC – Valcartier Research Centre Defence Research and Development Canada 2459 route de la Bravoure Québec (Québec) G3J 1X5 Canada

2a. SECURITY MARKING (Overall security marking of the document including special supplemental markings if applicable.)

CAN UNCLASSIFIED

2b. CONTROLLED GOODS

NON-CONTROLLED GOODS DMC A

3. TITLE (The document title and sub-title as indicated on the title page.)

Calculation software for ballistic limit and Vproof tests from ballistic tests results (BLC)

4. AUTHORS (Last name, followed by initials – ranks, titles, etc., not to be used)

Bourget, D.; Bolduc, M.

5. DATE OF PUBLICATION (Month and year of publication of document.)

July 2020

6a. NO. OF PAGES

(Total pages, including Annexes, excluding DCD, covering and verso pages.)

60

6b. NO. OF REFS

(Total references cited.)

38

7. DOCUMENT CATEGORY (e.g.,, Scientific Report, Contract Report, Scientific Letter.)

Scientific Report

8. SPONSORING CENTRE (The name and address of the department project office or laboratory sponsoring the research and development.)

DRDC – Valcartier Research Centre Defence Research and Development Canada 2459 route de la Bravoure Québec (Québec) G3J 1X5 Canada

9a. PROJECT OR GRANT NO. (If appropriate, the applicable research and development project or grant number under which the document was written. Please specify whether project or grant.)

02ab - Soldier System Effectiveness (SoSE)

9b. CONTRACT NO. (If appropriate, the applicable number under which the document was written.)

10a. DRDC PUBLICATION NUMBER (The official document number by which the document is identified by the originating activity. This number must be unique to this document.)

DRDC-RDDC-2020-R056

10b. OTHER DOCUMENT NO(s). (Any other numbers which may be assigned this document either by the originator or by the sponsor.)

11a. FUTURE DISTRIBUTION WITHIN CANADA (Approval for further dissemination of the document. Security classification must also be considered.)

Public release

11b. FUTURE DISTRIBUTION OUTSIDE CANADA (Approval for further dissemination of the document. Security classification must also be considered.)

12. KEYWORDS, DESCRIPTORS or IDENTIFIERS (Use semi-colon as a delimiter.)

Ballistic Limit; Body armour; Applied Statistics; Personal protective equipment; Personal ballistic protection; Vehicle ballistic protection; Test method

13. ABSTRACT (When available in the document, the French version of the abstract must be included here.)

This document provides scientific information on the mathematical and statistical methods used by the BLC (Ballictic Limit Calculator) Beta Version 2 program to generate ballistic limit and Vproof data and their related statistics based on experimental ballistic data.

The BLC software enables the analysis of ballistic data using 5 different statistical models (namely, the Probit, Logit, Gompit, Scobit and Weibull models) that, at least for the Probit and Logit models, follow the data analysis procedures of NATO STANAG 2920 and NIJ 0101.06 standards. In addition to the model parameters, BLC calculates standard errors and confidence limits on the model parameters, on the ballistic limit, on the standard deviation and on the Vproof. Furthermore, a series of statistical tests enables the comparison of the different models and allows to assess their significance and goodness of fit.

This document is also the user manual of the BLC Beta version 2 program. As such, this document provides details on the nature and meaning of input and output information and presents its different features.

Ce document fournit des informations scientifiques sur les méthodes mathématiques et statistiques utilisées par le programme BLC (Ballictic Limit Calculator) Beta Version 2 pour générer des données de limite balistique et Vproof et leurs statistiques associées basées sur des données balistiques expérimentales.

Le logiciel BLC permet l'analyse de données balistiques à l'aide de 5 modèles statistiques différents (à savoir, les modèles Probit, Logit, Gompit, Scobit et Weibull) qui, au moins pour les modèles Probit et Logit, suivent les procédures d'analyse des données des normes OTAN STANAG 2920 et NIJ 0101.06 STANAG 2920 et NIJ Normes 0101.06. En plus des paramètres du modèle, BLC calcule les erreurs standards et les limites de confiance sur les paramètres du modèle, sur la limite balistique, sur l'écart type et sur le Vproof. En outre, une série de tests statistiques permet la comparaison des différents modèles et permet d'évaluer leur signification et leur qualité d'ajustement.

Ce document est également le manuel d'utilisation du programme BLC Beta version 2. À ce titre, ce document fournit des détails sur la nature et la signification des informations d'entrée et de sortie et présente ses différentes caractéristiques.