45
Martin J. Ippel & Ryan Glaze CogniMetrics Inc, San Antonio,TX Identification of Technical Aptitude Based on Criterion Measures of the U.S. Navy Apprentice Technical Training Program (O.N.R. Contracts Nr. N00014-10-M-0087 & N00014-10-C-0505) Paper presented at the 53 rd Annual Conference of the International Military Testing Association, Bali (Indonesia). October 31 – November 4, 2011

IMTA 2011 Technical Aptitude Pres I

Embed Size (px)

Citation preview

Page 1: IMTA 2011 Technical Aptitude Pres I

Martin J. Ippel &

Ryan Glaze

CogniMetrics Inc, San Antonio,TX

Identification of Technical Aptitude Based on Criterion Measures of the

U.S. Navy Apprentice Technical Training Program (O.N.R. Contracts Nr. N00014-10-M-0087 & N00014-10-C-0505)

Paper presented at the 53rd Annual Conference of the International Military Testing Association, Bali (Indonesia).

October 31 – November 4, 2011

Page 2: IMTA 2011 Technical Aptitude Pres I

Outline: 1. Terminology: aptitude, ability and skill

2. Research Questions: (1) Can technical skill be measured independently from technical knowledge, and if so, (2) do we need different predictor variables to predict success in technical training on these aspects.

3. Did we indeed extract general aptitude variance from a collection of training performance measures? Evidence in favor:

i. Study 1: Technical Aptitude: a 2-dimensional concept

ii. Study 2: Observed Score Variance Decomposed

iii. Study 3: Translate SE model parameters into (2PN) IRT parameters

4. Summary of results

Page 3: IMTA 2011 Technical Aptitude Pres I

Terminology: Aptitude: entails that someone can learn to perform a task (or a class of tasks) under reasonable time constraints Ability: entails that someone can perform a task (or class of tasks); example: Using DC equipment Skill: a very specific ability; example: charging a car battery

Page 4: IMTA 2011 Technical Aptitude Pres I

Spearman-Holzinger Bi-Factor Model of the ASVAB (Ippel-Watson, 2008)

Page 5: IMTA 2011 Technical Aptitude Pres I

Spearman-Holzinger Bi-Factor Model of the ASVAB (Ippel-Watson, 2008)

Page 6: IMTA 2011 Technical Aptitude Pres I

Percentages of Variance Explained per ASVAB Factor estimated under two conditions: (1) Regular single-version administration (confounded condition); (2) multitrait-

multimethod design, both versions administered (unconfounded condition). (Ippel, 2011).

Page 7: IMTA 2011 Technical Aptitude Pres I

Conclusions:

1. The Armed Forces form a highly technical and partly high-tech work environment, but have no adequate tools for selection and placement of future personnel in this highly technical work environment.

2. The existing TK tests measure mainly the ASVAB general factor (crystalized intelligence).

Page 8: IMTA 2011 Technical Aptitude Pres I

2. Opportunity / Challenge

Apprentice Technical Training Program

We have access to training data of the Navy’s Apprentice Technical Training (ATT) program, which provides basic electricity and electronics training to 21 Navy ratings. The ATT program is a modular program. The ATT program consists of 49 modules, which in different combinations prepare recruits for a particular job rating.

Page 9: IMTA 2011 Technical Aptitude Pres I

ATT modules

modules for a particular job rating

common specific

specific

specific

specific

specific

specific

specific

Examples of Navy ratings trained in the ATT Program:

• (Aviation) Electronics Technician • Electrician’s Mate • Interior Communications Technician • Communications Technician • Gas Turbine Systems Technician • Sonar Technician • Fire Control Technician • Missile Technician

2. Opportunity / Challenge

Page 10: IMTA 2011 Technical Aptitude Pres I

Lesson 1 Lesson 2 Lesson k

ATT module tests S-test

K-test

Each module consists of 8 – 10 lessons Modules can be general to all job ratings or specific for a subgroup of job ratings

2. Opportunity / Challenge

Page 11: IMTA 2011 Technical Aptitude Pres I

2. Opportunity / Challenge

Lots of test data available, but (as is often the case with training data) many test-score distributions have undesirable properties. For example,

Page 12: IMTA 2011 Technical Aptitude Pres I

3. Research Questions

Previous Study: • Make something out of these criterion scores: Watson & Ippel (2008): developed a logistic model to replace the dichotomized K-scores and S-scores with probability of passing the Minimum Competence Level (MCL). Present Project: • Can technical skill be measured independently from technical

knowledge, and if so, • Do we need different predictor variables to predict success in

technical training on these aspects

Page 13: IMTA 2011 Technical Aptitude Pres I

3. Research Questions

STUDY I

SE model with dichotomous test scores

STUDY III

(2PN) IRT model

STUDY II

equivalent models

CT-CM model

augmented model

(Takane & DeLeeuw, 1989)

Page 14: IMTA 2011 Technical Aptitude Pres I

STUDY I

Test of an independent cluster model for Technical Aptitude

Page 15: IMTA 2011 Technical Aptitude Pres I

SE model with independent cluster structure (part model)

TS

S3

u13

S2

u12

S1

u11

TK

K3

u23

K2

u22

K1

u21

Page 16: IMTA 2011 Technical Aptitude Pres I

model 1 model 2

modules / tests Tech. Skill Tech. Knowledge Tech. Skill Tech. Knowledge

Estimate S.E. Estimate S.E. Estimate S.E. Estimate S.E.

Introduction to Electricity (S1) 0.443 0.096 0.464 0.104

Multi-meter Measurements (S2) 0.432 0.147 0.438 0.148

Basic DC Circuits (S3) 0.529 0.082 0.554 0.090

Introduction to AC (S4) No test available No test available

AC Test Equipment (S5) 0.678 0.083 0.737 0.091

Transformers (S6) 0.592 0.074 0.633 0.080

Introduction to DC (S7) 0.525 0.098 0.607 0.108

Digital Logic Functions (S8) 0.44 0.134 0.456 0.143

Introduction to Electricity (K1) 0.774 0.122 0.922 0.135

Multi-meter Measurements (K2) 0.121* 0.08 0.119* 0.086

Basic DC Circuits (K3) 0.092* 0.096 0.087* 0.100

Introduction to AC (K4) 0.485 0.075 0.541 0.085

AC Test Equipment (K5) 0.548 0.070 0.610 0.084

Transformers (K6) 0.203 0.100 0.220 0.108

Introduction to DC (K7) 0.392 0.079 0.417 0.088

Digital Logic Functions (K8) 0.417 0.083 0.460 0.091

*) p > 0.05

Table 1: Estimated Factor Loadings on Dimensions of a Technical Learning Aptitude under two different models (Navy Rating: AE, N = 500)

Page 17: IMTA 2011 Technical Aptitude Pres I

Table 2: Meta-analytic estimates of means of CFA factor loadings of A.T.T. post-tests over 16 samples of Navy Ratings (N = 500 for each sample)

modules / tests sampling 95% CI

Meanλ SDλ error SEλ LL UL

Introduction to Electricity (S1) 0.312 0.142 0.04% 0.008 0.297 0.328

Multi-meter Measurements (S2) 0.198 0.127 0.06% 0.009 0.180 0.215

Basic DC Circuits (S3) 0.465 0.084 0.02% 0.006 0.453 0.477

Introduction to AC (S4) No Test Available

AC Test Equipment (S5) 0.406 0.164 0.03% 0.007 0.393 0.419

Transformers (S6) 0.298 0.137 0.04% 0.008 0.282 0.313

Introduction to DC (S7) 0.450 0.071 0.02% 0.006 0.438 0.462

Digital Logic Functions (S8) 0.325 0.121 0.04% 0.008 0.31 0.339

Introduction to Electricity (K1) 0.540 0.095 0.01% 0.005 0.530 0.550

Multi-meter Measurements (K2) 0.456 0.18 0.02% 0.006 0.444 0.468

Basic DC Circuits (K3) 0.572 0.143 0.01% 0.005 0.563 0.582

Introduction to AC (K4) 0.628 0.077 0.01% 0.004 0.62 0.637

AC Test Equipment (K5) 0.577 0.071 0.01% 0.005 0.567 0.586

Transformers (K6) 0.525 0.118 0.02% 0.005 0.515 0.536

Introduction to DC (K7) 0.470 0.167 0.02% 0.006 0.458 0.482

Digital Logic Functions (K8) 0.630 0.132 0.01% 0.004 0.622 0.638

Page 18: IMTA 2011 Technical Aptitude Pres I

modules / tests sampling 95% CI

Meanτ SDτ error SEτ LL UL

Introduction to Electricity (S1) -0.309 0.569 0.04% 0.008 -0.324 -0.294

Multi-meter Measurements (S2) -1.397 0.338 0.01% -0.004 -1.388 -1.406

Basic DC Circuits (S3) -0.300 0.281 0.04% 0.008 -0.316 -0.285

Introduction to AC (S4) No Test Available

AC Test Equipment (S5) -0.043 0.684 0.13% 0.011 -0.064 -0.022

Transformers (S6) -0.810 0.493 0.00% 0.002 -0.814 -0.806

Introduction to DC (S7) -0.689 0.42 0.01% 0.003 -0.696 -0.682

Digital Logic Functions (S8) -1.066 0.447 0.00% -0.001 -1.064 -1.067

Introduction to Electricity (K1) -1.291 0.349 0.00% -0.003 -1.284 -1.297

Multi-meter Measurements (K2) -1.457 0.623 0.01% -0.005 -1.447 -1.467

Basic DC Circuits (K3) -0.752 0.391 0.00% 0.003 -0.758 -0.747

Introduction to AC (K4) -0.743 0.473 0.00% 0.003 -0.748 -0.737

AC Test Equipment (K5) 0.006 0.34 0.17% 0.011 -0.015 0.028

Transformers (K6) -1.290 0.337 0.00% -0.003 -1.283 -1.296

Introduction to DC (K7) -1.306 0.495 0.00% -0.003 -1.299 -1.313

Digital Logic Functions (K8) -1.496 0.467 0.01% -0.006 -1.485 -1.507

Table 3: Meta-analytic estimates of means of CFA difficulty parameters for each A.T.T. post-tests across 16 samples of Navy Ratings

(N = 500 for each sample)

Page 19: IMTA 2011 Technical Aptitude Pres I

STUDY II

Decomposition of Observed Score Variance (augmenting the SE model)

Page 20: IMTA 2011 Technical Aptitude Pres I

SE model with independent cluster structure (part model)

TS

S3

u13

S2

u12

S1

u11

TK

K3

u23

K2

u22

K1

u21

TS

Page 21: IMTA 2011 Technical Aptitude Pres I

TS

S3

u13

S2

u12

S1

u11

TK

K3

u23

K2

u22

K1

u21

Intro AC Basic DC Trans-

formers

TS

Correlated Traits – Correlated Modules model (part model)

Page 22: IMTA 2011 Technical Aptitude Pres I

TS

S3

u13

S2

u12

S1

u11

TK

K3

u23

K2

u22

K1

u21

Intro AC Basic DC Trans-

formers

TS

Correlated Traits – Correlated Modules model (part model)

Page 23: IMTA 2011 Technical Aptitude Pres I

Knowledge Domain Knowledge Type

Techn. Knowledge Techn. Skill H2 Unique

Tests Estimate S.E. Estimate S.E. Estimate S.E.

Introduction to Electricity (S1) 0.269 0.179 0.403 0.104 0.235 0.765

Introduction to Electricity (K1) 0.602 0.329 0.793 0.163 0.991 0.009

Multi-meter Measurements (S2) -0.215 0.257 0.463 0.151 0.261 0.739

Multi-meter Measurements (K2) 0.514 0.171 0.048 0.117 0.267 0.734

Basic DC Circuits (S3) 0.303 0.172 0.494 0.094 0.336 0.664

Basic DC Circuits (K3) 0.261 0.176 0.046 0.114 0.070 0.930

Introduction AC (S4) no test available

Introduction AC (K4) 0.492 0.073

AC Test Equipment (S5) -0.070 0.202 0.686 0.081 0.475 0.525

AC Test Equipment (K5) 0.136 0.171 0.531 0.070 0.300 0.700

Transformers (S6) 0.291 0.197 0.558 0.088 0.396 0.604

Transformers (K6) 0.147 0.146 0.178 0.104 0.053 0.947

Introduction to DC (S7) -0.696 0.353 0.713 0.175 0.993 0.007

Introduction to DC (K7) 0.043 0.151 0.388 0.080 0.152 0.848

Digital Logic Functions (S8) 0.139 0.287 0.424 0.128 0.199 0.801

Digital Logic Functions (K8) -0.023 0.147 0.426 0.084 0.182 0.818

Bold Face: significant at 0.05 or lower

R(S, K) = 0.881

ΦM: 0.70 for all non-diagonal entries

Page 24: IMTA 2011 Technical Aptitude Pres I

Mean Size of Variance Components in Criterion Tests for Common ATT Modules

Page 25: IMTA 2011 Technical Aptitude Pres I

VARIANCE ABSORPTION

Significant larger trait loadings in Study I as a percentage of total number of tests when the T-variance components were larger (T>M), equal (T=M), or smaller (T<M) than the module component

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

T<M T=M T>M

Per

cen

tage

of

sign

ific

ant

dif

fere

nce

s

Relation between T- and M variance componets

S-tests

K-tests

Page 26: IMTA 2011 Technical Aptitude Pres I

STUDY III

Translating SE model parameters into

(2PN) IRT parameters

Page 27: IMTA 2011 Technical Aptitude Pres I

Average Probability of Failure on K-tests and S-tests for trainees with a mean score on underlying latent traits at the first trial

Page 28: IMTA 2011 Technical Aptitude Pres I

IRT Parameters

Modules/Tests bj1 bj2 aj

Introduction to Electricity (S1) 0.301 -0.12

Multi-meter Measurements (S2) 0.296 -1.399

Basic DC Circuits (S3) 0.552 -0.231

Introduction to AC (S4)

AC Test Equipment (S5) 0.489 0.135

Transformers (S6) 0.312 -1.068

Introduction to DC (S7) 0.559 -0.575

Digital Logic Functions (S8) 0.559 -1.107

Introduction to Electricity (K1) 0.588 -1.334

Multi-meter Measurements (K2) 0.584 -1.916

Basic DC Circuits (K3) 1.068 -0.857

Introduction to AC (K4) 1.144 -1.295

AC Test Equipment (K5) 0.735 -0.124

Transformers (K6) 0.699 -1.607

Introduction to DC (K7) 0.559 -1.565

Digital Logic Functions (K8) 1.144 -2.471

Table 4: IRT parameters based on the meta-analytic factor loading and threshold estimates

Page 29: IMTA 2011 Technical Aptitude Pres I

Is the (2PN) IRT model adequate for the A.T.T. criterion score data?

a. Validating Assumptions • Independent cluster structure • Postulated dimensions represent the complete latent space of the

construct

b. Properties of ICCs

Probability of passing score should be monotonically increasing with person parameter

c. Model Predictions vs Observed Data

d. Other Psychometric Information • Test Information Functions • Distribution Observed Scores, True Scores and Person Parameter

Values

Page 30: IMTA 2011 Technical Aptitude Pres I

Is the (2PN) IRT model adequate for the A.T.T. criterion score data?

a. Validating Assumptions • Independent cluster structure • Postulated dimensions represent the complete latent space of the

construct

b. Properties of ICCs

Probability of passing score should be monotonically increasing with person parameter

c. Model Predictions vs Observed Data

d. Other Psychometric Information • Test Information Functions • Distribution Observed Scores, True Scores and Person

Parameter Values

Page 31: IMTA 2011 Technical Aptitude Pres I

Is the (2PN) IRT model adequate for the A.T.T. criterion score data?

a. Validating Assumptions • Independent cluster structure • Postulated dimensions represent the complete latent space of the

construct

b. Properties of ICCs

Probability of passing score should be monotonically increasing with person parameter

c. Model Predictions vs Observed Data

d. Other Psychometric Information • Test Information Functions • Distribution Observed Scores, True Scores and Person

Parameter Values

Page 32: IMTA 2011 Technical Aptitude Pres I

Item Characteristic Curves (ICCs) for dichotomized criterion measures K4 and K5 based on meta-analytically derived estimates (N = 10,000)

Page 33: IMTA 2011 Technical Aptitude Pres I
Page 34: IMTA 2011 Technical Aptitude Pres I

Is the (2PN) IRT model adequate for the A.T.T. criterion score data?

a. Validating Assumptions • Independent cluster structure • Postulated dimensions represent the complete latent space of the

construct

b. Properties of ICCs

Probability of passing score should be monotonically increasing with person parameter

c. Model Predictions vs Observed Data

d. Other Psychometric Information • Test Information Functions • Distribution Observed Scores, True Scores and Person

Parameter Values

Page 35: IMTA 2011 Technical Aptitude Pres I

0

5

10

15

20

25

30

35

40

45

50

-0.7 -0.6 -0.5 -0.4 -0.3 -0.2 -0.1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8

Fre

qu

en

cy

Standarized Residuals TS test

Distribution of Standardized Residuals for the TS test based on seven dichotomized common A.T.T. criterion scores in a Stratified Sample (N =

1,000) with meta-analytically derived estimates of 2PN IRT item parameters

Page 36: IMTA 2011 Technical Aptitude Pres I

0

10

20

30

40

50

60

-0.7 -0.6 -0.5 -0.4 -0.3 -0.2 -0.1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8

Fre

qu

en

cy

Standardized Residuals TK test

Distribution of Standardized Residuals for the TK test based on eight dichotomized common A.T.T. criterion scores in a Stratified Sample (N =

1,000) with meta-analytically derived estimates of 2PN IRT item parameters

Page 37: IMTA 2011 Technical Aptitude Pres I

Is the (2PN) IRT model adequate for the A.T.T. criterion score data?

a. Validating Assumptions • Independent cluster structure • Postulated dimensions represent the complete latent space of the

construct

b. Properties of ICCs

Probability of passing score should be monotonically increasing with person parameter

c. Model Predictions vs Observed Data

d. Other Psychometric Information • Test Information Functions • Distribution Observed Scores, True Scores and Person

Parameter Values

Page 38: IMTA 2011 Technical Aptitude Pres I

Test Information Function for Technical Skill Learning (Navy ranking AE, N = 500)

Test Information Function for Technical Concepts Learning (Navy ranking AE, N = 500)

Page 39: IMTA 2011 Technical Aptitude Pres I

Score distributions for the skill test consisting of seven ATT common modules criterion scores. Person parameters were transformed to the

same metric as the observed and true score estimates (Data: Stratified Sample, N = 10,000)

Page 40: IMTA 2011 Technical Aptitude Pres I

Score distributions for the knowledge test consisting of eight ATT common modules criterion scores. Person parameters were transformed to the

same metric as the observed and true score estimates (Data: Stratified Sample, N = 10,000)

Page 41: IMTA 2011 Technical Aptitude Pres I

Conclusions:

1. The A.T.T. criterion performance data could be modeled as a two-dimensional independent cluster structure.

2. The latent variables underlying these clusters could be shown to be orthogonal to module-specific training effects.

3. The underlying distribution of the person parameter (theta) was found to be normally distributed (most clearly for the technical skill parameter).

Page 42: IMTA 2011 Technical Aptitude Pres I

Why is this important? The advantage of deriving relatively “pure” measures directly from training data is obvious: o The logical distance between construct and behavior is extremely small

and the relevance of the theoretical construct for the criterion is indisputable

o It provides a best possible criterion to evaluate existing tests (see: Glaze & Ippel, 2011)

o It provides a best possible criterion for the development of new tests

Page 43: IMTA 2011 Technical Aptitude Pres I

THANK YOU!

Page 44: IMTA 2011 Technical Aptitude Pres I

0.00

0.10

0.20

0.30

0.40

0.50

0.60

0.70

S1 S2 S3 S4 S5 S6 S7 S8 K1 K2 K3 K4 K5 K6 K7 K8

Exp

lain

ed V

aria

nce

Post-Tests Common ATT Modules, (Navy Rating AE)

CFA

M-aptitude

M-module

Page 45: IMTA 2011 Technical Aptitude Pres I

0.000

0.100

0.200

0.300

0.400

0.500

0.600

0.700

0.800

0.900

1.000

Pro

bab

ility

Su

cce

ss

latent dimension

P (Xij = 1 | θj) = f (θj – bi)

bi

In IRT models the probability of success a function the distance between the location of the test (b) and the examinee (θ) on the latent dimension

3. Research Questions