IMTA 2011 Technical Aptitude Pres I

Martin J. Ippel &

Ryan Glaze

CogniMetrics Inc, San Antonio,TX

Identification of Technical Aptitude Based on Criterion Measures of the

U.S. Navy Apprentice Technical Training Program (O.N.R. Contracts Nr. N00014-10-M-0087 & N00014-10-C-0505)

Paper presented at the 53rd Annual Conference of the International Military Testing Association, Bali (Indonesia).

October 31 – November 4, 2011

Outline: 1. Terminology: aptitude, ability and skill

2. Research Questions: (1) Can technical skill be measured independently from technical knowledge, and if so, (2) do we need different predictor variables to predict success in technical training on these aspects.

3. Did we indeed extract general aptitude variance from a collection of training performance measures? Evidence in favor:

i. Study 1: Technical Aptitude: a 2-dimensional concept

ii. Study 2: Observed Score Variance Decomposed

iii. Study 3: Translate SE model parameters into (2PN) IRT parameters

4. Summary of results

Terminology: Aptitude: entails that someone can learn to perform a task (or a class of tasks) under reasonable time constraints Ability: entails that someone can perform a task (or class of tasks); example: Using DC equipment Skill: a very specific ability; example: charging a car battery

Spearman-Holzinger Bi-Factor Model of the ASVAB (Ippel-Watson, 2008)

Spearman-Holzinger Bi-Factor Model of the ASVAB (Ippel-Watson, 2008)

Percentages of Variance Explained per ASVAB Factor estimated under two conditions: (1) Regular single-version administration (confounded condition); (2) multitrait-

multimethod design, both versions administered (unconfounded condition). (Ippel, 2011).

Conclusions:

1. The Armed Forces form a highly technical and partly high-tech work environment, but have no adequate tools for selection and placement of future personnel in this highly technical work environment.

2. The existing TK tests measure mainly the ASVAB general factor (crystalized intelligence).

2. Opportunity / Challenge

Apprentice Technical Training Program

We have access to training data of the Navy’s Apprentice Technical Training (ATT) program, which provides basic electricity and electronics training to 21 Navy ratings. The ATT program is a modular program. The ATT program consists of 49 modules, which in different combinations prepare recruits for a particular job rating.

ATT modules

modules for a particular job rating

common specific

specific

specific

specific

specific

specific

specific

Examples of Navy ratings trained in the ATT Program:

• (Aviation) Electronics Technician • Electrician’s Mate • Interior Communications Technician • Communications Technician • Gas Turbine Systems Technician • Sonar Technician • Fire Control Technician • Missile Technician


Lesson 1 Lesson 2 Lesson k

ATT module tests S-test

K-test

Each module consists of 8 – 10 lessons Modules can be general to all job ratings or specific for a subgroup of job ratings



Lots of test data available, but (as is often the case with training data) many test-score distributions have undesirable properties. For example,

3. Research Questions

Previous Study: • Make something out of these criterion scores: Watson & Ippel (2008): developed a logistic model to replace the dichotomized K-scores and S-scores with probability of passing the Minimum Competence Level (MCL). Present Project: • Can technical skill be measured independently from technical

knowledge, and if so, • Do we need different predictor variables to predict success in

technical training on these aspects


STUDY I

SE model with dichotomous test scores

STUDY III

(2PN) IRT model

STUDY II

equivalent models

CT-CM model

augmented model

(Takane & DeLeeuw, 1989)

STUDY I

Test of an independent cluster model for Technical Aptitude

SE model with independent cluster structure (part model)

TS

S3

u13

S2

u12

S1

u11

TK

K3

u23

K2

u22

K1

u21

model 1 model 2

modules / tests Tech. Skill Tech. Knowledge Tech. Skill Tech. Knowledge

Estimate S.E. Estimate S.E. Estimate S.E. Estimate S.E.

Introduction to Electricity (S1) 0.443 0.096 0.464 0.104

Multi-meter Measurements (S2) 0.432 0.147 0.438 0.148

Basic DC Circuits (S3) 0.529 0.082 0.554 0.090

Introduction to AC (S4) No test available No test available

AC Test Equipment (S5) 0.678 0.083 0.737 0.091

Transformers (S6) 0.592 0.074 0.633 0.080

Introduction to DC (S7) 0.525 0.098 0.607 0.108

Digital Logic Functions (S8) 0.44 0.134 0.456 0.143

Introduction to Electricity (K1) 0.774 0.122 0.922 0.135

Multi-meter Measurements (K2) 0.121* 0.08 0.119* 0.086

Basic DC Circuits (K3) 0.092* 0.096 0.087* 0.100

Introduction to AC (K4) 0.485 0.075 0.541 0.085

AC Test Equipment (K5) 0.548 0.070 0.610 0.084

Transformers (K6) 0.203 0.100 0.220 0.108

Introduction to DC (K7) 0.392 0.079 0.417 0.088

Digital Logic Functions (K8) 0.417 0.083 0.460 0.091

*) p > 0.05

Table 1: Estimated Factor Loadings on Dimensions of a Technical Learning Aptitude under two different models (Navy Rating: AE, N = 500)

Table 2: Meta-analytic estimates of means of CFA factor loadings of A.T.T. post-tests over 16 samples of Navy Ratings (N = 500 for each sample)

modules / tests sampling 95% CI

Meanλ SDλ error SEλ LL UL

Introduction to Electricity (S1) 0.312 0.142 0.04% 0.008 0.297 0.328

Multi-meter Measurements (S2) 0.198 0.127 0.06% 0.009 0.180 0.215

Basic DC Circuits (S3) 0.465 0.084 0.02% 0.006 0.453 0.477

Introduction to AC (S4) No Test Available

AC Test Equipment (S5) 0.406 0.164 0.03% 0.007 0.393 0.419

Transformers (S6) 0.298 0.137 0.04% 0.008 0.282 0.313

Introduction to DC (S7) 0.450 0.071 0.02% 0.006 0.438 0.462

Digital Logic Functions (S8) 0.325 0.121 0.04% 0.008 0.31 0.339

Introduction to Electricity (K1) 0.540 0.095 0.01% 0.005 0.530 0.550

Multi-meter Measurements (K2) 0.456 0.18 0.02% 0.006 0.444 0.468

Basic DC Circuits (K3) 0.572 0.143 0.01% 0.005 0.563 0.582

Introduction to AC (K4) 0.628 0.077 0.01% 0.004 0.62 0.637

AC Test Equipment (K5) 0.577 0.071 0.01% 0.005 0.567 0.586

Transformers (K6) 0.525 0.118 0.02% 0.005 0.515 0.536

Introduction to DC (K7) 0.470 0.167 0.02% 0.006 0.458 0.482

Digital Logic Functions (K8) 0.630 0.132 0.01% 0.004 0.622 0.638

modules / tests sampling 95% CI

Meanτ SDτ error SEτ LL UL

Introduction to Electricity (S1) -0.309 0.569 0.04% 0.008 -0.324 -0.294

Multi-meter Measurements (S2) -1.397 0.338 0.01% -0.004 -1.388 -1.406

Basic DC Circuits (S3) -0.300 0.281 0.04% 0.008 -0.316 -0.285

Introduction to AC (S4) No Test Available

AC Test Equipment (S5) -0.043 0.684 0.13% 0.011 -0.064 -0.022

Transformers (S6) -0.810 0.493 0.00% 0.002 -0.814 -0.806

Introduction to DC (S7) -0.689 0.42 0.01% 0.003 -0.696 -0.682

Digital Logic Functions (S8) -1.066 0.447 0.00% -0.001 -1.064 -1.067

Introduction to Electricity (K1) -1.291 0.349 0.00% -0.003 -1.284 -1.297

Multi-meter Measurements (K2) -1.457 0.623 0.01% -0.005 -1.447 -1.467

Basic DC Circuits (K3) -0.752 0.391 0.00% 0.003 -0.758 -0.747

Introduction to AC (K4) -0.743 0.473 0.00% 0.003 -0.748 -0.737

AC Test Equipment (K5) 0.006 0.34 0.17% 0.011 -0.015 0.028

Transformers (K6) -1.290 0.337 0.00% -0.003 -1.283 -1.296

Introduction to DC (K7) -1.306 0.495 0.00% -0.003 -1.299 -1.313

Digital Logic Functions (K8) -1.496 0.467 0.01% -0.006 -1.485 -1.507

Table 3: Meta-analytic estimates of means of CFA difficulty parameters for each A.T.T. post-tests across 16 samples of Navy Ratings

(N = 500 for each sample)

STUDY II

Decomposition of Observed Score Variance (augmenting the SE model)

SE model with independent cluster structure (part model)

TS

S3

u13

S2

u12

S1

u11

TK

K3

u23

K2

u22

K1

u21

TS

TS

S3

u13

S2

u12

S1

u11

TK

K3

u23

K2

u22

K1

u21

Intro AC Basic DC Trans-

formers

TS

Correlated Traits – Correlated Modules model (part model)

TS

S3

u13

S2

u12

S1

u11

TK

K3

u23

K2

u22

K1

u21

Intro AC Basic DC Trans-

formers

TS

Correlated Traits – Correlated Modules model (part model)

Knowledge Domain Knowledge Type

Techn. Knowledge Techn. Skill H2 Unique

Tests Estimate S.E. Estimate S.E. Estimate S.E.

Introduction to Electricity (S1) 0.269 0.179 0.403 0.104 0.235 0.765

Introduction to Electricity (K1) 0.602 0.329 0.793 0.163 0.991 0.009

Multi-meter Measurements (S2) -0.215 0.257 0.463 0.151 0.261 0.739

Multi-meter Measurements (K2) 0.514 0.171 0.048 0.117 0.267 0.734

Basic DC Circuits (S3) 0.303 0.172 0.494 0.094 0.336 0.664

Basic DC Circuits (K3) 0.261 0.176 0.046 0.114 0.070 0.930

Introduction AC (S4) no test available

Introduction AC (K4) 0.492 0.073

AC Test Equipment (S5) -0.070 0.202 0.686 0.081 0.475 0.525

AC Test Equipment (K5) 0.136 0.171 0.531 0.070 0.300 0.700

Transformers (S6) 0.291 0.197 0.558 0.088 0.396 0.604

Transformers (K6) 0.147 0.146 0.178 0.104 0.053 0.947

Introduction to DC (S7) -0.696 0.353 0.713 0.175 0.993 0.007

Introduction to DC (K7) 0.043 0.151 0.388 0.080 0.152 0.848

Digital Logic Functions (S8) 0.139 0.287 0.424 0.128 0.199 0.801

Digital Logic Functions (K8) -0.023 0.147 0.426 0.084 0.182 0.818

Bold Face: significant at 0.05 or lower

R(S, K) = 0.881

ΦM: 0.70 for all non-diagonal entries

Mean Size of Variance Components in Criterion Tests for Common ATT Modules

VARIANCE ABSORPTION

Significant larger trait loadings in Study I as a percentage of total number of tests when the T-variance components were larger (T>M), equal (T=M), or smaller (T<M) than the module component

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

T<M T=M T>M

Per

cen

tage

of

sign

ific

ant

dif

fere

nce

s

Relation between T- and M variance componets

S-tests

K-tests

STUDY III

Translating SE model parameters into

(2PN) IRT parameters

Average Probability of Failure on K-tests and S-tests for trainees with a mean score on underlying latent traits at the first trial

IRT Parameters

Modules/Tests bj1 bj2 aj

Introduction to Electricity (S1) 0.301 -0.12

Multi-meter Measurements (S2) 0.296 -1.399

Basic DC Circuits (S3) 0.552 -0.231

Introduction to AC (S4)

AC Test Equipment (S5) 0.489 0.135

Transformers (S6) 0.312 -1.068

Introduction to DC (S7) 0.559 -0.575

Digital Logic Functions (S8) 0.559 -1.107

Introduction to Electricity (K1) 0.588 -1.334

Multi-meter Measurements (K2) 0.584 -1.916

Basic DC Circuits (K3) 1.068 -0.857

Introduction to AC (K4) 1.144 -1.295

AC Test Equipment (K5) 0.735 -0.124

Transformers (K6) 0.699 -1.607

Introduction to DC (K7) 0.559 -1.565

Digital Logic Functions (K8) 1.144 -2.471

Table 4: IRT parameters based on the meta-analytic factor loading and threshold estimates

Is the (2PN) IRT model adequate for the A.T.T. criterion score data?

a. Validating Assumptions • Independent cluster structure • Postulated dimensions represent the complete latent space of the

construct

b. Properties of ICCs

Probability of passing score should be monotonically increasing with person parameter

c. Model Predictions vs Observed Data

d. Other Psychometric Information • Test Information Functions • Distribution Observed Scores, True Scores and Person Parameter

Values



construct




d. Other Psychometric Information • Test Information Functions • Distribution Observed Scores, True Scores and Person

Parameter Values



construct





Parameter Values

Item Characteristic Curves (ICCs) for dichotomized criterion measures K4 and K5 based on meta-analytically derived estimates (N = 10,000)



construct





Parameter Values

0

5

10

15

20

25

30

35

40

45

50

-0.7 -0.6 -0.5 -0.4 -0.3 -0.2 -0.1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8

Fre

qu

en

cy

Standarized Residuals TS test

Distribution of Standardized Residuals for the TS test based on seven dichotomized common A.T.T. criterion scores in a Stratified Sample (N =

1,000) with meta-analytically derived estimates of 2PN IRT item parameters

0

10

20

30

40

50

60

-0.7 -0.6 -0.5 -0.4 -0.3 -0.2 -0.1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8

Fre

qu

en

cy

Standardized Residuals TK test

Distribution of Standardized Residuals for the TK test based on eight dichotomized common A.T.T. criterion scores in a Stratified Sample (N =

1,000) with meta-analytically derived estimates of 2PN IRT item parameters



construct





Parameter Values

Test Information Function for Technical Skill Learning (Navy ranking AE, N = 500)

Test Information Function for Technical Concepts Learning (Navy ranking AE, N = 500)

Score distributions for the skill test consisting of seven ATT common modules criterion scores. Person parameters were transformed to the

same metric as the observed and true score estimates (Data: Stratified Sample, N = 10,000)

Score distributions for the knowledge test consisting of eight ATT common modules criterion scores. Person parameters were transformed to the

same metric as the observed and true score estimates (Data: Stratified Sample, N = 10,000)

Conclusions:

1. The A.T.T. criterion performance data could be modeled as a two-dimensional independent cluster structure.

2. The latent variables underlying these clusters could be shown to be orthogonal to module-specific training effects.

3. The underlying distribution of the person parameter (theta) was found to be normally distributed (most clearly for the technical skill parameter).

Why is this important? The advantage of deriving relatively “pure” measures directly from training data is obvious: o The logical distance between construct and behavior is extremely small

and the relevance of the theoretical construct for the criterion is indisputable

o It provides a best possible criterion to evaluate existing tests (see: Glaze & Ippel, 2011)

o It provides a best possible criterion for the development of new tests

THANK YOU!

0.00

0.10

0.20

0.30

0.40

0.50

0.60

0.70

S1 S2 S3 S4 S5 S6 S7 S8 K1 K2 K3 K4 K5 K6 K7 K8

Exp

lain

ed V

aria

nce

Post-Tests Common ATT Modules, (Navy Rating AE)

CFA

M-aptitude

M-module

0.000

0.100

0.200

0.300

0.400

0.500

0.600

0.700

0.800

0.900

1.000

Pro

bab

ility

Su

cce

ss

latent dimension

P (Xij = 1 | θj) = f (θj – bi)

bi

In IRT models the probability of success a function the distance between the location of the test (b) and the examinee (θ) on the latent dimension


Documents

IMTA 2011 Technical Aptitude Pres I