27
Assessment Centre Procedures: Reducing Cognitive Load During the Observation Phase Nanja J. Kolk & Juliette M. Olman Department of Work and Organizational Psychology, Vrije Universiteit, Amsterdam, the Netherlands. vrije Universiteit amsterdam

Assessment Centre Procedures: Reducing Cognitive Load During the Observation Phase Nanja J. Kolk & Juliette M. Olman Department of Work and Organizational

  • View
    215

  • Download
    0

Embed Size (px)

Citation preview

Assessment Centre Procedures: Reducing Cognitive Load During

the Observation Phase

Nanja J. Kolk & Juliette M. Olman

Department of Work and Organizational Psychology, Vrije Universiteit, Amsterdam,

the Netherlands.

vrije Universiteit amsterdam

The AT&T observation procedure

• Observe behavioural information and take notes of these observations at the same time

• Afterwards, classify the written remarks into behavioural dimensions,

• Give a quantitative rating per dimension,

• Evaluate ratings with co-assessors

Observing & Note-taking

• Behaviours are more easily stored in memory, through coding the material both verbally and visually

• Note-taking forces the assessor to process information performing the observation and recording tasks simultaneously

Dual Task Processing

• Requires more cognitive resources than assessors have available

• More observational and rating errors, e.g. failing to notice key behaviours, incorrectly appointing behaviours to dimensions etc.

• Interrater reliability, accuracy, construct validity

Experienced vs. Inexperienced Assessors

• Practice with a given task leads to a decrease of the necessary cognitive resources

• Experienced assessors have been able to practice the dual task extensively

• Inexperienced assessors lack the necessary practice to be able to perform two tasks concurrently

Study Aim

• Comparing two methods of observation:– the traditional AT&T method– the “observe only” method

• Comparing two types of assessors:– inexperienced assessors– experienced assessors

Research Question

• What is more cognitively demanding?

• Note taking during an exercise?

• Memorizing the observed behaviors?

Procedure

• 31 experienced and 90 inexperienced assessors rated 3 videotaped candidates in 2 AC exercises.

• Both groups received assessor training

• The control group was asked to take notes during the exercise

• The experimental group was not required to take notes

Simulated Assessment Center

• Sensitivity, Co-operation, Judgement, Tenacity

• Interview Simulations (1) win over the reluctance of a self-conscious and busy

subordinate to give a lecture for a large audience, the day after tomorrow.

(2) persuade a subordinate to put in overtime on a Friday-afternoon, which the subordinate is initially unwilling to do.

– The role of the subordinates was played by six professional actors.

Evaluation Criteria

• Accuracy of the ratings– ‘intended true score’ by 3 expert raters– elevation, differential elevation, stereotype accuracy,

and differential accuracy

• Inter-rater reliability – intraclass correlation coefficients

• Halo– correlations between dimensions within exercises

Accuracy• Elevation: accuracy of the average rating over all applicants and

dimensions (i.e. leniency and severity)• Differential elevation: accuracy in discriminating among

applicants, averaging over dimensions (i.e. OAR)• Stereotype accuracy: accuracy in discriminating among

dimensions, averaging over applicants (i.e. ‘general’ score of a particular dimension)

• Differential accuracy: accuracy in detecting applicant differences in patterns of performance (i.e. degree to which the assessors rate the performance of the three applicants accurately on each dimension)

Results Halo

• Fisher’s Z tests revealed no significant differences in mean inter-correlations between dimensions within exercises between observation methods and type of assessor.

Discussion Halo

• Design is divergent to the regular applicant x dimension designs to study halo. This design examines ratings of multiple assessors and only a few applicants.

• The assessor x dimension matrix reveals information concerning assessor properties, rather than properties of the applicants.

Results Reliability

• The ‘observe only’ method yields somewhat higher inter-rater reliability (.93) than the traditional method (.85), yet this difference is not significant.

• None of the ICC’s differed significantly between observation methods or types of assessors.

• The limited number of applicants (n=3) may have hindered reaching a level of significance.

Results Accuracy MANOVA

• Elevation: • Significant difference assessor type & method

• Significant interaction assessor type X method

• Differential elevation• No significant difference assessor type & method

• Significant interaction assessor type X method

Results Accuracy MANOVA

• Stereotype accuracy• Significant difference assessor type & method

• Significant interaction assessor type X method

• Differential accuracy• Significant difference assessor type

• No significant difference method

• No significant interaction assessor type X method

Results Elevation

• Significant difference assessor type: F (1,116) = 22.15 , p < .05. – inexperienced > (.35) than experienced (.64).

• Significant difference method (F (1,116) = 4.90 , p < .05– traditional > (.43) ‘observe only’ (.56).

Results Elevation (cont.)

• Significant interaction assessor type X method: F (1,116) = 14.65, p < .05: – inexperienced assessors > (.30) experienced assessors

(.82) on the ‘observe only’ method– inexperienced assessors > ‘observe only’ method (.30)

than on the traditional method (.40). – experienced assessors > on the traditional (.46) than

on the ‘observe only’ method (.82).

Results Differential Elevation

• No significant difference assessor type

• No significant difference observation method

• Significant interaction assessor type X method: F (1,116) = 4.29, p < .05– inexperienced assessors > (.69) experienced assessors (.90)

‘observe only’

– inexperienced assessors > ‘observe only’ method (.69) than on the traditional method (.74)

– experienced assessors > traditional method (.72) than ‘observe only’ method (.90).

Results Stereotype Accuracy

• Significant difference assessor type: F (1,116) = 12.61, p < .05– inexperienced assessors (.42) > experienced assessors

(.61).

• Significant difference method: F (1,116) = 6.68, p < .05– traditional method (.45) > the ‘observe only’ method

(.58).

Results Stereotype Accuracy (cont.)

• Significant interaction assessor type X method: F (1,116) = 15.05, p < .05– inexperienced assessors > on the ‘observe only’

method (.39) than the experienced assessors (.77)– inexperienced assessors > on the ‘observe only’ (.39)

than on the traditional method (.46) – experienced assessors > on the traditional (.44) than

on the ‘observe only’ method (.77).

Results Differential Accuracy

• Significant difference between inexperienced and experienced assessors F (1,116) = 16.10, p < .05– experienced assessors (.69) > inexperienced assessors

(.95).

• No significant difference between observation methods

• No significant interaction effects.

Summary Accuracy Results

• Higher elevation, differential elevation and stereotype accuracy for inexperienced assessors, observing an AC exercise without taking notes

• Higher elevation, differential elevation and stereotype accuracy for experienced assessors observing exercises through the traditional observation method

• Higher differential accuracy for experienced assessors than inexperienced assessors, regardless of observation method.

Discussion Accuracy

• Because the AC’s main interest is an applicant’s performance per dimension, differential accuracy is the most informative and theoretically relevant outcome (e.g. Lievens, in press; Schleicher & Day, 1998)

• Differential accuracy is relevant only when different decisions are being made depending on the pattern of performance (Murphy & Cleveland, 1995, p. 287).

Discussion Accuracy

• Applicants are to score high on all dimensions in order to get selected, because all dimensions have been considered to be relevant.

• Thus, there is no diversified pattern of AC performance in reference to the selection decision.

• Differential elevation and differential accuracy are both relevant to this area of research, whereas elevation and stereotype accuracy might be of minor importance.

Study Limitations

• Small sample size candidates

• Small sample size experienced assessors

• No a priori performance script

• No a priori performance profile

Study Implications

• Writing a behavioral report during an AC exercise may be too cognitively demanding for assessors with little rating experience

• A behavioral report does not seem to have added value (Hennessey et al., 1998)

• Inexperienced assessors may be better off focusing all their attention on the role-play

• Experienced assessors should be given the opportunity to write down observations