39
Advanced Adverse Impact Analysis What’s New in Adverse Impact: Practical Significance, Statistical Significance, and Statistical Techniques April 12, 2011

Advanced Adverse Impact Analysis What’s New in Adverse ...c.ymcdn.com/sites/ · Practical Significance, Statistical Significance, and Statistical Techniques ... • Compensation

  • Upload
    donga

  • View
    217

  • Download
    1

Embed Size (px)

Citation preview

Page 1: Advanced Adverse Impact Analysis What’s New in Adverse ...c.ymcdn.com/sites/ · Practical Significance, Statistical Significance, and Statistical Techniques ... • Compensation

Advanced Adverse Impact Analysis

What’s New in Adverse Impact: Practical Significance, Statistical Significance, and Statistical Techniques

April 12, 2011

Page 2: Advanced Adverse Impact Analysis What’s New in Adverse ...c.ymcdn.com/sites/ · Practical Significance, Statistical Significance, and Statistical Techniques ... • Compensation

Visit BCGi Online

• If you enjoy this webinar,

– Don’t forget to check out our other training opportunities through the BCGi website.

• BCGi Standard Membership (free)

– Online community

– Monthly webinars on EEO compliance topics

– EEO Insight Journal (e-copy)

• BCGi Platinum (paid) Membership ($199/year)

– Fully interactive online community

– Includes validation/compensation analysis books

– EEO Tools including those needed to conduct AI analyses

– EEO Insight Journal (e-copy and hardcopy)

Page 3: Advanced Adverse Impact Analysis What’s New in Adverse ...c.ymcdn.com/sites/ · Practical Significance, Statistical Significance, and Statistical Techniques ... • Compensation

HRCI Credit

• BCGi is an HRCI Preferred Provider

• CE Credits are available for attending this webinar

• Only those who remain with us for at least 80% of the webinar will be eligible to receive the HRCI training completion form for CE submission

Page 4: Advanced Adverse Impact Analysis What’s New in Adverse ...c.ymcdn.com/sites/ · Practical Significance, Statistical Significance, and Statistical Techniques ... • Compensation

44

About Our Sponsor: BCG• Assisted hundreds of clients with cases involving Equal Employment

Opportunity (EEO) / Affirmative Action (AA) (both plaintiff and defense)

• Compensation Analyses / Test Development and Validation

• Published: Adverse Impact and Test Validation, 2nd Ed., as a practical guide for HR professionals (3rd edition out in weeks!)

• Editor & Publisher: EEO Insight an industry e-Journal

• Creator and publisher of a variety of productivity

Software/Web Tools:– OPAC

® (Administrative Skills Testing)

– CritiCall® (9-1-1 Dispatcher Testing)

– AutoAAP™ (Affirmative Action Software and Services)

– C4™ (Contact Center Employee Testing)

– Encounter™ (Video Situational Judgment Test)

– AutoGOJA® (Automated Guidelines Oriented Job Analysis

®)

– COMPare: Compensation Analysis in Excel

Page 5: Advanced Adverse Impact Analysis What’s New in Adverse ...c.ymcdn.com/sites/ · Practical Significance, Statistical Significance, and Statistical Techniques ... • Compensation
Page 6: Advanced Adverse Impact Analysis What’s New in Adverse ...c.ymcdn.com/sites/ · Practical Significance, Statistical Significance, and Statistical Techniques ... • Compensation

Contact Information

Dan A. Biddle, Ph.D.CEO

(800) 999-0438 x 113

[email protected]

6

Page 7: Advanced Adverse Impact Analysis What’s New in Adverse ...c.ymcdn.com/sites/ · Practical Significance, Statistical Significance, and Statistical Techniques ... • Compensation

Copyright

TESTTEST

AdverseAdverse

Impact?Impact?

YESYES NONO

Is the PPTIs the PPT

Valid?Valid?

YESYES NONO

Alternative Alternative

EmploymentEmployment

Practice?Practice?

NONODefendant PrevailsDefendant Prevails

YESYES

Plaintiff PrevailsPlaintiff Prevails

ENDEND

Plaintiff Plaintiff

PrevailsPrevails

How Can Testing Practices be How Can Testing Practices be Challenged?Challenged?

Title VII Disparate Impact Title VII Disparate Impact Discrimination FlowchartDiscrimination Flowchart

Adverse Impact Overview

Page 8: Advanced Adverse Impact Analysis What’s New in Adverse ...c.ymcdn.com/sites/ · Practical Significance, Statistical Significance, and Statistical Techniques ... • Compensation

• Statistical Significance:

• 5%• 0.05

• 1 chance in 20

• 2.0 Standard Deviations (actually 1.96)

Adverse Impact — Theories and Adverse Impact — Theories and

Methods Abound…Methods Abound…

What Really Works? What Really Works?

Page 9: Advanced Adverse Impact Analysis What’s New in Adverse ...c.ymcdn.com/sites/ · Practical Significance, Statistical Significance, and Statistical Techniques ... • Compensation

Two Types of Adverse ImpactTwo Types of Adverse Impact

Men

Pass

Women Pass

Men Fail

Women Fail

Availability %

# Women

# Total

• Utilization Analysis• Single Group Test• “Binomial”

SELECTION RATE COMPARISONSELECTION RATE COMPARISON AVAILABILITY COMPARISONAVAILABILITY COMPARISON

• 2 X 2 Table Comparison• Evaluates hires,

promotions, terminations• “Hypergeometric”

Page 10: Advanced Adverse Impact Analysis What’s New in Adverse ...c.ymcdn.com/sites/ · Practical Significance, Statistical Significance, and Statistical Techniques ... • Compensation

• 2 X 2 Table Comparison• Evaluates hires, promotions,

terminations• “Hypergeometric”

• Utilization Analysis• Single Group Test• “Binomial”• See p. 58955 of Int. App Regs

SELECTION RATE COMPARISON AVAILABILITY COMPARISON

Statistically Significant Result Statistically Significant Result

+No Job Relatedness / Validity

Disparate Impact Discrimination

+6 “Possible Ingredients”

“Adverse Inference” or Evidence

for Disparate Treatment Cases

= =

When Does Adverse Impact Resultin “Disparate Impact Discrimination”?

Page 11: Advanced Adverse Impact Analysis What’s New in Adverse ...c.ymcdn.com/sites/ · Practical Significance, Statistical Significance, and Statistical Techniques ... • Compensation

Adverse Impact—Availability Comparison

• Statistical Significance + Six Possible Ingredients for “Adverse Inference” or Disparate Treatment

– #1 Failure to keep applicant records (sometimes referred to as an “adverse inference”—see 4D of the Guidelines)

– #2 Failure to run/keep adverse impact analyses on the selection or promotional processes (also an “adverse inference”—see 4D of the Guidelines)

– #3 Discriminatory recruiting practice (e.g., Hazelwood School District v. United States)

– #4 Discriminatory reputation “chilled” or “discouraged” certain group members from applying

– #5 Promoting employees through “appointment only” process (rather than conducting promotional processes)

– #6 Invalid “Basic Qualifications”

Page 12: Advanced Adverse Impact Analysis What’s New in Adverse ...c.ymcdn.com/sites/ · Practical Significance, Statistical Significance, and Statistical Techniques ... • Compensation

Adverse Impact—Availability ComparisonStatistical Significance + Six Possible Ingredients for

“Adverse Inference” or Disparate Treatment

• Unless one or more of the 6 ingredients exist, statistically significant underutilization should not be directly equated with discrimination

• Several other factors can sometimes explain underutilization:

– Job interest

– Occupational qualifications

– Labor trends

– Traditional roles (e.g., engineering vs. clerical)

• Unless one of the “6 ingredients” exist, a specific practice, procedure, or test will need to be identified that caused the adverse impact (using statistical significance tests). The only exception is if the agency’s practices cannot be “separated for analysis purposes” (see 1991 Civil Rights Act)

Page 13: Advanced Adverse Impact Analysis What’s New in Adverse ...c.ymcdn.com/sites/ · Practical Significance, Statistical Significance, and Statistical Techniques ... • Compensation

Adverse Impact: Selection Rate Comparison (1991 CRA) & UGESP

Amends Section 703 of the 1964 Civil Rights Act (Title VII)

(k)(1)(A). An unlawful employment practice based on

disparate impact is established under this title only if:

• A(i) a complaining party demonstrates that a respondent uses a particular employment practice that causes a disparate impact on the basis of race, color, religion, sex, or national origin, and the respondent fails to demonstrate that the challenged practice is job-related for the position in question and consistent with business necessity; OR,

• A(ii) the complaining party makes the demonstration described in subparagraph (C) with respect to an alternate employment practice, and the respondent refuses to adopt such alternative employment practice.

Page 14: Advanced Adverse Impact Analysis What’s New in Adverse ...c.ymcdn.com/sites/ · Practical Significance, Statistical Significance, and Statistical Techniques ... • Compensation

What Tools Should be used for Classic AI Analyses?� For decades, EEO professionals have relied

on “Chi-Square” type analyses for the 2x2 table question:

� Sometimes various corrections have been used (Yates, Cochran).

� Sometimes the Fisher Exact Test (FET) has been used

Men Women Totals

Pass 8 2 10

Fail 2 6 8

Totals 10 8 18

Page 15: Advanced Adverse Impact Analysis What’s New in Adverse ...c.ymcdn.com/sites/ · Practical Significance, Statistical Significance, and Statistical Techniques ... • Compensation

What Tools Should be used for Classic AI Analyses?

� When running “classic” (2 X 2 table) hires/proms/terms adverse impact, there are different tools available for the job:

� Chi-Square

� Z-test

� Fisher Exact Test

� Mid-p Fisher Exact Test

� Adverse impact is serious and no one wants to calculate liability statistics inaccurately.

Samples >200 w/ 50+ in each group

Samples <200 w/ <50 in each group

Page 16: Advanced Adverse Impact Analysis What’s New in Adverse ...c.ymcdn.com/sites/ · Practical Significance, Statistical Significance, and Statistical Techniques ... • Compensation

Why?� The uncorrected FET:

� Is very conservative in estimating AI in small/mid samples

� Most (if not all) of the time when analyzing hires/proms/terms, the assumptions behind the FET’s use are not addressed

� Unconditional exact tests, mid-p corrected FETs, or Z-test/Chi-Squares should be used instead

� This issue has been reviewed and discussed in 80 stats articles and books

� See Biddle & Morris, 2011 and Biddle, 2011 for some lengthy research on this topic… but here’s a quick summary of one of the limitations (cont.)

Page 17: Advanced Adverse Impact Analysis What’s New in Adverse ...c.ymcdn.com/sites/ · Practical Significance, Statistical Significance, and Statistical Techniques ... • Compensation

Probability Theory Applied to 2X2 Tables

17

0

0.05

0.1

0.15

0.2

1 2 3 4 5 6 7

DEMONSTRATION OF "DISCRETENESS"

IN THE FET PROBABILITY DISTRIBUTION

The FET has 4 "stopping

places" below .05

Chi-Square Theory has more

FET: 0.0536

Mid-p: 0.0392

Uncond.: 0.0338

Asymptotic "Best Estimate" Line

Used by the Chi-Square

Page 18: Advanced Adverse Impact Analysis What’s New in Adverse ...c.ymcdn.com/sites/ · Practical Significance, Statistical Significance, and Statistical Techniques ... • Compensation

Probability Theory Applied to 2X2 Tables

18

Actual Significance Level v. Desired (.05) Significance Level

Mid-PMid-P

Actual Significance Level v. Desired (.05) Significance Level

FET (uncorrected)FET (uncorrected)

Page 19: Advanced Adverse Impact Analysis What’s New in Adverse ...c.ymcdn.com/sites/ · Practical Significance, Statistical Significance, and Statistical Techniques ... • Compensation

Important Questions for HR Professionals…

• What is the significance level used for testing whether a test is valid?

• What is the significance level used for testing Adverse Impact?

• Answers:

– Validity: .05

– Adverse Impact: .05

• What statistical tests are useful for answering these statistical questions?

– Validity: Pearson correlation is common

– Adverse Impact: Fisher Exact Test (under a variety of methods), Chi-Square, Z-test, etc.

Page 20: Advanced Adverse Impact Analysis What’s New in Adverse ...c.ymcdn.com/sites/ · Practical Significance, Statistical Significance, and Statistical Techniques ... • Compensation

Accuracy of Tests for Answering the “just .05 or less” Question

1.50

1.60

1.70

1.80

1.90

2.00

2.10

2.20

2.30

2.40

2.50

2.60

0-20 21-50 50-75 76-100 101-125 126-200

SD Value

Sample Size

Comparison Between FET and Mid-P Based on Sample Size(Based on Monte Carlo Simulations from Cited Articles)

FET SD Required for Signif icance Mid-p SD Required for Signif icance

Poly. (FET SD Required for Signif icance) Poly. (Mid-p SD Required for Signif icance)

AVERAGE "OVERAGE" OF FISHER EXACT TEST (AMOUNT

HIGHER THAN 1.96 TO FIND AN ACTUAL 1.96 FINDING)

AVERAGE "OVERAGE" OF MID-P

Page 21: Advanced Adverse Impact Analysis What’s New in Adverse ...c.ymcdn.com/sites/ · Practical Significance, Statistical Significance, and Statistical Techniques ... • Compensation

How Accurately do the Tests Answer the .05 Question?

Sample

Size

Typical

Alpha

Range

% Below

Desired .05

Level

Actual SD

Required for

Significance0-20 0.015 70% 2.43

21-50 0.025 50% 2.24

50-75 0.026 47% 2.22

76-100 0.032 36% 2.15

101-125 0.035 30% 2.11

126-200 0.043 13% 2.02

Typical

Estimate for

n<50 (FET)

0.029 41% 2.19

Typical

Estimate for

n<50 (mid-P)

0.046 8% 1.99

Actual FET/Mid-P Significance Levels

(Compared to Desired .05 Level)

Page 22: Advanced Adverse Impact Analysis What’s New in Adverse ...c.ymcdn.com/sites/ · Practical Significance, Statistical Significance, and Statistical Techniques ... • Compensation

Power Analysis for an Employer Using a 10% Hiring Rate

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80

Sample Size (Equal for Each Group)

Power Curve for Detecting Adverse Impact on a 1.0(d) Test Used with a 10%

Overall Passing Rate / Chart Answers the Question:

What percent of the time will the test find adverse impact when it exists?

FET FET (mid-P) Chi-Square

Gap Showing Increased Likelihood of

Missing AI When it Exists

Page 23: Advanced Adverse Impact Analysis What’s New in Adverse ...c.ymcdn.com/sites/ · Practical Significance, Statistical Significance, and Statistical Techniques ... • Compensation

How Do You Compute the Mid-P?

• It’s rather simple…Many Stat Packages will provide the mid-p

• If you already have an AI tool or stat program, just

– Compute the 2-tail FET

– Subtract ½ of the p-value from the first table from that value

– The “hypergeomdist” function can be used

– If you want to avoid the hassle, just calculate mid-p values for FETs that are “on the cusp” of significance, such as 1.80 SDs (corresponding to p-values of about .07)

• Can easily be done for Mantel-Haenszel style analyses

� If the exact unconditional test is preferred: http://www.stat.ncsu.edu/exact/

Page 24: Advanced Adverse Impact Analysis What’s New in Adverse ...c.ymcdn.com/sites/ · Practical Significance, Statistical Significance, and Statistical Techniques ... • Compensation

Practical Significance

• Descriptive Ratios/Percents:

– Selection Rate Difference (SRD)

– Impact Ratio Analysis (IRA)

• Shortfalls and Hypotheticals

– Rule of One

– Parity Shortfall

– Statistical Significance Shortfall (Number Needed to Remove Statistical Significance)

– 80% Rule violations

• Statistical:

– Odds Ratio

– Phi Coefficient

Page 25: Advanced Adverse Impact Analysis What’s New in Adverse ...c.ymcdn.com/sites/ · Practical Significance, Statistical Significance, and Statistical Techniques ... • Compensation

• The Selection Rate Difference (SRD) looks at the

success rate difference between two groups as a

“simple effect size”

• For example, men pass rate = 80%; women pass

rate = 85% is a 5% SRD

– 2% SRD might be “small”

– 5% SRD might be “medium”

– 10% SRD might be “large”

• Considering the SRD is useful, especially when

samples are large and the 2 X 2 test results are

highly significant

Practical Significance: The “Selection Rate Difference” (SRD)

Page 26: Advanced Adverse Impact Analysis What’s New in Adverse ...c.ymcdn.com/sites/ · Practical Significance, Statistical Significance, and Statistical Techniques ... • Compensation

• The Impact Ratio Analysis (IRA) provides a

single metric relevant to one group’s success

rate compared to another

• For example, a 80% male passing rate and 85%

female passing rate is 80% / 85%, or 94.1%

• IRAs provide a comparison between one group’s

passing rate relative to another group’s, as well

as the overall success rate of both groups

Practical Significance: The “Impact Ratio Analysis” (IRA)

Page 27: Advanced Adverse Impact Analysis What’s New in Adverse ...c.ymcdn.com/sites/ · Practical Significance, Statistical Significance, and Statistical Techniques ... • Compensation

• “Rule of One” from the Uniform Guidelines

• # Needed to Parity:

– How many additional disadvantaged group members are

necessary to equal the success rate of the advantaged group?

– Used after discrimination has been established to determine

“make whole relief”

• # Needed to Remove Statistical Significance

– How many “flip-flops” are necessary to remove the statistical

significance finding?

– Some legal precedence

• # Needed to Remove 80% Violation

– How many “flip-flops” are necessary to remove the 80% rule

violation?

– Two cases

Practical Significance: Shortfalls and Hypotheticals

Page 28: Advanced Adverse Impact Analysis What’s New in Adverse ...c.ymcdn.com/sites/ · Practical Significance, Statistical Significance, and Statistical Techniques ... • Compensation

28

Dixon v. Margolis (1991), Citing 7th Circuit Case Against PS Concepts

Practical Significance: A Few Legal Examples and Updates…

� The defendants' expert analyzed each category of tests and each test

period separately. Out of the 12 different situations, 3 situations had adverse impact on blacks…

� In 5 of the 9 situations where he found no adverse impact, however,

the expert had to adjust the figures to find no adverse impact.

� For all 12 situations, he added 2 to the number of blacks promoted

and subtracted 2 from the number of whites promoted.

� Compare Washington v. Electrical Joint Apprenticeship & Training Committee of Northern Indiana (7th Cir., 1988) ( “it is an

unacceptable statistical procedure to turn a large sample into a small

one by arbitrarily excluding observations”).

� The court’s judgment objected to “playing hypotheticals” with the

actual observed numbers relevant in the case.

Page 29: Advanced Adverse Impact Analysis What’s New in Adverse ...c.ymcdn.com/sites/ · Practical Significance, Statistical Significance, and Statistical Techniques ... • Compensation

� This Court has never established ‘practical significance’ as an independent requirement for a disparate impact case, and we decline to

do so here.

� The EEOC Guidelines themselves do not set out “practical” significance as an independent requirement, and we find that in a case in which the statistical significance of some set of results is clear, there is no need to probe for additional “practical” significance.

� Statistical significance is relevant because it allows a fact-finder to be confident that the relationship between some rule or policy and some set of disparate impact results was not the product of chance.

� This goes to the plaintiff’s burden of introducing statistical evidence that is “sufficiently substantial” to raise “an inference of causation.”

� There is no additional requirement that the disparate impact caused be above some threshold level of practical significance. Accordingly, the District Court erred in ruling “in the alternative” that the absence of practical significance was fatal to Plaintiffs’ case.

[Stagi v. National Railroad Passenger Co., No. 09-3512 (3d Cir., 2010).

Practical Significance: A Few Legal Examples …

Page 30: Advanced Adverse Impact Analysis What’s New in Adverse ...c.ymcdn.com/sites/ · Practical Significance, Statistical Significance, and Statistical Techniques ... • Compensation

• Odds Ratio (OR)

• The OR is the ratio of the odds of an event occurring in one group to the odds of it occurring in another group.

• An OR of 1 indicates that the event under study is equally likely to occur in both groups. An odds ratio greater than 1 indicates that the condition or event is more likely to occur in the first group.

Practical Significance: Statistical Methods (Odds Ratio)

Pass Fail Totals

Women 40 60 100

Men 60 40 100

67%

150%

2.25

Passing Odds of Women:

Passing Odds of Men

Odds Ratio

Page 31: Advanced Adverse Impact Analysis What’s New in Adverse ...c.ymcdn.com/sites/ · Practical Significance, Statistical Significance, and Statistical Techniques ... • Compensation

• Phi Coefficient

• Computed by computing the square root of the Chi-

square statistic divided by the sample size.

• Measure of association for the Chi-square test

• Reveals the extent of the relationship between two

variables in a 2 X 2 table.

Practical Significance: Statistical Methods (Phi Coefficient)

Page 32: Advanced Adverse Impact Analysis What’s New in Adverse ...c.ymcdn.com/sites/ · Practical Significance, Statistical Significance, and Statistical Techniques ... • Compensation

Practical Significance: Statistical Methods (Phi Coefficient)

0.00

0.10

0.20

0.30

0.40

0.50

0.60

0.70

0.80

N = 30 N = 50 N = 100 N = 200 N = 300 N = 400 N = 500 N = 600 N = 700 N = 800 N = 900 N = 1000

Phi Coefficient

Phi Coefficient for Various SD and Sample Size Configuations

1.96

2.33

2.58

3.09

3.72

Page 33: Advanced Adverse Impact Analysis What’s New in Adverse ...c.ymcdn.com/sites/ · Practical Significance, Statistical Significance, and Statistical Techniques ... • Compensation

� For reasons explained in the previous slides, we recommend

that our clients:

� Rely primarily on statistical significance for determining adverse impact

� Consider the 80% test and shortfall calculations (i.e., the number needed to eliminate a statistically significant finding) as two methods for sensibly evaluating practical significance

� However, validation efforts should be driven whenever a practice, procedure, or test has only statistical significance

� The BIG issue, ultimately, is how many people are impacted by the 2 X 2 table?

Practical Significance: Recommendations

Page 34: Advanced Adverse Impact Analysis What’s New in Adverse ...c.ymcdn.com/sites/ · Practical Significance, Statistical Significance, and Statistical Techniques ... • Compensation

Criterion-Related Study

0

20

40

60

80

0 20 40 60 80 100

Test Score

Pe

rfo

rma

nc

e M

ea

su

re

Validity Coefficient to Start: .35

Validity Coefficient After Hypothetically Changing Two Data Points: r=.26, p=.061

PS Concepts Applied to the Second Burdon (Validity)

34

Significant Not Significant

Taking Practical Significance Arguments

to the “Other Side of the Table”

Page 35: Advanced Adverse Impact Analysis What’s New in Adverse ...c.ymcdn.com/sites/ · Practical Significance, Statistical Significance, and Statistical Techniques ... • Compensation

� Logistic Regression (LR) is a statistical method used for evaluating why adverse impact may be occurring in a hiring or promotional process.

� Classic adverse impact analysis can identify only whether the observed hiring or promotion rates between two groups are significantly different… however, LR is much more powerful:

� LR can determine if job-relevant qualification factors (e.g., experience or education) of the individuals included in the analysis explain the difference in hiring or promotion (as opposed to gender or race being the reason).

Logistic Regression

Page 36: Advanced Adverse Impact Analysis What’s New in Adverse ...c.ymcdn.com/sites/ · Practical Significance, Statistical Significance, and Statistical Techniques ... • Compensation

Hiring DecisionsHiring DecisionsEducation Education

LevelLevel

GENDERGENDER

Logistic Regression

Relevant Relevant

ExperienceExperience ReferenceReference

Check Check

ScoreScore

Page 37: Advanced Adverse Impact Analysis What’s New in Adverse ...c.ymcdn.com/sites/ · Practical Significance, Statistical Significance, and Statistical Techniques ... • Compensation

� LR needs to be applied to job qualification factors that were ACTAULLY USED OR CONSIDERED in the selection process

� LR should be mapped onto actual positions, hiring data, and hiring decisions, NOT theoretical ones

� LR is useful for weighing the practical importance of job qualifications factors in the hiring or promotion process

� LR can also be useful for determining “shortfall” calculations� For example, how many women would have been hired “but

for” the discrimination?� What is the total shortfall for women, given what the model

can explain?

Logistic Regression

Page 38: Advanced Adverse Impact Analysis What’s New in Adverse ...c.ymcdn.com/sites/ · Practical Significance, Statistical Significance, and Statistical Techniques ... • Compensation
Page 39: Advanced Adverse Impact Analysis What’s New in Adverse ...c.ymcdn.com/sites/ · Practical Significance, Statistical Significance, and Statistical Techniques ... • Compensation

39