Transcript
Page 1: PADM 522—Summer 2012 Lecture 3 Professor Mario Rivera

Causal Logic Models: Incorporating Change & Action Models; Fidelity-Adaptation Relationship; Stakeholder Engagement & Partnership StrategiesPADM 522—Summer 2012

Lecture 3

Professor Mario Rivera

Page 2: PADM 522—Summer 2012 Lecture 3 Professor Mario Rivera

Intervention

OtherFactors

OutcomeA causal logic model clarifies the program’s theory, or change and action modeling of the way that interventions produce outcomes, by isolating program effects from other factors or influences.

Multiple methods may be used to establish the relative importance of various causative influences. These include experimental, quasi-experimental, and cross-case analysis, and range from quantitative to mixed-methods to purely qualitative methods. Most evaluations use mixed-methods designs.

Causal logic models—essential definitions, methods

Page 3: PADM 522—Summer 2012 Lecture 3 Professor Mario Rivera

Components of a causal logic model (in red) pertain to program theory; they augment regular logic modeling Left to right on the graphic one would find, in some order: Inputs (Fiscal and Human Resources Invested; Key

Programmatic Initiatives) Assumptions, Underlying Conditions, Premises (May Specify

Ones Under Program Control and Outside Program Control, as in USAID’s Logical Framework or LogFrame)

Causative (If-then) Linkages Among Program Functions, Indicating Change and Action Models or Program Theory

Program Activities, Services Immediate or Short-term Outcomes (process measures) Intermediate or Medium-term Outcomes (outcome measures) Long-term Results, Long-term Outcomes or Program Impact

(impact measures)

Page 4: PADM 522—Summer 2012 Lecture 3 Professor Mario Rivera

The Causal Logic Model Framework: Incorporating “If-then” Causative Linkages Among Program Components

Activities/Services

Provide BB

Do AA

Populationgets BB

IntermediateOutcomes

ImmediateOutcomes

Somethinghappens

# Trained

A later result

A later result

Condition ABC

Improve

Long Term Outcomes/

Results

Provide trainingabout CC

InputsAssumptions/

Conditions

PersonnelResources

If condition A exists

Funding/Other

ResourcesIf need B exists

Curriculum If condition C exists

Page 5: PADM 522—Summer 2012 Lecture 3 Professor Mario Rivera

A Causal Model Worksheet (one format)

Inputs Assumption or Underlying

Condition

Activities Immediate Outcomes

Intermediate Outcomes

Long Term Outcomes/

Results/

Impact

Page 6: PADM 522—Summer 2012 Lecture 3 Professor Mario Rivera

Making program theory or a program’s change model explicit: an example from Chen In a hypothetical spouse abuse treatment program that

relies on group counseling, with a target group of “abusers convicted by a court,” Chen (page 18) proposes that the change model may work as follows: “[T]he designers decide that group counseling should be provided weekly for 10 weeks because they believe that 10 counseling sessions is a sufficient ‘dose’ for most people” who are similarly situated. Here the tacit theory or change model is bound up with the expectation that counseling is a sufficient intervention to elicit behavioral change in adjudicated abusers. So-called “zero-tolerance” automatic incarceration programs instead build on the premise that incarceration is required as a deterrent and as a prompt for behavioral change.

Page 7: PADM 522—Summer 2012 Lecture 3 Professor Mario Rivera

Action and change models and partnerships Chen divides program theory into two component parts: An action model and a change model. An action model should incorporate both program ecological context and dimensions of inter-agency collaboration. Using the just-cited example of a domestic violence program, Chen argues that the program would fail if it lacked a working relationship with the courts, police, and community social agency partners and advocacy groups (p. 26). It is therefore important to align models as well as strategies in working in concert with other agencies, although that can be very difficult.

Partnered programs may have different change models at work, or they may operate on different concepts of a single model set. What if one partner agency in a domestic violence collaborative operates on one set of assumptions (e.g., a model based on zero-tolerance, and deterrence through incarceration) while another does so based on a rehabilitation & counseling model?

Such programs create complex effects chains, as the efforts of various partners have impact in different places, at different times.

Page 8: PADM 522—Summer 2012 Lecture 3 Professor Mario Rivera

Chen’s Stakeholder Engagement and Partnership Strategies Chen provides another dimension of partnership in his

evaluation framework, namely that of evaluator-stakeholder partnership, particularly in the development and assessment of partnered programs. This essentially occurs when program principals and stakeholders bringing evaluators into the program coalition and program development effort as key partners. What are the pros and cons of this kind of evaluator involvement in program development? At what junctures of evaluator involvement are dilemmas likely to present themselves? Might it be possible for an evaluator to become involved in this way early on in a program but then detach himself or herself for the purposes of outcome evaluation? If not, why not? Can stakeholders empower evaluators? How?

Page 9: PADM 522—Summer 2012 Lecture 3 Professor Mario Rivera

“Integrative Validity”—From Chen, Huey T., 2010. “The bottom-up approach to integrative validity: A new perspective for program evaluation,” Evaluation and Program Planning, Elsevier, vol. 33(3), pages 205-214, August. “Evaluators and researchers have . . . increasingly recognized

that in an evaluation, the over-emphasis on internal validity reduces that evaluation's usefulness and contributes to the gulf between academic and practical communities regarding interventions (p. 205).”

Chen proposes an alternative integrative validity model for program evaluation, premised on viability and “bottom-up” incorporation of stakeholders’ views and concerns. The integrative validity model and the bottom-up approach enable evaluators to meet scientific and practical requirements, facilitate in advancing external validity, and gain a new perspective on methods. For integrative validity to obtain, stakeholders must be centrally involved. Consistent with Chen’s emphasis on addressing both scientific and stakeholder validity.

Page 10: PADM 522—Summer 2012 Lecture 3 Professor Mario Rivera

Key Concepts in Impact Assessment Linking interventions to outcomes.

Establishing impact essentially amounts to establishing causality.

Most causal relationships in social science and behavioral science are expressed as probabilities.

Conditions limiting assessments of causality External conditions and causes. Internal conditions (such as biased selection). Other social programs with similar targets.

Page 11: PADM 522—Summer 2012 Lecture 3 Professor Mario Rivera

Key Concepts in Impact Assessment “Perfect” versus “good enough” impact assessments.

Intervention and target may not allow perfect design. Time and resource constraints. Importance often determines rigor. Review design options to determine most appropriate

—mixed methods are most often used. Quasi-experiments and cross-case or cross-site design, and “natural experiments,” are typically the closest one can come to true experimentation. These may provide as much or more rigor than efforts at randomized experiments on a clinical model.

Page 12: PADM 522—Summer 2012 Lecture 3 Professor Mario Rivera

Key Concepts in Impact Assessment Gross versus net outcomes. Net outcomes and the

counterfactual: Net outcomes equal outcomes of the program minus projected outcomes without the program.

Effects

Design

factors)

gconfoundin s(extraneou

processesother

of Effects

effect)(net

onInterventi

of Effects

OutcomeGross

Page 13: PADM 522—Summer 2012 Lecture 3 Professor Mario Rivera

Program impacts as comparative net outcomesIf, for example, one finds in an anti-smoking program that only 2 percent of targeted youth have quit or not taken up smoking by virtue of the program, the program appears ineffective. However, if in comparable populations not exposed to it there was a 1.5 percent increase in smoking behaviors, it seems more effective. Arguably, it was able to stem some of the naturally occurring increase in tobacco use (first or continued use). The critical distinction is a difference between outcomes and impacts. In evaluation, an outcome is the value of any variable measured after an intervention. An impact is the difference between the outcome observed and what would have occurred without an intervention; i.e., an impact is the difference in outcomes attributable to the program. Impacts also must entail lasting changes in a targeted condition.

Page 14: PADM 522—Summer 2012 Lecture 3 Professor Mario Rivera

Key terminology re: attribution/causation Independent variables – direct policy/program interventions Dependent variables –outcomes Intervention variables are a special class of independent

variables that refer to policy/programming factors as discrete variables; these are endogenous (internal) factors

Exogenous factors – external to the program; contextual Counterfactual – the state of affairs that would have occurred

without the program Gross impact: observed change in outcome or outcomes Net impact: portion of gross impact attributable to the program

intervention; program intervention effects minus counterfactual. Confounding variables – Other factors making for impact felt

or measured within the program.

Page 15: PADM 522—Summer 2012 Lecture 3 Professor Mario Rivera

Confounding Factors—exogenous (external) & endogenous (internal) Exogenous confounding factors—other programs

and messages, socioeconomic context. Endogenous effects of uncontrolled selection.

Preexisting differences between treatment and control groups.

Self-selection. Program location and access. Deselection processes (attrition bias).

Endogenous change. Secular drift. Interfering events internal to the program. Maturational trends.

Page 16: PADM 522—Summer 2012 Lecture 3 Professor Mario Rivera

Design Effects Choice of outcome measures.

A critical measurement problem in evaluations is that of selecting the best measures for assessing outcomes. Conceptualization. Reliability. Feasibility. Proxy and indirect measures.

Missing information. Missing information is generally not randomly distributed. Often must be compensated for by alternative survey

items, unobtrusive measures, or estimates.

Page 17: PADM 522—Summer 2012 Lecture 3 Professor Mario Rivera

Design Strategies Compensating for Experimental Controls Full- versus partial-coverage programs.

Full coverage means absence of a control group. This is the norm for social programs, since it is unfeasible to deny the intervention or treatment to a control group of participants.

The evaluator must then use reflexive controls, for instance cross-case and cross-site comparisons internal to the program.

“Reflexive controls” means program-specific approximations of experimental controls

Page 18: PADM 522—Summer 2012 Lecture 3 Professor Mario Rivera

Realities of Randomized Experimental Design: Afterschool Science Program Example One would need to recruit all interested and eligible middle

school students to create a large enough subject pool, when it’s hard enough to recruit adequately-sized cohorts

Would need to ask parents and students for permission to randomly assign to one of two conditions. Then divide subjects into two conditions. But what? Denial of program benefits is unfeasible, and it would alienate everyone—parents, students, teachers. Try two curricula? Expensive, plus it raises the question of what is really being evaluated.

Could focus outcome evaluation efforts on randomly assigned subjects, while including all subjects in process evaluation

However, it is not clear that one would learn any more than otherwise from all this effort. Quasi-experiments and cross-case design would likely offer equal rigor.

Page 19: PADM 522—Summer 2012 Lecture 3 Professor Mario Rivera

Example: One experimental-design evaluation examined whether a home-based mentoring intervention forestalled 2nd birth for at least 2 years after an adolescent’s 1st birthDoes participation in the program reduce likelihood of early 2nd birth? Randomized controlled trial involving first-time African-American adolescent mothers

(n=181) younger than age 18 Intervention based on social cognitive theory, focused on interpersonal negotiation skills,

adolescent development, and parentingDelivered bi-weekly until infant’s first birthdayMentors were African-American, college-educated single mothers

Control group received usual care—no differences in baseline contraceptive use or other measures of ‘risk.’

Follow-up at 6, 13, and 24 months after recruitment at first deliveryResponse rate 82% at 24 months

Intervention mothers were less likely than control mothers to have a second infant; two or more intervention visits more than tripled the odds of avoiding 2nd birth within 2 years of the 1st.

Black et al. (2006). Delaying second births among adolescent mothers: A randomized, controlled trial of a home-based mentoring program. Pediatrics, 118, e1087-1099.

Page 20: PADM 522—Summer 2012 Lecture 3 Professor Mario Rivera

Incorporate Process Evaluation Measures in Outcome Analysis

Process evaluation measures assess qualitative and quantitative measures of program implementation, e.g.

Attendance data Participant feedback Program-delivery adherence to implementation guidelines

Facilitate replication. Make possible greater understanding of outcome evaluation findings, and program improvement

Avoids a typical evaluation error: Concluding that program is not effective, when in fact the program was not implemented as intended—program stakeholders may point out that discrepancy if the are consulted about process, therefore “empowering” the outcome evaluationSource: USDHHS. (2002). Science-based prevention programs and principles, 2002. Rockville, MD: Author.

Page 21: PADM 522—Summer 2012 Lecture 3 Professor Mario Rivera

Example: Children’s Hospital Boston Study to increase parenting skills and improve attitudes about parenting among parenting teens through a structured psycho-educational group model.

All parenting teens (n=91) were offered a 12-week group parenting curriculum Comparison group (n=54) declined the curriculum but agreed to participate in

evaluation Pre-test, post-test measures included the Adult-Adolescent Parenting

Inventory (AAPI) and the Maternal Self-Report Inventory (MSRI). Analysis controlled for mother’s age, baby’s age, demographics Evaluation results: Program participants or those who attended more

sessions improved their mothering role, perception of childbearing, developmental expectations of child, and empathy for baby, and they saw a reduced frequency of problems in child and family events. Couldn’t comparable results have been attained without going to the trouble of experimental design?Source—Woods et al. (2003). The parenting project for teen mothers: The impact of a nurturing curriculum … Ambul Pediatr, 3, 240-245.

Page 22: PADM 522—Summer 2012 Lecture 3 Professor Mario Rivera

Afterschool Science Program Causal Logic Model: Inputs,

Mediating and Moderating Factors, Outcomes, and Impacts

Curriculum Design

Coaching & Scientist Visits

Best-practices-based curricular content both builds on & strengthens in-school science

Improved ability to succeed academically

Greater school retention; more high school grads going to college

Process Evaluation

Science Camp

Tested Program Content

Skilled Program Delivery Stimulating Lab Activities

Outcome Evaluation

Mediators

Moderators Poverty; family linguistic & education barriers; historic gender- and ethnicity-based constraints on educational and professional aspirations

Short-term OutcomesMedium-term Outcomes

Long-term Outcomes,or Impacts

Hands-on Program

Increased student role-identification as a scientist and personal interest in learning science

Increased student desire and capacity to engage in science

Increased involvement in science

More opt for science courses, major in science

Increased adolescent

contraceptive use

Increased self-efficacy in science

More consider ascience career

Page 23: PADM 522—Summer 2012 Lecture 3 Professor Mario Rivera

Science Camp example Randomized experimental design unfeasible, undesirable. What is the comparison group? Not possible to identify

close control groups; non-participants in same middle schools not really closely comparable (self-selection, demographics). Non participants in other schools or in other local afterschool programs not comparable either.

Use other afterschool science programs for middle-school students nationally as the comparison group, especially those targeting or largely incorporating girls and students from historically-underrepresented minorities. Targeted literature review with over 80 citations basis of comparison. Most studies find negligible gains in science knowledge and academic performance, while a few do find modest gains in interest in and self-efficacy in science.

Page 24: PADM 522—Summer 2012 Lecture 3 Professor Mario Rivera

Literature review as analytical synthesis The extensive literature review developed for the 2010

evaluation set the backdrop for the outcome findings in the 2011 evaluation. The subject became the program itself, and its significant positive outcomes, against the baseline of limited-gain or ambiguous impact findings in dozens of other national and international evaluations. Findings for the 2010 and 2011 evaluations were considered together, in finding that the Science Camp consistently produced major gains in knowledge, self-efficacy, and motivation toward as well as identification with science. A more comprehensive standpoint than localized comparisons. The lit review itself became part of the evaluation methodology.

Page 25: PADM 522—Summer 2012 Lecture 3 Professor Mario Rivera

Science Camp Outcome Measures Science Camp evaluation found significant gains in

science content knowledge, aspiration, and self-efficacy. Repeated measure paired t-tests were used to gauge gains in knowledge for each subject-matter module. T-tests are a form of variation sampling that do not require (or allow for) randomization but do set up a comparison vector between results and results to be expected by chance variation.

The formula for the t-test is a ratio. The top part of the ratio is the difference between the two means or averages. The bottom part is a measure of the variability or dispersion of the scores.

A Science Attitude Survey developed as a synthesis of proven tests (in 2011 Report) showed major motivation gains. Unpaired t-tests were used for this assessment.

Page 26: PADM 522—Summer 2012 Lecture 3 Professor Mario Rivera

Another Example: Strategic Prevention Framework State Incentive Grant (SPF SIG ) New Mexico Community Causal Logic Model: Reducing alcohol-related youth traffic fatalities

High rate of alcohol-

related crash mortality

Among 15 to 24 year olds

Low or discount PRICING of alcohol

Easy RETAIL ACCESS to Alcohol for youth

Easy SOCIAL ACCESS to Alcohol

Media Advocacy to Increase Community

Concern about Underage Drinking

Restrictions on alcohol advertising in

youth markets

SOCIAL NORMS accepting and/or encouraging

youth drinking

PROMOTION of alcohol use (advertising, movies,

music, etc)

Low ENFORCEMENT of alcohol laws

Underage DRINKING AND

DRIVING

Social Event Monitoring and

Enforcement

Bans on alcohol price promotions and

happy hours

Young AdultBINGE

DRINKING

Enforce underage retail sales laws

Causal Factors

Strategies(Examples)

Substance-Related

Consequences

SubstanceUse

Low PERCEIVED RISK of alcohol use

Young Adult DRINKING AND

DRIVING

UnderageBINGE

DRINKING

Page 27: PADM 522—Summer 2012 Lecture 3 Professor Mario Rivera

Chen: Program Implementation and FidelityAssessment of program fidelity is a part of impact evaluation. “Fidelity”= congruence between program outcomes & design: Consistency with goals articulated in funding proposals or

position papers or other reports and program sources Consistency with key stakeholder intent (e.g., the intent of a

foundation, legislature, or other funding or authorizing sources Congruence in program design, implementation, and evaluation Important dimensions of fidelity

Coverage of target populations as planned, promised Preservation of the causal mechanism underlying the program

(e.g. childhood inoculations as a crucial initiative in improving children’s health outcomes)

Preserving the defining features of the program when scaling up in size and/or scope

The Fidelity-Adaptation Relationship is important; maintaining fidelity requires creative adaptation to changing and unexpected circumstances (not rigid or formulaic conformance to original plan)

Page 28: PADM 522—Summer 2012 Lecture 3 Professor Mario Rivera

Further definition of Program Fidelity from Chen Fidelity means that the implemented model is substantially

or essentially the same as the intended model. Fidelity means that normative theory (what should be

accomplished), causative theory (anticipated causal processes), and implicit and explicit conceptions of these, are mutually consistent:

Normative theory (prescriptive model/theory) The “what” and “how” are and remain congruent Relationships among program activities, outputs, outcomes,

moderators remain relatively constant Causative theory (causal theory, change model or theory)

The “why” of the program does not essentially change Mediating factors or moderators, factors making for conversion

from action to outcome (from a systems perspective), remain reasonably constant

Page 29: PADM 522—Summer 2012 Lecture 3 Professor Mario Rivera

Chen: articulating and testing program theory Chen addresses the role of stakeholders in regard to program theory—recall Chen’s contrast between scientific validity and stakeholder validity. The evaluator can ascertain program theory by reviewing existing program documents and materials, interviewing stakeholders, and creating evaluation workgroups with them (a participatory and consultative mode of interaction). S/he may also facilitate discussions, on topics ranging from strategy to logic models to program theory. Discussion of program theory entails forward reasoning and backward reasoning in some combination—either (1) projecting from program premises or (2) reasoning back from actual or desired program outcomes. The terms “feedback” and “feed-forward” are also used. An action model may be articulated in draft form by the evaluator as a consequence of facilitated discussion, then distributed to stakeholders and program principals for further consideration and refinement. Evaluation design will involve incorporation of needs assessments and articulated program theory, with a plan to test single or multiple stages of the program. For instance, one might have yearly formative evaluations followed by a comprehensive and summative evaluation the final program year.

Page 30: PADM 522—Summer 2012 Lecture 3 Professor Mario Rivera

Chen: Causal analysis and systems analysis Inevitably, some evaluation must be carried out at systems levels:

It is important to consider that systems dynamics are inherently complex They are governed by feedback, changeable, non-linear, and history-

dependent; Adaptive and evolving; Systems are characterized by trade-offs, shifting dynamics Characterized by complex causality—coordination complexity,

sequencing complexity, causal complexity due to multiple actors and influences, and the like.

Too much focus in evaluation on a single intervention as the unit of analysis;

Understanding connectivity between programs is important; Many complex interventions require programming (and therefore also

evaluation) at multiple levels, e.g., at the community, neighborhood, school and individual level;

Multilevel alignment is required across interventions

Page 31: PADM 522—Summer 2012 Lecture 3 Professor Mario Rivera

Every program should have a strategic frame-work comprised of a series of cascading

Relationship between a program’s strategic framework and evaluation indicators, measures

Every program evaluation should have a series of corresponding indicators and performance measures Impact indicators & measures

Outcome indicators & measures

Process Indicators & measures

Goals

Objectives

Activities

Page 32: PADM 522—Summer 2012 Lecture 3 Professor Mario Rivera

Evaluating partnered, multi-causal programs Program evaluation in collaborative network/partnership

contexts:Does it matter to the functioning and success of a program that it

involves different sectors, organizations, stakeholders, and standards?

What level and breadth of consultation are needed to achieve program aims?

How do we determine if partnerships have been strengthened or new linkages formed as a result of a particular program?

How can we evaluate the development of partnered efforts and partnership capacity along with program outcomes and program capacity?

To what extent have program managers and evaluators consulted with each other and with key constituencies in establishing goals and designing programs? In after-school programs, working partnerships between teachers and after-school personnel, and between these and parents, is essential.

Page 33: PADM 522—Summer 2012 Lecture 3 Professor Mario Rivera

Chen pp.240-241; Action Model for HIV/AIDS education

Mediating Variables Moderating Variables

Implementation (interventiondeterminantsprogram outcomes)

Usually +: e.g., help from supportive networks—support groups, family and friends, reinforcing messages, social and institutional cultural supports

Usually less than +: e.g., lack of partner support, social and economic variables such as poverty, education, prejudice

Impacts on individual subject(s) of the intervention, with “impacts’ defined as the aggregate of comparative net outcomes

Action Model (which along with the Change Model=ProgramTheory)

Page 34: PADM 522—Summer 2012 Lecture 3 Professor Mario Rivera

Implementation fidelity and change modeling Models of systems change versus models of

inducement of behavioral or social change Stage-like nature of change management Multi-level quality of directed change

Change can be conceptualized at the individual, group, programmatic, organizational, and social-system levels; these are interlocking levels of action

Change is not a discrete event but a continuum, a seamless process in which decisions and actions, and actions and their effects, affect one another continually and are difficult to separate while they are occurring

Change can be anticipated and managed on the basis of program design and the testing of implementation. Evaluation is in effect a test of change and action models

Page 35: PADM 522—Summer 2012 Lecture 3 Professor Mario Rivera

Other Elements of Fidelity Assessment The quality and efficacy of implementation is a critical

element of program fidelity and fidelity evaluation Fidelity-based evaluation is a form of merit evaluation Importance of context—does it make a difference that the

program is being implemented in New Mexico or New York? Considerations for conceptualizing fidelity

Multilevel nature of many interventions Level and intensity of measurement increases with the need

for more probing evaluation What is the program’s capacity for monitoring fidelity? What is the burden of monitoring fidelity? Key elements of fidelity—e.g., alignment of program outcomes

with desired outcomes—may focus or streamline fidelity-focused evaluation

Adaptive alignment with essential program goals (desired outcomes) is more important than slavish conformance to stated goals as such


Recommended