27
Methods for Estimating the Decision Rules in Dynamic Treatment Regimes S.A. Murphy Univ. of Michigan IBC/ASC: July, 2004

Methods for Estimating the Decision Rules in Dynamic Treatment Regimes S.A. Murphy Univ. of Michigan IBC/ASC: July, 2004

  • View
    229

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Methods for Estimating the Decision Rules in Dynamic Treatment Regimes S.A. Murphy Univ. of Michigan IBC/ASC: July, 2004

Methods for Estimating the Decision Rules in Dynamic

Treatment Regimes

S.A. Murphy

Univ. of Michigan

IBC/ASC: July, 2004

Page 2: Methods for Estimating the Decision Rules in Dynamic Treatment Regimes S.A. Murphy Univ. of Michigan IBC/ASC: July, 2004

Dynamic Treatment Regimes

Page 3: Methods for Estimating the Decision Rules in Dynamic Treatment Regimes S.A. Murphy Univ. of Michigan IBC/ASC: July, 2004

Dynamic Treatment Regimes are individually tailored treatments, with treatment type and dosage changing with ongoing subject information. Mimic Clinical Practice.

•Brooner et al. (2002) Treatment of Opioid Addiction

•Breslin et al. (1999) Treatment of Alcohol Addiction

•Prokaska et al. (2001) Treatment of Tobacco Addiction

•Rush et al. (2003) Treatment of Depression

Page 4: Methods for Estimating the Decision Rules in Dynamic Treatment Regimes S.A. Murphy Univ. of Michigan IBC/ASC: July, 2004

EXAMPLE: Treatment of alcohol dependency. Primary outcome is a summary of heavy drinking scores over time.

Page 5: Methods for Estimating the Decision Rules in Dynamic Treatment Regimes S.A. Murphy Univ. of Michigan IBC/ASC: July, 2004

Treatment of Alcohol Dependency

Initial Txt Intermediate Outcome Secondary Txt

Monitor +Responder counseling

Monitor

Med B

Med ANonresponder

EM + Med B+ Psychosocial

Intensive OutpatientProgram

Responder Monitor +counseling

Monitor

Med A + Psychosocial Med B

Nonresponder

EM +Med B+Psychosocial

Page 6: Methods for Estimating the Decision Rules in Dynamic Treatment Regimes S.A. Murphy Univ. of Michigan IBC/ASC: July, 2004

Sequential Multiple Assignments

Initial Txt Intermediate Outcome Secondary Txt

Monitor +

Responder R counseling

Monitor

Med B

Med A

Nonresponder REM + Med B+ Psychosocial

R

Responder Monitor +

R counseling

Monitor

Med A + Psychosocial Med B

Nonresponder R

EM +Med B+Psychosocial

Page 7: Methods for Estimating the Decision Rules in Dynamic Treatment Regimes S.A. Murphy Univ. of Michigan IBC/ASC: July, 2004

Examples of sequential multiple assignment randomized trials:

•CATIE (2001) Treatment of Psychosis in Alzheimer’s Patients

•CATIE (2001) Treatment of Psychosis in Schizophrenia

•STAR*D (2003) Treatment of Depression

•Thall et al. (2000) Treatment of Prostate Cancer

Page 8: Methods for Estimating the Decision Rules in Dynamic Treatment Regimes S.A. Murphy Univ. of Michigan IBC/ASC: July, 2004

k Decisions

Observations made prior to jth decision

Action at jth decision

Primary Outcome:

for a known function f

Page 9: Methods for Estimating the Decision Rules in Dynamic Treatment Regimes S.A. Murphy Univ. of Michigan IBC/ASC: July, 2004

A dynamic treatment regime is a vector of decision rules, one per decision

If the regime is implemented then

Page 10: Methods for Estimating the Decision Rules in Dynamic Treatment Regimes S.A. Murphy Univ. of Michigan IBC/ASC: July, 2004

Methods for Estimating Decision Rules

Page 11: Methods for Estimating the Decision Rules in Dynamic Treatment Regimes S.A. Murphy Univ. of Michigan IBC/ASC: July, 2004

Three Methods for Estimating Decision Rules

• Q-Learning (Watkins, 1989)

---regression

• A-Learning (Murphy, Robins, 2003)

---regression on a mean zero space.

• Weighting (Murphy, van der Laan & Robins, 2002)

---weighted mean

Page 12: Methods for Estimating the Decision Rules in Dynamic Treatment Regimes S.A. Murphy Univ. of Michigan IBC/ASC: July, 2004

One decision only!

Data:

is randomized with probability

Page 13: Methods for Estimating the Decision Rules in Dynamic Treatment Regimes S.A. Murphy Univ. of Michigan IBC/ASC: July, 2004

Goal

Choose to maximize:

Page 14: Methods for Estimating the Decision Rules in Dynamic Treatment Regimes S.A. Murphy Univ. of Michigan IBC/ASC: July, 2004

Q-Learning

Minimize

Page 15: Methods for Estimating the Decision Rules in Dynamic Treatment Regimes S.A. Murphy Univ. of Michigan IBC/ASC: July, 2004

A-Learning

Minimize

Page 16: Methods for Estimating the Decision Rules in Dynamic Treatment Regimes S.A. Murphy Univ. of Michigan IBC/ASC: July, 2004

Weighting

Page 17: Methods for Estimating the Decision Rules in Dynamic Treatment Regimes S.A. Murphy Univ. of Michigan IBC/ASC: July, 2004

Discussion

Page 18: Methods for Estimating the Decision Rules in Dynamic Treatment Regimes S.A. Murphy Univ. of Michigan IBC/ASC: July, 2004

Discussion

• Consistency of Parameterization

---problems for Q-Learning

• Model Space

---bias

---variance

Page 19: Methods for Estimating the Decision Rules in Dynamic Treatment Regimes S.A. Murphy Univ. of Michigan IBC/ASC: July, 2004

Q-Learning

Minimize

Page 20: Methods for Estimating the Decision Rules in Dynamic Treatment Regimes S.A. Murphy Univ. of Michigan IBC/ASC: July, 2004

Minimize

Page 21: Methods for Estimating the Decision Rules in Dynamic Treatment Regimes S.A. Murphy Univ. of Michigan IBC/ASC: July, 2004

Discussion

• Consistency of Parameterization

---problems for Q-Learning

• Model Space

---bias

---variance

Page 22: Methods for Estimating the Decision Rules in Dynamic Treatment Regimes S.A. Murphy Univ. of Michigan IBC/ASC: July, 2004

Points to keep in mind• The sequential multiple assignment randomized trial

is a trial for developing powerful dynamic treatment regimes; it is not a confirmatory trial.

• Focus on MSE recognizing that due to the high dimensionality of X, the model parameterization is likely incorrect.

Page 23: Methods for Estimating the Decision Rules in Dynamic Treatment Regimes S.A. Murphy Univ. of Michigan IBC/ASC: July, 2004

Goal

Given a restricted set of functional forms for the

decision rules, say , find

Page 24: Methods for Estimating the Decision Rules in Dynamic Treatment Regimes S.A. Murphy Univ. of Michigan IBC/ASC: July, 2004

Discussion

• Mismatch in Goals

---problems for Q-Learning & A-Learning

Page 25: Methods for Estimating the Decision Rules in Dynamic Treatment Regimes S.A. Murphy Univ. of Michigan IBC/ASC: July, 2004

Suppose our sample is infinite. Then in general

neither

or

is close to

Page 26: Methods for Estimating the Decision Rules in Dynamic Treatment Regimes S.A. Murphy Univ. of Michigan IBC/ASC: July, 2004

Open Problems

• How might we “guide” Q-Learning or A-Learning so as to more closely achieve our goal?

• Dealing with high dimensional X-- feature extraction---feature selection.

Page 27: Methods for Estimating the Decision Rules in Dynamic Treatment Regimes S.A. Murphy Univ. of Michigan IBC/ASC: July, 2004

This seminar can be found at:

http://www.stat.lsa.umich.edu/~samurphy/seminars/

ibc_asc_0704.ppt

My email address:[email protected]