Upload
hortense-nash
View
220
Download
1
Tags:
Embed Size (px)
Citation preview
Age Stratified Risk Prediction of Invasive versus In-situ Breast
Cancer: A Logistic Regression Model
Mehmet Ayvaci1,2
Oguzhan Alagoz1,Jagpreet Chhatwal3, Mary Lindstrom4,Houssam Nassif5,Elizabeth S
Burnside2
1Industrial and Systems Engineering, UW-Madison2Radiology, UW-Madison
3Merck Research Labaratories4Biostatistics
5Computer Science, UW-Madison
Breast Anatomy
Breast profile:– A ducts– B lobules– C dilated section of duct
to hold milk– D nipple– E fat– F pectoralis major
muscle– G chest wall/rib cage
• Enlargement:– A normal duct cells– B basement membrane– C lumen (center of duct)
www.breastcancer.org
Age Stratified Risk Prediction of Invasive vs In-situ Breast Cancer: A Logistic Regression Model
Progression of Breast Cancer
Normal duct
Typical ductal hyperplasia
Atypical ductal hyperplasia Ductal
carcinoma in situ (DCIS)
Invasive ductal carcinoma
• Primary modality of screening or diagnosis: Mammography
Age Stratification forInvasive vs In-situ Breast
Cancer
• Primary modality of detecting type of breast cancer: Biopsy
Age Stratification forInvasive vs In-situ Breast
Cancer
• Incidence of DCIS has increased since adoption of mammography
• DCIS has favorable prognosis: will often not cause mortality for years
• PPV of biopsy 20%
• Invasive vs. DCIS distinction important because:
Age Stratification forInvasive vs In-situ Breast
Cancer
• Develop a risk prediction model for prospective differentiation of DCIS versus invasive breast cancer
• Measure and compare model performance for different age groups
Purpose and Methods
)(11
1
N
iiiX
e
LOGISTIC REGRESSION
ROC Curves
Purpose and Methods Contd.
Markov Decision Processes
Risk Assessment Tools
Clinical Implications
Clinical Implications
ROC & PR Curves, Statistical Testing
Structure of Data UsedNMDNational
Mammography Database Format
Free TextStructured
Demographic Factors
Mammographic Descriptors
Radiologists’ Overall Assessment of the Mammogram with Some Repeat to the Structured Part
BIRADS descriptors
Turned into Structured format using Natural Language Processing
• Information retrieval from free text given a standardized lexicon
• Parse sentences to detect BIRADS descriptors using Natural Language Processing in PERL
• Test on a set of 100 which is manually populated– 97.7% Precision– 95.5% Recall
Methods: Processing Free Text
Data in Detail
Structured
DEMOGRAPHIC Age Family History Personal History Prior Surgery Palpable Lump Interpreting Radiologist Breast Density Indication for Exam if Diagnostic
MAMMOGRAPHIC BI-RADS Assessment Principle Abnormal Finding
o Architectural Distortiono Calcifications o Asymmetry (one view) o Focal asymmetry (two views) o Developing asymmetry o Mass o Single Dilated Ducto Both Calcifications and Something Else
Associated Findings
Skin Retraction, Nipple Retraction, Skin Thickening, Trabecular Thickening, Skin Lesion, Axillary Adenopathy, Architectural Distortion
Calcification Distribution Clustered, Linear, Regional, Scattered, Segmental, Diffuse
Calcification Morphology
Pleomorphic , Dystrophic, Lucent, Punctate, Vascular, Eggshell, Dermal, Fine-Linear, Round, Popcorn, Milk of Calcium, Rod Like, Suture, Amorphous,
Mass Margins Circumscribed, Obscured, Defined, Microlobulated,
SpiculatedMass Shape Irregular, Lobular, Oval, RoundMass Density Fat Containing, Low Density, Equal Density, High Density
Mass Size <10mm, >10mm and <20mm, >20mm and<50mm,
>50mm
Special Cases Intra-mammary Lymph Node, Tubular Density,
Assymetric Breast Tissue, Focal Asymmetric DensityBiopsy Insitu, Invasive
Free Text
==Features
Extracted Using NLP
• 1475 Diagnostic Mammograms 1378 Patients– 1298 patients with single mammogram– 81 patients with 2 mammograms– 5 patients with three mammograms
• 1063 cases invasive vs. 412 DCIS
• Age range 27 to 97 with – Mean 59.7 and standard deviation 13.4
Summary of Data
• Regress with a dichotomous outcome, where the patient is known to have malignant condition, i.e.– Invasive or– DCIS
• Stratified data into 3 groups– Overall Model LR 1475 records– Age Less Than 50 LRyoung 374 records– Age Greater Than 65 LROld 533 records
• Used stepwise regression to find the appropriate models. Possibility of interactions were investigated
Methods: Performing Logistic Regression
P(Invasive|Demographic Factors, Mammographic Descriptors)
Methods: Validation Technique
n fold cross-validation Leave-one-out
Test fold
Training fold
1 32 n…
…
Data set
Fold
Merge tested folds for performance analysis
1 32 n…
Fold
Sensitivity vs. 1-Specificity at all thresholds– Sensitivity: True Positive
Rate– Specificity: True Negative
Rate– Thresholds: Probability
above which call “Invasive” – AUC: Area Under the Curve
Methods: Measuring Performance
Patient with Disease
Patients without disease
Test + a b
Test - c d
Sensitivity=a/(a+c)
Specificity = d/(b+d)
Results: LR
• Overall model significant at p-value<0.01
• Not enough power to justify inclusion of interaction terms (Over-fitting)
• Acceptable ROC• Decreasing trend in Error
rates
Row Labels Subject Count Actual Invasive Count
AgeLT50 374 264Age5064 568 398
AgeGTE65 533 401Overall 1475 1063
Results: LRyoung vs. LRold
• Difference in AUC = 0.07
• Significant at p-value = 0.045
Results: LRyoung vs. LRold
• Improvement is in False Negatives
Results: LRyoung vs. LRold
• Mammography is not perfect and performs better in older women.
• There is a need for discriminating between invasive and DCIS to better manage the breast disease in the context of age and other comorbidities
• An age based risk prediction model for assessing performance difference in discriminating invasive vs. DCIS is necessary
• Such a model would enable physicians to make more informed decisions
• Demonstration of performance difference and varying risk factors in different age cohorts justifies
In Summary
Future Work
Markov Decision Processes
Risk Assessment Tools
Clinical Implications
Clinical Implications
ROC & PR Curves, Statistical Testing
Get in Literature
Using POMDPs to Determine the Optimal Mammography Screening Schedule From the Patient's Perspective Presenting Author:
Turgay Ayer,University of Wisconsin
Co-Author: Oguzhan Alagoz,Assistant Professor, University of Wisconsin-Madison
THANK YOU!
Questions?