36
LEARNING PROPORTIONS IN A SEMI-SUPERVISED SETTING: A CASE STUDY IN PRECISION MEDICINE Predrag Radivojac DEPARTMENT OF COMPUTER SCIENCE AND INFORMATICS INDIANA UNIVERSITY,BLOOMINGTON November 28, 2016

LEARNING PROPORTIONS IN A SEMI SUPERVISED SETTING A … · learning proportions in a semi-supervised setting: a case study in precision medicine predrag radivojac department of computer

  • Upload
    others

  • View
    11

  • Download
    0

Embed Size (px)

Citation preview

Page 1: LEARNING PROPORTIONS IN A SEMI SUPERVISED SETTING A … · learning proportions in a semi-supervised setting: a case study in precision medicine predrag radivojac department of computer

LEARNINGPROPORTIONSINASEMI-SUPERVISEDSETTING:ACASESTUDYIN

PRECISIONMEDICINE

PredragRadivojac

DEPARTMENTOFCOMPUTERSCIENCEANDINFORMATICSINDIANAUNIVERSITY,BLOOMINGTON

November28,2016

Page 2: LEARNING PROPORTIONS IN A SEMI SUPERVISED SETTING A … · learning proportions in a semi-supervised setting: a case study in precision medicine predrag radivojac department of computer
Page 3: LEARNING PROPORTIONS IN A SEMI SUPERVISED SETTING A … · learning proportions in a semi-supervised setting: a case study in precision medicine predrag radivojac department of computer
Page 4: LEARNING PROPORTIONS IN A SEMI SUPERVISED SETTING A … · learning proportions in a semi-supervised setting: a case study in precision medicine predrag radivojac department of computer

AtsomepredicLonthreshold[one]ThirdoftheAlternaLveSplicingIsoformspredictedtoproduceFuncLonalProteins….

Page 5: LEARNING PROPORTIONS IN A SEMI SUPERVISED SETTING A … · learning proportions in a semi-supervised setting: a case study in precision medicine predrag radivojac department of computer
Page 6: LEARNING PROPORTIONS IN A SEMI SUPERVISED SETTING A … · learning proportions in a semi-supervised setting: a case study in precision medicine predrag radivojac department of computer
Page 7: LEARNING PROPORTIONS IN A SEMI SUPERVISED SETTING A … · learning proportions in a semi-supervised setting: a case study in precision medicine predrag radivojac department of computer

WHATISTHEFRACTIONOFENZYMESINAGENOME?

Page 8: LEARNING PROPORTIONS IN A SEMI SUPERVISED SETTING A … · learning proportions in a semi-supervised setting: a case study in precision medicine predrag radivojac department of computer

WHATISTHEFRACTIONOFENZYMESINAGENOME?

CharlesDann,Chemistry YuzhenYe,ComputerScienceTuliMukhopadhyay,Biology

Page 9: LEARNING PROPORTIONS IN A SEMI SUPERVISED SETTING A … · learning proportions in a semi-supervised setting: a case study in precision medicine predrag radivojac department of computer

EXAMPLEFROMPSYCHOLOGY

GreeneMR.EsLmaLonsofobjectfrequencyarefrequentlyoveresLmated.Cogni&on(2016)149:6-10.

Page 10: LEARNING PROPORTIONS IN A SEMI SUPERVISED SETTING A … · learning proportions in a semi-supervised setting: a case study in precision medicine predrag radivojac department of computer

ANDSOWEGO…

HiPedja,Youpose interesLngquesLons. I'dexpectyeasttohavethehighestenzymefracLonas itdoesnotneedtohaveconservedgenesformulLcellulardevelopment,cogniLon,etc.(thoughmanyoftheseprocessesrequiressignalingpathwayswithenzymes).SoherearemyesLmatesforenzymefracLon,basedenLrelyonintuiLon.

Yeast~45%;E.coli~35%;Mouse~25%;Human~25%;Arabidopsis~40%(noideahere)

IimagineImayhitlowonallofthese...

CD3

Page 11: LEARNING PROPORTIONS IN A SEMI SUPERVISED SETTING A … · learning proportions in a semi-supervised setting: a case study in precision medicine predrag radivojac department of computer

SUPERVISEDLEARNINGPROBLEM

Page 12: LEARNING PROPORTIONS IN A SEMI SUPERVISED SETTING A … · learning proportions in a semi-supervised setting: a case study in precision medicine predrag radivojac department of computer

SUPERVISEDLEARNINGPROBLEM

Page 13: LEARNING PROPORTIONS IN A SEMI SUPERVISED SETTING A … · learning proportions in a semi-supervised setting: a case study in precision medicine predrag radivojac department of computer

SUPERVISEDLEARNINGPROBLEM

Page 14: LEARNING PROPORTIONS IN A SEMI SUPERVISED SETTING A … · learning proportions in a semi-supervised setting: a case study in precision medicine predrag radivojac department of computer

SUPERVISEDLEARNINGPROBLEM

Page 15: LEARNING PROPORTIONS IN A SEMI SUPERVISED SETTING A … · learning proportions in a semi-supervised setting: a case study in precision medicine predrag radivojac department of computer

SEMI-SUPERVISEDLEARNINGPROBLEM

Page 16: LEARNING PROPORTIONS IN A SEMI SUPERVISED SETTING A … · learning proportions in a semi-supervised setting: a case study in precision medicine predrag radivojac department of computer

UNSUPERVISEDLEARNINGPROBLEM

Page 17: LEARNING PROPORTIONS IN A SEMI SUPERVISED SETTING A … · learning proportions in a semi-supervised setting: a case study in precision medicine predrag radivojac department of computer

POSITIVE-UNLABELEDLEARNINGPROBLEM(PU)

Page 18: LEARNING PROPORTIONS IN A SEMI SUPERVISED SETTING A … · learning proportions in a semi-supervised setting: a case study in precision medicine predrag radivojac department of computer

IDENTIFIABILITY

Page 19: LEARNING PROPORTIONS IN A SEMI SUPERVISED SETTING A … · learning proportions in a semi-supervised setting: a case study in precision medicine predrag radivojac department of computer

ANTICIPATEDLIKELIHOODFUNCTION

Page 20: LEARNING PROPORTIONS IN A SEMI SUPERVISED SETTING A … · learning proportions in a semi-supervised setting: a case study in precision medicine predrag radivojac department of computer

ENZYMES:EXPERIMENTALPROTOCOL

>sp|P04637|P53_HUMAN Cellular tumor antigen p53 MEEPQSDPSVEPPLSQETFSDLWKLLPENNVLSPLPSQAMDDLMLSPDDIEQWFTEDPGP DEAPRMPEAAPPVAPAPAAPTPAAPAPAPSWPLSSSVPSQKTYQGSYGFRLGFLHSGTAK SVTCTYSPALNKMFCQLAKTCPVQLWVDSTPPPGTRVRAMAIYKQSQHMTEVVRRCPHHE RCSDSDGLAPPQHLIRVEGNLRVEYLDDRNTFRHSVVVPYEPPEVGSDCTTIHYNYMCNS SCMGGMNRRPILTIITLEDSSGNLLGRNSFEVRVCACPGRDRRTEEENLRKKGEPHHELP PGSTKRALPNNTSSSPQPKKKPLDGEYFTLQIRGRERFEMFRELNEALELKDAQAGKEPG GSRAHSSHLKSKKGQSTSRHKKLMFKTEGPDSD

DevelopanSVMpredictor

-  Linearkernel-  AUC≈75%-  Plam’scorrecLon

AAAA AAAC MEEP YYYY

0 0 1 0

VVVP

1 ... ...

Page 21: LEARNING PROPORTIONS IN A SEMI SUPERVISED SETTING A … · learning proportions in a semi-supervised setting: a case study in precision medicine predrag radivojac department of computer

RESULTS:ENZYMES

Page 22: LEARNING PROPORTIONS IN A SEMI SUPERVISED SETTING A … · learning proportions in a semi-supervised setting: a case study in precision medicine predrag radivojac department of computer

RESULTS:ENZYMES

Page 23: LEARNING PROPORTIONS IN A SEMI SUPERVISED SETTING A … · learning proportions in a semi-supervised setting: a case study in precision medicine predrag radivojac department of computer

DISEASEMUTATIONSINHUMANEXOME

Page 24: LEARNING PROPORTIONS IN A SEMI SUPERVISED SETTING A … · learning proportions in a semi-supervised setting: a case study in precision medicine predrag radivojac department of computer

PRECISIONMEDICINE

“Sotonight,I’mlaunchinganewPrecisionMedicineIniLaLvetobringusclosertocuringdiseaseslikecanceranddiabetes,andtogiveallofusaccesstothepersonalizedinformaLonweneedtokeepourselvesandourfamilieshealthier.Wecandothis.“–PresidentBarackObama.

01/20/2015

PrecisionMedicine

thescienceandpracLceofmatchingthebestdiagnosLc,therapeuLcand

prevenLonstrategiestopromotehealththataretailoredtoanindividual’sgeneLc,biological,behavioraland

psychosocialcharacterisLcs

Page 25: LEARNING PROPORTIONS IN A SEMI SUPERVISED SETTING A … · learning proportions in a semi-supervised setting: a case study in precision medicine predrag radivojac department of computer

PRECISIONMEDICINE

www.nih.gov/AllofUs-Research-Program

nih.gov

Page 26: LEARNING PROPORTIONS IN A SEMI SUPERVISED SETTING A … · learning proportions in a semi-supervised setting: a case study in precision medicine predrag radivojac department of computer

hmp://bgiamericas.com

GENOMESEQUENCING

TheAtlanLc

Page 27: LEARNING PROPORTIONS IN A SEMI SUPERVISED SETTING A … · learning proportions in a semi-supervised setting: a case study in precision medicine predrag radivojac department of computer

… AGCATACCGA …

HUMANGENOMEANDITSIMPACTONPHENOTYPEWHATISTHEMOLECULARBASISOFIT?

GENOMEPHENOME

… TTTACCGAGC …

… AGCATAGCGA …

Page 28: LEARNING PROPORTIONS IN A SEMI SUPERVISED SETTING A … · learning proportions in a semi-supervised setting: a case study in precision medicine predrag radivojac department of computer

Adaptedfrom:hmp://snp.ims.u-tokyo.ac.jp/samplesMethods.html#SNP

Gene

ExonDNA

>40millionknownuniquesitesofvaria5on!

Yueetal.JMolBiol,353:459(2005).

G38Din1tag

Anothergene…

regulatory non-synonymous intronicgenomicsynonymous

BASECHANGESRESULTINGINDIFFERENTPROTEINS

Page 29: LEARNING PROPORTIONS IN A SEMI SUPERVISED SETTING A … · learning proportions in a semi-supervised setting: a case study in precision medicine predrag radivojac department of computer

Whenappliedto43nsSNPsof18drugrelatedgenesfromtheThaiSNPresequencingprojecttherewerestrongcorrelaLons SlidefromSeanMooney’sgroup.

CURRENTTOOLSPREDICTEFFECTSOFVARIANTS

Page 30: LEARNING PROPORTIONS IN A SEMI SUPERVISED SETTING A … · learning proportions in a semi-supervised setting: a case study in precision medicine predrag radivojac department of computer

GROWTHOFDATA

HapMap Phase I

1000 Genomes project Phase I

Page 31: LEARNING PROPORTIONS IN A SEMI SUPERVISED SETTING A … · learning proportions in a semi-supervised setting: a case study in precision medicine predrag radivojac department of computer
Page 32: LEARNING PROPORTIONS IN A SEMI SUPERVISED SETTING A … · learning proportions in a semi-supervised setting: a case study in precision medicine predrag radivojac department of computer
Page 33: LEARNING PROPORTIONS IN A SEMI SUPERVISED SETTING A … · learning proportions in a semi-supervised setting: a case study in precision medicine predrag radivojac department of computer

TUMORBOARD

hmp://www.med.umich.edu/cancer/images/urologic-oncology-tumor-board.jpg

Person:68yearoldwomanCancertype:coloncancer,metastaLcMuta7ons:KRAS,C27FBRCA1,H57RTP53,T98*

Treatmentop7ons:-clinicaltrialatMDAnderson-conLnuewithchemotherapy-treatwithnewdrugforbreastcancer

Page 34: LEARNING PROPORTIONS IN A SEMI SUPERVISED SETTING A … · learning proportions in a semi-supervised setting: a case study in precision medicine predrag radivojac department of computer

MOLECULARCONSEQUENCESONP53

R175H: Metal-binding

V143A: Stability

K120R: Acetylation

R273H: DNA-binding

G245S: Protein-binding

p53–tumorsuppressorprotein

PDBstructures:2ybg,2j1w,1ycsand1tup

Page 35: LEARNING PROPORTIONS IN A SEMI SUPERVISED SETTING A … · learning proportions in a semi-supervised setting: a case study in precision medicine predrag radivojac department of computer

MUTPRED2.0

VikasPejaver,IndianaUniversity

Conservation Funct. Prop. Sequence Struct. Prop.

Neural networks

>sp|P04637|P53_HUMAN MEEPQSDPSVEPPLSQETFSDLWKLLPENNVLSPLPSQAMDDLMLSPDDIEQWFTEDPGP DEAPRMPEAAPRVAPAPAAPTPAAPAPAPSWPLSSSVPSQKTYQGSYGFRLGFLHSGTAK SVTCTYSPALNKMFCQLAKTCPVQLWVDSTPPPGTRVRAMAIYKQSQHMTEVVRRCPHHE RCSDSDGLAPPQHLIRVEGNLRVEYLDDRNTFRHSVVVPYEPPEVGSDCTTIHYNYMCNS ...

...

Physicochemical Substitution matrices

Neural network ensemble:

•  Z-score normalized and PCA’d •  30 feed-forward networks •  bootstrap aggregating, balanced training •  trained using resilient propagation

Page 36: LEARNING PROPORTIONS IN A SEMI SUPERVISED SETTING A … · learning proportions in a semi-supervised setting: a case study in precision medicine predrag radivojac department of computer

Cataly7cresiduesofPCSK9(2qtw)

n  amemberoftheproteinaseKsub-familyofsubLlasesthatreducesthenumberofLDLreceptorsinliverthroughaposmranscripLonalmechanism.

n  D374Yleadstoa10-foldincreaseincatalyLcacLvitythatcauseshypercholesterolemia.

Lagaceetal.JClinInvest,116:2995(2006). Xinetal.Bioinforma&cs26:1975-1982(2010).

GAINOFCATALYTICACTIVITYCAUSESDISEASE