CS639:DataManagementfor
DataScienceLecture16:IntrotoMLandDecisionTrees
TheodorosRekatsinas(lecturebyAnkur Goswami manyslidesfromDavidSontag)
1
Today’sLecture
1. IntrotoMachineLearning
2. TypesofMachineLearning
3. DecisionTrees
2
1. IntrotoMachineLearning
3
WhatisMachineLearning?
• “Learningisanyprocessbywhichasystemimprovesperformancefromexperience”– HerbertSimon
• DefinitionbyTomMitchell(1998):MachineLearningisthestudyofalgorithmsthat• ImprovetheirperformanceP• atsometaskT• withexperienceEAwell-definedlearningtaskisgivenby<P,T,E>.
WhatisMachineLearning?
MachineLearningisthestudyofalgorithmsthat• ImprovetheirperformanceP• atsometaskT• withexperienceE
Awell-definedlearningtaskisgivenby<P,T,E>.
Experience:data-driventask,thusstatistics,probabilityExample:useheightandweighttopredictgender
Whendoweusemachinelearning?
MLisusedwhen:• Humanexpertisedoesnotexist(navigatingonMars)• Humanscan’texplaintheirexpertise(speechrecognition)• Modelsmustbecustomized(personalizedmedicine)• Modelsarebasedonhugeamountsofdata(genomics)
Ataskthatrequiresmachinelearning
Whatmakesahanddrawingbe2?
Modernmachinelearning:Autonomouscars
Modernmachinelearning:SceneLabeling
Modernmachinelearning:SpeechRecognition
2.TypesofMachineLearning
11
TypesofLearning
• Supervised(inductive)learning• Given:trainingdata+desiredoutputs(labels)
• Unsupervisedlearning• Given:trainingdata(withoutdesiredoutputs)
• Semi-supervisedlearning• Given:trainingdata+afewdesiredoutputs
• Reinforcementlearning• Rewardsfromsequenceofactions
SupervisedLearning:Regression
• Given• Learnafunctionf(x)topredictygivenx• yisreal-valued==regression
SupervisedLearning:Classification
• Given• Learnafunctionf(x)topredictygivenx• yiscategorical==regression
SupervisedLearning:Classification
• Given• Learnafunctionf(x)topredictygivenx• yiscategorical==regression
SupervisedLearning
• Value xcanbemulti-dimensional.• Eachdimensioncorrespondstoanattribute
TypesofLearning
• Supervised(inductive)learning• Given:trainingdata+desiredoutputs(labels)
• Unsupervisedlearning• Given:trainingdata(withoutdesiredoutputs)
• Semi-supervisedlearning• Given:trainingdata+afewdesiredoutputs
• Reinforcementlearning• Rewardsfromsequenceofactions
Wewillcoverlaterintheclass
3.DecisionTrees
18
Alearningproblem:predictfuelefficiency
Hypotheses:decisiontreesf:X→Y
InformalAhypothesisisacertainfunctionthatwebelieve(orhope)issimilartothetruefunction,the targetfunction thatwewanttomodel.
WhatfunctionscanDecisionTreesrepresent?
Spaceofpossibledecisiontrees
• Howwillwechoosethebestone?• Letsfirstlookathowtosplitnodes,thenconsiderhowtofindthebesttree
Whatisthesimplesttree?
• Alwayspredictmpg=bad• Wejusttakethemajorityclass
• Isthisagoodtree?• Weneedtoevaluateitsperformance
• Performance: Wearecorrecton22examplesandincorrecton18examples
Adecisionstump
Recursivestep
Recursivestep
Secondleveloftree
Arealldecisiontreesequal?
• Manytreescanrepresentthesameconcept• But,notalltreeswillhavethesamesize!• e.g., φ = ( A∧ B)∨(¬A∧ C) -- ((A and B) or ( not A and C))
• Whichtreedoweprefer?
Learningdecisiontreesishard
• Learningthesimplest(smallest)decisiontreeisanNP-completeproblem[Hyafil &Rivest ’76]• Resorttoagreedyheuristic:• Startfromemptydecisiontree• Splitonnextbestattribute(feature)• Recurse
Splitting:choosingagoodattribute
Measuringuncertainty
• Goodsplitifwearemorecertainaboutclassificationaftersplit• Deterministicgood(alltrueorallfalse)• Uniformdistributionbad• Whataboutdistributionsinbetween?
Entropy
High,LowEntropy
EntropyExample
ConditionalEntropy
Informationgain
Learningdecisiontrees
Adecisionstump
BaseCases:AnIdea
• BaseCaseOne: Ifallrecordsincurrentdatasubsethavethesameoutputthendonotrecurse• BaseCaseTwo: Ifallrecordshaveexactlythesamesetofinputattributesthendonotrecurse
TheproblemwithBaseCase3
IfweomitBaseCase3
Summary:BuildingDecisionTrees
Fromcategoricaltoreal-valuedattributes
Whatyouneedtoknowaboutdecisiontrees
• DecisiontreesareoneofthemostpopularMLtools• Easytounderstand,implement,anduse• Computationallycheap(tosolveheuristically)
• Informationgaintoselectattributes• Presentedforclassificationbutcanbeusedforregressionanddensityestimationtoo• Decisiontreeswilloverfit!!!• Wewillseethedefinitionofoverfittingandrelatedconceptslaterinclass.