MLDM Monday -- Optimization Series Talk

Preview:

DESCRIPTION

Taiwan R User Group/MLDM Monday

Citation preview

  • 1. R for finding the non-dominated rules in multi-objective optimizationBo-Han Wu Jan 27, 2014TaiwanRUserGroup/MLDMMonday

2. GoogleWu Bo-Han rippleblue2002@gmail.com 3. Outline Introduction Classification rule Accuracy Comprehensibility Interestingness Multi-objective optimization Non-dominated rules SPEA2 Case study Wu Bo-Han rippleblue2002@gmail.com 4. Data growingWu Bo-Han rippleblue2002@gmail.com 5. Introduction Facing the age of data explosion, the amount of data is increasing very fast in databases. Those data normally include hidden knowledge, and they can be used to improve the decision-making process of any kinds of company. Wu Bo-Han rippleblue2002@gmail.com 6. Classification rule Classification rule mining is a common technology in data mining. From the historical data, rule can be generalized to classify unknown samples or predict the future.Wu Bo-Han rippleblue2002@gmail.com 7. Classification rule IF AND THEN Example: IF SexMale AND Location = Taipei THEN Product beerWu Bo-Han rippleblue2002@gmail.com 8. Classification rule Traditional mining techniques mostly focus on accuracy and usually generate lots of rules that are hard to choose meaningful ones from. In order to select optimally meaningful rules, accuracy, comprehensibility and interestingness are employed as three objectives.Wu Bo-Han rippleblue2002@gmail.com 9. Accuracysup( A & C ) A(R) sup( A ) is the support for the rule R represents the support for the antecedent of rule R Wu Bo-Han rippleblue2002@gmail.com 10. ComprehensibilityNc ( R) C( R) 1 Mc Nc(R)is the number of conditions in the rule Mc is the maximum number of conditions that a rule can have Wu Bo-Han rippleblue2002@gmail.com 11. Interestingness sup( A & C ) sup( A & C ) sup( A & C ) I (R) 1 sup( A ) sup( C ) D 1 1 gives the probability of generating the rule depending on the antecedent part gives the probability of generating the rule depending on the consequent part gives the probability of generating the rule depending on the whole data-setWu Bo-Han rippleblue2002@gmail.com 12. Multi-objective optimization Low price and high performance 90%Performance40% 10k NondominatedsolutionPrice100kWu Bo-Han rippleblue2002@gmail.com 13. Multi-objective optimization Low price and high performance 90%453 2Performance40%110k NondominatedsolutionPrice100kWu Bo-Han rippleblue2002@gmail.com 14. Multi-objective optimization Low price and high performance 90%453 2Performance40% Nondominatedsolutionset Nondominatedsolution110kPrice100kWu Bo-Han rippleblue2002@gmail.com 15. Multi-objective optimization However, traditional methods handle multiobjective problems by converting them into a single objective problem. But this approach can not guarantee to find optimal solutions for multiple objectives.Wu Bo-Han rippleblue2002@gmail.com 16. SPEA2 SPEA2 is designed by the concept "survival of the fittest" from natural evolution. The work intended to improve quality of individuals from solution space in each generation. SPEA2 used the strategy of selection, crossover and mutation to retain the best individuals and discard worst ones. Wu Bo-Han rippleblue2002@gmail.com 17. SPEA2Wu Bo-Han rippleblue2002@gmail.com 18. SPEA2Initial populationEmpty archiveIndividual Wu Bo-Han rippleblue2002@gmail.com 19. SPEA2Wu Bo-Han rippleblue2002@gmail.com 20. Non-dominatedWu Bo-Han rippleblue2002@gmail.com 21. Non-dominated solutionWu Bo-Han rippleblue2002@gmail.com 22. Non-dominated solution set EFWu Bo-Han rippleblue2002@gmail.com 23. SPEA2Individual Nod-dominated Individual Wu Bo-Han rippleblue2002@gmail.com 24. SPEA2Wu Bo-Han rippleblue2002@gmail.com 25. SPEA2Individual Nod-dominated Individual Wu Bo-Han rippleblue2002@gmail.com 26. SPEA2 Truncation operatorIndividual Nod-dominated Individual Wu Bo-Han rippleblue2002@gmail.com 27. SPEA2Wu Bo-Han rippleblue2002@gmail.com 28. SPEA2Wu Bo-Han rippleblue2002@gmail.com 29. SPEA2241 3Wu Bo-Han rippleblue2002@gmail.com 30. SPEA2Wu Bo-Han rippleblue2002@gmail.com 31. SPEA2 Recombination = 10101101011001100100010010111 = 01100110010111001011101101101Mutation = 01100101011001100100010010111 = 10010101011001100100010010111Wu Bo-Han rippleblue2002@gmail.com 32. SPEA24321Wu Bo-Han rippleblue2002@gmail.com 33. Non-dominated rules Three objectivesIFSexMale ANDLocation= Taipei THENProduct beerA=0.333333 C=0.875000 I=0.080000 Accuracy Comprehensibility InterestingnessNondominatedrules Wu Bo-Han rippleblue2002@gmail.com 34. Case study Transactiondataofaninsurancebrokercompany Date:2005 2006Attribute Gender Occupation Paymentfrequency Salesmethods Payment methods Location Data source Company ProductAttribute valueindex () () Wu Bo-Han rippleblue2002@gmail.com 35. Case studyDataCleaningDatatransactionTrainingdataand TestdataExample: Male01 Female10Accuracy DatatransactionSPEA2Comprehensibility InterestingnessExample: 01 Male 10FemaleWu Bo-Han rippleblue2002@gmail.com 36. Case study SPEA2 RuleMing.r Objective Functions.r SPEA2 Functions.rTruncation.rCrossover.rMutation.rWu Bo-Han rippleblue2002@gmail.com 37. Case study Non-dominated rules Sales methods= AND Data source= AND Company= THEN Product= Payment methods= AND Data source= AND Company= THEN Product= Payment frequency= AND Data source= Company=Wu Bo-Han rippleblue2002@gmail.com 38. Case study Non-dominated rules Sales methods= AND Data source= AND Company= THEN Product= Wu Bo-Han rippleblue2002@gmail.com 39. Thanks for your listeningWu Bo-Han rippleblue2002@gmail.com

Recommended