Upload
muncel
View
70
Download
1
Tags:
Embed Size (px)
DESCRIPTION
Distinguish Wild Mushrooms with Decision Tree. Shiqin Yan. Objective. Utilize the already existed database of the mushrooms to build a decision tree to assist the process of determine the whether the mushroom is poisonous . DataSet. - PowerPoint PPT Presentation
Citation preview
Distinguish Wild Mushrooms with Decision Tree
Shiqin Yan
Objective Utilize the already existed database of the
mushrooms to build a decision tree to assist the process of determine the whether the mushroom is poisonous.
DataSet Existing record drawn from the Audubon
Society Field Guide to North American Mushrooms (1981) . G. H. Lincoff (Pres. ), NewYork: Alfred A. Knopf
Number of Instances: 8124 (classified as either edible or poisonous)
Number of Attributes: 22 Training: 5416, Tuning: 1354, Testing: 1354 Missing attribute values: 2480 (denoted by
“?”), all for attribute 11
Mushroom Features 1. cap-shape: bell=b, conical=c, convex=x,
flat=f, knobbed=k, sunken = s 2. cap-surface: fibrous=f, grooves=g,
scaly=y, smooth=s 3. cap-color: brown=n, buff=b, cinnamon=c,
gray=g, green=r, pink=p, purple=u, red=e, white=w, yellow=y
4. bruise?: bruises=t, no=f 5. odor: almond=a, anise=l, creosote=c,
fishy=y, foul=f …
Approach Mutual information to determine the features
used to split the tree.
Mutual information: Y: label, X: feature Choose feature X which maximizes I(Y;X)
Most informative features extracted from decision tree: odor spore-print-color habitat population
Prior Research
by Wlodzislaw Duch, Department of Computer Methods, Nicholas Copernicus University
Add cross-validation to improve the accuracy
Prune the tree to avoid over-fitting
Future