1 / 10

Distinguish Wild Mushrooms with Decision Tree

Distinguish Wild Mushrooms with Decision Tree. Shiqin Yan. Objective. Utilize the already existed database of the mushrooms to build a decision tree to assist the process of determine the whether the mushroom is poisonous . DataSet.

muncel
Télécharger la présentation

Distinguish Wild Mushrooms with Decision Tree

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Distinguish Wild Mushrooms with Decision Tree Shiqin Yan

  2. Objective • Utilize the already existed database of the mushrooms to build a decision tree to assist the process of determine the whether the mushroom is poisonous.

  3. DataSet • Existing record drawn from the Audubon Society Field Guide to North American Mushrooms (1981) . G. H. Lincoff (Pres. ), NewYork: Alfred A. Knopf • Number of Instances: 8124 (classified as either edible or poisonous) • Number of Attributes: 22 • Training: 5416, Tuning: 1354, Testing: 1354 • Missing attribute values: 2480 (denoted by “?”), all for attribute 11

  4. Mushroom Features • 1. cap-shape: bell=b, conical=c, convex=x, flat=f, knobbed=k, sunken = s • 2. cap-surface: fibrous=f, grooves=g, scaly=y, smooth=s • 3. cap-color: brown=n, buff=b, cinnamon=c, gray=g, green=r, pink=p, purple=u, red=e, white=w, yellow=y • 4. bruise?: bruises=t, no=f • 5. odor: almond=a, anise=l, creosote=c, fishy=y, foul=f • …

  5. Approach • Mutual information to determine the features used to split the tree. • Mutual information: • Y: label, X: feature • Choose feature X which maximizes I(Y;X)

  6. Most informative features extracted from decision tree: • odor • spore-print-color • habitat • population

  7. Prior Research by WlodzislawDuch, Department of Computer Methods, Nicholas Copernicus University

  8. Future • Add cross-validation to improve the accuracy • Prune the tree to avoid over-fitting

More Related