1 / 16

Knowledge Mining and Soil Mapping using Maximum Likelihood Classifier with Gaussian Mixture Models

Knowledge Mining and Soil Mapping using Maximum Likelihood Classifier with Gaussian Mixture Models. ECE539 final project Instructor: Yu Hen Hu Fall 2005 . Jian Liu 12/13/2005. Overview.

jenn
Télécharger la présentation

Knowledge Mining and Soil Mapping using Maximum Likelihood Classifier with Gaussian Mixture Models

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Knowledge Mining and Soil Mapping using Maximum Likelihood Classifier with Gaussian Mixture Models ECE539 final project Instructor: Yu Hen Hu Fall 2005 Jian Liu 12/13/2005

  2. Overview This study deals with data mining from soil survey maps and soil mapping with mined soil-landscape knowledge.

  3. Soil – landscape models • Soil is a product of the interaction of surrounding environments • “soil-landscape model” (Hudson, 1992) • Soil can be predicated given the environments

  4. Environmental variables • Environmental factors affecting soil formation: • Bedrock geology • Elevation (DEM) • Slope gradient • 1st derivative along the steepest slope • Profile curvature • 2nd derivative along the steepest slope • Planform curvature • 2nd derivative perpendicular to contour lines

  5. Previous Approaches & Problems • Fuzzy system (Zhu 2001) • Elicits knowledge from a soil scientist and represents it with arbitrary curves • Assumes independence of each environmental variable • ANN (Zhu 2000; Behrens 2005; Scull 2005 ) • Black box knowledge representation • High dimensional matrix is hard to comprehend • Decision trees (Bui, 1999; Qi et.al. 2003) • Knowledge extracted is crisp (typical case), no information about gradation

  6. Proposal – Knowledge Representation GMM representation is more suitable because: • Probability representation well captures the physical gradation of the phenomenon • The interactions between environmental variables are taken into account by the multivariate Gaussian distribution • Mixture model gives a great potential of capturing the real distribution • Physically a soil type may have multiple instances.

  7. Proposal – Maximum Likelihood Classifier • Maximum likelihood • P(A|Class1) = 0.8 • P(A|Class2) = 0.5 • A then is classified into class1 based on“Maximum likelihood” • Naturally evaluates the composite effect environmental variables have on the probability of soil formation

  8. Algorithm Training procedure: Testing procedure:

  9. geology elevation slope gradient profile curvature planform curvature soil map Case Study Training set Testing set … elevation soil map geology

  10. Evaluation of the GMM representation The GMM representations well capture the gradation of soil on the landscape, which complies well with expert knowledge e.g. Council at footslope e.g. Elbaville at backslope

  11. Training accuracy & testing accuracy • Overall, 80% classification accuracy against testing data • Increasing number of mixtures leads to higher classification accuracy • at an expense of exponentially increasing storage and computational load classification accuracy (%)

  12. Classification Accuracy vs. # of Mixtures

  13. Mapping accuracy based on field data • 64 points are correctly classified out of 83 field sample points (77%), higher than traditional manual based soil survey (usually 60%) Classification result using 8 mixtures (the dark blue areas are not mapped)

  14. More comments • Standardization of feature dimensions is very effective, -- improves mapping accuracy from 55% to 80% • Preprocessing techniques such as data cleaning required by decision tree is not critical to ML because the ML classifier is not as sensitive to training errors as long as theyare not of a huge amount.

  15. Conclusion • GMM is suitable to represent soil-landscape knowledge • ML classifier with GMMs is promising for soil knowledge mining and soil mapping

  16. Future improvement? • Reduce the storage and computational load so that bigger number of mixtures can be used to improve classification accuracy • Use diagonal matrix to replace full covariance matrix (after applying de-correlation to the features)?

More Related