1 / 14

PKNN/LSVM Approach for Gene Expression Analysis (P-trees Technology)

This outline provides an introduction to the P-tree technology, followed by our approach using PKNN/LSVM for gene expression analysis. We discuss podium KNN, weight optimization, improving accuracy by LSVM, and conduct a performance study.

Télécharger la présentation

PKNN/LSVM Approach for Gene Expression Analysis (P-trees Technology)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. PKNN/LSVM Approach for Mircroarray Gene Expression Analysis (P-trees technology is patented by NDSU)

  2. OUTLINE • Introduction • P-tree Technology • Our Approach • PODIUM KNN • Weight Optimization • Improving Accuracy by LSVM • Performance Study • Conclusion

  3. PODIUM KNN • Dissimilarity measurement: F(X,Y)=wid(xi,yi) where d(xi,yi)= |xi-yi|, manhattan distance Stage1. finding neighborsStage2. Podium votes

  4. Optimizing Weights • Genetic algorithm, as introduced by Goldberg (1989), is randomized search and optimization techniques that is capable of searching for optimal solutions. • Step1. Partition weight space • Step2. Evaluation/Selection: 10-fold cross validation 1010 1110 … 1010 1010 1110 … 1010 eval 1010 1110 … 1010 1010 1110 … 1010 …

  5. 1010 1110 … 1010 rep 1010 1110 … 1010 1010 1110 … 1010 mut 1010 1110 … 1010 1010 1110 … 1010 Optimizing Weights (cont.) • Step3. Reproduction • Step4. Mutation • Step5. Go back to step2 till reaching stop conditions.

  6. Class 1 Optimal boundary Optimal margin Class 2 Optimal Knn/LSVM • Why LSVM: A lesson from KddCup02

  7. Optimal Knn/LSVM (cont.) • EIN-ring membership • C: component • R: radius • Support vector pair • Boundary Sentry • Boundary hyper plane + + + + + + + + + + + + + - + - * - + + - - - # * - - - - - - - - - - - Step1. finding support vector pairsStep2. fitting boundary hyper plane

  8. Class 1 Optimal boundary Optimal margin Class 2 Optimal Knn/LSVM (cont.) • Robust for Data Set with Noise

  9. data DCI Model GA Model Basic P-trees w1,w2,…,wd Cuboids Model gw1,…,(wi,wj),…,gwk Sorting w.t. avg(gw) HOBBit/EINring EINring Formulation Excution using PDM PDM Model Implementation • Models Structure Design

  10. Data Sets of Bioinformatics • DS1. Leukemia data, size 6817x72, (http://llmpp.nih.gov/lymphoma/) • DS2. Colon cancer data (Alon 1999),size 2000x62 • DS3. NCI60, size 1376x60 • DS4. Yeast sporulation data set (Chu et al. 1998). Time series data. http://cmgm.stanford.edu/pbrown/sporulation/.

  11. Performance Study • Accuracy Comparision

  12. Performance Study (cont.) • Influence of noise

  13. Performance Study (cont.) • Influence of GA parameters

  14. Conclusion

More Related