1 / 32

Clustering and Research Works

Clustering and Research Works. Dr. Bernard Chen Ph.D. University of Central Arkansas. Outline. Clustering Data Science Future Works. Clustering Algorithms. There are two clustering algorithms we used in our approach: K-means Clustering Fuzzy C-means Clustering. K-means Clustering.

crete
Télécharger la présentation

Clustering and Research Works

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Clustering and Research Works Dr. Bernard Chen Ph.D. University of Central Arkansas

  2. Outline • Clustering • Data Science • Future Works

  3. Clustering Algorithms • There are two clustering algorithms we used in our approach: • K-means Clustering • Fuzzy C-means Clustering

  4. K-means Clustering

  5. K-means Clustering

  6. K-means Clustering

  7. K-means Clustering

  8. K-means Clustering

  9. Fuzzy C-means Clustering

  10. Fuzzy C-means Clustering

  11. Fuzzy C-means Clustering

  12. Fuzzy C-means Clustering

  13. Fuzzy C-means Clustering

  14. Fuzzy C-means Clustering

  15. Fuzzy C-means Clustering

  16. Real World example

  17. Outline • Clustering • Data Science • Future Works

  18. Data Science wikipedia • Data science is the study of the generalizable extraction of knowledge from data. • It incorporates varying elements and builds on techniques and theories from many fields wikipedia

  19. Outline • Clustering • Data Science • Future Works

  20. Data Science wikipedia • A practitioner of data science is called a data scientist. • Data scientists solve complex data problems through employing deep expertise in some scientific discipline. • It is generally expected that data scientists are able to work with various elements

  21. Data Science wikipedia • Good data scientists are able to apply their skills to achieve a broad spectrum of end results. • the ability to find and interpret rich data sources, • manage large amounts of data despite hardware, software and bandwidth constraints, • merge data sources together, • ensure consistency of data-sets, • create visualizations to aid in understanding data, • build mathematical models using the data, • present and communicate the data insights/findings to specialists and scientists in their team and if required to a naive audience.

  22. Outline • Clustering • Data Science • Future Works

  23. Data Science in WINE • Once viewed as a luxury good, nowadays wine is increasingly enjoyed by a wider range of consumers. • Wine certification is generally assessed by physicochemical and sensory tests

  24. sensory tests • Example: Chateau Latour 2010 • http://www.wine.com/V6/Chateau-Latour-2010/wine/110508/detail.aspx

  25. sensory tests • Among those expert reviews, we use “Wine Spectator’s” version • "Unbelievably pure, with distilled cassis and plum fruit that cuts a very precise path, while embers of anise, violet and black cherry configure form a gorgeous backdrop. A bedrock of graphite structure should help this outlive other 2010s. Powerful, sleek and incredibly long. Not perfect, but very close. Best from 2020 through 2050."99 Points Wine Spectator

  26. sensory tests • Wine Spectator has the following advantages: • Words are precise • Well-known • Famous for it’s Top 100 wine of the year selection • Well maintained database

  27. Research Topic 1 • Clustering on past 10 years Top 100 wine (1000 wines) • Challenges: • Extract attributes from 1000 wine • Clustering algorithm • Analysis of the results

  28. Research Topic 2 • Multi-label (4 classes) Classification on 1000 wines, which composed of 250 wines for 4 category (95+, 90~94, 89~85, 85-) • Challenges: • Classification algorithm • 4 classes • How to improve accuracy

  29. Research Topic 3 • Association Rules on region-specific dataset (such as Napa) for attribute correlation and quality prediction. • Challenges: • Association Rules algorithm • Analysis of the results • How to improve accuracy

  30. Research Topic 4 • Region Prediction (such as France vs Italy), open for association rules or classification algorithms. • Challenges: • More free-style (more suitable for experienced researchers) • Not only focus on accuracy, but also try to tell the difference between the regions

  31. Research Topic 5 • Clustering + Classification for higher accuracy prediction. • Challenges: • TWO type of algorithms • More complex in understanding and coding

  32. Research Topic 6 • Multi-label research: since we have multiple reviews available, how to use those information for data science research? • Challenges: • Very flexible!!!

More Related