1 / 30

S. Wesley Changchien, Tzu-Chuen Lu Expert Systems with Applications 20(2001) 325-335

Data Mining & Knowledge Discovery. Mining association rules procedure to support on-line recommendation by customers and products fragmentation. S. Wesley Changchien, Tzu-Chuen Lu Expert Systems with Applications 20(2001) 325-335. 組員: M964020025 郭李哲 M964020027 鄭淵太

Télécharger la présentation

S. Wesley Changchien, Tzu-Chuen Lu Expert Systems with Applications 20(2001) 325-335

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Data Mining & Knowledge Discovery Mining association rules procedure to support on-line recommendation by customers and products fragmentation S. Wesley Changchien, Tzu-Chuen Lu Expert Systems with Applications 20(2001) 325-335 組員:M964020025 郭李哲 M964020027 鄭淵太 M964020044 鐘佶修

  2. Background and motivation • Most of the EC business endeavor to survive and become leaders in the frontier of the new wave. • The major key factors of success include learning customers’ behavior of purchasing, developing marketing strategies to create new consuming market, and discover latent loyal customers, etc.

  3. Data mining task

  4. Self-Organizing Map(SOM)

  5. Rough set theory(RST) • 以RST進行資料分析全賴兩個基本觀念,稱之為集合的下界與上界近似(the lower and the upper approximations of a set)

  6. Rough set theory(RST) ,

  7. Data mining 程序

  8. Step 1- selection and sampling • 1. Creating a fact table • 2. Selecting dimensions • 挑選其所感興趣的dimension • 3. Selecting attributes • 根據重要性,挑選屬性 • 4. Filtering data • 限制屬性值的範圍

  9. Step 2 - transformation and normalization • 1. 屬性為數值資料 • 2. 屬性為非數值資料 • 將資料做設計描述 • Job 屬性中的資料當作character

  10. Step 3 – data mining of association rules • 採用neural network進行clustering與rough set theory 取得規則,以應用於找尋association rules,解釋每個cluster中其特性,和不同的cluster間屬性的關係。

  11. Clustering module • Kohonen proposed SOM in 1980. • 顯示input屬性之間的natural relationship。 • We can group enterprise’s customers, products, and suppliers into clusters. • For instance, input nodes : education and job from the table member • Output nodes:nine clusters.

  12. Rule extraction module • 使用Rough set theory 對資料記錄中同質的cluster找出association rules與不同cluster間其屬性間關係。

  13. Characterization of each cluster • 利用Rough Set Theory來解釋一個cluster所擁有的特徵 • Ex:某類的顧客,其教育程度在大學以上、月薪3.5萬以上… • 產生Result equivalence classes, Xk • 產生Cause equivalence classes, Aij • 產生Lower approximation rules • 產生Upper approximation rules • 產生Combinatorial rules • 解釋cluster的特徵 • 重複(返回Step3)

  14. step1.產生Result equivalence classes, Xk • 針對每個cluster產生result equivalence class

  15. step2.產生Cause equivalence classes, Aij • 針對屬性產生Cause equivalence class

  16. step3.產生Lower approximation rules • ,Confidence = 1 • X1 = { Member2, Member5, Member6} • = { Member2} • = { Member5, Member6} • = {Φ} • = {Φ} • = { Member2、Member5, Member6} • = {Φ} • Rule1: If Education = H then GID = A • Confidence = 1

  17. step4.產生Upper approximation rules • , ,Confidence = • X1 = { Member2, Member5, Member6} • = { Member2} • = { Member5, Member6} • = {Φ} • = {Φ} • = { Member2、Member5, Member6} • = {Φ}

  18. step4.產生Upper approximation rules • Confidence Threshold = 0.75 • Rule2: If Education = N then GID = A • Confidence= • Reject Rule2 (0.33≦0.75) • Rule3: If Job = H then GID = A • Confidence= • Accept Rule3(0.75 ≦0.75)

  19. step5.產生Combinatorial rules • Confidence = • 結合Rule,產生考量多個屬性的關聯規則 X1 = { Member2, Member5, Member6}

  20. step5.產生Combinatorial rules • Rule4: If Education = N and Job = H then GID = A • Confidence= • Rule5: If Education = H and Job = H then GID = A • Confidence =

  21. step6.解釋cluster的特徵 • 將規則匯總並解釋其特徵 • 屬於Cluster 1(Cluster A)的Member: • 100%的人Education = High • 75%的人Job = High • 25%的人Education = Normal且Job = High • 50%的人Education = High且Job = High

  22. step7.重複 • 返回Step3,計算下一個equivalence class Xk,以此方式重複進行直到所有的equivalence class皆計算完成。

  23. Association of different clusters • 利用Rough Set Theory分析不同cluster之間的關係 • Ex: A類的會員較喜歡b類的商品;C類的會員較喜歡d類的商品…

  24. Association of different clusters R3: If Buyer = 1 Then Receiver = 2, Confidence = 0.5 R4: If Buyer = 2 Then Receiver = 2, Confidence = 0.75 R1: If Product = 3 Then Receiver = 2, Confidence = 1 R2: If Product = 6 Then Receiver = 2, Confidence = 1 R5: If Product = 7 Then Receiver = 2, Confidence = 0.5

  25. 系統實做 • 以某家商店的交易紀錄為對象 • Product Table有1120筆記錄 • Customer Table有35筆紀錄 • 保留2000筆交易紀錄作為探勘的資料 • 經由維度、屬性的挑選 • Customer Clustering • education、job 、 gender • Product Clustering • sales price、import price 、 sale price of VIP customers

  26. SOM network Interface

  27. Relationship analysis Interface

  28. Use Rules for Recommendation • 某一位顧客想購買一個商品贈送其朋友,但他不知該買什麼較適合。 • 顧客的cluster = 7,而其朋友的cluster = 1,則系統可推薦cluster = 9之商品給顧客

  29. Conclusion • 本篇採用SOM與rough set theory進行群集與規則粹取Rule extraction module描述了不同群集間之關係特性分析者可進一步選擇其他屬性,以分析出群集間的關係,例如,星座、心理測驗或血型等。 • 本研究利用Rough Set Theory找出資料中的關聯規則,而關聯規則又可分為兩個方向:cluster的特徵敘述和不同cluster間關係;然而在實作中,只有呈現不同cluster間之關係,並沒有提到cluster的特徵敘述和該如何應用。

More Related