1 / 31

Link Prediction and Sentiment Analysis on Amazon products

Link Prediction and Sentiment Analysis on Amazon products. Presenters: Vishal Mishra Ashray Bhandare Subhrajit Majumder. Agenda. Background Analysis Dataset Problem formulation Methodology Simulation Preprocessing Link Prediction Sentiment Analysis Experimental results

dash
Télécharger la présentation

Link Prediction and Sentiment Analysis on Amazon products

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Link Prediction and Sentiment Analysis on Amazon products Presenters: Vishal Mishra Ashray Bhandare Subhrajit Majumder

  2. Agenda • Background • Analysis • Dataset • Problem formulation • Methodology • Simulation • Preprocessing • Link Prediction • Sentiment Analysis • Experimental results • Conclusions and future work EECS6980:006 Social Network Analysis

  3. Objective and Motivation Objectives • To make a Link prediction on given set of data which will help us to get the prognostic knowledge of the products • To perform sentiment analysis on reviews of the products to assess the emotions or sentiments of the user toward the product for the future market. Motivation • How companies perform analysis on their products to increase their sales thus increasing yearly profits. EECS6980:006 Social Network Analysis

  4. Agenda • Background • Analysis • Dataset • Problem formulation • Methodology • Simulation • Preprocessing • Link Prediction • Sentiment Analysis • Experimental results • Conclusions and future work EECS6980:006 Social Network Analysis EECS6980:006 Social Network Analysis

  5. Background Overview of Project • Humans inevitably develop a sense of the relationships between movies or tv shows they watch, some of which are based on their interest. • The system we develop is capable of recommending which movies and tv shows will go well together (and which will not). • Prediction of the sales of products based on past purchases and user feedback. • Our approach is just not based on selling of products at any give point of time to increase the sale of the company but rather on capturing the largest dataset possible and developing a scalable method for uncovering human notions. EECS6980:006 Social Network Analysis EECS6980:006 Social Network Analysis

  6. Background Why we have chosen this project ? • Link Prediction systems are ubiquitous in applications ranging from e-commerce to social media, video, and online news platforms. • Such systems help users to navigate a huge selection of items with unprecedented opportunities to meet a variety of special needs and user tastes. Making sense of a large number of products and driving users to new and previously unknown items is key to enhancing user experience and satisfaction. • How relationships between the interested movies and tv shows and human behavior(Sentiment Analysis) help in finding acceptable alternatives EECS6980:006 Social Network Analysis EECS6980:006 Social Network Analysis

  7. Background Technologies used ? • Link Prediction: Link prediction is an important task for analying social networks which also has applications in other domains like, information retrieval, bioinformatics and e-commerce. • Common Neighbors • Adamic Adar • Page Rank • Jakards Constant • Delta EECS6980:006 Social Network Analysis EECS6980:006 Social Network Analysis

  8. Background Technologies used ? • Sentiment Analysis: Sentiment analysis refers to the use of natural language processing, text analysis and computational linguistics to identify and extract subjective information in source materials. • Linguistic Inquiry and Word Count : LIWC tool is used to procure the tone of the user towards his purchased product. EECS6980:006 Social Network Analysis EECS6980:006 Social Network Analysis

  9. Agenda • Background • Analysis • Dataset • Problem formulation • Methodology • Simulation • Preprocessing • Link Prediction • Sentiment Analysis • Experimental results • Conclusions and future work EECS6980:006 Social Network Analysis EECS6980:006 Social Network Analysis

  10. Analysis Amazon Dataset • The Size of the Dataset obtained from Prof. Julian McAuley is 18gb. EECS6980:006 Social Network Analysis EECS6980:006 Social Network Analysis

  11. Analysis • The Size of the movies and tv show data is 3.66gb EECS6980:006 Social Network Analysis EECS6980:006 Social Network Analysis

  12. Analysis • The following is a sample of one purchase consisting of the different parameters related to that purchase. EECS6980:006 Social Network Analysis EECS6980:006 Social Network Analysis

  13. Analysis Problem Formulation • A and B are the items. 1,2,3,4,5 are the customers who have bought these items • We can see that item A and B have common customers 2,3 and 4 between them • Based on this, we can predict that customer 1 may buy item B and customer 5 may buy item A. EECS6980:006 Social Network Analysis EECS6980:006 Social Network Analysis

  14. Analysis Methodology • After extracting the required parameters from the Amazon dataset, we created a bipartite graph between items and people. • Bipartite graphs were created for 280 items for the month of January and February. • These graphs were then projected onto items to get adjacency matrices for the month of January and February. • These adjacency matrices show the items who have peoplein common. EECS6980:006 Social Network Analysis EECS6980:006 Social Network Analysis

  15. Analysis Spy Graph of Adjacency Matrix projected on items for January EECS6980:006 Social Network Analysis EECS6980:006 Social Network Analysis

  16. Analysis Spy Graph of Adjacency Matrix projected on items for February EECS6980:006 Social Network Analysis EECS6980:006 Social Network Analysis

  17. Analysis Link Prediction of adjacency projected on items • The Precision Recall Area under the curve(PRAUC) results of different Link Prediction Algorithms. • Based on the above table we chose the common neighbor link predictor. EECS6980:006 Social Network Analysis EECS6980:006 Social Network Analysis

  18. Analysis Predicted Spy Graph of Adjacency Matrix projected on items for February EECS6980:006 Social Network Analysis EECS6980:006 Social Network Analysis

  19. Analysis Sentiment Analysis of Customer review text • The Linguistic Inquiry and Word Count (LIWC) program includes the main text analysis module along with a group of built-in dictionaries. • LIWC reads written or transcribed verbal texts which have been stored in a digital, computer-readable form (such as text files). • The text analysis module then compares each word in the text against a user-defined dictionary. • The dictionary identifies which words are associated with which psychologically-relevant categories. EECS6980:006 Social Network Analysis EECS6980:006 Social Network Analysis

  20. Agenda • Background • Analysis • Dataset • Problem formulation • Methodology • Simulation • Preprocessing • Link Prediction • Sentiment Analysis • Experimental results • Conclusions and future work EECS6980:006 Social Network Analysis EECS6980:006 Social Network Analysis

  21. Simulation Preprocessing • The Amazon Dataset has 9 parameters • The data was extracted using the regular expression function which took the following patterns as inputs. pat1 = '"reviewerID": "(\w*)"'; pat2 = '"asin": "(\w*)"'; pat3 = '"reviewerName": "(.*?)"'; pat4 = '"helpful": \[(.*?)\]'; pat5 = '"reviewText": "(.*?)"'; pat6 = '"overall": (\d*\.\d*)'; pat7 = '"summary": "(.*?)"'; pat8 = '"unixReviewTime": (\d*)'; pat9 = '"reviewTime": "(.*?)"'; EECS6980:006 Social Network Analysis EECS6980:006 Social Network Analysis

  22. Simulation • Out of the 17 years of data (1997-2014), the data for the month of January and February of 2014 was considered. • Our Aim was to use the January data and predict the increase or decrease in item sales of the February data. • Since we projected the bipartite matrices on items, we got adjacency matrices of 280X280 EECS6980:006 Social Network Analysis EECS6980:006 Social Network Analysis

  23. Simulation Link Prediction • As the PRAUC of the common neighbors predictor was maximum, we chose the results given by this predictor. • Common Neighbors is given by |Γ(x)∩Γ(y)|. x and y are nodes under consideration EECS6980:006 Social Network Analysis EECS6980:006 Social Network Analysis

  24. Simulation Sentiment Analysis • Sentiment Analysis was carried out on the text reviews given by the customers for the month of January and February. • The text was passed through the LIWC analyzer, and based on different parameters such as word count, tone, authenticity, clout we have analyzed the reviews. The tone parameter was used as primary attribute. • LIWC2015 includes both positive emotion and negative emotion dimensions, the Tone variable puts the two dimensions into a single summary variable (Cohn, Mehl, & Pennebaker, 2004). • The algorithm is built so that the higher the number, the more positive the tone. Numbers below 50 suggest a more negative emotional tone. EECS6980:006 Social Network Analysis EECS6980:006 Social Network Analysis

  25. Simulation Results • For demonstration purposes we have chosen two items which show an increase and decrease in percentage of sales from January to February. • Item 217 shows an increase in sale percentage and item 127 shows a decrease in sale percentage EECS6980:006 Social Network Analysis EECS6980:006 Social Network Analysis

  26. Simulation • The tone of the different purchases were averaged in order to get a clear idea of the change in the sales of the items. EECS6980:006 Social Network Analysis EECS6980:006 Social Network Analysis

  27. Agenda • Background • Analysis • Dataset • Problem formulation • Methodology • Simulation • Preprocessing • Link Prediction • Sentiment Analysis • Experimental results • Conclusions and future work EECS6980:006 Social Network Analysis EECS6980:006 Social Network Analysis

  28. Conclusion • We comprehensively evaluate the Amazon Data setin terms of link prediction and Sentiment Analysis. • We adapt several link prediction algorithms on this dataset to both predict links and infer human notions. • Our evaluation with a large-scale Amazon dataset demonstrates performance improvement for each of these generalized algorithm on both link prediction and Sentiment Analysis. EECS6980:006 Social Network Analysis EECS6980:006 Social Network Analysis

  29. Future Work • The Common neighbor link predictor always gives the new predicted nodes, it does not deal with link deletion. We have to look for a model which works together with the link predictor and deletes the necessary nodes. • The algorithms used for performing sentiment analysis does not give good results for conjunction statements, so there is a need for a much better algorithm. • This project can be further extended to create a recommender system. EECS6980:006 Social Network Analysis EECS6980:006 Social Network Analysis

  30. References • Image-based recommendations on styles and substitutes J. McAuley, C. Targett, J. Shi, A. van den HengelSIGIR, 2015. • Inferring networks of substitutable and complementary products J. McAuley, R. Pandey, J. LeskovecKnowledge Discovery and Data Mining, 2015. • Jointly Predicting Links and Inferring Attributes using a Social-Attribute Network (SAN) Neil Zhenqiang Gong, Ameet Talwalkar, Lester Mackey, Ling Huang. • Evolution of Social-Attribute Networks: Measurements, Modeling, and Implications using Google+ Prateek Mittal, Wenchang Xu, Ling Huang, Emil Stefanov, Vyas Sekar, Neil Zhenqiang Gong. • Linguistic markers of psychological change surrounding September 11, 2001 Cohn MA,Mehl MR, Pennebaker JW. EECS6980:006 Social Network Analysis EECS6980:006 Social Network Analysis

  31. Thank You ! EECS6980:006 Social Network Analysis EECS6980:006 Social Network Analysis

More Related