1 / 20

Assortative Mixing in the Amazon Book Reviewer Network

Assortative Mixing in the Amazon.com Book Reviewer Network. CSE 5810 Complex Networks Ben Collingsworth April 21, 2010. Department of Computer Sciences, Florida Institute of Technology email:bcolling@fit.edu. Proposal. Demonstrate assortativity in the Amazon.com book reviewer network:.

lula
Télécharger la présentation

Assortative Mixing in the Amazon Book Reviewer Network

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Assortative Mixing in the Amazon.com Book Reviewer Network CSE 5810 Complex Networks Ben Collingsworth April 21, 2010 Department of Computer Sciences, Florida Institute of Technology email:bcolling@fit.edu

  2. Proposal Demonstrate assortativity in the Amazon.com book reviewer network: Are people balanced in the materials they read or do they tend stay within a range of their bias and inclination?

  3. Related Work • Identifying the role that individual animals play in their social network[5]: • Examines assortativity in community of 62 dolphins. • Vertices represent dolphins. • Edge exists between two dolphins if associations between the pair is higher than expected by chance.

  4. Related Work • Complex network study of Brazilian soccer players[11]: • Bipartite network containing soccer club and soccer player vertices taken from the set of clubs and players that participated in the Brazilian soccer championship during the period from 1971 to 2002. • An edge is created between a club and a player if the player has been employed by the club. • The network is found to be assortative with a value of 0.12. • Assortativity rises over time from 0.02 in 1975, to 0.12 in 2002. • Rise attributed to a growing segregationist pattern, where preferential transfers of players between teams occurs.

  5. Related Work • Statistical Analysis of Network Data: Methods and Models[11]: • Analysis of assortativity in Internet2 backbone (Abilene). • Vertices consist of network components (aggregation points, connectors, exchanges, and participants). • Edge exists between vertices if physically connected. • Assortativity based on node type attribute. • High negative assortativity found with value of -0.3162. • Negative assortativity expected from hierarchical network.

  6. The Book Reviewer Network • Books are represented by vertices. • An edge exists between two books if they are reviewed by the same reviewer. • Edges are undirected. • Multiple edges between two vertices are recorded in edge weight.

  7. Tools Used to Create Network • Java 2 Standard Edition (J2SE) Java development and runtime environment • Eclipse Integrated Development Environment (IDE) • MySQL Relational Database Management System • BioLayout Express 3D • Network Workbench

  8. Network Data Collection • Web Crawler used to collect data: • The Web Crawler was run in two phases to collect data in two disparate categories. • The two categories for the book collection were "George W. Bush" and "Barrack Obama". • Five starting books were used to begin each pass of the data collection. • The first pass was started using five books describing George W. Bush. These books describe Bush's background, beliefs, and accomplishments in a positive manner. • In the second phase, five books describing Barak Obama were used. Similarly, these books described Obama positively.

  9. Data Collection Algorithm while number of books less than maximum for each book in book URL list extract book information if the book already exists in the database and the category of the book is different in the database, change the category to “Common” otherwise, store in book information in database extract reviewer URLs for book for each reviewer extract review information and store in database extract URLs of other books read by reviewer add book URLs to next level book URL list end for end for assign next level book URL list to book URLlist end while

  10. Book Reviewer Networks Nodes and Edges

  11. Level 3 Network Visualization

  12. Level 1 Network Visualization

  13. Book Reviewer Network Level 3 - Degree Probability Distribution

  14. Book Reviewer Network Level 3 - LOG10 Degree Probability

  15. Book Reviewer Network Property Comparison

  16. Book Reviewer Network Assortativity Assortativity calculated using the Newman equation:

  17. Conclusions • The book reviewer networks were demonstrated to be assortative. • The book reviewer networks remained assortative even as books and reviewers became further removed from the original “seed" books through additional iterations of the collection algorithm • The assortativity shown in the book reviewer networks reveals a heterogeneity in the types of books people read. • People tend to read a range of books that match their personal biases and inclinations.

  18. Future Work • Further exploration into degradation of book reviewer network as levels deepen. • Analysis of network with increase in number of “seed” books. • Investigation into validity of assortativity calculation for unbalanced networks.

  19. Bibliography • 1. B. Collingsworth and R. Menezes. Identication of social tension in organizational networks. In Complex Networks, pages 209{223. Springer Berlin / Heidelberg, May 2009. • 2. R. Dawkins. The God Delusion. Houghton Miin Harcourt, 2006. • 3. H. Ebel, L.-I. Mielsch, and S. Bornholdt. Scale-free topology of e-mail networks. Phys. Rev. E, 66(3):035103:(1{4), Sep 2002. • E. D. Kolaczyk. Statistical Analysis of Network Data: Methods and Models (Springer Series in Statistics). Springer, 1 edition, 2009. • 5. D. Lusseau and M. E. J. Newman. Identifying the role that individual animals play in their social network. PROC.R.SOC.LONDON B, 271:S477, 2004. • 6. J. Matlis. Internet2. COMPUTERWORLD, August 2006. Available for download at http://www.computerworld.com/s/article/9002735/Internet2. • 7. M. McPherson, L. S. Lovin, and J. M. Cook. Birds of a feather: Homophily in social networks. Annual Review of Sociology, 27(1):415{444, 2001. • 8. M. E. Newman. The structure of scientic collaboration networks. ProcNatl Acad Sci U S A, 98(2):404{409, January 2001. • 9. M. E. J. Newman. Mixing patterns in networks. Phys. Rev. E, 67(2):026126, Feb 2003. • 10. M. E. J. Newman. The structure and function of complex networks. SIAM Review, 45:167{256, 2993. • 11. R. N. Onody and P. A. de Castro. Complex network study of brazilian soccer players. Phys Rev E Stat Nonlin Soft Matter Phys, 70(3 Pt2):037103, 2004.

  20. “Bush” connected to “Obama” • Reviews of books sorted by "Most Helpful Customer Reviews“. • Next level book URL list, i.e. other books read by reviewer, sorted by "Most Recently Reviewed“. • Not every review of a book in book URL list is saved, limited to 12 reviews per book. • Not every book reviewed by reviewer is used in next level book URL list, limited to 10 books. • A reviewer is saved in the list of reviews of a book in the "Bush phase“: • In the “Obama” phase, a book reviewed by the same reviewer is seen that was not seen in the "Bush" phase. • Hence, books from the "Bush" phase that were recorded from the reviewer will have edges to books in the "Obama" phase that were reviewed by the same reviewer. • “Bush” to “Obama” limited to a few hundred edges in level 3 network. • Included in the assortativity calculation.

More Related