Project topics – Private data management Nov. 2011
Topic 1: Survey on the Status of Privacy Specifications in Online Social Networks • Study at least 8 online social networks (OSNs) (including Facebook, LinkedIn, Google+ and Flickr) and report of how each one of them handles privacy specifications. • The output of this study is expected to be: • A characterization of what a user is allowed to specify in terms of "what piece of information" (e.g., photo, wall post, status update, etc) is visible to "what type of users" (e.g., friends, friends-of-friends, lists, etc) and what is the default setting. As an example of the expected output of this study consider Table 1 in  but more detailed (*) • A ranking of the 8 OSNs with regards to "how much" private these OSN are, using one or more appropriate metrics, for example, using ideas from  Read  for some nice ideas on how to improve the current situation. •  Barbara Carminati, Elena Ferrari, Andrea Perego: Enforcing access control in Web-based social networks. ACM Trans. Inf. Syst. Secur. 13(1): (2009) •  Kun Liu, EvimariaTerzi: A Framework for Computing the Privacy Scores of Users in Online Social Networks. TKDD 5(1): 6 (2010) •  Krishna P. Gummadi, Alan Mislove, and Balachander Krishnamurthy. Addressing the Privacy Management Crisis in Online Social Networks. In The IAB Workshop on Internet Privacy, December 2010. (Position Paper)
Topic 1: Survey on the Status of Privacy Specifications in Online Social Networks Table 1 of 
Topic 2: Experimental Evaluation of the Privacy of a Real OSN • Choose 2 real data sets from OSNs (or 2 different subsets of the same data set) • Build the corresponding social network graphs. Check the web page for some links of where to get datasets. • Evaluate the resulting graphs in terms of • (1) k-degree anonymity , and • (2) an additional k-anonymity based criteria of your choice.  Kun Liu, EvimariaTerzi: Towards identity anonymization on graphs.
Local recoding with hierarchies • How do we anonymize a table with categorical attributes in the QI set, • with local recoding + • with hierarchies playing a role in the process? • Implement+test the KACA algorithm • Jiuyong Li, Raymond Chi-Wing Wong, Ada Wai-Chee Fu, Jian Pei. Anonymization by Local Recoding in Data withAttributeHierarchical Taxonomies. IEEE Trans. Knowl. Data Eng. 20(9): 1181-1194 (2008)
Local recoding with hierarchies (2) • Another approach on the topic: • “Cut-off” a single ancestor value per detailed value • Implement + test the proposed algorithm • Junqiang Liu, Ke Wang. On Optimal Anonymization for L(+)-Diversity. Proceedings of 26th IEEE International Conference on Data Engineering, March 1-6, 2010, Long Beach, California, USA
Toolkits • Do sth with existing toolkits (Cornell, Udallas) • Port Cornell’s toolkit to MySQL / generic DB ? • Port Udallas to java ? • Convert UoI code to toolkit?