privacy preserving public auditing for data storage security in cloud computing n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Privacy-Preserving Public Auditing for Data Storage Security in Cloud Computing PowerPoint Presentation
Download Presentation
Privacy-Preserving Public Auditing for Data Storage Security in Cloud Computing

play fullscreen
1 / 21

Privacy-Preserving Public Auditing for Data Storage Security in Cloud Computing

1751 Views Download Presentation
Download Presentation

Privacy-Preserving Public Auditing for Data Storage Security in Cloud Computing

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. Privacy-Preserving Public Auditing for Data Storage Security in Cloud Computing Cong Wang1, Qian Wang1, Kui Ren1 and Wenjing Lou2 1 Illinois Institute of Technology, 2 Worcester Polytechnic Institute Proceedings of IEEE Infocom 2010 Computer Systems Lab Group Meeting Presented by: Zakhia Abichar February 25, 2010

  2. data user user user External Audit party Cloud network Cloud Computing • With cloud computing, users can remotely store their data into the cloud and use on-demand high-quality applications • Using a shared pool of configurable computing resources • Data outsourcing: users are relieved from the burden of data storage and maintenance • When users put their data (of large size) on the cloud, the data integrity protection is challenging • Enabling public audit for cloud data storage security is important • Users can ask an external audit party to check the integrity of their outsourced data

  3. data user user user External Audit party Cloud network Third Party Auditor (TPA) • External audit party is called TPA • TPA helps the user to audit the data • To allow TPA securely: • 1) TPA should audit the data from the cloud, not ask for a copy • 2) TPA should not create new vulnerability to user data privacy • This paper presents a privacy-preserving public auditing system for cloud data storage

  4. Outline • Introduction • System and threat model • Proposed scheme • Security analysis & performance evaluation

  5. Introduction • Cloud computing gives flexibility to users • Users pay as much as they use • Users don’t need to set up the large computers • But the operation is managed by the Cloud Service Provider (CSP) • The user give their data to CSP; CSP has control on the data • The user needs to make sure the data is correct on the cloud • Internal (some employee at CSP) and external (hackers) threats for data integrity • CSP might behave unfaithfully • For money reasons, CSP might delete data that’s rarely accessed • CSP might hide data loss to protect their reputation

  6. Introduction • How to efficiently verify the correctness of outsourced data? • Simply downloading the data by the user is not practical • TPA can do it and provide an audit report • TPA should not read the data content • Legal regulations: US Health Insurance Portability and Accountability Act (HIPAA) • This paper presents how to enable privacy-preserving third-party auditing protocol • First work in the literature to do this

  7. System and Threat Model • U: cloud user has a large amount of data files to store in the cloud • CS: cloud server which is managed by the CSP and has significant data storage and computing power (CS and CSP are the same in this paper) • TPA: third party auditor has expertise and capabilities that U and CSP don’t have. TPA is trusted to assess the CSP’s storage security upon request from U

  8. A note on auditing • What’ is auditing? • Reference: http://searchcio.techtarget.com/searchCIO/downloads/AuditTheDataOrElse.pdf

  9. A Public Auditing Scheme This is a framework from previous related work. It is adapted to suit the goals of this paper • Consists of four algorithms (KeyGen, SigGen, GenProof, VerifyProof) • KeyGen: key generation algorithm that is run by the user to setup the scheme • SigGen: used by the user to generate verification metadata, which may consist of MAC, signatures or other information used for auditing • GenProof: run by the cloud server to generate a proof of data storage correctness • VerifyProof: run by the TPA to audit the proof from the cloud server

  10. CSP CSP TPA TPA user TPA Setup KeyGen SigGen File F Public & Secret parameters Verification Metadata Audit issues an audit message or a challenge to GenProof File F Response message VerifyProof Verification Metadata

  11. key MAC File block code code 2 … code 1 Block 2 … code n Block 1 Block n Block 1 Block 2 … Block n Cloud user TPA Basic Scheme 1 File is divided into blocks Message Authentication Code (MAC) • Audit • TPA demands a random number of blocks and their code from CSP • TPA uses the key to verify the correctness of the file blocks • User computes the MAC of every file block • Transfers the file blocks & codes to cloud • Shares the key with TPA Drawbacks: -The audit demands retrieval of user’s data; this is not privacy-preserving -Communication and computation complexity are linear with the sample size

  12. user code 2 code 2 code 2 … … … code 1 code 1 code 1 Block 2 Block 2 … … code n code n code n Block 1 Block 1 Block n Block m Basic Scheme 2 Key 1 Key 2 … Key s Cloud TPA • Setup • User uses s keys and computes the MAC for blocks • User shares the keys and MACs with TPA • Audit • TPA gives a key (one of the s keys) to CSP and requests MACs for the blocks • TPA compares with the MACs at the TPA • Improvement from Scheme 1: TPA doesn’t see the data, preserves privacy • Drawback: a key can be used once. • The TPA has to keep a state; remembering which key has been used • Schemes 1 & 2 are good for static data (data doesn’t change at the cloud)

  13. Privacy-Preserving Public Auditing Scheme Proposed scheme • Uses homomorphic authenticator • Also uses a random mask achieved by a Pseudo Random Function (PRF) Homomorphic authenticator Block 1 Block 2 … Block k Verification Metadata Verification Metadata Verification Metadata Aggregate Verification Metadata A linear combination of data blocks can be verified by looking only at the aggregated authenticator

  14. Privacy-Preserving Public Auditing Scheme - In addition to Aggregate Authenticator, the TPA will receive a linear combination of file blocks: Random Mask by PRF • The PRF function masks the data • It has a property of not affecting the Verification Metadata vi are random number mi are file blocks • If TPA sees many linear combinations of the same blocks, it might be able to infer the file blocks • This, we also use a random mask provided by the Pseudo Random Function (PRF) Block 1 Block 1 Block 1 with PRF Mask Verification Metadata Verification Metadata  Equal  r is the mask

  15. KeyGen CSP user TPA user sk SigGen Block 1 Block 2 … Block n Public key (sk)& Secret key (pk) Block 2 … Block 1 Block n σ1 σ2 … σn • TPA sends a challenge message to CSP • It contains the position of the blocks that will be checked in this audit σ1 σ2 … σn Setup 1- User generates public and secret parameters 2- A code is generated for each file block 3- The file blocks and their codes are transmitted to the cloud Audit Selected blocks in challenge -CSP also makes a linear combination of selected blocks and applies a mask.Separate PRF key for each auditing. -CSP send aggregate authenticator & masked combination of blocks to TPA GenProof Aggregate authenticator Masked linear combination of requested blocks Compare the obtained Aggregate authenticator to the one received from CSP VerifyProof Aggregate authenticator

  16. Properties • The data sent from CSP to TPA is independent of the data size • Linear combination with mask • Previous work has shown that if the server is missing 1% of the data • We need 300 or 460 blocks to detect that with a probability larger than 95% or 99%, respectively

  17. More Possible Extensions • Batch auditing • There are K users having K files on the same cloud • They have the same TPA • Then, the TPA can combine their queries and save in computation time • The comparison function that compares the aggregate authenticators has a property that allows checking multiple messages in one equation • Instead of 2K operation, K+1 are possible • Data dynamics • The data on the cloud may change according to applications • This is achieved by using the data structure Merkle Hash Tree (MHT) • With MHT, data changes in a certain way; new data is added in some places • There is more overhead involved ; user sends the tree root to TPA • This scheme is not evaluated in the paper

  18. Performance • Reference [11] doesn’t have privacy-preserving property • TPA can read the information

  19. Batch Auditing • Number of auditing tasks increased from 1 to 200 in multiple of 8 • Auditing time per task: total auditing time / number of tasks

  20. Performance with Invalid Responses • In batch auditing, true means that all of the messages are correct • False means at least one is wrong • Divide batch in half, repeat for left- and right parts • Binary search Wrong 1 2 3 4 5 6 7 8 9 10 Wrong 1 2 3 4 5 6 7 8 9 10 1,2,3 and 9,10 1 2 3 4 5 6 7 8 9 10 3 and 10 1 2 3 4 5 6 7 8 9 10

  21. The more errors that there is, it takes more time to find them