iBigTable : Practical Data Integrity for BigTable in Public Cloud CODASPY 2013

iBigTable: Practical Data Integrity for BigTable in Public CloudCODASPY 2013 Wei Wei, Ting Yu, RuiXue

iBigTable – Overview Deploying BigTable in a public cloud is an economic solution. However, one may not always trust the public cloud provider. BigTable – Scalable Storage System • Storelarge data sets with petabytes or even more • Business transactions, software logs, social network messages • Benefits from processing large data sets • Identify business opportunities, find software bugs, mine social relationship • Widely used in Google, Facebook, Twitter • However, small companies and researchers usually lack of capabilities to deploy BigTable • Large cluster required • Technical difficulties • High maintenance cost

iBigTable – Overview Our Focus • Provide integrity assurance for BigTable in public cloud Basic Idea • Build MerkleHash Tree based Authenticated Data Structure • Decentralize integrity verification across multiple nodes

Agenda Introduction System Model System Design Experimental Evaluation Related Work Conclusion

Merkle Hash Tree (MHT) sroot=S(hroot) hroot=H(h12|h34) h12=H(h1|h2) h34=H(h3|h4) h1=H(d1) h2=H(d2) h1=H(d3) h1=H(d4) Verification Object (VO) • Data returned along with result and used to authenticate the result Example • Authenticate data d1, and the VO for d1 is {h2 and h34}

BigTable – Data Model A table is a sparse, distributed, persistent multidimensional sorted map (OSDI 2006). Data Model • Table schema only defines its column families • Each family consists of any number of columns • Each column consists of any number of versions • Columns only exist when inserted, NULLs are free • Columns within a family are sorted and stored together • Table contains a set of rows sorted based on row key • Row: a set of column families • Column Family: a set of columns • Cell: arbitrary string (uninterpretedstring)

BigTable – Data Organization • Master • Responsible for load balancing and assigning tablets Tablet • Root tablet • Metadata tablet • User tablet Tablet Server • Each tablet is only stored in a tablet server • Multiple tablets can be stored in a tablet server

BigTable – Data Operations Queries • Single row query by specify the row key • Range query by specifying start and end row keys • Projection query to retrieve specific column, column family Changes • Data insert, update, and delete • Tablet split & merge

System Model The actual design and deployment of authentication schemes are significantly different Similar to Database Outsourcing • Host data in untrusted party and support data retrieval • Principle ideas of integrity verification Different from Database Outsourcing • Distributed data among large number of nodes • How to handle authenticated data structures during tablet merging or splitting • Impractical to store authenticated structures in a single node • Not scalable to adopt a centralized integrity verification scheme at a single point • Simple data model and query interfaces • Design much simpler and efficient authenticated structures and protocols to verify data integrity

System Model Assumptions • The public cloud is not trusted, and BigTable is deployed in the public cloud, including the master and tablet servers • The data owner has a public/private key pair, and public key is known to all • The data owner is the only party who can update data • Public communications are through a secure channel Attacks from The Public Cloud • Return incorrect data by tampering some data • Return incomplete data result by discarding some data • Report that data doesn’t exist or return old data

System Model cont’d Goal • Deploy BigTable over Public Cloud with Practical Integrity Assurance Design Goals • Security (Integrity) • Correctness, completeness, freshness • Practicability • Simplicity, flexibility, efficiency

System Design Basic Idea • Embed a MHT-based Authenticated Data Structure in each tablet

Distributed Merkle Hash Tree Root Tablet Data Owner Root hash Meta Tablet Cons • Require update propagation • Concurrent update could cause issues • Hard to synchronize hash tree update • Complicate protocols between tablet servers … User Tablet User Tablet … Pros Authenticated data distributed across nodes Only maintain one hash for all data

Our Design Data Owner Root hash … User Tablet User Tablet Meta Tablet Root Tablet …

Our Design Data Owner Root hash Root hash … … Root hash User Tablet User Tablet Meta Tablet Root Tablet Root hash … …

System Design Data integrity is guaranteed by the correctness of the root hash of the MHT in each tablet. Basic Idea • Embed a MHT-based Authenticated Data Structure in each tablet • Store the root hash of each MHT in a trusted party (e.g., data owner) • Decentralize the integrity verification across multiple tablet servers

Decentralized Integrity Verification 1.2 generate VO 1.4 verify 1.1 meta key (root, meta, table name, start row key) 1.3 meta row (meta tablet location, start and end keys) , VO Tablet Server serving ROOT tablet Client 2.2 generate VO 2.4 verify 2.1 meta key (meta, table name, start row key) 2.3 meta row (user tablet location, start and end keys) , VO Tablet Server serving META tablet Client 2.2 generate VO 3.4 verify 3.1 start and end row keys 3.3 rows within the start and end row keys , VO Tablet Server serving USER tablet Client

iBigTable – Authenticated Data Structure Signature Aggregation Compared with Merkle Hash Tree • Both of them can guarantee correctness and completeness • Incur significant computation cost in client side and large storage cost in server side • Not clear how to address freshness MHT-based Authenticated Data Structure • SL-MBT: A single-level Merkle B+ tree • Build a Merkle B+ tree based on all key value pairs in a tablet • Each leaf is a hash of a key value pair • ML-MBT: A multi-level Merkle B+ tree • Builds multiple Merkle B+ trees in three different levels • TL-MBT: A two-level Merkle B+ tree (adopted)

iBigTable – TL-MBT • Column Family Tree: generate hashes for a column family of all rows and each leaf is a hash of a column family of a row • Column Tree: generate hashes for a column of all rows and each leaf is a hash of a column of a row Index Level Only one tree – index tree Each leaf points to a data tree Data Level Row Tree: generate hashes for all rows and each leaf is a hash of a row

iBigTable – TL-MBT Verification Object Generation • Find the data tree(s) based on the specific query • Use the data tree(s) to generate VO based on the query range Pros • Performance is comparable to ML-MBT for row-based query • Much more efficient than SL-MBT and ML-MBT for projection query • Flexible authenticated data structure Cons • Update cost may increase by 3 times • Large storage cost if column trees are created

iBigTable – Data Access Range query within tablet • Find metadata tablet, user tablet, data through specific tablet server Range query across tablets • Break a large range into small sub-ranges • Based on the end key of each tablet • Sub-range falls in a tablet • Execute the sub-range queries

iBigTable – Single Row Update 3.2 generate VO 3.4 verify and update tablet root hash 3.1 new row 3.3 partial tree VO Tablet Server serving USER tablet Data Owner Partial Tree Verification Object (VO) • Data included • Only keys and hashes of data for two boundaries • Hashes of nodes for computing the root hash • Keys in related inner nodes • Used for direct update within the range of partial tree

iBigTable – Single Row Update cont’d 30 60 10 50 70 80 0 10 20 30 40 50 60 70 80 90 Initial MB+ row tree of a tablet in a tablet server.

iBigTable– Single Row Update cont’d 30 60 30 60 50 40 50 30 40 50 30 40 45 50 45 New Key 45 New Key 45 Insert a row with key 45 into partial tree VO Partial tree VO after 45 is inserted

iBigTable – Efficient Batch Update 3.4 verify and update tablet root hash 3.2 generate VO 3.1 request partial tree VO for a range 3.3 partial tree VO Tablet Server serving USER tablet Data Owner 3.4 new rows … … … 3.n new rows Single row update is inefficient • one verification for single row Range query is efficient • One verification for multiple rows How can we do batch update like range query?

iBigTable – Tablet Changes Tablet split • Grow too large • Load balancing • Better management Tablet merge • Only a few data in a tablet • Improve query efficiency How to guarantee data integrity? • Make sure the root hash of each tablet is correctly updated

iBigTable – Tablet Split 30 60 10 50 70 80 0 10 20 30 40 50 60 70 80 90 (a) A MBT of a tablet in a tablet server, and split tablet at key 45.

iBigTable – Tablet Split cont’d 30 60 10 50 70 80 10 20 30 40 50 60 Left boundary node Right boundary node Two boundary keys (b) Partial tree returned to the data owner.

iBigTable – Tablet Split cont’d 30 60 30 60 Split 10 50 50 70 80 10 20 30 40 50 60 Left Partial Tree Right Partial Tree (c) Split it into two partial trees by data owner.

iBigTable – Tablet Split cont’d 30 60 10 50 10 30 10 20 30 40 10 20 30 40 (d) Data owner adjusts left partial tree and computes the new root hash for the new tablet.

iBigTable – Tablet Split cont’d 30 60 70 50 70 80 60 80 50 60 50 60 (e) Data owner adjusts right partial tree and computes the new root hash for the new tablet.

iBigTable – Tablet Merge 70 50 70 Merge 10 30 60 10 30 60 30 40 50 30 40 50 Left Partial Tree Right Partial Tree Merged Tree Data owner merges two partial trees sent from tablet servers into one for the new merged tablet

iBigTable – Experimental Evaluation System Implementation • Implementation based on HBase • Extend some interfaces to specify integrity options • Add new interfaces to support efficient batch updates Experiment Setup • 5 hosts in Virtual Computing Lab (VCL) • Intel(R) Xeon(TM) CPU 3.00GHz • Red Hat Enterprise 5.1, Hadoop-0.20.2, and HBase-0.90.4 • Client network with 30Mbps download and 4Mbps upload

iBigTable – Baseline Ex 1. Time to receive data from server Ex 2. VO size vs # of rows Observations • It almost takes the same time to transmit data less than 4k • Time is doubled from 4k to 8k till around 64k. • After 64k, the time dramatically increases. • The VO size increases as the range increases, but the VO size per row actually decreases.

iBigTable – Write Ex 3. Write performance. Ex 4. The breakdown of write cost Observations • The performance overhead ranges from 10% to 50%. • iBigTablewith Efficient Batch Update only causes a performance overhead about 1.5%. • Communication cost is high, but computation cost is small about 2~5%.

iBigTable – Read Ex 5. Read performance Ex 6. The breakdown of read cost Observations • The read performance overhead is small, which ranges from 1% to 8%. • The total computation cost of both client and servers is about 1%. • The major part of performance downgrade is caused by communication.

iBigTable – TL-MBT Ex 7. TL-MBT update performance. Ex 8. Projection query with TL-MBT Observations • As the number of trees that need to be updated increases, the performance decreases dramatically. • For different data size, we see the large performance variation for different cases.

iBigTable – Related Work Research related to BigTable • Performance evaluation [Carstoiu et al., NISS 2010] • High performance OLAP analysis [You et al., IMSCCS 2008] • BigTable in a hybrid cloud [Ko et al., HotCloud 2011] • Integrity layer for cloud storage [Kevin et al., CCS 2009] Outsourcing Database • Different authenticated data structures [DASFAA 2006] • Probabilistic approaches [Xie et al.VLDB 2007] • Approaches to address complex queries [Yang et al., SIGMOD 2009] • Partitioned MHT (P-MHT) [Zhou et al., MS-CIS 2010]

iBigTable – Conclusion Contributions • Explore the practicability of different authenticated data structures • Focus on Merkle Hash Tree based authenticated data structures • Design a set of efficient mechanisms to handle authenticated data structure changes • Efficient data batch update • Handle tablet split and merge • Implement a prototype of iBigTable based on Hbase, an open source implementation of BigTable • Conduct experimental evaluation of performance overhead

Thank you Questions?

iBigTable : Practical Data Integrity for BigTable in Public Cloud CODASPY 2013