150 likes | 327 Vues
The Hadoop RDBMS Replace Oracle with Hadoop John Leach CTO and Co-Founder J. who we are. The Hadoop RDBMS. Standard ANSI SQL Horizontal Scale- Out Real -Time Updates ACID Transactions Powers OLAP and OLTP Seamless BI Integration. Splice Machine Proprietary and Confidential.
E N D
The Hadoop RDBMS Replace Oracle with Hadoop John Leach CTO and Co-Founder J
who we are The Hadoop RDBMS • Standard ANSI SQL • Horizontal Scale-Out • Real-Time Updates • ACID Transactions • Powers OLAP and OLTP • Seamless BI Integration Splice Machine Proprietary and Confidential
serialization and write pipelining • Serialization Goals • Disk Usage Parity with Data Supplied • Predicate evaluation use byte[] comparisons (sorted) • Memory and CPU efficient (fast) • Lazy Serialization and Deserialization • Write Pipelining Goals • Non-blocking Writes • Transactional Awareness • Small Network Footprint • Handle Failure, Location, and Retry Semantics
Single Column Encoding • All Columns encoded in a single cell • separated by 0x00 byte • Nulls are encoded either as “explicit null” or as an absent field • Cell value prefixed by an Index containing • which fields are present in cell • whether the field is • Scalar (1-9 Bytes) • Float (4 Bytes) • Double (8 Bytes) • Other (1 – N Bytes)
Example Insert • Table Schema: (a int, b string) • Insert row (1,’bob’): • All columns packed together • 1 0x00 ‘bob’ • Index prepended • {1(s),2(o)}0x00 1 0x00 ‘bob’
Example Insert w/ nulls • Row (1,null) • nulls left absent • 1 • Index prepended (field B is not present) • {1(s)} 0x00 1
Example: Update • Row already present: {1(s),2(o)} • set a = 2 • Pack entry • 2 • prepend index (field B is not present) • {1(s)}0x00 2
Decoding • Indexes are cached • Most data looks like it’s predecessor • Values are read in reverse timestamp order • Updates before inserts • Seek through bytes for fields of interest • Once a field is populated, ignore all other values for that field.
Example Decoding • Start with (NULL,NULL) • 2 KeyValues present: • {1(s)}0x00 2 • {1(s),2(o)} 0x00 1 0x00 ‘bob’ • Read first KeyValue, fill field 1 • Row: (2,NULL) • Read second KeyValue, skip field 1(already filled), fill field 2: • Row: (2,’bob’)
Index Decoding • Index encoded differently depending on number of columns present and type • Uncompressed: 1 bit for present, 2 bits for type • Compressed: Run-length encoded (field 1-3, scalar, 5-8 double…) • Sparse: Delta encoded (index,type) pairs • Sparse compressed: Run-length encoded (index,type) pairs
Write Pipeline • Asynchronous but guaranteed delivery • Operate in Bulk • Row or Size bounded • Highly Configurable • Utilizes Cached Region Locations • Server component modeled after Java’s NIO • Attach Handlers for different RDBMS features • Handle retries, failure, and SQL semantics • Wrong Region, Region Too Busy, Primary Key Violation, Unique Constraint Violation
Write Pipeline Base Element • Rows are encoded into custom KVPairs • all rows for a family and column are grouped together • <byte[],byte[]> • Exploded into Put only to write to HBase • Timestamps added on server side • Supports snappy compression
Write Pipeline Client • Tree Based Buffer • Table -> Region -> N Buffers • Rows are buffered on client side in memory • N is configurable • When buffer fills • asynchronously write batch to Region • Handles HBase “difficulties” gracefully • Wrong Region • Re-bucket • Too Busy • Add delay and possibly back-off • etc.
Write Pipeline Server Side • Coprocessor based • Limited number of concurrent writes to a server • excess write requests are rejected • prevents IPC thread starvation • SQL Based Handlers for parallel writes • Indexes, Primary Key Constraints, Unique Constraints • Writes occur in a single WALEdit on each region
Interests • Other items we have done or interested in… • Burstable Tries Implementation of Memstore • Pluggable Cost Based Genetic Algorithm for Assignment Manager • Columnar Representations and in-memory processing. • Concurrent Bloom Filter (i.e. Thread Safe BitSet) • We are hiring • Just Completed $15M Series B Raise • careers@splicemachine.com