310 likes | 402 Vues
The ScaleDB Storage Engine. Enabling high performance and scalability, using a Multi-Table Index, and a Shared-Disk Clustering Architecture. Moshe Shadmon moshe@scaledb.com. Agenda. Overview ScaleDB’s Clustering Architecture Shared-Disk vs. Shared-Nothing
E N D
The ScaleDB Storage Engine Enabling high performance and scalability, using a Multi-Table Index, and a Shared-Disk Clustering Architecture Moshe Shadmonmoshe@scaledb.com
Agenda • Overview • ScaleDB’s Clustering Architecture • Shared-Disk vs. Shared-Nothing • MySQL and a Shared-Disk Storage Engine • ScaleDB Installation • Demo • ScaleDB’s Indexing Technology • Multi-Table Index • Enabling Multi-Table Index in MySQL • Demo • Summary • ScaleDB Status & Product Availability
Overview • Plug-in Storage Engine for MySQL • Main Features: • Shared-Disk Architecture • Innovative Multi-Table Indexing • Transactional • Row-Level Locking • ACID Compliant • Atomicity: All tasks of a transaction performed or none of them are. • Consistency: The database is in a consistent state before and after the transaction. • Isolation: Data is not available in an intermediate state during a transaction • Durability: When a transaction completes, the transaction’s data will persist • Disk-Based Storage Engine
Shared-Disk vs. Shared-Nothing • Manageability • Adaptability • Availability/Fault-Tolerance • Scalability • Performance • Total Cost of Ownership (TCO)
Shared-Nothing: DatabaseInstance 1 Table A Table B Table C Vertical Partitioning DatabaseInstance 1 Table A DatabaseInstance 2 Table B DatabaseInstance 3 Table C
Shared Nothing: Partitioning Your Data…How • Predict usage patterns, application evolution, data growth patterns…all are moving targets • Avoid data skew: bottlenecks caused by frequently accessed data on just a few nodes • Avoid data shipping between nodes • Avoid delays from distributed 2-phase commit • Searches outside the partition column require participation by all nodes • Scaling becomes an exercise in fire fighting
Shared-Nothing:Horizontal Partitioning Logical View salary name age Partitioned by Salary salary name age salary name age salary name age Physical View Horizontal Partitioning – Salary % 3
Shared-Nothing:Horizontal Partitioning Pitfalls • Selections with equality predicates referencing the partitioning attribute are directed to a single node: • Retrieve Emp where salary = 60K • SELECT FROM Emp WHERE salary=60K • Equality predicates referencing a non-partitioning attribute and range predicates are directed to all nodes: • Retrieve Emp where age = 20 • Retrieve Emp where salary < 20K • SELECT FROM Emp WHERE salary<20K
Table A Table C Table B Shared-Disk:No Partitioning, Full Access to Data DatabaseInstance 1 Table A Table B Table C DB ClusterNode 1 DB ClusterNode 2 High-Speed Interconnect Shared DiskSubsystem DB ClusterNode 3
Scalability & Availability Shared Nothing Slave A Node A Node B Slave B Node C Slave C
Scalability & Availability Shared Disk MySQL Servers with ScaleDB Engine Node A Node B Node C Node D Node E Data
Shared-Disk:Summarizing Shared-Disk Benefits • Grow by simply adding nodes to the cluster • Servers can be added and removed dynamically according to your needs • No interruption to your application • High-Availability with dynamic failover • Existing nodes automatically take over • Significantly reduced maintenance costs • Can be built on low-cost commodity hardware • No data partitioning • No need for slaves • Low Total Cost of Ownership (TCO)
Shared-Disk:Making it work with MySQL Node 1 Node 2 Server Instance A Server Instance B ScaleDB Engine Instance A ScaleDB Engine Instance B Cluster Interconnect ClusterManager Comm. Layer ClusterManager Comm. Layer Buffer Manager Buffer Manager Shared Disk Sub-system
Shared-Disk: Insert New Row Node 1 Node 2 Server Instance A Server Instance B ScaleDB Engine Instance A ScaleDB Engine Instance B Cluster Interconnect ClusterManager Comm. Layer ClusterManager Comm. Layer Buffer Manager Buffer Manager Shared Disk Sub-system
Shared-Disk: Select Node 1 Node 2 Server Instance A Server Instance B ScaleDB Engine Instance A ScaleDB Engine Instance B Cluster Interconnect ClusterManager Comm. Layer ClusterManager Comm. Layer Buffer Manager Buffer Manager Shared Disk Sub-system
Shared-Disk: Create Table Node 1 Node 2 Table A Meta-Data Table A Meta-Data Server Instance A Server Instance B ScaleDB Engine Instance A ScaleDB Engine Instance B Cluster Interconnect ClusterManager Comm. Layer ClusterManager Comm. Layer Buffer Manager Buffer Manager Shared Disk Sub-system
ScaleDB Installation • Define cluster = true in ScaleDB Config file: • ScaleDB.cnf is at the same directory as my.cnf: • Cluster params: • cluster = true • nodes_in_cluster = 2 • node_id = 1 • this_machine_port = 100 • next_machine_ip_address = 192.168.0.101 • next_machine_port = 100 • log_directory = /share/logs/
Demo - Sysbench • ScaleDB cluster – one node – show throughput • ScaleDB cluster – 2nd node – show throughput
ScaleDB: Multi-Table Indexing #2 #1 #1 #2 #3 #3 #4 #4 #5 #5 B-tree: Only indexes the data in tables ScaleDB: Indexes the data and relationships Index #1 Index #2 Index #3 Index #4 Index #5 ScaleDB Index • Advantages: • Faster • Smaller • Referential integrity
Example • Scenario: Select information that is spread across 3 tables: Colleges, Students and Enrollment • Relationships: Students are enrolled in courses within departments of colleges SELECT c1.CollName, s.StudName, c2.CourseName , e.Grade FROM College AS c1 JOIN Student AS s JOIN Enrollment AS e JOIN Course AS c2 ON ( c1.CollNo = s.CollNo AND s.CollNo = e.CollNo AND s.StudentNo = e.StudentNo AND e.CollNo = c2.CollNo AND e.DeptNo = c2.DeptNo AND e.CourseNum = c2.CourseNum ) WHERE c1.CollNo = X AND s.StudentNo = Y ;
Option #1: Conventional Joins College Table Get College information Students Table Get Student information Enrollment Table Search enrollment by College & Student
Mapping Foreign Keys to Data Views • Create College Table College • Create Department Table • Foreign key – College • Create Course Table • Foreign Key – Department Department Students • Create Students Table • Foreign key – College Course Enrollment • Create Enrollment Table • Foreign key - Students • The Parent-Child tables are Created in MySQL Such that MySQL is able to operate over the new tables • The data of the Parent-Child tables is assembled on the fly from the source tables
Mapping Foreign Keys to Data Views College Department College Department Course College Students Enrollment College Students Meta-Data Tables: 1. College 2. College-Dept 3. College-Dept-Course 4. College-Students 5. College-Students-Enrollment 6. Department 7. Students 8. Course 9. Enrollment ScaleDB Physical files: 1. College 2. Department 3. Student 4. Course 5. Enrollment
Enabling the MySQL optimizer to use a Multi-Table Index SELECT c1.CollName, s.StudName, c2.CourseName , e.Grade FROM College AS c1 JOIN Student AS s JOIN Enrollment AS e JOIN Course AS c2 ON ( c1.CollNo = s.CollNo AND s.CollNo = e.CollNo AND s.StudentNo = e.StudentNo AND e.CollNo = c2.CollNo AND e.DeptNo = c2.DeptNo AND e.CourseNum = c2.CourseNum ) WHERE c1.CollNo = X AND s.StudentNo = Y ; CREATE TABLE sdb_view_college_course_student ( L1_CollNo INT NOT NULL, L1_CollName CHAR(32) NOT NULL, L1_CollBudget INT NOT NULL, L1_CollDescription CHAR(60) NOT NULL, … Table College Columns L2_StudNo INT NOT NULL, L2_StudName CHAR(48) NOT NULL, … Table Student Columns L3_CourseNum CHAR(9) NOT NULL, L3_Grade CHAR(2) NOT NULL, … Table Enrollment Columns PRIMARY KEY ( L1_CollNo, L2_StudtNo, L3_CourseNum)) ENGINE = SCALEDB; Select L1_CollName, L2_StudName, L3_CourseName, L3_Grade FROM sdb_view_college_course_student WHERE l1_CollNo = X AND l2_StudentNo = Y ;
Option #3: Multi-Table Index ScaleDB Multi-Table Index College Enrollment Departments Students Students Enrollment Colleges Courses Enrollment 001 Agriculture $1,234,567 Nice place to visit 002 Arts $5,432,567 Sports not so good 003 Business $9,999,666 Cool logo 004 Education $3,234,567 Ugh Worcester 005 Engineering $8,238,568 Serious work 006 Law $7,237,767 Jumpy students 007 Liberal Arts $9,898,777 Pretty campus 008 Medicine $5,987,004 In Texas 56-8033 008 Mike Hogan Caucasian 56-8045 008 Moshe Smith Caucasian 56-8044 008 Sally Shadmon Native American 56-8055 008 Billy Fleegle African American 56-8037 008 Saul Goode African American 56-8122 008 Tim Collins Polynesian 56-8233 008 Sam Gee Asian 56-8334 008 Rod Paulino Asian 008 4455 56-8037 B+ 008 4455 56-8033 C 008 4455 56-8045 B+ 008 4456 56-8044 A- 008 4456 56-8122 B- 008 4454 56-8233 C 008 4455 56-8334 F 008 4454 56-8055 D Col_ID# Col_Name Col_Budget Col_Description Student_ID# College_ID# Student_Name Student_Desc Coll_ID# Coll_Name Coll_Budget Coll_Description College_ID# Dept_ID# Student_ID# Grade
The Multi-Table Index • Multi-Table Index appears to MySQL as a data table • ScaleDB does not maintain data file associated with the Multi-Table Index • For a query using virtual table, ScaleDB assembles the rows on the fly using the Multi-Table Index • ScaleDB indexes are different than B-tree indexes • ScaleDB indexes provide the same functionality as B-tree, plus… • They maintain referential integrity with minimal overhead • They allow you to search for the data and relationships • They are much smaller in size
Demo • Query with join • Query with Multi-Table Index • 2nd node virtual table
Summary • ScaleDB Cluster • Multiple ScaleDB instances share the same physical data. • Connecting to the cluster is similar to connecting to a single node. • For the application, the cluster appears as a single node. • Transparent application failover • Transparent Scalability • ScaleDB Indexes • Provide the B-tree functionality • High performance • Map relationships • Maintain referential integrity • Smaller footprint • Independent of the key size
ScaleDB Status and Product Availability • Started Beta Process • We are looking for beta companies • Product launch is scheduled for June timeframe • Please talk to us if you are developer interested in working with ScaleDB • moshe@scaledb.com