Open Source Databases
E N D
Presentation Transcript
Open Source Databases Zeger W. Hendrikse
Content • Purpose • Survey of available open source databases and tools • Strong emphasis on HSqlDb and MySQL • Feature comparison • Content • SQL, JDBC basics (optional) • Open Source databases • Introduction (Java & non-Java) • Access and administration tools • Demo
SQL • DDL (Data Definition Language) • create, alter, drop • DML (Data Manipulation Language) • CRUD (select, insert, update, delete) • DCL (Data Control Language) • connect, grant, revoke
JDBC: drivers • Type 1: JDBC to ODBC bridge • Usually slow, meant as transient solution • Early drivers designed to leverage existing ODBC drivers • Type 2: Java wrappers around native DB interfaces • No longer very common • Using JNI • Type 3: Pure Java drivers • Use middleware component to translate JDBC-calls to a DB-generic communication protocol • Type 4: Native Java drivers • Driver designed specifically to interact with the DB • The most common
JDBC: connections • DriverManager • Introduced in JDBC 1.0 API • Connection retrieved with hard-coded URL • DataSource • Introduced in the JDBC 2.0 Optional Package API • Preferred method • Underlying DataSource transparent to the application • ConnectionPoolDataSource • XADataSource
JDBC connections: DriverManager public class DbConnector { private static String driver = "oracle.jdbc.driver.OracleDriver"; private static String url = "jdbc:oracle:thin:@AMISNT15:1521:LUDO"; private static Connection connection = null; private static void createConnection { try { if (connection == null || connection.isClosed()) { Class.forName(driver).newInstance(); connection = DriverManager.getConnection(url,”adf6”, “adf6”); } catch (Exception e) {// omitted here} } }
JDBC connections: DataSource A DataSource • = a factory to create database connections • ‘centralizes’ database connection-related data • implementation of javax.sql.DataSource • three implementations available: • Standard implementation • Connection pooling implementation • Distributed transaction implementation
JDBC: JNDI & pooling DataSources & JNDI • DataSources are designed for use with JNDI • Configured within JNDI provider & bound to JNDI name:Context context = new InitialContext();DataSource ds = (DataSource) context.lookup(“jdbc/TestDataSource”);Connection con = ds.getConnection(“sa”, “”);Program only “knows” about username and password! • Usage in Servlets: • Get connections in doGet() en doPost() • Get DataSource in init() provides connections from pool
JDBC: pooling • A pooled DataSource stores connections in a pool • Pool ~ cache, BUT pool is prepopulated with a # of con’s • If all connections are used:1. May fail2. May add more connections to pool • Purpose: increase in performance
JDBC: 1.0 2.0 • Scrollable result sets • Batch updates (multiple insert/update/delete in 1 request) • Advanced data types (BLOB, CLOB) • Rowsets, for JavaBeans (rowset = set of rows = bean) • JNDI for obtaining connection • Connection pooling • Distributed transaction support • Two-phase commit used by the JTS API. • Additional minor additions/improvements…
JDBC: 2.0 3.0 • Savepoint support • Reuse of prepared statements by connection pools • Connection pool configuration • Retrieval of parameter metadata • Retrieval of auto-generated keys • Ability to have multiple open ResultSet objects • Passing parameters to CallableStatement objects by name • Holdable cursor support • Boolean data type • Making internal updates to data in BLOB en CLOB objects
JDBC: 2.0 3.0 (cont.) • Retrieving and updating the object referenced by a Ref object • Updating of columns containing BLOB, CLOB, ARRAY and REF types • DATALINK/URL data type • Transform groups and type mapping • Relationship between the JDBC SPI (Service Provider Interface) and the Connector architecture • DatabaseMetadata APIs
Enterprise demands on databases • A JDBC Compliant driver must support SQL-92 at the entry level entry level SQL-92 is a must • Transaction support • may also be handled in business tier • JDBC driver that is JDBC >= 2.0 compliant • Includes support for DataSources • necessary for pooling critical for scaling & performance • Preferred JDBC >= 3.0 compliant • Triggers, views and stored procedures • BLOB, CLOB • Support, user-base
Platform independent Small so may be embedded in application e.g. as part of a .war Small but can handle large databases Some versions allow memory-based tables Easy to set-up, configure, maintain. HSqlDb, Derby (Apache), Axion, McKoi and … Xindice. Better scalability Better support for very large applications Firebird, GNU SQL, MySQL, PostGreSQL, Max DB (= SAP DB), Ingres, Berkely DB Java vs. non-Java databases Java Java non-Java
Java databases: Axion • Product of Tigris.org (known from GEF) • Embedded Java DB with file- or memory-based tables • Currently no server mode • Nov. 2003: “The Apache DB Project has accepted Axion's proposal to become a part of the Apache Software Foundation.” • Features: http://axion.tigris.org/features.html
Java databases: HSqlDb • Continuation of Thomas Mueller’s Hypersonic DB • Yet another example of successful OS project • Basically Axion++ • Same functionality + many useful advanced features • JDBC >= 2.0, full transaction support • Additional DDL commands: “alter table” & “create view” • Additional DML commands: outer join, “group by” , “grant/revoke” • Standard DB with JBoss AS • License: BSD-based • Default: in-memory table regenerated from SQL script • =transaction log of SQL statements (executed at startup)
Java databases: HSqlDb (cont.) Features: • In-memory and disk-based tables • Transaction support • JDBC 2.0 (partly 3.0) • Five modes: • embedded (in-memory mode) • standalone • servlet • server • HTTP server
Java databases: HSqlDb (cont.) • If URL is “jdbc:hsqldb:c:/path/databaseName” creation of database “databaseName” in c:/path • If URL is “jdbc:hsqldb:.” database not saved to disk! • Behaviour of in-memory tables rebuilt with logs can be changed on a per-table basis: • CREATE [ MEMORY | CACHED | TEMP | TEXT ] TABLE name • Memory persistent in-memory table • Cache disk based table with cache • Text table saved as CSV-based file • Temp non-persistent in-memory table
Java databases: HSqlDb (cont.) • Drawback: no keyword for modification of this kind of storage type limits script portability • HSqlDb comes with nice manager application, which may be used with other JDBC-supported databases • Tool available from the JBoss management page! • Personal remark [ZWH] • Sometimes it seems confusing to me (with respect to the storage types) if the JBoss EJB container has added data as expected: there seems to be a mismatch what the tool shows you and what is actually in the (in-memory) database.
Java databases: HSqlDb (cont.) • Demo Hypersonic SQL!
Java databases: Derby • Formerly IBM’s Cloudscape • At incubation at ASF: http://incubator.apache.org/derby/ • Features: • Easy to use • Small footprint • Standards based • SQL syntax, transaction management, concurrency, triggers, and online backups. • Secure • Secure data management appropriate to environment the engine is executing in • Both as embedded database engine and network server
Java databases: McKoi • License: GPL and maintained by individual developer • Home @ http://mckoi.com/database/ • Features: • Transaction support • both embedded and client/server mode • Full JDBC 3.0 support • Comparison: • More “traditional” than HSqlDb and more features than Axion • Stores data files on disk and caches in memory like HSqlDb • License (including JDBC driver): GPL • Results in incompatibility with Apache license/products!!!
Java databases: Xindice • Native XML DB, stores compressed XML documents • Donated to ASF by former xmlDB team • XPath is query language and APIs support DOM & SAX • No DTD or Schema support for documents in DB yet! • Why not relational? • O/R impedance mismatch (would need Castor XML or similar)! • Why not object DB? • Object DB is “application bound”
Evaluation Java databases • McKoi and HSqlDb richer feature set compared to Axion • Derby is in ‘incubation’ at ASF • McKoi’s GPL license is limiting • HSqlDb is the Java database of choice • JBoss database • Largest Java-database community • Xindice is a native XML database • comes with pro’s and con’s of native XML DB • Well suited for document-style oriented data • No need to map from relational DB to XML • Less flexible queries
Non-Java databases: GNU SQL • Seems to be dead (latest news from 28-09-1998)
Non-Java databases: Firebird • Code base of Borland’s Interbase, OSS since 2000 • Forked off Interbase and Firebird • Well documented
Non-Java databases: MySQL • Most common open source database (“from NASA to Yahoo”) • Back-up by Swedish company MySQL AB • Version 3.2x still widely used, 4.1.x now available and stable • Stored procedures and triggers expected for 5.0 • JDBC driver: • Used to be MM.MySQL Connector/J, available form MySQL site
Non-Java databases: MySQL (cont.) • MySQL++ API: choose your engine! • ISAM + MyISAM, no transaction nor foreign key support • ISAM: • fast for reads >> updates, but not fault-tolerant nor transaction support • MyISAM • =ISAM++ (table locking), but requires more maintenance • HEAP • In-memory, fast but volatile • InnoDB & Berkely BDB, transaction + foreign key support, but slower! • Implement your own… • Suggestion for rainy Sunday afternoon: implement native XML engine!
Non-Java databases: MySQL (cont.) • Example: CREATE TABLE tblMyISAM ( id INT NOT NULL AUTO_INCREMENT, PRIMARY KEY (id), value_a TINYINT ) TYPE=MyISAM CREATE TABLE tblISAM ( id INT NOT NULL AUTO_INCREMENT, PRIMARY KEY (id), value_a TINYINT ) TYPE=ISAM CREATE TABLE tblHeap ( id INT NOT NULL AUTO_INCREMENT, PRIMARY KEY (id), value_a TINYINT ) TYPE=Heap • You can also use the ALTER TABLE command: ALTER TABLE tblMyISAM CHANGE TYPE=InnoDB
Non-Java databases: PostGreSQL • Evolved from database research at UC Berkeley • Actively being developed/supported • NO server version for windows platform available
Non-Java databases: MaxDB • Code base from SAP (ERP software) • Back-up by Swedish company MySQL AB (!!!) • Can be configured to support Oracle (v.7) SQL and DB2 (v.2) • Fear: skeptical of “going open source motivation” • Remains to be seen how this (huge) project is picked up by OS community
Non-Java databases: Ingres • Recently went Open Source, • License (CA-TOSL) to be OSI approved (CPL derived) • http://opensource.ca.com/projects/ingres • Strengthened relations with JBoss organization • Ingres will be the default DB delivered with JBoss • CA will coordinate development efforts • Like MySQL AB for MySQL and MaxDB databases • “Industrial strength” database • Questions: • Remains to be seen if an “OS community” will emerge • What will the prices be for support
Non-Java databases: Berkely DB • Embedded database for servers, networking hardware, handhelds, … • Supporting company: Sleepycat • Three products • Berkely DB, embedded, accessible via API (Perl, Python, …) • Berkeley DB XML, native XML database • Berkeley DB JE, for pure Java solutions • License free for open source projects
Evaluation non-Java databases • Requirement analysis (examples of valid considerations) • Server has to run on windows No PostgreSQL • Requirement for triggers and stored procedures No MySQL • Porting app. from DB2 or Oracle MaxDB • is able to understand different SQL dialects • Support + user base MySQL • Licensing issues • MySQL AB/Sleepycat only free for OS projects • GPL might be restrictive • Availability of supporting/admin tools • There may be many requirements-satisfying candidates
Evaluation non-Java databases (cont.) • Quick through the corner (“snel door de bocht”): • PostgreSQL • implementing most of the ANSI standard • transactions, triggers, views, subselects and user-defined types. • Stored procedures (many languages), sophisticated locking. • Client libraries for all of your favorite programming languages. • MySQL • high-traffic applications where speed is more important than data integrity. • lacks support for transactions, views, stored procedures, and subselects. • Oracle • Not free • Most popular and powerful relational database on the market • Applications, functions, and stored procedures (PL/SQL language)
Evaluation non-Java databases (cont.) • Generally: less features than commercial alternatives, but covers 80% of applications’ requirements • Finally: we haven’t exhaustively examined all open source databases here!
Tools: MySQL Control Centre • GUI for MySQL database • GPL, for Windows and Linux platforms • Download http://www.mysql.com/products/mysqlcc/
Tools: MySQL admin tools • MySQL Administrator • by MySQL AB (free) • http://www.mysql.com/products/administrator/ • MySQL manager • by EMS (commercial) • http://ems-hitech.com/mymanager/ • NOT SHOWN!
Tools: Squirrel • Relational database access tool • Purely Java and JDBC-based (license: LPGL) • http://squirrel-sql.sourceforge.net/ • Various plug-ins available • Adds Oracle specific functionality • Adds MySQL specific functionality • … • Support to validate SQL against ISO SQL standard • Look-and-feel • Write scripts to file
Tools: Squirrel (cont.) • Demo!
Tools: HSQL db-manager from jmx-console • From JBoss management console: • HSQL db-manager • Demo!
Tools: Eclipse plug-ins • IBM’s WSAD • JfaceDb from http://www.pratocity.com/index.jsp (commercial!!)
Integration with application servers • Integration with Tomcat • JDBC Drivers in $CATALINA_HOME/common/lib • Configure JNDI • Use Tomcat Admin tool/page • Adapt the server.xml • Configure • Class name of JDBC driver (e.g. jdbc/DefaultDS) • Connection URL (e.g. com.mysql.jdbc.driver or org.hsqldb.jdbcDriver) • Username & passwd (tomcat + welkom123) • Integration with JBoss • Demo!