1 / 19

Falcon from the Beginning

Falcon from the Beginning. Jim Starkey jstarkey@mysql.com. Why Falcon? Because the World is Changing!. Hardware is evolving rapidly Customers need ACID transactions A tomic – the books should balance C onsistent – the alternative is chaos I solated – preserve programmer’s sanity(sic)

elago
Télécharger la présentation

Falcon from the Beginning

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Falcon from the Beginning • Jim Starkey • jstarkey@mysql.com

  2. Why Falcon?Because the World is Changing! • Hardware is evolving rapidly • Customers need ACID transactions • Atomic – the books should balance • Consistent – the alternative is chaos • Isolated – preserve programmer’s sanity(sic) • Durable – who wants to lose data?

  3. Where Hardware is going • CPUs breed like rabbits – more sockets, more cores per socket, more threads per core • Memory is bigger, faster, and cheaper • Disks are bigger and cheaper but not much faster • (Boxes are cheaper and more plentiful, but that’s a different story)

  4. Where Applications are going • Batch – dead! • Timesharing – dead! • Departmental computing – dead! • Client server – fading fast • Application servers for most of us • Web services for the really big guys

  5. The Database challenge • Traditional challenge: • Exhaust CPU, memory, and disk simultaneously • Today’s challenge: • Exhaust CPU and memory and avoid the disk

  6. Falcon tradeoffs • Use memory (page cache) to avoid disk reads • Use memory (record cache) to avoid the page cache manipulation. • Use CPU to find the fastest path to a record • Use CPU to minimize record size • Synchronize most data structures with user mode read/write locks • Synchronize high contention data structures with interlocked instructions.

  7. The Falcon architecture • Incomplete in-memory database with disk backfill • Multi-version concurrency control in memory • Updates in memory until commit • Group commits to a single serial log write • Post-commit multi-threaded pipe line to move updates to disk

  8. Incomplete in-memory database • Selected records cached in memory • Separate cache for disk pages • Record cache hit is 15% the cost of a page cache hit • Record cache is more memory efficient than page cache

  9. Record Encoding - Cache Efficiency • Records encoded by value, not declaration • String “abc” occupies the same space in varchar(3) or varchar(4096) • The number 7 is the same where small, medium, int, bigint, decimal, or numeric

  10. Multi-Version Concurrency Control • Update operations create new record versions • New version is tagged with transaction id, points to old version • System tracks which transactions should see which versions • Readers don’t block writers • Everyone sees a consistent view of the data

  11. Updates Are in Memory Until Commit • Updates held in memory pending commit (well, usually) • Index changes held in memory pending commit (same caveat) • Verb rollback is dirt cheap • Transaction rollback is dirt cheap

  12. At Commit… • Pending record updates flushed to serial log • Pending index updates flushed to serial log • Commit record written to serial log • Serial log flushed to the oxide • And the transaction is committed!

  13. Alas, Memory isn’t infinite, so • Large transaction chills uncommitted data (flushes it to the log early) • Chilled records can be thawed (fetched from the log) • Scavenger garbage collects unloved records periodically • When things get really bad, entire record chains flushed to backlog • (Note: This is hard and we aren’t done.)

  14. Falcon Weaknesses • Transactions are ACID but not serializable • Latency advantage disappears at saturation • Very large transactions degrade performance • Optimized for Web, not batch

  15. Falcon Strengths • Runs like a memory database when data fits in cache • Scales like disk-based database when data doesn’t fit in cache • Lowest possible latency for Web applications • Absorbs huge spiky loads

  16. Performance Measurement • Generally benchmark against InnoDB (transactional engines) • We use the DBT2 benchmark: • High contention • Write intensive – 40% records touched are updated • Measures only performance at saturation • DBT2 (we believe) is InnoDB’s best spot and Falcon’s worst

  17. Benchmarking Results • 16 & 8 cpu system: Falcon exceeds InnoDB performance • 4 cpu systems: Falcon exceeds InnoDB performance for moderate to large number of threads • 2 cpu systems: Rough parity, advantage to InnoDB • 1 cpu systems: InnoDB wins • Caveat: Results subject to change! Both systems are moving targets!!!

  18. When should you use what? • If you don’t need ACID, MyISAM is probably fastest • For Uniprocessors and small memory systems, InnoDB is a good choice • For large transaction batch, InnoDB may be best match • For multi-cores and large number of threads, Falcon is probably best • For the Web, Falcon is hard to beat.

  19. Questions?

More Related