1 / 12

Understanding Parallel/Scalable Databases: Architecture and Query Processing

This article explores the concept of parallel and scalable databases, detailing their hardware and software architectures that include multiple processors, disk drives, and large memory banks. It highlights how these databases differ from traditional systems, emphasizing features like parallel query processing, information partitioning, and pipelining capabilities. The benefits of enhanced processing speed and efficiency are weighed against the higher costs and maintenance challenges. Ultimately, the article guides organizations in assessing the suitability of parallel databases for their needs.

mason-kent
Télécharger la présentation

Understanding Parallel/Scalable Databases: Architecture and Query Processing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Parallel DatabasesMichael French, Spencer Steele, Jill Rochelle When Parallel Lines Meet by Ken Rudin (BYTE, May 98)

  2. What are Parallel/Scalable Databases? • Parallel/Scalable Databases: • Hardware Architecture Multiple Processors Multiple Disk Drives Large Memory Banks • Software Architecture Capable of processing parallel queries Data shipping capabilities

  3. What makes Parallel Databases different from previous technologies?

  4. Previous Technology • Hardware Single processor Small Disk Capacity Less Memory • Software Sequential Queries No partitioning of queries

  5. Parallel Query: • A Query that partitions information to multiple processors and also has the ability to pipeline information

  6. Information Partitioning • Divide the information into smaller tasks • Can have multiple meanings: • Distribution of info to multiple CPUs • Division of hard drive space to contain certain parts of the data

  7. Information Partitioning 2

  8. Information Pipelining • Allows separate processors to work on separate stages of a query • Scan • Join • Sort • Concept is akin to assembly line idea • Allows multiple queries to run at the same time

  9. Information Pipelining 2

  10. Sequential Query Example • Two Tables with 20 million rows each run on a uniprocessor machine • To perform scan, join & sort, query takes 12 mins. • Add partitioning • Query takes 3 mins. • Add Pipelining • 12 queries can be run in 12 mins.

  11. Parallel Kinds • Share-Everything • Hardware • Software • Share-Disk • Hardware • Software • Share-Nothing • Hardware • Software

  12. Conclusion • Pros • Allows you to process more information • Provides for faster processing of queries • Cons • Expensive hardware & software • Much higher maintenance • Is a parallel database right for your organization?

More Related