1 / 9

Hadoop & HDFS Architecture - Ravi Nambori Cisco Evagelist

HDFS Architecture: An HDFS cluster consists of a single NameNode, a master server that manages the file system namespace and regulates access to files by clients. Find info on benefits & flaws in this system by IT Expert Ravi Namboori.

Télécharger la présentation

Hadoop & HDFS Architecture - Ravi Nambori Cisco Evagelist

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Hadoop & HDFS Architecture Presentation By Ravi Namboori Visit us: http://ravinamboori.net

  2. What is HDFS? Hadoop comes with a distributed filesystem called HDFS, which stands for Hadoop distributed filesystem. http://ravinamboori.net

  3. What actually a distributed filesystem? • When a dataset outgrows the storage capacity of a single physical machine, it becomes necessary to partition it across a number of separate machines. • Filesystems that manage the storage across a network of machines are called distributed filesystems. http://ravinamboori.net

  4. Design of HDFS: • Very large files • Streaming data access • Commodity hardware http://ravinamboori.net

  5. Where HDFS Is Not A Good Fit • Low-latency data access • Lots of small files • Multiple writers, arbitrary file modifications http://ravinamboori.net

  6. HDFS Architecture An HDFS cluster consists of a single NameNode, a master server that manages the file system namespace and regulates access to files by clients. It also consist of secondary namenode. It updates the namespace image with datalog. http://ravinamboori.net

  7. Why Is a Block in HDFS So Large? • HDFS blocks are large compared to disk blocks, and the reason is to minimize the cost of seeks. • By making a block large enough, the time to transfer the data from the disk can be made to be significantly larger than the time to seek to the start of the block. • Thus the time to transfer a large file made of multiple blocks operates at the disk transfer rate.

  8. Advantage of HDFS? “Moving Computation is Cheaper than Moving Data" A computation requested by an application is much more efficient if it is executed near the data it operates on. This is especially true when the size of the data set is huge. HDFS provides interfaces for applications to move themselves closer to where the data is located. http://ravinamboori.net

  9. Thank You Presentation By Ravi Namboori Visit us: http://ravinamboori.in

More Related