hadoop hdfs mapreduce n.
Skip this Video
Loading SlideShow in 5 Seconds..
What is HDFS | Hadoop Distributed File System | Edureka PowerPoint Presentation
What is HDFS | Hadoop Distributed File System | Edureka

What is HDFS | Hadoop Distributed File System | Edureka

406 Vues
- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. Hadoop: HDFS & MapReduce Copyright © 2017, edureka and/or its affiliates. All rights reserved.

  2. Hadoop: HDFS and MapReduce Hadoop is a framework that allows us to store and process large data sets in parallel and distributed fashion Copyright © 2017, edureka and/or its affiliates. All rights reserved.

  3. What is DFS? Copyright © 2017, edureka and/or its affiliates. All rights reserved.

  4. What is DFS? Distributed File System Local File System 1 TB 1 TB 4 TB 1 TB 1 TB 1 TB Copyright © 2017, edureka and/or its affiliates. All rights reserved.

  5. Why DFS? Copyright © 2017, edureka and/or its affiliates. All rights reserved.

  6. Why DFS? Copyright © 2017, edureka and/or its affiliates. All rights reserved.

  7. Why DFS? Copyright © 2017, edureka and/or its affiliates. All rights reserved.

  8. What is HDFS? Copyright © 2017, edureka and/or its affiliates. All rights reserved.

  9. What is HDFS? HDFS is a distributed file system that allows you to store large data across the cluster Who is Who distributes the data across the cluster? responsible for managing the data? 1 TB 1 TB How is data accessed? 1 TB 1 TB Copyright © 2017, edureka and/or its affiliates. All rights reserved.

  10. HDFS Architecture Copyright © 2017, edureka and/or its affiliates. All rights reserved.

  11. HDFS Architecture NameNode: Master Node NameNode ▪ Master daemon ▪ Maintains and Manages DataNodes ▪ Records metadata ▪ Receives heartbeat and block report from all the DataNodes DataNode ▪ Slave daemons ▪ Stores actual data ▪ Serves read and write requests from the clients DataNodes: Slave Nodes Copyright © 2017, edureka and/or its affiliates. All rights reserved.

  12. How Files are Stored in HDFS? Copyright © 2017, edureka and/or its affiliates. All rights reserved.

  13. HDFS Data Blocks ➢ Each file is stored on HDFS as blocks ➢ The default size of each block is 128 MB in Apache Hadoop 2.x (64 MB in Apache Hadoop 1.x) Copyright © 2017, edureka and/or its affiliates. All rights reserved.

  14. What if DataNode Containing Data Crashes? Copyright © 2017, edureka and/or its affiliates. All rights reserved.

  15. DataNode Failure Scenario: One of the DataNodes crashed containing the data blocks Copyright © 2017, edureka and/or its affiliates. All rights reserved.

  16. Solution: Replication Factor Copyright © 2017, edureka and/or its affiliates. All rights reserved.

  17. Replication Factor Solution: Each data blocks are replicated (thrice by default) and are distributed across different DataNodes Copyright © 2017, edureka and/or its affiliates. All rights reserved.

  18. Demo Copyright © 2017, edureka and/or its affiliates. All rights reserved.