150 likes | 287 Vues
Kelly Technologies offers best online Hadoop Training with more experienced professionals. Our trainers have worked in Hadoop and related technologies for more years in MNC’s. We aware of industry needs and we are offering Hadoop Training in a more practical way. <br>www.kellytechno.com
E N D
Introduction to Hadoop Powered By
Hadoop Training • Candidates, who are looking for the successful career path, must prefer Hadoop Training. Hadoop training helps you a lot in the career development. www.kellytechno.com
What is Hadoop? • In the Information Technology, Hadoop is considered as the collection of large amounts of data and various tasks are performed. Hadoop is the open source software framework and it is the free Java based programming framework. It runs the applications on the cluster commodity hardware. Tasks are handled in the concurrent way. It allows distributed processing of huge amount of data by using simple programming models. Hadoop is the part of Apache Software Project, which was sponsored by Apache Software Foundation. www.kellytechno.com
What is the Need to prefer Hadoop? • Hadoop has designed to scale up from the single servers to the thousands of machines. It offers storage and local computation. www.kellytechno.com
Challenges in Big Data Storage and Analysis • It is slow to process and can’t be scaled. • Hard Drive Capacity to Process • Unreliable Machines Risk • Concurrent Execution of Tasks • Reliability • Secured and Easy to Use www.kellytechno.com
What is Hadoop Commons? • With the help of this Hadoop commons, it provides access to the file system. The package of Hadoop commons contains the necessary JAR files and also the scripts that are used to start Hadoop. The package provides the documentation, source code and contribution section. It allows the project from the Hadoop Community. www.kellytechno.com
Architecture of Hadoop • It was designed and built on two independent frameworks. They are HDFS and Map Reduce • Hadoop Distributed File System (HDFS) • Hadoop MapReduce www.kellytechno.com
What is Hadoop Distributed File System? • A distributed file system is based on the GFS as it’s the shared file system. GFS is known as Google File System. The architecture of the HDFS divides the files into large portion that is up to 64 MB. It is distributed across the data servers. It has the datanode as well as namenodes. It helps in the storage and file system. www.kellytechno.com
What is Map Reduce? • Map Reduce is the framework that is used for high performance distributed data processing. It uses the divide and aggregate programming paradigm. • Hadoop is having the master slave architecture. This is for both storage and processing. www.kellytechno.com
Hadoop Master and Slave Architecture Components of HDFS • In this architecture, it has three components there are NameNode, DataNodes and Secondary NameNodes. • NameNode is considered as the master of the system. It completely maintains directory and files. It also manages the blocks that are present on the dataNodes. • DataNodes are considered as Slave that is deployed on each of the machine. It is responsible for checking read and writes requests for the clients. • Secondary NameNodes is for performing periodic check points. If there is the failure of the NameNode, then by using the check point you can restart the NameNode. www.kellytechno.com
Components of Map Reduce • They are two components in the Map Reduce. They are JobTracker and TaskTracker. • JobTracker is the master of the system. It managers the resources in the most efficient way. It tries to schedule each map as very close to the actual data that is being processed. • TaskTrackers is the slaves that are deployed on each machine. It is completely for running map and reduces the tasks. The instructor will be by the JobTracker. www.kellytechno.com
Environment of Hadoop • The Hadoop environment will be in three modes. • It is with the Standalone mode in which it helps in the debug of Hadoop applications • Pseudo distributed: It runs in different JVM’s. It has the separate process, but all the process runs in the single machine. • Fully Distributed: It has parallel processing, workflow management, fault tolerance as well as data consistency. www.kellytechno.com
Features of Hadoop • It is completely open source and written in Java • Stores structured as well as Semistructured data • Storage Capacity is high • Detection of Failure and solves the problem itself only • Cost Effective www.kellytechno.com
Uses of Hadoop • Search Purpose • Data Warehouse • Video and Image Analysis www.kellytechno.com
Thank You www.kellytechno.com