The Hadoop Cluster Administration training course is designed to provide knowledge and skills to become a successful Hadoop Architect. It starts with the fundamental concepts of Apache Hadoop and Hadoop Cluster. It covers topics to deploy, configure, manage, monitor, and secure a Hadoop Cluster. The course will also cover HBase Administration. There will be many challenging, practical and focused hands-on exercises for the learners. By the end of this Hadoop Cluster Administration training, you will be prepared to understand and solve real world problems that you may come across while working on Hadoop Cluster.
Hadoop 2.0 Developer training at ISEL Global will teach you the technical aspects of Apache Hadoop, and you will obtain a deeper understanding of the power of Hadoop. Our experienced trainers will handhold you through the development of applications and analyses of Big Data, and you will be able to comprehend the key concepts required to create robust big data processing applications. Successful candidates will earn the credential of Hadoop Professional, and will be capable of handling and analysing Terabyte scale of data successfully using MapReduce.
- What is Big Data
- Dimensions of Big Data
- Big Data in Advertising
- Big Data in Banking
- Big Data in Telecom
- Big Data in eCommerce
- Big Data in Healthcare
- Big Data in Defense
- Processing options of Big Data
- Hadoop as an option
- What is Hadoop
- How Hadoop 1.0 Works
- How Hadoop 2.0 Works
- HDFS
- MapReduce
- What is YARN
- How YARN Works
- Advantages of YARN
- How Hadoop has an edge
- Sqoop
- Oozie
- Pig
- Hive
- Flume
- Running HDFS commands
- Running your MapReduce program on Hadoop 1.0
- Running your MapReduce Program on Hadoop 2.0
- Running Sqoop Import and Sqoop Export
- Creating Hive tables directly from Sqoop
- Creating Hive tables
- Querying Hive tables
- MapReduce Code Walkthrough
- ToolRunner
- MR Unit
- Distributed Cache
- Combiner
- Partitioner
- Setup and Cleanup methods
- Using Java API to access HDFS
- Map Side joins
- Reduce side joins
- Input Types in MapReduce
- Output Types in MapReduce
- Custom Input Data types
- Custom Input Data types
- Custom Output Data types
- Multiple Reducer MR program
- Zero Reducer Mapper Program
- MR Unit hands on
- Distributed Cache hands on
- Partitioner hands on
- Combiner hands on
- Accessing files using HDFS API hands on
- Map Side joins hands on
- Reduce side joins hands on
- Searching
- Sorting
- Filtering
- Inverted Index
- TF-IDF
- Word Co-occurrence
- Distributed Grep
- Bloom Filters
- Average Calculation
- Standard Deviation
- MapSide joins
- Reduce Side joins
- What is Pig
- How Pig Works
- Simple processing using Pig
- Advanced Processing Using Pig
- Pig Hands On
- What is Hive
- How Hive Works
- Simple processing using Hive
- Advanced processing using Hive
- Hive hands-on
- What is Oozie
- How Oozie Works
- Oozie hands-on
- What is Impala
- How Impala Works
- Where Impala is better than Hive
- Impala’s shortcomings
- Impala hands-on
From the course:
- Understand Big Data and the various types of data stored in Hadoop
- Understand the fundamentals of MapReduce, Hadoop Distributed File System (HDFS), YARN, and how to write MapReduce code
- Learn best practices and considerations for Hadoop development, debugging techniques and implementation of workflows and common algorithms
- Learn how to leverage Hadoop frameworks like ApachePig™, ApacheHive™, Sqoop, Flume, Oozie and other projects from the Apache Hadoop Ecosystem
- Understand optimal hardware configurations and network considerations for building out, maintaining and monitoring your Hadoop cluster
- Learn advanced Hadoop API topics required for real-world data analysis
- Understand the path to ROI with Hadoop
From the workshop:
- 3 days of comprehensive training
- Learn the principles and philosophy behind the Apache and Hadoop methodology
- Dummy projects to work and gain practical knowledge
- Earn 24 PDUs certificate
- Downloadable e-book
- Industry based case studies
- High quality training from an experienced trainer
- Course completion certificate after successful passing the examination
This course is best suited to systems administrators, windows administrators, linux administrators, Infrastructure engineers, DB Administrators, Big Data Architects, Mainframe Professionals and IT managers who are interested in learning Hadoop Administration.
List of people who can go for course:
-
Architects and developers who design, develop and maintain Hadoop-based solutions
-
Data Analysts, BI Analysts, BI Developers, SAS Developers and related profiles who analyze Big Data in Hadoop environment
-
Consultants who are actively involved in a Hadoop Project
-
Experienced Java software engineers who need to understand and develop Java MapReduce applications for Hadoop 2.0