• What is Hadoop?
  • Why Hadoop and its use cases.
  • Different Echosystems of Hadoop.
  • Hadoop is Good For And Hadoop Is Not Good For.

HDFS (Hadoop Distributed File System)

  • Significance of HDFS in Hadoop
  • Features of HDFS
  • 5 daemons of Hadoop
  • Name Node and its functionality
  • Data Node and its functionality
  • Secondary Name Node and its functionality
  • Job Tracker and its functionality
  • Task Tracker and its functionality
  • Data Storage in HDFS
  • Introduction about Blocks
  • Data replication
  • Accessing HDFS
  • CLI (Command Line Interface) and admin commands
  • How to store various types of data in HDFS using CLI-commands
  • Java Based Approach
  • Safemode concepts in HDFS


  • Map Reduce Architecture
  • Map Reduce Programming Model
  • Different phases of MapReduce Algorithm
  1. Mapper phase
  2. Sort & Shuffle phase
  3. Reducer phase
  • Different Data types in Map Reduce
  • How Write a basic Map Reduce Program
  1.  The Driver Code
  2.  The Mapper
  3.  The Reducer 
  • Creating Input and Output Formats in Map Reduce Jobs
  1.  Text Input Format
  2.  Key Value InputFormat
  3.  Sequence File Input Format
  • Important features of MapReduce job
  • Data localization in Map Reduce
  • Combiner(Mini Reducer)
  •  Partitioner

Apache PIG

  • Introduction to Apache Pig
  • Map Reduce Vs Apache Pig
  • SQL Vs Apache Pig
  • Different datatypes in Pig
  • Modes Of Execution in Pig
  • Local Mode
  • Map Reduce OR Distributed Mode
  • Exection Mehanism
  • Grunt Shell
  • Script
  • Embeddeb
  • Transformations in Pig
  • How to write a simple pig script
  • How to store Pig output data in Sqoop&HDFS
  • UDFs in Pig


  • Hive Introduction
  • Hive architecture
  • Hive Meta Store
  • Hive Integration with Hadoop
  • Hive Tables
  • Managed Tables
  • External Tables
  • Hive Query Language(Hive QL)
  • How to load the data to Hive Tables
  • Altering Tables In Hive
  • Partitions In Hive
  • CTAS In Hive
  • Joins In Hive
  • SQL VS Hive QL
  • Hive Transform
  • UDF’s In Hive


  • Introduction to Sqoop.
  • MySQL client and Server Installation
  • How to connect to Relational Database using Sqoop
  • Different Sqoop Commands
  • Different flavors of Imports
  • Sqoop Eval Functions
  • Export


  • Hbase introduction
  • Hbase usecases
  • Hbase basics
  • Column families
  • Scans
  • Hbase Architecture
  • Hmaster
  • Zookeeper
  • Region Servers
  • Regions
  • How to create the tables in Hbase
  • Introduction about OOZIE
  • Introduction about ZOOKEEPER


  • What is  Flume?
  • How Flume work?
  • Flume Architecture
  • Flume Agents
  • Flume Examples


  • What is Oozie?
  • How oozie will works?
  • Oozie workflow


  • What is Cassandra?
  • How Cassandra will work

How to setup Hadoop Clusters in Apache Distribution (4 days)

How to setup Hadoop Clusters in Cloudera Distribution (4 days)

Pre-Requisites for Course

   OOPS Concepts (Poly  morphism, Inheritance, encapsulation etc)

Java Basics like Interfaces, Classes, and Abstract Classes etc

File I/O.

Linux Basic Commands

Mysql concepts


  1. Real time project explanation will be provided at the end of the course.
  2. Mock Interviews will be conduteds on a one-to-one basis after the course duration.
  3. Disussing about interview questions.