Hadoop Training At Vsbtech

June 25, 2015

HADOOP AT VSBTECH

HADOOP BASICS

·Problems with traditional large-scale systems

·Data Storage literature survey

·Data Processing literature Survey

·Network Constraints

· Requirements for a new approach

Hadoop: Basic Concepts

·What is Hadoop.

·The Hadoop Distributed File System

·Hadoop Map Reduce Works

·Anatomy of a Hadoop Cluster

·Master Daemons

·Name node

·Job Tracker

·Secondary name node

·Slave Daemons

·Job tracker

·Task tracker

HDFS(Hadoop Distributed File System)

·Blocks and Splits

·Input Splits

·HDFS Splits

·Data Replication

·Hadoop Rack Aware

·Data high availability

·Cluster architecture and block placement

CASE STUDIES

Programming Practices & Performance Tuning

·Developing MapReduce Programs in

·Local Mode

·Running without HDFS

·Pseudo-distributed Mode

·Running all daemons in a single node

·Fully distributed mode

·Running daemons on dedicated nodes

·INSTALLING APACHE SINGLE NODE CLUSTER

·Name Node in Safe mode

Writing a MapReduce Program

·Examining a Sample MapReduce Program

·With several examples

·Basic API Concepts

·The Driver Code

·The Mapper

·The Reducer

·Hadoop's Streaming API

Performing several Hadoop jobs

·The configure and close Methods

·Sequence Files

·Record Reader

·Record Writer

·Role of Reporter

·Output Collector

·Counters

·Directly Accessing HDFS

·ToolRunner

·Using The Distributed Cache

·Killing a job

Several MapReduce jobs (In Detailed)

·MOST EFFECTIVE SEARCH USING MAPREDUCE

·GENERATING THE RECOMMENDATIONS USING MAPREDUCE

·PROCESSING THE LOG FILES USING MAPREDUCE

·IMAGE COUNTERS IN MAPREDUCE

·MRUNIT TESTING

·Identity Mapper

·Identity Reducer

·Exploring well known problems using MapReduce applications

Debugging MapReduce Programs

·Testing with MRUnit

·Logging

·Other Debugging Strategies.

Advanced MapReduce Programming

·The Secondary Sort

·Customized Input Formats and Output Formats

·Joins in MapReduce

·Compressions

Monitoring and debugging on a Production Cluster

·Skipping Bad Records

·Running in local mode

Tuning for Performance in MapReduce

·Reducing network traffic with combiner

·Partitioners

·Reducing the amount of input data

·Speculative execution

·Other Performance Aspects

CASE STUDIES

CDH4 Enhancements

·Name Node High – Availability

·Name Node federation

·Fencing

·MapReduce Version - 2

HIVE

·Hive concepts

·Hive architecture

·Install and configure hive on cluster

·Different type of tables in hive

·Hive library functions

·Buckets

·Partitions

·Joins in hive

·Inner joins

·Outer Joins

·Hive UDF

·Hive Serde

·Processing JSON in hive

·Compressions in Hive

PIG

·Pig basics

·Install and configure PIG on a cluster

·PIG Library functions

·Pig Vs Hive

·Write sample Pig Latin scripts

·Modes of running PIG

·Running in Grunt shell

·Designing Pig Scripts

·Using PiggyBank

·Running as Java program

·PIG UDFs

·Pig Macros

·Debugging PIG

IMPALA

·Difference between Impala Hive and Pig

·How Impala gives good performance

·Exclusive features of Impala

·Impala Challenges

·Use cases of Impala

SQOOP

·Install and configure Sqoop on cluster

·Connecting to RDBMS

·Installing Mysql

·Import data from Oracle/Mysql to hive

·Export data to Oracle/Mysql

·Internal mechanism of import/export

FLUME

·Architecture

·Ingesting Streaming tweets

·HDFS as Sink

NOSQL

HBase

·HBase concepts

·HBase architecture

·Region server architecture

·File storage architecture

·HBase basics

·Column access

·Scans

·HBase use cases

·Install and configure HBase on a multi node cluster

·Create database, Develop and run sample applications

OOZIE

·Oozie architecture

·XML file specifications

·Install and configuring Oozie and Apache

·Specifying Work flow

·Action nodes

·Control nodes

·Oozie job coordinator

Hadoop Challenges

·Hadoop disaster recovery

·Hadoop suitable cases

ELASTICSEARCH

·Get and Put API

·Java approarch

·ElasticSearch with Kibana

SPARK

·Basics of in memory computation

·RDD in Spark

·Installation

·Spark with Scala example

·Spark Java API

·Spark Mlib STORM BASICS KAFKA BASICS

Sign In

Hadoop Training At Vsbtech

Recommended Posts

vmanubolu

Link to comment

Share on other sites

Join the conversation

Tell a friend

Most viewed in last 30 days

Browse

Activity

AndhraWatch