Big Data Hadoop Online

What is BigData
Hadoop Overview
Introduction to HDFS
HDFS Architecture
MapReduce v1
MapReduce v2/YARN
HBase
Hive
Pig
Flume
Sqoop

This course brings together several key information technologies used in manipulating, storing, and analyzing big data. We look at the basic tools for statistical analysis, R, and key methods used in machine learning. We review MapReduce techniques for parallel processing and Hadoop, an open source framework that allow us to cheaply and efficiently implement MapReduce on Internet scale problems. We touch on related tools that provide SQL-like access to unstructured data: Pig and Hive. We analyze so-called NoSQL storage solutions exemplified by HBase for their critical features: speed of reads and writes, data consistency, and ability to scale to extreme volumes. We examine memory resident databases and streaming technologies which allow analysis of data in real time. We work with the public cloud as unlimited resource for big data analytics. Students gain the ability to design highly scalable systems that can accept, store, and analyze large volumes of unstructured data in batch mode and/or real time

ENROLL NOW

4.7

Big Data Hadoop Online

Fees

Self Paced Training

$499