Thursday, August 16, 2018
Map and Reduce

MapReduce – A Programming Model On Shared Nothing Architecture

MapReduce Framework: MapReduce enables programmers to process huge amounts of data in parallel across distributed processors. It handles details such as parallelization, fault tolerance, data distribution...
YARN as an operating system

Apache YARN

Introduction to Apache YARN Apache YARN is the prerequisite for Enterprise Hadoop, providing resource management and a central platform to deliver consistent operations, security, and...
apache pig

Apache Pig – A Detailed Overview

Introduction Apache Pig is a scripting platform for processing and analyzing large data sets. Pig was designed to perform long series of data operations, making...
apache hive overview

Apache Hive – The SQL-like Interface

Apache Hive Overview Apache Hive is a data warehouse infrastructure built on top of Apache Hadoop that provides data summarization, ad-hoc query, and analysis of...

Data Ingestion Technologies – Big Data

Introduction There are many different data sources with different formats that need to be input into HDFS. Just as there are many vendors that your...

Recent Posts