Hadoop Distributed File System (HDFS) is a Java-based file system that provides scalable and reliable data storage, and it was designed to span large clusters of commodity servers.
HDFS has demonstrated production scalability of up to 200 PB of storage and a single cluster of 4500 servers, supporting close to a billion files and blocks (can we call it Big Data?). When that quantity and quality of enterprise data is available in HDFS, and YARN enables multiple data access applications to process it, Hadoop users can confidently answer questions that eluded previous data platforms.
HDFS is a scalable, fault-tolerant, distributed storage system that works closely with a wide variety of concurrent data access applications, coordinated by YARN. HDFS will “just work” under a variety of physical and systemic circumstances. By distributing storage and computation across many servers, the combined storage resource can grow linearly with demand while remaining economical at every amount of storage.
Key HDFS Concepts
- Write Once, Read Many times (WORM)
- Divide files into big blocks and distribute across the cluster
- Store multiple replicas of each block for reliability
- Programs can ask “where do the pieces of my file live?” in order to find replicas for each block and try to gain data locality
HDFS Looks Like a File System
Common tools such as a web-based file system browser and basic CLI represent the basic expectations of a file system.
However, the differences between HDFS and other distributed file systems are significant:
- HDFS is highly fault-tolerant
- It is designed to be deployed on low-cost hardware
- The HDFS provides high throughput access to application data and is suitable for applications that have large data sets
- It allows streaming access to file system data
HDFS Acts like a File System
hdfs dfs -command [args]
The concerted effort has been made to move from “hadoop fs” to “hdfs dfs”.
Hadoop Documentation states that “hdfs dfs” is a synonym for “hadoop fs”.
HDFS Components and Interactions
The HDFS NameNode and DataNodes form the basis of the HDFS architecture and interact to distribute and replicate data across the cluster.
The NameNode is the master service of HDFS. It determines and maintains how the chunks of data are distributed across the DataNodes. A Namenode represents a single namespace. Data never reside on a NameNode.
DataNodes store the chunks of data, and are responsible for replicating the chunks across other DataNodes. The Default block size in HDP 128MB. The default replication factor is 3.
NameNode and DataNodes are components of the HDFS service. The NameNode is an HDFS master component while a DataNode is an HDFS worker component. They are implemented as daemons running inside a Java virtual machine.
The NameNode maintains critical HDFS information. To enhance HDFS performance, it maintains and serves this information from memory. The memory-based information includes:
- Namespace information
- Metadata information
- Journaling information
- Block map information
Because the NameNode maintains all file system state information in memory, it is critical to ensure that the NameNode has sufficient memory.
Resolving Missing or Corrupted Blocks
DataNodes periodically send a Heartbeat message to the NameNode. If a DataNode loses connectivity with the namenode, the NameNode detects this condition by the absence of the Heartbeat. The NameNode then marks silent DataNode as dead and will not forward any new IO requests to it. Data registered to a dead DataNode will no longer be available to HDFS.
It is the NameNode’s job to track which blocks need to be replicated and initiate replication. A need for re-replication may occur for several reasons:
- A DataNode may become unavailable
- A replica could become corrupted
- A hard disk on a DataNode may fail
- The Replication factor of a file may be increased
NameNode and DataNodes Interaction
HDFS is a master-slave architecture.
An HDFS cluster has a single NameNode that manages the namespace and regulates access to files by clients. There are a number of DataNodes, usually one per node in the cluster, which manage storage.
HDFS allows user data to be stored in files. DataNodes are responsible for serving read and write requests from the file system’s clients. The DataNodes also perform block creation, deletion, and replication upon instruction from the NameNode.
Replication and Block Placement
HDFS is designed to assume that disk, system, and network failures will occur. As a result, HDFS is also designed to automatically and transparently handle disk failures. It does this by automatically replicating data across different DataNodes.
HDFS stores a file as a sequence of blocks; all blocks in a file except the last block are the same size. Block size and replication factor are configurable per file and an application can specify the number of replicas of a file.
NameNode High Availability
In Hadoop, prior to version 2.0, the NameNode was SPOF (Single Point Of Failure). The entire cluster would become unavailable if the NameNode failed or became unreachable. Even maintenance events such as software or hardware upgrades on the NameNode machine would result in periods of cluster downtime.
The HDFS NameNode High Availability (HA) feature eliminates the NameNode as a single point of failure. It enables a cluster to run redundant NameNodes in an Active/Standby configuration.
NameNode HA enables fast failover to the Standby NameNode in response to a failure, or a graceful administrator-initiated failover for planned maintenance.
There are two ways of configuring NameNode HA. Using the Ambari Web UI is the easiest way. Manually editing the configuration files and starting or restarting the necessary daemons is also possible. However, manual configuration of NameNode HA is not compatible with
Ambari administration. Any manual edits to the hdfs-site.xml file would be overwritten by information in the Ambari database when the HDFS service is restarted.
HDFS Multi-Tenant Controls
- Classic POSIX permissioning (ex: -rwxr-xr–)
- Extended Access Control Lists (ACL) for richer scenarios
- Centralized authorization policies and audit available via Ranger plug-in
- Easy to understand data size quotas
- An additional option for controlling the number of files
- HDFS breaks files into blocks and replicates them for reliability and processing data locality
- The primary components are the master NameNode service and the worker DataNode service
- The NameNode is a memory-based service
- The NameNode automatically takes care of recovery missing and corrupted blocks
- Clients interact with the NameNode to get a list, for each block, of DataNodes to write data to.