Yet Another Resource Negotiator (YARN)-Big Data - (PART - 3)
Yet Another Resource Negotiator (YARN)
Yet Another Resource Negotiator (YARN) is a resource manager of Hadoop is created by splitting the processing engine and the management function of MapReduce. It is responsible for allocating system resources to the various applications running in the Hadoop cluster, high availability features of Hadoop, scheduling tasks, and implements security controls.
YARN ARCHITECTURE
The most important component of YARN is Node Manager, Resource Manager, and Application Master.
Resource Manager
- It is the master of the YARN Architecture. It runs many services like resource scheduler and application manager.
- It gives information about the available resources among the competing application
- It ensures maximum cluster utilization.
- Resource Scheduler is responsible for allocating resources to the application.
- Application manager maintains a list of submitted, running, and completed application.
Node Manager
- It is the slave of YARN Architecture. After starting it announces itself to the resource manager and offer resources to the cluster.
- Each Node Manager receives instruction from the Resource Manager and reports and handles containers on a single node.
- It manages container lifecycle.
- It manages node health, log management, Node, and container resource usage.
Application Master
Application master in YARN is a framework-specific library that negotiates resources from Resource Manager and works with the Node Manager or manages to execute and monitor containers and their resource consumption.
- It manages the application lifecycle.
- It manages execution flow.
- It provides status about the application to the Resource Manager.
How Resource Manager Operates
- The resource manager and the clients communicate through the interface called client service.
- Administrative requests are assisted by a separate interface called the AdminService.
- Resource Manager continuously receives node heartbeats from the NodeManager to track new or decommissioned nodes.
- The NMLivelinessMonitor and NodesListManager keep an updated status of which nodes are healthy so that the Scheduler and the ResourceTrackerService can allocate work appropriately.
- Application Masters on all node is managed by ApplicationMasterService.
- A list of Application Masters and their heartbeat times is kept by the AMLivelinessMonitor to let the Resource Manager know what applications are healthy on the cluster.
- ApplicationMaster which does not send a heartbeat within a certain interval of time is marked as dead and re-scheduled to run on a new container.
Resource Manager – High Availability Mode
The Resource Manager was the single point of failure in a YARN cluster before the release of Hadoop 2.4 version.
The High Availability, or HA, the feature is an Active/Standby Resource Manager pair to remove this single point of failure.
Node Manager: Launching a Container
The Application Master must provide a container launch context (CLC) to launch the container. CLC includes the following information:
- Security tokens
- Environment Variables
- Dependencies (local resources such as shared objects or data files needed before launch)
- The command necessary to create the process that the application wants to launch
Running an Application Through YARN
1. The client submits an application to the Resource Manager:
- The user submits the application by typing the Hadoop jar command to the Resource manager.
- The Resource Manager maintains the list of applications on the cluster and available resources on the Node Manager and determines the next application that receives a portion of the cluster resource.
2. The Resource Manager allocates a container:
- After accepting a new application submission by the Resource Manager, the scheduler first selects a container.
- After that, the Application Master is started and is responsible for the entire life-cycle of the application.
3. The Application Masters contact the related Node Manager:
- After allocation of a container, the Application Master asks the Node Manager managing the host on which the container was allocated to use these resources to launch an application-specific task.
4. The Node Manager launches the container:
- The Node Manager only monitors the resource usage in the containers.
- The Application Master negotiates containers to launch all of the tasks needed to complete the application.
5. The container executes the Application Master:
- After the completion of the application, the Application Master not only shuts itself but also releases its container.