CN110958311A

CN110958311A - YARN-based shared cluster elastic expansion system and method

Info

Publication number: CN110958311A
Application number: CN201911179701.5A
Authority: CN
Inventors: 曹东刚; 马俊明; 邵嘉伦
Original assignee: Peking University
Current assignee: Peking University
Priority date: 2019-11-27
Filing date: 2019-11-27
Publication date: 2020-04-03

Abstract

The invention discloses a YARN-based shared cluster elastic stretching system and method. The system comprises: a fixed node, a resilient node that joins or leaves the cluster as the cluster load changes, an application manager, a resource manager running on the fixed node, and a node manager running on either the fixed node or the resilient node. The resource manager is used for scheduling resources, monitoring the resource utilization rate of the cluster and interacting with the public cloud platform according to the resource utilization rate of the cluster to add the elastic nodes into the cluster or release the elastic nodes in the cluster. The application manager communicates with the node manager according to the allocation of the resource manager to the resource, so that the fixed node or the elastic node where the node manager is located starts an operation task, and manages and monitors the task; the node manager makes the fixed node or the elastic node execute tasks. The invention can lead the scale of the sharing cluster to elastically stretch along with the change of the load.

Description

YARN-based shared cluster elastic expansion system and method

Technical Field

The invention relates to the technical field of computer clusters, in particular to a YARN-based shared cluster elastic stretching system and method.

Background

Cloud computing is an information technology service mode which enables users to use hardware resources such as computing, network and storage as required. The public cloud is a cloud infrastructure that is managed by third-party enterprise operations and maintenance to provide services to individuals and enterprise users. Representative public cloud enterprises currently include arrests, amazon AWS, microsoft Azure, and the like. Public cloud operators provide programmable Application Program Interfaces (APIs) that enable consumers to use public cloud resources more efficiently.

YARN is a piece of software that manages large computer cluster systems. YARNs support different application frameworks while running on shared computer cluster hardware resources. For example, in a YARN cluster, different types of jobs, such as HadoopMapReduce and Spark, may be run simultaneously. YARN uses a double-layer scheduling mechanism, each job has a centralized management program Application Master (AM), and each AM will apply for resources from Resource Manager (RM) of YARN. Having obtained the allocated resources, the AM will further allocate these resources to different tasks within the job. The AM communicates with a Node Manager (NM for short) and runs tasks on each computer Node.

YARNs are commonly used in large-scale data centers to manage computer clusters. With the development of cloud computing, more and more enterprises choose to use the service of the public cloud and deploy the data center on the public cloud. However, YARN as cluster management software cannot effectively support job migration, and cannot fully exert the flexible and flexible characteristics of cloud computing.

Disclosure of Invention

The invention aims to provide a YARN-based shared cluster elastic expansion system and method in a cloud environment.

In order to achieve the purpose, the invention provides the following scheme:

the invention provides a YARN-based shared cluster elastic expansion system in a first aspect, which comprises: a fixed node, a resilient node, an application manager, a resource manager running on the fixed node, and a node manager running on the fixed node or the resilient node.

The fixed nodes are fixed and unchangeable nodes in the shared cluster, and the elastic nodes are nodes which are added into the cluster or released from the cluster along with the change of the cluster load in the shared cluster. The system provided by the invention is deployed on a virtual machine cluster in a public cloud environment and is the extension of a native YARN, fixed nodes exist all the time in the whole cluster operation process, and elastic nodes can be dynamically added into or released from the cluster along with the change of the cluster load.

The resource manager is used for scheduling resources, monitoring the resource utilization rate of the cluster and interacting with the public cloud platform according to the resource utilization rate of the cluster so as to add the elastic node into the cluster or release the elastic node in the cluster; the application manager is used for communicating with the node manager according to the allocation of the resource manager to the resource, so that the fixed node or the elastic node where the node manager is located starts an operation task, and manages and monitors the task; the node manager is used for enabling the fixed node or the elastic node to run the task.

Optionally, the resource manager is a most core module of the entire system, which runs on the fixed node, and may include the following four modules:

the resource monitoring module is used for periodically monitoring the resource utilization rate of the cluster;

the resource scheduling module is used for receiving the task allocation request submitted by the application manager and performing resource scheduling;

the flexible management module is used for making a decision of cluster expansion or contraction according to the resource utilization rate provided by the resource monitoring module;

and the cloud platform interaction module is used for interacting with the public cloud platform according to the decision, creating a new virtual machine as an elastic node, or releasing the virtual machine corresponding to the elastic node in the cluster, so as to dynamically expand the scale of the whole cluster.

Optionally, the node manager includes a container manager module, and the container manager module is configured to monitor a state of a node, manage tasks on the node in a container form, and migrate the tasks in a cluster scaling process.

The second aspect of the present invention provides a YARN-based shared cluster elastic stretching method, which is applied to the YARN-based shared cluster elastic stretching system provided by the present invention, and the method includes:

the resource manager creates an application manager for managing the job submitted by the user;

the application manager makes a resource application for the job to the resource manager;

the resource manager allocates resources on a plurality of nodes to the application manager according to the resource use information of each node;

after obtaining the resource allocation of the resource manager, the application manager communicates with the node manager of the corresponding node, and starts to run the subtasks in the operation at the corresponding node;

the resource manager monitors the resource utilization rate of the cluster in real time;

the resource manager judges whether the resource utilization rate of the cluster is greater than a first threshold value;

if the resource utilization rate of the cluster in a plurality of continuous monitoring periods is larger than a first threshold value, the resource manager interacts with the public cloud platform and adds the elastic node into the cluster;

the resource manager judges whether the resource utilization rate of the elastic nodes in the cluster is smaller than a second threshold value;

and if the resource utilization rate of the elastic node in a plurality of continuous monitoring periods is less than a second threshold value, the resource manager deletes the elastic node from the cluster.

Optionally, before the adding the elastic node into the cluster, the method further includes:

and the resource manager determines the number of the elastic nodes needing to be accessed into the cluster according to the resource utilization rate of the cluster.

Optionally, the resource manager interacts with a public cloud platform, and adds an elastic node into the cluster, which specifically includes:

calling a public cloud API interface to interact with a public cloud platform, creating a set number of virtual machines, and determining the IP addresses of the newly added virtual machines;

and the resource manager remotely logs in the newly added virtual machine according to the IP address of the newly added virtual machine, starts a node manager of the newly added virtual machine and adds the newly added virtual machine into the cluster.

Optionally, before the resource manager deletes the elastic node from the cluster, the method further includes:

judging whether an application manager runs on the elastic node;

if not, the elastic node is deleted from the cluster.

and migrating the containers running in the elastic nodes needing to be deleted.

Optionally, the migrating the container running in the elastic node to be deleted specifically includes:

the resource manager determines a target node identification number and an application manager identification number for managing the container, and transmits the target node identification number to an application manager for managing the container and a node manager of an elastic node to which the container belongs;

the node manager migrates the container to the target node, and transmits information of completed migration to a resource manager after migration is completed;

the resource manager informs the node manager on the target node to recover the container which has completed the migration;

after the node manager recovers the container, the node manager informs the resource manager that the container is completely recovered;

the resource manager calls a public cloud API to interact with a public cloud platform, and deletes the virtual machine corresponding to the elastic node;

the resource manager container informs the application manager that the container has been migrated to the target node;

the application manager establishes an RPC association with a node manager on the target node.

According to the specific embodiment provided by the invention, the invention discloses the following technical effects: the YARN-based shared cluster elastic telescopic system in the cloud environment comprises fixed nodes which are fixed and unchangeable in a shared cluster and elastic nodes which are added into or removed from the cluster along with the load change of the shared cluster. The resource utilization rate of the cluster is monitored through the resource manager; when the cluster resource utilization rate is greater than a first threshold value, interacting with a public cloud platform, and adding the elastic node into the cluster; and when the resource utilization rate of the cluster elastic node is less than a second threshold value, interacting with the public cloud platform, deleting the elastic node from the cluster, and migrating a container on the elastic node. The scale of the shared cluster can be elastically stretched along with the change of the load, and the cloud computing has the characteristic of flexibility.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings needed in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings without creative efforts.

FIG. 1 is a schematic structural diagram of a YARN-based shared cluster elastic expansion system in an embodiment of the present invention;

fig. 2 is a flowchart of a YARN-based shared cluster elastic scaling method in an embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in further detail below.

As shown in fig. 1, the YARN-based shared cluster elastic expansion and contraction system provided by the present invention comprises:

the fixed nodes are fixed and unchangeable nodes in the shared cluster;

the elastic nodes are nodes which are added to or released from the cluster along with the change of the cluster load in the shared cluster;

the resource manager runs on the fixed node and is used for scheduling resources, monitoring the resource utilization rate of the cluster and interacting with the public cloud platform according to the resource utilization rate of the cluster so as to add the elastic node into the cluster or release the elastic node in the cluster;

an application manager; the node manager is used for communicating with the node manager according to the allocation of the resource manager to the resource, so that the fixed node or the elastic node where the node manager is located starts an operation task, and manages and monitors the task;

and the node manager runs on the fixed node or the elastic node and is used for enabling the fixed node or the elastic node to run the task.

Wherein the resource manager comprises: the system comprises a resource monitoring module, a resource scheduling module, a flexible management module and a cloud platform interaction module. The resource monitoring module is used for periodically monitoring the resource utilization rate of the cluster; the resource scheduling module is used for receiving the task allocation request submitted by the application manager and performing resource scheduling; the flexible management module is used for making a decision of cluster expansion or contraction according to the resource utilization rate provided by the resource monitoring module; and the cloud platform interaction module is used for interacting with the public cloud platform according to the decision, creating a new virtual machine as an elastic node, or releasing the virtual machine corresponding to the elastic node in the cluster.

The node manager is a manager of each node in the system. A node manager runs on each fixed and elastic node. The node manager comprises a core submodule, namely a container manager module. The function of the container manager module is to monitor the status of the nodes and to manage tasks on the nodes in the form of containers and to migrate tasks during cluster scale.

The application manager is a manager of each job in the system, and each job, after being submitted to the system, creates a corresponding application manager process to manage the job. Each job consists of a number of subtasks, which are typically executed in containers on the various nodes, and are managed and monitored by the application manager. Each application manager can be viewed as an application framework, with developers developing applications that meet specific requirements using the APIs provided by the system. The system provides an interface for perceiving task migration for a developer, so that a user can develop an application manager with fault tolerance capability for task migration based on the interface, and the system cannot cause the crash of operation when scaling migration tasks are performed.

The resource manager, the node manager and the application manager are communicated with each other in an RPC mode. The application manager requests resources for running the tasks from the resource manager through the RPC heartbeat, and the resource manager informs the resource conditions distributed by the application manager and the task running state through an RPC heartbeat return value. The application manager informs the node manager of requests related to running tasks such as startup and release through RPC heartbeat, and the node manager informs the application manager of the state of the tasks running on the node through an RPC heartbeat return value. The node manager reports the resource utilization condition of each node and the state of the task running on the node to the resource manager through RPC heartbeat, and the resource manager informs the node manager of operations such as task migration and task recovery through an RPC heartbeat return value.

Based on the above system, the present invention further provides a YARN-based shared cluster elastic stretching method, as shown in fig. 2:

a user submits a job (e.g., a Hadoop MapReduce job) to the RM (resource manager) through a client, and the RM first creates an AM (application manager) for the job to manage the job. The AM will make a resource application required by the operation to the RM, and the RM allocates resources on a plurality of nodes to the AM through the node resource use information reported by the NM. After the AM obtains the allocated resources, it communicates with NM (node manager) of the corresponding node to start the subtask within the operation. The system is initially started on a cluster consisting of a number of virtual machines, which are all fixed virtual machines, and the life cycle of the virtual machines is maintained until the whole system is terminated. The RM process runs on one machine of the cluster and the NM runs on each machine of the cluster.

The RM periodically monitors the cluster load. And when the RM monitors that the utilization rate of the cluster resources is integrally overhigh in a plurality of continuous periods, the RM makes a decision to expand the cluster size. And the RM calculates the number of nodes which need to be increased to enable the utilization rate of the cluster resources to fall within a reasonable interval. And then the RM calls the public cloud platform API to create a specified number of virtual machines, remotely logs on the newly created virtual machines, starts NM service and adds the newly created machines into the cluster. These newly added virtual machines to the cluster are elastic virtual machines.

When the RM monitors that the resource utilization rate of a certain elastic node in the cluster is too low in a plurality of continuous periods, the RM makes a decision to delete the elastic virtual machine with the resource utilization rate lower than a certain value. Before deleting the virtual machine, the system will migrate the tasks (in the form of containers) on that node, computing the destination nodes for these container migrations. If the RM calculation finds the cost of migration too high, the RM will kill these tasks directly so that they can be scheduled to run again. When the task is migrated, the RM tells the AM that some containers belonging to the AM need to be migrated to the target node through the RPC heartbeat return value, and tells the NM to migrate the container running on the node to the target node through the RPC heartbeat return value. The NM receives the RM heartbeat return value, adaptively migrates these containers, and tells the RM container that the migration is in progress or has been completed through the heartbeat. The RM receives NM heartbeat to know that a certain container has finished the migration, and informs a destination node of NM recovery container through RPC heartbeat return value. And the RM receives the NM heartbeat and calls the public cloud platform API to delete the node after knowing that all containers on the RM finish the migration. After the RM receives the destination node NM and knows that the migration container is recovered, the AM is informed that a new RPC can be established with the destination node NM.

The method comprises the following specific steps:

step 1:

and the resource scheduling module of the RM receives the job resource request submitted by the AM and performs resource scheduling. The resource scheduling module uses a fixed priority scheduling strategy to attempt to schedule the task on the fixed node, and if the fixed node resources are insufficient, the resource scheduling module can attempt to schedule the task on the elastic node.

And the RM periodically monitors the cluster load through the resource monitoring module at the same time, and if the cluster resource utilization rate is monitored to be higher than a certain threshold value in a plurality of continuous periods, the RM makes a decision to expand the cluster scale, outputs the resource utilization rate, the node number and the threshold value of the current cluster, and executes the step 2. If the resource utilization rate of a certain elastic node in the cluster is monitored to be lower than a certain threshold value for a plurality of continuous periods, the RM makes a decision to delete the elastic node which runs without the AM and has the resource utilization rate lower than a certain value to shrink the cluster. And outputting the node identification number and executing the step 5. The monitoring period and the threshold value are set by the user of the system.

Step 2:

and the flexible management module of the RM takes the resource utilization rate of the current cluster, the node number and the threshold value output in the step 1 as input, and calculates the number of the nodes needing to be increased so that the cluster resource utilization rate can be lower than the threshold value. Step 3 is performed with the increased number of nodes calculated as output.

And step 3:

and the cloud platform interaction module of the RM takes the number of the nodes to be increased output in the step 2 as input, calls the public cloud API to interact with the public cloud platform, and creates a set number of virtual machines. And outputting the IP address of the newly added virtual machine and executing the step 4.

And 4, step 4:

and 3, the expansion management module of the RM takes the IP address of the newly-added virtual machine output in the step 3 as input, remotely logs on the newly-created virtual machine, starts NM service and adds the machine into the cluster. And returning to the step 1.

And 5:

and (3) inquiring the running containers on the node by the RM expansion management module based on the node identification numbers output in the step (1), calculating the target node identification numbers to be migrated of the containers, and outputting the AM identification numbers to which the containers belong, the NM identification numbers of the nodes to which the containers belong and the target node identification numbers to be migrated.

Step 6:

and the scaling management module of the RM tells the AM that the container belonging to the RM needs to be migrated to the target node through an RPC heartbeat return value between the RM-AM based on the container AM identification number output in the step 5 and the target node identification number to be migrated.

And 7:

and the flexible management module of the RM tells the NM of the node to which the container belongs to migrate the container to the target node through an RPC heartbeat return value between the RM and the NM based on the NM identification number of the node to which the container belongs and the identification number of the target node to be migrated, which are output in the step 5.

And 8:

and the NM receives the RPC heartbeat information sent by the step 7, and starts to migrate the container to the target node through the container manager module. During container migration, NM will tell RM container is in the process of migration through RPC heartbeat between RM-NM. When the NM container manager module completes the container migration operation, the NM will tell the RM container that the migration is complete through an RPC heartbeat between RM-NMs.

And step 9:

and the telescopic management module of the RM receives the RPC heartbeat information sent by the NM in the step 8, knows that the container has completed the migration, and tells the NM on the target node to restore the container which has completed the migration through the RPC heartbeat between the RM and the NM.

Step 10:

and the flexible management module of the RM receives the RPC heartbeat information sent by the NM in the step 8 and knows that all containers on the NM have completed the migration. And the RM expansion management module informs the cloud platform interaction module, calls the public cloud API to interact with the public cloud platform, and deletes the node virtual machine where the NM is located.

Step 11:

the NM on the target node receives the RPC heartbeat message sent by the RM in step 9, and its container manager module starts to recover the container migrated to that node. When the container recovery is complete, the NM tells the RM that the container has been recovered by an RPC heartbeat between RM-NMs.

Step 12:

the RM receives the RPC heartbeat message from the NM in step 11, and knows that the container on the NM has completed recovery. The RM will tell the AM that the container to which it belongs has been successfully migrated to the target node via an RPC heartbeat between RM-AMs.

Step 13:

the AM receives the RPC heartbeat message sent by the RM in step 9, and knows that the container belonging to the AM has been migrated to the target node. The AM will establish a new RPC contact with the NM on the target node. And returning to the step 1.

The existing shared cluster management system YARN does not have the property of scaling nodes by supporting job migration. When the cluster load is low and the nodes need to be deleted to save the cost, the YARN can only wait for all the tasks on the nodes to be executed or kill all the tasks running on the nodes to perform the scheduling again to release the nodes. Both of these schemes are inefficient. The former may cause nodes to run for long periods of time with low resource utilization, and the latter may cause tasks that have been calculated for long periods of time to perform inefficient calculations. Existing YARNs lack mechanisms to ensure that systems can be elastically scaled through task migration while allowing tasks running on the system to execute correctly.

On the basis of YARN, the invention realizes that a new system has the capability of elastically stretching the cluster scale through task migration in a public cloud environment. The system can interactively create and delete the virtual machine with the public cloud platform in real time according to the current cluster resource utilization rate, and the scale of the virtual machine cluster is dynamically adjusted.

Meanwhile, the system provides an interface for perceiving task migration for developers, so that users can develop the AM with fault tolerance capability for task migration based on the interface, and the system cannot cause the crash of operation when scaling migration tasks are carried out. By the system provided by the invention, the shared cluster running on the public cloud can use the public cloud resources more flexibly as required.

The embodiments in the present description are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other. For the system disclosed by the embodiment, the description is relatively simple because the system corresponds to the method disclosed by the embodiment, and the relevant points can be referred to the method part for description.

The principles and embodiments of the present invention have been described herein using specific examples, which are provided only to help understand the method and the core concept of the present invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, the specific embodiments and the application range may be changed. In view of the above, the present disclosure should not be construed as limiting the invention.

Claims

1. A YARN-based shared cluster elastic telescopic system is characterized by comprising:

the fixed nodes are fixed and unchangeable nodes in the shared cluster;

2. The YARN-based shared cluster elastic scaling system of claim 1, wherein the resource manager comprises:

and the cloud platform interaction module is used for interacting with the public cloud platform according to the decision, creating a new virtual machine as an elastic node, or releasing the virtual machine corresponding to the elastic node in the cluster.

3. The YARN-based shared cluster elastic scalability system of claim 1 wherein the node manager comprises a container manager module that monitors the status of nodes, manages tasks on nodes in containers, and migrates tasks during cluster scalability.

4. A YARN-based shared cluster elastic scaling method, applied to the YARN-based shared cluster elastic scaling system of any one of claims 1-3, the method comprising:

5. The YARN-based shared cluster elastic scaling method of claim 4, further comprising, prior to said joining an elastic node to a cluster:

6. The YARN-based shared cluster elastic scaling method of claims 4 or 5, wherein the resource manager interacts with a public cloud platform to add elastic nodes to the cluster, specifically comprising:

7. The YARN-based shared cluster elastic scaling method of claim 4, further comprising, before the resource manager deletes an elastic node from the cluster:

judging whether an application manager runs on the elastic node;

if not, the elastic node is deleted from the cluster.

8. The YARN-based shared cluster elastic scaling method of claim 4, further comprising, before the resource manager deletes an elastic node from the cluster:

9. The YARN-based shared cluster elastic scaling method of claim 8, wherein migrating the containers running in the elastic nodes that need to be deleted specifically comprises: