CN115811549B

CN115811549B - Cloud edge resource management scheduling method and system supporting hybrid heterogeneous operation

Info

Publication number: CN115811549B
Application number: CN202310080136.7A
Authority: CN
Inventors: 柳泉波; 许骏; 陈浩
Original assignee: South China Normal University
Current assignee: South China Normal University
Priority date: 2023-02-08
Filing date: 2023-02-08
Publication date: 2023-04-14
Anticipated expiration: 2043-02-08
Also published as: CN115811549A

Abstract

The invention discloses a cloud edge resource management scheduling method and system supporting mixed heterogeneous operation, which are characterized in that computing nodes are divided into a plurality of computing areas, a plurality of management nodes are used in each computing area to manage a computing node resource pool so as to enhance the fault-tolerant capability of resource management and scheduling of each computing area, leader nodes are selected from the management nodes to uniformly manage the states of all working nodes in the computing areas, manage computing tasks and distribute the computing tasks, and runtime drivers for completing the computing tasks are installed and operated on the working nodes according to the computing tasks as required and interact with the management nodes. Compared with the prior art, the standard interface interactive with the runtime is provided through the runtime driver, mixed deployment of computing tasks running in different runtimes can be supported in various heterogeneous runtimes, expandability is strong, and stable and efficient cloud-edge resource management and scheduling in heterogeneous runtimes are realized.

Description

Cloud edge resource management scheduling method and system supporting hybrid heterogeneous operation

Technical Field

The invention relates to the technical field of cloud computing and edge computing, in particular to a cloud edge resource management scheduling method and system supporting hybrid heterogeneous operation.

Background

The integrated supply of cloud computing and edge computing resources is one of the main trends of digital development. Runtime refers to all code libraries, frameworks, platforms, etc. that are needed when a computer program is running. During operation, the operation abstraction of the single-node computing resources is realized, and the method is the basis of cloud-edge resource integrated management. The runtime is of various types, including physical processes, lightweight virtual machines, containers, programming language virtual machines, and WASM runtime, which are different from each other in terms of security, efficiency, and consistency.

In the process of promoting digital transformation in various industries, different application scenes have great difference on the requirements of running, and the development technology stacks of the applications are different. Thus, a cloud-edge resource management system is required to be able to run multiple heterogeneous runtimes on the cloud side and edge simultaneously.

The existing cloud-edge resource management system usually takes single type operation as a starting point to design and realize the whole system. If a new runtime support is added, a large amount of codes need to be changed, time and labor are consumed, and the expandability is poor; and the multi-type operation support realized often has great difference, and the system is unstable and is easy to break down.

Thus, the prior art is in need of improvement and advancement.

Disclosure of Invention

The invention mainly aims to provide a cloud-side resource management scheduling method and system supporting hybrid heterogeneous operation, and aims to solve the problems that an existing cloud-side resource management system is poor in expandability, unstable in system and prone to failure.

In order to achieve the above object, a first aspect of the present invention provides a cloud-edge resource management scheduling system supporting hybrid heterogeneous runtime, including:

each computing area is provided with a plurality of management nodes and a plurality of working nodes;

the management node is provided with a management component and a scheduler, the management component is used for receiving a calculation task submitted by a user and acquiring the state of the working node, and the scheduler is used for generating an allocation plan corresponding to the calculation task according to a scheduling strategy;

all the management nodes are configured to be a leader node and a plurality of follower nodes, and the leader node is used for managing the states of all the working nodes in the computing area, managing computing tasks and distributing the computing tasks; a synchronization mechanism used for synchronizing the states of all working nodes in the calculation region is also arranged between the leader node and the follower node;

the working node is provided with a working agent, the working node is used for running a runtime driver used for completing the computing task, monitoring the running state of the runtime, and sending an execution progress and an execution result to the management node, and the runtime driver is used for providing a standardized interface to realize the interaction between the working agent and the runtime.

Optionally, all the working nodes of the computing area are configured into a plurality of groups, and each group includes a plurality of the working nodes.

Optionally, the leader node is further provided with a task queue and a planning queue, the task queue is used for managing the computing tasks submitted by the users, and the planning queue is used for managing the distribution plans corresponding to the computing tasks.

Optionally, a bidirectional channel is provided between the runtime driver and the work agent to support request-response communication and data streaming communication.

Optionally, the runtime driver is implemented based on a local network remote procedure call mechanism, and the runtime driver is an independent process on the working node.

Optionally, the scheduler includes: the system comprises a scheduler for appointing all working nodes to run the system service, a scheduler for appointing all working nodes to run the system batch processing task, a scheduler for selecting the best working node according to the condition to run the appointed service and a scheduler for selecting the proper working node according to the condition to run the appointed batch processing task.

The second aspect of the present invention provides a cloud-edge resource management scheduling method for supporting hybrid heterogeneous runtime, where the method includes:

dividing the cloud edge integrated computing resource pool into a plurality of computing areas so as to enable computing nodes in the same computing area to be a low-delay high-speed network;

selecting a plurality of management nodes from the computing nodes in each computing area and setting the rest computing nodes as working nodes;

dividing all the management nodes into a leader node and a plurality of follower nodes according to a consensus protocol, wherein the leader node is used for managing the states of all the working nodes in the calculation area and synchronizing the working nodes to the follower nodes;

acquiring a computing task submitted by a user and sending the computing task to a task queue of the leader node;

running a scheduler on the leader node and the follower node;

sequentially taking out calculation tasks from the task queue of the leader node, sending the calculation tasks to any scheduler to generate an allocation plan according to a scheduling strategy, and sending the allocation plan to the plan queue of the leader node;

sending the calculation tasks in the task queue to the work nodes determined according to the distribution plans corresponding to the calculation tasks;

installing and running a runtime driver for completing the computation task on the working node according to the computation task, monitoring a running state, and sending an execution progress and an execution result to the management node; the runtime driver is to provide a standardized interface to interact with the runtime, the runtime driver running in a completely independent process.

Optionally, a heartbeat signal is sent to the management node on the working node to obtain a state of the working node;

sending state change messages of all working nodes to the leader node to obtain the states of all working nodes in the calculation area;

synchronizing states of all worker nodes within the computing region between the leader node and the follower node.

Optionally, the sending a heartbeat signal to the management node on the working node to obtain a state of the working node includes:

and dividing the working nodes into a plurality of groups, and simultaneously sending heartbeat signals to the management node by all the working nodes in the groups.

Optionally, the working node is provided with a working agent, and further includes:

establishing a bidirectional channel between the runtime driver and the work agent;

and the work agent calls a standardized interface function provided by the runtime driver to interact with the runtime.

From the above, the invention divides the computing nodes into a plurality of computing areas, each computing area uses a plurality of management nodes to manage the computing node resource pool to enhance the fault-tolerant capability of resource management and scheduling of each computing area, selects the leader node from the management nodes to uniformly manage the states of all the working nodes in the computing areas, manage the computing tasks and distribute the computing tasks, installs and runs the runtime and runtime drivers for completing the computing tasks according to the computing tasks on the working nodes, and interacts with the management nodes. Compared with the prior art, the standard interface interactive with the runtime is provided through the runtime driver, mixed deployment of computing tasks running in different runtimes can be supported in various heterogeneous runtimes, expandability is strong, and stable and efficient cloud-edge resource management and scheduling in heterogeneous runtimes are realized.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings required to be used in the embodiments or the prior art description will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without inventive labor.

Fig. 1 is a schematic diagram of a cloud-edge integrated resource pool according to an embodiment of the present invention;

fig. 2 is a schematic diagram of a cloud-edge resource management scheduling system architecture supporting hybrid heterogeneous operation according to an embodiment of the present invention;

FIG. 3 is a schematic diagram of task management flow of the embodiment of FIG. 2;

FIG. 4 is a schematic diagram of a driving scheme during startup of a working node according to the embodiment of FIG. 2;

FIG. 5 is a schematic diagram illustrating interaction between a work agent and a runtime-driven RPC of the work node in the embodiment of FIG. 2;

FIG. 6 is a diagram illustrating the operation of managing computing tasks in the embodiment of FIG. 2;

fig. 7 is a flowchart illustrating a cloud-edge resource management scheduling method for supporting hybrid heterogeneous runtime according to an embodiment of the present invention.

Detailed Description

In the following description, for purposes of explanation and not limitation, specific details are set forth, such as particular system structures, techniques, etc. in order to provide a thorough understanding of the embodiments of the invention. It will be apparent, however, to one skilled in the art that the present invention may be practiced in other embodiments that depart from these specific details. In other instances, detailed descriptions of well-known systems, devices, circuits, and methods are omitted so as not to obscure the description of the present invention with unnecessary detail.

It will be understood that the terms "comprises" and/or "comprising," when used in this specification and the appended claims, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.

It is also to be understood that the terminology used in the description of the invention herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used in the specification of the present invention and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.

It should be further understood that the term "and/or" as used in this specification and the appended claims refers to any and all possible combinations of one or more of the associated listed items and includes such combinations.

As used in this specification and the appended claims, the term "if" may be interpreted contextually as "when 8230that is," or "once" or "in response to a determination" or "in response to a detection". Similarly, the phrase "if it is determined" or "if a [ described condition or event ] is detected" may be interpreted depending on the context to mean "upon determining" or "in response to determining" or "upon detecting [ described condition or event ]" or "in response to detecting [ described condition or event ]".

The technical solutions in the embodiments of the present invention are clearly and completely described below with reference to the drawings of the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention, but the present invention may be practiced in other ways than those specifically described and will be readily apparent to those of ordinary skill in the art without departing from the spirit of the present invention, and therefore the present invention is not limited to the specific embodiments disclosed below.

The integrated supply of cloud computing and edge computing resources is one of the main trends of digital development. The current mainstream operations include the following types: (1) a physical process implemented by a single-node operating system; (2) Lightweight virtual machines realized based on hardware virtualization technologies, such as AWS Firecrakers, google visors, openStack Kata containers and the like; (3) a container implemented based on operating system virtualization technology; (4) Programming language virtual machines including Java language virtual machine (JVM), common Language Runtime (CLR) of Microsoft. (5) A programming language independent WebAssembly (WASM) runtime.

Because the safety, efficiency and consistency of various operation are different, the existing cloud edge resource management system usually takes single type operation as a starting point to design and realize the whole system. If a new runtime support is added, a large amount of codes need to be changed, time and labor are consumed, expandability is poor, the realized multi-type runtime support often has large difference, and the stability and consistency of the resource management system are poor. With the digital transformation of each industry, the cloud-side resource management system is required to be capable of simultaneously operating a plurality of heterogeneous operations on the cloud side and the edge.

Aiming at the technical problems, the invention provides a cloud-side resource management scheduling method and system for supporting mixed heterogeneous operation, standardized operation driving is adopted, developers use any common programming language to realize a specified standard interface function, and then new operation support can be added to the cloud-side resource management system, so that the consistency is good, and the expansibility is strong. The runtime drive and the self-contained component of the cloud edge resource management system run in different processes. The normal operation of the cloud-side resource management system is not affected by the crash of the running driving process, and the system stability is good.

Examples

The embodiment of the invention provides a cloud edge resource management scheduling system supporting hybrid heterogeneous operation, which is applied to a cloud edge integrated computing resource pool containing various types of computing nodes, such as a self-built data center, a private cloud, a public cloud or a computing node at an edge side, wherein the computing node can be a physical machine or a virtual machine. Fig. 1 exemplarily illustrates a cloud-edge unified computing resource pool of the present embodiment. The scheduling system comprises one or more computing areas, and each computing area comprises a plurality of computing nodes. The rule for dividing the computing nodes into computing areas can be determined according to the network communication connection among the computing nodes, and if the computing nodes are nodes of a data center, the network delay among the computing nodes should not exceed 10 milliseconds; in the case of edge nodes, the delay of the network communication connection can be relaxed appropriately. Each computing area may also contain computing nodes from multiple data centers that are communicatively connected to each other through a low-latency, high-speed network as long as the data centers are geographically close. The computing areas are in loose coupling relation, namely the resource management and the computing task management of the computing areas are completely independent.

The compute nodes within each compute region are further classified into management nodes and worker nodes based on their performance. If 3 or 5 computing nodes are selected as management nodes, and the rest computing nodes are working nodes. The management nodes need to select high-performance nodes, and the CPU of each management node at least has 2 cores or 4 cores, preferably 8 cores or 16 cores; the memory has at least 8GB or 16GB, preferably 32GB or 64GB; the capacity of the disk exceeds 100GB, the read-write performance is guaranteed, and the solid state disk is recommended to be used. And because the communication between the management nodes and the working nodes is frequent and requires low delay, the management nodes are preferably connected through a high-speed network, and the network delay between the management nodes does not exceed 10 milliseconds. The configuration of the working nodes is not limited, and the specific configuration can be determined by referring to the resource requirements of the operation and the calculation load.

Each management node is provided with a management component and a scheduler in operation, and the management component can receive a calculation task submitted by a user and acquire the state of a working node connected with the management node (including whether the working node is effective, the resource state on the working node and the like); the scheduler may generate an allocation plan corresponding to the computation task according to the scheduling policy.

In order to improve the fault tolerance of the system, as shown in fig. 2, in this embodiment, all management nodes belonging to the same computing area jointly form a consensus group, which is responsible for managing computing resources formed by all working nodes in the area, and scheduling and allocating suitable working node resources for computing tasks submitted by users. And selecting a leader node and a plurality of follower nodes in the consensus group by adopting a Raft consensus protocol in the consensus group, and performing state replication, namely selecting one leader node and a plurality of follower nodes in the consensus group.

The leader node is used for centrally managing the states of all the working nodes in the computing area, managing computing tasks and distributing the computing tasks, namely, the state change messages and the computing task requests of all the working nodes are processed by the leader node in a unified mode. The state change messages of the working nodes acquired by the follower nodes need to be sent to the leader node for unified processing, so that the states of all the working nodes in the whole calculation area are acquired. A synchronization mechanism is arranged between the leader node and the follower node, and the states of all working nodes in the calculation area stored on the leader node can be synchronized to the follower node. It should be noted that both the leader node and the follower node can read the state of the working node connected to each other.

And the leader node is provided with a task queue for managing the computing tasks and a planning queue for managing the distribution planning. When computing tasks are managed, all computing task requests enter a task queue of the leader node. And then the leader node sequentially takes out the calculation tasks from the task queue and distributes the calculation tasks to the schedulers running on the leader node or the follower node, the schedulers of all the management nodes execute concurrently, an allocation plan of the calculation tasks is generated according to a scheduling strategy and sent to the planning queue of the leader node. The leader node takes out an allocation plan of the calculation tasks from the plan queue according to the first-in first-out principle, and checks whether the allocation plan is still executable. If the task is executable, notifying the work agent of the corresponding work node to execute the calculation task; if it is not already feasible, the allocation plan is returned to the original scheduler, which modifies or even re-plans.

Preferably, when the distribution plan generated by the scheduler is sent into the plan queue of the leader node, the distribution plan corresponding to the calculation task with high priority is arranged in front of the queue; tasks with the same priority are ranked in front of the tasks that enter the planning queue first.

By selecting a plurality of management nodes to form a consensus group in each computing area, the management scheduling system of the embodiment can tolerate accidental failures of 1 or 2 management nodes, thereby improving the fault tolerance of the system and realizing high availability. For example: when a computing area contains 5 physical management nodes, the whole system can tolerate the unexpected failure of 2 management nodes. It is preferable to set 3 or 5 management nodes in the same calculation area because the total number of nodes of the same consensus population must be odd according to the characteristics of the Raft consensus protocol. If the total number of the nodes is too much, the communication cost of the system for achieving consensus is high, and the performance of the system is influenced.

Generating computing task demands typically includes two cases: 1. a user submits a request for creating, modifying or deleting a computing task through a graphical portal, a command line tool or a calling API; 2. when a certain working node completely fails, the cloud edge resource management system must migrate the computing task running on the node to other suitable working nodes for execution.

The embodiment is provided with an interactive component, and the interactive component is a graphical portal for a user to access the cloud-side resource management system and is used for completing the work of resource pool management, submission, calculation task execution and the like. Optionally, the user may also use a command line tool to complete the task in batch, and may also call an Application Programming Interface (API) of the management component to write a task submission program suitable for the personalized requirements.

Each computing task comprises scheduling configuration and running configuration, wherein the scheduling configuration is used for specifying a scheduler type, scheduling constraint conditions and the like, and the running configuration is used for specifying the running time and the configuration of the running task.

According to the different functions of the schedulers, the schedulers operated in the leader node and the follower node mainly comprise four types: the system comprises a scheduler for appointing all working nodes to run the system service, a scheduler for appointing all working nodes to run the system batch processing task, a scheduler for selecting the best working node according to the condition to run the appointed service and a scheduler for selecting the proper working node according to the condition to run the appointed batch processing task. When the first two types of schedulers are executed, all working nodes are directly informed to execute tasks; the last two types of schedulers are executed by two steps of screening and sorting: firstly, a dispatcher operating specified service screens out a candidate working node set meeting constraint conditions from all normal working nodes or a dispatcher operating specified batch processing tasks screens out a candidate working node set meeting the constraint conditions from part of normal working nodes as soon as possible; and then, calculating the fitness of each candidate working node in turn, and selecting the working node with high fitness to execute the task. And adopting a bin packing strategy during task scheduling, namely calculating the fitness of the task j and the node i by using a best matching (best fit) V3 algorithm: score (I, j) = 10^ (1-total memory of memory/node I required for task j) + 10^ (total CPU of CPU/node I required for 1-task j). By adopting a best matching (best fit) V3 algorithm, tasks can be distributed to the working nodes with the least available resources, and the throughput of the computing resource pool is remarkably improved.

As shown in fig. 3, the specific process of scheduling assignment is: and the leader node sequentially takes out one calculation task from the task queue, allocates the calculation task to the scheduler meeting the type requirement, and generates a task allocation plan according to the scheduler to allocate the calculation task to the corresponding working node. The working node receives the computing task distributed by the management node of the computing area, installs and runs the running time and the driving of the computing task as required, executes the task, monitors the running state, and reports and sends the execution progress and the execution result to the management node.

Each working node is provided with and runs a working agent and a runtime driver for completing a calculation task, wherein the working agent is used for reporting available resources and attributes to a management component of the management node; receiving and executing a calculation task issued by a management component of a management node; the runtime driver is a standardized interoperation interface between the work agent and the runtime, the interaction between the work agent and the runtime is realized through the standardized interface, and the runtime is a component which actually executes a computing task, and is generally common open source or commercialized software, such as a Docker container engine or an OpenJDK virtual machine.

As shown in fig. 4, after receiving a computing task assigned by a management node, a worker node first checks whether a specified runtime driver has been started before executing the assigned task. If not, check if the runtime driver is installed on the worker node. The check runtime-driven location includes the current user's home directory, the current work directory, and the specified system directory. If the runtime driver is installed, the runtime driver process is started, otherwise the work agent reports to the leader node that the task was executed incorrectly. And when the runtime driver runs, the working agent hands the runtime configuration contained in the calculation task definition to the runtime driver, and the runtime driver is submitted to the runtime driver for execution. And monitoring the running state of the task during running, feeding the running state back to the work agent, and submitting the running state to a leader management point by the work agent regularly.

As shown in fig. 5, the runtime driver is implemented based on a Remote Procedure Call (RPC) mechanism, and the runtime driver and the work agent are two completely independent processes. An unexpected crash of the runtime driver process does not affect the normal operation of the work agent. Specifically, the process driven by the work agent when starting the runtime is as follows: the runtime driver informs the working agent that the runtime driver uses the configuration Schema defined by the JSON Schema when the runtime driver is installed; the working agent starts the runtime driver after verifying that the configuration of the runtime driver is legal; the runtime driver implements a service for monitoring the local web site and prints it out in the standardized output stdout. The service first selects to monitor a Unix Domain Socket, can also select to monitor a local website localhost of a TCP protocol, and importantly, a remote network interface is not opened so as to ensure the network security; the work agent establishes a bidirectional RPC connection with the runtime driver, and can call a standardized interface function provided by the runtime driver and receive a returned function call result.

Further, to support data streaming interface functions, the work agent and runtime driver establish an auxiliary TCP connection or create multiple RPC connections in parallel to implement data streaming. The problem of insufficient operating system resources caused by excessive connection can be avoided, and multiplexing of all the connections is realized.

When the work agent and the runtime driver establish a bi-directional RPC connection, it means that the work agent can directly call standard interface functions related to runtime interaction. First the work agent queries the capabilities of the runtime and the computing task configuration schema it supports. The run-time capabilities mainly include: signal processing support, execution command support, file system isolation, network initialization, storage volume loading, and the like. And the working agent checks whether the running configuration in the calculation task request submitted by the user is legal or not according to the received calculation task configuration outline. And after the configuration is verified to be correct, the calculation task is submitted to actual execution in runtime. As shown in FIG. 6, the runtime driver implements standard operations for managing computing task instances, including the starting, pausing, resuming, and destroying of computing task instances. Other common operations include actively checking the detailed status of the running instance, periodically returning the running instance, sending operating system signals (e.g., sigup and SIGKILL) to the running instance, processing events triggered by the running instance, and the like.

After the runtime driver establishes a bidirectional channel with the work agent, the work agent can call a standardized interface function, realize interaction with the runtime, and can support request-responsive communication and data streaming communication. The standard interface specifically comprises the capability of querying the runtime and a computing task configuration outline supported by the capability, and manages the computing task instance, such as the operations of starting, suspending, recovering and destroying the computing task instance.

In this embodiment, various components (such as a management component and a work agent) on the management node and the work node are mainly developed by using a Go programming language, each component corresponds to a binary file without other dependencies, and memory resources occupied are low.

Usually, the management node can receive 50 heartbeat signals per second, and the heartbeat period of each work agent is equal to the total number of the work nodes divided by 50 heartbeat/second. For example, in a computing area with 1000 working nodes, the period for each node to send a heartbeat signal is 20 seconds.

In order to avoid that a large number of working nodes send heartbeat signals to the management node at the same time, a virtual flood attack is formed. In one embodiment, all of the worker nodes of each compute farm are divided into groups according to a set of 50 worker nodes criteria. The working nodes within each group have the same heartbeat start time. That is, each group sends a heartbeat signal to the management node at the same time, which indicates that the group works normally. The heartbeat time interval between adjacent groups is 1 second. Alternatively, when the size of the working node is small, the heartbeat period may be specified to be at least 10 seconds. If the management node does not receive the heartbeat signal of a certain working node within the heartbeat cycle, the node is judged as a suspected failure node; if the heartbeat signal of the node is not received within a specified time interval, such as 24 hours, the node is classified as a failure state, removed from the resource pool, and the computing task running on the node is migrated to other working nodes for execution. The default value of the heartbeat period is equal to the number of all nodes in the calculation area divided by 50, and if the value is lower than 10 seconds, the value is 10 seconds.

In this embodiment, a plurality of management nodes are used to manage a cloud-edge integrated computing resource pool, so as to enhance the fault-tolerant capability of cloud-edge resource management, select a leader node from the management nodes to uniformly manage the states of all working nodes in a computing area, manage computing tasks and distribute the computing tasks, install and run runtime drivers for completing the computing tasks according to the computing tasks as required on the working nodes, and interact with the management nodes. The standard interface interacting with the runtime is provided through the runtime driver, so that the hybrid deployment of computing tasks running in different runtimes can be supported in various heterogeneous runtimes. Safe, efficient and consistent heterogeneous runtime cloud-side resource management is achieved.

The implementation process of the cloud-edge resource management system of the embodiment is as follows:

firstly, three computing nodes are selected as management nodes, and a management component is installed on each management node. Randomly selecting a management node, assuming as node 1, and operating a management component of the management node; next, before running the management components on nodes 2 and 3, the existing partner's web address, i.e. the IP address of management node 1, is first set. After the management components of the nodes 2 and 3 are started, the 3 management nodes jointly form a consensus group.

And then adopting a Raft consensus protocol to realize leader election and state replication of the consensus group. Assuming that management node 2 is selected as the leader node, management nodes 1 and 3 belong to the follower nodes. API requests sent to the follower nodes or messages from the work agents are forwarded to the leader node, are processed by the leader node in a unified mode, and the latest state of the whole calculation area is synchronized to the follower nodes.

And then install a work agent on each work node. Before starting the working agent, the network address of any management node in the calculation area needs to be appointed. After the work agent is started for the first time, the resource condition of the current work node, including three types of resources of hardware, an operating system and user definition, is reported to the management component of the designated management node. If the designated management node is not the leader node, the information is forwarded to the leader management node. After the work agent component operates, a heartbeat signal is sent to the management component regularly, a computing task distributed by the leader node is received, and a runtime driver and a runtime are installed and operated according to the requirement of the computing task. The work agent submits the runtime configuration contained in the calculation task definition to the runtime driver, and the runtime driver is submitted to the runtime driver for execution, monitors the running state of the task, feeds the running state back to the work agent, and is submitted to the leader node by the work agent periodically.

Examples

The embodiment of the invention provides a cloud edge resource management scheduling method supporting mixed heterogeneous operation. Specifically, as shown in fig. 7, the method includes the following steps:

step S100: dividing the cloud edge integrated computing resource pool into a plurality of computing areas so as to enable computing nodes in the same computing area to be a low-delay high-speed network;

specifically, the computing nodes which are low-delay high-speed networks are clustered and divided into the same computing area, so that the cloud-edge integrated computing resource pool is divided into a plurality of computing areas.

Step S200: selecting a plurality of management nodes from the computing nodes in each computing area and setting the rest computing nodes as working nodes;

specifically, the number of management nodes is preferably 3 to 5. Because the management node has large communication traffic and needs to perform scheduling matching, the computing node with better hardware performance in the computing nodes is selected as the management node. And the rest computing nodes are used as working nodes.

Step S300: dividing all management nodes into a leader node and a plurality of follower nodes according to a consensus protocol, wherein the leader node is used for managing the states of all working nodes in a calculation region and synchronizing the working nodes to the follower nodes;

specifically, all management nodes form a consensus group, a Raft consensus protocol is adopted to realize leader node selection in all the consensus groups and state replication in the consensus group, that is, the management nodes are divided into: a leader node and follower nodes.

The leader node is used for managing the states of all the working nodes in the computing area, managing computing tasks and distributing the computing tasks, namely, state change messages and computing task requests of all the working nodes are processed by the leader node in a unified mode. The state change messages of the working nodes acquired by the follower nodes need to be sent to the leader node for unified processing, so that the states of all the working nodes in the whole calculation area are acquired. A synchronization mechanism is arranged between the leader node and the follower node, and the states of all working nodes in the calculation area stored on the leader node can be synchronized to the follower node.

Step S400: acquiring a computing task submitted by a user and sending the computing task to a task queue of a leader node;

specifically, a task queue is arranged in a memory of the leader node and used for processing a computing task submitted by a user. Although the user may also submit the computing task to the follower node, the follower node receives the computing task request and then forwards the computing task request to the leader node for unified processing.

Step S500: running a scheduler on the leader node and the follower node;

step S600: sequentially taking out the calculation tasks from the task queue of the leader node and sending the calculation tasks to any scheduler to generate an allocation plan according to a scheduling strategy, and sending the allocation plan to the plan queue of the leader node;

specifically, a scheduler is run on both the leader node and the follower node, through which the matching of the computing task requirements to the resources of the worker node is achieved. Although both the leader node and the follower node can perform distribution planning on the computation tasks, after the follower node generates the distribution planning on the computation tasks, the results need to be transmitted to the leader node, a planning queue runs on the leader node, the planning queue manages the distribution planning corresponding to each computation task in a centralized manner, and then the leader node completes the scheduling and distribution of the computation tasks in a unified manner.

Step S700: sending the calculation tasks in the task queue to the work nodes determined according to the distribution plans corresponding to the calculation tasks;

specifically, when the computation task is scheduled and allocated, the computation task is taken out from the task queue, the allocation plan corresponding to the computation task is found in the planning queue, then specific working nodes are determined according to the allocation plan, and the computation task is sent to the working nodes.

Step S800: installing and operating a runtime driver for completing the computation task on the working node according to the computation task, monitoring the operation state, and sending an execution progress and an execution result to the management node; the runtime driver is used for providing a standardized interface to interact with the runtime, and the runtime driver runs in a completely independent process;

specifically, after receiving a computing task, a work node installs and runs a runtime driver and a runtime driver for completing the computing task as needed according to the computing task, where the runtime driver is a standardized interoperation interface between a work agent and a runtime, and the runtime driver implements interaction between the work agent and the runtime through the standardized interoperation interface, and is a component that actually executes the computing task, and is generally common open source or commercialized software, such as a Docker container engine or an OpenJDK virtual machine.

And after the runtime driver submits the calculation task to the runtime for execution, monitoring the running state of the task and feeding the running state back to the management node.

From the foregoing, the present embodiment implements runtime and runtime drivers for completing computation tasks by installing and running on the work nodes as needed according to the computation tasks, and interacts with the management node. The standard interface interacting with the runtime is provided through the runtime driver, so that the hybrid deployment of computing tasks running in different runtimes can be supported in various heterogeneous runtimes. Safe, efficient and consistent heterogeneous runtime cloud-side resource management is achieved.

Optionally, sending a heartbeat signal to the management node on the working node to obtain the state of the working node;

sending state change messages of all the working nodes to a leader node to obtain the states of all the working nodes in the calculation area;

the system state is synchronized between the leader node and the follower node.

Optionally, the working nodes are divided into a plurality of groups, and all the working nodes in the groups send heartbeat signals to the management node at the same time;

optionally, the working node is provided with a working agent, and further includes: establishing a bidirectional channel between the runtime driver and the work agent; the work agent invokes a standardized interface function provided by the runtime driver to interact with the runtime.

Specifically, in this embodiment, specific functions of each step of the cloud-edge resource management scheduling method for supporting the hybrid heterogeneous runtime may refer to corresponding descriptions in the cloud-edge resource management scheduling system for supporting the hybrid heterogeneous runtime, and are not described herein again.

It should be understood that, the sequence numbers of the steps in the foregoing embodiments do not imply an execution sequence, and the execution sequence of each process should be determined by its function and inherent logic, and should not constitute any limitation to the implementation process of the embodiments of the present invention.

It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-mentioned division of the functional units and modules is illustrated, and in practical applications, the above-mentioned functions may be distributed as different functional units and modules according to needs, that is, the internal structure of the apparatus may be divided into different functional units or modules to implement all or part of the above-mentioned functions. Each functional unit and module in the embodiments may be integrated in one processing unit, or each unit may exist alone physically, or two or more units are integrated in one unit, and the integrated unit may be implemented in a form of hardware, or in a form of software functional unit. In addition, specific names of the functional units and modules are only used for distinguishing one functional unit from another, and are not used for limiting the protection scope of the present invention. The specific working processes of the units and modules in the system may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.

In the above embodiments, the description of each embodiment has its own emphasis, and reference may be made to the related description of other embodiments for parts that are not described or recited in any embodiment.

Those of ordinary skill in the art would appreciate that the elements and algorithm steps of the examples described in connection with the embodiments disclosed herein may be implemented as electronic hardware, or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.

In the embodiments provided in the present invention, it should be understood that the disclosed apparatus/terminal device and method may be implemented in other ways. For example, the above-described embodiments of the apparatus/terminal device are merely illustrative, and for example, the division of the above modules or units is only one type of logical function division, and the actual implementation may be implemented by another division manner, for example, multiple units or components may be combined or integrated into another system, or some features may be omitted, or not executed.

The integrated modules/units described above, if implemented in the form of software functional units and sold or used as separate products, may be stored in a computer readable storage medium. Based on such understanding, all or part of the flow of the method according to the embodiments of the present invention may also be implemented by a computer program, which may be stored in a computer-readable storage medium and can implement the steps of the embodiments of the method when the computer program is executed by a processor. Wherein the computer program comprises computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer readable medium may include: any entity or device capable of carrying the above-mentioned computer program code, recording medium, usb disk, removable hard disk, magnetic disk, optical disk, computer Memory, read-Only Memory (ROM), random Access Memory (RAM), electrical carrier wave signal, telecommunication signal, software distribution medium, etc. It should be noted that the contents of the computer-readable storage medium can be increased or decreased as required by the legislation and patent practice in the jurisdiction.

The above-mentioned embodiments are only used for illustrating the technical solutions of the present invention, and not for limiting the same; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art; the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not depart from the spirit and scope of the embodiments of the present invention, and they should be construed as being included therein.

Claims

1. The cloud edge resource management scheduling system supporting hybrid heterogeneous operation is characterized by comprising the following components:

2. The cloud-edge resource management scheduling system in support of hybrid heterogeneous runtime of claim 1 wherein all worker nodes of the computing region are configured into a number of groups, each group containing a number of the worker nodes.

3. The cloud-edge resource management scheduling system supporting hybrid heterogeneous runtime of claim 1, wherein a task queue and a planning queue are further provided on the leader node, the task queue is used for managing computing tasks submitted by users, and the planning queue is used for managing allocation plans corresponding to the computing tasks.

4. The cloud-edge resource management scheduling system in support of hybrid heterogeneous runtime of claim 1 wherein a bidirectional channel is provided between the runtime driver and the work agent to support request-responsive communication and data streaming communication.

5. The cloud-edge resource management scheduling system in support of hybrid heterogeneous runtimes of claim 4, wherein the runtime driver is implemented based on a local network remote procedure call mechanism, the runtime driver being an independent process on the worker node.

6. The cloud-edge resource management scheduling system in support of hybrid heterogeneous runtime of claim 1, wherein the scheduler comprises: the system comprises a scheduler for appointing all working nodes to run the system service, a scheduler for appointing all working nodes to run the system batch processing task, a scheduler for selecting the best working node according to the condition to run the appointed service and a scheduler for selecting the proper working node according to the condition to run the appointed batch processing task.

7. The cloud edge resource management scheduling method for supporting mixed heterogeneous operation is characterized by comprising the following steps:

dividing all the management nodes into a leader node and a plurality of follower nodes according to a consensus protocol, wherein the leader node is used for managing the states of all the working nodes in the calculation area and synchronizing to the follower nodes;

running a scheduler on the leader node and the follower node;

8. The method for cloud-edge resource management scheduling in support of hybrid heterogeneous runtime of claim 7, further comprising:

sending a heartbeat signal to the management node on the working node to acquire the state of the working node;

synchronizing states of all working nodes within the computing region between the leader node and the follower node.

9. The method for cloud edge resource management scheduling in support of hybrid heterogeneous runtime of claim 8, wherein the sending a heartbeat signal to the management node on the worker node to obtain the state of the worker node comprises:

10. The cloud edge resource management scheduling method supporting hybrid heterogeneous runtime of claim 7, wherein a work agent is provided on the work node, further comprising: