CN113312165B - Task processing method and device - Google Patents

Task processing method and device Download PDF

Info

Publication number
CN113312165B
CN113312165B CN202110857003.7A CN202110857003A CN113312165B CN 113312165 B CN113312165 B CN 113312165B CN 202110857003 A CN202110857003 A CN 202110857003A CN 113312165 B CN113312165 B CN 113312165B
Authority
CN
China
Prior art keywords
task
container
request
unit
uniform resource
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110857003.7A
Other languages
Chinese (zh)
Other versions
CN113312165A (en
Inventor
陈文灿
周明伟
钱浩东
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang Dahua Technology Co Ltd
Original Assignee
Zhejiang Dahua Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang Dahua Technology Co Ltd filed Critical Zhejiang Dahua Technology Co Ltd
Priority to CN202110857003.7A priority Critical patent/CN113312165B/en
Publication of CN113312165A publication Critical patent/CN113312165A/en
Application granted granted Critical
Publication of CN113312165B publication Critical patent/CN113312165B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/182Distributed file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals

Abstract

The embodiment of the invention provides a task processing method and a task processing device, which are suitable for a resource management system running in a container cluster management system; a resource management unit RM and a node agent unit NM in a resource management system run in the container cluster management system in POD; the method comprises the following steps: the RM receives the task processing request, informs the NM to generate a task coordination request, and sends the task coordination request to the AM through a uniform resource interface of the container cluster management system; the AM generates a task starting request according to the task coordination request and sends the task starting request to the NM; the NM generates a task container creating request according to the task starting request and sends the task container creating request to the uniform resource interface, wherein the task container creating request comprises the description information of the task and the task container; and the uniform resource interface creates a task container according to the task container creating request, and the task container is used for the task process to execute the task processing request. The method is used for solving the problem of resource competition between the node agent and task processing.

Description

Task processing method and device
Technical Field
The present application relates to the field of network technologies, and in particular, to a task processing method and apparatus.
Background
Apache Hadoop YARN (Yet other Resource coordinator) is a new Hadoop Resource manager, is a universal Resource management system, can provide uniform Resource management and scheduling for upper-layer application, and brings great benefits to the cluster in the aspects of utilization rate, uniform Resource management, data sharing and the like. kubernets is an open source for managing containerized applications on multiple hosts in a cloud platform, and aims to make it simple and efficient (powerfull) to deploy containerized applications, and provides a mechanism for application deployment, planning, updating, and maintenance.
In the prior art, a YARN system is generally operated on a kubernets system, that is, a mirror image file of a Hadoop system is obtained; loading the image file into a Kubernetes system; the image file is run on the docker (container) of the kubernets system to start the Hadoop service on the kubernets system. Although the method can utilize docker idle resources of a Kubernetes system, the method has the disadvantages that a yarn task process and a NodeManager (node proxy unit NM) are in the same container, the CPU and memory resources of the container are shared, and when different tasks run concurrently, the processes of the tasks mutually occupy the CPU and the memory resources, or if the resource configuration is large enough, the NodeManager container occupies larger resources and causes resource waste.
Therefore, a task processing method and device are needed to solve the problem of resource contention between the node agent and the task processing.
Disclosure of Invention
The embodiment of the invention provides a task processing method and a task processing device, which are used for solving the problem of resource competition between node agents and task processing.
In a first aspect, an embodiment of the present invention provides a task processing method, which is applicable to a resource management system running in a container cluster management system; a resource management unit RM and a node agent unit NM in said resource management system run in said container cluster management system in the manner of an application instance POD; the method comprises the following steps:
the resource management unit RM receives a task processing request sent by a client, informs the node agent unit NM of generating a task coordination request, and sends the task coordination request to a task scheduling unit AM through a uniform resource interface of the container cluster management system, wherein the task scheduling unit AM is created by the uniform resource interface according to the task coordination request, and the task scheduling unit AM runs in the container cluster management system in an application instance POD manner;
the task scheduling unit AM generates a task starting request according to the task coordination request and sends the task starting request to the node agent unit NM;
the node agent unit NM generates a task container creating request according to the task starting request, and sends the task container creating request to the uniform resource interface, wherein the task container creating request comprises task description information and task container description information;
and the uniform resource interface creates a task container according to the task container creating request, wherein the task container is used for a task process to execute the task processing request.
In the method, only the node agent unit NM is arranged in the resource management server, the node agent unit NM is not arranged in the task scheduling server and the task processing server, but the resource management system is butted with the uniform resource interface, so that the resource management system creates a task container and starts a task process through the uniform resource interface, manages and monitors the task and the like. And generating description information of the task and description information of the task container according to the related information in the task processing request through a node agent unit NM in the resource management server, and further creating a task container request according to the description information of the task and the description information of the task container, so that the uniform resource interface receives the task container creating request, and creates the task container in the server of the universal resource management system to complete task processing. Therefore, a task container can be established in the server which does not include the node agent unit NM, the node agent unit NM is separated from the task process, the resource competition of the node agent unit NM container and the task container can be prevented, the resource competition of each task process in the node agent unit NM container in the prior art is eliminated, and the resource utilization rate and the stability of system operation are improved. And the resources are coordinated according to the task scheduling unit AM, and the flexible allocation of the resources can be realized through the task scheduling unit AM and the resource management unit RM.
Optionally, the sending the task coordination request to a task scheduling unit AM through a uniform resource interface of the container cluster management system includes: the uniform resource interface receives the task coordination request, creates a task scheduling unit AM in a task scheduling server according to the task coordination request, sends the task coordination request to the task scheduling unit AM, and creates a task container in the task scheduling unit AM according to the task coordination request, wherein the task coordination request comprises description information of a task and description information of the task container, and the task scheduling server is determined by the resource management unit RM according to the task processing request.
In the above method, the task scheduling unit AM may perform resource coordination on the processing of the task according to the task coordination request. Therefore, reasonable allocation of resources can be guaranteed, and the resource utilization rate and the task processing efficiency are improved.
Optionally, before the task scheduling unit AM generates a task starting request according to the task coordination request, the method further includes:
the task scheduling unit AM registers in the resource management unit RM, and the registration is used for establishing heartbeat connection between the task scheduling unit AM and the resource management unit RM;
and the task scheduling unit AM acquires task operation resources from the resource management unit RM, and generates a task starting request according to the task operation resources, wherein the task operation resources comprise at least one server ID.
In the above method, the task scheduling unit AM registers in the resource management unit RM, so that the task scheduling unit AM and the resource management unit RM are in heartbeat connection. Therefore, the resource management server can monitor the task processing state in the task scheduling server at any time, can perform actions such as restarting after the task is stopped, and improves the stability of task processing.
Optionally, the generating, by the node agent unit NM, a task container creation request according to the start task request includes: the node agent unit NM generates the task container creation request according to at least one server ID in the start task request, and sends the task container creation request to the uniform resource interface; the uniform resource interface creates the task container according to the task container creating request, and the method comprises the following steps: and the uniform resource interface receives the task container creating request, creates a task container in a server corresponding to the at least one server ID, wherein the at least one server ID is any one or more of a resource management server ID and a task processing server ID.
In the above method, the task scheduling unit AM performs resource scheduling according to the task job resource, i.e., at least one server ID, allocated by the resource management unit RM, thereby improving the resource utilization rate. At least one server ID is a server with sufficient resources in any one or more of the resource management server ID and the task processing server ID determined by the resource management unit RM, so that the task processing efficiency is improved.
Optionally, after the request for creating the task container is sent to the uniform resource interface, the method further includes: the node agent unit NM generates a task monitoring request and sends the task monitoring request to the uniform resource interface; and the uniform resource interface receives the task monitoring request and monitors the task container according to the task monitoring request.
In the above method, the node agent unit NM monitors the task container through the uniform resource interface. Therefore, the task process states in the task scheduling server and the task processing server can be obtained in time.
Optionally, after the uniform resource interface creates the task container according to the task container creation request, the method further includes:
the uniform resource interface acquires a task processing result and sends the task processing result to the node agent unit NM; the node agent unit NM obtains the task processing result and sends the task processing result to the resource management unit RM; and the resource management unit RM sends the task processing result to the client.
In the method, the unified resource interface monitors the completion of task processing, a task processing result is obtained and sent to the task scheduling server, and the heartbeat connection between the task scheduling server and the resource management server enables the resource management server to obtain the task processing result at the first time and send the task processing result to the client, so that the timeliness of the client for obtaining the task processing result is guaranteed.
Optionally, after the resource management unit RM sends the task processing result to the client, the method further includes: the uniform resource interface determines the task process is finished and informs the node agent unit NM of the task process finishing state; the node agent unit NM generates a task container deleting request according to the task process ending state and sends the task container deleting request to the uniform resource interface; and the uniform resource interface deletes the task container.
According to the method, after the task process is finished, the task container is deleted, so that resources can be released, and the utilization rate of the resources is improved.
Optionally, the creating a task container according to the task container creating request by the uniform resource interface includes:
the uniform resource interface creates a container resource preparation container, a task container and a log container in a server of the resource management system according to the task container creating request, wherein the container resource preparation container is used for acquiring a container dependent file; the task container comprises a task process, and the task process is used for processing a task; and the log container is used for monitoring the task process and converging the log of the task process into a distributed file system after the task process is finished.
According to the method, the resource list dependent on the task can be acquired from the environment variable by creating the container resource preparation container, each resource file is downloaded, and the resource files are stored in the specified path of the container mount. Therefore, the reliability of the task container and the log container is ensured, the task can be processed through the task process, and the acquired task logs can be gathered through the log container.
In a second aspect, an embodiment of the present invention provides a task processing apparatus, which is suitable for a resource management system running in a container cluster management system; a resource management unit RM, a task scheduling unit AM and a node agent unit NM in the resource management system run in the container cluster management system in the manner of an application instance POD; the device includes:
a receiving and sending module, configured to receive a task processing request sent by a client, notify the node agent unit NM to generate a task coordination request, and send the task coordination request to a task scheduling unit AM through a uniform resource interface of the container cluster management system, where the task scheduling unit AM is created by the uniform resource interface according to the task coordination request, and the task scheduling unit AM operates in the container cluster management system in an application instance POD manner;
a processing module, configured to generate a task starting request according to the task coordination request, and send the task starting request to the node agent unit NM through the transceiver module;
the processing module is further configured to generate a task container creation request according to the task starting request, and send the task container creation request to the uniform resource interface through the transceiver module, where the task container creation request includes description information of a task and description information of a task container;
the processing module is further configured to create a task container according to the task container creation request, where the task container is used for a task process to execute the task processing request.
In a third aspect, an embodiment of the present application further provides a computing device, including: a memory for storing a program; a processor for calling the program stored in said memory and executing the method as described in the various possible designs of the first aspect according to the obtained program.
In a fourth aspect, embodiments of the present application further provide a computer-readable non-transitory storage medium including a computer-readable program which, when read and executed by a computer, causes the computer to perform the method as described in the various possible designs of the first aspect.
These and other implementations of the present application will be more readily understood from the following description of the embodiments.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without inventive exercise.
Fig. 1 is a schematic diagram of a system architecture for task processing according to an embodiment of the present invention;
FIG. 2 is a system architecture diagram of task processing according to an embodiment of the present invention;
fig. 3 is a schematic flowchart of a task processing method according to an embodiment of the present invention;
fig. 4 is a flowchart illustrating a task processing method according to an embodiment of the present invention;
fig. 5 is a schematic diagram of a task processing device according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention clearer, the present invention will be described in further detail with reference to the accompanying drawings, and it is apparent that the described embodiments are only a part of the embodiments of the present invention, not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Fig. 1 is a schematic diagram of a system architecture for task processing according to an embodiment of the present invention, in which a resource management system runs in a container cluster management system, and uses idle resources of the container cluster management system to improve resource utilization; a resource management unit RM and a node agent unit NM in a resource management system run in the container cluster management system in the manner of an application instance POD; after a client task processing request is sent to the resource management unit RM, the resource management unit RM determines a task scheduling server according to the task processing request; the resource management unit RM informs the node agent unit NM to generate a task coordination request, the task coordination request is sent to the task scheduling server through a uniform resource interface of the container cluster management system, a task scheduling unit AM is established in the task scheduling server, the uniform resource interface sends the task coordination request to the task scheduling unit AM, and the uniform resource interface of the container cluster management system establishes a task container in the task scheduling unit AM according to the task coordination request. The task scheduling unit AM registers with the resource management unit RM, establishes a heartbeat connection between the task scheduling unit AM and the resource management unit RM, and acquires task job resources from the resource management unit RM. The method comprises the steps that a task scheduling unit AM generates a task starting request according to task operation resources and sends the task starting request to a node agent unit NM, the node agent unit NM generates a task creating container request containing description information of a task and description information of a task container according to the task starting request and sends the task creating container request to a uniform resource interface, and the node agent unit NM also generates a task monitoring request so that the uniform resource interface monitors the task container according to the task monitoring request. The uniform resource interface creates a task container in a corresponding server (any one or more of the resource management server, the task processing server 1 to the task processing server n) according to task job resources in the request for creating the task container, so as to realize pulling up a task process in the task container, executing the task processing request and acquiring a task processing result. The uniform resource interface monitors a task processing result of the task process, and sends the task processing result to the node agent unit NM, and the node agent unit NM acquires the task processing result and sends the task processing result to the resource management unit RM; and sending the task processing result to the client by the resource management unit RM. And then, the uniform resource interface determines that the task process is finished, deletes the task container, and the task scheduling unit AM sends a logout request to the resource management unit RM after determining that the task is finished, and simultaneously logs out the own application instance POD. Therefore, idle resources of the application container cluster management system can be effectively utilized, and resources between the node agent unit AM and the task process can be prevented from being seized; coordinated task operation resources are obtained through a resource management RM, and reasonable allocation of resource utilization is guaranteed; and after the task is finished, the task container is deleted, resources are released, and the utilization rate of effective resources is further ensured.
Based on the system architecture in fig. 1, a system architecture diagram of task processing according to an embodiment of the present invention is shown in fig. 2;
the resource management unit RM and the node proxy NM may be in one server, which may be referred to as a resource management server; the task scheduling unit AM is in a server and can be called a task scheduling server; the task processing servers 1 to n can be used for realizing the flexible use of task processing resources, and preventing the situation that the tasks cannot be processed due to resource shortage. In addition, the container cluster management system may be a kubernets system, the resource management system may be a horn system in a hadoop distributed system, and a hadoop cluster is deployed on top of kubernets, where jounalnodes (for sharing data between the namenodes), namenodes (for managing file systems), resourcemanagers (for resource management), jobhistory (for collecting task operation information that does not include a task log), datanodes (for storing classes of data), and nodemanagers (node agents) operate in an independent pod manner. Therefore, the idle resources of Kubernetes are utilized, and the resource utilization rate is improved. As can be seen from fig. 1 and 2, the node proxy unit NM is modified to interface with the uniform resource management interface of Kubernetes, so as to set up a 'bridge' between the task container and the node proxy unit NM, and accordingly, the node proxy unit NM and the task process are 'separated'. It should be noted that the system architectures in fig. 1 and fig. 2 are only an example, for example, the resource management unit RM and the node proxy NM may also be in different servers respectively, and the above example does not limit the technical implementation.
Based on this, the embodiment of the present application provides a flow of a task processing method, which is suitable for a resource management system running in a container cluster management system; a resource management unit RM and a node agent unit NM in said resource management system run in said container cluster management system in the manner of an application instance POD; as shown in fig. 3, includes:
step 301, the resource management unit RM receives a task processing request sent by a client, notifies the node proxy unit NM to generate a task coordination request, and sends the task coordination request to a task scheduling unit AM through a uniform resource interface of the container cluster management system, where the task scheduling unit AM is created by the uniform resource interface according to the task coordination request, and the task scheduling unit AM operates in the container cluster management system in an application instance POD manner;
here, after the node agent unit NM receives the start task scheduling unit AM request sent by the resource management unit RM, the node agent unit NM extracts from the start task scheduling unit AM request a task start context (including environment variables of the task, a list of resource dependent files required for task start, and start parameters), the resource size of the task run, the task ID, constructing POD description of the task (including description information of the task and description information of the task container) according to the extracted information, generating a task coordination request according to the description information of the task and the description information of the task container, sending the task coordination request to a uniform resource interface (apiServer) of the kubernets so that the uniform resource interface creates a task coordination unit AM in the task scheduling server, a pod, i.e. a task container, running a task is created in a task orchestration unit am (applicationmaster).
The method for generating the description information of the task and the description information of the task container may include: serializing and coding a task starting context (including information such as environment variables, resource dependence file lists and starting parameters started by a task scheduling unit AM) and token information carried in a task coordination request to generate environment variables of the pod, then taking a task ID as the name of the pod, taking resources (CPU and memory) assigned to the task in the task coordination request as resource limits of the pod, constructing yaml description of the pod, and then requesting to create and start the task pod from a uniform resource interface of kubernets.
In addition, the resource management unit RM can include a task processing information list, and update the task processing state in real time according to the task processing state reported by the node agent unit NM, so that a worker can visually acquire the processing state of the task.
Step 302, the task scheduling unit AM generates a start task request according to the task coordination request, and sends the start task request to the node agent unit NM;
303, the node agent unit NM generates a task container creation request according to the start task request, and sends the task container creation request to the uniform resource interface, where the task container creation request includes description information of a task and description information of a task container;
and step 304, the uniform resource interface creates a task container according to the task container creating request, and the task container is used for a task process to execute the task processing request.
In the method, only the node agent unit NM is arranged in the resource management server, the node agent unit NM is not arranged in the task scheduling server and the task processing server, but the resource management system is butted with the uniform resource interface, so that the resource management system creates a task container and starts a task process through the uniform resource interface, manages and monitors the task and the like. And generating description information of the task and description information of the task container according to the related information in the task processing request through a node agent unit NM in the resource management server, and further creating a task container request according to the description information of the task and the description information of the task container, so that the uniform resource interface receives the task container creating request, and creates the task container in the server of the universal resource management system to complete task processing. Therefore, a task container can be established in the server which does not include the node agent unit NM, the node agent unit NM is separated from the task process, the resource competition of the node agent unit NM container and the task container can be prevented, the resource competition of each task process in the node agent unit NM container in the prior art is eliminated, and the resource utilization rate and the stability of system operation are improved. And the resources are coordinated according to the task scheduling unit AM, and the flexible allocation of the resources can be realized through the task scheduling unit AM and the resource management unit RM.
The embodiment of the present application provides a task processing method, where the task coordination request is sent to a task scheduling unit AM through a uniform resource interface of the container cluster management system, including:
the uniform resource interface receives the task coordination request, creates a task scheduling unit AM in a task scheduling server according to the task coordination request, sends the task coordination request to the task scheduling unit AM, and creates a task container in the task scheduling unit AM according to the task coordination request, wherein the task coordination request comprises description information of a task and description information of the task container, and the task scheduling server is determined by the resource management unit RM according to the task processing request. That is, after receiving the task coordination request, the uniform resource interface extracts the description information of the task and the description information of the task container to create a task scheduling unit AM in the task scheduling server, and creates the task container in the task scheduling unit AM.
The embodiment of the present application provides a method for acquiring task job resources, where before the task scheduling unit AM generates a task starting request according to the task coordination request, the method further includes: the task scheduling unit AM registers in the resource management unit RM, and the registration is used for establishing heartbeat connection between the task scheduling unit AM and the resource management unit RM; and the task scheduling unit AM acquires task operation resources from the resource management unit RM, and generates a task starting request according to the task operation resources, wherein the task operation resources comprise at least one server ID. That is, before the task scheduling unit AM obtains the task job resource from the resource management unit RM, it first establishes a connection with the resource management unit RM; after the heartbeat connection is established, the task scheduling unit AM can acquire the task processing state acquired by the resource management unit RM from the node agent unit NM, so that the task scheduling unit AM can manage the task conveniently, for example, if the task processing is interrupted, whether the task needs to be restarted or not. In addition, the task job resource is acquired from the resource management unit RM: and at least one server ID can ensure that the acquired task operation resources are the optimal resource allocation scheme determined by the resource management unit RM according to the resource use state in the resource management system, so that the resource utilization rate is improved.
An embodiment of the present application provides a method for creating a task container, where the node agent unit NM generates a task container creation request according to the task starting request, and includes:
the node agent unit NM generates the task container creation request according to at least one server ID in the start task request, and sends the task container creation request to the uniform resource interface;
the uniform resource interface creates the task container according to the task container creating request, and the method comprises the following steps:
and the uniform resource interface receives the task container creating request, creates a task container in a server corresponding to the at least one server ID, wherein the at least one server ID is any one or more of a resource management server ID and a task processing server ID. That is, instead of setting the task process in the container where the node proxy unit NM is located as in the prior art, the node proxy unit NM of yarn is modified to run different tasks in the pod manner of kubernets. Compared with the prior art that the task process is arranged in the node agent unit NM, the resource utilization rate can be improved, the resource preemption of the node agent unit NM and the task process can be eliminated, the resource preemption of the task process can be eliminated, and the flexible allocation of the resources for task processing can be realized.
The embodiment of the application provides a task container monitoring method, after sending a request for creating a task container to a uniform resource interface, the method further includes: the node agent unit NM generates a task monitoring request and sends the task monitoring request to the uniform resource interface; and the uniform resource interface receives the task monitoring request and monitors the task container according to the task monitoring request. That is, the node agent unit NM monitors the task running state through the uniform resource interface, and reports the task running state to the resource management unit RM, so that the task coordination unit AM determines the specific actions of starting and ending the task.
The embodiment of the present application provides a method for processing a task result, where after the uniform resource interface creates a task container according to the task container creation request, the method further includes: the uniform resource interface acquires a task processing result and sends the task processing result to the node agent unit NM; the node agent unit NM obtains the task processing result and sends the task processing result to the resource management unit RM; and the resource management unit RM sends the task processing result to the client. Here, the uniform resource interface may implement real-time monitoring on the task process running state in each task container, and when the task running state changes, notify the node agent unit NM of the changed task running state, if the task processing is completed and a task processing result is obtained, at this time, the node agent unit NM sends the task processing result to the resource management unit RM, and returns the task processing result to the client.
The embodiment of the present application provides a method for deleting a task container, where after the resource management unit RM sends the task processing result to the client, the method further includes: the uniform resource interface determines the task process is finished and informs the node agent unit NM of the task process finishing state; the node agent unit NM generates a task container deleting request according to the task process ending state and sends the task container deleting request to the uniform resource interface; and the uniform resource interface deletes the task container. That is to say, the node agent unit NM obtains the task process end state through the uniform resource interface, and then deletes the task container after obtaining the task processing result, and releases the resource occupied by the task container, thereby improving the resource utilization rate. In addition, when the task scheduling unit AM determines that the self task process is finished, or the task scheduling unit AM determines that all task processes are finished, a logout request is generated, and the logout request is sent to the resource management unit RM to disconnect the heartbeat connection, and quit the self process.
The embodiment of the application provides a method for creating a task container, wherein a uniform resource interface creates the task container according to a task container creating request, and the method comprises the following steps:
the uniform resource interface creates a container resource preparation container, a task container and a log container in a server of the resource management system according to the task container creating request, wherein the container resource preparation container is used for acquiring a container dependent file; the task container comprises a task process, and the task process is used for processing a task; and the log container is used for monitoring the task process and converging the log of the task process into a distributed file system after the task process is finished. That is, while creating a task container, a container resource preparation container and a log container are also created; after the creation of each container is completed, it is the container resource preparation container (initContainer) that is started first. After the initContainer is started, a resource list on which a task depends is obtained from the environment variables, and then each resource file is downloaded from an HDFS (distributed file system) in sequence. The resource file is saved to the designated path of the container mount. And after the initContainer finishes downloading the resources, the process can automatically quit, the corresponding container can also be finished, Kubernets then pull up a task container (container) for running the task according to the description of the POD, the container for running the task also obtains a starting context from the environment variables, a shell starting script of the task coordination unit AM is created according to the environment variables of the task, the locally downloaded resource file list and the starting parameters, and the script is executed to create the corresponding process. Finally created is a log container (sidecar) responsible for the aggregation of the logs. And after the process in the container is started, the process running in the task container is monitored, and after the task process exits (whether the process exits after normal ending or abnormal exiting), the logs of the task are aggregated under the specified path and uploaded to the HDFS.
Based on the above method flow, an embodiment of the present application provides a flow of a task processing method, as shown in fig. 4, including:
step 401, the client sends a task processing request to the resource management unit RM.
Step 402, after receiving the task processing request sent by the client, the resource management unit RM sends a request for starting a task scheduling unit AM to the node agent unit NM, where the request for starting the task scheduling unit AM includes a task starting context, and the task starting context includes: the method comprises the following steps of environment variables, a resource dependence file list required by task starting, starting parameters, identification or address information of a task scheduling server, the size of a task running resource, a task ID and the like.
Step 403, the node agent unit NM constructs the description information of the task and the description information of the task container according to the task start context, and generates a task coordination request according to the description information of the task and the description information of the task container.
Step 404, the node agent unit NM sends the task coordination request to the uniform resource interface; and sends a task monitoring request to the uniform resource interface.
Step 405, after receiving the task coordination request, the uniform resource interface creates a task scheduling unit AM in the task scheduling server, and creates a container resource preparation container, a task container and a log container in the task scheduling unit AM; and the uniform resource interface receives the task monitoring request and monitors the task container.
Step 406, after the container resource preparation container, the task container and the log container in the task scheduling unit AM are created, registering the container resource preparation container, the task container and the log container with the resource management server, and establishing a heartbeat connection.
Step 407, the task scheduling unit AM sends a resource acquisition request to the resource management unit RM.
In step 408, after receiving the resource acquisition request, the resource management unit RM determines the task job resource according to the information such as the resource usage and the resource status, and returns the task job resource to the task scheduling unit AM.
Step 409, the task scheduling unit AM generates a start task request according to the task job resource, and sends the start task request to the node agent unit NM.
Step 410, the node agent unit NM generates a task container creation request according to the task job resource in the start task request, and sends the task container creation request to the uniform resource interface.
Step 411, the node agent unit NM generates a task monitoring request, and sends the task monitoring request to the uniform resource interface, so that the uniform resource interface monitors each task container corresponding to the task container creating request.
Step 412, the uniform resource interface receives the request for creating the task container, and creates the task container in the corresponding resource management server and/or task processing server.
Step 413, the node agent unit NM monitors the end of the task process in the task container through the uniform resource interface, and the log container converges the log of the task process into the distributed file system, and obtains the task processing result, and sends the task processing result to the resource management unit RM.
And step 414, the node agent unit NM receives the task process end state notification sent by the uniform resource interface, generates a task container deletion request according to the task process end state, sends the task container deletion request to the uniform resource interface, and deletes the task container through the uniform resource interface.
Step 415, the resource management unit RM sends the task processing result to the client.
Step 416, the task scheduling unit AM logs off the resource management server and disconnects the heartbeat connection.
It should be noted that, the order of the task processing flow is not exclusive, for example, step 412 may be executed before step 411, and the flow order does not limit the specific implementation of the scheme.
Based on the same conception, the embodiment of the invention provides a task processing device which is suitable for a resource management system running in a container cluster management system; a resource management unit RM and a node agent unit NM in said resource management system run in said container cluster management system in the manner of an application instance POD; fig. 5 is a schematic diagram of a task processing device according to an embodiment of the present application, as shown in fig. 5, including:
a transceiver module 501, configured to receive a task processing request sent by a client, notify the node agent unit NM to generate a task coordination request, and send the task coordination request to a task scheduling unit AM through a uniform resource interface of the container cluster management system, where the task scheduling unit AM is created by the uniform resource interface according to the task coordination request, and the task scheduling unit AM runs in the container cluster management system in an application instance POD manner;
a processing module 502, configured to generate a task starting request according to the task coordination request, and send the task starting request to the node agent unit NM through the transceiver module 501;
the processing module 502 is further configured to generate a task container creation request according to the task starting request, and send the task container creation request to the uniform resource interface through the transceiver module 501, where the task container creation request includes description information of a task and description information of a task container;
the processing module 502 is further configured to create a task container according to the task container creating request, where the task container is used for a task process to execute the task processing request.
Optionally, the transceiver module 501 is specifically configured to: receiving the task coordination request, creating the task scheduling unit AM in a task scheduling server according to the task coordination request, sending the task coordination request to the task scheduling unit AM, and creating a task container in the task scheduling unit AM according to the task coordination request through the processing unit 502, where the task coordination request includes description information of a task and description information of a task container, and the task scheduling server is determined by the resource management unit RM according to the task processing request.
Optionally, the processing module 502 is further configured to: registering in the resource management unit RM, wherein the registration is used for establishing heartbeat connection between the task scheduling unit AM and the resource management unit RM; task operation resources are obtained from the resource management unit RM, a task starting request is generated according to the task operation resources, and the task operation resources comprise at least one server ID.
Optionally, the processing module 502 is specifically configured to: generating the task container creating request according to at least one server ID in the task starting request, and sending the task container creating request to the uniform resource interface through the transceiver module 501; the processing module 502 is specifically configured to: receiving the task container creating request through the transceiver module 501, creating a task container in a server corresponding to the at least one server ID, where the at least one server ID is any one or more of a resource management server ID and a task processing server ID.
Optionally, the transceiver module 501 is further configured to: generating a task monitoring request through the processing unit 502 and sending the task monitoring request to the uniform resource interface; receiving the task monitoring request, and monitoring the task container through the processing unit 502 according to the task monitoring request.
Optionally, the processing module 502 is further configured to: acquiring a task processing result, and sending the task processing result to the node agent unit NM through the transceiving module 501; acquiring the task processing result and sending the task processing result to the resource management unit RM; the task processing result is sent to the client through the transceiver module 501.
Optionally, the transceiver module 501 is further configured to: determining the task process is finished, and informing the node agent unit NM of the task process finishing state; generating a task container deleting request according to a task process ending state through the processing unit 502, and sending the task container deleting request to the uniform resource interface; the task container is deleted by the processing unit 502.
Optionally, the processing module 502 is specifically configured to: creating a container resource preparation container, a task container and a log container in a server of the resource management system according to the task container creation request, wherein the container resource preparation container is used for acquiring a container dependent file; the task container comprises a task process, and the task process is used for processing a task; and the log container is used for monitoring the task process and converging the log of the task process into a distributed file system after the task process is finished.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present application without departing from the spirit and scope of the application. Thus, if such modifications and variations of the present application fall within the scope of the claims of the present application and their equivalents, the present application is intended to include such modifications and variations as well.

Claims (11)

1. A task processing method is characterized by being suitable for a resource management system running in a container cluster management system; a resource management unit RM and a node agent unit NM in the resource management system operate in the container cluster management system in an application instance POD manner, the container cluster management system is a kubernets system, and the resource management system is a yarn system; the method comprises the following steps:
the resource management unit RM receives a task processing request sent by a client, informs the node agent unit NM of generating a task coordination request, and sends the task coordination request to a task scheduling unit AM through a uniform resource interface of the container cluster management system, wherein the task scheduling unit AM is created by the uniform resource interface according to the task coordination request, and the task scheduling unit AM runs in the container cluster management system in an application instance POD manner;
the task scheduling unit AM generates a task starting request according to the task coordination request and sends the task starting request to the node agent unit NM;
the node agent unit NM generates a task container creating request according to the task starting request, and sends the task container creating request to the uniform resource interface, wherein the task container creating request comprises task description information and task container description information;
and the uniform resource interface creates a task container according to the task container creating request, wherein the task container is used for a task process to execute the task processing request.
2. The method according to claim 1, wherein sending the task orchestration request to a task scheduling unit AM through a uniform resource interface of the container cluster management system comprises:
the uniform resource interface receives the task coordination request, creates a task scheduling unit AM in a task scheduling server according to the task coordination request, sends the task coordination request to the task scheduling unit AM, and creates a task container in the task scheduling unit AM according to the task coordination request, wherein the task coordination request comprises description information of a task and description information of the task container, and the task scheduling server is determined by the resource management unit RM according to the task processing request.
3. The method according to claim 1, wherein before the task scheduling unit AM generating a start task request according to the task coordination request, it further comprises:
the task scheduling unit AM registers in the resource management unit RM, and the registration is used for establishing heartbeat connection between the task scheduling unit AM and the resource management unit RM;
and the task scheduling unit AM acquires task operation resources from the resource management unit RM, and generates a task starting request according to the task operation resources, wherein the task operation resources comprise at least one server ID.
4. A method according to claim 1, wherein said node agent unit NM generates a create task container request according to said start task request, comprising:
the node agent unit NM generates the task container creation request according to at least one server ID in the start task request, and sends the task container creation request to the uniform resource interface;
the uniform resource interface creates the task container according to the task container creating request, and the method comprises the following steps:
and the uniform resource interface receives the task container creating request, creates a task container in a server corresponding to the at least one server ID, wherein the at least one server ID is any one or more of a resource management server ID and a task processing server ID.
5. The method of claim 1, wherein after sending a create task container request to the uniform resource interface, further comprising:
the node agent unit NM generates a task monitoring request and sends the task monitoring request to the uniform resource interface;
and the uniform resource interface receives the task monitoring request and monitors the task container according to the task monitoring request.
6. The method of claim 1, wherein after the uniform resource interface requests creation of a task container based on the creation of a task container, further comprising:
the uniform resource interface acquires a task processing result and sends the task processing result to the node agent unit NM;
the node agent unit NM obtains the task processing result and sends the task processing result to the resource management unit RM;
and the resource management unit RM sends the task processing result to the client.
7. The method according to claim 6, wherein after the resource management unit RM sends the task processing result to the client, further comprising:
the uniform resource interface determines the task process is finished and informs the node agent unit NM of the task process finishing state;
the node agent unit NM generates a task container deleting request according to the task process ending state and sends the task container deleting request to the uniform resource interface;
and the uniform resource interface deletes the task container.
8. The method of any of claims 1-6, wherein the uniform resource interface creating a task container according to the create task container request comprises:
the uniform resource interface creates a container resource preparation container, a task container and a log container in a server of the resource management system according to the task container creating request, wherein the container resource preparation container is used for acquiring a container dependent file; the task container comprises a task process, and the task process is used for processing a task; and the log container is used for monitoring the task process and converging the log of the task process into a distributed file system after the task process is finished.
9. A task processing apparatus adapted to a resource management system operating in a container cluster management system; a resource management unit RM and a node agent unit NM in the resource management system operate in the container cluster management system in an application instance POD manner, the container cluster management system is a kubernets system, and the resource management system is a yarn system; the device comprises:
a receiving and sending module, configured to receive a task processing request sent by a client, notify the node agent unit NM to generate a task coordination request, and send the task coordination request to a task scheduling unit AM through a uniform resource interface of the container cluster management system, where the task scheduling unit AM is created by the uniform resource interface according to the task coordination request, and the task scheduling unit AM operates in the container cluster management system in an application instance POD manner;
a processing module, configured to generate a task starting request according to the task coordination request, and send the task starting request to the node agent unit NM through the transceiver module;
the processing module is further configured to generate a task container creation request according to the task starting request, and send the task container creation request to the uniform resource interface through the transceiver module, where the task container creation request includes description information of a task and description information of a task container;
the processing module is further configured to create a task container according to the task container creation request, where the task container is used for a task process to execute the task processing request.
10. A computer-readable storage medium, characterized in that it stores a program which, when run on a computer, causes the computer to carry out the method of any one of claims 1 to 8.
11. A computer device, comprising:
a memory for storing a computer program;
a processor for calling a computer program stored in said memory to execute the method of any of claims 1 to 8 in accordance with the obtained program.
CN202110857003.7A 2021-07-28 2021-07-28 Task processing method and device Active CN113312165B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110857003.7A CN113312165B (en) 2021-07-28 2021-07-28 Task processing method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110857003.7A CN113312165B (en) 2021-07-28 2021-07-28 Task processing method and device

Publications (2)

Publication Number Publication Date
CN113312165A CN113312165A (en) 2021-08-27
CN113312165B true CN113312165B (en) 2021-11-16

Family

ID=77381872

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110857003.7A Active CN113312165B (en) 2021-07-28 2021-07-28 Task processing method and device

Country Status (1)

Country Link
CN (1) CN113312165B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114116133A (en) * 2021-11-30 2022-03-01 北京字节跳动网络技术有限公司 Container recycling method, device, equipment and storage medium
CN114936898B (en) * 2022-05-16 2023-04-18 广州高专资讯科技有限公司 Management system, method, equipment and storage medium based on spot supply
CN115145695B (en) * 2022-08-30 2022-12-06 浙江大华技术股份有限公司 Resource scheduling method and device, computer equipment and storage medium
CN115599517A (en) * 2022-10-13 2023-01-13 阿里巴巴(中国)有限公司(Cn) Processing method and device of timing task and storage medium

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160188594A1 (en) * 2014-12-31 2016-06-30 Cloudera, Inc. Resource management in a distributed computing environment
CN106953910A (en) * 2017-03-17 2017-07-14 郑州云海信息技术有限公司 A kind of Hadoop calculates storage separation method
US20180074855A1 (en) * 2016-09-14 2018-03-15 Cloudera, Inc. Utilization-aware resource scheduling in a distributed computing cluster
CN109117259A (en) * 2018-07-25 2019-01-01 北京京东尚科信息技术有限公司 Method for scheduling task, platform, device and computer readable storage medium
CN109739640A (en) * 2018-12-13 2019-05-10 北京计算机技术及应用研究所 A kind of container resource management system based on Shen prestige framework
US20190370146A1 (en) * 2018-06-05 2019-12-05 Shivnath Babu System and method for data application performance management
CN112104504A (en) * 2020-09-17 2020-12-18 汇智点亮科技(北京)有限公司 Transaction management framework for large-scale resource access, design method and cloud platform
CN113094150A (en) * 2021-04-02 2021-07-09 上海中通吉网络技术有限公司 Method and equipment for dynamically generating Spark port access in Jupyter container

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111327681A (en) * 2020-01-21 2020-06-23 北京工业大学 Cloud computing data platform construction method based on Kubernetes

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160188594A1 (en) * 2014-12-31 2016-06-30 Cloudera, Inc. Resource management in a distributed computing environment
US20180074855A1 (en) * 2016-09-14 2018-03-15 Cloudera, Inc. Utilization-aware resource scheduling in a distributed computing cluster
CN106953910A (en) * 2017-03-17 2017-07-14 郑州云海信息技术有限公司 A kind of Hadoop calculates storage separation method
US20190370146A1 (en) * 2018-06-05 2019-12-05 Shivnath Babu System and method for data application performance management
CN109117259A (en) * 2018-07-25 2019-01-01 北京京东尚科信息技术有限公司 Method for scheduling task, platform, device and computer readable storage medium
CN109739640A (en) * 2018-12-13 2019-05-10 北京计算机技术及应用研究所 A kind of container resource management system based on Shen prestige framework
CN112104504A (en) * 2020-09-17 2020-12-18 汇智点亮科技(北京)有限公司 Transaction management framework for large-scale resource access, design method and cloud platform
CN113094150A (en) * 2021-04-02 2021-07-09 上海中通吉网络技术有限公司 Method and equipment for dynamically generating Spark port access in Jupyter container

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
容器化大数据云平台及其存储与调度技术研究;黄凯旋;《中国优秀博硕士学位论文全文数据库(硕士)》;20210115;全文 *

Also Published As

Publication number Publication date
CN113312165A (en) 2021-08-27

Similar Documents

Publication Publication Date Title
CN113312165B (en) Task processing method and device
US11316800B2 (en) Method and system for managing applications
CN112632566B (en) Vulnerability scanning method and device, storage medium and electronic equipment
CN110752947A (en) K8s cluster deployment method and device, and deployment platform
CN107688502B (en) Inter-process communication method and device
CN109194538A (en) Test method, device, server and storage medium based on distributed coordination
EP3249871A1 (en) Method and device for updating network service descriptor
CN113031874B (en) Cache processing method, device, equipment and storage medium based on Kubernetes cluster
CN111930525A (en) GPU resource use method, electronic device and computer readable medium
CN115617497A (en) Thread processing method, scheduling component, monitoring component, server and storage medium
Seelam et al. Polyglot application auto scaling service for platform as a service cloud
CN111163140A (en) Method, apparatus and computer readable storage medium for resource acquisition and allocation
CN113986534A (en) Task scheduling method and device, computer equipment and computer readable storage medium
CN113391878A (en) Remote access method, device, system and storage medium
CN111522630B (en) Method and system for executing planned tasks based on batch dispatching center
CN114189439A (en) Automatic capacity expansion method and device
CN113220480B (en) Distributed data task cross-cloud scheduling system and method
CN111464368B (en) Device and method for quickly realizing signaling tracking in network management system
CN115268909A (en) Method, system and terminal for establishing and running construction task at web front end
CN115113975A (en) Cluster management method and device, electronic equipment and storage medium
CN114579298A (en) Resource management method, resource manager, and computer-readable storage medium
CN111427634A (en) Atomic service scheduling method and device
CN115686813A (en) Resource scheduling method and device, electronic equipment and storage medium
CN108418725B (en) Method and equipment for network monitoring and computer readable storage medium
CN115525390A (en) Container processing method and server

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant