CN111897622B

CN111897622B - High-throughput computing method and system based on container technology

Info

Publication number: CN111897622B
Application number: CN202010523599.2A
Authority: CN
Inventors: 黄荷; 徐蕴琪; �金钟
Original assignee: Computer Network Information Center of CAS
Current assignee: Computer Network Information Center of CAS
Priority date: 2020-06-10
Filing date: 2020-06-10
Publication date: 2022-09-30
Anticipated expiration: 2040-06-10
Also published as: CN111897622A

Abstract

The invention provides a high-throughput computing method and system based on a container technology, which relate to the field of high-throughput computing.A workflow description file defines workflow jobs, each workflow job consists of one or more subtask jobs, and a dependency relationship between the subtask jobs is defined through a directed graph; constructing the subtask operation into an operation container in a container mode, and connecting a resource pool, wherein the resource pool comprises computing resources and storage resources which are mainly composed of local physical resources, grid resources and virtualized resources; and scheduling, distributing, running, monitoring and managing each subtask according to the dependency relationship. A container and related technologies are utilized to realize a high-throughput computing platform for butting local physical resources, grid resources and virtualized cloud resources, influence factors of the environment are shielded for computing, resource requirements under different scenes are met, research efficiency and flexibility are improved, and meanwhile, support of a system to workflow morphological computing tasks is taken into consideration.

Description

High-throughput computing method and system based on container technology

Technical Field

The invention relates to the field of high-throughput computing, in particular to a high-throughput computing method and system based on a container technology.

Background

With the development of cloud computing and virtualization technologies, the container packages the application and the dependent environment thereof in a standardized manner through a kernel lightweight virtualization technology, provides an isolated operating environment for application programs and services, and has the advantages of being fast, efficient, easy to migrate and the like. Compared with the traditional virtual machine, the container environment directly runs on a host operating system, the additional requirement on system resources is far lower than that of the virtual machine, and the container arrangement framework and other related technologies provide comprehensive support for the arrangement and management of multiple containers, so that the container arrangement framework and other related technologies can be widely applied to multiple business scenes such as continuous integration and continuous deployment, automatic testing, microservice and the like.

High-throughput computing includes applications in high-throughput material computing, high-throughput material integration computing, material genome computing, and the like, and generally, the high-throughput computing operation is executed in a command line manner in a computing cluster in which related applications are installed. However, running a computing task in this mode still presents certain challenges: computing has higher requirements on the environment, the problem that the computing application cannot be compatible with the operating system or system software often occurs when the operating system of an older version installs a new third-party computing application, and the time cost and risk brought by upgrading the operating system or system software are relatively high under the condition, so that the time cost of compatibility and debugging is increased; the reproducibility of results is difficult to ensure, the reproduction of general data results needs to ensure a complete reproducibility mechanism of the whole system stack, if only the consistency of source codes is relied on, the completely same environment for obtaining a specific calculation result can not be ensured to reproduce, and the usability of high-throughput calculation is reduced; the computing resource is single in form and is difficult to meet the requirements of different computing scenes. High-throughput computing tasks typically include multiple relatively independent subtask steps, and the subtasks and their mutual relationship definitions form a workflow, so high-throughput computing tasks can often be described in terms of workflows. Support for workflow morphism tasks is an important requirement for high throughput computing systems.

Disclosure of Invention

The invention aims to solve the problems of difficult environmental compatibility, difficult result reproduction, single computing resource form and the like in a high-throughput computing scene, and provides a high-throughput computing method and a high-throughput computing system based on a container technology.

In order to achieve the purpose, the invention provides the following technical scheme:

a high throughput computing method based on container technology, comprising the steps of:

defining workflow jobs through a workflow description file, wherein each workflow job consists of one or more subtask jobs, the subtask jobs are executed serially or synchronously and parallelly according to the sequence, and the dependency relationship between the subtask jobs is defined through a directed graph;

constructing the subtask operation into an operation container in a container mode, and connecting a resource pool, wherein the resource pool comprises computing resources and storage resources which are mainly composed of local physical resources, grid resources and virtualized resources;

and scheduling, distributing, running, monitoring and managing each subtask according to the dependency relationship.

A high-throughput computing system based on container technology comprises a physical layer, a scheduling operation layer, a workflow engine layer and an application layer, wherein:

the physical layer is positioned at the bottom layer and is used for providing a uniform resource pool which comprises computing resources and storage resources, wherein the computing resources and the storage resources are mainly composed of local physical resources, grid resources and virtualized resources;

the scheduling operation layer is positioned at the upper layer of the physical layer and is used for constructing subtask jobs into job containers in a container mode, the job containers are connected with a resource pool of the physical layer, the definition of the workflow jobs is obtained through a workflow job description file, each workflow job consists of one or more subtask jobs, the subtask jobs are executed serially or synchronously and parallelly according to the sequence, the dependency relationship among the subtask jobs is defined through a directed graph, and each subtask is scheduled, distributed and operated according to the dependency relationship;

the workflow engine layer is positioned on the upper layer of the scheduling operation layer and is used for analyzing, dispatching, monitoring and managing the subtasks of the workflow operation;

the application layer is positioned on the upper layer of the workflow engine layer and is used for encapsulating functions from the workflow engine layer and the scheduling and running layer and providing a visual interface and an entrance for uniformly accessing the system for a user.

Further, the virtualized resources include container instance services provided by public cloud vendors.

Further, the scheduling operation layer comprises the following two modules:

(1) a container scheduling module: the system comprises a job scheduling module, a job container module and a job configuration module, wherein the job scheduling module is used for monitoring the configuration information of the newly-built job and then distributing the job container to the corresponding work module by using a job scheduling strategy;

(2) a container working module: for running the job container on different resources.

Further, the job scheduling policy includes resource availability, resource load, and job-to-resource propensity.

Further, the operation container comprises a local operation container, a grid operation expression container and an on-cloud operation container, wherein the local operation container and the on-cloud operation container are respectively connected with local physical resources and virtualized resources through a unified interface provided by the container scheduling module, the grid operation expression container is connected with grid middleware and an API through the unified interface of the container scheduling module, and the operation is operated in a grid environment and state information of the operation is obtained in real time.

Further, the process of the grid job presentation container acquires the state of the remote grid job in real time, and presents the actual job state as the outward state of the container; the grid job performance container comprises a tool packet and script codes which are in butt joint with a grid resource environment, the script codes are run to log in a grid by a specified user identity when the grid job performance container is started, and operation of submitting a job and uploading a job file is carried out through a grid environment API; after a job is successfully created in the grid environment, the grid job presentation container continuously polls through the running guardian process to check the status of the job in the remote environment and updates its own status accordingly.

Further, the container is a Docker container.

Further, the container is constructed and operated via a Kubernetes open source platform.

Further, the workflow engine layer processes and controls the workflow operation by using the workflow tool Argo.

Further, the workflow engine layer includes the following seven components:

(1) CLI: the system is used for realizing operations such as adding, deleting, modifying, checking and the like of the workflow operation through a command line tool;

(2) a workflow controller: the system is used for controlling the execution of workflow processes, so that the scheduling and running layer sequentially creates containers according to the process sequence and the operation state and executes calculation; the workflow controller forms a plurality of subtask job configurations with execution sequences by interpreting and splitting the workflow description file; the subtask job configuration comprises basic mirror image information, an execution command, resource requirements and input and output, can be identified by the scheduling and running layer, and creates a container with corresponding calculation content;

(3) a workflow queue: the system is used for storing one or more workflow jobs to be processed, and the workflow job at the head of the queue is processed by the controller preferentially;

(4) and (3) a subtask queue: the scheduling operation layer is used for storing one or more subtask jobs to be scheduled, and the job at the head of the queue is sent to the scheduling operation layer by the controller preferentially;

(5) the log and monitoring module: the system is used for log collection and job container execution state monitoring and transmitting to the controller;

(6) the garbage recycling module: the system is used for marking, sorting or deleting the operation containers and the related files which are finished in operation, wrong in operation and ended in operation;

(7) client API: and a programming interface for externally exposing control and operation.

Furthermore, the workflow operation is defined by a workflow description file, an application layer defines each subtask step and the mutual relation thereof in the workflow according to specific application, and the functions of parameter transmission, condition judgment and recursive calling are supported in the process.

Furthermore, the scheduling running layer sends out a monitoring request, controls the execution of the flow according to the job state when the job execution state changes, and receives the error log information submitted by the scheduling running layer.

Further, for the log and monitoring module, the log collection method is to obtain the operation index of the workflow job from the scheduling operation layer, provide centralized management of log information, and provide the functions of checking, saving and deleting the log information of all the workflow jobs; the monitoring comprises monitoring of the execution state of the operation container, and the controller can conveniently and timely carry out re-operation on abnormal operation.

Further, the application layer includes the following three parts:

(1) web backend: providing an API which is easy to understand and use for a front-end interface, and realizing the butt joint and operation with a workflow engine layer by using a client API of the workflow engine layer;

(2) a Web interface: providing a system use interface for a user, providing functions of user login and logout, submission, termination, deletion, workflow progress check and workflow list of a user workflow, and displaying and explaining results in a visual mode;

(3) application of the database: for storing user data and application specific data.

Furthermore, the Web back-end language adopts Python, adopts a flash frame, is connected by Python PyMODM library for data interaction processing, reads data from the database or stores the changed data into the database; the Web interface adopts an Vue framework and realizes user authentication and authorization by using a Vue-auth authentication library.

Further, the application database adopts a MongoDB database.

And further, the remote mirror image warehouse is used for uniformly storing the application mirror images packaged by the users, so that centralized management and distribution are facilitated.

Further, the system adopts a hierarchical design based on a browser/server mode.

The high-throughput computing system based on the container technology provided by the embodiment of the invention can be used for connecting three resource forms, namely local physical resources, grid resources, virtualized resources and the like, effectively completing the operation and monitoring of high-throughput workflow operation through the workflow engine by utilizing the container and related technologies, shielding the influence factors of the environment for computing and improving the usability and flexibility of the computing system.

Drawings

The accompanying drawings are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the example serve to explain the principles of the invention and not to limit the invention.

FIG. 1 is a design architecture diagram of a high throughput computing system based on container technology in accordance with an embodiment of the present invention;

FIG. 2 is a schematic flow chart of a method for computing high throughput workflow according to an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail with reference to the following embodiments and the accompanying drawings.

A first aspect of the present invention is to define the overall flow of a computational task in a high-throughput job in the form of a workflow.

A workflow job is made up of one or more subtask jobs.

The subtask jobs can be executed serially in a sequential order or synchronously and parallelly, and the dependency relationship can be defined by a directed graph.

The system acquires the definition of the workflow job through the workflow job description file, carries out scheduling processing on each subtask step according to the dependency relationship, and uniformly manages the operation state of the job and the subtasks thereof.

A second aspect of the invention is to build and run subtasks in a workflow job on a container basis.

The system creates a container mirror image for encapsulating the running environment of each subtask, and performs job scheduling by taking the container as a unit, thereby realizing flexible deployment of diversified environments required by different subtasks and solving the problem that the calculation result is difficult to reproduce.

The third aspect of the invention is to interface three physical resources of different forms, including local physical resources, grid resources and virtualized cloud resources, through a polymorphic presentation container. The system provides a class of performance containers corresponding to each physical resource, downward interfaces with different forms of computing resources through polymorphic interfaces, upward exposes a uniform management operation interface, and supports uniform scheduling management, thereby shielding the use difficulty caused by the diversity of bottom layer resources.

The embodiment provides a high-throughput computing system based on container technology, and with reference to fig. 1, the high-throughput computing system based on container technology is a design architecture diagram, the architecture is based on a browser/server mode, and a hierarchical design is adopted. The system specifically comprises:

the system comprises a physical layer, a scheduling operation layer, a workflow engine layer and an application layer. The system forms a high-throughput computing system which is in butt joint with three resource forms, namely local physical resources, grid resources, virtualized resources and the like, can submit, run and manage high-throughput workflow computing operation through the four layers.

The physical layer is located at the bottom layer of the system and is used for providing computing resources and storage resources for the computing system, and specifically, the physical layer can include three resource forms, namely local physical resources, grid resources and virtualized resources, to form a resource pool capable of being flexibly selected, so that the scheduling layer can build and run high-throughput computing jobs. Alternatively, the virtualized resource may be a Container Instance (Container Instance) service provided by a public cloud vendor, which has the advantage that the Container can be run without managing the underlying server by specifying the image and only paying for the resource consumed by the actual running of the Container.

The scheduling running layer is positioned at the upper layer of the physical layer and is used for scheduling and running high-throughput computing operation, and scheduling distribution and unified management are carried out on the operation through a scheduling strategy. Jobs are built and run in the form of containers under the management of a container orchestration framework. Alternatively, the container may be a Docker container, which is a mainstream open source container technology implemented based on Go language, providing an efficient, agile, and lightweight container solution. Optionally, the container arrangement technology may be kubernets, which is an open source platform for automated deployment, capacity expansion, and operation and maintenance of a container cluster, and provides a complete open source scheme for container arrangement management. The initial design was service centric, with version 1.2 being followed by kubernets starting to support Job types, i.e., batch processing tasks. Because the scheduling framework is in a plug-in mode, a user can customize a scheduling strategy according to the requirement of the user, and the expansibility of the user facing different task scheduling requirements is greatly increased. This layer contains two modules:

(1) a container scheduling module: when the module monitors the information of the newly-built job configuration through the interface, the module binds the job to the corresponding working module according to scheduling strategies such as resource availability, resource load, tendency of the job to the resource and the like, and informs the container working module to take over the subsequent work;

(2) a container working module: for the real running of the job container on different resources, generally, before the formal container is started, some necessary preparation work such as data initialization is carried out, and then the container mirror is pulled and the job container is started until the container running finishes exiting or is terminated or fails exiting. Illustratively, job containers are divided into three types, a local job container, a grid job presentation container, and an on-cloud job container. The container is connected with the grid middleware and the API through the unified interface of the container scheduling module, the operation is operated in a grid environment and state information of the operation is acquired in real time, and the external performance of the operation is basically consistent with other two operation containers through containerization packaging. The grid job presentation container does not actually run the computational tasks, and the process in the container can acquire the state of the remote grid job in real time and "present" the actual job state as the outward state of the container. The container contains a tool kit and script codes for connecting grid resource environment, and the script codes are run when the container is started to log in the grid by the specified user identity, and operations such as submitting operation and uploading operation files are carried out through a grid environment API. After the job is successfully created in the grid environment, the monitoring process running in the container continuously polls to check the state of the job in the remote environment, and updates the state of the monitoring process correspondingly until exiting.

And the workflow engine layer is used for processing and controlling the workflow operation. Optionally, the workflow technology may be Argo, Argo is a workflow tool implemented based on Custom Resources (CRD) of kubernets, and the workflow control and task operation are implemented based on scheduling capabilities of kubernets. This layer includes the following six components:

(1) CLI: the component completes operations such as addition, deletion, modification, check and the like of the workflow by using a command line tool through encapsulating functions from a scheduling operation layer;

(2) a workflow controller: for control of the workflow. The workflow description file is interpreted and split to form a plurality of job configurations with execution sequence, so that the scheduling and running layer can create a container and execute jobs according to the job configurations, send monitoring requests to the scheduling and running layer, control the execution of the flow according to the job state when the job execution state changes, and receive error log information submitted by the scheduling and running layer. The workflow description file can define workflow operation in various flow forms such as DAG and the like, the flow supports the functions of parameter transmission, condition judgment, recursive calling and the like, the operation configuration generally comprises basic mirror image information, an execution command, resource requirements, input and output and the like, and the configuration can be identified by a scheduling operation layer and creates a container with corresponding calculation content;

(3) a workflow queue: the system comprises a workflow queue, a controller and a workflow analysis module, wherein the workflow queue is used for storing one or more workflow descriptions to be processed, the workflow submitted by a user is added into the workflow queue in time sequence, and the workflow description at the head of the queue is processed by the controller preferentially;

(4) and (3) job queue: the system comprises a scheduling operation layer, a workflow management layer and a workflow management layer, wherein the scheduling operation layer is used for storing one or more than one job configuration to be scheduled, the jobs in the split workflow are added into an operation queue according to the flow sequence and the completion state of the front jobs, and the jobs at the head of the queue are sent to the scheduling operation layer by a controller preferentially;

(5) the log and monitoring module: the system is used for log collection and job container execution state monitoring, and transmits information to the controller. The log collection is to obtain the operation index of the workflow job from the scheduling operation layer, and provide the centralized management of the log information, so as to provide the functions of checking, saving and deleting the log information of all the workflow jobs. The monitoring mainly comprises monitoring the execution state of the operation container, so that the controller can conveniently perform operations such as rerun on abnormal operation in time;

(6) the garbage recycling module: the system is used for processing the operation containers and related files of the operation end, the operation error and the operation termination, and marking, sorting, deleting or other operations are carried out.

The application layer is used for encapsulating functions from the workflow engine layer and the scheduling operation layer, providing an interface convenient to use, and enabling a user to complete operations such as submission, viewing, termination, deletion and the like of the workflow on the interface. The layer includes the following three portions:

(1) web backend: the method comprises the steps of providing an API (application programming interface) interface which is easy to understand and use, processing data requests sent from a front end, using a client library of a workflow engine layer to realize the interface and operation with the workflow engine layer, and further filtering data according to specific requirements. Optionally, the backend language may be Python, the backend framework technology may be a flash framework, and when interaction with the database is required, the Python pymod library is connected to perform data interaction processing, and data from the database is read or changed data is stored in the database.

(2) A Web interface: the system is used for providing a friendly system use interface for users, and providing functions of user login and logout, submission, termination and deletion of user workflows, workflow progress viewing, workflow list and the like. And the front-end interface and the front-end interaction logic are responsible, and the front-end and the back-end are separated based on the API (application programming interface) display page provided by the back end. Optionally, the front-end framework may be Vue, the Vue-auth authentication library may be used to implement user authentication and authorization, and the front-end and the back-end may interact by calling an API interface of the back-end using Ajax.

(3) An application database: the data storage system is used for storing user data and data of a high-throughput computing workflow and providing data support for Web applications. Alternatively, the database technology used may be the MongoDB database, which is a database based on distributed file storage, aiming at providing an extensible high-performance data storage solution for Web applications.

The system also comprises a remote mirror image warehouse which is used for uniformly storing the packed application mirror images and is convenient for centralized management and distribution.

FIG. 2 is a schematic flow chart of a high throughput workflow calculation method, in which 201 and 206 are corresponding steps.

The high-throughput computing system based on the container technology provided by the embodiment enables a user to conveniently submit a high-throughput computing workflow, meets the requirements of different computations, and provides computing support for related research works.

The above description is only a specific embodiment of the present invention and is not intended to limit the present invention. Therefore, any omissions, modifications, substitutions, improvements and the like that may be made without departing from the spirit and principles of the invention are intended to be included within the scope of the invention.

Claims

1. A high-throughput computing method based on container technology is characterized by comprising the following steps:

constructing the subtask operation into an operation container in a container mode, and connecting a resource pool, wherein the resource pool comprises computing resources and storage resources which are mainly composed of local physical resources, grid resources and virtualized resources; the operation container comprises a local operation container, a grid operation expression container and an on-cloud operation container, wherein the local operation container and the on-cloud operation container are respectively butted with local physical resources and virtualized resources through a unified interface provided by a container scheduling module, the grid operation expression container is butted with a grid middleware and an API (application programming interface) through the unified interface of the container scheduling module, and the operation is operated in a grid environment and state information of the operation is acquired in real time; the process of the grid operation expression container acquires the state of the remote grid operation in real time and presents the actual operation state as the external state of the container; the grid job performance container comprises a tool packet and script codes which are in butt joint with a grid resource environment, the script codes are run to log in a grid by a specified user identity when the grid job performance container is started, and operation of submitting a job and uploading a job file is carried out through a grid environment API; after the job is successfully created in the grid environment, the grid job performance container continuously polls and checks the state of the job in the remote environment through the running monitoring process, and updates the state of the job correspondingly;

2. A high-throughput computing system based on container technology is characterized by comprising a physical layer, a scheduling operation layer, a workflow engine layer and an application layer, wherein:

the scheduling operation layer is positioned at the upper layer of the physical layer and is used for constructing subtask jobs into job containers in a container mode, the job containers are connected with a resource pool of the physical layer, the definition of the workflow jobs is obtained through a workflow job description file, each workflow job consists of one or more subtask jobs, the subtask jobs are executed serially or synchronously and parallelly according to the sequence, the dependency relationship among the subtask jobs is defined through a directed graph, and each subtask is scheduled, distributed and operated according to the dependency relationship; the operation container comprises a local operation container, a grid operation expression container and an on-cloud operation container, wherein the local operation container and the on-cloud operation container are respectively butted with local physical resources and virtualized resources through a unified interface provided by a container scheduling module, the grid operation expression container is butted with a grid middleware and an API (application programming interface) through the unified interface of the container scheduling module, and the operation is operated in a grid environment and state information of the operation is acquired in real time; the process of the grid operation expression container acquires the state of the remote grid operation in real time and presents the actual operation state as the external state of the container; the grid job performance container comprises a tool packet and script codes which are in butt joint with a grid resource environment, the script codes are run to log in a grid by a specified user identity when the grid job performance container is started, and operation of submitting a job and uploading a job file is carried out through a grid environment API; after the job is successfully created in the grid environment, the grid job performance container continuously polls and checks the state of the job in the remote environment through the running monitoring process, and updates the state of the job;

3. The system of claim 2, wherein the schedule run layer includes the following two modules:

(1) a container scheduling module: after monitoring the newly-built job configuration information, allocating the job container to the corresponding working module by using a job scheduling strategy, wherein the job scheduling strategy comprises resource availability, resource load and tendency of the job to the resource;

4. The system of claim 2, wherein the container is a Docker container constructed and operated via a Kubernetes open source platform.

5. The system of claim 2, wherein the workflow engine layer processes and controls the workflow job using the workflow tool Argo, the workflow engine layer comprising:

(1) CLI: the system comprises a command line tool, a workflow execution tool and a workflow execution tool, wherein the command line tool is used for realizing the operations of adding, deleting, modifying and viewing workflow jobs;

(5) the log and monitoring module: the system is used for log collection and job container execution state monitoring and transmitting to a controller, and the log collection method is used for obtaining the operation indexes of the workflow jobs from a scheduling operation layer, providing centralized management of log information and providing functions of checking, storing and deleting the log information of all the workflow jobs; the monitoring comprises monitoring the execution state of the operation container, so that the controller can conveniently and timely carry out re-operation on abnormal operation;

(7) client API: and a programming interface for exposing control and operation to the outside.

6. The system of claim 2, wherein the application layer comprises:

7. The system of claim 6, wherein the Web backend language employs Python, employs a flash framework, and performs data interaction processing by Python PyMODM library connection, reading data from the database or storing modified data into the database; the Web interface adopts an Vue framework, and realizes user authentication and authorization by using a Vue-auth authentication library; the application database adopts a MongoDB database.

8. The system of claim 2, further comprising a remote mirror repository for uniformly storing the packaged application images for centralized management and distribution.