CN113886023A

CN113886023A - Batch dispatching system and method based on distributed architecture and containerization

Info

Publication number: CN113886023A
Application number: CN202111230495.3A
Authority: CN
Inventors: 刘志
Original assignee: Jiangsu Suning Bank Co Ltd
Current assignee: Jiangsu Suning Bank Co Ltd
Priority date: 2021-10-22
Filing date: 2021-10-22
Publication date: 2022-01-04

Abstract

The invention provides a batch scheduling system and a batch scheduling method based on a distributed architecture and containerization, wherein the system comprises a container management platform, a scheduling platform, a mirror pool and a resource pool, wherein the mirror pool comprises a plurality of batch mirrors generated by packaging a service subsystem, and the resource pool comprises a plurality of batch containers established based on the batch mirrors; the container management platform is used for generating a corresponding batch container according to the task execution instruction and carrying out batch container destruction operation according to the batch task execution state; the scheduling platform is used for configuring batch task information of the service system, formulating a mapping relation between the batch tasks and the batch mirror image and the execution system, sending task execution instructions to the container management platform, tracking the batch task execution state and feeding back the batch task execution state to the container management platform. The invention can realize the scheduling control of enterprise-level batch tasks, realize the centralized application of resources and solve the problems of centralized batch time and overhigh use of server resources in the batch execution process.

Description

Batch dispatching system and method based on distributed architecture and containerization

Technical Field

The invention relates to the technical field of data processing, in particular to a container batch scheduling system and method based on a distributed architecture.

Background

With the rapid development of financial business, the bank system is continuously impacted by internet finance, the bank system and business development are required to be rapidly promoted under the environment, the traditional architecture of the bank system needs to be promoted, the IT architecture and the application system software architecture of the bank are relocated, and a new generation business system which can efficiently run, can continuously serve for a long time and can be operated professionally is created. Therefore, architecture breakthrough is needed to be realized in bank system construction, and a system architecture mode which is more convenient for transverse resource expansion, high system performance, high system cohesion and low coupling and business commercialization is adopted to solve the problems of the bank core system in the new era.

The rapid development of banking business, the high performance and the high stability of the system are realized through a brand-new application architecture and an IT architecture. The batch processing of the service system requires shorter time and has higher efficiency. The batch processing requirements among the cross-service systems are seamless connection, synchronous execution, cross execution and the like, and the batch of the service systems is also evolved from traditional one-batch-a-day processing, end-of-day processing and the like to daytime high-frequency multi-batch execution. The method has the advantages that the method has the characteristic that batch execution is centralized to scattered on the framework, the system is developed from traditional single-computer high-configuration to multi-computer low-configuration parallel computing by means of implementation and popularization of a distributed framework, batch processing is also performed from traditional centralized processing to data fragment splitting and execution, and finally results are merged, so that batch processing efficiency is improved, the problem that more single-task key paths consume too much time is optimized to be single-task key path fragment parallel processing, and batch processing time is greatly shortened.

However, in the actual process, there are many problems in the combined operation of batch scheduling and containers, such as the scheduling system cannot monitor the task state in the container, and cannot flexibly control the survival cycle, the readjustment, etc. of the batch instance according to the task topic of the container.

Disclosure of Invention

In view of the above problems, the present invention provides a batch scheduling system and method based on a distributed architecture and containerization, which realizes effective connection between a scheduling platform and a container management platform and a container execution instance, and solves the problem that the state scheduling state is uncontrollable after the container execution instance is pulled up.

In order to solve the technical problems, the invention adopts the technical scheme that: a batch scheduling system based on a distributed architecture and containerization comprises a container management platform, a scheduling platform, a mirror pool and a resource pool, wherein the mirror pool comprises a plurality of batch mirrors generated by packaging a service subsystem, and the resource pool comprises a plurality of batch containers established based on the batch mirrors; the container management platform is used for generating a corresponding batch container according to the task execution instruction and carrying out batch container destruction operation according to the batch task execution state; the scheduling platform is used for configuring batch task information of the service system, formulating a mapping relation between the batch tasks and the batch mirror image and the execution system, sending task execution instructions to the container management platform, tracking the batch task execution state and feeding back the batch task execution state to the container management platform.

Preferably, the scheduling platform comprises an instance management center, a distribution controller, an instruction generator, a state collector and a life cycle manager; the instance management center is used for generating a task instance, and setting a task instance state, a task timeout time and a task parallel state; the distribution controller is used for checking the load of the resource pool, setting task execution template information and distributing batch tasks to task instances; the instruction generator is used for generating a task execution instruction, and the task execution instruction comprises batch mirror image pull information and batch container information. The state collector is used for receiving task execution states executed by batch sub-applications, and the batch sub-applications are pulled up by the batch container; the life cycle manager is used for receiving and checking the execution state of the batch tasks under the batch mirroring, and sending the checking result to the instance management center.

As a preferred scheme, the formulating a mapping relationship between the batch tasks and the batch mirroring and execution system includes: and binding the batch mirror ID available for deployment with the execution system ID and the batch task ID of the scheduling platform.

The invention also provides a batch scheduling method based on the distributed architecture and containerization, which comprises the following steps: s1, loading the configuration information of the service system batch task into a scheduling platform, and formulating the mapping relation among the batch task, the batch mirror image and the execution system; s2, starting batch tasks through a scheduling platform, and calling a container management platform according to the configuration information to send task execution instructions to the container management platform; s3, the container management platform pulls batch images in the image pool according to the task execution instruction to generate corresponding batch containers; s4, after the batch container is started, pulling up the batch sub-application corresponding to the batch container, executing the batch task by the batch sub-application, and feeding back the execution state of the batch task to the scheduling platform in real time; s5, the scheduling platform initiates a container destruction command according to the batch task execution state; and S6, the container management platform destroys the batch containers according to the container destroying command.

Preferably, the scheduling platform includes an instance management center, a distribution controller, an instruction generator, a state collector, and a lifecycle manager, and the step S2 includes: s201, starting batch tasks through an instance management center, generating task instances, and checking the task instance state, the task timeout time and the task parallel state; s202, checking the load of a resource pool through an allocation controller, and allocating batch tasks to task instances; s203, generating a task execution instruction through an instruction generator, wherein the task execution instruction comprises batch mirror image pulling information and batch container information; and if the free batch container corresponding to the task instance exists, binding the task instance with the free batch container, skipping the step S3, and otherwise, executing the step S3.

Preferably, in step S4, the feeding back the execution status of the batch task to the scheduling platform in real time includes: after the task execution is finished, sending a task execution state to a state collector, and then forwarding the task execution state to a life cycle manager by the state collector, wherein the life cycle manager checks whether batch tasks needing to be executed exist under batch images and checks whether the batch tasks need to be continuously executed.

Preferably, the method further comprises the following steps: and when the task execution state is in execution and the execution time exceeds the first preset time, sending an alarm notification, if the execution time exceeds the second preset time, unbinding the batch tasks and the task instances, modifying the batch tasks to be overtime, and setting the corresponding batch container state as false.

Preferably, the method further comprises the following steps: when the batch container is not available due to the fact that the batch sub-application is pulled up unsuccessfully, the batch container of the task instance is scanned through the timing task, the state updating time and the state information of the batch container are checked, and a destroying command of the batch container in the abnormal state is sent to the container management platform.

In summary, the beneficial effects of the invention include: the dispatching system can realize dispatching control of enterprise-level batch tasks, realize centralized application of resources, and solve the problems that batch time is centralized in the realization of various service systems, server resources are overhigh in the batch execution process, the batch influences on-line service, and the server resources cannot be effectively utilized after the batch execution is finished.

Drawings

The disclosure of the present invention is illustrated with reference to the accompanying drawings. It is to be understood that the drawings are designed solely for the purposes of illustration and not as a definition of the limits of the invention. In the drawings, like reference numerals are used to refer to like parts. Wherein:

FIG. 1 is a schematic structural diagram of a batch dispatching system based on a distributed architecture and containerization according to an embodiment of the present invention;

FIG. 2 is a flowchart illustrating a distributed architecture and containerized batch scheduling method according to an embodiment of the present invention;

FIG. 3 is a flow chart illustrating another embodiment of a distributed architecture and containerized batch scheduling method;

FIG. 4 is a flowchart illustrating a process of starting a batch task through a scheduling platform according to an embodiment of the present invention.

Detailed Description

It is easily understood that according to the technical solution of the present invention, a person skilled in the art can propose various alternative structures and implementation ways without changing the spirit of the present invention. Therefore, the following detailed description and the accompanying drawings are merely illustrative of the technical aspects of the present invention, and should not be construed as all of the present invention or as limitations or limitations on the technical aspects of the present invention.

An embodiment according to the present invention is shown in connection with fig. 1. A batch scheduling system based on a distributed architecture and containerization comprises a container management platform, a scheduling platform, a mirror image pool and a resource pool, wherein the mirror image pool comprises a plurality of batch mirror images generated by packaging of business subsystems, such as deposit batch mirror images, accounting batch mirror images, card system batch mirror images and the like, and the resource pool comprises a plurality of batch containers established based on the batch mirror images, such as deposit batch dockers, accounting batch dockers, card batch dockers and the like.

In the embodiment of the invention, the container management platform is used for generating the corresponding batch containers according to the task execution instruction and performing batch container destruction operation according to the batch task execution state.

The scheduling platform is used for configuring batch task information of the service system, formulating a mapping relation between the batch tasks and the batch mirror image and the execution system, sending task execution instructions to the container management platform, tracking the execution state of the batch tasks and feeding back the execution state to the container management platform.

Wherein, formulate the mapping relation between batch task and batch mirror image and the execution system, include: and binding the batch mirror ID available for deployment with the execution system ID and the batch task ID of the scheduling platform. Such as: the batch mirror ID is S _0001, the corresponding batch Task ID is Task _ dataClean _ mirror, and the corresponding execution system ID is 0001.

It should be understood that the execution system described above is a system for executing a batch task, and the execution system includes several execution machines, i.e., servers. Batch quantum applications belong to programs that are allowed to run in the container of the execution machine.

Specifically, the scheduling platform comprises an instance management center, a distribution controller, an instruction generator, a state collector and a life cycle manager.

The instance management center is used for generating task instances, and setting task instance states, task timeout time and task parallel states.

The allocation controller is used for checking the load of the resource pool, setting task execution template information and allocating the batch tasks to the task instances.

The instruction generator is used for generating a task execution instruction, and the task execution instruction comprises batch mirror image pulling information and batch container information.

The state collector is used for receiving the task execution state executed by the batch sub application, and the batch sub application is pulled up by the batch container.

The life cycle manager is used for receiving and checking the execution state of the batch tasks under the batch mirroring, and sending the checking result to the instance management center.

Furthermore, the dispatching platform further comprises a configuration center, a timed task factory, a monitoring alarm center and a task definition center.

The configuration center is used for carrying out system configuration information, starting and stopping of platform functions and the like.

The timed task factory is used for realizing automatic generation, triggering, distributed uniqueness checking and the like of timed tasks.

The monitoring alarm center is used for monitoring the execution of batch tasks, alarming for abnormalities, alarming for overlong execution time, executing a same-proportion ring ratio, performing monitoring analysis in multiple dimensions and the like.

The task definition center is used for configuring task execution information, such as task execution time, execution limit time, concurrence, attribute relation, execution binding data, a data fragmentation mechanism, task dependency relation, task exception handling mode, alarm rule and the like.

Referring to fig. 2 and 3, the present invention further provides a batch scheduling method based on a distributed architecture and containerization, including the following steps:

s1, loading the configuration information of the service system batch task into the scheduling platform, and making a mapping relation among the batch task, the batch mirror image and the execution system.

It should be understood that, before step S1, the method further includes packaging the service subsystems into a mirror pool, deploying a container management platform environment, installing a Docker open-source application container engine, and the container management platform being a K8S container cluster management system.

And S2, starting the batch tasks through the scheduling platform, and calling the container management platform according to the configuration information to send task execution instructions to the container management platform.

The scheduling platform comprises an instance management center, a distribution controller, an instruction generator, a state collector and a life cycle manager. Referring to fig. 4, step S2 includes:

s201, starting batch tasks through the instance management center, generating task instances, and checking the task instance state, the task timeout time and the task parallel state.

The method specifically comprises the following steps: generating a task instance, setting the ID of the task instance as instance _ ID, and inquiring a database table to check whether an execution instance of a system to which the task belongs exists in an execution instance pool. And checking the current running quantity of the images required by the batch tasks, and if the current running quantity of the images does not exceed the maximum parallel quantity, setting the task instance state running to be 1. And if the number of the parallel data exceeds the maximum parallel number, sending alarm information to prompt a relevant responsible person. Setting a task timeout time max _ time, setting a task parallel state parallelisms, wherein the parallel execution parallelisms are equal to 1, and the non-parallel execution parallelisms are equal to 0. If the task is a high-frequency task, whether a running instance exists is checked.

S202, checking the load of the resource pool through the allocation controller, and allocating the batch tasks to the task instances.

Checking the load of the resource pool, and setting target information of the task execution template resource, wherein the target information comprises a task mapping mirror image ID, an IP address port of an execution target server and the like, and is used for generating an execution instruction and pulling a mirror image. And after the distribution is finished, executing the task load amount +1 in the physical server load state. When the batch tasks are high-frequency tasks, if the batch tasks are running instances, resource binding information is set, the high-frequency tasks are executed and bound to the running instances, the batch containers are prevented from being frequently pulled up by the high-frequency batch tasks in the execution process, and the execution efficiency is improved.

S203, generating a task execution instruction through an instruction generator, wherein the task execution instruction comprises batch mirror image pulling information (mirrorInfo) and batch container information (dockerInfo).

If the batch container corresponding to the task instance does not exist, the container management platform is called, and the task instance state is set to be new (useType is 0). The task instance state represents the state of the instance corresponding to the execution of the task, indicating that a new container instance needs to be allocated. If the task instance is executed in parallel, setting the task instance state as parallel execution (useType is 2), if the execution instance exists and the parallelisms are not executed in parallel, skipping step S3, binding the free container and the task instance, modifying the task execution state as resource allocation end (running is 2), and issuing a command to the specified task instance.

And S3, the container management platform pulls the batch images in the image pool according to the task execution instruction to generate corresponding batch containers. Such as: the deposit batch docker is generated by the deposit batch image in fig. 1. The batch container is created and the container instance ID is returned. And binding the container instance ID with the batch task ID, and modifying the task execution state to be the resource allocation end (running 2).

And S4, pulling up the batch sub-application corresponding to the batch container after the batch container is started, executing the batch task by the batch sub-application, and feeding back the execution state of the batch task to the scheduling platform in real time.

The batch task enters an execution machine to be executed, the batch sub-application is pulled up, the execution state is fed back after the batch sub-application is started successfully, the task execution state is modified to be in batch execution (running is 3) by the scheduling platform, and the task execution state is continuously updated.

In step S4, feeding back the execution status of the batch task to the scheduling platform in real time includes: after the task is executed, sending the task execution state to the state collector, modifying the task execution state to be complete (running is 4), and then forwarding the task execution state to the life cycle manager by the state collector, wherein the life cycle manager checks whether batch tasks needing to be executed exist under the batch mirror image, and checks whether the batch tasks need to be continuously executed.

And if the task instance needs to be reserved, setting the state of the task instance as reserved, otherwise, not reserving (neederretain true or false), and sending the data to the instance management center. And the instance management center processes according to the needRetain state, removes the binding between the batch tasks and the task instances, marks whether the task instances can be reused or not, and marks the state of the batch containers in the resource pool as to-be-destroyed if the task instances cannot be reused (dockerstatus is 3). Sending a destroy command to the distribution controller, setting instance _ id to null,

action is defined as "destroy". If the batch task execution result is failure (running is 5), modifying the task execution state to be failure, sending failure alarm prompt information, checking whether the task execution failure configuration supports continuous operation, if so, retaining the task, if not, setting needRetain to be false, removing the binding of the batch task and the batch container, marking the state of the batch container in the resource pool to be destroyed (dockerstatus is 3), and executing the destruction of the container instance.

And S5, the scheduling platform initiates a container destruction command according to the batch task execution state, and the container destruction command is sent to the instruction generator by the distribution controller and then forwarded to the container management platform.

And S6, the container management platform destroys the batch containers according to the container destruction command, and retains the execution result to the OSS storage center.

In this embodiment of the present invention, the scheduling method further includes: and when the task execution state is in execution and the execution time exceeds the first preset time, sending an alarm notification, if the execution time exceeds the second preset time, unbinding the batch tasks and the task instances, modifying the batch tasks to be overtime, and setting the corresponding batch container state as false.

Specifically, if the execution time exceeds 12 hours, a warning that the batch containers are to be destroyed is sent out, if the task execution time is more than 2 times the predefined time and exceeds 24 hours, the task execution instance binding is released, if the task is modified to be overtime, the container retention state is set to false, and a command is sent to the instance management center.

In this embodiment of the present invention, the scheduling method further includes: when the batch container is not available due to the fact that the batch sub-application is pulled up unsuccessfully, the batch container of the task instance is scanned through the timing task, the state updating time and the state information of the batch container are checked, and a destroying command of the batch container in the abnormal state is sent to the container management platform. And actively calling the state information query of the batch container aiming at the batch container which does not return to the self state for a long time, updating the state information of the batch container, starting retry query accumulated along delay intervals if the query is failed, executing maxretry which is 10 times of query, wherein interval of each query is timeout + maxretry which is 60(s), and sending alarm information if the query still has no result, generating a container destruction command and sending a container management platform.

And when the task instance ID is empty, the distribution controller sends a request of action ═ destroy, generates a container destroy command according to the container instance ID, sends a container management platform and executes container destroy. And returning to an execution state after the destruction is finished, and marking the state of the container in the resource pool as destroyed (dockerstatus ═ 4).

In summary, the beneficial effects of the invention include: the dispatching system can realize dispatching control of enterprise-level batch tasks and centralized application of resources, and solves the problems that batch time is centralized in the implementation of various service systems, server resources are overhigh in the batch execution process, the batch influences on-line service, and the server resources cannot be effectively utilized after the batch execution is finished.

The technical scope of the present invention is not limited to the above description, and those skilled in the art can make various changes and modifications to the above-described embodiments without departing from the technical spirit of the present invention, and such changes and modifications should fall within the protective scope of the present invention.

Claims

1. A batch scheduling system based on a distributed architecture and containerization is characterized by comprising a container management platform, a scheduling platform, a mirror pool and a resource pool, wherein the mirror pool comprises a plurality of batch mirrors generated by packaging a service subsystem, and the resource pool comprises a plurality of batch containers established based on the batch mirrors;

the container management platform is used for generating a corresponding batch container according to the task execution instruction and carrying out batch container destruction operation according to the batch task execution state;

the scheduling platform is used for configuring batch task information of the service system, formulating a mapping relation between the batch tasks and the batch mirror image and the execution system, sending task execution instructions to the container management platform, tracking the batch task execution state and feeding back the batch task execution state to the container management platform.

2. The distributed architecture and containerized batch scheduling system of claim 1, wherein the scheduling platform includes an instance management center, an allocation controller, an instruction generator, a state collector, and a lifecycle manager;

the instance management center is used for generating a task instance, and setting a task instance state, a task timeout time and a task parallel state;

the distribution controller is used for checking the load of the resource pool, setting task execution template information and distributing batch tasks to task instances;

the instruction generator is used for generating a task execution instruction, and the task execution instruction comprises batch mirror image pull information and batch container information.

The state collector is used for receiving task execution states executed by batch sub-applications, and the batch sub-applications are pulled up by the batch container;

3. The distributed architecture and containerized batch scheduling system of claim 1, wherein said formulating a mapping relationship between the batch tasks and the batch mirroring and execution systems comprises: and binding the batch mirror ID available for deployment with the execution system ID and the batch task ID of the scheduling platform.

4. A batch scheduling method based on a distributed architecture and containerization is characterized by comprising the following steps:

s1, loading the configuration information of the service system batch task into a scheduling platform, and formulating the mapping relation among the batch task, the batch mirror image and the execution system;

s2, starting batch tasks through a scheduling platform, and calling a container management platform according to the configuration information to send task execution instructions to the container management platform;

s3, the container management platform pulls batch images in the image pool according to the task execution instruction to generate corresponding batch containers;

s4, after the batch container is started, pulling up the batch sub-application corresponding to the batch container, executing the batch task by the batch sub-application, and feeding back the execution state of the batch task to the scheduling platform in real time;

s5, the scheduling platform initiates a container destruction command according to the batch task execution state;

and S6, the container management platform destroys the batch containers according to the container destroying command.

5. The distributed architecture and containerized batch scheduling method of claim 4, wherein said scheduling platform comprises an instance management center, a distribution controller, an instruction generator, a state collector, and a lifecycle manager, said step S2 comprises:

s201, starting batch tasks through an instance management center, generating task instances, and checking the task instance state, the task timeout time and the task parallel state;

s202, checking the load of a resource pool through an allocation controller, and allocating batch tasks to task instances;

s203, generating a task execution instruction through an instruction generator, wherein the task execution instruction comprises batch mirror image pulling information and batch container information; and if the free batch container corresponding to the task instance exists, binding the task instance with the free batch container, skipping the step S3, and otherwise, executing the step S3.

6. The distributed architecture and containerized batch scheduling method of claim 4, wherein in step S4, the feeding back the execution status of the batch task to the scheduling platform in real time comprises: after the task execution is finished, sending a task execution state to a state collector, and then forwarding the task execution state to a life cycle manager by the state collector, wherein the life cycle manager checks whether batch tasks needing to be executed exist under batch images and checks whether the batch tasks need to be continuously executed.

7. The distributed architecture and containerized batch scheduling method of claim 4, further comprising: and when the task execution state is in execution and the execution time exceeds the first preset time, sending an alarm notification, if the execution time exceeds the second preset time, unbinding the batch tasks and the task instances, modifying the batch tasks to be overtime, and setting the corresponding batch container state as false.

8. The distributed architecture and containerized batch scheduling method of claim 4, further comprising: when the batch container is not available due to the fact that the batch sub-application is pulled up unsuccessfully, the batch container of the task instance is scanned through the timing task, the state updating time and the state information of the batch container are checked, and a destroying command of the batch container in the abnormal state is sent to the container management platform.