CN110928716B

CN110928716B - Scheduling task exception handling method and device

Info

Publication number: CN110928716B
Application number: CN201911016897.6A
Authority: CN
Inventors: 来彬彬; 王根村; 胡仁超; 张彬; 史珠峰
Original assignee: Jiangsu Suning Logistics Co ltd
Current assignee: Jiangsu Suning Logistics Co ltd
Priority date: 2019-10-24
Filing date: 2019-10-24
Publication date: 2022-09-06
Anticipated expiration: 2039-10-24
Also published as: CN110928716A

Abstract

The invention discloses a scheduling task exception handling method and device, relates to the technical field of business support, and can reduce the workload of maintenance personnel while improving the exception scheduling task handling efficiency. The method comprises the following steps: a service exception table template and a system exception table template are constructed in advance according to the type of the exception scheduling task; monitoring abnormal scheduling data, importing the abnormal service data into a abnormal service table template to generate abnormal service tasks, and/or importing the abnormal system data into an abnormal system table template to generate abnormal system tasks; and calling a corresponding service exception handling scheme from the service scheme library based on the service exception task to perform exception handling operation, and/or calling a corresponding system exception handling scheme from the system scheme library based on the system exception task to perform exception handling operation. The device is applied with the method provided by the scheme.

Description

Scheduling task exception handling method and device

Technical Field

The invention relates to the technical field of business support, in particular to a scheduling task exception handling method and device.

Background

With the rapid development of logistics information technology, a large number of scheduling tasks are generated at every moment, a scheduling system is difficult to avoid abnormal situations when processing the scheduling tasks, the scheduling tasks are generally divided into business abnormalities and system abnormalities, and most of the processing of the existing scheduling system about the abnormal situations adopts a mode of combining fault monitoring and manual processing. The scheduling system monitors key processing steps, when a scheduling task is interrupted due to an abnormal condition, the scheduling system sends an alarm to notify a maintenance person, the maintenance person analyzes the alarm information after obtaining the alarm information, the scheduling system observes a fault point to confirm a problem root (namely, analyzes the fault), solves the fault according to the problem root, and restarts related scheduling tasks and subsequent operations after solving the problem.

Disclosure of Invention

The invention aims to provide a scheduling task exception handling method and device, which can improve the processing efficiency of an exception scheduling task and reduce the workload of maintenance personnel.

In order to achieve the above object, an aspect of the present invention provides a method for handling exception of a scheduling task, including:

a service exception table template and a system exception table template are constructed in advance according to the type of the exception scheduling task;

monitoring abnormal scheduling data, importing the abnormal service data into a abnormal service table template to generate an abnormal service task, and/or importing the abnormal system data into an abnormal system table template to generate an abnormal system task;

and calling a corresponding service exception handling scheme from the service scheme library based on the service exception task to perform exception handling operation, and/or calling a corresponding system exception handling scheme from the system scheme library based on the system exception task to perform exception handling operation.

Preferably, after the step of calling the corresponding service exception handling scheme from the service scheme library based on the service exception task to perform the exception handling operation, and/or calling the corresponding system exception handling scheme from the system scheme library based on the system exception task to perform the exception handling operation, the method further includes:

judging whether the abnormal business task eliminates the abnormality after performing the abnormality processing operation, warning the abnormal business task when the abnormal business task is not eliminated and the retry frequency reaches a threshold value, requesting manual assistance to eliminate the abnormality and storing a manually-assisted abnormal business processing scheme;

judging whether the system abnormal task eliminates the abnormality after the abnormal task is subjected to the abnormal processing operation, warning the system abnormal task when the system abnormal task is not eliminated and the retry frequency reaches a threshold value, requesting manual assistance to eliminate the abnormality and storing a system abnormal processing scheme of the manual assistance;

and updating the exception type of the service exception task and the corresponding service exception handling scheme into a service scheme library, and updating the exception type of the system exception task and the corresponding system exception handling scheme into a system scheme library.

Illustratively, the method for monitoring the abnormal scheduling data is as follows:

and monitoring abnormal scheduling data through a step-by-step timing task.

Optionally, the service exception table template at least includes primary key information, merchant code information, exception type information, processing state information, retry number information, threshold information, and early warning contact information; the system exception table template comprises main key information, merchant coding information, exception type information, processing time updating information, processing state information, retry frequency information, threshold value information, retry strategy information and early warning contact information.

Preferably, the method for calling the corresponding service exception handling scheme from the service scheme library based on the service exception task to perform the exception handling operation includes:

and calling the matched business exception handling scheme from the business scheme library based on the exception type of the business exception task to trigger exception handling operation.

Preferably, the method for calling the corresponding system exception handling scheme from the system scheme library to perform the exception handling operation based on the system exception task includes:

acquiring system abnormal tasks in real time and sequentially caching the system abnormal tasks into an MQ queue buffer pool;

sequentially calling system abnormal tasks to be processed from the MQ queue buffer pool, and calling a matched system abnormal processing scheme from a system scheme library based on the abnormal type of the system abnormal tasks to trigger abnormal processing operation;

and when the exception handling is successful, deleting the system exception task from the MQ queue buffer pool, and when the exception handling is failed, selecting a same-frequency retry strategy/frequency modulation retry strategy to re-trigger the exception handling operation on the system exception task.

Preferably, the retry interval period of the same-frequency retry strategy is always T1, the current retry interval period of the fm retry strategy is T2, the next retry interval period is T3, and T3 > T2.

Preferably, the method further comprises the following steps:

summarizing the business exception tasks and the corresponding business exception handling schemes, summarizing the system exception tasks and the corresponding system exception handling schemes, and adopting a clustering algorithm to analyze big data to form a knowledge base to assist in manually formulating the business exception handling schemes and the system exception handling schemes.

Compared with the prior art, the scheduling task exception handling method provided by the invention has the following beneficial effects:

the scheduling task exception handling method comprises the steps of constructing a business exception table template and a system exception table template in advance, then importing business exception data into the business exception table template to generate a business exception task and importing system exception data into the system exception table template to generate a system exception task by monitoring exception scheduling data, and realizing effective classification of exception scheduling tasks.

Therefore, the method and the device reduce the dependence of the abnormal scheduling task on manual processing, realize the automatic processing of the abnormal scheduling task, and reduce the workload of maintenance personnel while improving the processing efficiency of the abnormal scheduling task.

Another aspect of the present invention provides a scheduling task exception handling apparatus, to which the scheduling task exception handling method mentioned in the above technical solution is applied, and the apparatus includes:

the table template unit is used for pre-constructing a service exception table template and a system exception table template according to the type of the exception scheduling task;

the system comprises an exception monitoring unit, a system exception table template generation unit and an exception scheduling unit, wherein the exception monitoring unit is used for monitoring exception scheduling data, importing the business exception data into the business exception table template to generate a business exception task, and/or importing the system exception data into the system exception table template to generate a system exception task;

and the exception handling unit is used for calling a corresponding business exception handling scheme from the business scheme library based on the business exception task to perform exception handling operation, and/or calling a corresponding system exception handling scheme from the system scheme library based on the system exception task to perform exception handling operation.

Compared with the prior art, the beneficial effects of the scheduling task exception handling device provided by the invention are the same as the beneficial effects of the scheduling task exception handling method provided by the technical scheme, and are not repeated herein.

A third aspect of the present invention provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, performs the steps of the above-mentioned method for handling a scheduled task exception.

Compared with the prior art, the beneficial effects of the computer-readable storage medium provided by the invention are the same as the beneficial effects of the scheduling task exception handling method provided by the technical scheme, and are not repeated herein.

Drawings

The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the invention and not to limit the invention. In the drawings:

FIG. 1 is a flowchart illustrating a method for handling exception of scheduling task according to an embodiment;

FIG. 2 is a diagram illustrating an exemplary business exception table template according to a first embodiment;

FIG. 3 is a diagram illustrating an example system exception table template according to an embodiment.

Detailed Description

In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in detail below. It is to be understood that the described embodiments are merely exemplary of the invention, and not restrictive of the full scope of the invention. All other embodiments, which can be obtained by a person skilled in the art without inventive step based on the embodiments of the present invention, are within the scope of protection of the present invention.

Example one

Referring to fig. 1, the present embodiment provides a method for handling exception of a scheduling task, including:

a service exception table template and a system exception table template are constructed in advance according to the type of the exception scheduling task; monitoring abnormal scheduling data, importing the abnormal service data into a abnormal service table template to generate an abnormal service task, and/or importing the abnormal system data into an abnormal system table template to generate an abnormal system task; and calling a corresponding service exception handling scheme from the service scheme library based on the service exception task to perform exception handling operation, and/or calling a corresponding system exception handling scheme from the system scheme library based on the system exception task to perform exception handling operation.

In the scheduling task exception handling method provided by this embodiment, a service exception table template and a system exception table template are pre-constructed, then, exception scheduling data is monitored, the service exception data is imported into the service exception table template to generate a service exception task, and the system exception data is imported into the system exception table template to generate a system exception task, so as to implement effective classification of exception scheduling tasks.

Therefore, the dependence of the abnormal scheduling task on manual processing is reduced, the automatic processing of the abnormal scheduling task is realized, and the workload of maintenance personnel is reduced while the processing efficiency of the abnormal scheduling task is improved.

As will be readily appreciated, exception tasks typically include two broad categories, namely business exception tasks and system exception tasks, wherein, the abnormal task of the service refers to a service scene which can be satisfied only by needing the manual support in the running process of the system, the following single link merchants can not replenish stock in time, can not maintain information of sold commodities, can directly order the commodities and the like, so that the exception caused by the failure of completing an expected flow is caused, a system exception task refers to unexpected exception generated in the system operation process, faults such as network jitter, server downtime, etc. cause an anomaly in which a service is unavailable for a period of time, such an exception is basically re-processed through a certain strategy to ensure that the task continues to be executed, for example, retry is performed once within 5 minutes of fixed frequency or retry is performed after frequency hopping fails for 5 minutes for the first time, retry is performed after 30 minutes of second failure, retry is performed after 60 minutes of third failure, and the exception flow can be automatically processed when the service is recovered.

In the above embodiment, after the step of calling the corresponding service exception handling scheme from the service scheme library based on the service exception task to perform the exception handling operation, and/or calling the corresponding system exception handling scheme from the system scheme library based on the system exception task to perform the exception handling operation, the method further includes:

judging whether the abnormal business task eliminates the abnormality after performing the abnormality processing operation, warning the abnormal business task when the abnormal business task is not eliminated and the retry frequency reaches a threshold value, requesting manual assistance to eliminate the abnormality and storing a manually-assisted abnormal business processing scheme; judging whether the system abnormal task eliminates the abnormality after performing the abnormal processing operation, warning the system abnormal task when the system abnormal task is not eliminated and the retry frequency reaches a threshold value, requesting manual assistance to eliminate the abnormality and storing a system abnormal processing scheme of the manual assistance; and updating the exception type of the service exception task and the corresponding service exception handling scheme into a service scheme library, and updating the exception type of the system exception task and the corresponding system exception handling scheme into a system scheme library.

In specific implementation, in order to prevent the abnormal business tasks and the abnormal system tasks from being processed in time by adopting the automatic processing scheme, the embodiment sets the threshold of the retry times, when the retry times of the abnormal business tasks and the abnormal system tasks in the abnormal processing operation do not reach the threshold, only a set strategy is needed to respond to the retry operation, when the retry times of the abnormal business tasks and the abnormal system tasks in the abnormal processing operation reach the threshold, in view of the abnormal processing efficiency and the abnormal processing scheme, the existing automatic processing scheme can not effectively process the current abnormal tasks, at this time, the abnormal tasks can be timely requested to be manually assisted to be eliminated by an abnormal early warning mode, meanwhile, the abnormal business processing scheme and/or the abnormal system processing scheme given by manual assistance are saved, and finally, the abnormal business processing scheme and the abnormal system processing scheme are correspondingly updated into the business scheme library and/or the system scheme library, so that automatic processing can be realized by directly calling when the same type of abnormal tasks are encountered in the later stage.

Optionally, in the above specific implementation process, the same type of service exception task may correspond to multiple service exception handling schemes, and the same type of system exception task may correspond to multiple system exception handling schemes, during the automatic exception task handling process, only one service exception handling scheme/system exception handling scheme may be selected by one triggering operation to handle the corresponding exception task, in order to quickly match to the best service exception handling scheme/system exception handling scheme corresponding to the current service exception task/system exception task, a polling filtering scheme may be adopted to handle response retry, that is, the service exception handling schemes/system exception handling schemes selected during each response retry are different, so that the probability of matching to the best service exception handling scheme/system exception handling scheme may be greatly increased, and further, the efficiency of processing the abnormal tasks is improved. Of course, the same service exception handling scheme/system exception handling scheme may also be fixedly selected for response retry, and the selection of the scheme for response retry is not limited in this embodiment, and a person skilled in the art may freely set the scheme according to actual needs.

Preferably, the method for monitoring the abnormal scheduling data in the above embodiment includes: and monitoring abnormal scheduling data through a step-by-step timing task.

In specific implementation, a distributed application server is adopted to monitor abnormal scheduling data, a mainstream datamation persistent mode is supported in design, performance requirements are guaranteed by means of object query and method calling mechanisms of a container, data pressure is relieved through database sub-base and table division technical means, expandability is guaranteed by step deployment of the application server, and processing performance is guaranteed by adopting NIO non-blocking asynchronous processing in service processing.

Referring to fig. 2, the service exception table template at least includes primary key information, merchant code information, exception type information, processing status information, retry number information, threshold information, and early warning contact information; referring to fig. 3, the system exception table template includes primary key information, merchant code information, exception type information, processing time update information, processing status information, retry number information, threshold information, retry policy information, and early warning contact information.

The method for calling the corresponding business exception handling scheme from the business scheme library to perform exception handling operation based on the business exception task comprises the following steps: and calling the matched business exception handling scheme from the business scheme library based on the exception type of the business exception task to trigger exception handling operation.

The method for calling the corresponding system exception handling scheme from the system scheme library to perform exception handling operation based on the system exception task comprises the following steps: acquiring system abnormal tasks in real time and sequentially caching the system abnormal tasks into an MQ queue buffer pool, such as a Reactor mode; sequentially calling system exception tasks to be processed from an MQ queue buffer pool, calling a matched system exception handling scheme from a system scheme library based on the exception type of the system exception tasks to trigger exception handling operation, specifically implementing a time-stepping application server to monitor the system exception tasks, finding a corresponding instance name and the system exception handling scheme of the object from container configuration, and transmitting an anti-sequence of the system exception handling scheme into a calling method parameter to be triggered and executed again, wherein a sequencer supports JDON/Hessian/K/Kryo; and when the exception handling is successful, deleting the system exception task from the MQ queue buffer pool, and when the exception handling is failed, selecting a same-frequency retry strategy/frequency modulation retry strategy to re-trigger the exception handling operation on the system exception task.

Preferably, the retry interval period of the same-frequency retry strategy is always T1, the current retry interval period of the fm retry strategy is T2, the next retry interval period is T3, and T3 > T2. For example, T1 is 3 minutes, and the frequent retry strategy corresponds to 5 interval periods of 5 minutes, 30 minutes, 60 minutes, 120 minutes, and 240 minutes, respectively. By setting a flexible retry strategy, an effective early warning mechanism can enable manual intervention to process abnormal tasks efficiently.

The above embodiment further includes: summarizing the business exception tasks and the corresponding business exception handling schemes, summarizing the system exception tasks and the corresponding system exception handling schemes, and adopting a clustering algorithm to analyze big data to form a knowledge base to assist in manually formulating the business exception handling schemes and the system exception handling schemes.

For convenience of understanding, a logistics cloud platform is taken as an example for explanation, the logistics cloud platform receives and processes 2KW single abnormal tasks every day, the number of times of calling related service support for processing each single abnormal task is averagely 50, calculated by service reliability of 99.999%, the abnormal tasks generated every day are huge numbers, if all the related service supports need manual intervention, on one hand, the work amount is large, and on the other hand, the timeliness of abnormal processing cannot be guaranteed. By applying the embodiment, classification and early warning can be effectively carried out on the abnormity, if the logistics documents require that the system is not subjected to bill stagnation for more than 1 minute in time, and the warehouse-matching integrated business documents require that the time is not more than 30 minutes, the abnormal tasks are tracked and classified, a knowledge base for abnormal processing is perfected, a corresponding scheme is made to reduce operation and maintenance work, common problems and a corresponding system abnormal processing scheme can be timely classified through analysis of big data, the classification is classified to distinguish priority, the abnormal processing can be more targeted, meanwhile, a system is designed with an entry point, and the abnormal tasks generated in the operation and maintenance process can be automatically processed.

In the specific implementation, the first step, the engineering write-in entry point: recording the currently processed instance name, the system exception handling scheme, the serialization parameters and the exception type; step two, regularly calling the system abnormal task to be processed to the MQ queue: loading and calling system abnormal tasks to be processed to a queue, and guaranteeing the processing efficiency through database division and table division and application clusters; step three, synchronously processing the exception step by step according to the exception types: continuing to do tasks based on the abnormal points, deleting the abnormal tasks if the tasks are successful, and calculating the next trigger point time according to an abnormal processing strategy if the tasks are failed; step four, judging whether the early warning needs to be triggered to manually assist in removing the abnormity or not through real-time statistics of Flink streaming data based on the system abnormity task; and fifthly, analyzing abnormal tasks based on clustering algorithms such as big data k-means, k-models, k-protocols or SOM (self-organizing neural network) to form rule auxiliary operation and maintenance and modifying abnormal retry strategies.

Example two

The embodiment provides a scheduling task exception handling device, which includes:

the system comprises an exception monitoring unit, a system exception scheduling unit and a system exception scheduling unit, wherein the exception monitoring unit is used for monitoring exception scheduling data, importing the business exception data into a business exception table template to generate a business exception task and/or importing the system exception data into a system exception table template to generate a system exception task;

and the exception handling unit is used for calling the corresponding business exception handling scheme from the business scheme library to perform exception handling operation based on the business exception task, and/or calling the corresponding system exception handling scheme from the system scheme library to perform exception handling operation based on the system exception task.

Compared with the prior art, the beneficial effects of the scheduling task exception handling device provided by the embodiment are the same as those of the scheduling task exception handling method provided by the embodiment, and are not described herein again.

EXAMPLE III

The present embodiment provides a computer-readable storage medium, on which a computer program is stored, and when the computer program is executed by a processor, the method for processing the exception of the scheduled task includes the steps of the method for processing the exception of the scheduled task.

Compared with the prior art, the beneficial effects of the computer-readable storage medium provided by this embodiment are the same as those of the scheduling task exception handling method provided by the above technical solution, and are not described herein again.

It will be understood by those skilled in the art that all or part of the steps in the method for implementing the invention may be implemented by hardware that is instructed to be associated with a program, the program may be stored in a computer-readable storage medium, and when the program is executed, the program includes the steps of the method of the embodiment, and the storage medium may be: ROM/RAM, magnetic disks, optical disks, memory cards, and the like.

The above description is only for the specific embodiments of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily think of the changes or substitutions within the technical scope of the present invention, and shall cover the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the appended claims.

Claims

1. A scheduling task exception handling method is characterized by comprising the following steps:

calling a corresponding service exception handling scheme from the service scheme library based on the service exception task to perform exception handling operation, and/or calling a corresponding system exception handling scheme from the system scheme library based on the system exception task to perform exception handling operation;

the method for calling the corresponding system exception handling scheme from the system scheme library based on the system exception task to perform exception handling operation comprises the following steps:

and when the exception processing is successful, deleting the system exception task from the MQ queue buffer pool, and when the exception processing is failed, selecting a same-frequency retry strategy or a frequency modulation retry strategy to re-trigger the exception processing operation on the system exception task.

2. The method according to claim 1, wherein after the step of calling the corresponding business exception handling scheme from the business scheme library based on the business exception task to perform the exception handling operation, and/or calling the corresponding system exception handling scheme from the system scheme library based on the system exception task to perform the exception handling operation, the method further comprises:

3. The method of claim 1, wherein the method of monitoring the anomalous scheduling data comprises:

and monitoring abnormal scheduling data through a step-by-step timing task.

4. The method of claim 1, wherein the business exception table template at least includes primary key information, merchant code information, exception type information, processing state information, retry number information, threshold information, and early warning contact information; the system exception table template comprises main key information, merchant coding information, exception type information, processing time updating information, processing state information, retry frequency information, threshold value information, retry strategy information and early warning contact information.

5. The method according to claim 4, wherein the method for calling the corresponding service exception handling scheme from the service scheme library based on the service exception task to perform the exception handling operation comprises:

6. The method of claim 4, wherein the retry interval period of the same-frequency retry strategy is always T1, the current retry interval period of the FM retry strategy is T2, the next retry interval period is T3, and T3 > T2.

7. The method of claim 2, further comprising:

8. A scheduled task exception handling apparatus, comprising:

the exception handling unit is used for calling a corresponding business exception handling scheme from the business scheme library based on the business exception task to perform exception handling operation, and/or calling a corresponding system exception handling scheme from the system scheme library based on the system exception task to perform exception handling operation;

sequentially calling system abnormal tasks to be processed from the MQ queue buffer pool, calling a matched system abnormal processing scheme from a system scheme library based on the abnormal type of the system abnormal tasks to trigger abnormal processing operation;

and when the exception handling is successful, deleting the system exception task from the MQ queue buffer pool, and when the exception handling is failed, selecting a same-frequency retry strategy or a frequency modulation retry strategy to re-trigger the exception handling operation on the system exception task.

9. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method according to any one of the claims 1 to 7.