CN113407429A - Task processing method and device - Google Patents

Task processing method and device Download PDF

Info

Publication number
CN113407429A
CN113407429A CN202110700169.8A CN202110700169A CN113407429A CN 113407429 A CN113407429 A CN 113407429A CN 202110700169 A CN202110700169 A CN 202110700169A CN 113407429 A CN113407429 A CN 113407429A
Authority
CN
China
Prior art keywords
task
processing
data
execution
batch processing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110700169.8A
Other languages
Chinese (zh)
Other versions
CN113407429B (en
Inventor
陈兵
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Construction Bank Corp
Original Assignee
China Construction Bank Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Construction Bank Corp filed Critical China Construction Bank Corp
Priority to CN202110700169.8A priority Critical patent/CN113407429B/en
Publication of CN113407429A publication Critical patent/CN113407429A/en
Application granted granted Critical
Publication of CN113407429B publication Critical patent/CN113407429B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3466Performance evaluation by tracing or monitoring
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3466Performance evaluation by tracing or monitoring
    • G06F11/3476Data logging
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/36Preventing errors by testing or debugging software
    • G06F11/362Software debugging
    • G06F11/3636Software debugging by tracing the execution of the program
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/542Event management; Broadcasting; Multicasting; Notifications

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Hardware Design (AREA)
  • Quality & Reliability (AREA)
  • Software Systems (AREA)
  • Multimedia (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The invention discloses a task processing method and a task processing device, and relates to the technical field of big data. One embodiment of the method comprises: determining a batch processing task to be executed, acquiring fragmentation parameters from the attributes of the batch processing task, and judging whether the fragmentation parameters are set to preset values; if the judgment result is yes, determining a plurality of devices for processing the plurality of data, respectively distributing different data to corresponding devices for processing, summarizing the processing result, and obtaining an execution result; if the judgment result is negative, determining one device which processes the plurality of data together, and distributing the plurality of data together to the one device for processing to obtain an execution result. The implementation mode can realize the rapid processing of various batch processing requirements of an organization (such as a bank system), and simplifies the development and operation and maintenance processing of batch processing.

Description

Task processing method and device
Technical Field
The invention relates to the technical field of big data, in particular to a task processing method and a task processing device.
Background
With the development of technology and the continuous evolution of banking business, the business data of the banking system which needs to be processed in batches is increased, and the data is of various types and the processing flow is complex. How to carry out batch processing on data quickly, stably and flexibly and carry out smooth transition on the existing batch processing is a problem which needs to be considered at present.
At present, various technical schemes are provided with a function of processing data in batches, such as key and DataStage, which are mainly used for processing data, but each of the schemes has certain defects, such as complex development process, long maintainability, inconvenient configuration and the like.
Taking DataStage as an example, when a batch job is newly developed, the developed batch job is invoked by adding a job to the DataStage and then adding a new schedule to the Control-M. Control-M scheduling supports scheduling configuration performed on a daily basis, and supports a weaker batch job that needs to be performed multiple times a day. The data processing mode is processed by using a conventional ETL method, and the support for transactions is weak, and if the workflow includes multiple steps, it is difficult to track the specific cause of an error when the job execution occurs. In addition, for the existing batch operation, the processing flow chain is complex, and when the optimization needs to be modified, the modification is difficult.
Disclosure of Invention
In view of this, embodiments of the present invention provide a task processing method and apparatus, which can at least solve the problems of complex development process, long maintainability, inconvenient configuration, and the like in the existing task batch processing in the prior art.
To achieve the above object, according to an aspect of an embodiment of the present invention, there is provided a task processing method including:
determining a batch processing task to be executed, acquiring fragmentation parameters from the attributes of the batch processing task, and judging whether the fragmentation parameters are set to preset values; wherein one batch processing task comprises a plurality of data to be processed;
if the judgment result is yes, determining a plurality of devices for processing the plurality of data, respectively distributing different data to corresponding devices for processing, summarizing the processing result, and obtaining an execution result; or
If the judgment result is negative, determining one device which processes the plurality of data together, and distributing the plurality of data together to the one device for processing to obtain an execution result.
Optionally, the determining a plurality of devices for processing the plurality of data includes:
and calculating the data volume distributed to each device by adopting a Hash algorithm according to the current processor utilization rate and memory utilization rate of each device and the data volume to be processed in the batch processing task.
Optionally, the determining a device that processes the plurality of data together includes:
and screening out one device with the smallest current load from a plurality of devices, and using the one device as a target device for processing the plurality of data together.
Optionally, the determining a device that processes the plurality of data together includes:
determining one or more devices which process the batch processing tasks historically, counting the historical execution duration of the batch processing tasks and the utilization rate of a processor when each batch processing task is processed, and further calculating the weight value of each device;
and screening out the device with the largest weight value from the one or more devices, and using the device as a target device for processing the plurality of data together.
Optionally, the allocating different data to corresponding devices respectively for processing, or allocating the multiple data to the device together for processing includes:
acquiring a serial parameter from the attribute of the batch processing task, and judging whether the serial parameter is set to be a first preset value or not; the serial parameters correspond to multiple execution instances of the batch processing tasks, and one instance is generated when the batch processing tasks are executed once;
if the judgment result is yes, processing data when the task execution period of the batch processing task is reached; or
And if the judgment result is negative, acquiring the execution state of the previous batch processing task execution example, and reprocessing the data under the condition that the execution state is the execution completion.
Optionally, the reprocessing the data when the execution state is that the execution is completed further includes:
and if the execution state is not completed when the execution cycle of the batch processing task is reached, another device for processing the plurality of data together is determined again.
Optionally, the method further includes: and pulling a plurality of batch processing tasks to be executed from the database/operation path, and arranging the batch processing tasks according to the pulling sequence to generate the task queue to be executed.
Optionally, the method further includes: and when the data to be processed is a file, checking whether the file exists in the operation path or not based on the checking period, and if not, executing file waiting operation.
Optionally, the attribute further includes a task number, a name, an executable time period, and a task parameter; the task parameters comprise specific parameters required by task execution.
Optionally, the method further includes: and judging whether the current time period is in the executable time period or not, and if not, executing waiting operation.
Optionally, the method further includes: and in the process of executing the batch processing task, monitoring the execution state by using a monitoring mechanism, and if the execution state is abnormal, recording abnormal data, abnormal reasons and operating equipment to generate a task execution abnormal log.
Optionally, after the generating of the task execution exception log, the method further includes:
responding to the opening of the task execution exception log, and positioning an exception execution step according to the exception data and an exception reason; wherein a batch job comprises a plurality of execution steps.
Optionally, the method further includes: and if the abnormality occurs, sending a notification message or popping up a reminding message.
To achieve the above object, according to another aspect of an embodiment of the present invention, there is provided a task processing apparatus including:
the system comprises a judging module, a data processing module and a data processing module, wherein the judging module is used for determining a batch processing task to be executed, acquiring fragmentation parameters from the attributes of the batch processing task, and judging whether the fragmentation parameters are set to preset values or not; wherein one batch processing task comprises a plurality of data to be processed;
the fragmentation module is used for determining a plurality of devices for processing the plurality of data if the judgment result is yes, respectively distributing different data to corresponding devices for processing, and summarizing the processing result to obtain an execution result; or
And the non-fragmentation module is used for determining one device which processes the plurality of data together if the judgment result is negative, and distributing the plurality of data together to the one device for processing to obtain an execution result.
Optionally, the slicing module is configured to: and calculating the data volume distributed to each device by adopting a Hash algorithm according to the current processor utilization rate and memory utilization rate of each device and the data volume to be processed in the batch processing task.
Optionally, the non-fragmentation module is configured to: and screening out one device with the smallest current load from a plurality of devices, and using the one device as a target device for processing the plurality of data together.
Optionally, the non-fragmentation module is configured to:
determining one or more devices which process the batch processing tasks historically, counting the historical execution duration of the batch processing tasks and the utilization rate of a processor when each batch processing task is processed, and further calculating the weight value of each device;
and screening out the device with the largest weight value from the one or more devices, and using the device as a target device for processing the plurality of data together.
Optionally, the system further comprises a serial module, configured to:
acquiring a serial parameter from the attribute of the batch processing task, and judging whether the serial parameter is set to be a first preset value or not; the serial parameters correspond to multiple execution instances of the batch processing tasks, and one instance is generated when the batch processing tasks are executed once;
if the judgment result is yes, processing data when the task execution period of the batch processing task is reached; or
And if the judgment result is negative, acquiring the execution state of the previous batch processing task execution example, and reprocessing the data under the condition that the execution state is the execution completion.
Optionally, the serial module is further configured to:
and if the execution state is not completed when the execution cycle of the batch processing task is reached, another device for processing the plurality of data together is determined again.
Optionally, the system further includes a task pulling module, configured to:
and pulling a plurality of batch processing tasks to be executed from the database/operation path, and arranging the batch processing tasks according to the pulling sequence to generate the task queue to be executed.
Optionally, the task pulling module is further configured to: and when the data to be processed is a file, checking whether the file exists in the operation path or not based on the checking period, and if not, executing file waiting operation.
Optionally, the attribute further includes a task number, a name, an executable time period, and a task parameter; the task parameters comprise specific parameters required by task execution.
Optionally, the system further includes an execution module, configured to: and judging whether the current time period is in the executable time period or not, and if not, executing waiting operation.
Optionally, the system further includes an exception monitoring module, configured to: and in the process of executing the batch processing task, monitoring the execution state by using a monitoring mechanism, and if the execution state is abnormal, recording abnormal data, abnormal reasons and operating equipment to generate a task execution abnormal log.
Optionally, the system further includes an exception handling module, configured to: responding to the opening of the task execution exception log, and positioning an exception execution step according to the exception data and an exception reason; wherein a batch job comprises a plurality of execution steps.
Optionally, the method further includes: and if the abnormality occurs, sending a notification message or popping up a reminding message.
To achieve the above object, according to still another aspect of embodiments of the present invention, there is provided a task processing electronic device.
The electronic device of the embodiment of the invention comprises: one or more processors; the storage device is used for storing one or more programs, and when the one or more programs are executed by the one or more processors, the one or more processors realize any one of the task processing methods.
To achieve the above object, according to still another aspect of embodiments of the present invention, there is provided a computer-readable medium on which a computer program is stored, the program implementing any one of the above task processing methods when executed by a processor.
According to the scheme provided by the invention, one embodiment of the invention has the following advantages or beneficial effects: in order to enable the batch processing process to be faster, more flexible and controllable, a private cloud framework and a developed system general processing flow are combined, batch processing requirements can be rapidly developed and configured, the development efficiency and maintainability of the complex batch processing flow are improved, and monitoring and notification of tasks and operation execution states are supported.
Further effects of the above-mentioned non-conventional alternatives will be described below in connection with the embodiments.
Drawings
The drawings are included to provide a better understanding of the invention and are not to be construed as unduly limiting the invention. Wherein:
FIG. 1 is a schematic flow chart of a task processing method according to an embodiment of the present invention;
FIG. 2 is a flow diagram illustrating an alternative task processing method according to an embodiment of the invention;
FIG. 3 is a flow diagram illustrating an alternative task processing method according to an embodiment of the invention;
FIG. 4 is a flowchart illustrating an alternative task processing method according to an embodiment of the present invention;
FIG. 5 is a schematic diagram of the main modules of a task processing device according to an embodiment of the present invention;
FIG. 6 is an exemplary system architecture diagram in which embodiments of the present invention may be employed;
FIG. 7 is a schematic block diagram of a computer system suitable for use with a mobile device or server implementing an embodiment of the invention.
Detailed Description
Exemplary embodiments of the present invention are described below with reference to the accompanying drawings, in which various details of embodiments of the invention are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the invention. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
The words involved in the present solution are explained as follows:
batch processing: the IT system processes data into two categories, namely online transaction and batch processing, wherein the online transaction refers to an information processing mode which is initiated through various terminal devices and needs to respond in time through multiple channels, the processing time is short, and the timeliness is high. The batch processing refers to a process of processing accumulated data generated by the system in a certain period at a specific time according to business requirements and technical schemes, the processing time is relatively long, the processing efficiency is high, and the requirements of final daily processing, statistical settlement, report analysis and the like can be met.
Batch Task (Task): a job chain consisting of one or more batch jobs.
Batch Job (Job): a single batch job is a step in the Task.
Batch Step (Step): a step in a single batch Job Job.
ETL: an abbreviation of Extract-Transform-Load, i.e., a process of data extraction, transformation, and loading.
The detailed comparison between the prior art key and DataStage and the present scheme is described here:
1. the key is an open-source ETL tool for processing, converting, migrating, etc. various data. The key is developed by the client, and configuration development and management are difficult to complete when a large number of jobs are handled. And the transaction function support is weak, and the support for the complex operation flow needing transaction control is insufficient. The scheme provides three-level Task-Job-Step flow control, can be flexibly configured, basically has the perfect transaction control function of Spring Batch, and can meet the transaction processing requirement on a complex flow.
2. The DataStage supports the functionality, flexibility and scalability of data integration requirements, supports various complex data transformation and processes and the like. The Datastage tool is heavy, has high requirements on hardware, has weak self scheduling function, and generally needs to additionally develop a scheduling and monitoring program under the condition of a large amount of operation. And the transaction function support is weak, and the support for the complex operation flow needing transaction control is insufficient. The positioning problem is difficult under the condition of processing the operation exception. The scheme integrates the operation scheduling and batch processing functions, can run in a single machine environment and also supports cluster mode running. The method supports flexible scheduling configuration, supports perfect transaction processing function, provides a built-in batch tool, and is convenient and fast in customizing operation flow. The abnormal processing flow can be quickly positioned to the problem reason, and the system operation and maintenance are convenient.
Referring to fig. 1, a main flowchart of a task processing method according to an embodiment of the present invention is shown, including the following steps:
s101: determining a batch processing task to be executed, acquiring fragmentation parameters from the attributes of the batch processing task, and judging whether the fragmentation parameters are set to preset values; wherein one batch processing task comprises a plurality of data to be processed;
s102: if the judgment result is yes, determining a plurality of devices for processing the plurality of data, respectively distributing different data to corresponding devices for processing, summarizing the processing result, and obtaining an execution result;
s103: if the judgment result is negative, determining one device which processes the plurality of data together, and distributing the plurality of data together to the one device for processing to obtain an execution result.
In the above embodiment, for step S101, a task scheduling period, that is, a task execution interval is set in advance, and is executed, for example, once every 22 nights, or once every month, once every 5 minutes, or once every 1 minute. And pulling a plurality of batch processing tasks to be executed from the database/operation path according to the configured task scheduling period, and arranging according to the pulling sequence to generate a task queue to be executed, wherein the queue comprises the plurality of batch processing tasks. The operation path here may be a specific location in the disk, and is typically used for storing files. When the data to be processed is a file, whether the file exists in the operation path or not needs to be inquired based on the check period, if the file does not exist, the waiting is needed, and if the file exists, the processing can be performed.
A batch Task (Task) is composed of one or more batch jobs (Job), each containing one or more batch steps (Step). Each batch Step (Step) comprises specific operations and data, i.e. specific workflow logic processes, whereby one batch task contains multiple data to be processed.
The method comprises the steps of obtaining pre-configured attributes of batch processing tasks to be executed, wherein the pre-configured attributes comprise task numbers, names, task execution states, execution cycles, serial parameters, fragment parameters, executable time periods, task parameters (including specific parameters required by task execution) and the like. Before executing the task, firstly, judging whether the current time period is in the executable time period, and if not, executing a waiting operation.
If the value of the fragmentation parameter is 1, the to-be-processed data can be distributed to a plurality of hosts to be executed when the task is executed; if the value is 0, the data to be processed needs to be distributed to a single host to be executed when the task is executed.
If the value of the serial parameter is 0, the serial parameter indicates that the instance executed each time by the same task does not depend on the state executed by the previous instance; if the value is 1, it indicates that each instance executing the same task must wait for the previous instance to complete (whether success or failure) before executing; if the value is 2, it indicates that each instance executing the same task must wait for the previous instance to execute successfully.
If the data in the batch processing task is in the task executable time period, whether the fragment flag in the attribute is judged to be yes or not is continuously judged, if yes, the data in the batch processing task supports fragment processing, different data can be distributed to different hosts (or called devices) for processing, and if not, the data to be processed of the same batch processing task is distributed to a single host for processing. Assuming that 3 hosts are in a batch processing cluster, the execution of one task needs 3 hosts to be executed simultaneously in a fragment mode, and each host processes more than one third of data, so that the method is mainly used for the conditions of large data volume and relatively fast processing time; for the non-fragmentation mode, all data needs to be allocated to a certain host to be executed.
In the method provided in the foregoing embodiment, a fragmentation flag is set in a task attribute to determine whether to process multiple data in a task in a fragmentation mode. The implementation mode can realize the rapid processing of various batch processing requirements of an organization (such as a bank system), and simplifies the development and operation and maintenance processing of batch processing.
Referring to fig. 2, a schematic flow chart of an alternative task processing method according to an embodiment of the present invention is shown, including the following steps:
s201: determining a batch processing task to be executed, acquiring fragmentation parameters from the attributes of the batch processing task, and judging whether the fragmentation parameters are set to preset values; wherein one batch processing task comprises a plurality of data to be processed;
s202: if the judgment result is yes, calculating the distributed data volume of each device according to the current processor utilization rate and memory utilization rate of each device and the data volume to be processed in the batch processing task by adopting a Hash algorithm, and summarizing the processing result to obtain a total processing result;
s203: if the judgment result is negative, screening out one device with the minimum current load from the multiple devices, and taking the one device as a target device for processing the multiple data together to obtain an execution result;
s204: if the judgment result is negative, determining one or more devices which historically process the batch processing task, counting the historical execution duration of the batch processing task and the utilization rate of a processor when each batch processing task is processed, and further calculating the weight value of each device;
s205: and screening out the device with the largest weight value from the one or more devices, and taking the device as a target device for processing the plurality of data together to obtain an execution result.
In the above embodiment, for step S201, refer to the description of step S101 shown in fig. 1, and no further description is provided herein.
In step S202, it is assumed that the batch processing job to be executed this time includes 20 pieces of data to be processed, and there are 10 hosts in total. For the situation that the task supports the fragment execution, a hash algorithm is adopted, the allocated proportion of each host is calculated according to the current CPU utilization rate and the memory utilization rate of each host, the proportion is multiplied by the data volume to be processed to obtain the allocated data volume of each host, the algorithm ensures that the data is allocated completely, and 10 hosts are substantially evenly divided.
For step S203, for the case that the task does not support the slicing execution, one host with the smallest current load may be screened out from the 10 hosts to process the 20 data together. When the number of tasks to be processed is large, this method may also be adopted, for example, the task a-host 2 and the task B-host 3 each time determine which host processed the previous task, and then determine the host with the smallest current load, and sometimes, a host processes a plurality of tasks.
Dynamic programming algorithms are typically used to solve problems with some optimal nature. In such a problem, there may be many possible solutions, each corresponding to a value, for which it is desirable to find a solution with an optimal value. The basic idea is to decompose the problem to be solved into a plurality of sub-problems, solve the sub-problems first, and then obtain the solution of the original problem from the solutions of the sub-problems.
If there are 3 hosts, if it is feasible to allocate the task a to all 3 hosts, there are 3 feasible solutions. And distributing the tasks to the host with the minimum load by adopting a dynamic programming algorithm, namely, obtaining an optimal solution. And searching the optimal solution every time a task execution instance is newly added. If 10 tasks are to be executed, the most balanced task allocation of 3 hosts is finally achieved according to the mode.
For steps S204 to S205, in addition to the above manner, for the case that the task does not support the slicing execution, the time length of the task history being executed, the devices that have historically processed the task, and the CPU utilization rate of the devices when processing the task may be counted, so as to calculate the weight values of the devices, and then a device with the largest weight value is selected from the devices to process the task.
The method provided by the embodiment adopts different modes to determine the equipment for processing the data for the situations of task fragmentation execution and non-fragmentation execution, thereby realizing the high efficiency of data processing.
Referring to fig. 3, a schematic flow chart of an alternative task processing method according to an embodiment of the present invention is shown, including the following steps:
s301: acquiring a serial parameter from the attribute of the batch processing task, and judging whether the serial parameter is set to be a first preset value or not; the serial parameters correspond to multiple execution instances of the batch processing tasks, and one instance is generated when the batch processing tasks are executed once;
s302: if the judgment result is yes, processing data when the task execution period of the batch processing task is reached;
s303: if the judgment result is negative, acquiring the execution state of the previous batch processing task execution instance, and reprocessing the data under the condition that the execution state is the execution completion;
s304: and if the execution state is not completed when the execution cycle of the batch processing task is reached, another device for processing the plurality of data together is determined again.
In the above embodiment, in steps S301 to S304, if the result of the serial flag determination in the batch job attribute is yes, it indicates that the batch job does not support parallel execution, and otherwise, it indicates that parallel execution is supported. Parallel, meaning that there may be multiple instances of a task at the same time; non-parallel, meaning that a task has only one instance at a time.
Each execution of a task generates an instance and each execution depends on the state results of the previous instance execution. For example, task a is set to execute every 5 minutes and is executed serially, and is executed once at time points 19:55, 20:00, and 20:05, respectively, to generate 3 instances. Task A executes an instance at 19:55, but until 20:00 has not been executed, another device to process task A is re-determined at 20: 00. Similarly, in the 20:05 execution example, since the 20:00 instance has not been executed, another device for processing task A is determined again at 20:05, and if the 20:00 instance has not been executed at 20:10, another device for processing task A is determined again at 20: 10.
For example, it is preset that task a is executed once every 22 pm, but task a of No. 1 has not been executed yet at 22 pm of No. 2. In the serial processing method, the task No. 2 needs to be executed after the task No. 1 is executed, so that the task No. 2 continues to wait even after the current point No. 2 is 22 points. For the parallel processing mode, that is, when the point 22 # 2 is reached, the task # 2 is started to be executed regardless of whether and when the task # 1 is executed.
In the method provided by the above embodiment, for the serial mode and the parallel mode, the device processes data in the task in different manners, especially in the serial manner, the execution state of the previous instance needs to be considered, and if the execution state does not meet the requirement, the processing device is determined again, so that the task is processed in time.
Referring to fig. 4, a schematic flow chart of another alternative task processing method according to the embodiment of the present invention is shown, which includes the following steps:
s401: determining a batch processing task to be executed, acquiring fragmentation parameters from the attributes of the batch processing task, and judging whether the fragmentation parameters are set to preset values; wherein one batch processing task comprises a plurality of data to be processed;
s402: if the judgment result is yes, determining a plurality of devices for processing the plurality of data, respectively distributing different data to corresponding devices for processing, summarizing the processing result, and obtaining an execution result;
s403: if the judgment result is negative, determining one device which processes the plurality of data together, and distributing the plurality of data together to the one device for processing to obtain an execution result;
s404: monitoring an execution state in the process of executing the batch processing task, and if the execution state is abnormal, recording abnormal data, operating equipment and an abnormal reason to generate a task execution abnormal log;
s405: responding to the opening of the task execution exception log, and positioning an exception execution step according to the exception data and an exception reason; wherein a batch job comprises a plurality of execution steps.
In the above embodiment, the steps S401 to S403 can be described with reference to fig. 1 to 3, and are not described again here. A
With respect to steps S404 and S405, in the process of executing the task, the execution state of the task in each execution step is monitored by the event listener to send a notification message when an error occurs, and the task configuration is optimized. Subsequent developers can open task execution exception logs to locate exception execution steps according to exception data, exception reasons and operating equipment recorded in the logs so as to locate specific problems and analyze the problems.
In addition, the scheme can also encapsulate the operating task parameters according to the operation steps of the task configuration. The operation task parameters include, in addition to the previous task configuration attributes, parameters generated when the task starts to execute, such as a currently executed service date, a task queue number, a data file path used by the task, a data file waiting detection time interval, a rollback file path, a submission number of data to be written into the database in batches, and the like, which can be determined only when the task starts to execute, so that the encapsulation processing is required.
The task parameters are packaged in the first stage, when the task is executed in the second stage, the task parameters used in each batch processing step may be different, each step also has its own parameters, and then the required parameters and the parameters of the step itself need to be taken out from the task parameters and then assembled.
According to the method provided by the embodiment, the enhanced event monitor is utilized, error data can be recorded in the log file in the task execution process, the error reason is recorded, the specific problems can be conveniently and rapidly located according to the reason and the data, and the operation monitoring and operation and maintenance processing of batch processing tasks are improved.
Compared with the prior art, the method provided by the embodiment of the invention has at least the following beneficial effects:
1. and the state of each stage of the operation life cycle is easily mastered by using a multi-azimuth event monitoring mechanism provided by Spring Batch. And an exception handling mechanism based on the event can accurately position an exception position when the task execution is abnormal, so that developers can handle the exception in time.
2. The method is based on open source Batch processing frames Spring Batch and Quartz scheduling frames, combines a private cloud system, develops functions of flow configuration, operation monitoring, event notification and the like in a targeted manner, and develops by using Java language, so that developers have the advantages of quick operation, simple development flow, convenient operation and maintenance and strong technical innovation ideas.
3. The batch processing system has the advantages of convenient whole implementation, convenient configuration, simple and convenient operation and maintenance, strong usability and convenient quick integration, solves the problems that the existing batch processing frame is too heavy and needs professional technical personnel for development, and is difficult to deal with in the face of large batch demands, realizes the quick development of batch processing operation, and improves the development efficiency and maintainability of complex batch processing procedures.
Referring to fig. 5, a schematic diagram of main modules of a task processing device 500 according to an embodiment of the present invention is shown, including:
a judging module 501, configured to determine a batch processing task to be executed, obtain a fragmentation parameter from an attribute of the batch processing task, and judge whether the fragmentation parameter is set to a preset value; wherein one batch processing task comprises a plurality of data to be processed;
the fragmentation module 502 is configured to determine multiple devices for processing the multiple data if the determination result is yes, allocate different data to corresponding devices for processing, and summarize the processing result to obtain an execution result; or
And a non-fragmentation module 503, configured to determine, if the determination result is negative, one device that processes the multiple data together, and allocate the multiple data together to the one device for processing, so as to obtain an execution result.
In the implementation apparatus of the present invention, the fragmentation module 502 is configured to:
and calculating the data volume distributed to each device by adopting a Hash algorithm according to the current processor utilization rate and memory utilization rate of each device and the data volume to be processed in the batch processing task.
In the implementation apparatus of the present invention, the non-fragmentation module 503 is configured to:
and screening out one device with the smallest current load from a plurality of devices, and using the one device as a target device for processing the plurality of data together.
In the implementation apparatus of the present invention, the non-fragmentation module 503 is configured to:
determining one or more devices which process the batch processing tasks historically, counting the historical execution duration of the batch processing tasks and the utilization rate of a processor when each batch processing task is processed, and further calculating the weight value of each device;
and screening out the device with the largest weight value from the one or more devices, and using the device as a target device for processing the plurality of data together.
The device for implementing the invention also comprises a serial module used for:
acquiring a serial parameter from the attribute of the batch processing task, and judging whether the serial parameter is set to be a first preset value or not; the serial parameters correspond to multiple execution instances of the batch processing tasks, and one instance is generated when the batch processing tasks are executed once;
if the judgment result is yes, processing data when the task execution period of the batch processing task is reached; or
And if the judgment result is negative, acquiring the execution state of the previous batch processing task execution example, and reprocessing the data under the condition that the execution state is the execution completion.
In the implementation apparatus of the present invention, the serial module is further configured to:
and if the execution state is not completed when the execution cycle of the batch processing task is reached, another device for processing the plurality of data together is determined again.
The implementation device of the invention also comprises a task pulling module used for:
and pulling a plurality of batch processing tasks to be executed from the database/operation path, and arranging the batch processing tasks according to the pulling sequence to generate the task queue to be executed.
In the device for implementing the present invention, the task pulling module is further configured to: and when the data to be processed is a file, checking whether the file exists in the operation path or not based on the checking period, and if not, executing file waiting operation.
In the implementation device of the invention, the attributes further comprise task numbers, names, executable time periods and task parameters; the task parameters comprise specific parameters required by task execution.
The device for implementing the invention also comprises an execution module used for:
and judging whether the current time period is in the executable time period or not, and if not, executing waiting operation.
The device for implementing the invention also comprises an abnormal monitoring module which is used for:
and in the process of executing the batch processing task, monitoring the execution state by using a monitoring mechanism, and if the execution state is abnormal, recording abnormal data, abnormal reasons and operating equipment to generate a task execution abnormal log.
The device for implementing the invention also comprises an exception handling module which is used for:
responding to the opening of the task execution exception log, and positioning an exception execution step according to the exception data and an exception reason; wherein a batch job comprises a plurality of execution steps.
The implementation device of the invention also comprises: and if the abnormality occurs, sending a notification message or popping up a reminding message.
In addition, the detailed implementation of the device in the embodiment of the present invention has been described in detail in the above method, so that the repeated description is not repeated here.
Fig. 6 shows an exemplary system architecture 600 in which embodiments of the invention may be applied, including terminal devices 601, 602, 603, a network 604 and a server 605 (by way of example only).
The terminal devices 601, 602, 603 may be various electronic devices having display screens and supporting web browsing, and installed with various communication client applications, and users may interact with the server 605 through the network 604 using the terminal devices 601, 602, 603 to receive or transmit messages and the like.
The network 604 serves to provide a medium for communication links between the terminal devices 601, 602, 603 and the server 605. Network 604 may include various types of connections, such as wire, wireless communication links, or fiber optic cables, to name a few.
The server 605 may be a server providing various services, and is configured to perform batch processing tasks to be executed, and perform fragmentation operation or non-fragmentation operation according to task fragmentation attributes.
It should be noted that the method provided by the embodiment of the present invention is generally executed by the server 605, and accordingly, the apparatus is generally disposed in the server 605.
It should be understood that the number of terminal devices, networks, and servers in fig. 6 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.
Referring now to FIG. 7, shown is a block diagram of a computer system 700 suitable for use with a terminal device implementing an embodiment of the present invention. The terminal device shown in fig. 7 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present invention.
As shown in fig. 7, the computer system 700 includes a Central Processing Unit (CPU)701, which can perform various appropriate actions and processes in accordance with a program stored in a Read Only Memory (ROM)702 or a program loaded from a storage section 708 into a Random Access Memory (RAM) 703. In the RAM 703, various programs and data necessary for the operation of the system 700 are also stored. The CPU 701, the ROM 702, and the RAM 703 are connected to each other via a bus 704. An input/output (I/O) interface 705 is also connected to bus 704.
The following components are connected to the I/O interface 705: an input portion 706 including a keyboard, a mouse, and the like; an output section 707 including a display such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker; a storage section 708 including a hard disk and the like; and a communication section 709 including a network interface card such as a LAN card, a modem, or the like. The communication section 709 performs communication processing via a network such as the internet. A drive 710 is also connected to the I/O interface 705 as needed. A removable medium 711 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 710 as necessary, so that a computer program read out therefrom is mounted into the storage section 708 as necessary.
In particular, according to the embodiments of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such an embodiment, the computer program can be downloaded and installed from a network through the communication section 709, and/or installed from the removable medium 711. The computer program performs the above-described functions defined in the system of the present invention when executed by the Central Processing Unit (CPU) 701.
It should be noted that the computer readable medium shown in the present invention can be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present invention, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present invention, however, a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The modules described in the embodiments of the present invention may be implemented by software or hardware. The described modules may also be provided in a processor, which may be described as: a processor comprises a judging module, a fragmenting module and a non-fragmenting module. Where the names of these modules do not in some cases constitute a limitation on the modules themselves, for example, a sharded module may also be described as a "non-sharded execution module".
As another aspect, the present invention also provides a computer-readable medium that may be contained in the apparatus described in the above embodiments; or may be separate and not incorporated into the device. The computer readable medium carries one or more programs which, when executed by a device, cause the device to comprise:
determining a batch processing task to be executed, acquiring fragmentation parameters from the attributes of the batch processing task, and judging whether the fragmentation parameters are set to preset values; wherein one batch processing task comprises a plurality of data to be processed;
if the judgment result is yes, determining a plurality of devices for processing the plurality of data, respectively distributing different data to corresponding devices for processing, summarizing the processing result, and obtaining an execution result; or
If the judgment result is negative, determining one device which processes the plurality of data together, and distributing the plurality of data together to the one device for processing to obtain an execution result.
According to the technical scheme of the embodiment of the invention, the fragmentation mark is set in the task attribute so as to judge whether a plurality of data in the task are processed by adopting the fragmentation mode. The implementation mode can realize the rapid processing of various batch processing requirements of an organization (such as a bank system), and simplifies the development and operation and maintenance processing of batch processing.
The above-described embodiments should not be construed as limiting the scope of the invention. Those skilled in the art will appreciate that various modifications, combinations, sub-combinations, and substitutions can occur, depending on design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (16)

1. A task processing method, comprising:
determining a batch processing task to be executed, acquiring fragmentation parameters from the attributes of the batch processing task, and judging whether the fragmentation parameters are set to preset values; wherein one batch processing task comprises a plurality of data to be processed;
if the judgment result is yes, determining a plurality of devices for processing the plurality of data, respectively distributing different data to corresponding devices for processing, summarizing the processing result, and obtaining an execution result; or
If the judgment result is negative, determining one device which processes the plurality of data together, and distributing the plurality of data together to the one device for processing to obtain an execution result.
2. The method of claim 1, wherein determining a plurality of devices to process the plurality of data comprises:
and calculating the data volume distributed to each device by adopting a Hash algorithm according to the current processor utilization rate and memory utilization rate of each device and the data volume to be processed in the batch processing task.
3. The method of claim 1, wherein determining a device that processes the plurality of data together comprises:
and screening out one device with the smallest current load from a plurality of devices, and using the one device as a target device for processing the plurality of data together.
4. The method of claim 1, wherein determining a device that processes the plurality of data together comprises:
determining one or more devices which process the batch processing tasks historically, counting the historical execution duration of the batch processing tasks and the utilization rate of a processor when each batch processing task is processed, and further calculating the weight value of each device;
and screening out the device with the largest weight value from the one or more devices, and using the device as a target device for processing the plurality of data together.
5. The method according to claim 1, wherein the allocating different data to respective devices for processing or allocating the plurality of data to the one device for processing together comprises:
acquiring a serial parameter from the attribute of the batch processing task, and judging whether the serial parameter is set to be a first preset value or not; the serial parameters correspond to multiple execution instances of the batch processing tasks, and one instance is generated when the batch processing tasks are executed once;
if the judgment result is yes, processing data when the task execution period of the batch processing task is reached; or
And if the judgment result is negative, acquiring the execution state of the previous batch processing task execution example, and reprocessing the data under the condition that the execution state is the execution completion.
6. The method of claim 5, wherein reprocessing the data if the execution state is done further comprises:
and if the execution state is not completed when the execution cycle of the batch processing task is reached, another device for processing the plurality of data together is determined again.
7. The method of claim 1, further comprising:
and pulling a plurality of batch processing tasks to be executed from the database/operation path, and arranging the batch processing tasks according to the pulling sequence to generate the task queue to be executed.
8. The method of claim 7, further comprising: and when the data to be processed is a file, checking whether the file exists in the operation path or not based on the checking period, and if not, executing file waiting operation.
9. The method of any of claims 1-8, wherein the attributes further include a task number, a name, an executable period, and task parameters; the task parameters comprise specific parameters required by task execution.
10. The method of claim 9, further comprising:
and judging whether the current time period is in the executable time period or not, and if not, executing waiting operation.
11. The method of claim 1, further comprising:
and in the process of executing the batch processing task, monitoring the execution state by using a monitoring mechanism, and if the execution state is abnormal, recording abnormal data, abnormal reasons and operating equipment to generate a task execution abnormal log.
12. The method of claim 11, after the generating a task execution exception log, further comprising:
responding to the opening of the task execution exception log, and positioning an exception execution step according to the exception data and an exception reason; wherein a batch job comprises a plurality of execution steps.
13. The method of claim 11, further comprising: and if the abnormality occurs, sending a notification message or popping up a reminding message.
14. A task processing apparatus, comprising:
the system comprises a judging module, a data processing module and a data processing module, wherein the judging module is used for determining a batch processing task to be executed, acquiring fragmentation parameters from the attributes of the batch processing task, and judging whether the fragmentation parameters are set to preset values or not; wherein one batch processing task comprises a plurality of data to be processed;
the fragmentation module is used for determining a plurality of devices for processing the plurality of data if the judgment result is yes, respectively distributing different data to corresponding devices for processing, and summarizing the processing result to obtain an execution result; or
And the non-fragmentation module is used for determining one device which processes the plurality of data together if the judgment result is negative, and distributing the plurality of data together to the one device for processing to obtain an execution result.
15. An electronic device, comprising:
one or more processors;
a storage device for storing one or more programs,
when executed by the one or more processors, cause the one or more processors to implement the method of any one of claims 1-13.
16. A computer-readable medium, on which a computer program is stored, which, when being executed by a processor, carries out the method according to any one of claims 1-13.
CN202110700169.8A 2021-06-23 2021-06-23 Task processing method and device Active CN113407429B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110700169.8A CN113407429B (en) 2021-06-23 2021-06-23 Task processing method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110700169.8A CN113407429B (en) 2021-06-23 2021-06-23 Task processing method and device

Publications (2)

Publication Number Publication Date
CN113407429A true CN113407429A (en) 2021-09-17
CN113407429B CN113407429B (en) 2024-07-19

Family

ID=77682691

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110700169.8A Active CN113407429B (en) 2021-06-23 2021-06-23 Task processing method and device

Country Status (1)

Country Link
CN (1) CN113407429B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113495784A (en) * 2021-07-27 2021-10-12 中国银行股份有限公司 Data batch processing method and device

Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2010176303A (en) * 2009-01-28 2010-08-12 Nippon Yunishisu Kk Batch processing system, information terminal apparatus for use in the same, and method for recovering batch processing
CN102929585A (en) * 2012-09-25 2013-02-13 上海证券交易所 Batch processing method and system supporting multi-master distributed data processing
CN109144731A (en) * 2018-08-31 2019-01-04 中国平安人寿保险股份有限公司 Data processing method, device, computer equipment and storage medium
CN109376004A (en) * 2018-08-20 2019-02-22 中国平安人寿保险股份有限公司 Data batch processing method, device, electronic equipment and medium based on PC cluster
CN110008018A (en) * 2019-01-17 2019-07-12 阿里巴巴集团控股有限公司 A kind of batch tasks processing method, device and equipment
CN110113387A (en) * 2019-04-17 2019-08-09 深圳前海微众银行股份有限公司 A kind of processing method based on distributed batch processing system, apparatus and system
CN110308980A (en) * 2019-06-27 2019-10-08 深圳前海微众银行股份有限公司 Batch processing method, device, equipment and the storage medium of data
CN111061762A (en) * 2019-11-08 2020-04-24 京东数字科技控股有限公司 Distributed task processing method, related device, system and storage medium
CN111400012A (en) * 2020-03-20 2020-07-10 中国建设银行股份有限公司 Data parallel processing method, device, equipment and storage medium
CN111679920A (en) * 2020-06-08 2020-09-18 中国银行股份有限公司 Method and device for processing batch equity data
CN111767126A (en) * 2020-06-15 2020-10-13 中国建设银行股份有限公司 System and method for distributed batch processing
CN112148711A (en) * 2020-09-21 2020-12-29 建信金融科技有限责任公司 Processing method and device for batch processing tasks

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2010176303A (en) * 2009-01-28 2010-08-12 Nippon Yunishisu Kk Batch processing system, information terminal apparatus for use in the same, and method for recovering batch processing
CN102929585A (en) * 2012-09-25 2013-02-13 上海证券交易所 Batch processing method and system supporting multi-master distributed data processing
CN109376004A (en) * 2018-08-20 2019-02-22 中国平安人寿保险股份有限公司 Data batch processing method, device, electronic equipment and medium based on PC cluster
CN109144731A (en) * 2018-08-31 2019-01-04 中国平安人寿保险股份有限公司 Data processing method, device, computer equipment and storage medium
CN110008018A (en) * 2019-01-17 2019-07-12 阿里巴巴集团控股有限公司 A kind of batch tasks processing method, device and equipment
CN110113387A (en) * 2019-04-17 2019-08-09 深圳前海微众银行股份有限公司 A kind of processing method based on distributed batch processing system, apparatus and system
CN110308980A (en) * 2019-06-27 2019-10-08 深圳前海微众银行股份有限公司 Batch processing method, device, equipment and the storage medium of data
CN111061762A (en) * 2019-11-08 2020-04-24 京东数字科技控股有限公司 Distributed task processing method, related device, system and storage medium
CN111400012A (en) * 2020-03-20 2020-07-10 中国建设银行股份有限公司 Data parallel processing method, device, equipment and storage medium
CN111679920A (en) * 2020-06-08 2020-09-18 中国银行股份有限公司 Method and device for processing batch equity data
CN111767126A (en) * 2020-06-15 2020-10-13 中国建设银行股份有限公司 System and method for distributed batch processing
CN112148711A (en) * 2020-09-21 2020-12-29 建信金融科技有限责任公司 Processing method and device for batch processing tasks

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113495784A (en) * 2021-07-27 2021-10-12 中国银行股份有限公司 Data batch processing method and device
CN113495784B (en) * 2021-07-27 2024-03-19 中国银行股份有限公司 Method and device for data batch processing

Also Published As

Publication number Publication date
CN113407429B (en) 2024-07-19

Similar Documents

Publication Publication Date Title
US10601952B2 (en) Problem solving in a message queuing system in a computer network
CN108733476B (en) Method and device for executing multiple tasks
CN108595316B (en) Lifecycle management method, manager, device, and medium for distributed application
US10803411B1 (en) Enterprise platform deployment
US9497096B2 (en) Dynamic control over tracing of messages received by a message broker
US20210019179A1 (en) K-tier architecture scheduling
EP3901773A1 (en) Dynamically allocated cloud worker management system and method therefor
CN112817720A (en) Visual workflow scheduling method and device and electronic equipment
US12072986B2 (en) Intelligent vulnerability lifecycle management system
CN114721807A (en) Batch business task execution method, device, equipment, medium and program product
CN113127057B (en) Method and device for parallel execution of multiple tasks
CN113407429B (en) Task processing method and device
CN112445860A (en) Method and device for processing distributed transaction
CN114237853A (en) Task execution method, device, equipment, medium and program product applied to heterogeneous system
US11803421B2 (en) Monitoring health status of a large cloud computing system
US20220261277A1 (en) Container scheduler with multiple queues for special workloads
CN109213743B (en) Data query method and device
CN113656239A (en) Monitoring method and device for middleware and computer program product
CN111767126A (en) System and method for distributed batch processing
CN111212112A (en) Information processing method and device
CN114816477A (en) Server upgrading method, device, equipment, medium and program product
CN110807058A (en) Method and system for exporting data
CN110247802B (en) Resource configuration method and device for cloud service single-machine environment
CN112463616A (en) Chaos testing method and device for Kubernetes container platform
CN112269672B (en) File downloading exception handling method and device and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant