WO2020228177A1

WO2020228177A1 - Batch data processing method and apparatus, computer device and storage medium

Info

Publication number: WO2020228177A1
Application number: PCT/CN2019/102672
Authority: WO
Inventors: 朱鹏程; 王培�
Original assignee: 平安科技（深圳）有限公司
Priority date: 2019-05-16
Filing date: 2019-08-27
Publication date: 2020-11-19
Also published as: CN110297711B; CN110297711A

Abstract

Disclosed by the present application are a batch data processing method and apparatus, a computer device and a storage medium, the method comprising: selecting a target batch task from a non-real time task queue, and creating a data processing queue on the basis of the target batch task; determining target idle information from within a target idle time queue, and acquiring the quantity of data to be processed corresponding to the data to be processed; acquiring a target quantity on the basis of the quantity of data to be processed, the target idle duration and an estimated number of threads; selecting, from within the data processing queue, data to be processed that corresponds to the target quantity, and determining said data to be segmentation process data; when the original load of a system is less than a busy load threshold; performing data processing on the segmentation process data by using the target processing thread during the target idle duration, and acquiring a data processing result; and updating the task status of each piece of segmentation process data in the data processing queue on the basis of the data processing result. The described method may rationally allocate system resources, and guarantees the efficiency of batch data processing.

Description

Batch data processing method, device, computer equipment and storage medium

This application is based on the Chinese invention application filed on May 16, 2019 with the application number 201910405149.0 and titled "Batch data processing method, device, computer equipment and storage medium", and claims its priority.

Technical field

This application relates to the field of big data technology, and in particular to a batch data processing method, device, computer equipment and storage medium.

Background technique

With the development of big data technology, many fields will use big data technology to process related data. However, with the growth of business and the accumulation of time, the amount of data in the database has reached hundreds of millions. If the data in the database is directly processed in batches, the system's resources are too large and the data processing efficiency is affected. For example, when processing data in a database in batches, because more data needs to be processed, it requires more system resources and longer processing time; if the data batch processing process allocates more system resources, it will Occupying system resources of real-time tasks affects the response speed of real-time tasks that require real-time response, resulting in longer waiting time for users; if less system resources are allocated during data batch processing, it will affect the data processing progress of non-real-time tasks, resulting in Data backlog corresponding to non-real-time tasks.

Summary of the invention

The embodiments of the present application provide a batch data processing method, device, computer equipment, and storage medium to solve the problem of unreasonable system resources allocated during data batch processing.

A batch data processing method, including:

Selecting a target batch task from a non-real-time task queue, and creating a data processing queue based on the target batch task, the data processing queue including the data to be processed and the corresponding task status;

Determine target idle information from the target idle time queue, where the target idle information includes a start time, a target idle time length, and an estimated number of threads;

Obtain the number of data to be processed corresponding to the data to be processed, and obtain the target number based on the number of data to be processed, the target idle time and the estimated number of threads, and select the target number from the data processing queue. The data to be processed corresponding to the quantity is determined as segmentation processing data;

When the current time of the system is the start time, obtain the original load of the system;

If the original load of the system is less than the busy load threshold, the system is determined to be in an idle state, the target processing thread corresponding to the estimated number of threads is obtained, and the target processing thread is used to switch the switch within the target idle time. Perform data processing on the processed data to obtain the data processing results;

Based on the data processing result, the task status of each segmentation processing data in the data processing queue is updated.

A batch data processing device, including:

The data processing queue creation module is used to select target batch tasks from the non-real-time task queue, and create a data processing queue based on the target batch tasks. The data processing queue includes the data to be processed and the corresponding task status;

The target idle information determination module is used to determine the target idle information from the target idle time queue, and the target idle information includes the start time, the target idle time and the estimated number of threads;

The segmentation processing data determination module is used to obtain the number of data to be processed corresponding to the data to be processed, and obtain the target number based on the number of data to be processed, the target idle time and the estimated number of threads, and select the corresponding target number in the data processing queue The to-be-processed data is determined to be segmented processing data;

The system original load acquisition module is used to obtain the system original load when the current time of the system is the start time;

Data processing result acquisition module, used to determine that the system is in an idle state if the original load of the system is less than the busy load threshold, obtain the target processing thread corresponding to the estimated number of threads, and use the target processing thread to split the target within the target idle time Data processing and obtaining data processing results;

The task status update module is used to update the task status of each processing data in the data processing queue based on the data processing result.

A computer device includes a memory, a processor, and computer-readable instructions stored in the memory and capable of running on the processor, and the processor implements the following steps when the processor executes the computer-readable instructions:

Acquiring original video data, where the original video data includes at least two frames of images;

Selecting a reference image from the original video data, performing target detection on the reference image, and acquiring at least one reference tracking target and a corresponding reference target feature vector;

Performing target detection on the current image in the original video data to obtain at least one current tracking target and a corresponding current target feature vector;

Calculating the feature similarity between any one of the reference target feature vectors and all current target feature vectors to determine the target similarity corresponding to the reference target feature vector;

If the target similarity is less than the preset similarity, determining that the reference tracking target corresponding to the target similarity is a lost tracking target in the current image;

If the reference tracking target is a missing tracking target in N consecutive images after the current image, the reference tracking target is released.

One or more readable storage media storing computer readable instructions, the computer readable storage medium storing computer readable instructions, and when the computer readable instructions are executed by one or more processors, the one Or multiple processors perform the following steps:

The details of one or more embodiments of the present application are set forth in the following drawings and description. Other features and advantages of this application will become apparent from the description, drawings and claims.

Description of the drawings

In order to explain the technical solutions of the embodiments of the present application more clearly, the following will briefly introduce the drawings that need to be used in the description of the embodiments of the present application. Obviously, the drawings in the following description are only some embodiments of the present application. For those of ordinary skill in the art, other drawings can be obtained based on these drawings without creative labor.

FIG. 1 is a schematic diagram of an application environment of a batch data processing method in an embodiment of the present application;

2 is a flowchart of a batch data processing method in an embodiment of the present application;

FIG. 3 is another flowchart of a batch data processing method in an embodiment of the present application;

4 is another flowchart of a batch data processing method in an embodiment of the present application;

FIG. 5 is another flowchart of a batch data processing method in an embodiment of the present application;

FIG. 6 is another flowchart of a batch data processing method in an embodiment of the present application;

FIG. 7 is a schematic diagram of a batch data processing device in an embodiment of the present application;

Fig. 8 is a schematic diagram of a computer device in an embodiment of the present application.

Detailed ways

The technical solutions in the embodiments of the present application will be described clearly and completely in conjunction with the accompanying drawings in the embodiments of the present application. Obviously, the described embodiments are part of the embodiments of the present application, rather than all of them. Based on the embodiments in this application, all other embodiments obtained by those of ordinary skill in the art without creative work fall within the protection scope of this application.

The batch data processing method provided in the embodiment of the present application can be applied to the application environment shown in FIG. 1. Specifically, the batch data processing method is applied to a batch data processing system. The batch data processing system includes a client and a server as shown in FIG. 1. The client and the server communicate through a network for accurately dividing the batch data. Use system idle time to complete batch data processing, which not only ensures the progress and efficiency of data batch processing, but also does not affect the response speed of real-time tasks. Among them, the client is also called the client, which refers to the program that corresponds to the server and provides local services to the client. The client can be installed on, but not limited to, various personal computers, laptops, smart phones, tablet computers, and portable wearable devices. The server can be implemented as an independent server or a server cluster composed of multiple servers.

In an embodiment, as shown in FIG. 2, a method for processing batch data is provided. The method is applied to the server in FIG. 1 as an example for description, and includes the following steps:

S201: Select a target batch task from the non-real-time task queue, and create a data processing queue based on the target batch task. The data processing queue includes the data to be processed and the corresponding task status.

The non-real-time task queue is a queue for storing non-real-time tasks, that is, at least one non-real-time task is stored in the non-real-time task queue. Non-real-time tasks are a concept opposite to real-time tasks. Among them, real-time tasks are strong real-time tasks that require immediate response, such as user login, user query, and other operations performed by the user. Non-real-time tasks are tasks that allow delay and can be processed asynchronously, including tasks that need to process batch data such as system-level log upload server tasks, statistical report generation calculations, and data analysis.

The target batch task refers to the non-real-time task that needs to be processed currently selected from at least one non-real-time task in the non-real-time task queue according to a preset sequence. The target batch task can be a non-real-time task that requires batch data processing. task. The order can be determined by the first-in-first-out principle of the queue, or by the priority order of different tasks and the first-in-first-out principle.

The data processing queue is a list created based on any target batch task and used to record the task status of each pending data in the target batch task. The data to be processed refers to the smallest unit of data that needs to be processed in the target batch task, which can be understood as a log or a report and other data. The task status corresponding to each data to be processed is used to reflect the status of each data to be processed by the system. The task status specifically includes the unprocessed status, the processing status, the processing success status, and the processing failure status.

Specifically, the server can select the non-real-time task that needs to be processed first from the non-real-time task queue according to the order of the non-real-time task queue, and determine it as the target batch task. Normally, there are many non-real-time tasks in the system. , The system sorts non-real-time tasks according to preset sorting rules, and prioritizes the data that needs to be processed first to determine the order of data in the non-real-time tasks. Then create a data processing queue based on the data in the target batch task. In the data processing queue, each data to be processed and the task status corresponding to the data to be processed are displayed. Understandably, when the data processing queue is created, each data processing queue is The task status of the data is unprocessed. When the system processes the data to be processed, the task status will be updated along with the processing process.

S202: Determine target idle information from the target idle time queue, where the target idle information includes a start time, an idle duration, and an estimated number of threads.

Among them, the target idle time queue refers to a queue that predicts the daily idle time of the system according to the time the system processes historical processing data. Historical processing data refers to the information in the historical record that calls system resources to process historical real-time tasks. The historical real-time task refers to the real-time task before the current time of the system. Historical real-time tasks are strong real-time tasks that require immediate response, such as user login, user query, and other operations performed by the user. When the server has remaining threads in addition to the threads processing real-time tasks, the system is considered to be in idle time. The target idle time queue contains at least one piece of original idle information, and the original idle information refers to information corresponding to an idle time whose idle duration is greater than a preset duration threshold in each day. Each original idle information corresponds to a start time, original idle duration and estimated number of threads. The original idle duration refers to the difference between the start time and the end time in the corresponding original idle information. For example, if the server is its original idle information from 7:00 to 7:15 every morning, the starting time of the original idle information is 7:00, and the idle time is 15 minutes. The estimated number of threads is the predicted number of threads that can handle non-real-time tasks during idle time. Generally speaking, the estimated number of threads N _P can be calculated from the total number of threads N of the CPU, minus the number of real-time threads predicted to process real-time tasks N1 and the number of reserved emergency threads N2 during idle time, N _P = N- N1-N2. The target idle information refers to the original idle information selected from the target idle time queue whose starting time is closest to the current time of the system.

Specifically, the server analyzes the processing time of the historical real-time tasks of the system based on the big data modeling method, and predicts the original idle time queue of the server every day. According to the current time of the system, the target idle time queue is selected from the pre-predicted target idle time queue. After the current time of the system, the original idle information closest to the current time of the system is used as the target idle information to quickly determine the target idle information that can process the to-be-processed data, which improves the efficiency of determining the target idle information so that the target idle information corresponds to Data batch processing is performed in the free time of the system, which speeds up the progress of the system in processing batch data, and ensures that the data to be processed can be processed later, so as to achieve the purpose of using the free time to perform batch processing of the data to be processed. For example, there are three original idle information in the predicted idle time queue, and their start times are 6:00, 7:00, and 8:00 respectively. If the current time of the system is 7:20, the target idle information at this time It is the original idle information corresponding to the start time of 8:00.

S203: Obtain the number of data to be processed corresponding to the data to be processed, and obtain the target number based on the number of data to be processed, the target idle time and the estimated number of threads, and select the data to be processed corresponding to the target number in the data processing queue, and determine it as Split processing data.

Among them, the target number refers to the number of data to be processed that can be processed by the thread corresponding to the estimated number of threads that the system calls within the target idle time period corresponding to the target idle information. Splitting the processed data refers to the to-be-processed data that the system needs to process within the target idle time period corresponding to the target idle information. The segmentation processing data is determined according to the actual data processing situation of the system, and the target number of segmentation processing data processed by the system each time is determined by the target idle time and the estimated number of threads.

Specifically, the server performs time or type segmentation on the data to be processed in the data processing queue according to a preset data segmentation rule to obtain the number of batch data processed in the target idle time corresponding to the target idle information (ie Target number), and then select the data to be processed corresponding to the target number from the data processing queue, and determine it as segmentation processing data, so as to objectively allocate the number of batch data that can be processed in the free time corresponding to the target free information, which can ensure real-time Tasks are processed normally, and the target number of segmentation processing data can be processed in idle time. Specifically, time segmentation of the data to be processed can be to select the target number of data to be processed according to the order in which the data to be processed needs to be processed or to select the target according to the type of data to be processed (ie analysis type, log type, report type, etc.) The amount of data to be processed.

S204: When the current time of the system is the start time, obtain the original load of the system.

Among them, the original system load refers to the amount of system resources (real-time threads) occupied by the system when processing real-time tasks at the current time of the system. For example, at the start time of 8:00, if the system has the number of real-time threads N1 for processing real-time tasks and the number of reserved emergency threads N2, the system's original load is N1+N2. When the current time of the system is the starting time, the original load of the system is obtained to determine whether the system can process the segmentation processing data corresponding to non-real-time tasks in addition to real-time tasks.

S205: If the original system load is less than the busy load threshold, the system is determined to be in an idle state, the target processing thread corresponding to the estimated number of threads is obtained, and the target processing thread is used to perform data processing on the segmentation processing data within the target idle time, and obtain Data processing results.

Wherein, the busy load threshold is a preset threshold used to evaluate whether the system is in a busy state and load. The system is in an idle state and a busy state. A busy load threshold is set on the system. Assuming that the maximum server load is M, the load occupied by the processing of real-time tasks is M*50% (M*50% is the busy load threshold) Below, the system is considered to be idle. Conversely, if the load occupied by the processing of the real-time task is more than M*50% (greater than or equal to), the system is deemed to be in a busy state. The target processing thread refers to a thread that processes the segmentation processing data, and the target processing thread is a thread dedicated to processing non-real-time tasks. The data processing result refers to the result of the system processing the segmentation processing data.

Specifically, when the current time of the system is the start time of the target idle information, the current load of the system is detected. If the current load of the system is less than the busy load threshold, the system is considered to be in an idle state, and the number of threads pre-allocated by the system is called The corresponding target processing thread processes the segmentation processing data obtained during the target idle time corresponding to the target idle information, and processes the segmentation processing data when the system is idle, so as to allocate system resources reasonably, which avoids the system Processing batch data of non-real-time tasks occupies too many system resources, which leads to long processing time and slow processing speed of real-time tasks, and avoids failure to process the pending data corresponding to non-real-time tasks in time when the system is idle, resulting in system resources waste. It is understandable that by pre-allocating the target processing thread corresponding to the estimated number of threads to process the segmentation processing data in the target number of non-real-time tasks, the communication loss between threads can be reduced, and the system performance loss can be reduced. That is, by assigning a target processing thread dedicated to processing non-real-time tasks, and using the target processing thread to process the segmentation processing data, the target processing thread does not need to switch between processing real-time tasks and data corresponding to non-real-time tasks, reducing The loss of communication between threads makes the loss of system performance smaller and achieves the purpose of rational use of system resources.

S206: Based on the data processing result, update the task status of each sub-process data in the data processing queue.

Specifically, after the target processing thread is used to process the segmentation processing data, since the data processing result of each segmentation processing data may be a success or a failure, after the system completes the segmentation processing data processing, the processed data needs to be processed. The task status is updated. At this time, if the processing fails, the task status is updated to the processing failure status, and the system divides the corresponding processing data according to the processing failure status, so that the pending data of the task status as the processing failure status can be subsequently reviewed Or check and other processing to determine the reason for the processing failure; if the task status is updated to the processing success status when the processing is successful, the system removes the corresponding segmentation processing data from the data processing queue, and no further processing is performed.

Further, the data processing result of each piece of data to be processed will be synchronized to the system, and the system will record and formulate the next execution rule, while ensuring that each piece of data to be processed is processed and successfully. In this embodiment, the system also presets a failure count threshold. If the processing failure count of any data to be processed reaches the failure count threshold, the corresponding reminder mechanism is triggered, and the corresponding pending data and processing process are recorded through the reminder mechanism The log information is sent to the audit terminal together, so that the audit staff of the audit terminal can deal with it accordingly. The threshold of the number of failures is preset to limit the number of times that each data to be processed can be processed repeatedly. For example, the threshold of the number of failures can be set to three. For example, if the data processing results of three consecutive processing of data to be processed are all processing failures, the system will automatically send a reminder email to the audit terminal, so that the auditor of the audit terminal can view the data to be processed, so as to realize the processing of the data to be processed. monitor.

Understandably, when the data processing queue is created based on the target batch task, the task state of all the data to be processed in the data processing queue is initialized to the unprocessed state. When a target amount of to-be-processed data is selected from the data processing queue and determined to be split processing data, the task status of the target amount of split processing data in the data processing queue is updated to the processing state. Perform data processing on the segmentation processing data during the idle time to obtain the data processing results. The data processing results can be processed successfully or failed. At this time, the task status corresponding to the segmentation processing data can be updated based on the data processing results. Processing success status or processing failure status, so as to update the task status of the data to be processed in the data processing queue in real time.

S207: If the original load of the system is not less than the busy load threshold, the system is determined to be in a busy state, and the target idle information is determined from the target idle time queue repeatedly.

Specifically, when the current time of the system is the start time of the target idle information, the current load of the system is detected. If the current load of the system is not less than the busy load threshold, the system is determined to be in a busy state. In order to avoid processing non-real-time tasks from occupying system resources, Affect the speed of the system processing real-time tasks, it is necessary to re-determine a target idle information, that is, repeat the determination of the target idle information from the target idle time queue, and determine the next original idle information closest to the current time of the system as the target idle information. Realize the reasonable allocation of system resources and ensure the processing speed of real-time tasks.

In the batch data processing method provided in this embodiment, a target batch task is selected from a non-real-time task queue, and a data processing queue is created based on the target batch task to reasonably allocate the processing sequence of data in the non-real-time task. The target idle information is determined from the target idle time queue, which improves the efficiency of determining target idle information and speeds up the system to process batch data. According to the target idle time and the estimated number of threads, determine the segmentation processing data corresponding to the target number to objectively allocate the amount of data that can be processed when the system is currently idle, thereby reducing the amount of single processing data, and ensuring that real-time tasks can be processed normally Under the premise, the segmentation processing data can be processed in the free time, so as to realize the reasonable allocation of the segmentation processing data that can be processed in each free time according to the system resources. When the current time of the system is the starting time and the original load of the system is less than the busy load threshold, obtain the target processing thread corresponding to the estimated number of threads, and use the target processing thread to perform data segmentation processing within the target idle time Processing to reasonably allocate system resources and reduce the loss of communication between threads, so that the loss of system performance is small, and data processing results can be obtained. Based on the data processing result, update the task status of each sub-processing data in the data processing queue to ensure the successful processing of all data. If the original load of the system is not less than the busy load threshold, it is determined that the system is in a busy state, and the target idle information is determined from the target idle time queue repeatedly to ensure successful processing of the pending data.

In one embodiment, as shown in FIG. 3, before step S201, that is, before selecting a target batch task from a non-real-time task queue and creating a data processing queue based on the target batch task, the batch data processing method further includes:

S301: Obtain a task processing request, where the task processing request includes a task to be processed and a task identifier corresponding to the task to be processed.

Among them, the task processing request refers to a request for processing all unprocessed tasks in the system. In this embodiment, the tasks to be processed include real-time tasks and non-real-time tasks. Task identification refers to the identification of each task to be processed in advance, including real-time identification and non-real-time identification. Real-time identification and non-real-time identification are designated by the system administrator. Take the simplest digital identification as an example, 1- represents real-time Identification, 2-9 represents non-real-time identification. Among them, the real-time identification refers to the identification that indicates that the task to be processed is a real-time task, that is, if the task identification in a task processing request is a real-time identification, the corresponding task to be processed is a real-time task. The non-real-time identification refers to an identification indicating that the task to be processed is a non-real-time task, that is, if the task identification in a task processing request is a non-real-time identification, the corresponding task to be processed is a non-real-time task. Understandably, real-time identification and non-real-time identification are relative. Tasks to be processed with real-time identification are real-time tasks and require immediate processing and response; tasks to be processed with non-real-time identification are non-real-time tasks and can be Processing when idle, in order to reasonably arrange the processing time of pending tasks and ensure the processing speed of real-time tasks.

Specifically, when the system obtains the task processing request, it queries the task to be processed in the database, and each task to be processed corresponds to a task ID, and the task to be processed is processed correspondingly according to the carried task ID. For example, the task to be processed may be login, password retrieval, log processing, and report processing. The system will mark login and password retrieval and other tasks that require real-time processing with real-time identification. The system will process log and report processing, etc. Tasks that can be processed with a delay are marked as non-real-time identifiers.

S302: If the task identifier is a real-time identifier, execute the task to be processed.

Specifically, the server obtains the task ID as a real-time ID, indicating that the task to be processed is a real-time task and needs to be processed immediately, then the system resource is called to process the task to be processed to ensure that the task to be processed corresponding to the real-time ID can be instantly processed. Responsive processing, improving the processing efficiency of real-time tasks.

S303: If the task identifier is a non-real-time identifier, obtain the task type in the task processing request, determine the task priority of the task to be processed based on the task type, and store the task to be processed in the non-real-time task queue according to the order of task priority .

Among them, the task type refers to the type of data processing corresponding to the task to be processed. For example, the types of tasks to be processed may be log processing types, report processing types, and analysis processing types. Task priority refers to a parameter that determines the order of processing for each task to be processed when the system processes multiple tasks to be processed. For example, 2-9 represents non-real-time identification, and the smaller the number in 2-9, the higher the priority, and the more priority processing is required. Generally speaking, the task priority of the task to be processed is determined by the processing type, and the task to be processed is stored in the non-real-time task queue according to the order of priority from low to high.

Specifically, when the task in the task processing request is identified as a non-real-time task, the server determines the task priority corresponding to the task type according to the task type in the task to be processed, and stores the task to be processed with a higher priority in the non-real-time task. The front position of the task queue to determine the order of task priority corresponding to different task types, and to make the pending tasks with the priority of the task priority processed, store the pending tasks in the non-real-time task queue, so as to arrange non-real-time tasks reasonably The processing sequence of real-time tasks. Further, in the non-real-time task queue, the processing order between different task types can be determined according to the task priority corresponding to the task type of the task to be processed, and the task to be processed of the same task type is based on the first in first out of the queue. Principles are sorted.

In the batch data processing method provided in this embodiment, the server obtains the task to be processed and the task identifier corresponding to the task to be processed to reasonably arrange the processing time of the task to be processed and ensure the processing speed of the real-time task. When the task identifier is a real-time identifier, the task to be processed is executed immediately to ensure the processing speed and response time of the task to be processed with the real-time identifier. When the task identifier is a non-real-time identifier, the task type in the task processing request is obtained to determine the task priority of the task to be processed. According to the order of task priority, the task to be processed is stored in the non-real-time task queue, which is reasonable Arrange the processing order of non-real-time tasks to ensure that the pending tasks with the first priority are processed first.

In an embodiment, as shown in FIG. 4, before step S202, that is, before determining target idle information from the target idle time queue, the batch data processing method further includes:

S401: Obtain historical processing data, which includes historical processing time, historical processing quantity, and historical thread number.

Among them, historical processing data refers to the information in the historical record that calls system resources to process historical real-time tasks. The historical real-time task refers to the real-time task before the current time of the system. The historical processing time refers to the time interval formed between the start time and the end time of the server processing historical real-time tasks in the historical record. The number of historical threads refers to the number of threads called when processing historical real-time tasks in historical records. Specifically, the server obtains a large amount of historical processing data in order to analyze the data corresponding to these historical processing data, to analyze the objective laws existing in the historical processing data, and determine the current situation of daily system resources, so that the subsequent reasonable allocation of system processing real-time Time and resources for tasks and non-real-time tasks.

S402: Perform big data modeling on historical processing time, historical processing number, and historical thread number based on machine learning algorithms, and obtain an original idle time queue. The original idle time queue includes at least one original idle information, and each original idle information includes a start time , Original idle time and estimated number of threads.

Among them, the original idle time queue refers to a queue that predicts the daily idle time of the system according to the time the system processes historical processing data. The original idle information refers to the information corresponding to the idle time in which the idle time is greater than the preset time threshold in each day.

In this embodiment, historical processing time, historical processing quantity, and historical thread number are information about the system periodically processing data. Machine learning algorithms are used to perform big data modeling on historical processing time, historical processing quantity, and historical thread number to determine the system Historical processing threads and idle states at any time of the day, so as to realize the starting time, idle time and the number of real-time threads that need to be processed in a certain idle interval, and calculate the estimation of the non-real-time tasks that can be processed during the idle time The number of threads, using machine learning algorithms to make the obtained original idle time queue objective and accurate. For example, the system divides historical processing data into weekly cycles, and uses machine learning algorithms to analyze historical idle information every Monday, Tuesday...Sunday, to obtain the original idle time queue, and quickly obtain regular original idle time. queue. The machine learning algorithm includes but is not limited to logistic regression algorithm and LSTM neural network algorithm.

S403: If the original idle time is greater than the first time threshold, store the original idle information on the target idle time queue.

Wherein, the first duration threshold is a preset threshold used to evaluate whether the idle duration of the system reaches the duration identified as idle time. The setting of the first duration threshold can eliminate the situation that the original idle time is short, and ensure that each original idle time stored in the target idle time queue is within the original idle time corresponding to more non-real-time tasks to be processed. Data helps to reduce the loss of communication between subsequent threads to achieve the purpose of reasonable allocation of system resources. In this embodiment, when the original idle time is greater than the first time threshold, the corresponding original idle information is stored in the target idle time queue, that is, the original idle information whose original idle time is greater than the first time threshold is assembled together to construct Target idle time queue.

In the batch data processing method provided in this embodiment, the historical processing time, the historical processing number, and the historical thread number are based on a machine learning algorithm to perform big data modeling to make the obtained original idle time queue objective. When the original idle time is greater than the first time threshold, the original idle information is determined as the target idle information to ensure that the original idle time corresponding to each original idle time stored in the target idle time queue can handle more non-real time The to-be-processed data corresponding to the task helps to reduce the loss of communication between subsequent threads to achieve the purpose of rationally allocating system resources.

In one embodiment, as shown in FIG. 5, in step S203, the target number is acquired based on the number of data to be processed, the target idle time and the estimated number of threads, and the data to be processed corresponding to the target number is selected in the data processing queue , Determined as segmentation processing data, including:

S501: Use an estimated time calculation formula to calculate the number of data to be processed and the estimated number of threads, and obtain an estimated processing time corresponding to the data processing queue.

Among them, the estimated time calculation formula refers to a formula used to calculate the time required for the system to process the data to be processed in the data processing queue. Estimated processing time refers to the time required for the system to process all the data to be processed in the data processing queue. Specifically, the estimated time calculation formula is

T1 is the estimated processing time based on the data processing queue, S is the number of data to be processed, N _p is the estimated number of threads that can be processed in idle time, and x is the data processing volume of each thread per unit time. According to the estimated time calculation formula, the estimated processing time of all data to be processed in the system processing data processing queue can be quickly calculated.

S502: Use the target quantity calculation formula to calculate the estimated processing time and the target idle time to obtain the target quantity.

Among them, the target quantity obtaining formula is a formula used to calculate the quantity of data to be processed by the system within the target idle time period. The formula for calculating the target quantity is

Among them, X is the target number, S is the number of data to be processed, T1 is the estimated processing time based on the data processing queue, and T2 is the target idle time. According to the target number calculation formula, the number of data processed by the system in the target idle time can be quickly obtained. For example, it is known that the amount of data in a certain data processing queue is 1000, and the estimated processing time based on 1000 to-be-processed data is the estimated processing time T1. Knowing the operating conditions of the historical server, using a machine learning algorithm to evaluate that the server is idle at a certain time point in the future (that is, the start time), it can process batch data without affecting the real-time task processing, and the idle time is T2. The target quantity acquisition formula calculates that the target quantity is X=1000*T2/T1 pieces of data, and at the same time, it is set to start processing at time T.

S503: According to a preset filtering rule, select the data to be processed corresponding to the target quantity in the data processing queue, and determine it as segmentation processing data.

Among them, the preset filtering rules refer to the preset rules for selecting the data to be processed. The preset filtering rules usually filter the data to be processed in the order of task priority from high to low, so that the data processing sequence is sequential. Ensure that the data with the task priority first is processed first. Understandably, in the case of the same task priority, the queue's first-in-first-out principle is used to determine its screening order, so as to obtain the corresponding segmentation processing data.

Specifically, the server selects the target amount of data to be processed in the data processing queue according to preset filtering rules (that is, the order of task priority from high to low) to determine the priority of the task to be processed first. The data is determined to be segmented and processed and fed back to the system for processing.

In the batch data processing method provided in this embodiment, the estimated time calculation formula is used to calculate the number of data to be processed and the estimated number of threads, and then the target number calculation formula is used to calculate the estimated processing time and target idle time. The estimated time calculation formula and the target quantity calculation formula can quickly obtain the target quantity, ensure the objectivity of the target quantity, and ensure the subsequent accurate segmentation and processing of the data. In the data processing queue, the data to be processed corresponding to the target quantity is selected according to the preset filtering rules, and determined as the segmentation processing data, so as to realize the reasonable distribution of the data to be processed.

Further, when the system is processing data, preset filtering rules are set in advance to determine the task priority of the batch data. When processing batch data, you can control how many target processing threads are used for processing. The more target processing threads, the more the system The more resources, the stronger the processing power, and the faster the processing of the to-be-processed data corresponding to non-real-time tasks. If a special situation occurs, such as a busy real-time task business, you can suspend batch processing or reduce the resource allocation for batch processing of data to be processed with a lower task priority, so as to achieve a reasonable allocation of system resources and segmentation processing data that needs to be processed.

In one embodiment, as shown in FIG. 5, after step S206, that is, after updating the task status of processing data for each segment in the data processing queue based on the data processing result, the batch data processing method further includes:

S601: If the task status of each piece of processing data in the data processing queue is updated to the completed processing status, then the remaining time is obtained based on the current time of the system, the start time and the target idle time.

Specifically, the server monitors the task status of data segmentation processing in the data processing queue in real time. When all segmentation processing data has been processed, the current time of the system is obtained. If the current time of the system is within the idle time from the start time, it is based on System current time and idle time, determine the remaining time, that is, the remaining time is the difference between the deadline of idle time and the current time of the system, so that when the remaining time is long, continue to process batch data and make full use of the idle time corresponding to the remaining time time. For example, if the start time is 8:00, the idle time is 30 minutes, and the current system time is 8:20, the remaining time is 10 minutes.

S602: If the remaining duration is greater than the second duration threshold, update the remaining duration to the target idle duration, and determine the corresponding processable data amount based on the updated target idle duration and the estimated number of threads.

The second duration threshold is a preset threshold for judging whether the remaining duration is long enough. The second duration threshold may be the same as or different from the first duration threshold, and may be set to 30s or other values. In step S602, the server can determine the corresponding processable data volume by using the processable data volume obtaining formula. The processable data volume obtaining formula is K=N _p *x*T3, K is the processable data volume corresponding to the remaining time, N _p is the estimated number of threads that can be processed in idle time, x is the data processing volume of each thread per unit time, and T3 is idle time. When the remaining time is greater than the second duration threshold, the server updates the remaining time to the target idle time, and then continues to process the pending data according to the updated target idle time to ensure that real-time tasks are not affected, and other pending data is further processed in batches. Speed up the processing speed of the data to be processed.

Specifically, the server needs to first determine whether the remaining time is greater than the second time threshold. When the remaining time is greater than the second time threshold, it means that the remaining time corresponding to the target idle information is longer, and the system can be fully utilized to continue processing batch tasks without As a result, the system is too busy. Therefore, the corresponding processable data volume can be determined based on the remaining time, so that the to-be-processed data corresponding to the processable data volume can be processed within the remaining time, so as to improve the processing efficiency of batch data.

S603: Select the to-be-processed data corresponding to the number of processes that can be processed in the data processing queue, update to split processing data, obtain the target processing thread corresponding to the estimated number of threads, and use the target processing thread for repeated execution within the target idle time Perform data processing on the segmentation processing data to obtain the data processing result.

Specifically, the server obtains the processed data corresponding to the number of processed data according to the acquired target idle time and the estimated number of threads, and updates it to segmentation processing data, which improves the processing efficiency of batch data and speeds up the data to be processed in the target batch task Then, the processing thread is used to perform data processing on the segmentation processing data to obtain the data processing result.

In the batch data processing method provided in this embodiment, after the segmentation process data processing is completed, the server obtains the remaining time, and when the remaining time is greater than the second time threshold, the remaining time is updated to the target idle time, based on the updated target The idle time and the estimated number of threads determine the corresponding processable data volume to improve the processing efficiency of batch data. Select the to-be-processed data corresponding to the number of processes that can be processed in the data processing queue, update it to split processing data, and obtain the target processing thread corresponding to the estimated number of threads to ensure that the system does not affect the processing of real-time tasks while speeding up batch data The processing speed.

In an embodiment, after step S205, that is, after the target processing thread is used to perform data processing on the segmentation processing data within the target idle time period, the batch data processing method further includes: real-time monitoring of the current load of the system during the data processing process, if The current load of the system is greater than the burst load threshold, the target processing thread is released, the data processing of the segmentation processing data is stopped, and the task status of the segmentation processing data is updated to the stopped state.

Wherein, the burst load threshold is a preset threshold used to evaluate whether the system receives a large burst of load. Generally speaking, the burst load threshold is greater than the average busy load.

Specifically, when the server's current load is greater than the burst load threshold, it means that the system currently receives a large number of task processing requests carrying real-time identifiers, making the current load of the system too heavy. In order to ensure the timely processing of task processing requests carrying real-time identifiers Processing, at this time, the processing of the target batch task needs to be suspended to release the target processing thread occupied by the target batch task processing, that is, the system will actively reduce the number of target processing threads in the batch processing to give priority to the data processing corresponding to the real-time task. In this embodiment, when the current load of the system is greater than the burst load threshold, the target processing thread is released, and the data processing of the segmentation processing data is stopped to update its task status to the stopped state; understandably, in the next idle time Inside, the server preferentially processes the segmentation processing data in the data processing queue with the task status in the stopped state to ensure the efficiency of data batch processing.

It should be understood that the size of the sequence number of each step in the foregoing embodiment does not mean the order of execution. The execution sequence of each process should be determined by its function and internal logic, and should not constitute any limitation to the implementation process of the embodiment of the present application.

In one embodiment, a batch data processing device is provided, and the batch data processing device corresponds to the batch data processing method in the foregoing embodiment one-to-one. As shown in FIG. 7, the batch data processing device includes a data processing queue creation module 701, a target idle information determination module 702, a segmentation processing data determination module 703, a system raw load acquisition module 704, a data processing result acquisition module 705, and task status Update module 706. The detailed description of each functional module is as follows:

The data processing queue creation module 701 is used to select a target batch task from a non-real-time task queue, and create a data processing queue based on the target batch task. The data processing queue includes the data to be processed and the corresponding task status.

The target idle information determining module 702 is configured to determine target idle information from the target idle time queue, and the target idle information includes a start time, a target idle time length, and an estimated number of threads.

The segmentation processing data determination module 703 is used to obtain the number of data to be processed corresponding to the data to be processed, and to obtain the target number based on the number of data to be processed, the target idle time and the estimated number of threads, and select the target number in the data processing queue. The corresponding to-be-processed data is determined to be segmented processing data.

The system original load obtaining module 704 is configured to obtain the original system load when the current time of the system is the start time.

The data processing result acquisition module 705 is used to determine that the system is in an idle state if the original system load is less than the busy load threshold, acquire the target processing thread corresponding to the estimated number of threads, and use the target processing thread to split within the target idle time Process data for data processing, and obtain data processing results.

The task status update module 706 is used to update the task status of each piece of processing data in the data processing queue based on the data processing result.

Preferably, after the system original load acquisition module 704, the batch data processing device further includes: a system busy module.

The system busy module is used to determine that the system is in a busy state if the original load of the system is not less than the busy load threshold, and repeatedly execute to determine the target idle information from the target idle time queue.

Preferably, before the data processing queue creation module 701, the batch data processing device further includes: a task processing request acquisition module, a real-time identification module, and a non-real-time identification module.

The task processing request acquiring module is used to acquire the task processing request, and the task processing request includes the task to be processed and the task identifier corresponding to the task to be processed.

The real-time identification module is used to execute the task to be processed if the task identification is a real-time identification.

The non-real-time identification module is used to obtain the task type in the task processing request if the task identification is a non-real-time identification, determine the task priority of the task to be processed based on the task type, and store the task to be processed in the order of task priority Non-real-time task queue.

Preferably, before the module 702 for determining the target idle information, the batch data processing device further includes: a historical processing data acquisition module, an original idle time queue acquisition module, and an original idle information storage module.

The historical processing data acquisition module is used to acquire historical processing data. The historical processing data includes historical processing time, historical processing quantity, and historical thread number.

The original idle time queue acquisition module is used to perform big data modeling of historical processing time, historical processing quantity, and historical thread number based on machine learning algorithms to obtain the original idle time queue. The original idle time queue includes at least one piece of original idle information. The original idle information includes the start time, the original idle time, and the estimated number of threads.

The original idle information storage module is configured to store the original idle information on the target idle time queue if the original idle time is greater than the first time threshold.

Preferably, the segmentation processing data determination module 703 includes: an estimated processing time acquisition unit, a target quantity acquisition unit, and a preset screening rule unit.

The estimated processing time obtaining unit is used to calculate the number of data to be processed and the estimated number of threads using an estimated time calculation formula to obtain the estimated processing time corresponding to the data processing queue.

The target quantity obtaining unit is used to calculate the estimated processing time and the target idle time using the target quantity calculation formula to obtain the target quantity.

The preset screening rule unit is used to select the to-be-processed data corresponding to the target quantity in the data processing queue according to the preset screening rule, and determine it as segmentation processing data.

Preferably, after the task status update module 706, the batch data processing device further includes: a remaining time acquisition module, a processable data amount determination module, and a target processing thread acquisition module.

The remaining time obtaining module is used to obtain the remaining time based on the current time of the system, the starting time and the target idle time if the task status of each sub-processed data in the data processing queue is updated to the completed processing state.

The processable data amount determination module is configured to update the remaining time to the target idle time if the remaining time is greater than the second time threshold, and determine the corresponding processable data amount based on the updated target idle time and the estimated number of threads.

The target processing thread acquisition module is used to select the to-be-processed data corresponding to the number of processed data in the data processing queue, update it to split processing data, obtain the target processing thread corresponding to the estimated number of threads, and repeat execution when the target is idle Use the target processing thread to perform data processing on the segmentation processing data within the time period, and obtain the data processing result.

Preferably, after the data processing result obtaining module 705, the batch data processing device further includes: a real-time monitoring module.

The real-time monitoring module is used to monitor the current load of the system in the process of data processing in real time. If the current load of the system is greater than the burst load threshold, the target processing thread will be released, the data processing of the segmentation processing data will be stopped, and the task of processing the data will be segmented The status is updated to the stopped status.

For the specific limitation of the batch data processing device, please refer to the above limitation of the batch data processing method, which will not be repeated here. Each module in the above-mentioned batch data processing device can be implemented in whole or in part by software, hardware, and a combination thereof. The foregoing modules may be embedded in the form of hardware or independent of the processor in the computer device, or may be stored in the memory of the computer device in the form of software, so that the processor can call and execute the operations corresponding to the foregoing modules.

In one embodiment, a computer device is provided. The computer device may be a server, and its internal structure diagram may be as shown in FIG. 8. The computer equipment includes a processor, a memory, a network interface and a database connected through a system bus. Among them, the processor of the computer device is used to provide calculation and control capabilities. The memory of the computer device includes a readable storage medium and an internal memory. The readable storage medium stores an operating system, computer readable instructions, and a database. The internal memory provides an environment for the operation of the operating system and computer readable instructions in the readable storage medium. The database of the computer equipment is used to execute the data used or generated in the above batch data processing method, such as target batch tasks. The network interface of the computer device is used to communicate with an external terminal through a network connection. The computer-readable instructions are executed by the processor to realize a batch data processing method. The readable storage medium provided in this embodiment includes a non-volatile readable storage medium and a volatile readable storage medium.

In one embodiment, a computer device is provided, including a memory, a processor, and computer-readable instructions stored in the memory and capable of running on the processor. The processor executes the computer-readable instructions to implement batches in the foregoing embodiments. The data processing method, such as S201-S207 shown in FIG. 2, or shown in FIG. 3 to FIG. 6, is not repeated here to avoid repetition. Alternatively, the processor implements the functions of the modules/units in this embodiment of the batch data processing device when the processor executes computer-readable instructions, for example, the data processing queue creation module 701, the target idle information determination module 702, and the segmentation shown in FIG. The functions of the processed data determining module 703, the system original load acquiring module 704, the data processing result acquiring module 705, and the task status updating module 706 are not repeated here to avoid repetition.

In an embodiment, one or more readable storage media storing computer readable instructions are provided. When the computer readable instructions are executed by one or more processors, the one or more processors execute the foregoing The batch data processing method in the embodiment, for example, S201-S207 shown in FIG. 2, or shown in FIG. 3 to FIG. 6, is not repeated here to avoid repetition. Alternatively, when the computer-readable instructions are executed by one or more processors, the one or more processors realize the functions of each module/unit in the embodiment of the batch data processing apparatus when executed, for example, FIG. 7 The functions of the data processing queue creation module 701, the target idle information determination module 702, the segmentation processing data determination module 703, the system original load acquisition module 704, the data processing result acquisition module 705, and the task status update module 706 are shown to avoid duplication , I won’t repeat it here. The readable storage medium provided in this embodiment includes a non-volatile readable storage medium and a volatile readable storage medium.

A person of ordinary skill in the art can understand that all or part of the processes in the above-mentioned embodiment methods can be implemented by instructing relevant hardware through computer-readable instructions, which can be stored in a non-volatile computer. In a readable storage medium, when the computer-readable instructions are executed, they may include the processes of the above-mentioned method embodiments. Wherein, any reference to memory, storage, database or other media used in the embodiments provided in this application may include non-volatile and/or volatile memory. Non-volatile memory may include read only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory. Volatile memory may include random access memory (RAM) or external cache memory. As an illustration and not a limitation, RAM is available in many forms, such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous chain Channel (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), etc.

Those skilled in the art can clearly understand that for the convenience and conciseness of description, only the division of the above-mentioned functional units and modules is used as an example. In practical applications, the above-mentioned functions can be allocated to different functional units and modules as required. Module completion means dividing the internal structure of the device into different functional units or modules to complete all or part of the functions described above.

The above-mentioned embodiments are only used to illustrate the technical solutions of the present application, not to limit them; although the present application has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art should understand that it can still implement the foregoing The technical solutions recorded in the examples are modified, or some of the technical features are equivalently replaced; these modifications or replacements do not make the essence of the corresponding technical solutions deviate from the spirit and scope of the technical solutions of the embodiments of the application, and should be included in Within the scope of protection of this application.

Claims

A batch data processing method, characterized in that it comprises:

Selecting a target batch task from a non-real-time task queue, and creating a data processing queue based on the target batch task, the data processing queue including the data to be processed and the corresponding task status;

Determine target idle information from the target idle time queue, where the target idle information includes a start time, a target idle time length, and an estimated number of threads;

Obtain the number of data to be processed corresponding to the data to be processed, and obtain the target number based on the number of data to be processed, the target idle time and the estimated number of threads, and select the target number from the data processing queue. The data to be processed corresponding to the quantity is determined as segmentation processing data;

When the current time of the system is the start time, obtain the original load of the system;

If the original load of the system is less than the busy load threshold, the system is determined to be in an idle state, the target processing thread corresponding to the estimated number of threads is obtained, and the target processing thread is used to switch the switch within the target idle time. Perform data processing on the processed data to obtain the data processing results;

Based on the data processing result, the task status of each segmentation processing data in the data processing queue is updated.
5. The batch data processing method according to claim 1, wherein after obtaining the original system load when the current system time is the start time, the batch data processing method further comprises:

If the original load of the system is not less than the busy load threshold, it is determined that the system is in a busy state, and the determination of target idle information from the target idle time queue is repeated.
The method for processing batch data according to claim 1, wherein before the selecting a target batch task from a non-real-time task queue and creating a data processing queue based on the target batch task, the batch data processing method further comprises :

Acquiring a task processing request, where the task processing request includes a task to be processed and a task identifier corresponding to the task to be processed;

If the task identifier is a real-time identifier, execute the task to be processed;

If the task identifier is a non-real-time identifier, the task type in the task processing request is acquired, the task priority of the task to be processed is determined based on the task type, and the task priority is ordered according to the order of the task priority. The tasks to be processed are stored in the non-real-time task queue.
5. The batch data processing method according to claim 1, wherein before the determining target idle information from the target idle time queue, the batch data processing method further comprises:

Acquiring historical processing data, the historical processing data including historical processing time, historical processing quantity, and historical thread number;

Big data modeling is performed on the historical processing time, the historical processing quantity, and the historical thread number based on a machine learning algorithm, and an original idle time queue is obtained. The original idle time queue includes at least one piece of original idle information. The original idle information includes the start time, the original idle time and the estimated number of threads;

If the original idle time is greater than the first time threshold, the original idle information is stored on the target idle time queue.
The batch data processing method according to claim 1, wherein the target number is obtained based on the number of data to be processed, the target idle time and the estimated number of threads, and the target number is stored in the data processing queue Select the to-be-processed data corresponding to the target quantity and determine it as segmentation processing data, including:

Calculating the number of data to be processed and the number of estimated threads by using an estimated time calculation formula to obtain the estimated processing time corresponding to the data processing queue;

Calculate the estimated processing time and the target idle time using a target quantity calculation formula to obtain the target quantity;

According to a preset filtering rule, the data to be processed corresponding to the target quantity is selected from the data processing queue, and determined as the segmentation processing data.
The method for batch data processing according to claim 1, wherein after the task status of each of the segmentation processing data in the data processing queue is updated based on the data processing result, the batch data Treatment methods also include:

If the task status of each segmentation processing data in the data processing queue is updated to a completed processing status, obtaining the remaining time based on the current system time, the start time, and the target idle time;

If the remaining duration is greater than the second duration threshold, update the remaining duration to a target idle duration, and determine the corresponding processable data amount based on the updated target idle duration and the estimated number of threads;

Select the to-be-processed data corresponding to the number of processes that can be processed in the data processing queue, update it to split processing data, obtain the target processing thread corresponding to the estimated number of threads, and repeat execution when the target is idle The target processing thread is used to perform data processing on the segmentation processing data within the time period, and the data processing result is obtained.
5. The batch data processing method according to claim 1, wherein after the target processing thread is used to perform data processing on the segmentation processing data within the target idle time period, the batch data processing method further comprises:

Monitor the current load of the system during data processing in real time. If the current load of the system is greater than the burst load threshold, release the target processing thread, stop the data processing of the split processing data, and split the task of processing the data The status is updated to the stopped status.
A batch data processing device, characterized in that it comprises:

The data processing queue creation module is used to select target batch tasks from the non-real-time task queue, and create a data processing queue based on the target batch tasks. The data processing queue includes the data to be processed and the corresponding task status;

The target idle information determination module is used to determine the target idle information from the target idle time queue, and the target idle information includes the start time, the target idle time and the estimated number of threads;

The segmentation processing data determination module is used to obtain the number of data to be processed corresponding to the data to be processed, and obtain the target number based on the number of data to be processed, the target idle time and the estimated number of threads, and select the corresponding target number in the data processing queue The to-be-processed data is determined to be segmented processing data;

The system original load acquisition module is used to obtain the system original load when the current time of the system is the start time;

Data processing result acquisition module, used to determine that the system is in an idle state if the original load of the system is less than the busy load threshold, obtain the target processing thread corresponding to the estimated number of threads, and use the target processing thread to split the target within the target idle time Data processing and obtaining data processing results;

The task status update module is used to update the task status of each processing data in the data processing queue based on the data processing result.
A computer device comprising a memory, a processor, and computer-readable instructions stored in the memory and capable of running on the processor, wherein the processor executes the computer-readable instructions as follows step:

Selecting a target batch task from a non-real-time task queue, and creating a data processing queue based on the target batch task, the data processing queue including the data to be processed and the corresponding task status;

Determine target idle information from the target idle time queue, where the target idle information includes a start time, a target idle time length, and an estimated number of threads;

Obtain the number of data to be processed corresponding to the data to be processed, and obtain the target number based on the number of data to be processed, the target idle time and the estimated number of threads, and select the target number from the data processing queue. The data to be processed corresponding to the quantity is determined as segmentation processing data;

When the current time of the system is the start time, obtain the original load of the system;

If the original load of the system is less than the busy load threshold, the system is determined to be in an idle state, the target processing thread corresponding to the estimated number of threads is obtained, and the target processing thread is used to switch the switch within the target idle time. Perform data processing on the processed data to obtain the data processing results;

Based on the data processing result, the task status of each segmentation processing data in the data processing queue is updated.
The computer device according to claim 9, wherein, when the current system time is the start time, after acquiring the original system load, the processor further implements the following when executing the computer-readable instruction step:

If the original load of the system is not less than the busy load threshold, it is determined that the system is in a busy state, and the determination of target idle information from the target idle time queue is repeated.
The computer device according to claim 9, wherein before the target batch task is selected from the non-real-time task queue, and the data processing queue is created based on the target batch task, the processor executes the computer readable The following steps are also implemented when ordering:

Acquiring a task processing request, where the task processing request includes a task to be processed and a task identifier corresponding to the task to be processed;

If the task identifier is a real-time identifier, execute the task to be processed;

If the task identifier is a non-real-time identifier, the task type in the task processing request is acquired, the task priority of the task to be processed is determined based on the task type, and the task priority is ordered according to the order of the task priority. The tasks to be processed are stored in the non-real-time task queue.
9. The computer device according to claim 9, wherein before the determining target idle information from the target idle time queue, the processor further implements the following steps when executing the computer readable instruction:

Acquiring historical processing data, the historical processing data including historical processing time, historical processing quantity, and historical thread number;

Big data modeling is performed on the historical processing time, the historical processing quantity, and the historical thread number based on a machine learning algorithm, and an original idle time queue is obtained. The original idle time queue includes at least one piece of original idle information. The original idle information includes the start time, the original idle time and the estimated number of threads;

If the original idle time is greater than the first time threshold, the original idle information is stored on the target idle time queue.
The computer device according to claim 9, wherein the target number is obtained based on the number of data to be processed, the target idle time and the estimated number of threads, and the number of targets is selected from the data processing queue. The to-be-processed data corresponding to the target quantity is determined to be segmented processing data, including:

Calculating the number of data to be processed and the number of estimated threads by using an estimated time calculation formula to obtain the estimated processing time corresponding to the data processing queue;

Calculate the estimated processing time and the target idle time using a target quantity calculation formula to obtain the target quantity;

According to a preset filtering rule, the data to be processed corresponding to the target quantity is selected from the data processing queue, and determined as the segmentation processing data.
The computer device according to claim 9, wherein after the task status of each of the segmentation processing data in the data processing queue is updated based on the data processing result, the processor executes all The following steps are also implemented when the computer-readable instructions are described:

If the task status of each segmentation processing data in the data processing queue is updated to a completed processing status, obtaining the remaining time based on the current system time, the start time, and the target idle time;

If the remaining duration is greater than the second duration threshold, update the remaining duration to a target idle duration, and determine the corresponding processable data amount based on the updated target idle duration and the estimated number of threads;

Select the to-be-processed data corresponding to the number of processes that can be processed in the data processing queue, update it to split processing data, obtain the target processing thread corresponding to the estimated number of threads, and repeat execution when the target is idle The target processing thread is used to perform data processing on the segmentation processing data within the time period, and the data processing result is obtained.
One or more readable storage media storing computer readable instructions, when the computer readable instructions are executed by one or more processors, the one or more processors execute the following steps:

Selecting a target batch task from a non-real-time task queue, and creating a data processing queue based on the target batch task, the data processing queue including the data to be processed and the corresponding task status;

Determine target idle information from the target idle time queue, where the target idle information includes a start time, a target idle time length, and an estimated number of threads;

Obtain the number of data to be processed corresponding to the data to be processed, and obtain the target number based on the number of data to be processed, the target idle time and the estimated number of threads, and select the target number from the data processing queue. The data to be processed corresponding to the quantity is determined as segmentation processing data;

When the current time of the system is the start time, obtain the original load of the system;

If the original load of the system is less than the busy load threshold, the system is determined to be in an idle state, the target processing thread corresponding to the estimated number of threads is obtained, and the target processing thread is used to switch the switch within the target idle time. Perform data processing on the processed data to obtain the data processing results;

Based on the data processing result, the task status of each segmentation processing data in the data processing queue is updated.
The readable storage medium according to claim 15, wherein when the current time of the system is the start time, after obtaining the original load of the system, the computer-readable instructions are executed by one or more processors When executed, the one or more processors are caused to further execute the following steps:

If the original load of the system is not less than the busy load threshold, it is determined that the system is in a busy state, and the determination of target idle information from the target idle time queue is repeated.
The readable storage medium of claim 15, wherein before the target batch task is selected from the non-real-time task queue, and the data processing queue is created based on the target batch task, the computer-readable instruction is When executed by the one or more processors, the one or more processors further execute the following steps:

Acquiring a task processing request, where the task processing request includes a task to be processed and a task identifier corresponding to the task to be processed;

If the task identifier is a real-time identifier, execute the task to be processed;

If the task identifier is a non-real-time identifier, the task type in the task processing request is acquired, the task priority of the task to be processed is determined based on the task type, and the task priority is ordered according to the order of the task priority. The pending tasks are stored in the non-real-time task queue.
The readable storage medium of claim 15, wherein before the target idle information is determined from the target idle time queue, when the computer-readable instructions are executed by one or more processors, the One or more processors also perform the following steps:

Acquiring historical processing data, the historical processing data including historical processing time, historical processing quantity, and historical thread number;

Big data modeling is performed on the historical processing time, the historical processing quantity, and the historical thread number based on a machine learning algorithm, and an original idle time queue is obtained. The original idle time queue includes at least one piece of original idle information. The original idle information includes the start time, the original idle time and the estimated number of threads;

If the original idle time is greater than the first time threshold, the original idle information is stored on the target idle time queue.
The readable storage medium according to claim 15, wherein the target number is obtained based on the number of data to be processed, the target idle time and the estimated number of threads, and the target number is stored in the data processing queue Select the to-be-processed data corresponding to the target quantity and determine it as segmentation processing data, including:

Calculating the number of data to be processed and the number of estimated threads by using an estimated time calculation formula to obtain the estimated processing time corresponding to the data processing queue;

Calculate the estimated processing time and the target idle time using a target quantity calculation formula to obtain the target quantity;

According to a preset filtering rule, the data to be processed corresponding to the target quantity is selected from the data processing queue and determined as the segmentation processing data.
The readable storage medium according to claim 15, wherein after the task status of each of the segmentation processing data in the data processing queue is updated based on the data processing result, the computer can When the read instruction is executed by one or more processors, the one or more processors further execute the following steps:

If the task status of each segmentation processing data in the data processing queue is updated to a completed processing status, obtaining the remaining time based on the current system time, the start time, and the target idle time;

If the remaining duration is greater than the second duration threshold, update the remaining duration to a target idle duration, and determine the corresponding processable data amount based on the updated target idle duration and the estimated number of threads;

Select the to-be-processed data corresponding to the number of processes that can be processed in the data processing queue, update it to split processing data, obtain the target processing thread corresponding to the estimated number of threads, and repeat execution when the target is idle The target processing thread is used to perform data processing on the segmentation processing data within the time period, and the data processing result is obtained.