CN109597685B - Task allocation method, device and server - Google Patents

Task allocation method, device and server Download PDF

Info

Publication number
CN109597685B
CN109597685B CN201811157251.5A CN201811157251A CN109597685B CN 109597685 B CN109597685 B CN 109597685B CN 201811157251 A CN201811157251 A CN 201811157251A CN 109597685 B CN109597685 B CN 109597685B
Authority
CN
China
Prior art keywords
processing
data
task
time period
nodes
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811157251.5A
Other languages
Chinese (zh)
Other versions
CN109597685A (en
Inventor
吴轲
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Advanced New Technologies Co Ltd
Advantageous New Technologies Co Ltd
Original Assignee
Advanced New Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Advanced New Technologies Co Ltd filed Critical Advanced New Technologies Co Ltd
Priority to CN201811157251.5A priority Critical patent/CN109597685B/en
Publication of CN109597685A publication Critical patent/CN109597685A/en
Application granted granted Critical
Publication of CN109597685B publication Critical patent/CN109597685B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5083Techniques for rebalancing the load in a distributed system
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Debugging And Monitoring (AREA)
  • Computer And Data Communications (AREA)

Abstract

The specification provides a task allocation method, a task allocation device and a server. The task allocation method comprises the following steps: acquiring operation data, task processing data and use data of dependent resources of a plurality of processing nodes in a first time period; determining the task bearing capacity of the target processing node according to the data; and distributing the data processing tasks matched with the task bearing capacity in a second time period to the target processing nodes. In the embodiment of the specification, the specific task bearing capacity of the processing node is determined finely by acquiring and utilizing the use data of the resources relied by the processing node, the operation data of the processing node and the task processing data in the first time period, and then the data processing task in the second time period is distributed to the processing node according to the task bearing capacity, so that the processing node and the specific condition of the system resources relied by the processing node can be combined, the task can be distributed to the processing node accurately and reasonably, the utilization rate of the resources is improved, and the data processing is efficient and stable.

Description

Task allocation method, device and server
Technical Field
The present disclosure belongs to the technical field of the internet, and in particular, relates to a task allocation method, a task allocation device, and a task allocation server.
Background
In a distributed data processing system, there are typically a plurality of processing nodes. The processing nodes are respectively responsible for processing data processing tasks distributed by the processing system, so that batch processing can be carried out on the data processing tasks to be processed.
Most of the existing task allocation methods are to perform pressure test on the processing capacity of processing nodes in a system in advance, and determine the processing speed of the processing nodes (for example, the number of data processing tasks completed by the processing nodes per second) and discount proportion according to test results; and calculating the allocation amount of the fixed tasks according to the processing speed and the discount proportion of the processing nodes, and further allocating the data processing tasks to be processed to the processing nodes at intervals according to the allocation amount of the fixed tasks.
When the data processing tasks of the processing nodes are distributed through the method, the data processing tasks to be processed can be distributed to the processing nodes only according to the fixed task distribution amount determined based on the static test. However, when data processing is specifically performed, the use condition of resources of the whole system, the running environment, the self condition of the processing node and the like are dynamically changed (for example, the available resources of the current system are reduced, or some processing nodes in the system cannot work normally temporarily, etc.), and the change affects the processing capability of specific data processing tasks of the processing node. Therefore, according to the existing method, the data processing tasks of the processing nodes are distributed according to the fixed task distribution amount determined based on the static test, which is often inaccurate and reasonable, and errors easily exist, so that system resources cannot be effectively utilized, and even the system operation is unstable. Therefore, a more accurate and reasonable task allocation method is needed, which can accurately allocate data processing tasks for processing nodes, so that the utilization rate of resources is improved, and the data processing is efficient and stable.
Disclosure of Invention
The specification aims to provide a task allocation method, a task allocation device and a task allocation server, so that tasks are accurately and reasonably allocated to processing nodes according to specific conditions of a system and the processing nodes, and the utilization rate of resources is improved, so that data processing is efficient and stable.
The task allocation method, the task allocation device and the server provided by the specification are realized in the following way:
a task allocation method, comprising: acquiring operation data, task processing data and use data of dependent resources of a plurality of processing nodes in a first time period; determining the task bearing capacity of the target processing node according to the operation data of the processing nodes, the task processing data and the use data of the dependent resources in the first time period; and distributing the data processing tasks matched with the task bearing capacity in a second time period to the target processing nodes.
A task allocation device comprising: the acquisition module is used for acquiring operation data, task processing data and use data of dependent resources of the plurality of processing nodes in the first time period; the determining module is used for determining the task bearing capacity of the target processing node according to the operation data, the task processing data and the use data of the dependent resources of the plurality of processing nodes in the first time period; and the distribution module is used for distributing the data processing tasks matched with the task bearing capacity in a second time period to the target processing nodes.
A server comprising a processor and a memory for storing processor-executable instructions that when executed enable acquisition of operational data, task processing data and usage data of dependent resources for a plurality of processing nodes in a first time period; determining the task bearing capacity of the target processing node according to the operation data of the processing nodes, the task processing data and the use data of the dependent resources in the first time period; and distributing the data processing tasks matched with the task bearing capacity in a second time period to the target processing nodes.
A computer readable storage medium having stored thereon computer instructions that, when executed, enable obtaining operational data, task processing data, and usage data of dependent resources for a plurality of processing nodes in a first time period; determining the task bearing capacity of the target processing node according to the operation data of the processing nodes, the task processing data and the use data of the dependent resources in the first time period; and distributing the data processing tasks matched with the task bearing capacity in a second time period to the target processing nodes.
According to the task allocation method, the device and the server, the specific task bearing capacity of the processing nodes is finely determined by acquiring and utilizing the use data of the resources relied by the processing nodes, the operation data of the processing nodes and the task processing data in the first time period, and then the data processing tasks in the second time period are allocated to the processing nodes according to the task bearing capacity, so that the use condition of the whole resources of the system in the data processing process, the specific conditions of the processing nodes and other factors can be integrated, and the adaptive data processing tasks can be accurately and reasonably allocated to the processing nodes, so that the utilization rate of the resources is improved, and the processing nodes can efficiently and stably perform data processing.
Drawings
In order to more clearly illustrate the embodiments of the present description or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described below, it being obvious that the drawings in the following description are only some of the embodiments described in the present description, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic diagram of one embodiment of the structural composition of a system to which the task allocation method provided by the embodiments of the present specification is applied;
FIG. 2 is a schematic diagram of one embodiment of distributing data processing tasks of a processing node using the task distribution method provided by the embodiments of the present disclosure in one example scenario;
FIG. 3 is a schematic diagram of one embodiment of distributing data processing tasks of a processing node using the task distribution method provided by the embodiments of the present disclosure in one example scenario;
FIG. 4 is a schematic diagram of one embodiment of a flow of a task allocation method provided by embodiments of the present disclosure;
FIG. 5 is a schematic diagram of one embodiment of a structure of a server provided by embodiments of the present description;
Fig. 6 is a schematic view of an embodiment of a structure of a task assigning device provided in the embodiment of the present specification.
Detailed Description
In order to make the technical solutions in the present specification better understood by those skilled in the art, the technical solutions in the embodiments of the present specification will be clearly and completely described below with reference to the drawings in the embodiments of the present specification, and it is obvious that the described embodiments are only some embodiments of the present specification, not all embodiments. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are intended to be within the scope of the present disclosure.
Considering that the conventional task allocation method does not always consider the influence of the specific conditions of the system and the processing nodes (such as the use condition of system resources, the running condition of the processing nodes, the accumulation of unprocessed task amounts by the processing nodes and the like) on the processing capacity of the processing nodes when determining the task amounts allocated to the processing nodes, but determines the processing speed of the processing nodes by a static test mode, namely, performs a pressure test on the processing capacity of the processing nodes in advance, determines the processing speed of the processing nodes according to the test result, further determines the fixed task allocation amount, and allocates each processing task to the processing nodes according to the fixed task allocation amount.
However, the actual processing power of a processing node may be affected by the system environment in which the processing node is located, the system resources used, and the processing node itself. For example, dithering of a downstream processing system on which a processing node depends may reduce processing power of the processing node, and so on. Therefore, the existing method relies on the fixed task allocation amount determined by static test to allocate the data processing task, which is often inaccurate, so that the system resource cannot be effectively utilized, and even the system is unstable.
Aiming at the situation, the specification proposes that the data capable of reflecting the use condition of system resources (i.e. the resources on which a plurality of processes depend) and the specific condition of the processing node (including the running condition of the processing node and the task processing condition of the last time period) can be obtained and utilized, then the task bearing capacity suitable for the current system resources and the condition of the processing node is determined according to the data, and then the data processing task is reasonably distributed to the processing node according to the task bearing capacity. Therefore, tasks can be accurately and reasonably distributed to the processing nodes according to the specific conditions of the system and the processing nodes, the utilization rate of resources is improved, and the processing nodes can efficiently and stably process data.
The embodiment of the specification provides a task allocation method, which can be applied to the steps of: the system comprises a scheduling server, a perception module and a system architecture of a processor cluster.
Reference may be made to fig. 1. The processor cluster may specifically be a set comprising a plurality of processors. Wherein each processor in the cluster of processors is understood to be a processing node for processing the assigned data processing tasks to be processed. The sensing module may be specifically configured in the system architecture, and is configured to collect, in every other period of time (for example, every 30 seconds), usage data of resources (for example, resources of a CPU, a database, and the like of the whole system) relied on by processors in the processing cluster in a previous period of time, and operation data and task processing data of each individual processor in the processing cluster, and send the data to the scheduling server. The scheduling server can be specifically used for processing and analyzing the received data in the last time period to determine the overall resources of the current system and the specific running condition of the processor individual; based on the above situation, under the requirement of considering the resource utilization rate and the running stability, the task bearing capacity of the processor in the next time period (namely, the maximum task capacity of the data processing task which can be processed by the processor in the next time period) is calculated, and then the data processing task in the next time period is reasonably distributed to the processor according to the task bearing capacity.
In this embodiment, the server may be an electronic device with data operation, storage, and network interaction functions; software running in the electronic device that supports data processing, storage, and network interactions may also be used. The number of servers is not particularly limited in the present embodiment. The server may be one server, several servers, or a server cluster formed by several servers.
In this embodiment, the processor may be an integrated circuit module capable of interpreting computer instructions and performing specific data processing. For example, the processor may be a Central Processing Unit (CPU) or the like. And combining a plurality of processors to obtain the processor cluster for batch data processing.
In this embodiment, the sensing module may be specifically an acquisition device disposed in the system construction and configured to monitor and acquire usage conditions of system resources relied by the specified processor, and related data such as running conditions of the processor itself, task processing conditions, and the like. The description is not limited to the specific type of the sensing module.
In one scenario example, the task allocation method provided in the embodiments of the present disclosure may be applied to allocate, to each processor in the processor cluster, a data processing task to be processed in a next time period.
In this scenario example, it is considered that the specific processing capability of each processor in the processor cluster is affected by multiple factors such as the use condition of system resources on which the processor depends, the running condition of the processor itself, and the task processing condition of the last time period. Therefore, in the implementation, the usage data of the resources relied by the processing node in the previous time period (i.e. the first time period), the operation data of each processor in the processor cluster in the previous time period, and the task processing data of the processor in the previous time period can be monitored and acquired through a sensing module preset in the system.
The resources relied by the processor may be system resources except the processor itself, which are needed to be used by the processor in the data processing process. For example, it may be a CPU of a system that the processor calls when performing data processing, or a database of an accessed system, or an I/O interface of a system used, or the like. The specific type of resources on which the processors rely is not limited in this specification. The above-mentioned usage data of the resources relied on by the processing node in the previous period can be understood as, in particular, parameter data for characterizing the overall usage of the system resources except the processor itself in the previous period. Specifically, the usage data of the resources on which the processor depends may include: the utilization rate of the system CPU in the last time period, the access amount of the system database in the last time period, the occupancy rate of the system memory in the last time period and the like. Of course, it should be noted that the above-listed usage data of the resources relied on by the processor is only a schematic illustration, and other suitable parameter data may be selected to be used as the usage data of the resources relied on by the processor according to a specific scenario when the method is implemented. The present specification is not limited to this.
The above-mentioned operation data of the processor in the previous period can be understood as, in particular, parameter data for characterizing the operation of the processor in the previous period. Specifically, the operation data of the processor in the previous period may include: the resource data of the processor in the last time period (for example, the processing resource amount of the processor itself occupied when the data processing is performed in the last time period of each processor), the operation state of the processor in the last time period (for example, normal operation or abnormal operation), and the like. Of course, it should be noted that the above-listed operation data of the processor is only a schematic illustration, and other suitable parameter data may be selected as the operation data of the processor according to a specific scenario when the present invention is implemented. The present specification is not limited to this.
The task processing data of the processor in the previous period may be understood as parameter data for characterizing the processing situation of the processor data processing task in the previous period. Specifically, the task processing data of the processor in the previous period may include: the allocation number of data processing tasks of the processor in the last period, the number of data processing tasks which the processor has not completed in the last period, the number of data processing tasks which the processor has failed in processing in the last period, the processing time length of the processor processing the data processing tasks in the last period, and the like. It should be noted that the task processing data of the above-mentioned processor is merely illustrative, and other suitable parameter data may be selected as the task processing data of the above-mentioned processor according to a specific scenario when the present invention is implemented. The present specification is not limited to this.
In particular, the sensing module may be pre-deployed in the system and capable of performing data interaction with each processor in the system and other resource modules in the system other than the above-mentioned processor. Specifically, the sensing module can acquire log records of each processor in the processor cluster and other resource modules except the processor of the system through data interaction; the log record can be further analyzed to obtain specific usage data of the resources on which the processing node depends in the previous time period, operation data of the processor in the previous time period, task processing data of the processor in the previous time period, and the like.
After the sensing module obtains the data, the sensing module can send the data to the scheduling server in a wired or wireless mode. After receiving the data, the scheduling server can determine the task processing condition of the processor in the previous time period through data analysis, and based on the task processing in the previous time period, the running condition of the processor and the resource use condition except the processor in the system; the information of the various conditions can be synthesized, and the maximum data processing task quantity which can be processed and completed by each processor in the next time period, namely the task bearing capacity of the processor, is determined while the overall operation stability of the system is considered; and then, according to the task bearing capacity, distributing the data processing task matched with the task bearing capacity to the processors in the processor cluster as the data processing task to be processed by the processor in the next time period, so that the processor cluster can effectively utilize system resources, and the overall data processing efficiency of the processor cluster is improved.
Specifically, referring to fig. 2, the scheduling server may determine, according to the operation data of the processor in the previous time period and the task processing data of the processor in the previous time period, a processor that meets the processing requirement of the data processing task.
The processor meeting the processing requirements of the data processing task is specifically understood to be a processor capable of operating normally and also capable of receiving and processing a new data processing task (i.e., a data processing task in a next period).
In the implementation, for example, the scheduling server may screen, according to the operation data of the processor in the previous period, a plurality of processors with normal operation states from a plurality of processors in the processor cluster as the first processor; and further screening the processors with the number of the data processing tasks which are not completed by the processor being less than or equal to the threshold value of the number of the residual tasks from the first processor as a second processor according to the task processing data of the processor in the previous time period, namely the processors meeting the processing requirements of the data processing tasks. It should be noted that the above-mentioned manner of determining a processor meeting the processing requirements of the data processing task is merely a schematic illustration, and should not be construed as unduly limiting the present specification.
Further, the scheduling server may determine the number of data processing tasks to be allocated in the next time period (i.e. the second time period) according to the usage data of the resources on which the processor depends in the previous time period and the task processing data of the processor in the previous time period.
Specifically, the scheduling server may determine, according to the usage data of the resources relied by the processor in the previous time period and the task processing data of the processor in the previous time period, the total amount of tasks allocated by the processor and the actually completed data processing task amount based on the previous time period, and the usage condition of the resources relied by the processor except the processor in the data processing task system for completing the previous time period; and calculating the number of the data processing tasks which can be processed and completed by the processor in the next time period, namely the number of the data processing tasks to be distributed on the premise of considering the running stability of the system according to the determined condition information and the specific duration of the next time period.
For example, the total amount of tasks allocated by the processor cluster in the previous time period may be calculated according to the allocation amount of the data processing tasks of each processor in the processor cluster in the previous time period; and comparing the access amount of the system database in the resources on which the processors depend in the last time period, determining whether the total amount of tasks distributed by the processor cluster in the last time period is proper or not based on the threshold value of the access amount of the database determined by the overall stability of the system, and determining whether the system can stably run based on the total amount of tasks. And the total tasks distributed by the processor cluster in the previous time period can be further adjusted according to the comparison result of the access amount of the system database and the threshold value of the access amount of the database in the previous time period, so that the adjusted total tasks are used as the allocable data processing task amount in the next time period.
For example, if the comparison result of the access amount of the system database and the threshold value of the access amount of the database in the previous time period is that the access amount of the system database in the previous time period is greater than the threshold value of the access amount of the database, the amount of tasks to be allocated can be properly reduced on the basis of the total amount of tasks allocated by the processor cluster in the previous time period, so as to ensure the running stability of the whole system; if the comparison result of the access amount of the system database in the previous time period and the threshold value of the access amount of the database is that the access amount of the system database in the previous time period is far smaller than the threshold value of the access amount of the database, the amount of tasks to be allocated can be increased appropriately on the basis of the total amount of tasks allocated by the processor cluster in the previous time period, so that the utilization rate of system resources is further improved, and the overall data processing efficiency of the processor cluster is improved.
After determining the processors meeting the processing requirements of the data processing tasks and the number of the data processing tasks to be allocated in the next time period, the scheduling server can further calculate the average task quantity of the processors according to the processors meeting the processing requirements of the data processing tasks and the number of the data processing tasks to be allocated in the next time period, and the average task quantity is used as the task bearing capacity of each processor in the processor cluster (namely, the task bearing capacity of the target processor). Specifically, the number of data processing tasks to be allocated in the next period of time may be divided by the number of processors meeting the processing requirements of the data processing tasks, and the obtained average value may be used as the task load capacity of the processors.
The task carrying capacity of the processor can be specifically understood as the maximum data processing task quantity which can be processed and completed in the next time period of the processor on the premise that the whole processor cluster keeps running stably and which is determined by integrating the overall use condition of resources relied by the processor in the previous time period, the running condition of the processor and the task processing condition.
After determining the task load capacity of the processors in the processor cluster, the scheduling server may acquire, from the storage medium, data processing tasks satisfying the number of data processing tasks to be allocated in the next time period, and average, according to the task load capacity of the processors, allocate the data processing tasks to each processor in the processor cluster as the data processing tasks to be processed in the next time period, where the data processing tasks allocated by each processor in the processor cluster are matched with the task load capacity of the processor.
After each processor in the server cluster receives the distributed data processing tasks, the next time period respectively carries out specific data processing on the distributed data processing tasks, so that the effect of carrying out batch processing on the data processing tasks to be processed can be achieved. When the processor processes the data processing task, the processor can also interact with the sensing module, so that the running data and the task processing data of the processor can be timely fed back to the sensing module, and the sensing module can monitor and collect the use data of the resources relied by the processor.
In another example of the scenario, after the scheduling server determines the processor meeting the processing requirement of the data processing task according to the operation data of the processor in the above-mentioned one time period and the task processing data of the processor in the previous time period, in order to more finely allocate the data processing task for the specific situation of each processor, the number of the processors meeting the processing requirement of the data processing task in the system may also be counted first. Detecting whether the number of the processors meeting the processing requirements of the data processing task is smaller than a preset processor number threshold; the number of processors threshold may be determined according to the overall performance of the system resource. When the number of processors meeting the processing requirements of the data processing task is smaller than the threshold value of the number of processors, the task processing data of the target processor and the use data of the dependent resources in the last time period can be extracted from the operation data, the task processing data and the use data of the dependent resources of a plurality of processors in the last time period; and determining the task bearing capacity of the target processor according to the specific situation of the target processor in a targeted manner according to the task processing data of the target processor and the use data of the dependent resources in the last time period. Instead of taking the average number of tasks of the processors in the processor cluster as the task load of the target processor. Therefore, the distribution of the data processing tasks is more reasonable and accurate by utilizing the processor and other system resources depending on the processor according to the individual condition of the processor, and the data processing efficiency is further improved.
Specifically, for example, refer to fig. 3. Under the condition that the number of the processors meeting the processing requirements of the data processing tasks is smaller than a threshold value of the number of the processors, the scheduling server respectively determines that the maximum data processing task quantity which can be completed in the next time period of the No. 1 processor is 5 according to the use data of the resources which the processors depend on in the previous time period, the task processing data of the processors in the previous time period and the specific conditions of the processors meeting the processing requirements of the data processing tasks aiming at the single specific conditions of the processors, namely the task bearing capacity of the No. 1 processor is 5; the maximum data processing task amount which can be completed in the next time period of the No. 2 processor is 3, namely the task bearing capacity of the No. 2 processor is 3; the task load of the N-processor is 7 or the like. Further, different treatment can be allocated to tasks of different processors, for example, 5 data processing tasks can be allocated to the No. 1 processor; 3 data processing tasks are allocated to processor number 2, and 7 data processing tasks are allocated to processor number N.
As can be seen from the above scenario examples, in the task allocation method provided in the present disclosure, by acquiring and using the usage data of the resources relied by the processor, the operation data of the processing node, and the task processing data in the first period, the specific task load of the processor is determined finely and reasonably, and then the data processing task in the second period is allocated to the processor according to the task load, so that the task can be accurately allocated to the processor in combination with the specific conditions of the processor and other system resources relied by the processor, thereby improving the utilization rate of the resources, and enabling the processor to perform data processing efficiently and stably.
Referring to fig. 4, an embodiment of the present disclosure provides a task allocation method, where when the method is implemented, the method may include the following:
s41: and acquiring operation data, task processing data and use data of the dependent resources of the plurality of processing nodes in the first time period.
In this embodiment, the first period may be specifically understood as a specified period of time in history. Specifically, the first period may refer to a period of time that is closest to a current (or a next period of time of a task to be allocated), for example, a previous period of time; it may also refer to a time period selected from the history to meet a preset requirement according to a specific situation and requirement, for example, a time period selected from the history to be similar to the situation of the data processing task to be processed currently. Of course, the first time period listed above is only for better explaining the present description embodiment. No undue limitations should be made to the present specification.
In this embodiment, the above system may be specifically understood as a distributed cluster system including a plurality of processing nodes. Wherein, a plurality of processing nodes are distributed and deployed in the system. The processing nodes are independent and respectively responsible for processing the distributed data processing tasks. Specifically, when the system receives a large number of data processing tasks to be processed, the system can gradually distribute the data processing tasks to each processing node in the system at intervals according to a certain distribution amount by a scheduling server. After receiving the distributed data processing tasks, the processing nodes respectively process the distributed data processing tasks, so that the system can process the data processing tasks to be processed in batches, the data processing efficiency is improved, and the resource utilization rate is improved.
In the present embodiment, the processing node may specifically be an electronic device having a function of performing a certain data operation, storing, or the like, and the processing node may specifically be a processor, a server, or the like, for example. The processing node may be, for example, an application program or the like corresponding to some data processing task. The specification is not limited to a particular form of processing node.
In this embodiment, the usage data of the resources on which the processing node depends in the first period may be specifically understood as parameter data for characterizing the overall usage of the system resources except the processor itself in the first period. Specifically, the usage data of the resources on which the processing node depends may include: the utilization rate of the system CPU in the first time period, the access amount of the system database in the first time period, the occupancy rate of the system memory in the last time period and the like. Of course, the above-mentioned usage data of the resources relied on by the processing nodes is only a schematic illustration, and other suitable parameter data may be selected to be used as the usage data of the resources relied on by the processing nodes according to specific situations. The present specification is not limited to this.
In this embodiment, the operation data of the processing node in the first period may be specifically understood as parameter data for characterizing the operation condition of the processing node in the first period. Specifically, the operation data of the processing node in the first period may include: the resource data of the processing nodes in the first time period (for example, the resource amount of the processing nodes occupied by each processing node when the processing nodes perform data processing in the first time period), the operation state of the processing nodes in the first time period (for example, normal operation or abnormal operation) and the like. Of course, it should be noted that the above-listed operation data of the processing node is only a schematic illustration, and other suitable parameter data may be selected as the operation data of the processing node according to a specific scenario when the processing node is implemented. The present specification is not limited to this.
In this embodiment, the task processing data of the processing node in the first period may be specifically understood as parameter data for characterizing a processing condition of the processing node data processing task in the first period. Specifically, the task processing data of the processing node in the first period may include: the allocation number of the data processing tasks of the processing node in the first time period, the number of the data processing tasks which are not completed by the processing node in the first time period, the number of the data processing tasks which are failed to be processed by the processing node in the first time period, the processing time length of the data processing tasks processed by the processing node in the first time period and the like. It should be noted that the task processing data of the processing nodes listed above is merely illustrative, and other suitable parameter data may be selected as the task processing data of the processing nodes according to a specific scenario when the processing nodes are implemented. The present specification is not limited to this.
In this embodiment, the obtaining the usage data of the resources on which the processing node depends in the first period, the operation data of the processing node in the first period, and the task processing data of the processing node in the first period may include: and acquiring data such as the use data of the resources relied by the processing nodes in the first time period, the operation data of the processing nodes in the first time period, the task processing data of the processing nodes in the first time period and the like through a perception module preset in the system.
In this embodiment, the processing condition of the data processing task allocated by the system in the first period of time and the processing condition corresponding to the data processing task may be determined by analyzing and processing the usage data of the resource relied by the processing node in the first period of time, the operation data of the processing node in the first period of time, and the task processing data of the processing node in the first period of time, so as to process the specific condition of the system resource relied by the processing node. And then, the task allocation scheme of each processing node can be finely determined on the premise of considering the running stability and the resource utilization rate by referring to the determined condition information of the first time period. Of course, it should be noted that, the above scheme only enumerates the data of obtaining the usage data of the resources on which the processing node depends in the first period, the operation data of the processing node in the first period, the task processing data of the processing node in the first period, and other historical data besides the enumerated data may also be obtained according to the specific application scenario and requirements when the scheme is specifically implemented. The present specification is not limited to this.
S43: and determining the task bearing capacity of the target processing node according to the operation data of the processing nodes, the task processing data and the use data of the dependent resources in the first time period.
In this embodiment, the target processing node may be specifically understood as any one of a plurality of processing nodes. The task load may be specifically understood as a maximum data processing task amount that can be processed and completed by the processing node in the second time period (i.e., the current next time period) on the premise of keeping stable operation, which is determined by integrating the individual conditions of the processing node (including the operation condition of the processing node and the task processing condition) in the first time period (determined based on the usage data of the resources relied by the processing node, the operation data of the processing node, and the task processing data of the processing node in the first time period).
In this embodiment, in the implementation, the task load capacity of the target processing node is determined according to the operation data, the task processing data, and the usage data of the dependent resources of the plurality of processing nodes in the first period, and the scheduling server may determine, in the first period, based on the allocated data processing task and the processing condition of the completed data processing task, the processing node individual, and the condition data of the whole system resource on which the processing node depends according to the usage data of the dependent resources of the processing node in the first period, the operation data of the processing node in the first period, and the task processing data of the processing node in the first period; and determining the task quantity which can be completed by the processing node in the second time period, namely the task bearing capacity of the target processing node, according to a certain strategy on the premise of considering the running stability of the system and the processing node and the resource utilization rate according to the condition data.
S45: and distributing the data processing tasks matched with the task bearing capacity in a second time period to the target processing nodes.
In this embodiment, the second time may be specifically understood as a time period that distinguishes the first time period. For example, it may be the next time period, or some specified time period in the future, etc.
In this embodiment, when the data processing task matching the task load amount in the second period of time is allocated to the target processing node, and the scheduling server may allocate the data processing task to be processed in the second period of time to each processing node in the system according to the task load amount of each processing node after acquiring the data processing task to be processed in the second period of time. The data processing tasks distributed by each processing node are matched with the task bearing capacity of the processing node. For example, if the task load of the processing node No. 1 is 3, the data processing task amount allocated to the second period of time of the processing node No. 1 may be 3 or less than 3. The data processing tasks of the second time period distributed by the processing nodes in the task distribution mode are in accordance with the conditions of the individual processing nodes and the whole system resources relied by the processing nodes, so that the distributed data processing tasks can effectively utilize the system resources and the processing nodes in the system, thereby stably and efficiently processing data and improving the data processing efficiency.
Therefore, according to the task allocation method provided by the specification, the specific task bearing capacity of the processing node is determined finely by acquiring and utilizing the use data of the resources relied by the processing node, the operation data of the processing node and the task processing data in the first time period, and then the data processing task in the second time period is allocated to the processing node according to the task bearing capacity, so that the tasks can be allocated to the processing node reasonably and accurately according to the individual processing nodes and the specific condition of the whole system resources relied by the processing node, the utilization rate of the resources is improved, and a plurality of processing nodes can perform data processing efficiently and stably.
In one embodiment, the usage data of the resources on which the processing node depends in the first period of time may specifically include at least one of the following: the utilization rate of the system CPU in the first time period, the access amount of the system database in the first time period, the occupancy rate of the system memory in the first time period and the like. Of course, the above-mentioned usage data of the resources relied on by the processing nodes is only a schematic illustration, and other suitable parameter data may be selected to be used as the usage data of the resources relied on by the processing nodes according to specific situations. The present specification is not limited to this.
In one embodiment, the operation data of the processing node in the first period of time includes at least one of: the resource data of the processing node in the first time period, the running state of the processing node in the first time period, and the like. Of course, it should be noted that the above-listed operation data of the processing node is only a schematic illustration, and other suitable parameter data may be selected as the operation data of the processing node according to a specific scenario when the processing node is implemented. The present specification is not limited to this.
In one embodiment, the task processing data of the processing node in the first time period includes at least one of: the allocation number of the data processing tasks of the processing node in the first time period, the number of the data processing tasks which are not completed by the processing node in the first time period, the number of the data processing tasks which are failed to be processed by the processing node in the first time period, the processing time length of the data processing tasks processed by the processing node in the first time period and the like. It should be noted that the task processing data of the processing nodes listed above is merely illustrative, and other suitable parameter data may be selected as the task processing data of the processing nodes according to a specific scenario when the processing nodes are implemented. The present specification is not limited to this.
In an embodiment, the acquiring the usage data of the resources relied by the processing node in the first period, the operation data of the processing node in the first period, and the task processing data of the processing node in the first period may include the following when in implementation: and acquiring operation data, task processing data and use data of dependent resources of the processing nodes in a first time period when the plurality of processing nodes process the distributed data processing tasks in the first time period through a perception module preset in the system.
In this embodiment, the sensing module may be specifically understood as a device or apparatus that is pre-configured in the system and is capable of performing data interaction with each processor in the system and other resource modules in the system except for the processing node, and collecting relevant situation data. Specifically, the sensing module can acquire log records of each processing node in the cluster and other resource modules except the processing nodes of the system through data interaction; and further, the log records can be analyzed to obtain condition data such as the use data of the resources relied by the processing nodes in the first time period, the operation data of the processing nodes in the first time period, the task processing data of the processing nodes in the first time period and the like.
In one embodiment, determining the task load of the target processing node according to the operation data, the task processing data and the usage data of the dependent resources of the plurality of processing nodes in the first period may include the following when in implementation:
s1: determining a plurality of processing nodes meeting the processing requirements of the data processing task according to the operation data and the task processing data of the plurality of processing nodes in the first time period;
s2: determining the number of data processing tasks to be allocated in a second time period according to the use data of the resources relied by the plurality of processing nodes in the first time period and the task processing data;
s3: and calculating the average task quantity of the processing nodes according to the processing nodes meeting the processing requirements of the data processing tasks and the quantity of the data processing tasks to be distributed in the second time period, and taking the average task quantity of the processing nodes as the task bearing capacity of the target processing nodes.
In this embodiment, the processing node meeting the processing requirement of the data processing task may be specifically understood as a processing node capable of normal operation and receiving a new processed data processing task in the second time period based on the processing condition of the history data processing task, the individual operation condition of the processing node, and the condition of the resources on which the processing node depends.
In this embodiment, the determining, according to the operation data and the task processing data of the plurality of processing nodes in the first period, a plurality of processing nodes meeting processing requirements of the data processing task may specifically be that the scheduling server determines, according to the operation data of the processing nodes in the first period and the task processing data of the processing nodes in the first period, an operation state (for example, normal operation or abnormal operation) of the processing nodes in the first period, and a processing condition (for example, a number of outstanding data processing tasks or a processing failure amount of the data processing tasks) of the data processing tasks of each processing node; and then, according to the situation data, the processing nodes with normal running states can be screened out from the processing nodes; and screening out processing nodes with the number of incomplete data processing tasks being less than or equal to the threshold value of the number of residual tasks (or the processing failure amount of the data processing tasks being less than or equal to the threshold value of the failure amount) from the processing nodes with normal running states as the processing nodes meeting the processing requirements of the data processing tasks.
In this embodiment, the number of the data processing tasks to be allocated in the second period of time is determined according to the usage data and the task processing data of the resources relied by the plurality of processing nodes in the first period of time, and when the scheduling server is implemented, the scheduling server may determine, according to the usage data of the resources relied by the processing nodes in the first period of time and the task processing data of the processing nodes in the first period of time, the total amount of the data processing tasks allocated and the actually completed data processing tasks for the system processing nodes in the first period of time, and determine the usage threshold of the system resources based on the overall performance of the system (for example, the remaining usage resources of the system and the like); and determining the maximum value of the data processing task quantity which can be processed on the premise of ensuring the stable operation of the system as the quantity of the data processing tasks to be distributed in the second time period according to the total quantity of the data processing tasks distributed by the system processing nodes in the first time period and the quantity of the data processing tasks actually completed, the specific use condition of the system resources, the use threshold value of the system resources and the specific duration of the second time period.
In this embodiment, the average task amount of the processing node is calculated as the task load amount of the target processing node according to the plurality of processing nodes meeting the processing requirements of the data processing tasks and the number of the data processing tasks to be allocated in the second period. In the specific implementation, considering that the number of processing nodes in the system is large, the running conditions and task processing conditions of different processing node individuals may be different, if the task bearing capacity of each processing node is calculated for the running condition and task processing condition of each processing node individual, a large amount of operation resources are required to be occupied, and the determination efficiency is affected. Therefore, the scheduling server may divide the number of the data processing tasks to be allocated in the second period by the number of the processing nodes meeting the processing requirements of the data processing tasks to obtain a quotient, which is used as an average task number of the processing nodes; and the average task quantity is unified as the task bearing capacity of the target processing node, namely the maximum task quantity which can be processed and completed in the second time period of each processing node. Therefore, the determined task bearing capacity of the target processing node can be ensured to be reasonable and accurate for most of the processing nodes in the cluster, and meanwhile, the consumed resources and time for determining the task bearing capacity of the target processing node can be effectively reduced, and the processing efficiency is improved.
In one embodiment, in the case that the number of processing nodes in the system is relatively small or the computing power of the scheduling server is relatively high, in order to improve the accuracy and reliability of the task load capacity of the determined target processing node, the corresponding task load capacity may also be calculated for each specific situation of each processing node according to the specific situation of each processing node.
In this embodiment, the determining the task load of the target processing node according to the operation data, the task processing data, and the usage data of the dependent resources of the plurality of processing nodes in the first period may further include:
s1: determining a plurality of processing nodes meeting the processing requirements of the data processing task according to the operation data and the task processing data of the plurality of processing nodes in the first time period;
s2: counting the number of the plurality of processing nodes which meet the processing requirements of the data processing task;
s3: detecting whether the number of the plurality of processing nodes meeting the processing requirements of the data processing task is smaller than a preset node number threshold;
s4: under the condition that the number of processing nodes meeting the processing requirements of the data processing tasks is smaller than a preset node number threshold value, acquiring task processing data of a target processing node and use data of the dependent resources in a first time period from operation data, task processing data and use data of the dependent resources of a plurality of processing nodes in the first time period;
S5: and determining the task bearing capacity of the target processing node according to the task processing data of the target processing node and the use data of the dependent resources in the first time period.
In this embodiment, the specific value of the preset threshold number of nodes may be flexibly determined according to the data processing capability of the scheduling server in the system, the number of processing nodes in the system, and the accuracy requirement.
In this embodiment, when the number of processing nodes meeting the processing requirement of the data processing task is smaller than a preset threshold of the number of nodes, the operation data of the processing nodes meeting the processing requirement of the data processing task (i.e., the target processing node), the task processing data, and the usage data of the dependent resources may be sequentially extracted from the acquired operation data of the plurality of processing nodes, the task processing data, and the usage data of the dependent resources in the first period; and determining the task bearing capacity of the corresponding target processing node one by one according to the operation data of the target processing node, the task processing data and the use data of the dependent resources. The task load amounts of the different processing nodes determined in this way may be different according to different conditions of the processing nodes, and are not necessarily uniform values. And then the scheduling server can allocate different numbers of data processing tasks to different processing nodes as data processing tasks to be processed in the second time period according to the task bearing capacity of each processing node. Therefore, the processing capacity of each processing node can be better exerted, the accuracy and the rationality of task allocation are further improved, and the data processing based on the allocation is more efficient and stable.
In one embodiment, in order to make the task load determination efficient and accurate, the determining, according to the operation data of the multiple processing nodes, the task processing data and the usage data of the dependent resources in the first period, the task load of the target processing node may further include the following when implemented: and determining the task bearing capacity of the target processing node according to the operation data of the processing nodes, the task processing data and the use data of the dependent resources in the first time period through a preset task bearing capacity prediction model.
In this embodiment, the prediction model of the preset task load may specifically be a model that is built by learning and training historical data (including historical usage data of resources on which the processing node depends, operation data of the processing node, and historical task processing data), and is capable of predicting the amount of data processing tasks that can be processed by the processing node in a certain future time period.
In this implementation, in the implementation, the usage data of the resources on which the processing nodes depend in the first period, the operation data of the processing nodes in the first period, and the task processing data of the processing nodes in the first period may be used as model input data, and input into a trained prediction model of a preset task bearing capacity, so as to obtain result data output by the model, that is, task bearing capacity of each processing node in the second period.
In one embodiment, the predictive model of the preset task load may be specifically established in the following manner: acquiring operation data, task processing data and use data of dependent resources of a plurality of processing nodes in a preset time period as sample data; training by using the sample data, and establishing a predictive model of the preset task bearing capacity.
In this embodiment, the above-mentioned preset period of time may specifically refer to a certain or a plurality of periods of time in history. For example, there may be a plurality of time periods in which the data processing tasks handled by the system are historically similar to the data processing tasks to be handled by the second time period.
In this embodiment, in the implementation, the neural network model may be trained and learned by using the sample data, so as to establish a prediction model for obtaining the preset task load.
In this embodiment, the above-listed embodiments for determining the task load of the target processing node according to the operation data, the task processing data, and the usage data of the dependent resources of the plurality of processing nodes in the first period are all schematically described. In the implementation, other suitable embodiments may be selected to determine the task load capacity of the target processing node according to the specific application scenario and implementation requirements. The present specification is not limited to this.
In one embodiment, in the specific implementation, after the scheduling server allocates the data processing tasks of the second time period for each processing node, task processing data, operation data of the processing node and use data of resources on which the processing node depends when the data processing tasks are processed in the second time period of each processing node in the system can be obtained through the perception module; determining the operation and task processing conditions of the processing node in the second time period and the use condition of system resources on which the processing node depends according to the data; and correspondingly adjusting the data processing task of the second time period distributed by the processing node according to the condition data. For example, according to the above data, the running state of the processing node Y in the second time period is found to be poor, the data processing tasks that have not been processed in the second time period are more, and some of the data processing tasks in the second time period allocated before the processing node Y may be allocated to the processing node X in the idle state for processing after the data processing tasks in the second time period have been processed. Therefore, the processing pressure of the processing node Y can be effectively reduced, the processing node X in an idle state is effectively utilized, the resource utilization rate is further improved, processing failure caused by the fact that the processing pressure of the processing node Y is too high is avoided, and data processing is more efficient and stable.
From the above, according to the task allocation method provided in the present disclosure, the specific task load capacity of the processing node is determined finely by acquiring and using the usage data of the resources relied by the processing node, the operation data of the processing node and the task processing data in the first time period, and then the data processing task in the second time period is allocated to the processing node according to the task load capacity, so that the tasks can be allocated to each processing node reasonably and accurately according to the processing node and the specific condition of the resources relied by the processing node, thereby improving the utilization rate of the resources, and enabling the cluster to perform data processing efficiently and stably; the task bearing capacity of the processing node for task allocation is determined according to the use data of the resources relied by the processing node in the first time period, the operation data of the processing node and the task processing data by utilizing a pre-trained predictive model of the preset task bearing capacity, so that the accuracy and rationality of the determined task bearing capacity are improved, and the task allocation is more accurate and reasonable.
The embodiment of the specification also provides a server, which comprises a processor and a memory for storing instructions executable by the processor, wherein the processor can execute the following steps according to the instructions when being implemented: acquiring operation data, task processing data and use data of dependent resources of a plurality of processing nodes in a first time period; determining the task bearing capacity of the target processing node according to the operation data of the processing nodes, the task processing data and the use data of the dependent resources in the first time period; and distributing the data processing tasks matched with the task bearing capacity in a second time period to the target processing nodes.
In order to more accurately complete the above instructions, referring to fig. 5, the present disclosure further provides another specific server, where the server includes a network communication port 501, a processor 502, and a memory 503, where the foregoing structures are connected by an internal cable, so that each structure may perform specific data interaction.
The network communication port 501 may be specifically configured to obtain operation data, task processing data, and usage data of the dependent resources of the plurality of processing nodes in the first period.
The processor 502 may be specifically configured to determine a task load capacity of a target processing node according to operation data, task processing data, and usage data of a dependent resource of the plurality of processing nodes in the first period; and distributing the data processing tasks matched with the task bearing capacity in a second time period to the target processing nodes.
The memory 503 may be used to store various data acquired via the network communication port 501, and corresponding programs of instructions.
In this embodiment, the network communication port 501 may be a virtual port that binds with different communication protocols, so as to send or receive different data. For example, the network communication port may be an 80 # port responsible for performing web data communication, a 21 # port responsible for performing FTP data communication, or a 25 # port responsible for performing mail data communication. The network communication port may also be an entity's communication interface or a communication chip. For example, it may be a wireless mobile network communication chip, such as GSM, CDMA, etc.; it may also be a Wifi chip; it may also be a bluetooth chip.
In this embodiment, the processor 502 may be implemented in any suitable manner. For example, the processor may take the form of, for example, a microprocessor or processor, and a computer-readable medium storing computer-readable program code (e.g., software or firmware) executable by the (micro) processor, logic gates, switches, an application specific integrated circuit (Application Specific IntegratedCircuit, ASIC), a programmable logic controller, and an embedded microcontroller, among others. The description is not intended to be limiting.
In this embodiment, the memory 503 may include a plurality of layers, and in a digital system, the memory may be any memory as long as it can hold binary data; in an integrated circuit, a circuit with a memory function without a physical form is also called a memory, such as a RAM, a FIFO, etc.; in the system, the storage device in physical form is also called a memory, such as a memory bank, a TF card, and the like.
The embodiments of the present specification also provide a computer storage medium based on the task allocation method described above, where the computer storage medium stores computer program instructions that when executed implement: acquiring operation data, task processing data and use data of dependent resources of a plurality of processing nodes in a first time period; determining the task bearing capacity of the target processing node according to the operation data of the processing nodes, the task processing data and the use data of the dependent resources in the first time period; and distributing the data processing tasks matched with the task bearing capacity in a second time period to the target processing nodes.
In the present embodiment, the storage medium includes, but is not limited to, a random access Memory (Random Access Memory, RAM), a Read-Only Memory (ROM), a Cache (Cache), a Hard Disk (HDD), or a Memory Card (Memory Card). The memory may be used to store computer program instructions. The network communication unit may be an interface for performing network connection communication, which is set in accordance with a standard prescribed by a communication protocol.
In this embodiment, the functions and effects of the program instructions stored in the computer storage medium may be explained in comparison with other embodiments, and are not described herein.
Referring to fig. 6, on a software level, the embodiment of the present disclosure further provides a task allocation device, where the task allocation device may specifically include the following structural modules:
the acquiring module 601 may be specifically configured to acquire operation data, task processing data, and usage data of a dependent resource of a plurality of processing nodes in a first period of time;
the determining module 602 may be specifically configured to determine a task load capacity of the target processing node according to the operation data, task processing data, and usage data of the dependent resources of the plurality of processing nodes in the first period;
The allocation module 603 may be specifically configured to allocate, to the target processing node, a data processing task that matches the task load for a second period of time.
In one embodiment, the usage data of the resources on which the processing node depends in the first period of time may specifically include at least one of the following: the utilization rate of the system CPU in the first time period, the access amount of the system database in the first time period, the occupancy rate of the system memory in the first time period and the like.
In one embodiment, the operation data of the processing node in the first period may specifically include at least one of the following: the resource data of the processing node in the first time period, the running state of the processing node in the first time period, and the like.
In one embodiment, the task processing data of the processing node in the first period of time may specifically include at least one of the following: the allocation number of the data processing tasks of the processing node in the first time period, the number of the data processing tasks which are not completed by the processing node in the first time period, the number of the data processing tasks which are failed to be processed by the processing node in the first time period, the processing time length of the data processing tasks processed by the processing node in the first time period and the like.
In one embodiment, the acquiring module 601 may specifically acquire, by a sensing module preset in the system, operation data of the processing nodes, task processing data and usage data of the dependent resources in a first period when the plurality of processing nodes process the allocated data processing task in the first period.
In one embodiment, the determining module 602 may specifically include the following structural units:
the first determining unit may be specifically configured to determine a plurality of processing nodes that meet processing requirements of a data processing task according to operation data and task processing data of the plurality of processing nodes in the first period;
the second determining unit may be specifically configured to determine, according to usage data of resources on which the plurality of processing nodes depend in the first period of time and task processing data, the number of data processing tasks to be allocated in the second period of time;
the calculating unit may be specifically configured to calculate, according to the plurality of processing nodes meeting the processing requirements of the data processing tasks and the number of data processing tasks to be allocated in the second time period, an average task amount of the processing nodes, and use the average task amount as the task load capacity of the target processing node.
In one embodiment, the determining module 602 may specifically include the following structural units:
the first determining unit may be specifically configured to determine a plurality of processing nodes that meet processing requirements of a data processing task according to operation data and task processing data of the plurality of processing nodes in the first period;
the statistics unit is particularly used for counting the number of the plurality of processing nodes which meet the processing requirements of the data processing task;
the detection unit is specifically configured to detect whether the number of the plurality of processing nodes meeting the processing requirements of the data processing task is less than a preset threshold of the number of nodes;
the third determining unit may be specifically configured to obtain, when the number of processing nodes meeting the processing requirement of the data processing task is smaller than a preset threshold of the number of nodes, task processing data of the target processing node and usage data of the dependent resource in the first period from the operation data, task processing data and usage data of the dependent resource of the plurality of processing nodes in the first period;
the fourth determining unit may be specifically configured to determine a task load capacity of the target processing node according to the task processing data of the target processing node and usage data of the dependent resource in the first period.
It should be noted that, the units, devices, or modules described in the above embodiments may be implemented by a computer chip or entity, or may be implemented by a product having a certain function. For convenience of description, the above devices are described as being functionally divided into various modules, respectively. Of course, when the present description is implemented, the functions of each module may be implemented in the same piece or pieces of software and/or hardware, or a module that implements the same function may be implemented by a plurality of sub-modules or a combination of sub-units, or the like. The above-described apparatus embodiments are merely illustrative, for example, the division of the units is merely a logical function division, and there may be additional divisions when actually implemented, for example, multiple units or components may be combined or integrated into another system, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical or other form.
From the above, it is apparent that, in the task allocation device provided in the embodiments of the present disclosure, the obtaining module obtains the usage data of the resources on which the processing node depends in the first period of time, the operation data of the processing node, and the task processing data, the determining module determines the specific task bearing capacity of the processing node according to the data, and then the allocating module allocates the data processing task in the second period of time to the processing node according to the task bearing capacity, so that the task can be allocated to the processing node reasonably and accurately according to the specific conditions of the system and the processing node, thereby improving the utilization rate of the resources, and performing the data processing efficiently and stably.
Although the present description provides method operational steps as described in the examples or flowcharts, more or fewer operational steps may be included based on conventional or non-inventive means. The order of steps recited in the embodiments is merely one way of performing the order of steps and does not represent a unique order of execution. When implemented by an apparatus or client product in practice, the methods illustrated in the embodiments or figures may be performed sequentially or in parallel (e.g., in a parallel processor or multi-threaded processing environment, or even in a distributed data processing environment). The terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, it is not excluded that additional identical or equivalent elements may be present in a process, method, article, or apparatus that comprises a described element. The terms first, second, etc. are used to denote a name, but not any particular order.
Those skilled in the art will also appreciate that, in addition to implementing the controller in a pure computer readable program code, it is well possible to implement the same functionality by logically programming the method steps such that the controller is in the form of logic gates, switches, application specific integrated circuits, programmable logic controllers, embedded microcontrollers, etc. Such a controller can be regarded as a hardware component, and means for implementing various functions included therein can also be regarded as a structure within the hardware component. Or even means for achieving the various functions may be regarded as either software modules implementing the methods or structures within hardware components.
The description may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, classes, etc. that perform particular tasks or implement particular abstract data types. The specification may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.
From the above description of embodiments, it will be apparent to those skilled in the art that the present description may be implemented in software plus a necessary general purpose hardware platform. Based on this understanding, the technical solution of the present specification may be embodied in essence or a part contributing to the prior art in the form of a software product, which may be stored in a storage medium, such as a ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions to cause a computer device (which may be a personal computer, a mobile terminal, a server, or a network device, etc.) to execute the method described in the embodiments or some parts of the embodiments of the present specification.
Various embodiments in this specification are described in a progressive manner, and identical or similar parts are all provided for each embodiment, each embodiment focusing on differences from other embodiments. The specification is operational with numerous general purpose or special purpose computer system environments or configurations. For example: personal computers, server computers, hand-held or portable devices, tablet devices, multiprocessor systems, microprocessor-based systems, set top boxes, programmable electronic devices, network PCs, minicomputers, mainframe computers, distributed computing environments that include any of the above systems or devices, and the like.
Although the present specification has been described by way of example, it will be appreciated by those skilled in the art that there are many variations and modifications to the specification without departing from the spirit of the specification, and it is intended that the appended claims encompass such variations and modifications as do not depart from the spirit of the specification.

Claims (16)

1. A task allocation method, comprising:
acquiring operation data, task processing data and use data of dependent resources of a plurality of processing nodes in a first time period;
determining the task bearing capacity of the target processing node according to the operation data of the processing nodes, the task processing data and the use data of the dependent resources in the first time period; comprising the following steps: determining a plurality of processing nodes meeting the processing requirements of the data processing task according to the operation data and the task processing data of the plurality of processing nodes in the first time period; counting the number of the plurality of processing nodes meeting the processing requirements of the data processing task; detecting whether the number of the plurality of processing nodes meeting the processing requirements of the data processing task is smaller than a preset node number threshold; under the condition that the number of processing nodes meeting the processing requirements of the data processing tasks is smaller than a preset node number threshold value, acquiring task processing data of a target processing node and use data of the dependent resources in a first time period from operation data, task processing data and use data of the dependent resources of a plurality of processing nodes in the first time period; determining the task bearing capacity of the target processing node according to the task processing data of the target processing node and the use data of the dependent resources in the first time period;
And distributing the data processing tasks matched with the task bearing capacity in a second time period to the target processing nodes.
2. The method of claim 1, wherein the usage data of the resources relied upon by the processing node in the first time period comprises at least one of: the utilization rate of the system CPU in the first time period, the access amount of the system database in the first time period and the occupancy rate of the system memory in the first time period.
3. The method of claim 1, wherein the operational data of the processing node in the first time period comprises at least one of: the method comprises the steps of processing resource data of a node in a first time period and processing the running state of the node in the first time period.
4. The method of claim 1, wherein the task processing data of the processing node in the first time period comprises at least one of: the method comprises the steps of distributing the number of data processing tasks of a processing node in a first time period, the number of data processing tasks which are not completed by the processing node in the first time period, the number of data processing tasks which are failed to be processed by the processing node in the first time period, and the processing time of the data processing tasks processed by the processing node in the first time period.
5. The method of claim 1, obtaining operational data, task processing data, and usage data of dependent resources for a plurality of processing nodes in a first time period, comprising:
And acquiring operation data, task processing data and use data of dependent resources of the processing nodes in a first time period when the plurality of processing nodes process the distributed data processing tasks in the first time period through a perception module preset in the system.
6. The method of claim 1, determining a task load of a target processing node from the operational data, task processing data, and usage data of dependent resources of the plurality of processing nodes in the first time period, comprising:
determining a plurality of processing nodes meeting the processing requirements of the data processing task according to the operation data and the task processing data of the plurality of processing nodes in the first time period;
determining the number of data processing tasks to be allocated in a second time period according to the use data of the resources relied by the plurality of processing nodes in the first time period and the task processing data;
and calculating the average task quantity of the processing nodes according to the processing nodes meeting the processing requirements of the data processing tasks and the quantity of the data processing tasks to be distributed in the second time period, and taking the average task quantity of the processing nodes as the task bearing capacity of the target processing nodes.
7. The method of claim 1, determining a task load of a target processing node from the operational data, task processing data, and usage data of dependent resources of the plurality of processing nodes in the first time period, comprising:
And determining the task bearing capacity of the target processing node according to the operation data of the processing nodes, the task processing data and the use data of the dependent resources in the first time period through a preset task bearing capacity prediction model.
8. The method of claim 7, wherein the predictive model of the preset task load is established as follows:
acquiring operation data, task processing data and use data of dependent resources of a plurality of processing nodes in a preset time period as sample data;
training by using the sample data, and establishing a predictive model of the preset task bearing capacity.
9. A task allocation device comprising:
the acquisition module is used for acquiring operation data, task processing data and use data of dependent resources of the plurality of processing nodes in the first time period;
the determining module is used for determining the task bearing capacity of the target processing node according to the operation data, the task processing data and the use data of the dependent resources of the plurality of processing nodes in the first time period; wherein the determining module comprises: the first determining unit is used for determining a plurality of processing nodes which meet the processing requirements of the data processing task according to the operation data and the task processing data of the plurality of processing nodes in the first time period; the statistics unit is used for counting the number of the plurality of processing nodes which meet the processing requirements of the data processing task; the detection unit is used for detecting whether the number of the plurality of processing nodes meeting the processing requirements of the data processing task is smaller than a preset node number threshold value; a third determining unit, configured to obtain, when the number of processing nodes meeting the processing requirement of the data processing task is less than a preset threshold of the number of nodes, task processing data of a target processing node and usage data of a dependent resource in the first period from operation data, task processing data and usage data of the dependent resource of a plurality of processing nodes in the first period; a fourth determining unit, configured to determine a task load capacity of the target processing node according to task processing data of the target processing node and usage data of the dependent resource in the first period;
And the distribution module is used for distributing the data processing tasks matched with the task bearing capacity in a second time period to the target processing nodes.
10. The apparatus of claim 9, the usage data of resources relied by the processing node in the first time period comprising at least one of: the utilization rate of the system CPU in the first time period, the access amount of the system database in the first time period and the occupancy rate of the system memory in the first time period.
11. The apparatus of claim 9, the operational data of the processing node in the first time period comprising at least one of: the method comprises the steps of processing resource data of a node in a first time period and processing the running state of the node in the first time period.
12. The apparatus of claim 9, the task processing data of the processing node in the first time period comprising at least one of: the method comprises the steps of distributing the number of data processing tasks of a processing node in a first time period, the number of data processing tasks which are not completed by the processing node in the first time period, the number of data processing tasks which are failed to be processed by the processing node in the first time period, and the processing time of the data processing tasks processed by the processing node in the first time period.
13. The apparatus according to claim 9, wherein the acquisition module acquires, in particular by means of a perception module preset in the system, operation data of the processing nodes, task processing data and usage data of the dependent resources in a first period of time when the plurality of processing nodes process the allocated data processing tasks in the first period of time.
14. The apparatus of claim 9, the determination module comprising:
the first determining unit is used for determining a plurality of processing nodes which meet the processing requirements of the data processing task according to the operation data and the task processing data of the plurality of processing nodes in the first time period;
the second determining unit is used for determining the number of data processing tasks to be distributed in a second time period according to the use data of the resources relied by the plurality of processing nodes and the task processing data in the first time period;
and the calculating unit is used for calculating the average task quantity of the processing nodes according to the processing nodes meeting the processing requirements of the data processing tasks and the quantity of the data processing tasks to be distributed in the second time period, and taking the average task quantity of the processing nodes as the task bearing capacity of the target processing nodes.
15. A server comprising a processor and a memory for storing processor-executable instructions, which when executed by the processor implement the steps of the method of any one of claims 1 to 8.
16. A computer readable storage medium having stored thereon computer instructions which when executed implement the steps of the method of any of claims 1 to 8.
CN201811157251.5A 2018-09-30 2018-09-30 Task allocation method, device and server Active CN109597685B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811157251.5A CN109597685B (en) 2018-09-30 2018-09-30 Task allocation method, device and server

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811157251.5A CN109597685B (en) 2018-09-30 2018-09-30 Task allocation method, device and server

Publications (2)

Publication Number Publication Date
CN109597685A CN109597685A (en) 2019-04-09
CN109597685B true CN109597685B (en) 2023-06-09

Family

ID=65957269

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811157251.5A Active CN109597685B (en) 2018-09-30 2018-09-30 Task allocation method, device and server

Country Status (1)

Country Link
CN (1) CN109597685B (en)

Families Citing this family (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110764900A (en) * 2019-09-06 2020-02-07 深圳壹账通智能科技有限公司 Data distribution test method and device under high concurrency, terminal and storage medium
CN110727701A (en) * 2019-09-06 2020-01-24 深圳壹账通智能科技有限公司 Application automatic allocation method, device, terminal and storage medium
CN111178665A (en) * 2019-11-19 2020-05-19 腾讯科技(深圳)有限公司 Data analysis method, device and equipment
CN111143161B (en) * 2019-12-09 2024-04-09 东软集团股份有限公司 Log file processing method and device, storage medium and electronic equipment
CN111031350B (en) * 2019-12-24 2022-04-12 北京奇艺世纪科技有限公司 Transcoding resource scheduling method, electronic device and computer readable storage medium
CN113132324B (en) * 2019-12-31 2023-04-28 奇安信科技集团股份有限公司 Sample identification method and system
CN111522641B (en) * 2020-04-21 2023-11-14 北京嘀嘀无限科技发展有限公司 Task scheduling method, device, computer equipment and storage medium
CN111723202B (en) * 2020-05-19 2024-05-17 支付宝(杭州)信息技术有限公司 Processing device, method and system for public opinion data
CN113807621B (en) * 2020-06-12 2024-03-19 北京四维图新科技股份有限公司 Data processing method, device and equipment
CN113760520A (en) * 2020-07-09 2021-12-07 西安京迅递供应链科技有限公司 Task processing method and device
CN112068965A (en) * 2020-09-23 2020-12-11 Oppo广东移动通信有限公司 Data processing method and device, electronic equipment and readable storage medium
CN112181498B (en) * 2020-10-09 2024-01-30 中国工商银行股份有限公司 Concurrency control method, device and equipment
CN112162865B (en) * 2020-11-03 2023-09-01 中国工商银行股份有限公司 Scheduling method and device of server and server
CN112650582A (en) * 2020-12-21 2021-04-13 贝壳技术有限公司 Distributed task processing method and system and processor
CN115114034A (en) * 2022-08-29 2022-09-27 岚图汽车科技有限公司 Distributed computing method and device

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107193649A (en) * 2017-05-25 2017-09-22 深信服科技股份有限公司 A kind of method for scheduling task and device based on NUMA system
CN107580023A (en) * 2017-08-04 2018-01-12 山东大学 A kind of the stream process job scheduling method and system of dynamic adjustment task distribution

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9183016B2 (en) * 2013-02-27 2015-11-10 Vmware, Inc. Adaptive task scheduling of Hadoop in a virtualized environment
US9485197B2 (en) * 2014-01-15 2016-11-01 Cisco Technology, Inc. Task scheduling using virtual clusters
CN103986766B (en) * 2014-05-19 2017-07-07 中国工商银行股份有限公司 Adaptive load balancing job task dispatching method and device
CN104581227A (en) * 2014-12-31 2015-04-29 银江股份有限公司 Stream media load balancing method based on task scheduling
CN108563500A (en) * 2018-05-08 2018-09-21 深圳市零度智控科技有限公司 Method for scheduling task, cloud platform based on cloud platform and computer storage media

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107193649A (en) * 2017-05-25 2017-09-22 深信服科技股份有限公司 A kind of method for scheduling task and device based on NUMA system
CN107580023A (en) * 2017-08-04 2018-01-12 山东大学 A kind of the stream process job scheduling method and system of dynamic adjustment task distribution

Also Published As

Publication number Publication date
CN109597685A (en) 2019-04-09

Similar Documents

Publication Publication Date Title
CN109597685B (en) Task allocation method, device and server
CN112162865B (en) Scheduling method and device of server and server
US9658910B2 (en) Systems and methods for spatially displaced correlation for detecting value ranges of transient correlation in machine data of enterprise systems
CN108776934B (en) Distributed data calculation method and device, computer equipment and readable storage medium
CN106452818B (en) Resource scheduling method and system
US10678596B2 (en) User behavior-based dynamic resource capacity adjustment
CN107562512B (en) Method, device and system for migrating virtual machine
US20100318827A1 (en) Energy use profiling for workload transfer
CN110474852B (en) Bandwidth scheduling method and device
US11496413B2 (en) Allocating cloud computing resources in a cloud computing environment based on user predictability
US9588813B1 (en) Determining cost of service call
CN108256706B (en) Task allocation method and device
CN110795203A (en) Resource scheduling method, device and system and computing equipment
CN110990138A (en) Resource scheduling method, device, server and storage medium
US10305974B2 (en) Ranking system
CN109739627B (en) Task scheduling method, electronic device and medium
CN108809760A (en) The control method and device in sampling period in sampled-data system
CN115269108A (en) Data processing method, device and equipment
CN112689007A (en) Resource allocation method, device, computer equipment and storage medium
CN112162891A (en) Performance test method in server cluster and related equipment
CN114490078A (en) Dynamic capacity reduction and expansion method, device and equipment for micro-service
CN111897706A (en) Server performance prediction method, device, computer system and medium
CN107370783B (en) Scheduling method and device for cloud computing cluster resources
JP2005128866A (en) Computer unit and method for controlling computer unit
EP3032417A1 (en) Cloud orchestration and placement using historical data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right

Effective date of registration: 20201010

Address after: Cayman Enterprise Centre, 27 Hospital Road, George Town, Grand Cayman Islands

Applicant after: Advanced innovation technology Co.,Ltd.

Address before: Greater Cayman, British Cayman Islands

Applicant before: Alibaba Group Holding Ltd.

Effective date of registration: 20201010

Address after: Cayman Enterprise Centre, 27 Hospital Road, George Town, Grand Cayman Islands

Applicant after: Innovative advanced technology Co.,Ltd.

Address before: Cayman Enterprise Centre, 27 Hospital Road, George Town, Grand Cayman Islands

Applicant before: Advanced innovation technology Co.,Ltd.

TA01 Transfer of patent application right
GR01 Patent grant
GR01 Patent grant