CN114077481A - Task scheduling method, device, equipment and storage medium - Google Patents

Task scheduling method, device, equipment and storage medium Download PDF

Info

Publication number
CN114077481A
CN114077481A CN202010808404.9A CN202010808404A CN114077481A CN 114077481 A CN114077481 A CN 114077481A CN 202010808404 A CN202010808404 A CN 202010808404A CN 114077481 A CN114077481 A CN 114077481A
Authority
CN
China
Prior art keywords
value
parameter
optimized
task
scheduling
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010808404.9A
Other languages
Chinese (zh)
Inventor
孔华递
黄黎滨
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang Uniview Technologies Co Ltd
Original Assignee
Zhejiang Uniview Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang Uniview Technologies Co Ltd filed Critical Zhejiang Uniview Technologies Co Ltd
Priority to CN202010808404.9A priority Critical patent/CN114077481A/en
Publication of CN114077481A publication Critical patent/CN114077481A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
    • G06F9/4881Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The embodiment of the invention discloses a task scheduling method, a task scheduling device, a task scheduling equipment and a task scheduling storage medium. The task scheduling method comprises the following steps: acquiring a running value of a monitoring parameter of a scheduling task in the execution process; the scheduling task is executed by adopting the distribution value of the monitoring parameter, and the distribution value of the monitoring parameter is determined according to the matched preset distribution value of the parameter to be optimized; determining an optimized value of the parameter to be optimized based on a comparison result of the operation value of the monitoring parameter and the distribution value of the monitoring parameter; and determining the distribution value of the monitoring parameter executed next time by the scheduling task according to the optimized value of the parameter to be optimized. The embodiment of the invention realizes the optimization of the configuration parameters by monitoring the actual operation values of the scheduling task parameters, improves the reasonability and the accuracy of the setting of the configuration parameters, further realizes the optimization of the execution of the scheduling tasks, and improves the efficiency of task scheduling and the utilization rate of cluster resources.

Description

Task scheduling method, device, equipment and storage medium
Technical Field
The embodiment of the invention relates to the technical field of computers, in particular to a task scheduling method, a task scheduling device and a task scheduling storage medium.
Background
With the advent of the age of big data, various task scheduling systems have been developed, and the existence of the task scheduling systems improves the execution efficiency of scheduling tasks submitted by users.
However, the current task scheduling system simply executes the scheduling task according to the default parameter configuration, records the log, and judges whether the task is executed or not, and the like, and the current task scheduling system has a single function and lacks of the execution optimization of the scheduling task.
Disclosure of Invention
Embodiments of the present invention provide a method, an apparatus, a device, and a storage medium for task scheduling, which implement optimization of configuration parameters by monitoring an actual operation value of a scheduled task parameter, so as to improve task scheduling efficiency.
In a first aspect, an embodiment of the present invention provides a task scheduling method, including:
acquiring a running value of a monitoring parameter of a scheduling task in the execution process; the scheduling task is executed by adopting the distribution value of the monitoring parameter, and the distribution value of the monitoring parameter is determined according to the matched preset distribution value of the parameter to be optimized;
determining an optimized value of the parameter to be optimized based on a comparison result of the operation value of the monitoring parameter and the distribution value of the monitoring parameter;
and determining the distribution value of the monitoring parameter executed next time by the scheduling task according to the optimized value of the parameter to be optimized.
In a second aspect, an embodiment of the present invention further provides a task scheduling apparatus, including:
the operation value determining module is used for acquiring the operation value of the monitoring parameter of the scheduling task in the executing process; the scheduling task is executed by adopting the distribution value of the monitoring parameter, and the distribution value of the monitoring parameter is determined according to the matched preset distribution value of the parameter to be optimized;
the optimized value determining module is used for determining the optimized value of the parameter to be optimized based on the comparison result of the running value of the monitoring parameter and the distribution value of the monitoring parameter;
and the distribution value determining module is used for determining the distribution value of the monitoring parameter executed next time by the scheduling task according to the optimized value of the parameter to be optimized.
In a third aspect, an embodiment of the present invention further provides an apparatus, including:
one or more processors;
a storage device for storing one or more programs,
when executed by the one or more processors, cause the one or more processors to implement a method of task scheduling as in any embodiment of the invention.
In a fourth aspect, an embodiment of the present invention further provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the task scheduling method according to any embodiment of the present invention.
The embodiment of the invention is based on monitoring the parameters when the scheduling task is executed, comparing the parameters with the preset parameter distribution value, realizing the parameter optimization of the scheduling task according to the comparison result, and determining the parameter distribution value when the scheduling task is executed next time according to the optimized parameters. The configuration parameters are optimized by monitoring the actual operation values of the scheduling task parameters, the reasonability and the accuracy of the configuration parameter setting are improved, the scheduling task execution is optimized, and the task scheduling efficiency and the utilization rate of the cluster resources are improved.
Drawings
FIG. 1 is a flowchart of a task scheduling method according to a first embodiment of the present invention;
FIG. 2 is a flowchart of a task scheduling method according to a second embodiment of the present invention;
fig. 3 is a schematic structural diagram of a task scheduling device in a third embodiment of the present invention;
fig. 4 is a schematic structural diagram of an apparatus in the fourth embodiment of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting of the invention. It should be further noted that, for the convenience of description, only some of the structures related to the present invention are shown in the drawings, not all of the structures.
Example one
Fig. 1 is a flowchart of a task scheduling method in an embodiment of the present invention, which is applicable to a case where a task scheduling system executes a scheduled task. The method may be performed by a task scheduling apparatus, which may be implemented in software and/or hardware and may be configured in a device, for example, a device having communication and computing capabilities such as a background server. As shown in fig. 1, the method specifically includes:
step 101, acquiring a running value of a monitoring parameter of a scheduling task in an execution process; and the dispatching task is executed by adopting the distribution value of the monitoring parameter, and the distribution value of the monitoring parameter is determined according to the matched preset distribution value of the parameter to be optimized.
The scheduling task refers to a task being processed by the task scheduling system, and the task is submitted to the task scheduling system by a user. The monitoring parameter refers to a parameter which can monitor the task processing condition when the task scheduling system processes the scheduling task, and the parameter can reflect the resource consumption condition when the scheduling task is processed. The running value of the monitoring parameter refers to the resource consumption condition when the scheduling task is actually executed; the allocation values of the monitoring parameters are used to characterize the resource consumption situation estimated and allocated in advance for the scheduling task execution. The parameter to be optimized refers to a parameter of the scheduling task which can be directly adjusted, and the adjustment of the resource consumed by the execution of the scheduling task can be realized through the adjustment of the parameter to be optimized. The allocation value of the parameter to be optimized is used for representing the resource consumption condition estimated and allocated in advance for the scheduling task execution, and the allocation value of the monitoring parameter cannot be directly set because the parameter to be optimized can be directly set, and needs to be determined through the allocation value of the parameter to be optimized.
In one possible embodiment, the monitoring parameter includes at least one of: CPU utilization rate, disk utilization rate and memory utilization rate;
the parameter to be optimized matched with the CPU utilization rate is the thread number;
the parameters to be optimized matched with the utilization rate of the disk are the task segmentation quantity;
the parameter to be optimized matched with the memory utilization rate is a RAM value.
The utilization rate of a Central Processing Unit (CPU) refers to CPU resources occupied by a scheduling task during execution, and represents the execution condition of a cluster where a task scheduling system is located on the scheduling task. The disk utilization rate refers to disk resources occupied by the scheduling task during execution, and represents the execution condition of the scheduling task in the cluster where the task scheduling system is located on the other hand. The memory usage rate refers to memory resources occupied by the scheduling task during execution.
The thread number refers to the number of CPU cores allocated for the scheduling task, and the CPU utilization rate of the execution of the scheduling task can be controlled by controlling the thread number of the scheduling task. The task segmentation quantity refers to the quantity of sub tasks for dividing the scheduling task, and the control of load balance can be realized by controlling the task segmentation quantity, namely the utilization rate of a disk is controlled. The RAM (Random Access Memory) value refers to the size of the Memory space allocated for the scheduling task, and the Memory usage rate of the scheduling task can be controlled by different RAM values allocated for the scheduling task.
Specifically, the allocation value of the parameter to be optimized is preset according to the information of the task when the scheduling task is executed, and particularly, the allocation value of the parameter to be optimized may adopt a default configuration or a pre-estimated value when the scheduling task is executed for the first time. And determining the distribution value of the monitoring parameter according to the distribution value of the parameter to be optimized and the specific configuration of the cluster where the task scheduling system is located, executing the scheduling task by the task scheduling system according to the distribution value of the monitoring parameter, and acquiring the actual operation value of the monitoring parameter of the scheduling task in the executing process. Optionally, after obtaining the operation value of the monitoring parameter, the operation value is stored in the analysis database, so as to be analyzed and processed.
Illustratively, before a scheduling task is executed, the thread number, the task segmentation number and the RAM value of the scheduling task are initially configured, corresponding configuration values of the CPU utilization rate, the disk utilization rate and the memory utilization rate are determined according to the thread number, the task segmentation number and the initial configuration values of the RAM value, the scheduling task is executed according to the configuration values, and index parameters of the task are collected, such as the CPU utilization rate, the disk utilization rate and the memory utilization rate. For example, during the process of scheduling task execution, the running value of the monitoring parameter is acquired periodically.
In an optional embodiment, before the scheduling task is executed, whether the scheduling task conflicts with the scheduling task currently being executed is determined according to the service parameters of the scheduling task, and if the scheduling task conflicts with the scheduling task currently being executed, the execution is suspended; if not, setting an allocation value of the parameter to be optimized for the scheduling task, and executing the scheduling task based on the allocation value to avoid influencing the currently executed scheduling task.
Step 102, determining an optimized value of the parameter to be optimized based on a comparison result of the operation value of the monitoring parameter and the distributed value of the monitoring parameter.
The operation value of the monitoring parameter reflects the actual operation condition of the scheduling task, and the distribution value of the monitoring parameter reflects the ideal operation condition of the scheduling task, so that the condition of data exchange with the CPU can be reflected. Therefore, if the allocation value of the monitoring parameter is always used for operation, the resource utilization rate of the cluster where the scheduling system is located is reduced under the condition that the operation value and the allocation value are not consistent.
According to the comparison result of the operation value of the monitoring parameter and the distribution value of the monitoring parameter, determining the resource consumption condition brought to the cluster by the scheduling task execution on the distribution value, and performing optimization configuration operation on the parameter to be optimized according to the resource consumption condition, for example, if the resource consumption is too high, reducing the distribution value of the monitoring parameter.
In one possible embodiment, determining the optimized value of the parameter to be optimized based on the comparison of the running value of the monitored parameter and the assigned value of the monitored parameter includes:
if the ratio of the operation value of the monitoring parameter to the distribution value of the monitoring parameter is smaller than a preset threshold value, determining the optimized value of the parameter to be optimized according to the ratio of the operation value of the parameter to be optimized to the distribution value of the parameter to be optimized; the running value of the parameter to be optimized is determined according to the running value of the matched monitoring parameter;
and if the ratio of the operation value of the monitoring parameter to the distribution value of the monitoring parameter is greater than or equal to a preset threshold value, determining the optimized value of the parameter to be optimized as the product of the distribution value of the parameter to be optimized and the preset value, wherein the preset value is greater than or equal to 1.
The preset threshold represents a degree of the comparison result between the operation value of the monitoring parameter and the allocation value of the monitoring parameter, which reflects the degree of the utilization rate of the cluster resource consumption, and may be set according to an actual situation, for example, may be set to a number of 1 or about 1.
If the ratio of the operation value of the monitoring parameter to the allocation value of the monitoring parameter is smaller than the preset threshold, it indicates that the resource actually consumed by the operation when the scheduling task is executed according to the allocation value of the current monitoring parameter does not reach the result of default allocation, so the optimization value of the parameter to be optimized needs to be determined according to the ratio of the operation value of the parameter to be optimized to the allocation value of the parameter to be optimized, so as to achieve that the optimization value of the parameter to be optimized is smaller than the allocation value of the parameter to be optimized, and reduce the proportion of the resource allocated to the scheduling task.
If the ratio of the operation value of the monitoring parameter to the allocation value of the monitoring parameter is greater than or equal to the preset threshold, it indicates that the resource actually consumed by the operation of the scheduling task reaches the bottleneck, that is, the resource allocated by the scheduling task does not reach the actual requirement, and more resources are actually required to be allocated. The optimized value of the parameter to be optimized is thus determined according to the product of the assigned value of the parameter to be optimized and the preset value. After the optimized value of the parameter to be optimized is determined, it is stored in an analysis database.
Illustratively, when the monitoring parameter is the CPU utilization, the matched parameter to be optimized is the number of threads, and if the running value of the CPU utilization during the actual execution of the scheduling task is smaller than the allocation value, it is determined that the number of threads after optimization is the actual running value/default allocation value × the number of threads, and the result is rounded up; if the running value of the CPU utilization rate in the actual execution of the scheduling task is equal to the distribution value, the optimized thread number is the current default distributed thread number; and if the running value of the CPU utilization rate in the actual execution of the scheduling task is greater than the distribution value, determining the optimized thread number as the current default distribution thread number multiplied by 1.5.
When the monitoring parameter is the utilization rate of the disk, the matched parameter to be optimized is the task segmentation quantity, if the running value of the utilization rate of the disk during the actual execution of the scheduling task is smaller than the distribution value, the optimized task segmentation quantity is determined to be the current default distributed task segmentation quantity minus one, and if the current default distributed task segmentation quantity is not segmented, the current default distributed task segmentation quantity is kept unchanged; if the running value of the utilization rate of the disk during the actual execution of the scheduling task is equal to the distribution value, the optimized task segmentation quantity is the current default distributed task segmentation quantity; and if the running value of the disk utilization rate in the actual execution of the scheduling task is greater than the distribution value, determining the optimized task segmentation quantity as the current default distributed task segmentation quantity plus one.
When the monitoring parameter is the memory utilization rate, the matched parameter to be optimized is an RAM value, if the operation value of the memory utilization rate in the actual execution of the scheduling task is smaller than the distribution value, the optimized RAM value is determined to be the actual operation value/the default distribution value x the cluster total RAM value, and the result is rounded upwards; if the operation value of the memory utilization rate during the actual execution of the scheduling task is equal to the allocation value, the optimized RAM value is the current default allocated thread number; and if the running value of the memory utilization rate in the actual execution of the scheduling task is greater than the allocation value, determining the optimized RAM value as the current default allocation RAM value x 1.5.
Illustratively, if the monitoring parameters include at least two, each monitoring parameter is compared, and corresponding optimization measures are taken respectively.
And 103, determining the distribution value of the monitoring parameter executed next time by the scheduling task according to the optimized value of the parameter to be optimized.
The optimized value of the parameter to be optimized is determined according to the actual running condition of the scheduling task, so that the optimized value of the parameter to be optimized is used as the distribution value of the parameter to be optimized executed next time by the scheduling task, and a new distribution value of the monitoring parameter is determined according to the reset distribution value of the parameter to be optimized. And stores the assigned values of the new monitored parameters in the analysis database.
For example, when the same scheduling task is run next time, the task scheduling system obtains the optimized distribution value of the monitoring parameter from the previous execution record, and performs corresponding running as the configuration of the scheduling task running next time. Therefore, the dispatching task can continuously optimize the distribution value in the repeated execution process so as to improve the resource utilization rate of the cluster.
In one possible embodiment, the method further comprises:
presetting a threshold value of a parameter to be optimized;
correspondingly, after determining the optimized value of the parameter to be optimized based on the comparison result of the running value of the monitoring parameter and the distributed value of the monitoring parameter, the method further comprises the following steps:
and if the optimized value of the parameter to be optimized is larger than or equal to the threshold value of the parameter to be optimized, determining the distribution value of the monitoring parameter executed next time by the scheduling task according to the threshold value of the parameter to be optimized.
Because the cluster where the task scheduling system is located can execute a plurality of scheduling tasks at the same time, a final limit exists for the resources that can be allocated by each scheduling task. The limitation is determined according to a preset threshold value of a parameter to be optimized, for example, the maximum limitation of the number of threads is the number of cluster servers x the total number of single cores/2, the maximum limitation of the RAM is the number of cluster servers x the total amount of single memories/2, and the maximum limitation of the segmentation task is the number of cluster machines. If the optimization value of the parameter to be optimized is larger than or equal to the threshold value of the parameter to be optimized, the threshold value of the parameter to be optimized is determined as the allocation value of the parameter to be optimized executed next time, and therefore the problem that resources are increased and allocated to the scheduling task without limit and the execution of other tasks is affected is avoided.
Illustratively, the method further includes obtaining a cluster load balancing value in the process of scheduling task execution, and when the cluster load balancing value reaches a load threshold, the number of task segmentations is not increased any more.
The embodiment of the invention is based on monitoring the parameters when the scheduling task is executed, comparing the parameters with the preset parameter distribution value, realizing the parameter optimization of the scheduling task according to the comparison result, and determining the parameter distribution value when the scheduling task is executed next time according to the optimized parameters. The configuration parameters are optimized by monitoring the actual operation values of the scheduling task parameters, the reasonability and the accuracy of the configuration parameter setting are improved, and then the dynamic optimization of the scheduling task execution is realized, so that the task scheduling efficiency and the utilization rate of the cluster resources are improved.
Example two
Fig. 2 is a flowchart of a task scheduling method in the second embodiment of the present invention, and the second embodiment of the present invention performs further optimization based on the first embodiment of the present invention. As shown in fig. 2, the method includes:
step 201, acquiring at least two running time of the scheduling task.
Because the same scheduling task records the running condition of each time in the analysis database during the execution, including the running time-consuming condition, the task scheduling system can perform secondary analysis on the scheduling task after being executed for multiple times so as to ensure the accuracy of determining the distribution value of the scheduling task.
Illustratively, the running condition of the scheduling task is periodically obtained, and if the running condition is at least two times, the running time consumption records of at least two times are obtained. Or when the running times of the scheduling task exceed the threshold value, acquiring a plurality of running time consumption records of the scheduling task.
Step 202, determining the distribution value of the parameter to be optimized when the scheduling task corresponding to the shortest running time is executed, as the target distribution value of the parameter to be optimized of the scheduling task, and determining the scheduling task with the target distribution value of the parameter to be optimized as the target scheduling task.
And taking the running time consumption as a reference, and taking the distribution value corresponding to the shortest running time consumption of the scheduling task as a target distribution value of the parameter to be optimized of the scheduling task.
Illustratively, on the basis of the first embodiment, each scheduling task records a plurality of running values in an analysis database, at this time, a task scheduling system performs secondary analysis on data in the database, screens out execution record data of the scheduling task with the shortest time consumption based on running time consumption as a standard, determines distribution values of a RAM value, a thread number and a task segmentation number of the scheduling task, and stores the distribution values as optimal values in the task scheduling system to serve as optimal configuration values of the scheduling task, namely target distribution values. And meanwhile, determining the scheduling task with the determined target distribution value as a target scheduling task, and putting the target scheduling task into a target scheduling task queue.
By the execution record of the scheduling task and based on the running time consumption, the scheduling task can be analyzed in a total manner, the setting condition of the overall highest allocation value of the resource utilization rate of the scheduling task is obtained, the overall balance of task execution is realized, and the influence of a single parameter is avoided.
In one possible embodiment, the method further comprises:
acquiring service parameters of a calibration scheduling task, comparing the service parameters with the service parameters of a target scheduling task, and determining the target scheduling task with similarity greater than a preset threshold as a reference scheduling task of the calibration scheduling task;
and determining the distribution value of the parameter to be optimized of the calibration scheduling task according to the target distribution value of the parameter to be optimized of the reference scheduling task.
The calibration scheduling task refers to a task submitted by a user who is not in the target scheduling task queue, for example, a scheduling task submitted by the user latest, and the new scheduling task is not executed, so that an allocation value in the first execution is not determined. The service parameters are used to represent attribute information and execution information of the scheduling task, including, for example, an execution object and an execution operation of the scheduling task. The target scheduling task is a scheduling task which determines a target distribution value through secondary analysis.
Specifically, after a scheduling task newly submitted by a user is obtained, the scheduling task is used as a calibration scheduling task; or taking the scheduling task which is not in the target scheduling task queue in the task scheduling system as the calibration scheduling task.
Comparing the service parameters of the calibration scheduling task with the target scheduling tasks in the target scheduling task queue, determining whether the target scheduling tasks with similarity higher than a preset threshold exist in the target scheduling task queue, if so, putting the calibration scheduling tasks into an analysis queue, taking the target scheduling tasks as reference scheduling tasks of the calibration scheduling tasks, and determining the target distribution values of the parameters to be optimized of the reference scheduling tasks as the distribution values of the parameters to be optimized of the calibration scheduling tasks; if not, the calibration scheduling task is not processed, for example, a system default configuration value is directly adopted as an allocation value of the calibration scheduling task.
The reference scheduling task is obtained from the target scheduling task queue, and the target allocation value in the target scheduling task queue is obtained through secondary analysis, so that the allocation value of the corresponding calibration scheduling task is determined according to the reference scheduling task with high similarity, and the utilization rate of cluster resources by the task when the calibration scheduling task is executed according to the allocation value is improved.
Illustratively, matching the data of the scheduling tasks after the secondary analysis in the analysis database, and selecting a target distribution value of the scheduling tasks with high similarity (the current similarity is compared based on information such as operation and data sources in the service parameters, and the similarity threshold is set as 70% by default) as an operation standard value. And predicting the resource consumption condition of the calibration scheduling task by combining the target scheduling task with the similarity higher than 70 percent, obtaining the resource prediction condition consumed by the calibration scheduling task, and putting the calibration scheduling task into an analysis queue. If the similarity is lower than 70%, directly putting the data into a non-analysis queue.
In a possible embodiment, before determining the assignment value of the parameter to be optimized of the calibration scheduling task according to the target assignment value of the parameter to be optimized of the reference scheduling task, the method further includes:
acquiring service parameters of at least two calibration scheduling tasks;
performing service logic judgment according to the service parameters of the calibration scheduling tasks to obtain a judgment result of the conflict between the calibration scheduling tasks;
and determining the execution sequence of the calibration scheduling tasks according to the judgment result of the conflict.
When a user submits a plurality of scheduling tasks, service parameters of the scheduling tasks are traversed, whether a scheduling task with execution conflict exists is determined according to the service parameters of each scheduling task, if yes, an execution sequence is determined according to conflict judgment, and if not, the execution sequence is determined according to the submission sequence or the execution requirement in the service parameters.
Illustratively, after a user submits three scheduling tasks, the three scheduling tasks are determined as nominal scheduling tasks, and the task contents are respectively: deleting table 1, inserting data in table 1 and querying data in table 1, determining that there is an execution conflict according to the service parameters of the three scheduling tasks, and therefore determining that the data is inserted in table 1 and the data is queried in table 1 according to the conflict judgment result, and therefore determining that the execution order of inserting data in table 1 and querying data in table 1 is before deleting table 1.
In one possible embodiment, there are at least two nominal scheduled tasks of the reference scheduled task;
correspondingly, after determining the allocation value of the parameter to be optimized of the calibration scheduling task according to the target allocation value of the parameter to be optimized of the reference scheduling task, the method further includes:
determining the distribution value of the matched monitoring parameter according to the distribution value of the parameter to be optimized of the calibration scheduling task;
and determining the execution sequence of the calibration scheduling tasks according to the distribution values of the monitoring parameters of the calibration scheduling tasks.
And determining the distribution value of the matched monitoring parameter according to the distribution value of the parameter to be optimized of the calibration scheduling task, further predicting the resource consumption condition of the calibration scheduling task, and determining the execution sequence of the calibration scheduling task according to the resource consumption condition, for example, adjusting the execution sequence of the calibration scheduling task with too high resource consumption backward or setting at least two calibration scheduling tasks with balanced resource consumption to be executed simultaneously.
Illustratively, on the basis of the above example, the calibration scheduling tasks with similar scheduling tasks are placed in the analysis queue, and the resource consumption calculation is directly performed on the scheduling tasks in the analysis queue, so as to realize the maximum number of scheduling tasks executed simultaneously at the same time. For example, after determining the distribution value of the matched monitoring parameter according to the distribution value of the parameter to be optimized of the calibrated scheduling task, determining that the consumption memory utilization rate of the task A in the analysis queue is 20%, the disk utilization rate is 60%, and the CPU utilization rate is 20%; the task B consumes 60% of memory, 20% of disk and 30% of CPU, the task A and the task B can be executed simultaneously according to the resource consumption, the subsequent task is put into the analysis queue to wait for execution, and the task in the non-analysis queue is operated.
The embodiment of the invention realizes the multi-dimensional data acquisition of the scheduling tasks on the basis of executing the same scheduling task for multiple times, analyzes the acquired data based on the comparison of running time consumption, and determines the target distribution value of a single scheduling task parameter so as to obtain the optimal configuration strategy. And the calibration scheduling tasks without the determined target distribution values are analyzed, similar scheduling tasks in the target scheduling tasks are determined, so that the accuracy of determining the parameter distribution values of the calibration scheduling tasks is ensured, the calibration scheduling tasks are reasonably integrated according to the determined distribution values, and the utilization rate of cluster resources is improved.
EXAMPLE III
Fig. 3 is a schematic structural diagram of a task scheduling device in a third embodiment of the present invention, where a task scheduling system executes a scheduled task in this embodiment. As shown in fig. 3, the apparatus includes:
a running value determining module 310, configured to obtain a running value of a monitoring parameter of a scheduling task in an execution process; the scheduling task is executed by adopting the distribution value of the monitoring parameter, and the distribution value of the monitoring parameter is determined according to the matched preset distribution value of the parameter to be optimized;
an optimized value determining module 320, configured to determine an optimized value of the parameter to be optimized based on a comparison result of the running value of the monitoring parameter and the assigned value of the monitoring parameter;
an allocation value determining module 330, configured to determine, according to the optimized value of the parameter to be optimized, an allocation value of a monitoring parameter to be executed next time by the scheduling task.
The embodiment of the invention is based on monitoring the parameters when the scheduling task is executed, comparing the parameters with the preset parameter distribution value, realizing the parameter optimization of the scheduling task according to the comparison result, and determining the parameter distribution value when the scheduling task is executed next time according to the optimized parameters. The configuration parameters are optimized by monitoring the actual operation values of the scheduling task parameters, the reasonability and the accuracy of the configuration parameter setting are improved, the scheduling task execution is optimized, and the task scheduling efficiency and the utilization rate of the cluster resources are improved.
Optionally, the optimized value determining module 320 is specifically configured to:
if the ratio of the operation value of the monitoring parameter to the distribution value of the monitoring parameter is smaller than a preset threshold value, determining the optimized value of the parameter to be optimized according to the ratio of the operation value of the parameter to be optimized to the distribution value of the parameter to be optimized; the running value of the parameter to be optimized is determined according to the running value of the matched monitoring parameter;
and if the ratio of the running value of the monitoring parameter to the distribution value of the monitoring parameter is greater than or equal to a preset threshold value, determining the optimized value of the parameter to be optimized as the product of the distribution value of the parameter to be optimized and a preset value, wherein the preset value is greater than or equal to 1.
Optionally, the monitoring parameter includes at least one of: CPU utilization rate, disk utilization rate and memory utilization rate;
the parameter to be optimized matched with the CPU utilization rate is the thread number;
the parameters to be optimized matched with the utilization rate of the disk are the task segmentation quantity;
and the parameter to be optimized matched with the memory utilization rate is a RAM value.
Optionally, the apparatus further includes a parameter threshold setting module to be optimized, specifically configured to:
presetting a threshold value of a parameter to be optimized;
correspondingly, the device further comprises a threshold value judging module, which is specifically configured to:
and if the optimized value of the parameter to be optimized is larger than or equal to the threshold value of the parameter to be optimized, determining the distribution value of the monitoring parameter executed next time by the scheduling task according to the threshold value of the parameter to be optimized.
Optionally, the apparatus further includes a secondary analysis module, including:
the operation time acquiring unit is used for acquiring at least two operation times of the scheduling task;
and the target distribution value determining unit is used for determining the distribution value of the parameter to be optimized when the scheduling task corresponding to the shortest running time is executed, taking the distribution value as the target distribution value of the parameter to be optimized of the scheduling task, and determining the scheduling task of which the parameter to be optimized has the target distribution value as the target scheduling task.
Optionally, the apparatus further includes a task comparison module, including:
the service parameter comparison unit is used for acquiring service parameters of the calibration scheduling task, comparing the service parameters with the service parameters of the target scheduling task, and determining the target scheduling task with similarity greater than a preset threshold as a reference scheduling task of the calibration scheduling task;
and the comparison result determining unit is used for determining the distribution value of the parameter to be optimized of the calibration scheduling task according to the target distribution value of the parameter to be optimized of the reference scheduling task.
Optionally, the task comparison module further includes a task conflict determination unit, specifically configured to:
acquiring service parameters of at least two calibration scheduling tasks;
performing service logic judgment according to the service parameters of the calibration scheduling tasks to obtain a judgment result of the conflict between the calibration scheduling tasks;
and determining the execution sequence of the calibration scheduling task according to the conflict judgment result.
Optionally, there are at least two calibration scheduling tasks of the reference scheduling task;
correspondingly, the task comparison module further comprises a task execution sequence determining unit, which is specifically configured to:
determining the distribution value of the matched monitoring parameter according to the distribution value of the parameter to be optimized of the calibration scheduling task;
and determining the execution sequence of the calibration scheduling task according to the distribution value of the monitoring parameters of the calibration scheduling task.
The task scheduling device provided by the embodiment of the invention can execute the task scheduling method provided by any embodiment of the invention, and has corresponding functional modules and beneficial effects for executing the task scheduling method.
Example four
Fig. 4 is a schematic structural diagram of an apparatus according to a fourth embodiment of the present invention. Fig. 4 illustrates a block diagram of an exemplary device 12 suitable for use in implementing embodiments of the present invention. The device 12 shown in fig. 4 is only an example and should not bring any limitation to the function and scope of use of the embodiments of the present invention.
As shown in FIG. 4, device 12 is in the form of a general purpose computing device. The components of device 12 may include, but are not limited to: one or more processors or processing units 16, a system memory device 28, and a bus 18 that couples various system components including the system memory device 28 and the processing unit 16.
Bus 18 represents one or more of any of several types of bus structures, including a memory device bus or memory device controller, a peripheral bus, an accelerated graphics port, and a processor or local bus using any of a variety of bus architectures. By way of example, such architectures include, but are not limited to, Industry Standard Architecture (ISA) bus, micro-channel architecture (MAC) bus, enhanced ISA bus, Video Electronics Standards Association (VESA) local bus, and Peripheral Component Interconnect (PCI) bus.
Device 12 typically includes a variety of computer system readable media. Such media may be any available media that is accessible by device 12 and includes both volatile and nonvolatile media, removable and non-removable media.
The system storage 28 may include computer system readable media in the form of volatile storage, such as Random Access Memory (RAM)30 and/or cache storage 32. Device 12 may further include other removable/non-removable, volatile/nonvolatile computer system storage media. By way of example only, storage system 34 may be used to read from and write to non-removable, nonvolatile magnetic media (not shown in FIG. 4, and commonly referred to as a "hard drive"). Although not shown in FIG. 4, a magnetic disk drive for reading from and writing to a removable, nonvolatile magnetic disk (e.g., a "floppy disk") and an optical disk drive for reading from or writing to a removable, nonvolatile optical disk (e.g., a CD-ROM, DVD-ROM, or other optical media) may be provided. In these cases, each drive may be connected to bus 18 by one or more data media interfaces. Storage 28 may include at least one program product having a set (e.g., at least one) of program modules that are configured to carry out the functions of embodiments of the invention.
A program/utility 40 having a set (at least one) of program modules 42 may be stored, for example, in storage 28, such program modules 42 including, but not limited to, an operating system, one or more application programs, other program modules, and program data, each of which examples or some combination thereof may comprise an implementation of a network environment. Program modules 42 generally carry out the functions and/or methodologies of the described embodiments of the invention.
Device 12 may also communicate with one or more external devices 14 (e.g., keyboard, pointing device, display 24, etc.), with one or more devices that enable a user to interact with device 12, and/or with any devices (e.g., network card, modem, etc.) that enable device 12 to communicate with one or more other computing devices. Such communication may be through an input/output (I/O) interface 22. Also, the device 12 may communicate with one or more networks (e.g., a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network, such as the Internet) via the network adapter 20. As shown in FIG. 4, the network adapter 20 communicates with the other modules of the device 12 via the bus 18. It should be appreciated that although not shown in FIG. 4, other hardware and/or software modules may be used in conjunction with device 12, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, and data backup storage systems, among others.
The processing unit 16 executes various functional applications and data processing by running programs stored in the system storage device 28, for example, to implement a task scheduling method provided by the embodiment of the present invention, including:
acquiring a running value of a monitoring parameter of a scheduling task in the execution process; the scheduling task is executed by adopting the distribution value of the monitoring parameter, and the distribution value of the monitoring parameter is determined according to the matched preset distribution value of the parameter to be optimized;
determining an optimized value of the parameter to be optimized based on a comparison result of the operation value of the monitoring parameter and the distribution value of the monitoring parameter;
and determining the distribution value of the monitoring parameter executed next time by the scheduling task according to the optimized value of the parameter to be optimized.
EXAMPLE five
The fifth embodiment of the present invention further provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the task scheduling method provided in the fifth embodiment of the present invention, and the method includes:
acquiring a running value of a monitoring parameter of a scheduling task in the execution process; the scheduling task is executed by adopting the distribution value of the monitoring parameter, and the distribution value of the monitoring parameter is determined according to the matched preset distribution value of the parameter to be optimized;
determining an optimized value of the parameter to be optimized based on a comparison result of the operation value of the monitoring parameter and the distribution value of the monitoring parameter;
and determining the distribution value of the monitoring parameter executed next time by the scheduling task according to the optimized value of the parameter to be optimized.
Computer storage media for embodiments of the invention may employ any combination of one or more computer-readable media. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
A computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
Computer program code for carrying out operations for aspects of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C + +, or the like, as well as conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).
It is to be noted that the foregoing is only illustrative of the preferred embodiments of the present invention and the technical principles employed. It will be understood by those skilled in the art that the present invention is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the invention. Therefore, although the present invention has been described in greater detail by the above embodiments, the present invention is not limited to the above embodiments, and may include other equivalent embodiments without departing from the spirit of the present invention, and the scope of the present invention is determined by the scope of the appended claims.

Claims (10)

1. A method for task scheduling, comprising:
acquiring a running value of a monitoring parameter of a scheduling task in the execution process; the scheduling task is executed by adopting the distribution value of the monitoring parameter, and the distribution value of the monitoring parameter is determined according to the matched preset distribution value of the parameter to be optimized;
determining an optimized value of the parameter to be optimized based on a comparison result of the operation value of the monitoring parameter and the distribution value of the monitoring parameter;
and determining the distribution value of the monitoring parameter executed next time by the scheduling task according to the optimized value of the parameter to be optimized.
2. The method of claim 1, wherein determining an optimized value for a parameter to be optimized based on a comparison of the run value of the monitored parameter and the assigned value of the monitored parameter comprises:
if the ratio of the operation value of the monitoring parameter to the distribution value of the monitoring parameter is smaller than a preset threshold value, determining the optimized value of the parameter to be optimized according to the ratio of the operation value of the parameter to be optimized to the distribution value of the parameter to be optimized; the running value of the parameter to be optimized is determined according to the running value of the matched monitoring parameter;
and if the ratio of the running value of the monitoring parameter to the distribution value of the monitoring parameter is greater than or equal to a preset threshold value, determining the optimized value of the parameter to be optimized as the product of the distribution value of the parameter to be optimized and a preset value, wherein the preset value is greater than or equal to 1.
3. The method of claim 1, wherein the monitoring parameters include at least one of: CPU utilization rate, disk utilization rate and memory utilization rate;
the parameter to be optimized matched with the CPU utilization rate is the thread number;
the parameters to be optimized matched with the utilization rate of the disk are the task segmentation quantity;
and the parameter to be optimized matched with the memory utilization rate is a RAM value.
4. The method of claim 1, further comprising:
acquiring at least two running time consumptions of the scheduling task;
and determining the distribution value of the parameter to be optimized when the scheduling task corresponding to the shortest running time is executed, taking the distribution value as the target distribution value of the parameter to be optimized of the scheduling task, and determining the scheduling task with the target distribution value of the parameter to be optimized as the target scheduling task.
5. The method of claim 4, further comprising:
acquiring service parameters of a calibration scheduling task, comparing the service parameters with the service parameters of the target scheduling task, and determining the target scheduling task with similarity greater than a preset threshold as a reference scheduling task of the calibration scheduling task;
and determining the distribution value of the parameter to be optimized of the calibration scheduling task according to the target distribution value of the parameter to be optimized of the reference scheduling task.
6. The method according to claim 5, before determining the assignment value of the parameter to be optimized for the calibration scheduling task according to the target assignment value of the parameter to be optimized for the reference scheduling task, further comprising:
acquiring service parameters of at least two calibration scheduling tasks;
performing service logic judgment according to the service parameters of the calibration scheduling tasks to obtain a judgment result of the conflict between the calibration scheduling tasks;
and determining the execution sequence of the calibration scheduling task according to the conflict judgment result.
7. The method of claim 5, wherein there are at least two nominal scheduled tasks of the reference scheduled task;
correspondingly, after determining the allocation value of the parameter to be optimized of the calibration scheduling task according to the target allocation value of the parameter to be optimized of the reference scheduling task, the method further includes:
determining the distribution value of the matched monitoring parameter according to the distribution value of the parameter to be optimized of the calibration scheduling task;
and determining the execution sequence of the calibration scheduling task according to the distribution value of the monitoring parameters of the calibration scheduling task.
8. A task scheduling apparatus, comprising:
the operation value determining module is used for acquiring the operation value of the monitoring parameter of the scheduling task in the executing process; the scheduling task is executed by adopting the distribution value of the monitoring parameter, and the distribution value of the monitoring parameter is determined according to the matched preset distribution value of the parameter to be optimized;
the optimized value determining module is used for determining the optimized value of the parameter to be optimized based on the comparison result of the running value of the monitoring parameter and the distribution value of the monitoring parameter;
and the distribution value determining module is used for determining the distribution value of the monitoring parameter executed next time by the scheduling task according to the optimized value of the parameter to be optimized.
9. An apparatus, comprising:
one or more processors;
a storage device for storing one or more programs,
when executed by the one or more processors, cause the one or more processors to implement a task scheduling method as recited in any of claims 1-7.
10. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out a method for task scheduling according to any one of claims 1 to 7.
CN202010808404.9A 2020-08-12 2020-08-12 Task scheduling method, device, equipment and storage medium Pending CN114077481A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010808404.9A CN114077481A (en) 2020-08-12 2020-08-12 Task scheduling method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010808404.9A CN114077481A (en) 2020-08-12 2020-08-12 Task scheduling method, device, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN114077481A true CN114077481A (en) 2022-02-22

Family

ID=80280296

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010808404.9A Pending CN114077481A (en) 2020-08-12 2020-08-12 Task scheduling method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN114077481A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115454640A (en) * 2022-09-21 2022-12-09 苏州启恒融智信息科技有限公司 Task processing system and self-adaptive task scheduling method

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115454640A (en) * 2022-09-21 2022-12-09 苏州启恒融智信息科技有限公司 Task processing system and self-adaptive task scheduling method
CN115454640B (en) * 2022-09-21 2024-01-19 苏州启恒融智信息科技有限公司 Task processing system and self-adaptive task scheduling method

Similar Documents

Publication Publication Date Title
US11570272B2 (en) Provisioning using pre-fetched data in serverless computing environments
CN110727512B (en) Cluster resource scheduling method, device, equipment and storage medium
US20190324819A1 (en) Distributed-system task assignment method and apparatus
CA2780231C (en) Goal oriented performance management of workload utilizing accelerators
US9870269B1 (en) Job allocation in a clustered environment
US8869161B2 (en) Characterization and assignment of workload requirements to resources based on predefined categories of resource utilization and resource availability
WO2021159638A1 (en) Method, apparatus and device for scheduling cluster queue resources, and storage medium
US8578381B2 (en) Apparatus, system and method for rapid resource scheduling in a compute farm
US10348815B2 (en) Command process load balancing system
CN113342477A (en) Container group deployment method, device, equipment and storage medium
CN112052082B (en) Task attribute optimization method, device, server and storage medium
US20230325235A1 (en) Training task queuing cause analysis method and system, device and medium
CN113127173B (en) Heterogeneous sensing cluster scheduling method and device
CN114077481A (en) Task scheduling method, device, equipment and storage medium
US8707449B2 (en) Acquiring access to a token controlled system resource
CN114860449B (en) Data processing method, device, equipment and storage medium
CN116962532A (en) Cluster task scheduling method and device, computer equipment and storage medium
CN117093335A (en) Task scheduling method and device for distributed storage system
WO2024114728A1 (en) Heterogeneous processor and related scheduling method
CN116501499B (en) Data batch running method and device, electronic equipment and storage medium
US20230144238A1 (en) System and method for scheduling machine learning jobs
US20240028397A1 (en) Computational resource allocation advisor for elastic cloud databases
CN113918307A (en) Task processing method, device, equipment and medium
CN115982130A (en) Database performance optimization method, storage medium and computer device
JP2024506610A (en) Memory management by controlling data processing tasks

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination