CN117093374A

CN117093374A - Resource scheduling method, device, equipment and storage medium

Info

Publication number: CN117093374A
Application number: CN202311113890.2A
Authority: CN
Inventors: 牛少杰
Original assignee: Zhengzhou Yunhai Information Technology Co Ltd
Current assignee: Zhengzhou Yunhai Information Technology Co Ltd
Priority date: 2023-08-31
Filing date: 2023-08-31
Publication date: 2023-11-21

Abstract

The application discloses a resource scheduling method, a device, equipment and a storage medium, and relates to the technical field of cloud computing. The method is applied to the cloud service platform and comprises the following steps: determining a target host which is judged to be in a high-load state in the current cloud service platform, and identifying a corresponding high-load scene when the target host is in the high-load state; if the high load scene is the first high load scene caused by service operation occupation, adjusting the execution priority of all execution tasks to schedule the system resources in the target host to the operation service in the target host preferentially; and if the high load scene is a second high load scene caused by occupation of task execution, triggering a preset task scheduling algorithm and adjusting the scheduling authority of each execution task by using the preset task scheduling algorithm so as to perform corresponding resource scheduling on each execution task through the scheduling authority. By the technical scheme, stable operation of the service on the host can be ensured, and interference and influence of tasks on service resources are reduced.

Description

Resource scheduling method, device, equipment and storage medium

Technical Field

The present application relates to the field of cloud computing technologies, and in particular, to a method, an apparatus, a device, and a storage medium for scheduling resources.

Background

Whether public cloud or private cloud, the cloud platform technology is deeply favored by operation and maintenance personnel among large enterprises, the virtualization technology presses the capacity of hardware to the greatest extent, and the cloud service platform has better expansion capacity, more excellent disaster recovery backup capacity and more convenient management mode.

Because various hardware resources are centrally managed on the cloud service platform, when the cloud service platform operates in a high-load environment, tasks in the cloud service platform are abnormal due to the deficiency of the hardware resources, so that chain reaction is caused, and extreme problems such as service data loss can be finally caused. At present, aiming at a high-load state, a single high-load service virtual machine or abnormal service is mostly adopted, the means generally needs to ensure the stable operation of a platform by means of experience of operation and maintenance personnel, so that the maintenance cost of the operation and maintenance personnel is very consumed in time, and if the cloud service platform is unattended, the condition that a host machine is down can also occur, and the long-time stable operation of the cloud service platform cannot be ensured.

Therefore, how to provide a solution to the above technical problem is a problem that a person skilled in the art needs to solve at present.

Disclosure of Invention

Accordingly, the present application aims to provide a resource scheduling method, apparatus, device and storage medium, which can automatically process the resource scheduling problem under the high load state, is more suitable for the unattended situation of the cloud service platform, prolongs the operation time of the cloud service platform, operates more stably, reduces the time cost of the operation and maintenance cloud service platform, and assists the operation and maintenance personnel to better use the service of the cloud service platform. The specific scheme is as follows:

in a first aspect, the application discloses a resource scheduling method, which is applied to a cloud service platform and comprises the following steps:

determining a target host which is currently judged to be in a high-load state in the cloud service platform, and identifying a corresponding high-load scene when the target host is in the high-load state;

if the high load scene is a first high load scene caused by service operation occupation, adjusting the execution priority of all execution tasks to schedule the system resources in the target host to the operation service in the target host preferentially;

and if the high load scene is a second high load scene caused by occupation of task execution, triggering a preset task scheduling algorithm, and adjusting the scheduling authority of each execution task by using the preset task scheduling algorithm so as to perform corresponding resource scheduling on each execution task through the scheduling authority.

Optionally, the determining the target host determined to be in the high-load state in the cloud service platform currently includes:

determining that a host in any one or a combination of a first state, a second state, a third state, a fourth state and a fifth state in the cloud service platform is a target host in a high-load state;

the first state is a state that the central processing unit continuously keeps 100% utilization rate in a first preset time period; the second state is a state that the memory utilization rate exceeds a first preset threshold value; the third state is a network-free state; the fourth state is a state that the network bandwidth occupancy rate exceeds a second preset threshold value; the fifth state is a state in which any disk in the host cannot be read and/or any disk continuously maintains 100% of disk occupancy rate in a second preset period of time.

Optionally, the resource scheduling method further includes:

establishing an objective function, determining an upper load limit of a host in the cloud service platform based on the objective function, and judging the first state, the second state, the third state, the fourth state and the fifth state according to the upper load limit;

Wherein the objective function is

Pi(Task _i ) For the priority corresponding to the current Task, de (Task _i ) Time required for the current task, lo _p The load rate of the current host identified for the cloud service platform; lo (lo) _p ＝max(K ₁ ,K ₂ ,K ₃ ,K ₄ )，K ₁ K is the utilization rate of the CPU ₂ K is the memory utilization rate ₃ For the network bandwidth occupancy, K ₄ The occupancy rate of the magnetic disk is set; re (Task) _i ) To be the instituteThe current task occupies the total amount of resources, time _p Calculating the residual time length from the downtime of the server when the cloud service platform does not participate in resource scheduling;F ₁ for the total time required for all tasks, F ₂ F, for the time required for completing the execution of the execution tasks in all the current execution ₃ F, for the memory resource ratio occupied by the execution task ₄ Occupying the central processor resource proportion for the execution task, F ₅ Occupying a broadband resource proportion for the execution task, F ₆ The proportion of disk resources is occupied for the execution task; te _p For the predicted time length maintained by the cloud service platform after the historical data is scheduled by the preset task scheduling algorithm, the user is in a state of being in charge of the user>T _i T is the time length maintained by the cloud service platform after being scheduled by the preset task scheduling algorithm in the historical data _i ＝(1+M)×F ₁ M is a preset strategy value; n is the number of times of statistics combined with the historical data; said Te (VM) _i ) Is the optimal maintenance duration of the host in an ideal state.

Optionally, the identifying the high load scenario corresponding to the target host in the high load state includes:

acquiring a first system resource proportion occupied by each operation service in the target host;

acquiring a second system resource proportion occupied by all execution tasks in the target host;

when the first system resource proportion is larger than the second system resource proportion and the first system resource proportion is larger than a third preset threshold value, generating alarm information and marking the current scene as the first high-load scene caused by service operation occupation;

and when the second system resource proportion is larger than the first system resource proportion and the second system resource proportion is larger than a fourth preset threshold value, generating the alarm information and identifying the current scene as the second high-load scene caused by occupation of task execution.

Optionally, if the high load scenario is a first high load scenario caused by service running occupation, adjusting execution priorities of all execution tasks to schedule system resources in the target host to running services in the target host preferentially, including:

And if the high load scene is a first high load scene caused by service operation occupation, adjusting the execution priority of all execution tasks to be lower than the execution priority of the operation service in the target host, and prolonging the execution time of the execution tasks so as to schedule the system resource in the target host to the operation service preferentially.

Optionally, if the high load scenario is a second high load scenario caused by occupation of task execution, triggering a preset task scheduling algorithm, and adjusting scheduling rights of the execution tasks by using the preset task scheduling algorithm, so as to perform corresponding resource scheduling on the execution tasks through the scheduling rights, including:

if the high load scene is a second high load scene caused by occupation of task execution, triggering a preset task scheduling algorithm, and determining the predicted task ending time length of each executed task and the system maintenance time length after scheduling by the preset task scheduling algorithm by using the preset task scheduling algorithm;

and performing upgrading, downgrade or pause operation on each execution task according to the predicted task ending time and the system maintenance time so as to perform corresponding resource scheduling on each execution task according to an operation result.

Optionally, if the high load scenario is a second high load scenario caused by occupation of task execution, triggering a preset task scheduling algorithm, and adjusting scheduling rights of the execution tasks by using the preset task scheduling algorithm, so as to perform corresponding resource scheduling on the execution tasks by using the scheduling rights, and then further including:

monitoring the current state of the target host to judge whether a management and control data recording event corresponding to the target host exists when the high load state is eliminated;

when the management and control data recording event is monitored, the management and control data when each execution task is subjected to corresponding resource scheduling is recorded by using the preset task scheduling algorithm, and the management and control data is stored in a preset database.

Optionally, the resource scheduling method further includes:

providing an operable interface for a host in the cloud service platform in advance;

counting target running services in the first high-load scene and target execution tasks in the second high-load scene;

and executing interrupt operation on the target operation service and the target execution task through the operable interface.

In a second aspect, the present application discloses a resource scheduling device, which is applied to a cloud service platform, and includes:

the high load identification module is used for determining a target host which is judged to be in a high load state in the cloud service platform at present and identifying a corresponding high load scene when the target host is in the high load state;

the first scheduling module is used for adjusting the execution priority of all execution tasks to schedule the system resources in the target host to the operation service in the target host preferentially if the high-load scene is a first high-load scene caused by service operation occupation;

and the second scheduling module is used for triggering a preset task scheduling algorithm and adjusting the scheduling authority of each execution task by using the preset task scheduling algorithm if the high load scene is a second high load scene caused by occupation of task execution so as to perform corresponding resource scheduling on each execution task through the scheduling authority.

In a third aspect, the present application discloses an electronic device comprising a processor and a memory; wherein the memory is for storing a computer program that is loaded and executed by the processor to implement the resource scheduling method as described above.

In a fourth aspect, the present application discloses a computer-readable storage medium for storing a computer program; wherein the computer program, when executed by a processor, implements a resource scheduling method as described above.

The application provides a resource scheduling method, which is applied to a cloud service platform and comprises the following steps: determining a target host which is currently judged to be in a high-load state in the cloud service platform, and identifying a corresponding high-load scene when the target host is in the high-load state; if the high load scene is a first high load scene caused by service operation occupation, adjusting the execution priority of all execution tasks to schedule the system resources in the target host to the operation service in the target host preferentially; and if the high load scene is a second high load scene caused by occupation of task execution, triggering a preset task scheduling algorithm, and adjusting the scheduling authority of each execution task by using the preset task scheduling algorithm so as to perform corresponding resource scheduling on each execution task through the scheduling authority.

The beneficial technical effects of the application are as follows: the target host computer in the high-load state can realize corresponding resource scheduling execution strategies in different high-load scenes through two prefabrication mode states, and the requirements of clients can be individually met. If the current high-load scene is a first high-load scene caused by service operation occupation, giving the maximum resource support to the operation service, reducing the interference and influence of all execution tasks on the operation service, namely adjusting the execution priority of all the execution tasks to schedule the system resources in the target host to the operation service in the target host preferentially; if the current high load scenario is the second high load scenario due to occupation of task execution, the task scheduling is taken over by using an intelligent algorithm specially designed to perform task scheduling in the scenario. When the preset task scheduling algorithm is started, the operation and scheduling of the algorithm are all provided by the cloud service platform for calculation support and operation because the algorithm is operated on the cloud service platform, and the service can be ensured not to stop or drop. Further, the preset task scheduling algorithm adjusts the scheduling authority of each execution task, prolongs the running time of the host machine, and even ensures that the system can still run stably under high load, so that the host machine service is not down or down countdown is prolonged, and sufficient assistance is provided for operation and maintenance personnel and enough time is provided for processing the high load problem. In summary, the method is more suitable for automatically processing the high-load problem under the unattended condition of the cloud service platform, reduces the time cost of the operation and maintenance cloud service platform, assists operation and maintenance personnel to better use the cloud platform service, and improves the stability of the cloud platform under special conditions.

In addition, the resource scheduling device, the equipment and the storage medium provided by the application correspond to the resource scheduling method and have the same effects.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings that are required to be used in the embodiments or the description of the prior art will be briefly described below, and it is obvious that the drawings in the following description are only embodiments of the present application, and that other drawings can be obtained according to the provided drawings without inventive effort for a person skilled in the art.

FIG. 1 is a flow chart of a resource scheduling method disclosed by the application;

fig. 2 is a flow chart of resource scheduling in a high-load state of a cloud service platform according to the present application;

FIG. 3 is a flowchart of a specific resource scheduling method disclosed in the present application;

fig. 4 is a schematic structural diagram of a resource scheduling device disclosed in the present application;

fig. 5 is a block diagram of an electronic device according to the present disclosure.

Detailed Description

The following description of the embodiments of the present application will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present application, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.

Currently, a cloud service platform mostly adopts a single mode of disabling a high-load service virtual machine or abnormal service for a high-load state, and the means generally needs to guarantee the stable operation of the platform by the experience of operation and maintenance personnel. The maintenance cost of operation and maintenance personnel is consumed in time, and host service downtime can occur when the cloud service platform is unattended, so that abnormal tasks in the cloud service platform are caused to cause chain reaction, and finally, extreme problems such as service data loss and the like can be possibly caused, so that long-time stable operation of the cloud service platform cannot be guaranteed.

Therefore, the application provides a resource scheduling scheme which can automatically process the resource scheduling problem under the high load state, prolong the operation time of the cloud service platform under the unattended condition and operate more stably, reduce the time cost of the operation and maintenance cloud service platform and assist the operation and maintenance personnel to better use the service of the cloud service platform.

Referring to fig. 1, fig. 1 discloses a flowchart of a resource scheduling method, which is applied to a cloud service platform, and the method includes:

step S11: and determining a target host which is currently judged to be in a high-load state in the cloud service platform, and identifying a corresponding high-load scene when the target host is in the high-load state.

In the embodiment of the application, when the host in the cloud service platform meets the high load condition, the host is judged to be in a high load state, and the host in the high load state is taken as a target host. The host meeting the high load condition comprises a host which is in any one or a combination of a plurality of states of a first state, a second state, a third state, a fourth state and a fifth state in the cloud service platform.

In a first specific embodiment, the first state is a state in which the central processing unit (Central Processing Unit, CPU) maintains 100% utilization for a first preset period of time. The first preset time period may be set according to the need, and may default to 15 minutes as an example. For example, if the CPU remains at 100% utilization throughout 15 minutes, the host at that time may be determined to be in a high load state.

In a second specific embodiment, the second state is a state in which the memory usage exceeds a first predetermined threshold. For example, if the first preset threshold is set to a memory usage state of not more than 95%, when the memory usage amount is up to 95% and above, the host at this time may be determined to be in a high load state.

In a third specific embodiment, the third state is a network-free state. If the network card is disconnected during the operation of the host, the host enters a network-free state at this time, and the host is determined to be in a high-load state.

In a fourth specific embodiment, the fourth state is a state in which the network bandwidth occupancy exceeds a second preset threshold. Likewise, if the second preset threshold is set to a network bandwidth occupancy state of not more than 95%, when the network bandwidth occupancy is as high as 95% and above, the host at this time may be determined to be in a high load state.

In a fifth specific embodiment, the fifth state is a state in which any one of the disks in the host cannot be read and/or any one of the disks continuously maintains a 100% disk occupancy rate for a second preset period of time. The second preset time period may also be set as desired, as an example 15 minutes may be defaulted. For example, if a certain disk in the host is unreadable/the occupancy rate of a certain disk in the host is 100% or more for 15 minutes, the host at this time may be determined to be in a high load state.

It may be understood that the load information of the host includes load information of multiple resources, which is not limited to the load information of the resources corresponding to the five states, but may also be load information of other resources, for example, available memory, network input/output (I/O), use cases of storage I/O, and the like, which is not limited in particular by the embodiments of the present invention.

In the embodiment of the application, after the target host in the high load state is determined, the reason causing the high load state of the target host needs to be further identified, that is, the corresponding high load scene when the target host is in the high load state is identified. The resource scheduling strategy of the two mode states is prefabricated by the method and corresponds to two different high-load scenes, so that the user can meet the requirements in a personalized manner in the two modes, and resources are better scheduled.

Specifically, a first system resource proportion occupied by each operation service in the target host and a second system resource proportion occupied by all execution tasks in the target host are respectively obtained. And judging a corresponding high-load scene when the target host is in a high-load state currently according to the relation between the occupation proportion of the target host and the target host.

Step S12: and if the high load scene is a first high load scene caused by service running occupation, adjusting the execution priority of all execution tasks to schedule the system resources in the target host to the running service in the target host preferentially.

In a specific embodiment, when the host is in a high load state, if the first system resource proportion is greater than the second system resource proportion and the first system resource proportion is greater than a third preset threshold, alarm information is generated and the current scene is identified as the first high load scene caused by service operation occupation.

In the embodiment of the application, the first high-load scene corresponds to a high-load state caused by overlarge service operation occupation proportion. When the target host is in a first high-load scene, the cloud service platform generates alarm information to inform operation and maintenance personnel to pay attention to the high-load state of the host, meanwhile, the current scene is identified as the first high-load scene caused by service operation occupation, and the reason of the high load in the current scene is explained to be caused by service operation occupation.

In the embodiment of the application, if the high load scene is the first high load scene caused by service operation occupation, in order to ensure stable operation of the service on the host, the interference and influence of the task on the service resource need to be reduced. Therefore, the execution priority of all execution tasks is adjusted to schedule the system resources in the target host to the running service in the target host preferentially.

Specifically, the execution priority of all execution tasks is adjusted to be lower than the execution priority of the running service in the target host, and the execution time of the execution tasks is prolonged so as to schedule the system resources in the target host to the running service in a priority manner. Therefore, when the execution priority of all execution tasks is adjusted to be lower than that of the running service, the predicted completion time of all the execution tasks is prolonged, the influence of the execution tasks in the host on the running service is reduced under a conservation strategy, the hardware resources are provided for the running service in the host to execute in a maximal priority mode, and the largest resource support is provided for the running service.

Step S13: and if the high load scene is a second high load scene caused by occupation of task execution, triggering a preset task scheduling algorithm, and adjusting the scheduling authority of each execution task by using the preset task scheduling algorithm so as to perform corresponding resource scheduling on each execution task through the scheduling authority.

In another specific embodiment, when the host is in a high load state, if the second system resource proportion is greater than the first system resource proportion and the second system resource proportion is greater than a fourth preset threshold, the alarm information is generated and the current scene is identified as the second high load scene due to occupation of task execution.

In the embodiment of the application, the second high-load scene corresponds to a high-load state caused by overlarge occupation ratio of task execution. When the target host is in the second high-load scene, the cloud service platform also generates alarm information to inform operation and maintenance personnel of paying attention to the high-load state of the host, and meanwhile, the current scene is identified as the second high-load scene caused by occupation of task execution, and the reason for high load in the current scene is explained as the occupation of task execution.

In the embodiment of the application, if the high load scene is a second high load scene caused by occupation of task execution, a preset task scheduling algorithm is triggered, and a scheduling task level of an intelligent algorithm is performed to ensure that the target host spends the second high load scene. And after a preset task scheduling algorithm is started to perform resource scheduling, adjusting the scheduling authority of each execution task by using the preset task scheduling algorithm so as to perform corresponding resource scheduling on each execution task through the scheduling authority. It can be appreciated that the task state records of the plurality of tasks in the processing state in the target host may be traversed, and each of the execution tasks may be determined according to the task execution rate. For example, a task whose task execution ratio is greater than zero may be determined as an executing task in a processing state, or a threshold index may be set, and a task whose task execution ratio is greater than the threshold index may be determined as an executing task in a processing state, which is not particularly limited.

Specifically, determining the predicted task end time length of each execution task and the system maintenance time length after being scheduled by the preset task scheduling algorithm by using the preset task scheduling algorithm; and performing upgrading, downgrade or pause operation on each execution task according to the predicted task ending time and the system maintenance time so as to perform corresponding resource scheduling on each execution task according to an operation result.

In the embodiment of the application, after the mode operation of scheduling tasks by using the preset task scheduling algorithm, the predicted task ending time of the execution task is calculated according to the condition of each execution task, and meanwhile, the system maintenance time after algorithm scheduling can be seen. And adjusting the task execution sequence according to the predicted task end time length and the system maintenance time length, so that the problem of searching for optimal task call becomes solved by carrying out a conditional task scheduling process based on the service requirements proposed by clients. It should be noted that, the scheduling authority of the preset task scheduling algorithm is limited to calculating the execution priority level of each task and suspending execution of the task, that is, the means for executing the task only includes three operations of upgrading, downgrading or suspending, so as to prevent the problem that the task is interrupted during the period of being managed by the cloud service platform, and influence the normal operation of the business service. In addition, the operation and the scheduling of the preset task scheduling algorithm are carried out by the cloud service platform, so that the service can be ensured not to stop or drop.

It should be noted that, in the embodiment of the present application, data generated after resource scheduling is performed by using a preset task scheduling algorithm is also recorded. Firstly, monitoring the current state of the target host computer to judge whether a management and control data recording event corresponding to the target host computer exists when the high load state is eliminated; it will be appreciated that the high load condition elimination may be the condition when the high load is currently exited; the server can be in a down state caused by a high-load state, and the down time of the server is prolonged although the server is in the down state, so that time assistance can be provided for operation and maintenance personnel to solve the current problem. Further, when the management and control data recording event is monitored, the management and control data when the corresponding resource scheduling is carried out on each execution task is recorded by using the preset task scheduling algorithm, and the management and control data is stored in a preset database. It can be understood that the management and control data may include data such as an adjustment scheme for adjusting the scheduling authority of the execution task, a task management and control duration, a management and control task record, and a service resource occupation duration. In addition, the management and control data can also include host hardware state/resource state data before triggering the preset task scheduling algorithm, hardware state/resource state change data after triggering the preset task scheduling algorithm and final management and control ending reasons.

Fig. 2 is a flow chart of resource scheduling in a high-load state of a cloud service platform. In the figure, "platform" refers to a "cloud service platform". First, a high load state determination is performed on a host, and a target host in a high load state is identified. When the host in the platform meets the high load condition, the host is judged to be in a high load state, and the platform controls resources to keep the host in a stable state. Secondly, identifying the cause of the high load state of the host: when a high load state occurs due to running of a service in a system, a conservative strategy is used, so that the influence of an executing task in a host on the service is reduced, and the maximum resource support for running of the service is given; when a high load state is caused by the execution of a plurality of tasks in the system, the platform performs task scheduling by using a preset task scheduling algorithm. When the host is in a high-load state, alarm information of the platform is generated, operation and maintenance personnel are informed of paying attention to the high-load state of the host, and meanwhile whether the reason of the high load is caused by service occupation or task occupation is identified. The operation and maintenance personnel can operate according to the high load reason, including exiting the control of the preset task scheduling algorithm for manual taking over or continuing to enable the preset task scheduling algorithm to perform task control. Finally, when the high load state is eliminated, the preset task scheduling algorithm also records management and control data.

The following describes in detail how to determine whether the current host is a target host in a high-load state according to the load upper limit of each host in the cloud service platform. It can be understood that a management node is provided in the cloud service platform to manage multiple hosts in the platform, and the management node needs to know an upper limit of a load of each host when managing each host, and starts to ensure load balance of resources of the host after entering a high load state from the host, so that an objective function is established, and the upper limit of the load of the hosts in the cloud service platform is determined based on the objective function, and the first state, the second state, the third state, the fourth state and the fifth state are determined according to the upper limit of the load. In this way, it is clear whether the host is currently in a high load state according to the upper load limit of the host.

Specifically, the objective function is

Before resource scheduling begins, constraint values of the task environment need to be set, and the constraint values can be directly obtained from the system. Wherein Pi (Task) _i ) For the priority corresponding to the current task, the importance degree of the current task is reflected, and the value range is 0-1 ]The degree of importance is determined by the resources operated. De (Task) _i ) The time required for the current task is in the value range of [0-60000]The units are milliseconds. lo (lo) _p The load rate of the current host identified for the cloud service platform is an equilibrium load rate obtained by identifying various indexes of the host environment and calculating, and the load rate range is 0-1]. The calculation method comprises the following steps: lo (lo) _p ＝max(K ₁ ,K ₂ ,K ₃ ,K ₄ ) The method comprises the steps of carrying out a first treatment on the surface of the Wherein K is ₁ K is the utilization rate of the CPU ₂ K is the memory utilization rate ₃ For the network bandwidth occupancy, K ₄ The occupancy rate of the magnetic disk is set; k (K) ₁ 、K ₂ 、K ₃ 、K ₄ Has a range value of [0,1 ]]。

Re(Task _i ) Constraint values, which also belong to the task environment, represent the total amount of resources occupied by the current task, and range from 0 to 1]；time _p And calculating the residual time length from the downtime of the server when the cloud service platform does not participate in resource scheduling, namely, predicting how long the server will crash. The unit decision algorithm takes over the ideal time remaining when the task is scheduled; time of _p The value of (2) is determined by multidimensional data of the current host state, wherein the time is obtained _p The calculation formula of (2) is as follows:F ₁ for the total time required for all tasks, F ₂ F, for the time required for completing the execution of the execution tasks in all the current execution ₃ F, for the memory resource ratio occupied by the execution task ₄ Occupying the central processor resource proportion for the execution task, F ₅ Occupying a broadband resource proportion for the execution task, F ₆ And occupying the disk resource proportion for the execution task.

te _p Representing platform algorithm binding historyAnd the data is scheduled by the preset task scheduling algorithm, and the predicted time length maintained by the cloud service platform corresponds to the calculation formula:wherein T is _i And in T, the time length maintained by the cloud service platform after being scheduled by the preset task scheduling algorithm in the historical data is calculated _i The value of (2) is determined by a platform setting strategy, M is a preset strategy value, three values are selectable, namely 0.3, 0.6 and 0.9, wherein T is _i The calculation formula of (2) is as follows: t (T) _i ＝(1+M)×F ₁ The method comprises the steps of carrying out a first treatment on the surface of the n is the number of times of statistics combined with the historical data; in addition, the Te (VM _i ) The constraint value also belongs to the task environment, and is the optimal maintenance duration of the host under the ideal state.

It can be seen that the main dynamic parameter lo _p 、time _p Te _p Is calculated by the logic. According to the embodiment of the application, the upper load limit of the host in the cloud service platform is determined by utilizing the objective function, so that resource scheduling is started after the host enters a high-load state, the host is kept in a stable state, the host service is not down or the down countdown is prolonged, and more opportunities are provided for the remedy of operation and maintenance personnel.

Referring to fig. 3, fig. 3 is a flowchart of a specific resource scheduling method disclosed in the embodiment of the present application, which may provide an operable interface for an operation and maintenance personnel to determine whether to need to manually interfere with handling a high load problem, where the method further includes:

step S21: and providing an operable interface for a host in the cloud service platform in advance.

Step S22: and counting target running services in the first high-load scene and target execution tasks in the second high-load scene.

Step S23: and executing interrupt operation on the target operation service and the target execution task through the operable interface.

In the embodiment of the application, an operable interface is flexibly provided for operation and maintenance personnel, and the selection possibility of manual operation is realized through the operable interface, namely, whether the problem of high load is needed to be processed by manual interference is judged. When the alarm information is generated to remind operation and maintenance personnel to observe the dynamic state of the host, the current host is proved to correspond to the high-load state. It will be appreciated that the current host may be in a first high load scenario or in a second high load scenario. If the current host is in a first high-load scene, counting target running service in the first high-load scene, and normally, when a high-load state occurs due to normal service pressure increase, the host also exits the high-load state along with service pressure decrease, so that whether to execute interrupt operation is manually determined through the operable interface. And if the current host is in the second high-load scene, counting target execution tasks in the second high-load scene. Since the scheduling authority of the preset task scheduling algorithm is limited to the priority level and the pause, no destructive interrupt operation is generated on the task, and therefore, whether to execute the interrupt operation is manually determined through the operable interface.

Therefore, by providing the operable interface, the embodiment of the application manually decides whether to manually interfere with the resource scheduling process in the cloud service platform. Therefore, the method is not only suitable for automatically processing the high-load problem under the unattended condition of the cloud platform, but also can provide a choice of manual management and control task execution mode, can reduce the maintenance cost of operation and maintenance personnel in time, and fully assists the operation and maintenance personnel and provides enough time for processing the high-load problem. Stability of the cloud platform under special conditions is improved, and operation and maintenance staff are assisted to better use cloud platform services.

Correspondingly, the embodiment of the application also discloses a resource scheduling device which is applied to the cloud service platform, and referring to fig. 4, the device comprises:

the high load identification module 11 is configured to determine a target host that is currently determined to be in a high load state in the cloud service platform, and identify a corresponding high load scenario when the target host is in the high load state;

the first scheduling module 12 is configured to adjust execution priorities of all execution tasks to schedule system resources in the target host to an operation service in the target host preferentially, if the high-load scenario is a first high-load scenario caused by service operation occupation;

And the second scheduling module 13 is configured to trigger a preset task scheduling algorithm and adjust the scheduling authority of each execution task by using the preset task scheduling algorithm if the high load scenario is a second high load scenario caused by occupation of task execution, so as to perform corresponding resource scheduling on each execution task by using the scheduling authority.

The more specific working process of each module may refer to the corresponding content disclosed in the foregoing embodiment, and will not be described herein.

Therefore, through the above scheme of the embodiment, the application to the cloud service platform includes: determining a target host which is currently judged to be in a high-load state in the cloud service platform, and identifying a corresponding high-load scene when the target host is in the high-load state; if the high load scene is a first high load scene caused by service operation occupation, adjusting the execution priority of all execution tasks to schedule the system resources in the target host to the operation service in the target host preferentially; and if the high load scene is a second high load scene caused by occupation of task execution, triggering a preset task scheduling algorithm, and adjusting the scheduling authority of each execution task by using the preset task scheduling algorithm so as to perform corresponding resource scheduling on each execution task through the scheduling authority.

In a specific embodiment, the high load identification module 11 includes:

the target host determining unit is used for determining that a host in any one or a combination of a first state, a second state, a third state, a fourth state and a fifth state in the cloud service platform is a target host in a high-load state;

In a specific embodiment, the resource scheduling device further includes:

the state identification unit is used for establishing an objective function, determining the upper load limit of a host in the cloud service platform based on the objective function, and judging the first state, the second state, the third state, the fourth state and the fifth state according to the upper load limit;

Wherein the objective function is

Pi(Task _i ) For the priority corresponding to the current Task, de (Task _i ) Time required for the current task, lo _p The load rate of the current host identified for the cloud service platform; lo (lo) _p ＝max(K ₁ ,K ₂ ,K ₃ ,K ₄ )，K ₁ K is the utilization rate of the CPU ₂ K is the memory utilization rate ₃ For the network bandwidth occupancy, K ₄ The occupancy rate of the magnetic disk is set; re (Task) _i ) The total amount of resources is occupied for the current task, time _p Calculating the residual time length from the downtime of the server when the cloud service platform does not participate in resource scheduling;F ₁ for the total time required for all tasks, F ₂ F, for the time required for completing the execution of the execution tasks in all the current execution ₃ F, for the memory resource ratio occupied by the execution task ₄ Occupying the central processor resource proportion for the execution task, F ₅ Occupying a broadband resource proportion for the execution task, F ₆ The proportion of disk resources is occupied for the execution task; te _p For the predicted time length maintained by the cloud service platform after the historical data is scheduled by the preset task scheduling algorithm, the user is in a state of being in charge of the user>T _i T is the time length maintained by the cloud service platform after being scheduled by the preset task scheduling algorithm in the historical data _i ＝(1+M)×F ₁ M is a pre-determined Setting a strategy value; n is the number of times of statistics combined with the historical data; said Te (VM) _i ) Is the optimal maintenance duration of the host in an ideal state.

In a specific embodiment, the high load identification module 11 includes:

a first system resource proportion obtaining unit, configured to obtain a first system resource proportion occupied by each operation service in the target host;

a second system resource proportion obtaining unit, configured to obtain a second system resource proportion occupied by all execution tasks in the target host;

the first high-load scene identification unit is used for generating alarm information and identifying the current scene as the first high-load scene caused by service operation occupation when the first system resource proportion is larger than the second system resource proportion and the first system resource proportion is larger than a third preset threshold value;

and the second high-load scene identification unit is used for generating the alarm information and identifying the current scene as the second high-load scene caused by occupation of task execution when the second system resource proportion is larger than the first system resource proportion and the second system resource proportion is larger than a fourth preset threshold value.

In a specific embodiment, the first scheduling module 12 includes:

and the first scheduling execution unit is used for adjusting the execution priority of all execution tasks to be lower than the execution priority of the running service in the target host if the high-load scene is a first high-load scene caused by service running occupation, and prolonging the execution time of the execution tasks so as to schedule the system resources in the target host to the running service preferentially.

In a specific embodiment, the second scheduling module 13 includes:

the second scheduling execution unit is used for triggering a preset task scheduling algorithm and determining the predicted task ending time length of each execution task and the system maintenance time length after scheduling by the preset task scheduling algorithm by utilizing the preset task scheduling algorithm if the high load scene is a second high load scene caused by occupation of task execution; and performing upgrading, downgrade or pause operation on each execution task according to the predicted task ending time and the system maintenance time so as to perform corresponding resource scheduling on each execution task according to an operation result.

In a specific embodiment, the resource scheduling device further includes:

the current state monitoring unit is used for triggering a preset task scheduling algorithm and adjusting the scheduling authority of each execution task by using the preset task scheduling algorithm if the high load scene is a second high load scene caused by occupation of task execution, so as to monitor the current state of the target host after corresponding resource scheduling is carried out on each execution task by the scheduling authority, and judging whether a management and control data recording event corresponding to the target host when the high load state is eliminated exists currently or not;

and the management and control data recording unit is used for recording management and control data when each execution task performs corresponding resource scheduling by using the preset task scheduling algorithm when the management and control data recording event is monitored, and storing the management and control data into a preset database.

In a specific embodiment, the resource scheduling device further includes:

an operable interface providing unit for providing an operable interface to a host in the cloud service platform in advance;

the target operation service statistics unit is used for counting the target operation service in the first high-load scene;

The target execution task statistics unit is used for counting target execution tasks in the second high-load scene;

and the operable interface operation unit is used for executing interrupt operation on the target running service and the target execution task through the operable interface.

Further, the embodiment of the present application further discloses an electronic device, and fig. 5 is a block diagram of an electronic device 20 according to an exemplary embodiment, where the content of the figure is not to be considered as any limitation on the scope of use of the present application.

Fig. 5 is a schematic structural diagram of an electronic device 20 according to an embodiment of the present application. The electronic device 20 may specifically include: at least one processor 21, at least one memory 22, a power supply 23, a communication interface 24, an input output interface 25, and a communication bus 26. Wherein the memory 22 is configured to store a computer program that is loaded and executed by the processor 21 to implement the relevant steps in the resource scheduling method disclosed in any of the foregoing embodiments. In addition, the electronic device 20 in the present embodiment may be a computer.

In this embodiment, the power supply 23 is configured to provide an operating voltage for each hardware device on the electronic device 20; the communication interface 24 can create a data transmission channel between the electronic device 20 and an external device, and the communication protocol to be followed is any communication protocol applicable to the technical solution of the present application, which is not specifically limited herein; the input/output interface 25 is used for acquiring external input data or outputting external output data, and the specific interface type thereof may be selected according to the specific application requirement, which is not limited herein.

The memory 22 may be a carrier for storing resources, such as a read-only memory, a random access memory, a magnetic disk, or an optical disk, and the resources stored thereon may include an operating system 221, a computer program 222, data 223, and the like, and the data 223 may include various data. The storage means may be a temporary storage or a permanent storage.

The operating system 221 is used for managing and controlling various hardware devices on the electronic device 20 and computer programs 222, which may be Windows Server, netware, unix, linux, etc. The computer program 222 may further include a computer program that can be used to perform other specific tasks in addition to the computer program that can be used to perform the resource scheduling method performed by the electronic device 20 disclosed in any of the previous embodiments.

Further, embodiments of the present application also disclose a computer readable storage medium, where the computer readable storage medium includes random access Memory (Random Access Memory, RAM), memory, read-Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, magnetic disk, or optical disk, or any other form of storage medium known in the art. Wherein the computer program, when executed by a processor, implements the aforementioned resource scheduling method. For specific steps of the method, reference may be made to the corresponding contents disclosed in the foregoing embodiments, and no further description is given here.

In this specification, each embodiment is described in a progressive manner, and each embodiment is mainly described in a different point from other embodiments, so that the same or similar parts between the embodiments are referred to each other. For the device disclosed in the embodiment, since it corresponds to the method disclosed in the embodiment, the description is relatively simple, and the relevant points refer to the description of the method section.

The steps of a resource schedule or algorithm described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. The software modules may be disposed in Random Access Memory (RAM), memory, read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.

Finally, it is further noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.

The above detailed description of the resource scheduling method, device, apparatus and storage medium provided by the present invention applies specific examples to illustrate the principles and embodiments of the present invention, and the above examples are only used to help understand the method and core ideas of the present invention; meanwhile, as those skilled in the art will have variations in the specific embodiments and application scope in accordance with the ideas of the present invention, the present description should not be construed as limiting the present invention in view of the above.

Claims

1. The resource scheduling method is characterized by being applied to a cloud service platform and comprising the following steps of:

2. The method for scheduling resources according to claim 1, wherein the determining a target host currently determined to be in a high load state in the cloud service platform includes:

3. The resource scheduling method of claim 2, further comprising:

Wherein the objective function is

Pi(Task _i ) For the priority corresponding to the current Task, de (Task _i ) Time required for the current task, lo _p The load rate of the current host identified for the cloud service platform; lo (lo) _p ＝max(K ₁ ,K ₂ ,K ₃ ,K ₄ )，K ₁ K is the utilization rate of the CPU ₂ K is the memory utilization rate ₃ For the network bandwidth occupancy, K ₄ The occupancy rate of the magnetic disk is set; re (Task) _i ) The total amount of resources is occupied for the current task, time _p Calculating the residual time length from the downtime of the server when the cloud service platform does not participate in resource scheduling;F ₁ for the total time required for all tasks, F ₂ F, for the time required for completing the execution of the execution tasks in all the current execution ₃ F, for the memory resource ratio occupied by the execution task ₄ Occupying the central processor resource proportion for the execution task, F ₅ Occupying a broadband resource proportion for the execution task, F ₆ The proportion of disk resources is occupied for the execution task; te _p For the predicted time length maintained by the cloud service platform after the historical data is scheduled by the preset task scheduling algorithm, the user is in a state of being in charge of the user>T _i T is the time length maintained by the cloud service platform after being scheduled by the preset task scheduling algorithm in the historical data _i ＝(1+M)×F ₁ M is a preset strategy value; n is the number of times of statistics combined with the historical data; said Te (VM) _i ) Is the optimal maintenance duration of the host in an ideal state.

4. The method for scheduling resources according to claim 1, wherein the identifying the corresponding high load scenario when the target host is in the high load state includes:

5. The method for scheduling resources according to claim 1, wherein if the high load scenario is a first high load scenario due to service running occupation, adjusting execution priorities of all execution tasks to schedule system resources in the target host to running services in the target host preferentially, comprises:

6. The method according to claim 1, wherein if the high load scenario is a second high load scenario caused by occupation of task execution, triggering a preset task scheduling algorithm and adjusting scheduling rights of each execution task by using the preset task scheduling algorithm, so as to perform corresponding resource scheduling on each execution task by using the scheduling rights, including:

7. The method for scheduling resources according to claim 1, wherein if the high load scenario is a second high load scenario caused by occupation of task execution, triggering a preset task scheduling algorithm and adjusting scheduling rights of each execution task by using the preset task scheduling algorithm, so as to perform corresponding resource scheduling on each execution task by using the scheduling rights, further comprising:

8. The resource scheduling method according to any one of claims 1 to 7, characterized by further comprising:

9. The resource scheduling device is characterized by being applied to a cloud service platform and comprising the following components:

10. An electronic device comprising a processor and a memory; wherein the memory is for storing a computer program to be loaded and executed by the processor to implement the resource scheduling method of any one of claims 1 to 8.

11. A computer-readable storage medium storing a computer program; wherein the computer program when executed by a processor implements the resource scheduling method of any one of claims 1 to 8.