CN115080211A

CN115080211A - Task scheduling method and system of virtualization platform system and related components

Info

Publication number: CN115080211A
Application number: CN202210764561.3A
Authority: CN
Inventors: 潘景基
Original assignee: Inspur Jinan data Technology Co ltd
Current assignee: Inspur Jinan data Technology Co ltd
Priority date: 2022-06-30
Filing date: 2022-06-30
Publication date: 2022-09-20

Abstract

The application discloses a method, a system and related components for task scheduling of a virtualization platform system, relates to the field of virtualization platform systems, and is used for scheduling tasks of a virtual machine, and the method comprises the following steps: monitoring the network connection state of each task node; judging whether the network speed of each task node is lower than a preset speed and the duration time exceeds the minimum fault time, if so, judging that the task node has a network fault; determining whether a first task node meeting task migration conditions exists in all task nodes according to the number and time of network faults of each task node; when the first task node exists, all tasks on the first task node are migrated to other task nodes. By monitoring and analyzing the network connection state, when the task node meets the migration condition, all tasks on the node are migrated to other task nodes, so that the network communication hidden danger of the virtual machine is eliminated, and the normal communication and operation of the virtual machine are ensured.

Description

Task scheduling method and system of virtualization platform system and related components

Technical Field

The present invention relates to the field of virtualization platform systems, and in particular, to a method and a system for task scheduling of a virtualization platform system, and related components.

Background

Currently, for a High reliability High Available policy for a virtual machine in the prior art, it is mainly used to transfer the virtual machine on a physical node to another host when the physical node itself fails, so as to ensure continuity and High availability of a task on the virtual machine. However, the problem of network communication failure is not considered in the high-reliability policy, and if there is a problem in the network communication of the physical node, even if the virtual machine on the node is still in an operating state, on the premise that the communication function of the virtual machine is damaged, the task operation cannot be realized according to a normal communication flow.

Therefore, how to provide a solution to the above technical problems is a problem to be solved by those skilled in the art.

Disclosure of Invention

In view of the above, the present invention provides a method, a system, and related components for task scheduling of a virtualization platform system, which consider the effect of a network communication state and perform task transfer on a virtual machine according to the network communication state. The specific scheme is as follows:

a task scheduling method of a virtualization platform system comprises the following steps:

monitoring the network connection state of each task node; the network connection status comprises a network speed and a corresponding timestamp;

judging whether the network speed of each task node is lower than a preset speed and the duration time exceeds the minimum fault time, and if so, judging that the task node has a network fault;

determining whether a first task node meeting task migration conditions exists in all the task nodes according to the times and time of the network faults of each task node;

when the first task node exists, migrating all tasks on the first task node to other task nodes;

for any task node, the task migration condition includes:

the duration of the network failure at this time exceeds the maximum duration,

or, the network fault meets the network oscillation judgment condition for a plurality of times in history.

Preferably, the network oscillation determination condition includes:

and in a preset time period till the current moment, the frequency of the network faults exceeds a first preset value.

Preferably, the network oscillation determination condition includes:

in a preset time period till the current moment, the accumulated oscillation frequency of the network faults for a plurality of times exceeds a second preset value historically;

correspondingly, the process of determining the accumulated oscillation frequency includes:

adding 1 to the current accumulated oscillation frequency when the interval duration from the occurrence time of the network fault at the i-1 st time to the occurrence time of the network fault at the i-1 st time is less than the interval duration from the occurrence time of the network fault at the i-2 nd time to the occurrence time of the network fault at the i-1 st time within the preset time period from the current time; i is a positive integer not less than 3 and not more than the number of current network failures.

Preferably, the determining, according to the number of times and the time of the network failure of each task node, whether a first task node meeting a task migration condition exists in all the task nodes further includes:

and in the preset time period until the current moment, if the frequency of the network fault of the current task node is not more than 2 times, judging that the current task node does not meet the network oscillation judgment condition.

Preferably, the network oscillation determination condition includes:

in a preset time period until the current moment, the continuous oscillation frequency of the network fault exceeds a third preset value for a plurality of times in history;

correspondingly, the process of determining the continuous oscillation frequency comprises the following steps:

adding 1 to the current continuous oscillation frequency when the interval duration from the occurrence time of the network fault at the i-1 st time to the occurrence time of the network fault at the i-1 st time is less than the interval duration from the occurrence time of the network fault at the i-2 nd time to the occurrence time of the network fault at the i-1 st time within the preset time period from the current time;

when the interval duration from the ith-1 th occurrence time of the network fault to the ith occurrence time of the network fault is not less than the interval duration from the ith-2 nd occurrence time of the network fault to the ith-1 st occurrence time of the network fault, clearing the current continuous oscillation frequency;

i is a positive integer not less than 3 and not more than the number of current network failures.

Preferably, the process of migrating all tasks on the first task node to other task nodes when the first task node exists includes:

when the first task node exists, migrating all tasks on the first task node to other task nodes according to a task migration instruction;

the task migration instruction is specifically a direct migration instruction or a migration instruction after shutdown.

Preferably, the task node is a physical server, and the network connection state of the physical server includes all the network connection states of a virtual machine, a virtual network card, and a virtual switch that are arranged on the physical server.

Correspondingly, the present application also discloses a task scheduling system of a virtualization platform system, comprising:

the state monitoring module is used for monitoring the network connection state of each task node; the network connection status comprises a network speed and a corresponding timestamp;

the fault judging module is used for judging whether the network speed of each task node is lower than a preset speed and the duration time exceeds the minimum fault time, and if so, judging that the task node has a network fault;

the node determining module is used for determining whether a first task node meeting task migration conditions exists in all the task nodes according to the number of times and time of the network faults of each task node;

the task migration module is used for migrating all tasks on the first task node to other task nodes when the first task node exists;

for any task node, the task migration condition includes:

the duration of the network failure at this time exceeds the maximum duration,

Correspondingly, the application also discloses an electronic device, including:

a memory for storing a computer program;

a processor for implementing the steps of the method for task scheduling of a virtualized platform system as in any of the above when executing said computer program.

Accordingly, the present application also discloses a readable storage medium, on which a computer program is stored, which, when being executed by a processor, implements the steps of the method for task scheduling of a virtualization platform system as described in any one of the above.

The application discloses a task scheduling method of a virtualization platform system, which comprises the following steps: monitoring the network connection state of each task node; the network connection status comprises a network speed and a corresponding timestamp; judging whether the network speed of each task node is lower than a preset speed and the duration time exceeds the minimum fault time, and if so, judging that the task node has a network fault; determining whether a first task node meeting task migration conditions exists in all the task nodes according to the times and time of the network faults of each task node; when the first task node exists, migrating all tasks on the first task node to other task nodes; for any task node, the task migration condition includes: the duration of the network fault at this time exceeds the maximum duration, or the network fault meets the network oscillation judgment condition for a plurality of times in history. By monitoring and analyzing the network connection state, when the task node meets the migration condition, all tasks on the node are migrated to other task nodes, so that the network communication hidden danger of the virtual machine is eliminated, and the normal communication and operation of the virtual machine are ensured.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the provided drawings without creative efforts.

FIG. 1 is a flowchart illustrating steps of a method for scheduling tasks of a virtualization platform system according to an embodiment of the present invention;

FIG. 2 is a block diagram of a task scheduling system of a virtualization platform system according to an embodiment of the present invention;

fig. 3 is a structural distribution diagram of an electronic device according to an embodiment of the invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

The problem of network communication faults is not considered in the existing high-reliability strategy, and if the network communication of a physical node has problems, even if the virtual machine on the node still keeps the running state, on the premise that the communication function of the virtual machine is damaged, the task running cannot be realized according to a normal communication flow.

By monitoring and analyzing the network connection state, when the task node meets the migration condition, all tasks on the node are migrated to other task nodes, so that the network communication hidden danger of the virtual machine is eliminated, and the normal communication and operation of the virtual machine are ensured.

The embodiment of the invention discloses a task scheduling method of a virtualization platform system, which is shown in figure 1 and comprises the following steps:

s1: monitoring the network connection state of each task node; the network connection status includes a network speed and a corresponding timestamp;

it can be understood that, for the monitoring of the network connection state in step S1, the network speed and the corresponding discrete timestamp of the sampling time are obtained by continuous sampling on each task node;

it is understood that the task node is specifically a physical server, and the network connection state of the physical server includes all network connection states of a virtual machine, a virtual network card and a virtual switch which are arranged on the physical server. It is assumed that A, B, C and D four physical servers exist, where a is used as a management node of the virtualization platform system, the remaining B, C and D are respectively used as task nodes of the virtualization platform system, different numbers of virtual machines are installed on different task nodes, each virtual machine includes different numbers of virtual network cards and virtual switches, according to characteristics of the virtual machines, multiple physical servers may correspond to the same virtual switch, and different virtual machines may use shared storage. Further, here, the connection state at each task node may include a network connection state of the virtual machine, and/or the virtual network card, and/or the virtual switch, and when the determination is subsequently performed, the network connection states of all the virtual machines, and/or the virtual network cards, and/or the virtual switch at a single task node may respectively perform the network fault determination.

S2: judging whether the network speed of each task node is lower than a preset speed and the duration time exceeds the minimum fault time, if so, judging that the task node has a network fault;

it can be understood that a network fault is usually reflected in that the network connection between the task node and the node network is disconnected, and at this time, the monitored network speed should be 0, but in practice, the network speed is not as low as 0, and normal network communication cannot be performed, so that in this embodiment, the preset speed may be selected to be a value greater than 0, so as to more quickly and sensitively reflect the current network connection state of each task node.

Because the network connection state may have unstable or data error conditions, in order to avoid the occurrence of misjudgment, the judgment time is required, and the task node is judged to have network fault only if the duration of the network speed lower than the preset speed exceeds the minimum fault duration.

S3: determining whether a first task node meeting task migration conditions exists in all task nodes according to the number and time of network faults of each task node;

for any task node, the task migration condition comprises the following steps:

the duration of the network failure at this time exceeds the maximum duration,

or, the historical network faults meet the network oscillation judgment condition for many times.

It is understood that the maximum duration may be set according to the urgency of the task requirement, for example, the maximum duration is 30s, 1min, 2min, 5min or 10min, the urgency thereof is gradually decreased, and the specific value may be set according to the actual requirement, which is not limited herein.

It will be appreciated that the network oscillation decision condition is generally related to the frequency of network failures and may be specifically set according to the task requirements.

Further, in the determination of the task migration condition in step S3, the task migration condition may be determined by a network fault of a single object (virtual machine, virtual network card or virtual switch) on a task node, when any object satisfies the task migration condition, the task node is the first task node, or all network faults on the same task node may be integrated on a time line as the network faults of the task node, and then the determination of the task migration condition in step S3 is performed, it should be noted that the integration is performed by only concentrating the network faults of all objects on one time line for analysis, and the network faults with coincident time lines are not merged, for example, the determination that the duration of the network fault exceeds the maximum duration is usually performed only for a single network fault, rather than a composite network fault after two network faults with overlapping time lines are integrated, and the judgment of whether the historical multiple network faults meet the network oscillation judgment conditions or not is also referred to as the time judgment of all the network faults after all the network faults of all the objects are integrated on the same time line.

S4: when the first task node exists, all tasks on the first task node are migrated to other task nodes.

It can be understood that, in step S4, all tasks on the first task node are migrated to other task nodes, where the tasks are specifically task data or service data corresponding to the virtual machine, and/or the virtual network card, and/or the virtual switch, and in order to ensure that the tasks on the first task node are not affected by communication and the data is continuous in normal service, the tasks are transferred to other task nodes other than the first task node.

Further, when there is the first task node, the step S4 is a process of migrating all tasks on the first task node to other task nodes, including:

when a first task node exists, migrating all tasks on the first task node to other task nodes according to a task migration instruction;

It is to be understood that the task migration instruction includes a live migration instruction and a cold migration instruction, where the live migration instruction is a direct migration instruction, and when executed, step S4 directly migrates all tasks on the first task node that is currently running to other task nodes, where the first task node supports live migration using the virtual machine that uses shared storage; in step S4, the virtual machine of the first task node is automatically and safely turned off during execution, and then all tasks are cold-migrated to other task nodes, where the virtual machine capable of being automatically and safely turned off generally refers to a virtual machine installed with VMtools.

The application discloses a task scheduling method of a virtualization platform system, which comprises the following steps: monitoring the network connection state of each task node; the network connection status includes a network speed and a corresponding timestamp; judging whether the network speed of each task node is lower than a preset speed and the duration time exceeds the minimum fault time, if so, judging that the task node has a network fault; determining whether a first task node meeting task migration conditions exists in all the task nodes according to the times and time of the network faults of each task node; when the first task node exists, migrating all tasks on the first task node to other task nodes; for any task node, the task migration condition includes: the duration of the network fault at this time exceeds the maximum duration, or the network fault meets the network oscillation judgment condition for a plurality of times in history. By monitoring and analyzing the network connection state, when the task node meets the migration condition, all tasks on the node are migrated to other task nodes, so that the network communication hidden danger of the virtual machine is eliminated, and the normal communication and operation of the virtual machine are ensured.

The embodiment of the invention discloses a specific task scheduling method for a virtualization platform system, and compared with the previous embodiment, the embodiment further explains and optimizes the technical scheme. Specifically, the method comprises the following steps:

regarding whether the network fault satisfies the task migration condition in step S3, when determining the network oscillation determination condition for the network fault of the task node, a time following window is generally used as the data window, an end boundary of the time following window is the current time, and the length of the time following window is a preset time period, which can be set according to the requirement, for example, 2 hours, 30 minutes, 15 minutes, and the like. Along with the change of the current moment, the time correspondingly moves along with the window, the network faults corresponding to different times correspondingly expose in the data window along with the action of the time following window as a judgment basis, namely, the network faults in the preset time period up to the current moment, and the judgment conditions have various selection schemes, and the selection can be implemented alternatively.

The first determination condition is: the network oscillation judging conditions include: and in a preset time period till the current moment, the frequency of network faults exceeds a first preset value.

The second determination condition is: the network oscillation judging conditions include: in a preset time period till the current moment, the accumulated oscillation frequency of the historical multiple network faults exceeds a second preset value; correspondingly, the process of determining the accumulated oscillation frequency comprises the following steps: adding 1 to the current accumulated oscillation frequency when the interval duration from the occurrence time of the i-1 th network fault to the occurrence time of the i-th network fault is less than the interval duration from the occurrence time of the i-2 nd network fault to the occurrence time of the i-1 st network fault within a preset time period from the current time; i is a positive integer not less than 3 and not greater than the number of current network failures.

It is to be understood that, in this determination condition, if the time interval of the occurrence of the network failure is shortened, the current accumulated oscillation frequency is increased by one.

Further, the process of determining whether a first task node meeting task migration conditions exists in all the task nodes according to the number of times and time of the network failure of each task node, further includes:

and in a preset time period until the current moment, if the network fault occurrence frequency of the current task node does not exceed 2 times, judging that the current task node does not meet the task migration condition.

It can be understood that if the network fault does not exceed 2 times in the time following window, the network oscillation determination is not enough, and the current accumulated oscillation frequency is 0, it can be directly determined that the current task node does not satisfy the network oscillation determination condition.

The third determination condition is: the network oscillation judging conditions include: in a preset time period till the current moment, the continuous oscillation frequency of the historical multiple network faults exceeds a third preset value; accordingly, the process of determining the continuous oscillation frequency comprises: adding 1 to the current continuous oscillation frequency when the interval duration from the occurrence time of the i-1 th network fault to the occurrence time of the i-th network fault is less than the interval duration from the occurrence time of the i-2 nd network fault to the occurrence time of the i-1 st network fault within a preset time period from the current time; when the interval duration from the occurrence time of the ith-1 th network fault to the occurrence time of the ith network fault is not less than the interval duration from the occurrence time of the ith-2 nd network fault to the occurrence time of the ith-1 st network fault, resetting the current continuous oscillation frequency; i is a positive integer not less than 3 and not greater than the number of current network failures.

It can be understood that, the third and second determination conditions are both that 1 is added when the time interval of the network fault is shortened, in the second determination condition, if the time interval of the sudden occurrence is not less than the time interval of the previous network fault, the accumulated oscillation frequency is not counted, and the next condition that 1 can be added is continuously waited, but in the third determination condition, once the time interval of the network fault is not shortened any more, the continuous oscillation frequency is cleared and counted again, and in the third determination condition, the latest continuous oscillation frequency is mainly used.

In addition, the first determination condition and the second determination condition or the third determination condition may be combined, and the network oscillation determination condition may be satisfied when both the first determination condition and the second determination condition are satisfied, or when both the first determination condition and the third determination condition are satisfied.

It is understood that the first preset value, the second preset value, and the third preset value used for the determination herein can be set according to the length of the preset time period, the reliability requirement of the platform system, and the like, and are not limited herein.

Correspondingly, the present application also discloses a task scheduling system of a virtualization platform system, as shown in fig. 2, including:

the state monitoring module 1 is used for monitoring the network connection state of each task node; the network connection status includes a network speed and a corresponding timestamp;

the fault determination module 2 is used for determining whether the network speed of each task node is lower than a preset speed and the duration time exceeds the minimum fault time, and if so, determining that the network fault occurs in the task node;

the node determining module 3 is configured to determine whether a first task node meeting a task migration condition exists in all the task nodes according to the number of times and time that each task node has the network fault;

the task migration module 4 is configured to, when the first task node exists, migrate all tasks on the first task node to other task nodes;

for any task node, the task migration condition includes:

the duration of the network failure at this time exceeds the maximum duration,

or, the network faults meet the network oscillation judgment condition for a plurality of times in history.

In some specific embodiments, the network oscillation determining condition includes:

In some specific embodiments, the determining, according to the number of times and the time of the network failure of each task node, whether a first task node that meets a task migration condition exists in all the task nodes further includes:

In some specific embodiments, the migrating all tasks on the first task node to other task nodes when the first task node exists includes:

In some specific embodiments, the task node is specifically a physical server, and the network connection state of the physical server includes all the network connection states of a virtual machine, a virtual network card, and a virtual switch that are disposed on the physical server.

Accordingly, the present application also discloses an electronic device, as shown in fig. 3, including a processor 11 and a memory 12; wherein the processing 11 implements the following steps when executing the computer program stored in the memory 12:

for any task node, the task migration condition includes:

the duration of the network failure at this time exceeds the maximum duration,

and in a preset time period till the current moment, the frequency of the network fault exceeds a first preset value.

In some specific embodiments, when the processor 11 executes the computer subprogram stored in the memory 12, the following steps may be specifically implemented:

Further, the electronic device in this embodiment may further include:

the input interface 13 is configured to obtain a computer program imported from the outside, store the obtained computer program in the memory 12, and also be configured to obtain various instructions and parameters transmitted by an external terminal device, and transmit the instructions and parameters to the processor 11, so that the processor 11 performs corresponding processing by using the instructions and parameters. In this embodiment, the input interface 13 may specifically include, but is not limited to, a USB interface, a serial interface, a voice input interface, a fingerprint input interface, a hard disk reading interface, and the like.

And an output interface 14, configured to output various data generated by the processor 11 to a terminal device connected thereto, so that other terminal devices connected to the output interface 14 can acquire various data generated by the processor 11. In this embodiment, the output interface 14 may specifically include, but is not limited to, a USB interface, a serial interface, and the like.

A communication unit 15 for establishing a telecommunication connection between the electronic device and an external server so that the electronic device can mount the image file to the external server. In this embodiment, the communication unit 15 may specifically include, but is not limited to, a remote communication unit based on a wireless communication technology or a wired communication technology.

And the keyboard 16 is used for acquiring various parameter data or instructions input by a user through real-time key cap knocking.

And the display 17 is used for displaying relevant information of the task scheduling process in real time so that a user can know the current task scheduling situation in time.

The mouse 18 may be used to assist the user in entering data and to simplify the user's operation.

Further, embodiments of the present application also disclose a computer-readable storage medium, where the computer-readable storage medium includes Random Access Memory (RAM), memory, Read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, removable hard disk, CD-ROM, or any other form of storage medium known in the art. A computer-readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of:

determining whether a first task node meeting task migration conditions exists in all the task nodes or not according to the times and time of the network faults of each task node;

for any task node, the task migration condition includes:

the duration of the network failure at this time exceeds the maximum duration,

in a preset time period until the current moment, the accumulated oscillation frequency of the network fault exceeds a second preset value for a plurality of times;

Finally, it should also be noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.

The task scheduling method, system and related components of the virtualization platform system provided by the present invention are introduced in detail, and a specific example is applied in the present document to explain the principle and the implementation of the present invention, and the description of the above embodiment is only used to help understanding the method and the core idea of the present invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present invention.

Claims

1. A task scheduling method of a virtualization platform system is characterized by comprising the following steps:

for any task node, the task migration condition includes:

the duration of the network failure at this time exceeds the maximum duration,

2. The task scheduling method according to claim 1, wherein the network oscillation determining condition comprises:

3. The task scheduling method according to claim 1, wherein the network oscillation determining condition comprises:

4. The task scheduling method according to claim 3, wherein the process of determining whether a first task node satisfying a task migration condition exists in all the task nodes according to the number of times and the time of the network failure occurring in each of the task nodes further comprises:

5. The task scheduling method according to claim 1, wherein the network oscillation determining condition comprises:

6. The task scheduling method according to any one of claims 1 to 5, wherein the process of migrating all tasks on the first task node to other task nodes when the first task node exists comprises:

7. The task scheduling method according to claim 6, wherein the task node is a physical server, and the network connection state of the physical server includes all the network connection states of a virtual machine, a virtual network card, and a virtual switch that are provided on the physical server.

8. A task scheduling system for a virtualized platform system, comprising:

for any task node, the task migration condition includes:

the duration of the network failure at this time exceeds the maximum duration,

9. An electronic device, comprising:

a memory for storing a computer program;

processor for implementing the steps of the method for task scheduling of a virtualized platform system according to any of the claims 1 to 7 when executing said computer program.

10. A readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method for task scheduling of a virtualization platform system according to any one of claims 1 to 7.