CN106354531B

CN106354531B - Physical node upgrading method and device

Info

Publication number: CN106354531B
Application number: CN201610734647.6A
Authority: CN
Inventors: 郭家发; 李健
Original assignee: Huawei Technologies Co Ltd
Current assignee: Shenzhen Huawei Cloud Computing Technology Co ltd
Priority date: 2016-08-25
Filing date: 2016-08-25
Publication date: 2020-03-27
Anticipated expiration: 2036-08-25
Also published as: CN106354531A

Abstract

The invention discloses a method and a device for upgrading a physical node, wherein the method comprises the following steps: determining the ith batch of physical nodes from the physical nodes which are not upgraded; upgrading the ith batch of physical nodes; after the upgrade is finished, detecting whether physical nodes which are not upgraded exist; if the physical nodes which are not upgraded exist, enabling i to be i +1, and determining the ith batch of physical nodes from the physical nodes which are not upgraded again; the problem that in the prior art, when all physical hosts restart upgrading at the same time, the whole distributed system cannot provide service to the outside, and the service interruption time is long when the upgrading time is long is solved; the method achieves the effects that the plurality of physical nodes are upgraded according to different batches, when the physical nodes of each batch are upgraded, other physical nodes except the current batch can still provide services to the outside, and the services are not interrupted.

Description

Physical node upgrading method and device

Technical Field

The embodiment of the invention relates to the field of computers, in particular to a physical node upgrading method and device.

Background

The OpenStack distributed computing system is an open-source cloud computing management platform project. An OpenStack distributed computing system includes a plurality of physical hosts, each having a number of service instances running thereon. Two service instances of the same kind may be deployed in a master-standby manner on different physical hosts, or a plurality of service instances may be deployed in a load balancing manner.

When the operating system is upgraded due to version updating, security loopholes, drive updating and the like, the physical host needs to be restarted and upgraded. In the process of restarting and upgrading the physical host, the virtual machine on the physical host cannot provide service to the outside. The common upgrading method in the prior art comprises the following steps: all physical hosts are restarted and upgraded simultaneously.

Obviously, when all the physical hosts restart and upgrade at the same time, the whole distributed computing system cannot provide services to the outside. And because the service interruption time length is the same as the upgrading time length, if the upgrading time length is longer, the service interruption time length is also very long.

Disclosure of Invention

In order to solve the problem that the service interruption time and the upgrading time are long when all physical hosts restart upgrading at the same time in the prior art, embodiments of the present invention provide a physical node upgrading method and apparatus. The technical scheme is as follows:

the physical node upgrading method and device provided by each embodiment of the invention are used in a distributed computing system, the distributed computing system comprises a management terminal and physical nodes, wherein the management terminal is used for managing the operation condition of each physical node, the physical nodes are used for operating at least one service instance, and each service instance is used for providing one or more services.

In a first aspect, a method for upgrading a physical node is provided, where the method for upgrading a physical node includes:

determining an ith physical node from un-upgraded physical nodes, wherein service instances running in other physical nodes except the ith physical node comprise all kinds of service instances, and i is an integer;

upgrading the ith batch of physical nodes;

after the upgrade is finished, detecting whether the physical nodes which are not upgraded still exist;

and if the physical nodes which are not upgraded still exist, enabling i to be i +1, and re-executing the step of determining the ith batch of physical nodes from the physical nodes which are not upgraded.

According to the scheme shown in the embodiment of the invention, the ith batch of physical nodes are determined from the physical nodes which are not upgraded; upgrading the ith batch of physical nodes; after the upgrade is finished, detecting whether physical nodes which are not upgraded exist; if the physical nodes which are not upgraded exist, enabling i to be i +1, and determining the ith batch of physical nodes from the physical nodes which are not upgraded again; the problem that in the prior art, when all physical hosts restart upgrading at the same time, the whole distributed system cannot provide service to the outside, and the service interruption time is long when the upgrading time is long is solved; the method achieves the effects that the plurality of physical nodes are upgraded according to different batches, when the physical nodes of each batch are upgraded, other physical nodes except the current batch can still provide services to the outside, and the services are not interrupted.

In a first implementation manner of the first aspect, determining an ith batch of physical nodes from un-upgraded physical nodes includes: determining a first physical node from the physical nodes which are not upgraded; adding the first physical node to the ith batch of physical nodes; detecting whether the remaining physical nodes except the first physical node in the physical nodes which are not upgraded meet an inverse affinity condition one by one according to a preset sequence, wherein the inverse affinity condition comprises the following steps: a physical node set formed by a kth service instance operated by a current physical node and the kth service instance operated in the ith batch of physical nodes is not a complete set of the kth service instance, and k is an integer; and if the anti-affinity condition is met, adding the residual physical nodes to the ith batch of physical nodes.

According to the scheme disclosed by the embodiment of the invention, all physical nodes are upgraded in batches by dynamically detecting whether each physical node meets the condition of inverse affinity, so that the external service is not interrupted in the process of upgrading the physical nodes; when the physical nodes of each batch are upgraded, other physical nodes except the current batch can still provide services to the outside, and the services are not interrupted.

With reference to the first aspect or the first possible implementation manner of the first aspect, in a second possible implementation manner, the determining a first physical node from the physical nodes that are not upgraded includes: calculating the number of instances of the standby service instance running in each physical node which is not upgraded; and determining the physical node with the maximum number of the instances of the standby service instance as the first physical node.

With reference to the first aspect, the first possible implementation manner of the first aspect, or the second possible implementation manner of the first aspect, in a third possible implementation manner, the method further includes: and performing active-standby switching on the main service instance running in the ith batch of physical nodes and the standby service instances running in the other physical nodes.

In the solution shown in the embodiment of the present invention, the physical node having the most standby service instances running is determined as the first physical node, so that the number of service instances that need to be switched between the main and standby service is minimized, and the number of service interruption times caused by the main and standby switching process can be reduced, thereby reducing the service interruption time as much as possible.

With reference to the first aspect, the first possible implementation manner of the first aspect, the second possible implementation manner of the first aspect, or the third possible implementation manner of the first aspect, in a fourth possible implementation manner, the method further includes: for each service instance, detecting whether the service instance in an available state exists in other physical nodes except the ith batch of physical nodes; and if each service instance has a service instance in an available state, executing the step of upgrading the ith batch of physical nodes.

According to the scheme of the embodiment of the invention, before the ith batch of service instances are upgraded, the availability of each type of service instance is judged; the problem that the service instance of the type is unavailable in the upgrading process due to the fact that a certain service instance is in a fault state is avoided, and service interruption generated in the upgrading process of the physical node is prevented.

In a second aspect, an embodiment of the present invention provides a physical node upgrade apparatus, where the physical node upgrade apparatus includes at least one unit, and the at least one unit is configured to implement the physical node upgrade method provided in any one of the possible designs of the first aspect or the first aspect.

In a third aspect, an embodiment of the present invention provides a terminal, where the terminal includes a processor and a memory; the processor is configured to store one or more instructions, which are instructed to be executed by the processor, and to implement the physical node upgrade method provided in the first aspect or any one of the possible designs of the first aspect.

In a fourth aspect, an embodiment of the present invention provides a computer-readable storage medium, where an executable program for implementing the physical node upgrade method provided in the first aspect or any one of the possible designs of the first aspect is stored in the computer-readable storage medium.

The technical effects obtained by the second aspect to the fourth aspect of the embodiments of the present invention are similar to the technical effects obtained by the corresponding technical means in the first aspect, and are not described herein again.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the invention, as claimed.

Drawings

FIG. 1 illustrates an architecture diagram of a distributed computing system provided by one embodiment of the present invention;

FIG. 2 illustrates an architecture diagram of a distributed computing system provided by another embodiment of the present invention;

FIG. 3 is a schematic diagram of a physical node upgrade method provided in one embodiment of the invention;

FIG. 4 is a schematic diagram of a physical node upgrade method provided in one embodiment of the invention;

FIG. 5 is a schematic diagram of a physical node upgrade method provided in another embodiment of the invention;

FIGS. 6A and 6B are schematic diagrams of a physical node upgrade method provided in another embodiment of the invention;

FIGS. 7A and 7B are schematic diagrams of a physical node upgrade method provided in a further embodiment of the invention;

fig. 8 is a block diagram illustrating a physical node upgrade apparatus according to an embodiment of the present invention;

fig. 9 is a block diagram illustrating a physical node upgrade apparatus according to another embodiment of the present invention;

fig. 10 is a schematic structural diagram of a management terminal or a physical node according to an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, embodiments of the present invention will be described in detail with reference to the accompanying drawings.

Fig. 1 shows a schematic structural diagram of a distributed computing system according to an embodiment of the present invention. The distributed computing system includes: a management terminal 120 and a physical node 140.

The administrative terminal 120 is a device such as a laptop portable computer, desktop computer, cell phone, tablet computer, and the like. The management terminal 120 is used for managing the operation of each physical node 140. Illustratively, when the physical node 140 needs to perform the upgrade of the operating system, the management terminal 120 controls each physical node 140 to perform the upgrade.

The management terminal 120 is connected to the physical node 140 through the network 160. The network 160 may be a wired network or a fiber optic network.

The physical node 140 is a server with hardware entities such as: and the single board server. In a distributed computing system, the number of physical nodes 140 is typically multiple. In fig. 1, the number of physical nodes 140 is only three for illustration.

At least one service instance 142 runs on each physical node 140, each service instance 142 being used to provide one or more services. Optionally, the service instance 142 is any one of a thread, a process, a container, and a virtual machine (VMware).

Illustratively, when the distributed computing system is an OpenStack distributed computing system, the physical node 140 is a single-board server and the service instance 142 is a virtual machine.

Generally speaking, service instances 142 of the same type are deployed on two physical nodes 140 in a master-slave fashion, such as: the primary service instance 142 is deployed on one physical node 140 and the standby service instance 142 is deployed on another physical node 140. And/or, service instances 142 of the same type are deployed on different physical nodes 140 in a load-balanced manner, such as 3 service instances 142 of the same type being deployed on three different physical nodes 140, respectively.

Fig. 2 is a schematic structural diagram of a distributed computing system according to another embodiment of the present invention. The distributed computing system includes: a plurality of physical nodes connected by a network 160. Taking 5 physical nodes as an example, the physical nodes include: physical node 1, physical node 2, physical node 3, physical node 4, and physical node 5.

Each physical node is a server with hardware entities such as: and the single board server.

At least one service instance is run on each physical node, each service instance being for providing one or more services. Optionally, the service instance is any one of a thread, a process, a container, and a virtual machine (VMware). In general:

service instances of the same type are deployed on two physical nodes in a master-slave manner. In FIG. 2, A represents a primary service instance, S represents a standby service instance, service instance 1-A is a primary service instance of a first class of service instance, and service instance 1-S is a standby service instance of the first class of service instance; the service instance 3-A is a main service instance of the third type of service instance, and the service instance 3-S is a standby service instance of the third type of service instance; the service instance 4-A is a main service instance of the fourth type of service instance, and the service instance 4-S is a standby service instance of the fourth type of service instance; service instance 6-A is a primary service instance of the sixth class of service instances and service instance 6-S is a standby service instance of the sixth class of service instances.

And/or, service instances of the same type are deployed in a load-balanced fashion on different physical nodes. In fig. 2, the second class of service instances includes 5 service instances: the system comprises a service instance 2-1, a service instance 2-2, a service instance 2-3, a service instance 2-4 and a service instance 2-5, wherein each physical node is provided with a second type of service instance; the fifth class of service instances includes 4 service instances: the service instance 5-1, the service instance 5-2, the service instance 5-3 and the service instance 5-4 are deployed, wherein the service instance 5-1 is deployed at the physical node 2, the service instance 5-2 is deployed at the physical node 3, the service instance 5-3 is deployed at the physical node 4, and the service instance 5-4 is deployed at the physical node 5.

Unlike FIG. 1, the distributed computing system of FIG. 2 does not require a management node, illustratively, each physical node 140 is controlled by the service instance 1-A responsible for upgrading the service when the physical node requires an upgrade of the operating system.

The management terminal and/or the service instance responsible for upgrading the service are collectively referred to herein as the management terminal. That is, the management end may be a physical entity or a logical entity.

Referring to fig. 3, a flowchart of a method for upgrading a physical node according to an embodiment of the present invention is shown. The embodiment is exemplified by applying the physical node upgrading method to a management side. The physical node upgrading method comprises the following steps:

step 301, determining the ith batch of physical nodes from the physical nodes which are not upgraded.

Optionally, all kinds of service instances are included in service instances running in other physical nodes except the ith batch of physical nodes, i is an integer and the initial value of i is 0 or 1.

Assuming that the distributed computing system comprises N physical nodes, the ith physical node is k physical nodes, and the other physical nodes except the ith physical node are N-k. The other physical nodes may be physical nodes which are not upgraded or physical nodes which are upgraded.

However, all kinds of service instances are included in the other physical nodes. That is, if M service instances are running in the distributed computing system, M service instances are running in other physical nodes, and there is at least one service instance in an available state for each service instance.

And step 302, upgrading the ith batch of physical nodes.

Step 303, after the upgrade is completed, detecting whether there are any physical nodes that have not been upgraded.

And step 304, if the physical nodes which are not upgraded still exist, enabling i to be i +1, and re-executing the step of determining the ith batch of physical nodes from the physical nodes which are not upgraded.

And if the physical node which is not upgraded does not exist, determining that the upgrade is finished.

In summary, in the physical node upgrade method provided in this embodiment, the ith batch of physical nodes are determined from the physical nodes that are not upgraded; upgrading the ith batch of physical nodes; after the upgrade is finished, detecting whether physical nodes which are not upgraded exist; if the physical nodes which are not upgraded exist, enabling i to be i +1, and determining the ith batch of physical nodes from the physical nodes which are not upgraded again; the problem that in the prior art, when all physical hosts restart upgrading at the same time, the whole distributed system cannot provide service to the outside, and the service interruption time is long when the upgrading time is long is solved; the method achieves the effects that the plurality of physical nodes are upgraded according to different batches, when the physical nodes of each batch are upgraded, other physical nodes except the current batch can still provide services to the outside, and the services are not interrupted.

Referring to fig. 4, a flowchart of a method for upgrading a physical node according to another embodiment of the present invention is shown. The embodiment is exemplified by applying the physical node upgrading method to a management side. The physical node upgrading method comprises the following steps:

step 401, the management side calculates the number of instances of the standby service instance running in each physical node that is not upgraded.

The distributed computing system comprises a plurality of physical nodes, each physical node runs at least one service instance, and a plurality of service instances of the same type are deployed in different physical nodes according to a main-standby relationship or a load balancing relationship.

In the process of upgrading the operating system of the physical nodes, the physical nodes in different batches are upgraded in sequence. When determining the ith physical node needing to be upgraded, the first physical node needs to be determined first. Optionally, the initial value of i is 1.

Taking the upgrade of each physical node in fig. 2 as an example, the management side calculates the number of instances of the standby service instance running in the 5 physical nodes, and the calculation result is: 1 standby service instance runs on the physical node 1, 1 standby service instance runs on the physical node 2, 2 standby service instances runs on the physical node 3, and 0 standby service instance runs on the physical node 4 and the physical node 5.

Therefore, the management side calculates that the number of the instances of the standby service instance running on the physical node 3 is the largest.

Step 402, the management end determines the physical node with the maximum number of instances of the standby service instance as a first physical node.

Illustratively, the management side determines the physical node 3 as the first physical node.

And step 403, the management end adds the first physical node to the ith batch of physical nodes.

Illustratively, the management side adds physical node 3 to the first batch of physical nodes. The first physical nodes are the first physical nodes needing to be upgraded.

Step 404, the management end detects whether the remaining physical nodes except the ith physical node in the physical nodes which are not upgraded meet the condition of the anti-affinity one by one according to a predetermined sequence.

Optionally, the anti-affinity conditions comprise: and a physical node set formed by the kth service instance operated by the current physical node and the kth service instance operated in the ith physical node is not a complete set of the kth service instance, and k is an integer.

In other words, the anti-affinity conditions include: the main service instance and the standby service instance of the same service instance can not be upgraded in the same batch of physical nodes at the same time, and each service instance of the same service instance belonging to the load balancing relationship can not be upgraded in the same batch of physical nodes at the same time.

Optionally, the management end detects whether the physical nodes which are not upgraded meet the anti-affinity condition one by one according to the ascending sequence of the serial numbers of the physical nodes; or, the management end detects whether the physical nodes which are not upgraded meet the condition of the inverse affinity one by one according to the descending order of the serial numbers of the physical nodes

Taking fig. 2 as an example, the management side sequentially detects whether the physical node 5, the physical node 4, the physical node 2, and the physical node 1 meet the condition of the anti-affinity.

If the anti-affinity condition is met, go to step 405; if the anti-affinity condition is not met, step 406 is entered.

Step 405, if the condition of inverse affinity is met, the management end adds the remaining physical nodes to the ith batch of physical nodes.

Step 406, if the condition of inverse affinity is not met, the management end keeps the remaining physical nodes in the physical nodes which are not upgraded;

taking the distributed computing system shown in fig. 2 as an example, with reference to fig. 5:

when the management side checks the anti-affinity condition of the physical node 5, the first physical node 01 comprises the physical node 3, the service instances 2-5 in the physical node 5 and the service instances 2-3 in the physical node 3 are only a part of service instances in the second class of service instances, the service instances 5-4 in the physical node 5 and the service instances 5-2 in the physical node 3 are only a part of service instances in the fifth class of service instances, so that the physical node 5 meets the anti-affinity condition compared with the physical node 3, and the management side adds the physical node 5 to the first physical node 01.

When the management end checks the anti-affinity condition of the physical node 4, the first physical node 01 comprises the physical node 3 and the physical node 5, the service instance 6-A in the physical node 4 and the service instance 6-S in the physical node 3 are all service instances of the sixth class of service instances, so that the physical node 4 conflicts with the physical node 3, the physical node 4 does not accord with the anti-affinity condition compared with the physical node 3, and the management end continuously keeps the physical node 4 as an un-upgraded physical node.

When the management side checks the anti-affinity condition of the physical node 2, the first physical node 01 comprises a physical node 3 and a physical node 5, the service instance 2-2 in the physical node 2, the service instance 2-3 in the physical node 3, and the service instance 2-5 in the physical node 5 are part of service instances of the second class of service instances, the service instance 5-1 in the physical node 2, the service instance 5-2 in the physical node 3, and the service instance 5-4 in the physical node 5 are part of service instances of the fifth class of service instances, so that the physical node 4 meets the anti-affinity condition compared with (the physical node 3+ the physical node 5), and the management side adds the physical node 2 to the first physical node 01.

When the management side checks the anti-affinity condition of the physical node 1, the first physical nodes 01 comprise a physical node 3, a physical node 5 and a physical node 2, the service instances 1-A in the physical node 1 and the service instances 1-S in the physical node 2 are all service instances of the first class of service instances, the service instances 3-A in the physical node 1 and the service instances 3-S in the physical node 3 are all service instances of the third class of service instances, the service instances 4-S in the physical node 1 and the service instances 4-A in the physical node 2 are all service instances of the fourth class of service instances, therefore, the physical node 1 does not meet the condition of inverse affinity compared with (physical node 3+ physical node 5+ physical node 2), and the management end keeps the physical node 1 as an unequipped physical node.

At this point, the management end determines a first batch of physical nodes 01 which need to be upgraded: physical node 2, physical node 3 and physical node 5.

Step 407, the management side performs active/standby switching between the main service instance running in the ith batch of physical nodes and the standby service instances running in other physical nodes.

The management terminal carries out active-standby switching on the main service instances running in the first batch of physical nodes and the standby service instances running in other physical nodes so as to change the standby service instances running in other management nodes into main service instances, and the main service instances in other management nodes can provide services to the outside in the upgrading process of the first batch of management nodes.

Illustratively, the management end performs active-standby switching between a main service instance 4-a in a physical node 2 in the first batch of physical nodes and a service instance 4-S in the physical node 1, and switches the main service instance 4-a in the physical node 2 to the standby service instance 4-S. Thereafter, when the first batch of physical nodes 01 are upgraded, the physical nodes 1 provide services of the fourth type of service instances.

In step 408, for each service instance, the management side detects whether there is a service instance in an available state in other physical nodes except the ith batch of physical nodes.

Since some service instances may be in a failure state, the management side needs to perform availability check on the whole system before performing formal upgrade on the ith physical node.

The management side needs to confirm whether the service instance in the available state exists in other physical nodes for each service instance.

Illustratively, the other physical nodes than the first physical nodes include physical node 1 and physical node 4. In the physical node 1, service instances 1-A, 2-1, 3-A, 4A are running, and in the physical node 4, 5-3 and 6-A are running. Therefore, six different types of service instances have service instances in an available state in both physical node 1 and physical node 4.

If there is a service instance in an available state for each service instance, step 409 is entered, otherwise the upgrade is suspended.

Step 409, if each service instance has a service instance in an available state, the management end upgrades the ith batch of physical nodes.

Step 410, after the upgrade is completed, the management end detects whether there are any physical nodes that have not been upgraded.

Illustratively, after the first batch of physical nodes are upgraded, the management end detects whether there are any physical nodes that are not upgraded.

If there are physical nodes that have not been upgraded, step 411 is entered, otherwise, the upgrade is ended.

In step 411, if there are any physical nodes that have not been upgraded, let i be i +1, and re-execute the step of determining the ith physical node from the physical nodes that have not been upgraded.

The management side detects that the physical node 1 and the physical node 4 are not upgraded yet, and returns to step 401 to step 410 by setting i to 1+ 1.

The management end firstly calculates the number of examples of the standby service instances running in the physical node 1 and the physical node 4, and since the standby service instances do not exist in the physical node 1 and the physical node 4, the management end calculates the number of the instances of the service instances running in the physical node 1 and the physical node 4, and determines the physical node with the largest number of the instances of the service instances as the first physical node in the second batch of physical nodes.

Illustratively, the calculation result of the management end is as follows: there are 4 service instances running on physical node 1 and 3 service instances running on physical node 4. Therefore, the management side calculates that the number of instances of the service instance running on the physical node 1 is the largest, and determines the physical node 1 as the first physical node.

Illustratively, the management side adds physical node 1 to the second batch of physical nodes 02. The second set of physical nodes are the second set of physical nodes that need to be upgraded.

And the management terminal detects whether the physical node 4 meets the condition of inverse affinity. When the management side checks the anti-affinity condition of the physical node 4, the second batch of physical nodes 02 comprise the physical node 1, the service instances 2-4 in the physical node 5 and the service instances 2-1 in the physical node 1 are only a part of service instances in the second class of service instances, and the service instances 5-3 in the physical node 4 are only a part of service instances in the fifth class of service instances, so that the physical node 4 meets the anti-affinity condition compared with the physical node 1, and the management side adds the physical node 4 to the second batch of physical nodes 02.

And the management end determines the physical nodes 1 and 4 as the second batch of physical nodes 02 to be upgraded.

The management terminal performs active-standby switching on the main service instances running in the second batch of physical nodes 02 and the standby service instances running in other physical nodes so as to change the standby service instances running in other management nodes into main service instances, and can provide services to the outside by the main service instances in other management nodes in the upgrading process of the first batch of management nodes.

Schematically, the management end performs active-standby switching on a main service instance 1-a in a physical node 1 in the second batch of physical nodes 02 and a service instance 1-S in a physical node 2, and switches the main service instance 1-a in the physical node 1 into a standby service instance 1-S; performing active-standby switching on a main service instance 3-A in the physical node 1 and a service instance 3-S in the physical node 3, and switching the main service instance 3-A in the physical node 1 to a standby service instance 3-S; performing active-standby switching on a main service instance 4-A in the physical node 1 and a service instance 4-S in the physical node 2, and switching the main service instance 4-A in the physical node 1 to a standby service instance 4-S; performing active-standby switching on a main service instance 6-A in the physical node 4 and a service instance 6-S in the physical node 3, and switching the main service instance 6-A in the physical node 4 to a standby service instance 6-S; after that, when the second batch of physical nodes 02 are upgraded, the physical nodes 2 provide the services of the first class service instances and the services of the fourth class service instances, and the physical nodes 3 provide the services of the third class service instances and the services of the sixth class service instances.

Illustratively, the other physical nodes than the second set of physical nodes 02 include physical node 2, physical node 3, and physical node 5. In the physical node 2, a service instance 1-A, a service instance 2-2, a service instance 4-A and a service instance 5-1 are operated, in the physical node 3, a service instance 3-A, a service instance 2-3, a service instance 5-2 and a service instance 6-4 are operated, and in the physical node 5, a service instance 2-5 and a service instance 5-4 are operated. Therefore, six different types of service instances have service instances in an available state in each of physical node 2, physical node 3, and physical node 5.

The management end upgrades the second batch of physical nodes 02, and thus all the physical nodes in the embodiment are upgraded.

It should be noted that, in the embodiment of the present invention, a manner of determining the first physical node from the un-upgraded physical nodes by the management end is not limited. For example, the management end determines the physical node with the largest number of instances running the service instance as the first physical node; for another example, the management end determines the physical node with the largest number of main service instances running the service instances as the first physical node. These several determination methods can also be implemented in combination.

According to the physical node upgrading method provided by the embodiment, all physical nodes are upgraded in batches by dynamically detecting whether each physical node meets the condition of inverse affinity, so that external services are not interrupted in the physical node upgrading process; when the physical nodes of each batch are upgraded, other physical nodes except the current batch can still provide services to the outside, and the services are not interrupted.

In the physical node upgrade method provided in this embodiment, the physical node having the most standby service instances running is determined as the first physical node, so that the service instances that need to be subjected to the main/standby switching in step 407 are the smallest, which can reduce the number of service interruptions caused by the main/standby switching process, and reduce the service interruption time as much as possible.

In the physical node upgrade method provided in this embodiment, before the ith batch of service instances is upgraded, the availability of each type of service instance is judged; the problem that the service instance of the type is unavailable in the upgrading process due to the fact that a certain service instance is in a fault state is avoided, and service interruption generated in the upgrading process of the physical node is prevented.

Optionally, the physical node is operated with service instances deployed in a master-standby mode and/or service instances deployed in a load balancing mode.

Fig. 6A and fig. 6B are schematic structural diagrams illustrating a distributed computing system according to still another embodiment of the present invention. Taking 4 physical nodes as an example, the physical nodes include: physical node 1, physical node 2, physical node 3, physical node 4. In this embodiment, a service instance deployed only in a master-slave mode is deployed on a physical node as an example for explanation.

At least one service instance is run on each physical node, each service instance being for providing one or more services. Optionally, the service instance is any one of a thread, a process, a container, and a virtual machine (VMware).

Service instances of the same type are deployed on two physical nodes in a master-slave manner. In FIG. 6, service instance 1-A is a primary service instance of a first class of service instances, and service instance 1-S is a backup service instance of the first class of service instances; the service instance 2-A is a main service instance of the second type of service instance, and the service instance 2-S is a standby service instance of the second type of service instance; the service instance 3-A is a main service instance of the third type of service instance, and the service instance 3-S is a standby service instance of the third type of service instance; the service instance 4-A is a main service instance of the fourth type of service instance, and the service instance 4-S is a standby service instance of the fourth type of service instance; service instance 5-A is a primary service instance of the fifth class of service instances and service instance 5-S is a standby service instance of the fifth class of service instances.

The service instances 1-A and 4-S are deployed on the physical node 1, the service instances 1-S, 2-A, 3-A and 5-A are deployed on the physical node 2, the service instances 2-S, 5-S and 4-S are deployed on the physical node 3, and the service instances 3-S are deployed on the physical node 4.

The management terminal calculates the number of the instances of the standby service instances running in the 4 physical nodes, and the calculation result is as follows: the physical node 1 runs 1 standby service instance, the physical node 2 runs 1 standby service instance, the physical node 3 runs 2 standby service instances, and the physical node 4 runs 1 standby service instance.

The management side determines the physical node 3 as the first physical node.

The management side adds physical node 3 to the first batch of physical nodes 01. The first group of physical nodes 01 are the first physical nodes that need to be upgraded.

Optionally, the management end detects whether the physical nodes that are not upgraded meet the anti-affinity condition one by one according to the ascending sequence of the numbers of the physical nodes, that is, the management end detects whether the physical nodes 1, 2, and 4 meet the anti-affinity condition in sequence.

When the management end checks the inverse affinity condition of the physical node 1, the first physical nodes 01 include the physical node 3, the service instance 4-a in the physical node 1 and the service instance 4-S in the physical node 3 are all service instances of the fourth class of service instances, so that the physical node 1 conflicts with the physical node 3, the physical node 1 does not conform to the inverse affinity condition compared with the physical node 3, and the management end continuously keeps the physical node 1 as an unequipped physical node.

When the management side checks the anti-affinity condition of the physical node 2, the first physical node 01 comprises the physical node 3, the service instance 2-A in the physical node 1 and the service instance 2-S in the physical node 3 are all service instances of the second class of service instances, the service instance 5-A in the physical node 1 and the service instance 5-S in the physical node 3 are all service instances of the fifth class of service instances, so that the physical node 2 conflicts with the physical node 3, the physical node 2 does not accord with the anti-affinity condition compared with the physical node 3, and the management side keeps the physical node 2 as an unequipped physical node.

When the management side checks the anti-affinity condition of the physical node 4, the first physical node 01 comprises the physical node 3, the service instance 3-S exists in the physical node 4, and the service instance 3-S does not exist in the physical node 3, so that the physical node 4 and the service instance 3-S in the physical node 3 are part of the third class of service instances, and therefore the physical node 4 and the physical node 3 meet the anti-affinity condition, and the management side adds the physical node 4 to the first physical node 01.

At this point, the management end determines a first batch of physical nodes 01 which need to be upgraded: physical node 3 and physical node 4.

The management end carries out active-standby switching on a main service instance 4-A in a physical node 3 in a first batch of physical nodes 01 and a service instance 4-S in a physical node 1, and switches the main service instance 4-A in the physical node 3 to a standby service instance 4-S. Thereafter, when the first batch of physical nodes 01 are upgraded, the physical nodes 1 provide services of the fourth type of service instances.

The other physical nodes except the first group of physical nodes 01 include physical node 1 and physical node 2. In the physical node 1, service instances 1-S and 4-A run, and in the physical node 2, service instances 1-A, 2-A, 3-A and 5-A run. Therefore, five different types of service instances have service instances in an available state in both physical node 1 and physical node 2.

And the management terminal upgrades the first batch of physical nodes 01.

After the upgrade is finished, the management end detects whether the physical nodes which are not upgraded exist. The management terminal detects that the physical node 1 and the physical node 2 are not upgraded, the management terminal calculates the number of instances of the standby service instances running in the physical node 1 and the physical node 2, and the calculation result is as follows: physical node 1 has 1 standby service instance running thereon, and physical node 2 has 0 standby service instance running thereon.

The management end determines the physical node 1 as a first physical node.

The management side adds physical node 1 to the second batch of physical nodes 02. The second set of physical nodes 02 are the second set of physical nodes that need to be upgraded.

And the management terminal detects whether the physical node 2 meets the condition of inverse affinity.

When the management side checks the anti-affinity condition of the physical node 2, the second batch of physical nodes 02 includes the physical node 1, the service instance 1-S in the physical node 2 and the service instance 1-A in the physical node 2

The service instance 1-A in the physical node 2 and the service instance 1-S in the physical node 1 are all service instances of the first class service instance, so that the physical node 2 conflicts with the physical node 1, the physical node 2 does not accord with the condition of inverse affinity compared with the physical node 1, and the management end continuously reserves the physical node 2 as an unequipped physical node.

So far, the management end determines a second batch of physical nodes 02 which need to be upgraded: a physical node 1.

And the management end carries out active-standby switching on the main service instance 4-A in the physical node 1 in the second batch of physical nodes 02 and the service instance 4-S in the physical node 3, and switches the main service instance 4-A in the physical node 1 to the standby service instance 4-S. Thereafter, when the second batch of physical nodes 02 is upgraded, the physical nodes 3 provide services of the fourth class of service instances.

The other physical nodes except the second batch of physical nodes 02 include physical node 2, physical node 3, and physical node 4. In the physical node 2, a service instance 1-A, a service instance 2-A, a service instance 3-A and a service instance 5-A are operated, in the physical node 3, a service instance 2-S, a service instance 5-S and a service instance 4-A are operated, and in the physical node 4, a service instance 3-S is operated. Therefore, five different types of service instances have service instances in an available state in each of physical node 2, physical node 3, and physical node 4.

And the management terminal upgrades the second batch of physical nodes 02.

After the upgrade is finished, the management end detects whether the physical nodes which are not upgraded exist. The management end detects that only the physical node 2 is left without being upgraded, and because the physical node 2 is the only node without being upgraded, the physical node 2 is added to the third batch of physical nodes 03 for upgrading, and at this time, the anti-affinity check is not needed. Before the third batch of physical nodes 03 is upgraded, the management end carries out active-standby switching on a main service instance 1-A in a physical node 2 in the third batch of physical nodes 03 and a service instance 1-S in a physical node 1, and switches the main service instance 1-A in the physical node 2 into a standby service instance 1-S; performing active-standby switching on a main service instance 2-A in the physical node 2 and a service instance 2-S in the physical node 3, and switching the main service instance 2-A in the physical node 2 into a standby service instance 2-S; performing active-standby switching on a main service instance 3-A in the physical node 2 and a service instance 3-S in the physical node 4, and switching the main service instance 3-A in the physical node 2 to a standby service instance 3-S; the main service instance 5-a in the physical node 2 and the service instance 5-S in the physical node 3 are subjected to main/standby switching, and the main service instance 5-a in the physical node 2 is switched into the standby service instance 5-S. After that, when the third batch of physical nodes 03 is upgraded, the physical node 1 provides the service of the first class service instance, the physical node 3 provides the service of the second class service instance and the service of the fifth class service instance, and the physical node 4 provides the service of the third class service instance.

The other physical nodes except the third batch of physical nodes 03 include physical node 1, physical node 3, and physical node 4. In the physical node 1, a service instance 1-A and a service instance 4-A are operated, in the physical node 3, a service instance 2-A, a service instance 5-A and a service instance 4-S are operated, and in the physical node 4, a service instance 3-A is operated. Therefore, five different types of service instances have service instances in an available state in each of physical node 1, physical node 3, and physical node 4.

And the management terminal upgrades the third batch of physical nodes 03. Thus, all the physical nodes in the embodiment are upgraded.

Fig. 7A and 7B are schematic structural diagrams illustrating a distributed computing system according to still another embodiment of the present invention. Taking 3 physical nodes as an example, the physical nodes include: physical node 1, physical node 2, physical node 3, physical node 4. In this embodiment, a service instance deployed in a load balancing manner is only deployed on a physical node as an example for description.

Service instances of the same type are deployed in a load-balanced fashion on different physical nodes. In fig. 7, the first class of service instances includes 2 service instances: service instance 1-1 and service instance 1-2; the second class of service instances includes 2 service instances: service instance 2-1 and service instance 2-2; the third class of service instances includes 3 service instances: service instance 3-1, service instance 3-2, and service instance 2-3; the fourth class of service instances includes 5 service instances: service instance 4-1, service instance 4-2, service instance 4-3, service instance 4-4, and service instance 4-5.

The method comprises the steps that a service instance 1-1, a service instance 2-1, a service instance 3-1, a service instance 4-1 and a service instance 4-5 are deployed on a physical node 1; service instances 1-2, 2-2, 3-2, and 4-4 are deployed on physical node 2; service instances 3-3 and 4-3 are deployed on physical node 3 and 4-2 is deployed on physical node 4.

Optionally, the management side calculates the number of instances of the service instance running in the 4 physical nodes, and the calculation result is: the physical node 1 runs 5 service instances, the physical node 2 runs 4 service instances, the physical node 3 runs 2 service instances, and the physical node 4 runs 1 service instance.

Therefore, the management side calculates that the number of instances of the service instance running on the physical node 1 is the largest.

The management end determines the physical node 1 as a first physical node.

The management side adds physical node 1 to the first batch of physical nodes 01. The first group of physical nodes 01 are the first physical nodes that need to be upgraded.

Optionally, the management end detects whether the physical nodes that are not upgraded meet the anti-affinity condition one by one according to the ascending order of the numbers of the physical nodes, that is, the management end detects whether the physical nodes 2, 3, and 4 meet the anti-affinity condition in sequence.

When the management end checks the anti-affinity condition of the physical node 2, the first physical nodes 01 comprise the physical node 1, the service instances 1-2 in the physical node 2 and the service instances 1-1 in the physical node 1 are all service instances of the first class of service instances, the service instances 2-2 in the physical node 2 and the service instances 2-1 in the physical node 1 are all service instances of the second class of service instances, so that the physical node 2 conflicts with the physical node 1, the physical node 2 does not accord with the anti-affinity condition compared with the physical node 1, and the management end keeps the physical node 2 as an un-upgraded physical node.

When the management side checks the anti-affinity condition of the physical node 3, the first physical node 01 comprises the physical node 1, the service instance 3-3 in the physical node 3 and the service instance 3-1 in the physical node 1 are part of service instances of the third class of service instances, the service instance 4-3 in the physical node 3 and the service instance 4-1 and the service instance 4-5 in the physical node 1 are part of service instances of the fourth class of service instances, so that the physical node 3 meets the anti-affinity condition compared with the physical node 1, and the management side adds the physical node 3 to the first physical node 01.

When the management side checks the inverse affinity condition of the physical node 4, the first physical nodes 01 comprise the physical node 1, the service instance 4-2 in the physical node 4, the service instance 4-1 in the physical node 1 and the service instance 4-1 in the physical node 1 are part of service instances of the fourth class of service instances, so that the physical node 4 meets the inverse affinity condition compared with the physical node 1, and the management side adds the physical node 3 to the first physical nodes 01.

At this point, the management end determines a first batch of physical nodes 01 which need to be upgraded: physical node 1, physical node 3 and physical node 4.

In this embodiment, only the service instance deployed in the load balancing manner is deployed on the physical node, so that the main/standby switching is not required in this embodiment.

Before upgrading, the management side needs to perform availability detection on other physical nodes except the first batch of physical nodes 01, including the physical nodes 2.

Illustratively, in the physical node 2, service instances 1-2, 2-2, 3-2 and 4-4 are running, so that there are service instances in an available state in the physical node 2 for four different types of service instances.

And the management terminal upgrades the first batch of physical nodes 01.

After the first batch of physical nodes are upgraded 01, the management end detects that only the physical nodes 2 are left without being upgraded, and because the physical nodes 2 are the only nodes without being upgraded, the physical nodes 2 are added to the second batch of physical nodes 02 for upgrading, and at the moment, the anti-affinity check is not needed.

Before upgrading, the management side needs to perform availability detection on other physical nodes except the second batch of physical nodes 02, including the physical node 1, the physical node 3 and the physical node 4.

Illustratively, in the physical node 1, service instances 1-2, 2-1, 3-1, 4-1 and 4-5 are run, in the physical node 3, service instances 3-3 and 4-3 are run, and in the physical node 4, service instances 4-2 are run, so that there are service instances in available states in the physical node 1, the physical node 3 and the physical node 4 for four different types of service instances.

And the management terminal upgrades the second batch of physical nodes 02. Thus, all the physical nodes in the embodiment are upgraded.

Referring to fig. 8, a block diagram of a physical node upgrade apparatus according to an embodiment of the present invention is shown. The physical node upgrading device can be realized by a special hardware circuit or a combination of the hardware and the software to become all or part of the server. The physical node upgrading device is applied to a virtualization system in the embodiment. This physical node upgrading device includes: batch determination unit 801, node upgrade unit 802, node detection unit 803, and upgrade loop unit 804

A batch determining unit 801, configured to implement the function of step 301.

A node upgrading unit 802, configured to implement the function of step 302 described above.

A node detection unit 803, configured to implement the function of step 303.

And an upgrade loop unit 804, configured to implement the functions of step 304.

Referring to fig. 9, a block diagram of a physical node upgrade apparatus according to an embodiment of the present invention is shown. The physical node upgrading device can be realized by a special hardware circuit or a combination of the hardware and the software to become all or part of the server. The physical node upgrading device is applied to a virtualization system in the embodiment. This physical node upgrading device includes: a batch determining unit 901, a main/standby switching unit 902, a node upgrading unit 903, a node detecting unit 904 and an upgrading circulating unit 905.

A batch determining unit 901, configured to implement the functions of step 401, step 402, step 403, step 404, step 405, and step 406.

The main/standby switching unit 902 is configured to implement the function in step 407.

And a node upgrading unit 903, configured to implement the function in step 409.

A node detection unit 904, configured to implement the functions of step 408 and step 410.

And an upgrade loop unit 905 configured to implement the functions of step 411.

The above description mainly introduces the scheme provided by the embodiment of the present invention from the perspective of managing terminals or physical nodes. It is to be understood that the management terminal or the physical node includes a hardware structure and/or a software module for performing the respective functions in order to implement the above-described functions. The various illustrative modules and algorithm steps described in connection with the embodiments disclosed herein may be implemented as hardware or combinations of hardware and computer software. Whether a function is performed as hardware or computer software drives hardware depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present teachings.

Fig. 10 is a schematic structural diagram of a management terminal or a physical node according to an embodiment of the present invention. For example, the management terminal or the physical node is a server, and is configured to implement the physical node upgrade function of the foregoing method embodiment. The management terminal or physical node 1000 includes: a network interface 1010, a processor 1020, and a memory 1030. The network interface 1010 is used to support information transmission and reception with other external physical nodes. Processor 1020 is configured to perform various functions of management node or physical node 1000, such as determining a first physical node, controlling an upgrade of a physical node, and so on. The processor 1020 is further configured to perform various steps in the embodiments shown in fig. 3 to 4 or other steps of the technical solutions described in the present invention.

Further, the management terminal or physical node 1000 may also include a memory 1030, the memory 1030 configured to store one or more program instructions and data. The one or more program instructions are configured to be executed by the processor 1020 to implement the physical node upgrade method illustrated in fig. 3 or 4 described above.

The management terminal or physical node 1000 may further include a bus 1040. The memory 1030 and the network interface 1010 are coupled to the processor 1020 by a bus 1040.

It will be appreciated that fig. 10 shows only a simplified design of a management terminal or physical node 1000. In practical applications, the management terminal or the physical node 1000 may include any number of network interfaces, processors, memories, etc., and all devices that can implement the embodiments of the present invention are within the protection scope of the embodiments of the present invention.

The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.

It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by a program instructing relevant hardware, where the program may be stored in a computer-readable storage medium, and the above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.

The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.

Claims

1. A physical node upgrading method is used in a distributed system comprising a plurality of physical nodes, each physical node runs at least one service instance, the same service instance mutually has a primary-standby relationship or a load balancing relationship in different physical nodes, the method comprises the following steps:

determining a first physical node from the physical nodes which are not upgraded;

adding the first physical node to an ith batch of physical nodes;

detecting whether the remaining physical nodes except the first physical node in the physical nodes which are not upgraded meet an inverse affinity condition one by one according to a preset sequence, wherein the inverse affinity condition comprises the following steps: a physical node set formed by a kth service instance operated by a current physical node and the kth service instance operated in the ith batch of physical nodes is not a complete set of the kth service instance, and k is an integer;

if the anti-affinity condition is met, adding the rest physical nodes to the ith physical node, wherein service instances running in other physical nodes except the ith physical node comprise all kinds of service instances, each service instance at least has one service instance in an available state, and i is an integer;

upgrading the ith batch of physical nodes;

after the upgrade is finished, detecting whether physical nodes which are not upgraded exist;

and if the physical nodes which are not upgraded exist, enabling i to be i +1, and re-executing the step of determining the first physical node from the physical nodes which are not upgraded.

2. The method of claim 1, wherein determining the first physical node from the physical nodes that are not upgraded comprises:

calculating the number of instances of the standby service instance running in each physical node which is not upgraded;

and determining the physical node with the maximum number of the instances of the standby service instance as the first physical node.

3. The method of claim 1, wherein before the upgrading the ith batch of physical nodes, further comprising:

and performing active-standby switching on the main service instance running in the ith batch of physical nodes and the standby service instances running in the other physical nodes.

4. The method of claim 1, wherein before the upgrading the ith batch of physical nodes, further comprising:

for each service instance, detecting whether the service instance in an available state exists in other physical nodes except the ith batch of physical nodes;

and if each service instance has a service instance in an available state, executing the step of upgrading the ith batch of physical nodes.

5. A physical node upgrade apparatus, for use in a distributed system including a plurality of physical nodes, where each physical node runs at least one service instance, and the same service instance is in a primary-standby relationship or a load balancing relationship with each other in different physical nodes, the apparatus comprising:

the batch determining unit is used for determining a first physical node from the physical nodes which are not upgraded;

adding the first physical node to an ith batch of physical nodes;

the node upgrading unit is used for upgrading the ith batch of physical nodes;

the node detection unit is used for detecting whether physical nodes which are not upgraded exist after the upgrade is finished;

and an upgrade circulating unit, configured to, if there are still physical nodes that have not been upgraded, make i equal to i +1, and re-execute the step of determining the first physical node from the physical nodes that have not been upgraded.

6. The apparatus of claim 5, wherein the batch determination unit is further configured to:

7. The apparatus of claim 5, further comprising:

and the main/standby switching unit is used for performing main/standby switching on the main service instances running in the ith physical node and the standby service instances running in the other physical nodes.

8. The apparatus of claim 5, further comprising:

the node detection unit is further configured to detect, for each service instance, whether a service instance in the available state exists in the other physical nodes except the ith batch of physical nodes;

the node upgrading unit is further configured to execute the step of upgrading the ith batch of physical nodes when a service instance in the available state exists in each service instance.