CN115643168A

CN115643168A - Node hyper-convergence upgrading method, device, equipment and storage medium

Info

Publication number: CN115643168A
Application number: CN202211286846.7A
Authority: CN
Inventors: 请求不公布姓名
Original assignee: Anchao Cloud Software Co Ltd
Current assignee: Anchao Cloud Software Co Ltd
Priority date: 2022-10-20
Filing date: 2022-10-20
Publication date: 2023-01-24
Anticipated expiration: 2042-10-20
Also published as: CN115643168B

Abstract

The application relates to a node hyper-convergence upgrading method, device, equipment and storage medium, in particular to the technical field of cloud service. The method comprises the following steps: controlling each node device to copy each service in the control virtual manager deployed on each node device to a target container, and respectively deploying the target container on each node device; respectively deploying load balancers on the main node equipment and the non-main node equipment; controlling non-main node equipment to start a load balancer and stateless service; the control main node equipment closes the control virtual manager and starts a load balancer and stateful services in the main node equipment; and controlling the main node equipment to start the stateless service on the main node equipment and deleting the control virtual manager. According to the scheme, on the basis of avoiding upgrading failure caused by deletion of the control virtual manager, each service is smoothly started, and service interruption time caused by service starting is shortened.

Description

Node hyper-convergence upgrading method, device, equipment and storage medium

Technical Field

The invention relates to the technical field of cloud services, in particular to a node hyper-convergence upgrading method, a node hyper-convergence upgrading device and a node hyper-convergence upgrading storage medium.

Background

The super-fusion architecture is a new generation of transversely-extended software definition architecture, which is composed of general hardware units integrating a CPU, a memory, a storage, a network and a virtualization software platform, and has no fixed central node. The core concept of the method comprises linear horizontal expansion, combination of computing capacity and storage capacity, and adoption of a high-speed flash memory as a storage medium at a server side.

In the existing super-fusion architecture, a super-fusion management system runs in a control virtual manager, when the automatic upgrading of the super-fusion architecture is realized, a user logs in the management system and clicks upgrading on a web page, and at the moment, a cloud management service in a background can adjust the service of cloud platform life cycle management to upgrade a storage, a physical machine system and the control virtual manager after receiving an upgrading instruction.

However, in the above scheme, the control virtual manager needs to be uninstalled and then installed during upgrading, which may result in that the cloud management cannot log in and the cloud management fails to be upgraded.

Disclosure of Invention

The application provides a node hyper-convergence upgrading method, a node hyper-convergence upgrading device and a node hyper-convergence upgrading storage medium, on the basis of avoiding upgrading failure caused by deletion of a control virtual manager, each service is smoothly started, and service interruption time caused by service starting is reduced.

On one hand, the method for upgrading the node super-fusion is applied to control equipment in a cloud service system; the cloud service system further comprises each node device, wherein each node device comprises a master node device and a non-master node device, and the method comprises the following steps:

controlling each node device to copy each service in a control virtual manager deployed on each node device to a target container, and respectively deploying the target container on each node device; each service comprises a stateless service and a stateful service;

deploying load balancers on the main node equipment and the non-main node equipment respectively;

controlling the non-master node device to start the load balancer and the stateless service;

controlling the main node equipment to close the control virtual manager, and starting a load balancer and stateful services in the main node equipment;

and controlling the main node equipment to start the stateless service on the main node equipment, and deleting the control virtual manager to finish the node hyper-convergence upgrade.

In another aspect, a node hyper-convergence upgrading device is provided, and the device is used for a control device in a cloud service system; the cloud service system further includes each node device including a master node device and a non-master node device, and the apparatus includes:

a container deployment module, configured to control each node device to copy each service in the control virtual manager deployed on each node device to a target container, and deploy the target container on each node device respectively; each service comprises a stateless service and a stateful service;

a load balancing deployment module, configured to deploy load balancers on the master node device and the non-master node device, respectively;

a non-master node starting module, configured to control the non-master node device to start the load balancer and the stateless service;

a master node starting module, configured to control the master node device to close the control virtual manager, and start a load balancer and a stateful service in the master node device;

and the manager deleting module is used for controlling the main node equipment to start the stateless service on the main node equipment and deleting the control virtual manager so as to complete node hyper-convergence upgrading.

In one possible implementation, the apparatus further includes an upgrade module;

the upgrading module is used for upgrading the distributed storage shared by each node device and the service and operating system on each node device.

In one possible implementation manner, the apparatus further includes a high-availability component control module, configured to stop the high-availability component on each node device; the high availability component is used for controlling storage of each node device and switching of the virtual manager.

In one possible implementation, the manager deletion module is further configured to:

controlling the master node device to start a stateless service on the master node device;

when detecting that the stateless service on the main node equipment is started, starting a high-availability component on each node equipment;

and deleting the control virtual manager on each node device to finish node hyper-convergence upgrading.

In one possible implementation, the high availability component control module is further configured to:

detecting the state of each node device;

and stopping the high-availability components of the node devices when the node devices are detected to be in the healthy state.

when detecting that each node device is in a healthy state, performing data backup on target data in each node device;

and stopping the high-availability components of each node device after the target data backup in each node device is detected to be finished.

In one possible implementation, the apparatus further includes:

the upgrading termination module is used for shutting down the state service and the load balancer in the main node equipment and starting the control virtual manager when the condition that the main node equipment triggers upgrading failure is detected;

the upgrade failure condition comprises at least one of:

the controlling virtual manager fails to close;

the load balancer failed to start;

the stateful service fails to boot.

In one possible implementation, the target container includes at least a stateful container and a stateless container;

the container deployment module is further configured to control each node device, copy the stateful services deployed on each node device into a stateful container, and copy the stateless services deployed on each node device into a stateless container;

and respectively deploying the stateful container and the stateless container on each node device.

In yet another aspect, a computer device is provided, which includes a processor and a memory, where at least one instruction is stored in the memory, and the at least one instruction is loaded and executed by the processor to implement the node hyper-convergence upgrade method.

In yet another aspect, a computer-readable storage medium is provided, in which at least one instruction is stored, and the at least one instruction is loaded and executed by a processor to implement the above-mentioned node hyper-convergence upgrade method.

In yet another aspect, a computer program product or computer program is provided, the computer program product or computer program comprising computer instructions stored in a computer readable storage medium. And the processor of the computer device reads the computer instructions from the computer readable storage medium, and executes the computer instructions to enable the computer device to execute the node hyper-fusion upgrading method.

The technical scheme provided by the application can comprise the following beneficial effects:

in order to realize node hyper-convergence, the control device may copy the stateless service and the stateful service in the control virtual manager to the target container, and deploy the target container on each node device, and then the control device deploys the load balancer on each node device, and then the control device controls the non-master node device to start the load balancer and the stateless service in each service; then controlling the control virtual manager of the main node equipment to close, and starting the load balancer and the stateful service in the main node equipment, wherein at this time, although the control virtual manager is closed, the load balancer can forward the service to the stateless service in the non-main node equipment when receiving the service, if the stateless service needs to access the stateful service, the stateful service in the main node equipment can be accessed, and at this time, although the control virtual manager is closed, the control virtual manager can be replaced by the target container; and the control equipment controls the main node equipment to start the rest stateless service and deletes the control virtual manager to complete the node hyper-convergence upgrade. In the scheme, the control virtual manager is replaced by the target container, and the stateful service and the stateless service in the target container are started step by step, so that each service is started smoothly on the basis of avoiding upgrading failure caused by deleting the control virtual manager, and service interruption time caused by service starting is reduced.

Drawings

In order to more clearly illustrate the detailed description of the present application or the technical solutions in the prior art, the drawings needed to be used in the detailed description of the present application or the prior art description will be briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present application, and other drawings can be obtained by those skilled in the art without creative efforts.

Fig. 1 is a schematic structural diagram illustrating a cloud service system according to an exemplary embodiment.

FIG. 2 is a flow chart illustrating a method of node hyperfusion upgrade methodology in accordance with an exemplary embodiment.

FIG. 3 is a method flow diagram illustrating a method for node hyperfusion upgrade, according to an example embodiment.

Fig. 4 shows a flow chart of a node hyper-convergence upgrade according to an embodiment of the present application.

FIG. 5 illustrates a pre-upgrade node logic diagram according to an embodiment of the present application.

Fig. 6 shows a container creation diagram according to an embodiment of the present application.

Fig. 7 shows a schematic diagram of service initiation of a device of a non-master node according to an embodiment of the present application.

Fig. 8 is a schematic diagram illustrating a node device after a service according to an embodiment of the present application is completely started.

Fig. 9 is a schematic diagram illustrating a node device after a service according to an embodiment of the present application is completely started.

Fig. 10 is a schematic diagram illustrating a system architecture after upgrading according to an embodiment of the present application is completed.

Fig. 11 illustrates a node hyper-convergence upgrading apparatus according to an embodiment of the present application.

FIG. 12 is a schematic diagram of a computer device provided in accordance with an exemplary embodiment of the present application.

Detailed Description

The technical solutions of the present application will be described clearly and completely with reference to the accompanying drawings, and it should be understood that the described embodiments are only some embodiments of the present application, but not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

It should be understood that "indication" mentioned in the embodiments of the present application may be a direct indication, an indirect indication, or an indication of an association relationship. For example, a indicates B, which may mean that a directly indicates B, e.g., B may be obtained by a; it may also mean that a indicates B indirectly, for example, a indicates C, and B may be obtained by C; it can also mean that there is an association between a and B.

In the description of the embodiments of the present application, the term "correspond" may indicate that there is a direct correspondence or an indirect correspondence between the two, may also indicate that there is an association between the two, and may also indicate and be indicated, configure and configured, and so on.

In the embodiment of the present application, "predefining" may be implemented by saving a corresponding code, table, or other manners that may be used to indicate related information in advance in a device (for example, including a terminal device and a network device), and the present application is not limited to a specific implementation manner thereof.

Fig. 1 is a schematic structural diagram illustrating a cloud service system according to an exemplary embodiment. The system includes various node devices including a master node device 110 and a non-master node device 120.

That is to say, in the embodiments related to the present application, each node device may be considered to only include the master node device 110 and the non-master node device 120, and the node super-fusion upgrade method described in the embodiments related to the present application is used to implement super-fusion upgrade on two nodes (that is, a master node device and a non-master node device).

Optionally, the master node device 110 and the non-master node device 120 are super-convergence structures, a super-convergence management system of the two node devices is deployed in a control virtual manager (hereinafter, referred to as CVM in the present invention), and a service running in the CVM manages the two node devices (i.e., physical machines) through a network.

Optionally, life cycle management (LCM life cycle management, hereinafter referred to as LCM) is respectively deployed in the master node device 110 and the non-master node device 120, and the LCM life cycle management is responsible for upgrading, expanding, and replacing nodes of the cluster.

Optionally, the master node device 110 and the non-master node device 120 have at least one shared storage, data for controlling a stateful service of the virtual manager CVM is stored in the shared storage, data for controlling a stateless service is stored in an operating system disk of the physical machine, and only one node of the two node devices starts controlling the virtual manager CVM (i.e., the master node device) at any time. The switching of the controlling virtual manager CVM and storage is controlled by the high available component HA-service on the respective node. In the node equipment, the high-availability component HA-service is used for controlling storage and CVM switching, and when the CVM of the main node equipment is detected to be in fault, the high-availability component HA-service can open a control virtual manager CVM of non-main node equipment to realize the switching of the control virtual manager CVM.

Optionally, the cloud service system further includes a control device (not shown in fig. 1), where the control device may monitor a state of each node device in a super-fusion upgrade process of each node device, and when it is detected that the node device meets a preset condition, send a corresponding control instruction to the node device to control each node device to execute a preset operation, so as to complete the super-fusion upgrade process of the node.

The control device can be in communication connection with each node device through a wired or wireless network.

Optionally, the cloud server may be a cloud server providing basic operation and computing services such as cloud service, a cloud database, cloud computing, a cloud function, cloud storage, a network service, cloud communication, middleware service, domain name service, security service, CDN, and a big data and artificial intelligence platform.

Optionally, the cloud server 110 and the test instruction generating device 120 may be connected through a communication network. Alternatively, the communication network may be a wired network or a wireless network.

Optionally, the wireless or wired networks described above use standard communication techniques and/or protocols. The network is typically the internet, but may be any other network including, but not limited to, a local area network, a metropolitan area network, a wide area network, a mobile, a limited or wireless network, a private network, or any combination of virtual private networks. In some embodiments, data exchanged over the network is represented using techniques and/or formats including hypertext markup language, extensible markup language, and the like. All or some of the links may also be encrypted using conventional encryption techniques such as secure sockets layer, transport layer security, virtual private network, internet protocol security, and the like. In other embodiments, custom and/or dedicated data communication techniques may also be used in place of, or in addition to, the data communication techniques described above.

FIG. 2 is a flow chart illustrating a method of node hyperfusion upgrade methodology in accordance with an exemplary embodiment. The method is executed by a computer device, and the computer device can be a control device in a cloud service system, and the method comprises the following steps:

step 201, controlling each node device to copy each service in the control virtual manager deployed on each node device to a target container, and deploying the target container on each node device respectively.

Wherein each service comprises a stateless service and a stateful service;

in this embodiment of the present application, a control device may first send an instruction to control each node device, and taking any node device as an example, after receiving the instruction sent by the control device, the node device may copy each service in a control virtual manager CVM deployed on the node device to a target container, and deploy the target container in the node device.

Since the target container has various services in the control virtual manager CVM, the target container can replace the function of the virtual control manager CVM to some extent.

However, it should be noted that, at this time, each service in the target container is not started, that is, the target container with the service that is not started is deployed on each node device (i.e., the master node device and the non-master node device).

In the embodiment of the present application, each service is further divided into a stateful service and a stateless service.

Stateful and stateless services are two different service architectures that differ in the handling of service states. The service state is the data required to service the request and may be a variable or a data structure. The stateless service does not record the service state, and different requests have no relation; and the other way, the stateful service is not. Whether the server program is a stateful service or a stateless service, it determines whether the server has a context relationship according to two requests from the same initiator.

That is, for the stateless service, all data that can be processed by the server side comes from information carried by the request, the processing of a single request of the client side by the stateless service is independent of other requests, and information for processing one request is included in the request; for stateful services, the service stores data information that is context-dependent, and successive requests may be related.

In the embodiment of the present application, data of stateful services is placed on the distributed storage, and data of stateless services is placed on the operating system disk of each node device.

Step 202, deploying load balancers on the primary node device and the non-primary node device respectively.

After the target containers are deployed on both the main node device and the non-main node device, the control device may further control the node device and the non-main node device to deploy the load balancer.

The load balancer is responsible for flow distribution and distributes received services, so that the number of services processed by each node device is more balanced and reasonable.

Similarly, the load balancer deployed at this time is not started, and the load balancers which are not started are deployed in both the main node device and the non-main node devices.

Step 203, controlling the non-master node device to start the load balancer and the stateless service.

When the control device detects that both the master node device and the non-master node device are deployed with an un-started target container and a load balancer, the control device may start the load balancer and the stateless service in the non-master node device.

At this time, although the load balancer and the stateless service are started in the non-master node device, since the master node device is a node device that directly receives the traffic and the load balancer is not started in the master node device, the stateless service in the non-master node device does not process the traffic. At this time, the stateless service and the load balancer in the non-master node device are started in advance, so that the problem that too much service is started at one time in the subsequent step to cause too long service interruption time can be prevented.

Step 204, controlling the master node device to shut down the control virtual manager, and starting a load balancer and a stateful service in the master node device.

After the control device detects that the load balancer in the non-master node device and the stateless service are started, the control device may send an instruction to the master node device, control the master node device to close the virtual manager CVM, and start the load balancer in the master node device and the stateful service.

At this time, although the virtual manager CVM is controlled to be turned off, after the host node device receives the service, the traffic may be shunted through the load balancer to shunt the service to the non-host node device, and the non-host node device first determines the service, and if the service needs to request stateless service, the non-host node device directly accesses the stateless service. If the service needs to request a stateful service, the service may access the stateful service in the master node device through a vip (Virtual IP Address) Address in the non-master node device.

Through the above process, even if the virtual manager CVM is controlled to be turned off, the stateful service and the stateless service respectively started on the main node device and the non-main node device may replace the function of controlling the virtual manager CVM.

Step 205, controlling the master node device to start the stateless service on the master node device, and deleting the control virtual manager to complete the node hyper-convergence upgrade.

And finally, the control equipment controls the main node equipment to start the stateless service on the main node equipment and deletes the control virtual manager CVM, at the moment, the stateless service and the stateful service are started on the main node equipment, the stateless service is started on the non-main node equipment, and the stateless service and the services deployed in the target container realize the replacement of the control virtual manager CVM.

And if a developer needs to upgrade some functions in the control virtual manager CVM, upgrading each service copied to the target container in the process of copying each service of the control virtual manager CVM to the target container, wherein the control virtual manager CVM is replaced by the target container to upgrade the control virtual manager CVM.

And because the main node equipment also starts the stateless service at this time, when the main node equipment receives the service, the service can be distributed to the stateless service of the main node equipment and the stateless service of the non-main node equipment in a balanced manner, and when the service needs to access the stateful service, the stateful service on the main node equipment is accessed again.

In summary, in order to implement the node hyper-convergence, the control device may first copy the stateless service and the stateful service in the CVM to the target container, and deploy the target container on each node device, the control device then deploys the load balancer on each node device, and the control device then controls the non-master node device to start the load balancer and the stateless service in each service; controlling a Control Virtual Manager (CVM) of the main node equipment to close, and starting a load balancer and stateful services in the main node equipment, wherein at the moment, although the Control Virtual Manager (CVM) is closed, the load balancer can forward the service to stateless services in non-main node equipment when receiving the service, and if the stateless services need to access the stateful services, the stateful services in the main node equipment can be accessed, and at the moment, although the Control Virtual Manager (CVM) is closed, the Control Virtual Manager (CVM) can be replaced by a target container; and the control equipment controls the main node equipment to start the rest stateless service and deletes the control virtual manager CVM so as to complete the node hyper-convergence upgrade. In the scheme, the control virtual manager CVM is replaced by the target container, and the stateful service and the stateless service in the target container are started step by step, so that each service is started smoothly on the basis of avoiding upgrading failure caused by deletion of the control virtual manager CVM, and service interruption time caused by service starting is reduced.

FIG. 3 is a method flow diagram illustrating a method for node hyperfusion upgrade, according to an exemplary embodiment. The method is executed by a computer device, and the computer device can be a control device in a cloud service system, and the method comprises the following steps:

in step 301, the highly available components on the respective node devices are stopped.

The high availability components are used for controlling storage of each node device and switching of the virtual manager CVM.

Before node hyper-convergence upgrade is carried out, the control equipment needs to stop high-availability components on each node equipment, and because the high-availability components are used for controlling switching of the control virtual manager CVM, when the control virtual manager CVM in the main node equipment fails, the high-availability components can open the control virtual manager CVM in the non-main node equipment, so that switching of the control virtual manager CVM is realized.

In the node hyper-convergence upgrading process, the control virtual manager CVM in the main node equipment needs to be stopped, so that the control virtual manager CVM needs to be stopped before the upgrading process in order to avoid the situation that the high-availability component detects that the service fault occurs and the control virtual manager CVM is switched.

In a possible implementation manner, the state of each node device is detected;

and when detecting that the respective node equipment is in a healthy state, stopping the high-available components of the respective node equipment.

Before stopping the high-availability component to start the node hyper-convergence upgrading process, the control equipment also needs to detect the health state of each node device, and when each node device is in the health state and meets the requirement of node hyper-convergence upgrading, the high-availability component of each node device is stopped; and if at least one node device is in the unhealthy state, terminating the node hyper-convergence upgrading process.

In a possible implementation manner, when it is detected that each node device is in a healthy state, performing data backup on target data in each node device;

and stopping the high-availability components of each node device after detecting that the target data in each node device is completely backed up.

Further, in order to avoid a difference in the upgrading process, the control device may first perform backup processing on target data (e.g., predetermined important data) in each node device, and if the upgrading fails due to an uncontrollable factor encountered in the upgrading process, the environment may be rolled back through the backup data.

Step 302, upgrading the distributed storage shared by the node devices, and the services and operating systems on the node devices.

Before node hyper-convergence upgrade is carried out, the distributed storage shared by all nodes and the service and the operating system on all node devices can be upgraded firstly. The distributed storage is upgraded first because important data on the CVM is placed on the distributed storage, which needs to rely on the host operating system and services on the host.

That is, before node hyper-convergence, a storage service of distributed storage common to the respective nodes may be performed. The reason for the storage upgrade is that the CVM will use the storage. The distributed storage related in the embodiment of the application is a service operated on 2 physical machines, a distributed storage cluster is formed, and the functions of block storage, object storage, file storage and the like can be improved through the distributed storage.

Further, in the process of upgrading the storage, the method can be realized by the following steps:

1. at a node without vip, live migrating the running virtual machine to an opposite-end server;

2. the storage service of the node without the vip is closed, and at the moment, the node where the vip is located can continue to operate due to the fact that storage is double copies;

3. upgrading a storage service software package without a vip node;

4. starting up a non-vip node, and waiting for the data of 2 nodes to be completely consistent to carry out next vip node upgrading because the vip node takes over the service in the upgrading process and the data is continuously written in;

5. migrating the virtual machine to a non-vip node;

6. upgrading storage service of the vip node;

7. and starting the vip node storage service, and finishing storage upgrading after the data balance is finished.

Further, the physical machine deployed by each node can be optionally upgraded with an operating system, firmware, drivers and bios.

For example, in the embodiment of the present application, the upgrading process of the operating system, the firmware, the driver, and the bios of the physical machine may be implemented by the following steps:

2. upgrading a non-vip node operating system, taking centros as an example, the upgrading method is a 'yum update-y' command;

3. upgrading bios, raid drive and the like on a non-vip node through a web Interface of an IPMI (intelligent Platform Management Interface) system;

4. restarting the non-vip node;

4. starting up a non-vip node, and waiting for the complete consistency of the data of 2 nodes to update the vip node because the vip node takes over the service in the updating process and the data is continuously written;

5. migrating the virtual machine to a non-vip node;

6. the system and the drive for upgrading the vip nodes are the same as the method.

7. And starting the vip node, and finishing upgrading after finishing the data balance.

Step 303, controlling each node device to copy each service in the control virtual manager CVM deployed on each node device to a target container, and respectively deploying the target container on each node device.

The respective services include stateless services and stateful services.

the control equipment controls each node equipment, copies the stateful service deployed on each node equipment into a stateful container, and copies the stateless service deployed on each node equipment into a stateless container;

In the embodiment of the application, because a plurality of services are operated in the old version of the CVM, both the stateful service and the stateless service exist, at this time, the services in the CVM can be split into two containers, that is, the stateless service is in one container, the stateful service is in one container, and when the CVM is upgraded, a certain container can be independently started to realize the independent start of the stateful service or the start of the stateless service.

Step 304, deploying load balancers on the primary node device and the non-primary node device, respectively.

Step 305, controlling the non-master node device to start the load balancer and the stateless service.

Step 306, controlling the main node device to close the control virtual manager CVM, and starting a load balancer and a stateful service in the main node device.

Steps 304 to 306 refer to steps 202 to 204 in the embodiment shown in fig. 2, and are not described herein again.

In a possible implementation manner, when a condition that a main node device triggers upgrade failure is detected, a state service and a load balancer in the main node device are closed, and a Control Virtual Manager (CVM) is started; the upgrade failure condition includes at least one of:

the controlling virtual manager CVM fails to close;

the load balancer failed to start up;

the stateful service fails to boot.

That is, in step 306, when the master node device controls the virtual manager CVM to close and start the load balancer and the stateful service in the master node device, an error may occur, and when the control device detects that the master node device triggers an upgrade failure condition, the control device shuts down the stateful service and the load balancer, immediately starts the old CVM, and finally informs the user of the upgrade failure through the cloud management.

Step 307, controlling the master node device to start the stateless service on the master node device.

Step 308, when detecting that the stateless service on the master node device is started, starting the high-available components on the respective node devices.

After the stateless service of the master node device is started, the master node device can process the service requesting the stateless service and can also process the service requesting the stateful service; similarly, the non-master node device can process the service requesting the stateless service, and when the stateful service is required to be requested, the stateful service on the master node device can be requested through the vip address, so that the master node device cannot have service failure under the normal condition with the non-master node device, and the high-availability component can be started at the moment.

Step 309, delete the control virtual manager CVM on each node device to complete node hyper-convergence upgrade.

After the steps are completed, the control equipment can control each node equipment, and delete the control virtual manager CVM deployed on the control equipment, so that the node hyper-convergence upgrading process is completed.

Referring to fig. 4, a block diagram of a flow of node hyper-convergence upgrade according to an embodiment of the present application is shown. As shown in fig. 4, the node hyper-convergence upgrade logic according to the embodiment of the present application may be as follows:

1. before upgrading, health inspection and important data backup are carried out on the environment. The health inspection aims to find out the problems of the environment in time, and if any problem is solved before upgrading, secondary disasters are prevented from being caused during upgrading. The important data backup is to prevent that if the upgrading process encounters uncontrollable factors to cause upgrading failure, the environment can be rolled back through the important data.

Referring to fig. 5, a logic diagram of a node before upgrade according to an embodiment of the present application is shown. As shown in fig. 5, 2 physical machines respectively deploy a CVM, a high availability service HA-service, and a distributed storage service, where physical machine 1 serves as a master node device and physical machine 2 serves as a non-master node device. CVM only starts the master node device, controls CVM master/slave mode through HA-service, and the distributed storage services 2 nodes compose the distributed storage resource pool.

2. And stopping the HA-service, wherein the HA-service controls the storage and the CVM switching. Because the upgrading process can actively carry out other services, the HA-service is prevented from detecting that the service fault is subjected to the fault switching

3. The upgrade is performed on the distributed storage service and the operating system on the physical machine, in part, because important data on the CVM is placed on the distributed storage, and the distributed storage needs to depend on the host operating system and the service on the host. The upgrade is done first to avoid CVM upgrade.

4. The service split on the previous CVM is split into a plurality of containers by one container. The split containers are created on 2 physical machines (i.e., the master node device and the non-master node device), respectively. One load balancer is respectively deployed on 2 physical machines.

Refer to fig. 6, which illustrates a container creation diagram according to an embodiment of the present application. As shown in fig. 6, services in blue boxes are newly deployed in 2 nodes, and LB (LoadBalance) is a check of a load balancer and is responsible for traffic distribution. Stateless services and stateful services are split out of the services in the CVM. So that services are deployed but not started. Starting up in steps below.

5. A load balancer, stateless service, is started on a non-primary node. The part starts a part of services firstly, and can be quickly switched to be unaware in the subsequent switching.

Please refer to fig. 7, which illustrates a schematic diagram of service initiation of a non-master node device according to an embodiment of the present application. As shown in fig. 7, on the physical machine 1 (i.e., the master node device), the LB, the stateless service, and the stateful service are not started, while on the physical machine 2 (i.e., the non-master node device), the LB and the stateless service are started, and the stateful service is not started.

6. And starting and stopping the old CVM at the main node, and starting the stateful service and the load balancer of the node. If any step of the step fails, the stateful service and load balancer is shut down, the old CVM is started immediately, finally the user is informed of upgrading failure through the cloud pipe, and professional support is sought after the user is contacted.

Please refer to fig. 8, which shows a schematic diagram of service initiation of a master node device according to an embodiment of the present application. As shown in fig. 8, the master node device (i.e., physical machine 1) stops the CVM and starts a stateful service and an LB service of the load balancer. At this time, the service accesses the LB of the main node through the vip address, the service is shunted through the LB, the LB automatically checks the service at the rear end, and the service is shunted to the physical machine 2. If the stateless service needs to access the stateful service, the service will access the stateful service of the physical machine 1 via the vip.

7. Stateless services and 2-node HA-service services are initiated at the master node.

Please refer to fig. 9, which illustrates a schematic diagram of a node device after a service according to an embodiment of the present application is completely started. As shown in fig. 9, in the master node device (i.e., the physical machine 1), the LB service, the stateless service, the stateful service, and the HA-service of the load balancer are all in the activated state, and the CVM is in the inactivated state; in a non-main node device (namely a physical machine 2), LB service, stateless service and HA-service of a load balancer are all in a starting state; the stateful service and the CVM are in an inactivated state.

8. The environment is cleaned, and the old cvm and cvm images are deleted.

Referring to fig. 10, a schematic diagram of a system architecture after completing upgrade according to an embodiment of the present application is shown. As shown in fig. 10, after the environment is cleaned up, LB services, stateless services, stateful services, HA-service services, and distributed storage with a load balancer are deployed in the physical machine 1; LB services, stateless services, stateful services, HA-service services, and distributed storage with load balancers are also deployed in physical machine 2, but stateful services in physical machine 2 are in an un-started state with respect to physical machine 1.

In summary, to implement the node hyper-convergence, the control device may copy the stateless service and the stateful service in the CVM to the target container, and deploy the target container on each node device, and then deploy the load balancer on each node device, and then control the non-master node device to start the load balancer and the stateless service in each service; controlling a Control Virtual Manager (CVM) of the main node equipment to close, and starting a load balancer and stateful services in the main node equipment, wherein at the moment, although the Control Virtual Manager (CVM) is closed, the load balancer can forward the service to stateless services in non-main node equipment when receiving the service, and if the stateless services need to access the stateful services, the stateful services in the main node equipment can be accessed, and at the moment, although the Control Virtual Manager (CVM) is closed, the Control Virtual Manager (CVM) can be replaced by a target container; and the control equipment controls the main node equipment to start the rest stateless service and deletes the control virtual manager CVM so as to complete the node hyper-convergence upgrade. In the scheme, the control virtual manager CVM is replaced by the target container, and the stateful service and the stateless service in the target container are started step by step, so that each service is started smoothly on the basis of avoiding upgrading failure caused by deletion of the control virtual manager CVM, and service interruption time caused by service starting is reduced.

Please refer to fig. 11, which illustrates a node hyper-convergence upgrade apparatus according to an embodiment of the present application. The device is used for a control device in a cloud service system; the cloud service system further includes each node device including a master node device and a non-master node device, and the apparatus includes:

a container deployment module 1101, configured to control each node device to copy each service in the control virtual manager deployed on each node device to a target container, and deploy the target container on each node device respectively; each service comprises a stateless service and a stateful service;

a load balancing deployment module 1102, configured to deploy load balancers on the master node device and the non-master node device respectively;

a non-master node starting module 1103, configured to control the non-master node device to start the load balancer and the stateless service;

a master node starting module 1104, configured to control the master node device to close the control virtual manager, and start a load balancer and a stateful service in the master node device;

and a manager deleting module 1105, configured to control the master node device to start a stateless service on the master node device, and delete the control virtual manager, so as to complete node hyper-convergence upgrade.

In one possible implementation, the high-availability component control module is further configured to:

detecting the state of each node device;

and stopping the high-availability components of each node device after the target data backup in each node device is detected to be completed.

In one possible implementation, the apparatus further includes:

the upgrade failure condition comprises at least one of:

the controlling virtual manager fails to close;

the load balancer failed to start;

the stateful service fails to boot.

In summary, in order to implement the node hyper-convergence, the control device may first copy the stateless services and the stateful services in the non-stateful services into the target container, and deploy the target container on each node device, the control device then deploys the load balancer on each node device, and the control device then controls the non-master node device to start the load balancer and the stateless services in each service; then controlling the control virtual manager of the main node equipment to close, and starting the load balancer and the stateful service in the main node equipment, wherein at this time, although the control virtual manager is closed, the load balancer can forward the service to the stateless service in the non-main node equipment when receiving the service, if the stateless service needs to access the stateful service, the stateful service in the main node equipment can be accessed, and at this time, although the control virtual manager is closed, the control virtual manager can be replaced by the target container; and the control equipment controls the main node equipment to start the rest stateless service and deletes the control virtual manager to complete the node hyper-convergence upgrade. In the scheme, the control virtual manager is replaced by the target container, and the stateful service and the stateless service in the target container are started step by step, so that each service is started smoothly on the basis of avoiding upgrading failure caused by deleting the control virtual manager, and service interruption time caused by service starting is reduced.

Refer to fig. 12, which is a schematic diagram of a computer device including a memory and a processor, the memory storing a computer program, the computer program being executed by the processor to implement the method according to an exemplary embodiment of the present application.

The processor may be a Central Processing Unit (CPU). The Processor may also be other general purpose processors, digital Signal Processors (DSPs), application Specific Integrated Circuits (ASICs), field Programmable Gate Arrays (FPGAs) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components, or a combination thereof.

The memory, which is a non-transitory computer readable storage medium, may be used to store non-transitory software programs, non-transitory computer executable programs, and modules, such as program instructions/modules corresponding to the methods of the embodiments of the present invention. The processor executes various functional applications and data processing of the processor by executing non-transitory software programs, instructions and modules stored in the memory, that is, the method in the above method embodiment is realized.

The memory may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created by the processor, and the like. Further, the memory may include high speed random access memory, and may also include non-transitory memory, such as at least one disk storage device, flash memory device, or other non-transitory solid state storage device. In some embodiments, the memory optionally includes memory located remotely from the processor, and such remote memory may be coupled to the processor via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

In an exemplary embodiment, a computer readable storage medium is also provided for storing at least one computer program, which is loaded and executed by a processor to implement all or part of the steps of the above method. For example, the computer-readable storage medium may be a Read-Only Memory (ROM), a Random Access Memory (RAM), a Compact Disc Read-Only Memory (CD-ROM), a magnetic tape, a floppy disk, an optical data storage device, and the like.

In an exemplary embodiment, a computer program product or a computer program is also provided, which comprises computer instructions, which are stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions to cause the computer device to perform all or part of the steps of the method described in any of the embodiments of fig. 2 or fig. 3.

Other embodiments of the present application will be apparent to those skilled in the art from consideration of the specification and practice of the invention disclosed herein. This application is intended to cover any variations, uses, or adaptations of the invention following, in general, the principles of the application and including such departures from the present disclosure as come within known or customary practice within the art to which the invention pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the application being indicated by the following claims.

It will be understood that the present application is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the application is limited only by the appended claims.

Claims

1. A node hyper-convergence upgrading method is characterized in that the method is applied to control equipment in a cloud service system; the cloud service system further comprises each node device, each node device comprises a master node device and a non-master node device, and the method comprises the following steps:

respectively deploying load balancers on the main node equipment and the non-main node equipment;

2. The method of claim 1, prior to said controlling said respective node devices to split respective services in a control virtual manager deployed on said respective node devices into at least two containers, further comprising:

upgrading the distributed storage shared by the node devices, and the service and the operating system on each node device.

3. The method of claim 2, wherein before the upgrading the distributed storage shared by the node devices, further comprising:

stopping highly available components on the respective node devices; the high availability component is used for controlling storage of each node device and switching of the virtual manager.

4. The method of claim 3, wherein controlling the master node device to initiate stateless services on the master node device and remove the control virtual manager to complete a node hyper-convergence upgrade comprises:

when detecting that the stateless service on the main node equipment is started, starting a high-available component on each node equipment;

5. The method of any of claim 3, wherein said stopping highly available components on said respective node devices comprises:

detecting the state of each node device;

6. The method of claim 5, wherein said ceasing highly available components of said respective node device upon detecting said respective node device is in a healthy state comprises:

7. The method of any of claims 1 to 6, further comprising:

when a condition that the master node equipment triggers upgrading failure is detected, the state service and the load balancer in the master node equipment are closed, and the control virtual manager is started;

the upgrade failure condition comprises at least one of:

the controlling virtual manager fails to close;

the load balancer failed to start;

the stateful service fails to boot.

8. The method of any one of claims 1 to 6, wherein the target containers comprise at least a stateful container and a stateless container;

the controlling the node devices to copy each service in the control virtual manager deployed on each node device to a target container, and respectively deploy the target container on each node device, includes:

controlling each node device, copying the stateful services deployed on each node device into a stateful container, and copying the stateless services deployed on each node device into a stateless container;

9. A node hyper-convergence upgrading device is characterized in that the device is used for a control device in a cloud service system; the cloud service system further includes each node device including a master node device and a non-master node device, and the apparatus includes:

10. A computer device comprising a processor and a memory, the memory having stored therein at least one instruction that is loaded and executed by the processor to implement the node hyper-fusion upgrade method of any of claims 1 to 8.

11. A computer-readable storage medium having stored therein at least one instruction which is loaded and executed by a processor to implement the node hyper-convergence upgrade method of any one of claims 1 to 8.