CN117608494A

CN117608494A - Storage method and system of cloud computing cluster

Info

Publication number: CN117608494A
Application number: CN202311664735.XA
Authority: CN
Inventors: 王冬冬; 于英涛; 刘红敏
Original assignee: Taiji Computer Corp Ltd
Current assignee: Taiji Computer Corp Ltd
Priority date: 2023-12-06
Filing date: 2023-12-06
Publication date: 2024-02-27

Abstract

The application provides a storage method and a storage system of a cloud computing cluster, wherein the method comprises the following steps: establishing a unique corresponding main virtual machine on each server in the cloud computing cluster; for each main virtual machine, determining whether to backup a system disk and a data disk of the main virtual machine according to the working condition of the main virtual machine, and respectively determining a backup mode of each disk; selecting a target server for establishing a backup virtual machine for each main virtual machine needing backup, and backing up the main virtual machine to the target server according to the determined backup mode; and under the condition that a fault server exists in the cloud computing cluster, controlling a backup virtual machine corresponding to the main virtual machine in the fault server to take over the service of the main virtual machine, and recovering the main virtual machine according to the data of the corresponding backup virtual machine after the fault server is recovered to operate. The method can avoid cluster faults, improve the reliability of cluster storage data and improve the resource utilization rate of data storage.

Description

Storage method and system of cloud computing cluster

Technical Field

The present disclosure relates to the field of cloud computing technologies, and in particular, to a method and a system for storing a cloud computing cluster.

Background

With the development of cloud computing technology, the popularity of products such as a cloud platform and a cloud desktop derived from the cloud platform is gradually increased in various fields, and in cloud computing, storage is a key ring, and the key ring affects the core function, the running performance and the reliability of a system.

In the related art, a distributed file system is basically adopted for a cloud computing cluster formed by a plurality of servers, and an open-source distributed file system Ceph is mainly adopted. And each hard disk of each server is used as a storage node to form a large file system through the distributed file system, and the large file system is accessed through a C/S mode. Therefore, the functions of quickly migrating the virtual machine and the like on the basis that the storage position is always kept motionless can be realized.

However, in practical applications, the distributed file system is used as a file system, and there is always a situation that the file system is damaged and unavailable, for example. Abnormal state of write once and failure to recover to normal state may block the entire system write operation, resulting in the risk of the entire cluster not being available and even the entire cluster data being lost. In addition, when the cloud cluster is subjected to data storage in the related technology, all files or block devices are scattered and stored in different storage nodes (hard disks) of different servers after being sliced, one disk can possibly store all slices of data, three pairs of modes are generally adopted for storage in order to ensure the reliability of the data, namely the same slice is stored in three parts on the disks of different servers, so that the risk of system data loss is reduced, and the risk of whole data loss caused by simultaneously damaging two disks on different servers is avoided. However, the problem with this storage method is that no matter what data needs to be written in three copies across the server, the write-once operation can be completed after completion, which will bring the problems of the storage space utilization rate of at most 1/3, high consumption of calculation and storage performance, high write latency, and the like.

Therefore, how to implement a cloud computing storage manner that can avoid the failure risk of the whole cluster, with smaller operation delay and higher resource utilization rate becomes a problem that needs to be solved at present.

Disclosure of Invention

The present application aims to solve, at least to some extent, one of the technical problems in the related art.

Therefore, a first objective of the present application is to provide a method for storing a cloud computing cluster, which can avoid the risk of cluster failure and even cluster data loss caused by unavailable storage in cloud computing applications such as a cloud platform and a cloud desktop, and simultaneously improve storage efficiency and reduce occupation of computing resources and storage space.

A second object of the present application is to provide a storage system of a cloud computing cluster;

a third object of the present application is to propose a non-transitory computer readable storage medium.

To achieve the above object, an embodiment of a first aspect of the present application provides a storage method of a cloud computing cluster, including the following steps:

establishing a unique corresponding main virtual machine on each server in the cloud computing cluster, wherein the main virtual machine corresponding to each server is not influenced by other servers;

For each main virtual machine, determining whether to backup a system disk and a data disk of the main virtual machine according to the working condition of the main virtual machine, and respectively determining a backup mode of each disk;

selecting a target server for establishing a backup virtual machine for each main virtual machine needing backup, and backing up the main virtual machine to the target server according to the determined backup mode;

and under the condition that a fault server exists in the cloud computing cluster, controlling a backup virtual machine corresponding to a main virtual machine in the fault server to take over a service task of the main virtual machine, and recovering the main virtual machine according to the data of the corresponding backup virtual machine after the fault server is recovered to operate.

Optionally, in an embodiment of the present application, after the backup virtual machine corresponding to the primary virtual machine in the control failure server takes over a service task of the primary virtual machine, the method further includes: and selecting a server for rebuilding from the cloud computing cluster, and rebuilding a main virtual machine or a backup virtual machine of the fault server on the server for rebuilding.

Optionally, in an embodiment of the present application, the backup manner includes: the real-time backup and the asynchronous backup, backing up the main virtual machine to the target server according to the asynchronous backup mode, comprising the following steps: setting a time point, a time interval and a reserved reduction point of asynchronous backup, and executing an asynchronous backup task when the time point and the time interval are reached; creating a backup virtual machine on the target server, and if a backup data disk task exists, creating a backup data disk on the target server; creating a backup file of each disk to be backed up on the target server, and exporting a change part of each disk to be backed up to a corresponding backup file through a bypass technology when each backup is carried out; combining the updated backup file and the existing backup file into a complete backup file, and covering a disk corresponding to the backup virtual machine with the complete backup file; under the condition that the covered backup data disk is not mounted, mounting the covered backup data disk to the backup virtual machine; and deleting redundant backup files according to the reduction points.

Optionally, in an embodiment of the present application, exporting, by a bypass technology, the changed portion of each disk to be backed up to a corresponding backup file includes: during primary backup, the complete content of each disk to be backed up in the main virtual machine is covered into a corresponding backup file; a file change identifier is respectively added in each disk to be backed up in the main virtual machine; and when each time of backup is carried out, determining the data updated after the last backup is finished based on the file change identification, and extracting the updated data into the corresponding backup file through a bypass technology.

Optionally, in an embodiment of the present application, backing up the primary virtual machine to the target server according to the real-time backup manner includes: executing a real-time backup task according to the real-time backup instruction; creating a backup virtual machine on the target server, and covering a standby system disk of the backup virtual machine by a system disk of the main virtual machine; if the backup data disk task exists, creating a backup data disk on the target server, and mounting the backup data disk to the backup virtual machine; the method comprises the steps of connecting disk files corresponding to a main virtual machine and a backup virtual machine, and transmitting a change part of each disk to be backed up in the main virtual machine to the corresponding disk file in the backup virtual machine in real time through a bypass technology when a data modification event exists; and disconnecting the connection between the main virtual machine and the backup virtual machine according to the received backup stopping instruction.

Optionally, in an embodiment of the present application, the recovering the primary virtual machine according to the data of the corresponding backup virtual machine includes: extracting data in a system disk and/or a data disk of the backup virtual machine, and transmitting the extracted data to a disk corresponding to the main virtual machine for coverage in a compression transmission mode; and starting the main virtual machine to carry out the service task again, and stopping running the backup virtual machine.

Optionally, in an embodiment of the present application, the reconstructing the primary virtual machine or the backup virtual machine of the failed server on the server for reconstructing includes: determining the type of a virtual machine to be rebuilt, rebuilding the virtual machine to be rebuilt on the server for rebuilding, and synchronizing data in a normal virtual machine corresponding to the virtual machine to be rebuilt into the rebuilt virtual machine through a bypass technology; and reestablishing the corresponding relation between the main virtual machine and the standby virtual machine, and re-executing the backup task according to the backup mode preset before the fault of the fault server.

To achieve the above object, a second aspect of the present application further provides a storage system of a cloud computing cluster, including:

The cloud computing system comprises a building module, a storage module and a storage module, wherein the building module is used for building a unique corresponding main virtual machine on each server in a cloud computing cluster, and the main virtual machine corresponding to each server is not influenced by other servers;

the determining module is used for determining whether the system disk and the data disk of the main virtual machine are backed up or not according to the working condition of the main virtual machine and determining the backup mode of each disk respectively for each main virtual machine;

the backup module is used for selecting a target server for establishing a backup virtual machine for each main virtual machine needing backup, and backing up the main virtual machine to the target server according to the determined backup mode;

and the control module is used for controlling the backup virtual machine corresponding to the main virtual machine in the fault server to take over the service task of the main virtual machine under the condition that the fault server exists in the cloud computing cluster, and recovering the main virtual machine according to the data of the corresponding backup virtual machine after the fault server is recovered to operate.

In order to implement the above embodiment, a third aspect of the present application further proposes a non-transitory computer-readable storage medium having stored thereon a computer program that, when executed by a processor, implements the storage method of the cloud computing cluster in the above embodiment.

The technical scheme provided by the embodiment of the application at least brings the following beneficial effects: the present application is based on

Therefore, the risk that the whole cloud computing cluster is not available due to the fact that the distributed file system is not available is fundamentally avoided, and even the huge risk that data is completely lost. The number of copies required to be set by the distributed file system is reduced, whether backup, real-time backup or asynchronous backup is required to be set or not can be set according to the requirements, the data storage performance is better, the resource utilization rate is higher, and the servers in the cluster are not required to be configured identically so as to avoid the influence on the overall performance by the servers with poor performance. The storage space utilization rate is higher, the calculation performance consumption is smaller, the operation delay is lower, the occupation of calculation resources and storage space is reduced, the system recovery and reconstruction are quicker, the reliability of the cluster can be guaranteed through a simpler implementation mode, and the cost is greatly reduced. And moreover, the method has the characteristic of improving the storage utilization rate when a single server stores the virtual machine, and avoids the cluster fault and the cluster data loss risk in a cloud computing cluster formed by a plurality of servers. Therefore, the reliability, flexibility and storage efficiency of the cloud computing cluster storage data are improved, and the running performance of the cloud computing cluster is guaranteed.

Additional aspects and advantages of the invention will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the invention.

Drawings

The foregoing and/or additional aspects and advantages of the present application will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings, in which:

fig. 1 is a flowchart of a method for storing a cloud computing cluster according to an embodiment of the present application;

FIG. 2 is a flowchart of a method for recovering a primary virtual machine according to an embodiment of the present application;

fig. 3 is a flowchart of a method for reconstructing a virtual machine according to an embodiment of the present application;

FIG. 4 is a flowchart of an asynchronous backup method of a virtual machine according to an embodiment of the present application;

FIG. 5 is a flowchart of a method for heterogeneous real-time backup of a virtual machine according to an embodiment of the present application;

fig. 6 is a schematic structural diagram of a storage system of a cloud computing cluster according to an embodiment of the present application.

Detailed Description

Embodiments of the present invention are described in detail below, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to like or similar elements or elements having like or similar functions throughout. The embodiments described below by referring to the drawings are illustrative and intended to explain the present invention and should not be construed as limiting the invention.

The following describes in detail a storage method and a system of a cloud computing cluster according to an embodiment of the present application with reference to the accompanying drawings.

Fig. 1 is a flowchart of a method for storing a cloud computing cluster according to an embodiment of the present application, as shown in fig. 1, where the method includes the following steps:

step S101, a unique corresponding main virtual machine is established on each server in the cloud computing cluster, wherein the main virtual machine corresponding to each server is not affected by other servers.

Specifically, a local storage virtual machine, namely a main virtual machine, is established for each server in the cloud computing cluster, and each main virtual machine is a virtual machine for providing services for the server where the main virtual machine is located. When the main virtual machine is created, the virtual machine can be created on each server in the computing cluster according to a user creation instruction or automatically, and virtual disks can be distributed for the main virtual machine on the same server after creation, so that the virtual machine on each server can be ensured to work independently without being influenced by other servers except the current server in the cluster, and the backup of the subsequent virtual machine is facilitated.

In one embodiment of the present application, the virtual disk in the created host virtual machine includes a system disk and 0, 1 or more data disks, and when the virtual disk is created, the relationship between the data disks and the system disk is also determined, the created host virtual machine can independently run on the server where the created host virtual machine is located, and the storage mode of the hard disk of the virtual machine includes the following forms: files, bare partitions, and network storage, etc.

Step S102, for each main virtual machine, determining whether to backup the system disk and the data disk of the main virtual machine according to the working condition of the main virtual machine, and determining the backup mode of each disk respectively.

Specifically, the backup mode of the main virtual machine is set, in this application, the backup modes of the main virtual machine on each server may be different, and when determining the backup mode of each main virtual machine, whether to backup the system disk and the data disk of the main virtual machine may be independently determined, and the backup modes of the system disk and the data disk may be independently set. That is, the user can flexibly set whether to backup the main virtual machine, the system disk or the data disk of the backup virtual machine, and the backup modes of the disks in the same virtual machine can be different.

In one embodiment of the present application, the backup modes include real-time backup and asynchronous backup, and when the backup mode of each primary virtual machine is set, the backup mode is determined according to actual requirements such as working conditions of the primary virtual machine. For example, a real-time backup mode may be adopted for a primary virtual machine storing a database. The common office cloud desktop virtual machine does not need to carry out real-time backup in order to save resources, and can adopt an asynchronous backup mode to set the data disk to be backed up in an incremental mode every night and the system disk to be backed up every weekend every week.

Step S103, selecting a target server for establishing a backup virtual machine for each main virtual machine needing backup, and backing up the main virtual machine to the target server according to the determined backup mode.

Specifically, a backup operation of the primary virtual machine is executed, and for the primary virtual machine which needs to be backed up and is determined in the previous step, the primary virtual machine is backed up to the selected target server for establishing the backup virtual machine according to the set backup mode.

When selecting the target server, the user may designate the server for establishing backup for each main virtual machine, or the system may automatically select an appropriate server according to parameters and characteristics of each server in the cloud computing cluster.

In one embodiment of the present application, when a backup operation is specifically performed on any primary virtual machine, a standby virtual machine is first established on a target server corresponding to the primary virtual machine (in the present application, the standby virtual machine may also be used to express a backup virtual machine), and the primary virtual machine is used to perform complete coverage, and then backup is performed according to a backup manner of a system disk and a data disk. The backup data is extracted in a bypass mode, so that normal operation of the main virtual machine is not affected, and incremental backup is realized. The backup interval and backup parameters such as the reserved reduction points are preset for asynchronous backup, data are backed up to a target server at fixed time according to the set time, and old backups exceeding the reduction points are combined. And triggering the backup operation to complete the backup through the data modification event for the real-time backup. Specific implementation steps in asynchronous backup and real-time backup are described in the following embodiments.

It can be understood that there is no fixed correspondence between the physical servers and the backup virtual machines in the present application, and the backup virtual machines are allocated on the physical servers according to a certain rule. Firstly, determining that the backup main virtual machine can generate backup virtual machines on other servers, secondly, designating the server where the backup virtual machine is located by a user, and if not designating, uniformly distributing the backup virtual machine to the other servers according to the characteristics of resources such as a CPU, a memory and a storage space of the server.

Step S104, under the condition that a fault server exists in the cloud computing cluster, controlling a backup virtual machine corresponding to the main virtual machine in the fault server to take over the service task of the main virtual machine, and recovering the main virtual machine according to the data of the corresponding backup virtual machine after the fault server is recovered to operate.

Specifically, after one or more servers in the cloud computing cluster fail, the standby virtual machine corresponding to the failed server takes over service, and the current work task is executed. And after the fault server resumes operation, executing the recovery operation of the main virtual machine, and synchronizing the data of the standby virtual machine to the original virtual machine. As a possible implementation manner, when the recovery is performed, all virtual machines on the failed server may be automatically recovered, or one, multiple or all virtual machines may be specified to be recovered by a user according to requirements.

In one embodiment of the present application, after the failure server resumes operation, the primary virtual machine is restored according to the data of the corresponding backup virtual machine, including the following steps: extracting data in a system disk and/or a data disk of the backup virtual machine, and transmitting the extracted data to a disk corresponding to the main virtual machine for coverage in a compression transmission mode; and starting the main virtual machine to carry out service tasks again, and stopping running the backup virtual machine.

Specifically, in order to more clearly understand the implementation procedure of the virtual machine recovery method in this embodiment, a recovery method proposed in this embodiment is described below. Fig. 2 is a flowchart of a method for recovering a host virtual machine according to an embodiment of the present application, as shown in fig. 2, where the method includes the following steps:

step S201, when detecting that the server where any one of the main virtual machines is located cannot provide service, automatically or manually starting the corresponding standby virtual machine.

Step S202, after the failure server is recovered to normal, executing the virtual machine recovery command.

In step S203, the virtual disk file corresponding to the primary virtual machine is completely covered by the system disk and/or the data disk of the standby virtual machine.

Specifically, when the virtual machine is recovered, any one of the system disk and the data disk in the main virtual machine can be recovered, or all the disks can be recovered, according to the current actual needs. After determining the disk to be recovered, covering the data in the corresponding disk in the standby virtual machine with the disk to be recovered in the main virtual machine.

When the data is recovered, the data is transmitted in a compression transmission mode, so that the data transmission time and the network bandwidth occupation can be reduced.

Step S204, starting and switching to the operation of the main virtual machine, and stopping the operation of the standby virtual machine.

Therefore, when the servers in the cloud computing cluster fail, the risk of cluster failure and even cluster data loss caused by unavailability of certain servers can be avoided through operation of the standby virtual machine and data recovery after the failure.

Further, in one embodiment of the present application, after controlling the backup virtual machine corresponding to the primary virtual machine in the failure server to take over the service task of the primary virtual machine, the method further includes: and selecting a server for rebuilding in the cloud computing cluster, and rebuilding a main virtual machine or a backup virtual machine of the fault server on the server for rebuilding.

Specifically, the embodiment may also perform the rebuilding of the virtual machine after the server fails. When the servers in the cluster cannot be recovered after failure, or the primary virtual machine and the backup virtual machine are rebuilt immediately after the server is down from the safety consideration, the rebuilding operation of the virtual machine can be executed.

The rebuilding may be performed on all or part of the virtual machines in the failed server, for example, the primary virtual machine and the backup virtual machine in the failed server may be rebuilt, or only the primary virtual machine may be rebuilt. When the rebuilding operation is initiated, the host where the rebuilding virtual machine is located is automatically selected according to the resource occupation and the primary and standby conditions of other servers, the rebuilding operation is executed, the relationship between the primary and standby virtual machines is rebuilt after the rebuilding operation, and the backup operation is restarted.

In specific implementation, as a possible implementation manner, the type of the virtual machine to be rebuilt may be determined first, the virtual machine to be rebuilt is rebuilt on a server for rebuilding, and data in a normal virtual machine corresponding to the virtual machine to be rebuilt is synchronized into the rebuilt virtual machine through a bypass technology. And then, reestablishing the corresponding relation between the main virtual machine and the standby virtual machine, and re-executing the backup task according to the backup mode preset before the fault of the fault server.

In order to more clearly understand the implementation process of virtual machine reconstruction in this embodiment, a reconstruction method set forth in this embodiment is described below. Fig. 3 is a flowchart of a method for reconstructing a virtual machine according to an embodiment of the present application, as shown in fig. 3, where the method includes the following steps:

In step S301, a virtual machine rebuild command is issued.

Specifically, when a server is not available, the system issues a command to rebuild the virtual machine thereon.

Step S302, a server where the reconstructed virtual machine is located is automatically selected.

Specifically, each virtual machine (including the backup virtual machine) included in the reconstruction command is automatically allocated with a suitable server for reconstruction according to parameters such as CPU, memory, disk occupation and the like of each surviving server.

Step S303, judging whether the current rebuilding is the main virtual machine or the standby virtual machine, if the current rebuilding is the main virtual machine, executing step S304, and if the current rebuilding is the standby virtual machine, executing step S308.

In step S304, the primary virtual machine (including the data disk) is rebuilt on the present server according to the corresponding configuration of the standby virtual machines on the other servers.

In step S305, the virtual disk of the standby virtual machine is synchronized to the rebuilt primary virtual machine according to the bypass technique.

Specifically, if the primary virtual machine is to be rebuilt, the primary virtual machine is rebuilt on the corresponding server for rebuilding according to the preset standby virtual machine which is still alive on the other server. The disk files of the system disks and data disks of the standby virtual machine that remain alive are then completely synchronized to the rebuilt host virtual machine according to the bypass technique.

And step S306, updating the mapping relation between the rebuilt main virtual machine and the standby virtual machine.

Step S307, starting the real-time backup or asynchronous backup of the rebuilt main virtual machine according to the backup setting of the original main virtual machine.

Step S308, rebuilding the standby virtual machine on the server according to the corresponding configuration of the main virtual machine on the other servers.

Step 309, synchronizing the virtual disk of the primary virtual machine to the reconstructed standby virtual machine according to the bypass technique.

Specifically, if the standby virtual machine is to be rebuilt, the standby virtual machine is rebuilt on the corresponding server for rebuilding the current standby virtual machine according to the preset still-surviving main virtual machine on the other servers. The disk files of the system disks and data disks of the primary virtual machine that remain alive are then completely synchronized to the reconstructed standby virtual according to the bypass technique.

And step S310, updating the mapping relation between the main virtual machine and the rebuilt standby virtual machine.

In step S311, the real-time backup or asynchronous backup of the primary virtual machine is restarted.

Specifically, after the primary virtual machine or the backup virtual machine is rebuilt and the data synchronization is completed, that is, in executing step S306 and step S307, or in step S310 and step S311, the relationship between the primary virtual machine and the backup virtual machine is rebuilt by changing the database or the like, and the backup tasks are reset or restarted according to the real-time backup or asynchronous backup tasks recorded in the database, so as to complete the rebuilding.

In step S311, according to a preset backup mode, the real-time backup or the asynchronous backup of the primary virtual machine is restarted, which can be used to correct the backup error caused by the unavailability of the primary virtual machine.

Therefore, the method and the device for the cloud computing cluster reconstruction further remove the influence caused by the server fault through the virtual machine reconstruction, and further ensure the normal operation of each server in the cloud computing cluster.

In summary, the storage method of the cloud computing cluster according to the embodiment of the present application fundamentally avoids the risk that the whole cloud computing cluster is unavailable due to the unavailability of the distributed file system, and even the huge risk that the data is completely lost. The number of copies required to be set by the distributed file system is reduced, whether backup, real-time backup or asynchronous backup is required to be set or not can be set according to the requirements, the data storage performance is better, the resource utilization rate is higher, and the servers in the cluster are not required to be configured identically so as to avoid the influence on the overall performance by the servers with poor performance. The method has the advantages of higher storage space utilization rate, smaller calculation performance consumption and lower operation delay, reduces the occupation of calculation resources and storage space, ensures the reliability of the cluster through a simpler implementation mode, and greatly reduces the cost, and the system is restored and rebuilt faster. And moreover, the method has the characteristic of improving the storage utilization rate when a single server stores the virtual machine, and avoids the cluster fault and the cluster data loss risk in a cloud computing cluster formed by a plurality of servers. Therefore, the method improves the reliability, flexibility and storage efficiency of the cloud computing cluster storage data, and is beneficial to ensuring the running performance of the cloud computing cluster.

Based on the above embodiments, in order to more specifically and clearly describe the specific implementation flow of real-time backup and asynchronous backup in the storage method of the cloud computing cluster of the present application, the following details will describe two specific embodiments of the backup process.

In one embodiment of the present application, backing up a primary virtual machine to the target server according to an asynchronous backup manner includes the following steps: setting a time point, a time interval and a reserved reduction point of asynchronous backup, and executing an asynchronous backup task when the time point and the time interval are reached; creating a backup virtual machine on a target server, and if a backup data disk task exists, creating a backup data disk on the target server; creating a backup file of each disk to be backed up on the target server, and exporting a change part of each disk to be backed up to a corresponding backup file through a bypass technology when each backup is carried out; combining the updated backup file and the existing backup file into a complete backup file, and covering a disk corresponding to the backup virtual machine with the complete backup file; under the condition that the covered backup data disk is not mounted, mounting the covered backup data disk to a backup virtual machine; and deleting redundant backup files according to the reduction points.

Specifically, in order to more clearly understand the implementation procedure of asynchronous backup in this embodiment, an asynchronous backup method set forth in this embodiment is described below.

Fig. 4 is a flowchart of an asynchronous backup method of a virtual machine according to an embodiment of the present application, in which an asynchronous backup is performed, as shown in fig. 4, and the method includes the following steps:

in step S401, a time point, a time interval, and a reserved restore point of the asynchronous backup are set.

In step S402, the system automatically invokes the backup task according to the set backup time point and time interval.

Specifically, the storage system for the cloud computing cluster can automatically call the backup task at the appointed time according to the backup time point and the time interval set by the user, wherein the time point comprises the set time point of the first backup and the expected backup time point of each subsequent time, and the time for executing the asynchronous backup can be accurately determined by combining the backup time point and the time interval.

Step S403, judging whether a standby virtual machine exists, if not, executing step S404, and if so, executing step S407.

Step S404, determining whether the user designates a backup server, if not, executing step S405, and if so, executing step S406.

In step S405, an appropriate server is automatically selected to create a standby virtual machine.

Step S406, a standby virtual machine is created on the server specified by the user.

Specifically, when the backup is started, whether the backup virtual machine is created or not is judged. If there is no standby virtual machine, the standby virtual machine is created preferentially. When the standby virtual machine is created, a proper server is selected to create the standby virtual machine according to the specification of a user or the automatic allocation of a system (according to the disk, CPU and memory occupation of each server). The configuration of the CPU, the memory, the disk size, the operating system and the like of the standby virtual machine is consistent with the backed-up virtual machine.

Step S407, determining that the backup of the data disc is not required, if so, executing step S408, and if not, executing step S410.

Step S408, determining whether the data disc to be backed up has a corresponding spare data disc, if not, executing step S409, and if so, executing step S410.

In step S409, a spare data disk is created on the corresponding server (the server on which the spare virtual machine is located).

Specifically, if the backup task has the task of backing up the data disk, whether the data disk to be backed up has the backup data disk is judged first. If there is no spare data disk, a spare data disk is created on the server where the spare virtual machine is located. The size of the spare data disk, the file system type, etc. are configured to be consistent with the backed up data disk.

In step S410, a backup file is created at a corresponding location on the corresponding server.

In step S411, the changed portions of the virtual disk (including the system disk and the data disk) are exported to the backup file according to the bypass technique.

As one possible implementation manner, exporting the changed part of each disk to be backed up to the corresponding backup file through a bypass technology includes: during primary backup, the complete content of each disk to be backed up in the main virtual machine is covered into a corresponding backup file; a file change identifier is respectively added in each disk to be backed up in the main virtual machine; and when each backup is carried out, determining the data updated after the last backup is finished based on the file change identification, and extracting the updated data into the corresponding backup file through a bypass technology.

Specifically, a blank backup file of the system disk and the data disk is created on a server where the standby virtual machine is located. And then exporting the changed parts of the virtual machine system disk and the data disk which need to be backed up to corresponding backup files according to the bypass technology. If the backup is the first backup, the contents of the system disk and the data disk of the virtual machine are completely extracted into corresponding backup files, and an identifier for recording file modification is added on the system disk and the data disk of the virtual machine.

Further, each subsequent backup identifies which files/file blocks have been changed since the last backup according to the file change identifiers in the system disk and the data disk, and extracts the changed portions from the backup files through a bypass technology.

It can be understood that by adopting the method, the virtual machine can fully or partially extract the contents of the system disk and the data disk into the backup file under the condition of not shutting down, thereby greatly increasing the applicable scenes of the virtual machine backup, and greatly saving the storage space occupied by the backup due to the adoption of the incremental backup.

In step S412, the incremental backup files are combined to become a complete backup file.

In step S413, the virtual disk (including the system disk and the data disk) of the standby virtual machine is covered with the full backup file.

Step S414, determining whether the spare data disk is mounted on the spare virtual machine, if yes, executing step S416, and if not, executing step S415.

In step S415, the spare data disk is mounted on the spare virtual machine.

Specifically, the backup file of this time is combined with the backup file of the prior art to form a complete virtual disk file, and the file is used to cover the original system disk and data disk file of the standby virtual machine. After the merging is completed, if the spare data disk is not mounted on the spare virtual machine, the spare data disk is mounted on the spare virtual machine.

In the asynchronous backup process, when the backup virtual machine is created before, there may be a backup condition that only one of the system disk or the data disk is backed up before and then another disk is added according to the actual condition of change, so that a condition that the backup data disk is not mounted may occur. Therefore, in order to improve the applicability of the backup method, the present embodiment also performs mounting confirmation of the spare data disc.

Step S416, deleting redundant backup files according to the set restored points.

And finally, merging and deleting redundant backup files according to the set reduction points so as to save the storage space. Wherein, the restore point (restore point) is used to represent the file storage status.

It should be noted that, asynchronous backup is to transfer newly changed data as a backup file to a physical server where a backup virtual machine is located, and then merge the data. In the merging process, old backup files are merged first, and after merging, the restore point needs to be updated, and meanwhile, the original old backup files have no value, so redundant old backup files should be deleted.

In another embodiment of the present application, backing up a primary virtual machine to a target server according to a real-time backup manner includes the following steps: executing a real-time backup task according to the real-time backup instruction; creating a backup virtual machine on a target server, and covering a standby system disk of the backup virtual machine through a system disk of the main virtual machine; if the backup data disk task exists, creating a backup data disk on the target server, and mounting the backup data disk to the backup virtual machine; when a data modification event exists, the change part of each disk needing to be backed up in the main virtual machine is transmitted to the corresponding disk file in the backup virtual machine in real time through a bypass technology; and disconnecting the connection between the main virtual machine and the backup virtual machine according to the received backup stopping instruction.

Specifically, in order to more clearly understand the implementation process of real-time backup in this embodiment, a real-time backup method according to this embodiment is described below

Fig. 5 is a flowchart of a method for heterogeneous real-time backup of a virtual machine according to an embodiment of the present application, as shown in fig. 5, where the method includes the following steps:

step S501, performing real-time backup after receiving the real-time backup request.

Specifically, after receiving the real-time backup command, the storage system for the cloud computing cluster starts a real-time backup process.

Step S502, judging whether a standby virtual machine exists or not, if not, executing step S503, and if so, executing step S507.

Step S503, determining whether the user designates a backup server, if not, executing step S504, and if so, executing step S505.

In step S504, an appropriate server is automatically selected to create a standby virtual machine.

In step S505, a standby virtual machine is created on the server specified by the user.

Step S506, the system disk of the standby virtual machine is completely covered by the system disk of the original virtual machine.

Specifically, when the backup is started, whether the backup virtual machine is created or not is judged. If there is no standby virtual machine, the standby virtual machine is created preferentially. When the standby virtual machine is created, a proper server is selected to create the standby virtual machine according to the specification of a user or the automatic allocation of a system (according to the disk, CPU and memory occupation of each server). The configuration of CPU, memory, disk size, operating system and the like of the standby virtual machine is consistent with the backed-up virtual machine, and the system disk of the standby virtual machine is completely covered by the system disk of the original virtual machine to ensure the consistency of the two.

Step S507, determining that the backup of the data disc is not required, if so, executing step S508, and if not, executing step S513.

Step S508, determining whether the data disc to be backed up has a corresponding spare data disc, if not, executing step S509, and if so, executing step S510.

Step S509, creating a spare data disk on the corresponding server (the server where the spare virtual machine is located).

And S510, completely covering the data disk of the standby virtual machine with the data disk of the original virtual machine.

Step S511, determining whether the spare data disk is mounted on the spare virtual machine, if not, executing step S512, and if so, executing step S513.

In step S512, the spare data disk is mounted on the spare virtual machine.

Specifically, if the requirement of the backup data disk exists in the real-time backup task, whether the backup data disk is created or not is judged. If there is no spare data disk, a spare data disk is created on the server where the spare virtual machine is located. The size of the spare data disk, the type of the file system and the like are configured to be consistent with the backed-up data disk, and the data disk of the spare virtual machine is completely covered by the data disk of the original virtual machine to ensure the consistency of the spare data disk and the backup data disk. The spare data disk is then mounted to the spare virtual machine.

It should be noted that, because the virtual machines that need to make real-time backup are important to pass through, when the main virtual machine cannot work normally, the standby virtual machine is required to be pulled up instantly to take over the service of the main virtual machine, so in the embodiment of the application, the system disk and the data disk are created together, and the data disk is directly mounted after the data disk is created. And then determining whether the system disk needs to be backed up in real time according to actual needs.

In step S513, virtual disks (including a system disk and a data disk) corresponding to the primary virtual machine and the standby virtual machine are connected according to the bypass technology.

Specifically, an identifier for recording file modification is added on a system disk and a data disk of the main virtual machine, and data modification signals of the virtual machine are started to be monitored. When the file is changed, the virtual machine sends a data modification signal, and the changed part is transmitted to the virtual disk file corresponding to the standby virtual machine through the file change identification and the bypass technology.

Step S514, it is determined whether a signal for stopping backup is received, if yes, the backup is ended, and if no, step S515 is executed.

Step S515, determining whether a signal for modifying data is received, if yes, executing step S516, and if no, returning to execute step S514.

In step S516, the corresponding data modification portion is transferred to the virtual disk (including the system disk and the data disk) of the standby virtual machine through the bypass technology.

Specifically, the main virtual machine will normally operate and transmit the changed part to the virtual disk file corresponding to the standby virtual machine in real time, so as to realize real-time backup. And finally, after the system sends a backup stopping instruction, the main virtual machine is disconnected from the standby virtual machine, and the real-time backup is stopped.

Therefore, the data backup method can enable the standby virtual machine to have the system disk and the data disk which have the same content as the main virtual machine, can be used for taking over the main virtual machine when the main virtual machine is unavailable at any time, and improves the availability of the system. And the main virtual machine does not need to be shut down during backup, can continue to operate while backing up, and greatly improves the flexibility of backup and the applicability in various scenes.

In order to implement the foregoing embodiments, the present application further provides a storage system of a cloud computing cluster, and fig. 6 is a schematic structural diagram of the storage system of the cloud computing cluster according to the embodiment of the present application, as shown in fig. 6, where the system includes an establishing module 100, a determining module 200, a backup module 300, and a control module 400.

The establishing module 100 is configured to establish a unique corresponding primary virtual machine on each server in the cloud computing cluster, where the primary virtual machine corresponding to each server is not affected by other servers.

And the determining module 200 is configured to determine, for each primary virtual machine, whether to backup a system disk and a data disk of the primary virtual machine according to the working condition of the primary virtual machine, and determine a backup manner of each disk respectively.

And the backup module 300 is configured to select, for each primary virtual machine to be backed up, a target server for establishing a backup virtual machine, and backup the primary virtual machine to the target server according to the determined backup mode.

The control module 400 is configured to control, in the case that a failure server exists in the cloud computing cluster, a backup virtual machine corresponding to the primary virtual machine in the failure server to take over a service task of the primary virtual machine, and restore the primary virtual machine according to data of the corresponding backup virtual machine after the failure server resumes operation.

Optionally, in an embodiment of the present application, the system further comprises a reconstruction module for: and selecting a server for rebuilding in the cloud computing cluster, and rebuilding a main virtual machine or a backup virtual machine of the fault server on the server for rebuilding.

Optionally, in one embodiment of the present application, the backup module 300 is specifically configured to: setting a time point, a time interval and a reserved reduction point of asynchronous backup, and executing an asynchronous backup task when the time point and the time interval are reached; creating a backup virtual machine on a target server, and if a backup data disk task exists, creating a backup data disk on the target server; creating a backup file of each disk to be backed up on the target server, and exporting a change part of each disk to be backed up to a corresponding backup file through a bypass technology when each backup is carried out; combining the updated backup file and the existing backup file into a complete backup file, and covering a disk corresponding to the backup virtual machine with the complete backup file; under the condition that the covered backup data disk is not mounted, mounting the covered backup data disk to a backup virtual machine; and deleting redundant backup files according to the reduction points.

Optionally, in one embodiment of the present application, the backup module 300 is specifically configured to: during primary backup, the complete content of each disk to be backed up in the main virtual machine is covered into a corresponding backup file; a file change identifier is respectively added in each disk to be backed up in the main virtual machine; and when each backup is carried out, determining the data updated after the last backup is finished based on the file change identification, and extracting the updated data into the corresponding backup file through a bypass technology.

Optionally, in one embodiment of the present application, the backup module 300 is further configured to: executing a real-time backup task according to the real-time backup instruction; creating a backup virtual machine on a target server, and covering a standby system disk of the backup virtual machine through a system disk of the main virtual machine; if the backup data disk task exists, creating a backup data disk on the target server, and mounting the backup data disk to the backup virtual machine; when a data modification event exists, the change part of each disk needing to be backed up in the main virtual machine is transmitted to the corresponding disk file in the backup virtual machine in real time through a bypass technology; and disconnecting the connection between the main virtual machine and the backup virtual machine according to the received backup stopping instruction.

Optionally, in one embodiment of the present application, the control module 400 is specifically configured to: extracting data in a system disk and/or a data disk of the backup virtual machine, and transmitting the extracted data to a disk corresponding to the main virtual machine for coverage in a compression transmission mode; and starting the main virtual machine to carry out service tasks again, and stopping running the backup virtual machine.

Optionally, in an embodiment of the present application, the reconstruction module is specifically configured to: determining the type of a virtual machine to be rebuilt, rebuilding the virtual machine to be rebuilt on a server for rebuilding, and synchronizing data in a normal virtual machine corresponding to the virtual machine to be rebuilt into the rebuilt virtual machine through a bypass technology; and reestablishing the corresponding relation between the main virtual machine and the standby virtual machine, and re-executing the backup task according to the backup mode preset before the fault of the fault server.

It should be noted that the foregoing explanation of the embodiment of the storage method of the cloud computing cluster is also applicable to the system of this embodiment, and will not be repeated here.

In summary, the storage method of the cloud computing cluster according to the embodiment of the present application fundamentally avoids the risk that the whole cloud computing cluster is unavailable due to the unavailability of the distributed file system, and even the huge risk that the data is completely lost. The number of copies required to be set by the distributed file system is reduced, whether backup, real-time backup or asynchronous backup is required to be set or not can be set according to the requirements, the data storage performance is better, the resource utilization rate is higher, and the servers in the cluster are not required to be configured identically so as to avoid the influence on the overall performance by the servers with poor performance.

In order to implement the above embodiments, the present application further proposes a non-transitory computer readable storage medium, on which a computer program is stored, which when executed by a processor implements a storage method of a cloud computing cluster according to any of the above embodiments.

In the description of the present specification, a description referring to terms "one embodiment," "some embodiments," "examples," "specific examples," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the present application. In this specification, schematic representations of the above terms are not necessarily directed to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, the different embodiments or examples described in this specification and the features of the different embodiments or examples may be combined and combined by those skilled in the art without contradiction.

Furthermore, the terms "first," "second," and the like, are used for descriptive purposes only and are not to be construed as indicating or implying a relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include at least one such feature. In the description of the present application, the meaning of "plurality" is at least two, such as two, three, etc., unless explicitly defined otherwise.

Any process or method descriptions in flow charts or otherwise described herein may be understood as representing modules, segments, or portions of code which include one or more executable instructions for implementing specific logical functions or steps of the process, and additional implementations are included within the scope of the preferred embodiment of the present application in which functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order, depending on the functionality involved, as would be understood by those reasonably skilled in the art of the embodiments of the present application.

Logic and/or steps represented in the flowcharts or otherwise described herein, e.g., a ordered listing of executable instructions for implementing logical functions, can be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions. For the purposes of this description, a "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device. More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection (electronic device) having one or more wires, a portable computer diskette (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disc read-only memory (CDROM). In addition, the computer readable medium may even be paper or other suitable medium on which the program is printed, as the program may be electronically captured, via, for instance, optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner, if necessary, and then stored in a computer memory.

It is to be understood that portions of the present application may be implemented in hardware, software, firmware, or a combination thereof. In the above-described embodiments, the various steps or methods may be implemented in software or firmware stored in a memory and executed by a suitable instruction execution system. As with the other embodiments, if implemented in hardware, may be implemented using any one or combination of the following techniques, as is well known in the art: discrete logic circuits having logic gates for implementing logic functions on data signals, application specific integrated circuits having suitable combinational logic gates, programmable Gate Arrays (PGAs), field Programmable Gate Arrays (FPGAs), and the like.

Those of ordinary skill in the art will appreciate that all or a portion of the steps carried out in the method of the above-described embodiments may be implemented by a program to instruct related hardware, where the program may be stored in a computer readable storage medium, and where the program, when executed, includes one or a combination of the steps of the method embodiments.

In addition, each functional unit in each embodiment of the present application may be integrated in one processing module, or each unit may exist alone physically, or two or more units may be integrated in one module. The integrated modules may be implemented in hardware or in software functional modules. The integrated modules may also be stored in a computer readable storage medium if implemented in the form of software functional modules and sold or used as a stand-alone product.

The above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, or the like. Although embodiments of the present application have been shown and described above, it will be understood that the above embodiments are illustrative and not to be construed as limiting the application, and that variations, modifications, alternatives, and variations may be made to the above embodiments by one of ordinary skill in the art within the scope of the application.

Claims

1. The storage method of the cloud computing cluster is characterized by comprising the following steps of:

2. The storage method of a cloud computing cluster according to claim 1, further comprising, after a backup virtual machine corresponding to a primary virtual machine in the control failure server takes over a service task of the primary virtual machine:

and selecting a server for rebuilding from the cloud computing cluster, and rebuilding a main virtual machine or a backup virtual machine of the fault server on the server for rebuilding.

3. The method for storing a cloud computing cluster according to claim 1, wherein the backup mode includes: the real-time backup and the asynchronous backup, backing up the main virtual machine to the target server according to the asynchronous backup mode, comprising the following steps:

setting a time point, a time interval and a reserved reduction point of asynchronous backup, and executing an asynchronous backup task when the time point and the time interval are reached;

creating a backup virtual machine on the target server, and if a backup data disk task exists, creating a backup data disk on the target server;

creating a backup file of each disk to be backed up on the target server, and exporting a change part of each disk to be backed up to a corresponding backup file through a bypass technology when each backup is carried out;

Combining the updated backup file and the existing backup file into a complete backup file, and covering a disk corresponding to the backup virtual machine with the complete backup file;

under the condition that the covered backup data disk is not mounted, mounting the covered backup data disk to the backup virtual machine;

and deleting redundant backup files according to the reduction points.

4. The method for storing a cloud computing cluster according to claim 3, wherein exporting the changed portion of each disk to be backed up to a corresponding backup file through a bypass technology includes:

during primary backup, the complete content of each disk to be backed up in the main virtual machine is covered into a corresponding backup file;

a file change identifier is respectively added in each disk to be backed up in the main virtual machine;

and when each time of backup is carried out, determining the data updated after the last backup is finished based on the file change identification, and extracting the updated data into the corresponding backup file through a bypass technology.

5. The method for storing a cloud computing cluster according to claim 3, wherein backing up a primary virtual machine to the target server according to the real-time backup manner comprises:

Executing a real-time backup task according to the real-time backup instruction;

creating a backup virtual machine on the target server, and covering a standby system disk of the backup virtual machine by a system disk of the main virtual machine;

if the backup data disk task exists, creating a backup data disk on the target server, and mounting the backup data disk to the backup virtual machine;

the method comprises the steps of connecting disk files corresponding to a main virtual machine and a backup virtual machine, and transmitting a change part of each disk to be backed up in the main virtual machine to the corresponding disk file in the backup virtual machine in real time through a bypass technology when a data modification event exists;

and disconnecting the connection between the main virtual machine and the backup virtual machine according to the received backup stopping instruction.

6. The method of claim 1, wherein the recovering the primary virtual machine from the data of the corresponding backup virtual machine comprises:

extracting data in a system disk and/or a data disk of the backup virtual machine, and transmitting the extracted data to a disk corresponding to the main virtual machine for coverage in a compression transmission mode;

And starting the main virtual machine to carry out the service task again, and stopping running the backup virtual machine.

7. The method for storing a cloud computing cluster according to claim 2, wherein the rebuilding the primary virtual machine or the backup virtual machine of the failed server on the server for rebuilding comprises:

determining the type of a virtual machine to be rebuilt, rebuilding the virtual machine to be rebuilt on the server for rebuilding, and synchronizing data in a normal virtual machine corresponding to the virtual machine to be rebuilt into the rebuilt virtual machine through a bypass technology;

and reestablishing the corresponding relation between the main virtual machine and the standby virtual machine, and re-executing the backup task according to the backup mode preset before the fault of the fault server.

8. A storage system of a cloud computing cluster, comprising the following modules:

9. The storage system of a cloud computing cluster of claim 8, further comprising a reconstruction module, in particular for:

10. A non-transitory computer readable storage medium having stored thereon a computer program, wherein the computer program when executed by a processor implements a method of storing a cloud computing cluster according to any of claims 1-7.