CN107544868B - Data recovery method and device - Google Patents
Data recovery method and device Download PDFInfo
- Publication number
- CN107544868B CN107544868B CN201710331059.2A CN201710331059A CN107544868B CN 107544868 B CN107544868 B CN 107544868B CN 201710331059 A CN201710331059 A CN 201710331059A CN 107544868 B CN107544868 B CN 107544868B
- Authority
- CN
- China
- Prior art keywords
- file system
- shared file
- storage space
- data
- space corresponding
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Landscapes
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The application provides a data recovery method and device. In the method, the virtual machine does not need to be closed in the process of repairing the shared file system, so that the problems of service interruption and the like caused by closing the virtual machine related to the shared file system due to data repair are prevented; further, the method and the device can realize online migration of the virtual machine from one shared file system to another shared file system in a short time.
Description
Technical Field
The present application relates to network communication technologies, and in particular, to a data recovery method and apparatus.
Background
In a virtualization environment, when a virtual machine is created, disk image files of the virtual machine are uniformly deployed on a shared file system in advance. And when the virtual machine successfully runs the service, the virtual machine accesses the shared file system to perform read-write operation aiming at the disk image file.
However, when the shared file system fails, part or all of the data (including the disk image file of the virtual machine) on the storage space (such as a disk) corresponding to the shared file system also fails (the failure includes failures such as incomplete data and inconsistency), and at this time, the shared file system only supports a read operation.
In order to repair data in a storage space corresponding to a shared file system in time when the shared file system fails, a commonly used repair method is as follows: closing all virtual machines (specifically, disk image files are deployed in the virtual machines of the shared file system), unloading the shared file system, repairing a fault of data on a storage space corresponding to the shared file system, after the fault is repaired, re-mounting the shared file system (the re-mounting is similar to the re-starting in software), and then starting all virtual machines associated with the shared file system.
However, the repair method described above needs to close all the virtual machines associated with the shared file system before the shared file system is mounted again, and closing all the virtual machines associated with the shared file system may affect the service of running all the virtual machines associated with the shared file system, and finally cause a large amount of service interruption.
Disclosure of Invention
The application provides a data recovery method and device, which are used for preventing the problem caused by closing a virtual machine associated with a shared file system due to data repair.
The technical scheme provided by the application comprises the following steps:
a method of data recovery, the method comprising:
when the first shared file system only supports read operation, repairing fault data on a storage space corresponding to the first shared file system;
after fault data recovery is completed, establishing a second shared file system based on data on a storage space corresponding to a first shared file system, wherein the second shared file system and the first shared file system correspond to the storage space;
and verifying whether the second shared file system supports read and write operations, and migrating each virtual machine associated with the first shared file system to the second shared file system on line when the second shared file system supports read and write operations.
A data recovery apparatus, the apparatus comprising:
the fault repairing unit is used for repairing fault data on a storage space corresponding to the first shared file system when the first shared file system only supports read operation;
the second shared file system processing unit is used for establishing a second shared file system based on data on a storage space corresponding to a first shared file system after fault data recovery is completed, and the second shared file system and the first shared file system correspond to the storage space;
a verifying unit for verifying whether the second shared file system supports read and write operations,
and the migration unit is used for migrating each virtual machine associated with the first shared file system to the second shared file system on line when the verification unit verifies that the second shared file system supports read and write operations.
According to the technical scheme, when the first shared file system only supports read operation, a second shared file system is newly established on the storage space of the first shared file system in the process of repairing the first shared file system, all virtual machines related to the first shared file system are migrated to the second shared file system on line without closing the virtual machines, and the problem caused by closing the virtual machines related to the shared file system due to data repair is solved;
furthermore, the second shared file system and the first shared file system are established in the same storage space, so that a large amount of storage space cannot be occupied when the virtual machine associated with the first shared file system is online migrated to the second shared file system, the problems of disk data copying of the virtual machine and the like cannot occur, and the virtual machine can be online migrated from the first shared file system to the second shared file system in a short time.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and together with the description, serve to explain the principles of the disclosure.
FIG. 1 is a flow chart of a method provided herein;
FIG. 2 is a flowchart of an embodiment of verifying whether a second shared file system supports read and write operations, as provided herein;
FIG. 3 is a flow diagram of an embodiment of virtual machine migration provided herein;
FIG. 4 is a flow diagram of an embodiment of a shared file system provided by the present application when the shared file system does not support read or write operations;
FIG. 5 is a flow diagram of an embodiment of data repair provided herein;
fig. 6 is a diagram illustrating the structure of the apparatus according to the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in detail below with reference to the accompanying drawings and specific embodiments.
Referring to fig. 1, fig. 1 is a flow chart of a method provided by the present application. As an example, the method shown in fig. 1 may be applied to a management apparatus. The management device may be a host, a server, or other devices when implemented specifically, and the application is not limited specifically.
As shown in fig. 1, the process may include the following steps:
In this application, the first shared file system is named for convenience of description, and is not used to limit a certain shared file system.
Here, the storage space corresponding to the first shared file system may be: the storage space occupied by the first shared file system may also be: the storage space not occupied by the first shared file system, but a storage space corresponding to the first shared file system is specified. The storage space may also be a physical disk, and may also be other storage media, and the present application is not particularly limited.
Here, the first shared file system only supports read operation, and it can be considered that the first shared file system has a failure. When the first shared file system fails, part or all of the data (including the disk image file of the virtual machine) in the storage space corresponding to the first shared file system also fails (the failure includes failures such as incomplete data and inconsistency), and when the failure occurs, part or all of the data (including the disk image file of the virtual machine) in the storage space corresponding to the first shared file system fails, the service operation is affected, so as to prevent the service operation from being affected, as described in step 101, the present application repairs the failed data in the storage space corresponding to the first shared file system when the first shared file system is mounted. How to repair the failure data on the storage space corresponding to the first shared file system will be described below, which will not be described herein again.
In this application, the second shared file system is named for convenience of distinguishing from the first shared file system described above, and is not used to limit a certain shared file system.
By the step 102, two shared file systems can coexist in the same storage space.
And 103, verifying whether the second shared file system supports read and write operations, and migrating each virtual machine associated with the first shared file system to the second shared file system on line when the second shared file system supports read and write operations.
As described above, the second shared file system corresponds to the same storage space as the first shared file system, which is equivalent to that the second shared file system and the first shared file system are established in the same storage space, and based on this, in this step 103, online migration of each virtual machine associated with the first shared file system to the second shared file system is equivalent to migration of a virtual machine corresponding to the same storage space, which does not occupy a large amount of storage space, and does not cause problems such as disk data replication of the virtual machine, so that online migration of the virtual machine from the first shared file system to the second shared file system in a short time is realized.
Thus, the flow shown in fig. 1 is completed.
As can be seen from the flow shown in fig. 1, in the present application, when a first shared file system only supports a read operation, after repairing of faulty data in a storage space corresponding to the first shared file system is completed, a second shared file system is newly created in the storage space of the first shared file system, and virtual machines associated with the first shared file system are migrated to the second shared file system online without closing the virtual machines, so as to prevent a problem caused by closing the virtual machines associated with the shared file system due to data repair;
furthermore, the second shared file system and the first shared file system are established in the same storage space, so that a large amount of storage space cannot be occupied when the virtual machine associated with the first shared file system is online migrated to the second shared file system, the problems of disk data copying of the virtual machine and the like cannot occur, and the virtual machine can be online migrated from the first shared file system to the second shared file system in a short time.
In this application, for verifying whether the second shared file system supports read and write operations, reference may be made to the flow illustrated in fig. 2, which includes steps 201 and 202:
Step 202, if the read operation and the write operation are successfully executed, it is determined that the second shared file system supports the read operation and the write operation, otherwise, when the read operation is not successfully executed, it is determined that the second shared file system does not support the read operation, and when the write operation is not successfully executed, it is determined that the second shared file system does not support the write operation.
Here, whether it is determined that the second shared file system does not support a read operation or that it is determined that the second shared file system does not support a write operation means that the second shared file system cannot support both a read and a write operation.
Through steps 201 to 202, the step of verifying whether the second shared file system supports read and write operations is realized.
In step 103, as an embodiment, when each virtual machine associated with the first shared file system is migrated to the second shared file system online, the host of each virtual machine remains unchanged. Because the virtual machines are virtualized on the host machine, on the premise that the host machine of each virtual machine is kept unchanged, online migration of each virtual machine associated with the first shared file system to the second shared file system is equivalent to modification of the configuration in the host machine, but does not affect any other device connected with the host machine, the host machine does not need to notify any other device connected with the host machine, and networking topology does not change.
As described above, each virtual machine associated with the first shared file system is migrated to the second shared file system online, while the host of each virtual machine remains unchanged, and for this case, an optimal implementation is: it is also necessary that each virtual machine associated with the first shared file system is migrated from the second shared file system back to the first shared file system.
In order to implement that each virtual machine associated with the first shared file system is migrated from the second shared file system back to the first shared file system, the method provided by the present application may further include steps 301 to 303 in the flow illustrated in fig. 3:
Through steps 301 to 303, it is realized that each virtual machine associated with the first shared file system is migrated from the second shared file system back to the first shared file system.
As an embodiment, before repairing the failure data on the storage space corresponding to the first shared file system, the method further includes: and backing up the fault data on the storage space corresponding to the first shared file system. The failed data on the storage space corresponding to the first shared file system is preferably backed up in the form of data blocks.
Here, the reason why the failed data backup on the storage space corresponding to the first shared file system is: and when the fault data is not successfully repaired, repairing the fault data on the storage space corresponding to the first shared file system again, so as to achieve the purposes of repeated repair and final successful repair.
In one embodiment, the backed-up data may be stored in a storage space corresponding to the first shared file system, with a relatively small amount.
Based on the above-described backup of the failure data on the storage space corresponding to the first shared file system, in this application, when it is verified that the second shared file system does not support the read or write operation, the method provided by this application further includes steps 401 and 402 in the flow shown in fig. 4:
Namely, the failure data on the storage space corresponding to the first shared file system at the beginning is recovered.
How to repair the failure data on the storage space corresponding to the first shared file system in step 402 is described in detail below, and will not be described herein again.
Through steps 401 to 402, repeated repair and repeated verification are finally realized, and the correctness of data repair is verified in the data repair process.
In this application, the repairing the failure data in the storage space corresponding to the first shared file system includes steps 501 to 502 in the flow shown in fig. 5:
Through steps 501 to 502, repairing the failure data on the storage space corresponding to the first shared file system when the first shared file system is mounted is realized. In one embodiment, in the present application, a fsck tool may be used to repair the failure data on the storage space corresponding to the first shared file system when the first shared file system is mounted. The fsck tool is a common tool for checking and repairing a file system in the Linux system, and how to repair the fault data on the storage space corresponding to the first shared file system is not described in detail here.
In step 103, as another embodiment, each virtual machine associated with the first shared file system is migrated to the second shared file system online across host hosts, that is, the host of each virtual machine changes, and compared with a case that the host of each virtual machine remains unchanged when each virtual machine associated with the first shared file system is migrated to the second shared file system online, the host of each virtual machine changes and thus the networking topology changes, at this time, the original host of each virtual machine needs to notify any other connected device of a message that the virtual machine is migrated online in time, and the new host of each virtual machine needs to notify any other connected device of a message that the virtual machine is newly migrated in time, so as to prevent service interruption of the virtual machine.
The methods provided herein are described above. The following describes the apparatus provided in the present application:
referring to fig. 6, fig. 6 is a diagram illustrating the structure of the apparatus according to the present invention. As an example, the apparatus shown in fig. 6 may be applied to a management device. The management device may be a host, a server, or other devices when implemented specifically, and the application is not limited specifically.
As shown in fig. 6, the apparatus includes:
the fault repairing unit is used for repairing fault data on a storage space corresponding to the first shared file system when the first shared file system only supports read operation;
the second shared file system processing unit is used for establishing a second shared file system based on data on a storage space corresponding to a first shared file system after fault data recovery is completed, and the second shared file system and the first shared file system correspond to the storage space;
a verifying unit for verifying whether the second shared file system supports read and write operations,
and the migration unit is used for migrating each virtual machine associated with the first shared file system to the second shared file system on line when the verification unit verifies that the second shared file system supports read and write operations.
Preferably, as shown in fig. 6, the apparatus further comprises: a first shared file system processing unit;
the first shared file system processing unit is used for unloading the first shared file system and migrating the virtual machines from the second shared file system to the first shared file system on line when the first shared file system is mounted again;
the second shared file system processing unit is further used for unloading the second shared file system after the virtual machines are online migrated from the second shared file system to the first shared file system.
Preferably, the verifying unit verifies whether the second shared file system supports read and write operations, including:
executing a read operation on data in a storage space corresponding to the second shared file system, and executing a write operation to the storage space corresponding to the second shared file system, where the executing the write operation includes: writing the newly created data;
and if the read operation and the write operation are successfully executed, determining that the second shared file system supports the read operation and the write operation, otherwise, when the read operation is not successfully executed, determining that the second shared file system does not support the read operation, and when the write operation is not successfully executed, determining that the second shared file system does not support the write operation.
Preferably, as shown in fig. 6, the apparatus further comprises: a backup processing unit;
the backup processing unit is used for backing up the fault data on the storage space corresponding to the first shared file system before the fault repairing unit repairs the fault data on the storage space corresponding to the first shared file system;
the second shared file system processing unit is further used for unloading the second shared file system when the verification unit verifies that the second shared file system does not support read or write operation;
the backup processing unit further restores the corresponding data on the storage space based on the backup of the fault data, and triggers the fault restoration unit to restore the fault data on the storage space corresponding to the first shared file system.
Preferably, the repairing the failure data on the storage space corresponding to the first shared file system by the failure repairing unit includes:
dividing data blocks of fault data on a storage space corresponding to a first shared file system;
and repairing the fault data on the storage space corresponding to the first shared file system according to the sequence from the minimum fault data block to the maximum fault data block.
Thus, the description of the device structure shown in fig. 6 is completed.
The above description is only exemplary of the present application and should not be taken as limiting the present application, as any modification, equivalent replacement, or improvement made within the spirit and principle of the present application should be included in the scope of protection of the present application.
Claims (10)
1. A method for data recovery, the method comprising:
when the first shared file system only supports read operation, repairing fault data on a storage space corresponding to the first shared file system;
after fault data recovery is completed, establishing a second shared file system based on data on a storage space corresponding to a first shared file system, wherein the second shared file system and the first shared file system correspond to the storage space;
and verifying whether the second shared file system supports read and write operations, and migrating each virtual machine associated with the first shared file system to the second shared file system on line when the second shared file system supports read and write operations.
2. The method of claim 1, further comprising:
unloading the first shared file system;
when the first shared file system is mounted again, migrating each virtual machine from the second shared file system to the first shared file system on line;
the second shared file system is unloaded.
3. The method of claim 1, wherein verifying whether the second shared file system supports read and write operations comprises:
executing a read operation on data in a storage space corresponding to the second shared file system, and executing a write operation to the storage space corresponding to the second shared file system, where the executing the write operation includes: writing the newly created data;
and if the read operation and the write operation are successfully executed, determining that the second shared file system supports the read operation and the write operation, otherwise, when the read operation is not successfully executed, determining that the second shared file system does not support the read operation, and when the write operation is not successfully executed, determining that the second shared file system does not support the write operation.
4. The method of claim 1, further comprising, before repairing the failed data on the storage space corresponding to the first shared file system: backing up fault data on a storage space corresponding to the first shared file system;
when it is verified that the second shared file system does not support read or write operations, the method further comprises:
unloading the second shared file system, and recovering corresponding data on the storage space based on backup of fault data;
and repairing the fault data on the storage space corresponding to the first shared file system, and returning to execute the operation of establishing a second shared file system based on the data on the storage space corresponding to the first shared file system after completing the data repair.
5. The method of claim 1 or 4, wherein repairing the failed data on the storage space corresponding to the first shared file system comprises:
dividing data blocks of fault data on a storage space corresponding to a first shared file system;
and repairing the fault data on the storage space corresponding to the first shared file system according to the sequence from the minimum fault data block to the maximum fault data block.
6. A data recovery apparatus, characterized in that the apparatus comprises:
the fault repairing unit is used for repairing fault data on a storage space corresponding to the first shared file system when the first shared file system only supports read operation;
the second shared file system processing unit is used for establishing a second shared file system based on data on a storage space corresponding to a first shared file system after fault data recovery is completed, and the second shared file system and the first shared file system correspond to the storage space;
a verifying unit for verifying whether the second shared file system supports read and write operations,
and the migration unit is used for migrating each virtual machine associated with the first shared file system to the second shared file system on line when the verification unit verifies that the second shared file system supports read and write operations.
7. The apparatus of claim 6, further comprising:
the first shared file system processing unit is used for unloading the first shared file system and migrating the virtual machines from the second shared file system to the first shared file system on line when the first shared file system is mounted again;
the second shared file system processing unit is further used for unloading the second shared file system after the virtual machines are online migrated from the second shared file system to the first shared file system.
8. The apparatus of claim 6, wherein the verifying unit verifies whether the second shared file system supports read and write operations comprises:
executing a read operation on data in a storage space corresponding to the second shared file system, and executing a write operation to the storage space corresponding to the second shared file system, where the executing the write operation includes: writing the newly created data;
and if the read operation and the write operation are successfully executed, determining that the second shared file system supports the read operation and the write operation, otherwise, when the read operation is not successfully executed, determining that the second shared file system does not support the read operation, and when the write operation is not successfully executed, determining that the second shared file system does not support the write operation.
9. The apparatus of claim 6, further comprising: a backup processing unit;
the backup processing unit is used for backing up the fault data on the storage space corresponding to the first shared file system before the fault repairing unit repairs the fault data on the storage space corresponding to the first shared file system;
the second shared file system processing unit is further used for unloading the second shared file system when the verification unit verifies that the second shared file system does not support read or write operation;
the backup processing unit further restores the corresponding data on the storage space based on the backup of the fault data, and triggers the fault restoration unit to restore the fault data on the storage space corresponding to the first shared file system.
10. The apparatus according to claim 6 or 9, wherein the repairing the failed data on the storage space corresponding to the first shared file system by the failure repairing unit comprises:
dividing data blocks of fault data on a storage space corresponding to a first shared file system;
and repairing the fault data on the storage space corresponding to the first shared file system according to the sequence from the minimum fault data block to the maximum fault data block.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710331059.2A CN107544868B (en) | 2017-05-11 | 2017-05-11 | Data recovery method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710331059.2A CN107544868B (en) | 2017-05-11 | 2017-05-11 | Data recovery method and device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN107544868A CN107544868A (en) | 2018-01-05 |
CN107544868B true CN107544868B (en) | 2020-06-09 |
Family
ID=60966270
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710331059.2A Active CN107544868B (en) | 2017-05-11 | 2017-05-11 | Data recovery method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107544868B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109086166A (en) * | 2018-07-09 | 2018-12-25 | 郑州云海信息技术有限公司 | A kind of backup of virtual machine and restoration methods and device |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103176845A (en) * | 2011-12-22 | 2013-06-26 | 中国移动通信集团公司 | Method, system and device for virtual machine arrangement |
CN103605562A (en) * | 2013-12-10 | 2014-02-26 | 浪潮电子信息产业股份有限公司 | Method for migrating kernel-based virtual machine (KVM) between physical hosts |
CN103761168A (en) * | 2014-01-26 | 2014-04-30 | 上海爱数软件有限公司 | Method for mounting backup virtual machine based on nfs volume |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6862609B2 (en) * | 2001-03-07 | 2005-03-01 | Canopy Group, Inc. | Redundant storage for multiple processors in a ring network |
US10776209B2 (en) * | 2014-11-10 | 2020-09-15 | Commvault Systems, Inc. | Cross-platform virtual machine backup and replication |
-
2017
- 2017-05-11 CN CN201710331059.2A patent/CN107544868B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103176845A (en) * | 2011-12-22 | 2013-06-26 | 中国移动通信集团公司 | Method, system and device for virtual machine arrangement |
CN103605562A (en) * | 2013-12-10 | 2014-02-26 | 浪潮电子信息产业股份有限公司 | Method for migrating kernel-based virtual machine (KVM) between physical hosts |
CN103761168A (en) * | 2014-01-26 | 2014-04-30 | 上海爱数软件有限公司 | Method for mounting backup virtual machine based on nfs volume |
Also Published As
Publication number | Publication date |
---|---|
CN107544868A (en) | 2018-01-05 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US6990611B2 (en) | Recovering data from arrays of storage devices after certain failures | |
US8359493B2 (en) | Mechanism to provide assured recovery for distributed application | |
CN110377456B (en) | Management method and device for virtualization platform disaster tolerance | |
CN110389858B (en) | Method and device for recovering faults of storage device | |
CN103970481A (en) | Method and device for reconstructing memory array | |
CN102200944B (en) | Test environment cloning method and system for enterprise resource planning (ERP) system | |
CN109032838B (en) | Automatic verification method for consistency of backup and recovery data of virtual machine | |
CN105302667A (en) | Cluster architecture based high-reliability data backup and recovery method | |
CN112380062A (en) | Method and system for rapidly recovering system for multiple times based on system backup point | |
WO2015043155A1 (en) | Method and device for network element backup and recovery based on command set | |
CN105022678A (en) | Data backup method and apparatus for virtual machine | |
CN105740049B (en) | A kind of control method and device | |
CN111708488A (en) | Distributed memory disk-based Ceph performance optimization method and device | |
CN103064759B (en) | The method of data restore and device | |
CN104133742A (en) | Data protection method and device | |
CN111158955A (en) | High-availability system based on volume replication and multi-server data synchronization method | |
CN104268032A (en) | Multi-controller snapshot processing method and device | |
CN107544868B (en) | Data recovery method and device | |
CN109376036A (en) | A kind of method and apparatus for backup virtual machine | |
EP2313829A1 (en) | Recovery control in mirrored disks | |
US20240053915A1 (en) | Hard Disk Snapshot Method and Apparatus Based on Openstack Platform | |
CN111176886A (en) | Database mode switching method and device and electronic equipment | |
CN109582497A (en) | One kind being based on the quick emergency starting method of dynamic data increment | |
CN104407806A (en) | Method and device for revising hard disk information of redundant array group of independent disk (RAID) | |
CN114385412A (en) | Storage management method, apparatus and computer program product |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |