CN103064765B - Data reconstruction method, device and cluster storage system - Google Patents

Data reconstruction method, device and cluster storage system Download PDF

Info

Publication number
CN103064765B
CN103064765B CN201210587371.5A CN201210587371A CN103064765B CN 103064765 B CN103064765 B CN 103064765B CN 201210587371 A CN201210587371 A CN 201210587371A CN 103064765 B CN103064765 B CN 103064765B
Authority
CN
China
Prior art keywords
storage
file
memory
virtual volume
metadata
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201210587371.5A
Other languages
Chinese (zh)
Other versions
CN103064765A (en
Inventor
李立
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Priority to CN201210587371.5A priority Critical patent/CN103064765B/en
Publication of CN103064765A publication Critical patent/CN103064765A/en
Application granted granted Critical
Publication of CN103064765B publication Critical patent/CN103064765B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a kind of data reconstruction method, device and cluster storage system, relate to technical field of memory.The method comprises: when break down in cluster storage system memory storage time, the back mapping table of inquiry failure storage place memory node, obtains the information of the file stored in failure storage; The metadata virtual volume that inquiry is host node with failure storage place memory node, obtains the data slice not being stored in failure storage in file; The data slice in failure storage is stored according to obtained data slice recovery file.The method of the embodiment of the present invention, Apparatus and system, keeping comparatively meticulous granularity, realize date restoring and the Data Migration of block level, while effective control load equilibrium, greatly reduce the amount of data scanning, and restorability are very high.

Description

Data reconstruction method, device and cluster storage system
Technical field
The present invention relates to technical field of memory, particularly relate to a kind of data reconstruction method, device and cluster storage system.
Background technology
Along with the fast development of computing technique and network technology, the capacity requirement that data store is increasing, is also faced with ever-increasing cost simultaneously.In order to meet the demand of capacity, storage system experienced by and is combined as separate, stored server from by multibank memory, to the continuous expansion by multiple separate, stored server combination being cluster-based storage.
A large amount of memory nodes (storage server) is coupled together by modes such as concentric cable, netting twine, optical fiber by large-scale cluster storage system usually, memory node has usually disk, solid state hard disc (SolidStateDevice/SolidStateDrive, the local storage such as SSD), for storing data, cluster storage system by standard or privately owned interface and agreement, for main frame provides stores service.While improve capacity; in order to have higher reliability, performance and lower cost; cluster storage system can use redundant array of inexpensive disk (RedundantArraysofInexpensiveDisks usually; RAID) technology or correcting and eleting codes (ErasureCode, EC) technology data are processed after just stored in memory storage.Namely according to RAID/EC algorithm, burst is carried out to file data, every sheet data are stored in the different memory storage on different storage servers, like this can guarantee section memory node or part memory storage when breaking down, data can not be lost.Simultaneously when having memory node or memory storage to occur the fault of unrepairable, storage system can start the process of " date restoring ", utilizes remaining memory node and memory storage to be recovered again to store by the partial data of loss.
In order to reach date restoring target, storage system uses following methods to carry out organising data usually:
Storage system take file as organization unit; each file has the attribute that represents data protection mode; when file stores; according to its data protection mode; burst is carried out to file; then be stored in the memory storage of different memory nodes, the position of each data slice and the attribute record of this file are in the metadata of file.When a memory storage fault; this kind of storage system can initiate the scanning of file system usually; by the scanning to file system metadata; find that there is the data address of inefficacy; then according to the record of metadata; calculate by the data block of RAID or EC algorithm by inefficacy, and be rewritten to a new active position, revise the data block record in metadata simultaneously.
But when determining which file to carry out date restoring to, just need to scan all file metadatas.This not only brings the huge impacts performance of whole system, and if scanning time excessively of a specified duration, wherein also can there is the fault again of equipment in this, add the risk of loss of data.
Summary of the invention
In view of this, problem to be solved by this invention is to provide a kind of data reconstruction method, device and cluster storage system, the date restoring and the Data Migration that realize block level with meticulousr granularity can be kept, while effective control load equilibrium, scan expense when reducing date restoring little.
In order to solve the problem, first aspect, embodiments provides a kind of data reconstruction method, and described method is applied in cluster storage system, and described cluster storage system comprises at least two memory nodes, and described method comprises:
When break down in described cluster storage system memory storage time, inquire about the back mapping table preserved in the memory node at described failure storage place, the information of the file stored in described failure storage is obtained by described back mapping table, wherein, described file is divided at least two data slice;
The metadata virtual volume that inquiry is host node with described failure storage place memory node, obtain the data slice not being stored in described failure storage in described file, described metadata virtual volume is the virtual volume of the metadata for storage file;
Recover described file according to the data slice of described acquisition and be stored in data slice in described failure storage.
In conjunction with first aspect, in the first implementation, described back mapping table is stored in each memory node in described cluster storage system, for recording the corresponding relation of memory storage corresponding to specific metadata virtual volume and file, the metadata virtual volume that described specific metadata virtual volume is is host node with the memory node at described back mapping table place.
In conjunction with the first implementation of first aspect, in the second implementation, described back mapping table also comprises multiple back mapping sublist, be stored in respectively on each memory storage of corresponding memory node, described back mapping sublist have recorded the partial content in the back mapping table that its place memory node stores.
In conjunction with the second implementation of first aspect, in the third implementation, also comprise:
According to the metadata virtual volume being host node with described failure storage place memory node, recover the back mapping sublist in described failure storage.
Second aspect, embodiments provides a kind of Data Recapture Unit, and described application of installation is in cluster storage system, and described cluster storage system comprises at least two memory nodes, and described device comprises:
Back mapping table query unit, for when break down in described cluster storage system memory storage time, inquire about the back mapping table of described failure storage place memory node, obtain the information of the file stored in described failure storage, wherein, described file is divided at least two data slice;
Metadata virtual volume query unit, for the metadata virtual volume that to inquire about with described failure storage place memory node be host node, obtain the data slice not being stored in described failure storage in described file, described metadata virtual volume is the virtual volume of the metadata for storage file;
Data recovery unit, recovers described file according to obtained data slice and is stored in data slice in described failure storage.
In conjunction with second aspect, in the first implementation, described back mapping table is stored in each memory node, for recording the corresponding relation of memory storage corresponding to specific metadata virtual volume and file, the metadata virtual volume that described specific metadata virtual volume is is host node with the memory node at described back mapping table place.
In conjunction with the first implementation of second aspect, in the second implementation, described back mapping table also comprises multiple back mapping sublist, be stored in respectively on each memory storage of corresponding memory node, described back mapping sublist have recorded the partial content in the back mapping table that its place memory node stores.
In conjunction with the second implementation of second aspect, in the third implementation, also comprise:
Maintenance unit, for according to the metadata virtual volume being host node with described failure storage place memory node, recovers the back mapping sublist in described failure storage.
In conjunction with any one implementation above-mentioned of second aspect or second aspect, in the 4th kind of implementation, each described metadata virtual volume at least one memory storage corresponding, and one of them memory storage place memory node is the host node of described metadata virtual volume, the metadata of each file has the mirror image of predetermined number on metadata virtual volume, and every road mirror image is mapped on different memory storages respectively; Each memory node in described cluster storage system is respectively the host node of certain metadata virtual volume.
The third aspect, embodiments provides a kind of cluster storage system, and described cluster storage system comprises at least two memory nodes, and described cluster storage system also comprises the Data Recapture Unit in any one implementation above-mentioned of second aspect or second aspect.
The method of the embodiment of the present invention, Apparatus and system only need inquiry back mapping table when carrying out date restoring, namely can obtain the file needing to recover; By analyzing a small amount of file metadata, just can locate rapidly the data slice needing to recover, restorability is higher.In addition, because each memory node has a back mapping table, each memory node can carry out concurrent recovery, is independent of each other, and this makes restorability very high.
According to below with reference to the accompanying drawings to detailed description of illustrative embodiments, further feature of the present invention and aspect will become clear.
Accompanying drawing explanation
Comprise in the description and form the Figure of description of a part for instructions and instructions together illustrates exemplary embodiment of the present invention, characteristic sum aspect, and for explaining principle of the present invention.
The system architecture schematic diagram of the cluster storage system of the data reconstruction method that Fig. 1 application embodiment of the present invention provides;
Fig. 2 is the data reconstruction method process flow diagram of the embodiment of the present invention;
Fig. 3 is the principle schematic of the data reconstruction method of the embodiment of the present invention;
Fig. 4 is that the back mapping in the data reconstruction method of the embodiment of the present invention represents intention;
Fig. 5 is the metadata virtual volume schematic diagram in the data reconstruction method of the embodiment of the present invention;
Fig. 6 is the principle schematic upgrading back mapping table in the data reconstruction method of the embodiment of the present invention according to metadata virtual volume;
Fig. 7 is the another kind of process flow diagram of the data reconstruction method of the embodiment of the present invention;
Fig. 8 is the back mapping list structure schematic diagram in the data reconstruction method of the embodiment of the present invention;
Fig. 9 is the structured flowchart of a kind of Data Recapture Unit of the embodiment of the present invention;
Figure 10 is the structured flowchart of the another kind of Data Recapture Unit of the embodiment of the present invention;
Figure 11 is the structured flowchart of another Data Recapture Unit of the embodiment of the present invention.
Embodiment
Various exemplary embodiment of the present invention, characteristic sum aspect is described in detail below with reference to accompanying drawing.The same or analogous element of Reference numeral presentation function identical in accompanying drawing.Although the various aspects of embodiment shown in the drawings, unless otherwise indicated, accompanying drawing need not be drawn in proportion.
Word " exemplary " special here means " as example, embodiment or illustrative ".Here need not be interpreted as being better than or being better than other embodiment as any embodiment illustrated by " exemplary ".
In addition, in order to better the present invention is described, in embodiment hereafter, give numerous details.It will be appreciated by those skilled in the art that do not have these details, the present invention can implement equally.In other example, known method, means, element and circuit are not described in detail, so that highlight purport of the present invention.
For a better understanding of the present invention, first the data protection mode of file is described: the protected mode of data just to store after the data of file are through which kind of technology (RAID or EC) process for representing.
Such as, the data of file can be shared and are stored on multiple storer by RAID0 technology, thus reach the expansion of capacity, but the damage of arbitrary storer can cause the partial loss of data.RAID01 technology can keep every blocks of data to have two parts while data being stored on multiple storer, thus ensures that arbitrary storer breaks down and all can not cause loss of data, but such space utilisation only has 50%.RAID5 technology is by data fragmentation, and every N sheet data produce a slice checking data, are stored on multiple storer by N+1 sheet data, and while can ensureing that data are not lost when arbitrary out of memory like this, space utilisation reaches N/(N+1).RAID6 technology and RAID5 similar, every N sheet data produce 2 checking datas, can resist the inefficacy of two storeies, and space utilisation is N/(N+2).EC algorithm is a kind of data protection algorithms more flexibly, can produce (N+M) sheet data from N sheet data, and after ensureing M sheet corrupted data, data can not be lost, and space utilisation is N/(N+M).
In addition, the data reconstruction method that the embodiment of the present invention provides can realize on cluster storage system, the system architecture schematic diagram of the cluster storage system of the data reconstruction method that Fig. 1 provides for the application embodiment of the present invention, as shown in Figure 1, this storage system comprises three servers (each server can be understood as a memory node): server 1, server 2 and server 3, it should be noted that, the embodiment of the present invention, just for three servers, is not limited to three servers.
Server can comprise the known storage server of current techniques, at server internal, is provided with operating system and other application programs.In addition, also comprise three memory storages, it should be noted that in each server, the embodiment of the present invention, just for three memory storages, is not limited to three memory storages.In addition, processor, internal memory, I/O, buffer network adapter etc. can also be comprised at each server internal.
Memory storage can comprise the known memory storage of current techniques, as disk, RAID, disk cluster (JustaBunchOfDisks, JBOD), direct access storage device (DirectAccessStorageDevice, the disc driver of one or more interconnection DASD), the such as magnetic tape strip unit of tape library, one or more storage unit, solid state hard disc (SolidStateDrive, SSD) etc. can also be comprised.
It should be noted that, for most of storage system, directly the physical disk itself that memory storage comprises is not presented to operating system, but the storage space that each physical disk provides is mapped as one section of logic region, i.e. virtual volume, for user.
Such as, as shown in Figure 1, the physical disk that virtual volume 1 is comprised by server 1, server 2, server 3 respectively maps and forms.Concrete, virtual volume 1 comprises 3 components, one-component is from the physical disk in server 1, second component is from the physical disk in server 2,3rd component is from the physical disk in server 3, due to the principal component that one-component is virtual volume 1, therefore server 1 is the host node of virtual volume 1 correspondence.
For another example, the physical disk that virtual volume 2 is comprised by server 1, server 2, server 3 respectively maps and forms.Concrete, virtual volume 2 comprises 3 components, one-component is from the physical disk in server 1, second component is from the physical disk in server 2,3rd component is from the physical disk in server 3, due to the principal component that second component is virtual volume 2, therefore server 2 is host nodes of virtual volume 2 correspondence.
Please refer to Fig. 2, Fig. 2 is the data reconstruction method of the embodiment of the present invention, and the method comprises:
S1. when break down in cluster storage system memory storage time, inquire about the back mapping table of described failure storage place memory node, obtain the information of the file stored in described failure storage, described back mapping table comprises the information of the file stored in described failure storage, and described file is divided at least two data slice;
S2. the metadata virtual volume that to inquire about with described failure storage place memory node be host node, obtain the data slice not being stored in described failure storage in described file, described metadata virtual volume is the virtual volume of the metadata for storage file; Due to the position that each data slice of the data protection mode (when file stores, according to its data protection mode, carrying out burst to data block) and this file that record file in the metadata of file stores.Therefore, the position that all data slice that can inquire file according to the metadata of file store, and then get the data slice that this file is not stored in failure storage;
S3. the data slice in described failure storage is stored according to the described each file of obtained data slice recovery.
In the prior art, when determining which file to carry out date restoring to, need to scan all file metadatas, bring huge expense, and the method for the embodiment of the present invention is when memory storage fault, by inquiring about the back mapping table preserved in the memory node at failure storage place, namely the information of the file stored in described failure storage is obtained by this back mapping table, without the need to scanning all file metadatas, after the information inquiring the file stored in described failure storage, the metadata virtual volume that can be host node with described failure storage place memory node by inquiry, obtain the data slice not being stored in described failure storage in described file, the file needing to carry out date restoring can be obtained by these data slice.Therefore, the embodiment of the present invention only needs inquiry back mapping table, namely can obtain the file needing to recover; By analyzing a small amount of file metadata, just can locate rapidly the data block needing to recover, improve restorability.
Please refer to Fig. 3, the metadata of the metadata virtual volume being host node with this memory node by scanning, can obtain in file the data slice not being stored in failure storage, finally carry out date restoring according to the data protection mode of file.Suppose that the memory node number of cluster storage system is n, the metadata of all files will be distributed on this n memory node, compared with the metadata of traditional scanning All Files, need the metadata amount of scanning probably only to account for the 1/n of the metadata of general act.
Tool says it, described back mapping table is stored in each memory node in described cluster storage system, for recording the corresponding relation of the file that memory storage corresponding to the metadata virtual volume being specifically host node with the memory node at its place and this memory storage store, the metadata virtual volume that it is host node that this specific first number virtual volume is with the memory node at described back mapping table place.In the method for the embodiment of the present invention, the contents in table recorded in back mapping table can for being multiple (memory storages, file) binary relation pair, namely each list item is for the man-to-man corresponding relation of file recording a memory storage and store thereon; Also can be the many-one relationship group of (memory storage, listed files), namely each list item be for the corresponding relation of all files recording a memory storage and store thereon; Can be maybe the many-one relationship group of (file, memory storage list), i.e. the corresponding relation of whole memory storages of each list item for recording a file and its total data sheet and storing.Back mapping table can be as shown in Figure 4.In figure, Node represents memory node, and Disk represents the disk on memory node, is also memory storage.Described metadata virtual volume is used for the metadata of storage file, in the method for the embodiment of the present invention, the data of file carry out burst with the data protection mode of RAID or EC, each data slice is stored on the different memory storage of different storage servers by the algorithm dispersion of load balancing, and the address at data slice place is recorded in the metadata of file, metadata is then stored in metadata virtual volume, comparatively meticulous granularity can be kept like this, make the method for the embodiment of the present invention can realize date restoring and the Data Migration of block level, effective control load is balanced.
And, corresponding at least one memory storage (also namely the metadata that it stores is mapped at least one memory storage by this metadata virtual volume) of each described metadata virtual volume, and the memory node at a place in memory storage corresponding to each metadata virtual volume is the host node of described metadata virtual volume, host node is responsible for the operation of this metadata virtual volume, as read-write, recovers and moves etc.
It should be noted that; the attribute of the protected mode that metadata virtual volume is unfixing; the mode storing metadata of multichannel mirror image can be adopted; as 3 road mirror images, 5 road mirror images etc.; the metadata of each file has the mirror image of predetermined number on metadata virtual volume, and every road mirror image is mapped on different memory storages respectively, to improve the security that data store; ensureing when certain memory storage fault, also there is the reference for recovering data.Such as, if the data protection mode of file is the EC pattern of N+M, or the RAID mode of N+M, and the mirror image of the predetermined number of this file is M+1.As shown in Figure 5, corresponding 5 the memory storage members of metadata virtual volume-1, and member 0 place storage server is the host node of this metadata virtual volume, this metadata virtual volume-1 stores the metadata of file A, B, C, file A comprises three road mirror images, and is mapped on member 0, member 1 and member 2 respectively; File B comprises five road mirror images, and is mapped to five member's row respectively; File C comprises four road mirror images, and is mapped to respectively on other four members except member 4.
It should be noted that, each memory node in cluster storage system is respectively the host node of certain metadata virtual volume.The metadata of each file is after being deposited into a metadata virtual volume, the host node of this metadata virtual volume is responsible for this file to add in the back mapping table that this memory node stores, and the memory storage residing for the data slice recorded in the metadata of file, added in back mapping table, as shown in Figure 6, the ID(1112 of file is have recorded) in the metadata virtual volume of file, the data protection mode (3+1) of file, the size of file data and the position of data slice divided, in back mapping table, for first list item, 0th disk (memory storage) of what Disk0-0 represented is No. 0 memory node, for the 3rd list item, 3rd disk of what Disk-2-3 represented is No. 2 memory node, which stores the data slice of file 1112, other list item by that analogy.
To sum up, the method of the embodiment of the present invention is keeping comparatively meticulous granularity, owing to by Divide File being the storage of multiple data slice, thus date restoring and the Data Migration of block (sheet) level can be realized, while effective control load equilibrium, greatly reduce the amount of data scanning, only need inquiry back mapping table, namely can obtain the file needing to recover; By analyzing a small amount of file metadata, the data block needing to recover just can be located rapidly.Because each memory node has a back mapping table, and preserved by the host node of metadata virtual volume, each memory node can carry out concurrent recovery, is independent of each other, and this makes restorability higher.
As shown in Figure 7, the method for the embodiment of the present invention, the basis of each step shown in Fig. 2 also comprises:
S4. be that file is stored in data slice in described failure storage and distributes target storage according to the principle of load balancing;
S5. the described file data slice be stored in described failure storage is stored into described target storage;
S6. upgrade corresponding metadata virtual volume and back mapping table: the metadata of updating file, delete mapping relations old in back mapping table, add new mapping relations.
In addition, when conducting interviews to file, by traveling through back mapping table that metadata virtual volume corresponding to this file store on the primary node to the whole memory storages finding file destination corresponding, in order to improve access performance, each back mapping table can be divided into multiple back mapping sublist, complete back mapping table is stored on memory node, each back mapping sublist is then stored on each memory storage of corresponding stored node respectively, as shown in Figure 8, for recording the partial content in the back mapping table that its place memory node stores, like this, when conducting interviews to file, whole memory storages corresponding to file destination can be obtained by means of only traversal back mapping sublist, the time is shortened relative to the whole back mapping table of traversal, and then improve access performance.If the fault of memory storage causes the loss of back mapping sublist on this memory storage, then while carrying out date restoring according to complete back mapping table, according to the metadata virtual volume being host node with this failure storage place memory node, the back mapping sublist in this failure storage can be recovered.
As shown in Figure 9, the Data Recapture Unit 900 of the embodiment of the present invention comprises:
Back mapping table query unit 910, for when break down in cluster storage system memory storage time, inquire about the back mapping table of described failure storage place memory node, obtain the information of the file stored in described failure storage, wherein, described file is divided at least two data slice;
Metadata virtual volume query unit 920, for the metadata virtual volume that to inquire about with described failure storage place memory node be host node, obtain the data slice not being stored in described failure storage in described file, described metadata virtual volume is the virtual volume of the metadata for storage file;
Data recovery unit 930, is stored in the data slice in described failure storage according to the described each file of obtained data slice recovery.
The device of the embodiment of the present invention is keeping comparatively meticulous granularity, only needing inquiry back mapping table, namely can obtain the file needing to recover when carrying out date restoring; By analyzing a small amount of file metadata, just can locate rapidly the data block needing to recover, restorability is higher.In addition, because each node has a back mapping table, and preserved by the host node of metadata virtual volume, each memory node can carry out concurrent recovery, is independent of each other, and this makes restorability very high.
As shown in Figure 10, the device 1000 of the embodiment of the present invention also can comprise outward comprising each several part shown in Fig. 9:
Code reassignment unit 1040, for according to the principle of load balancing being the data slice distribution target storage that described file is stored in described failure storage;
Storage unit 1050, for being stored into described target storage by the described file data slice be stored in described failure storage;
Maintenance unit 1060, for upgrading corresponding metadata virtual volume and back mapping table.
As the structural representation of another Data Recapture Unit 1100 that Figure 11 provides for the embodiment of the present invention, the specific embodiment of the invention does not limit the specific implementation of Data Recapture Unit.As shown in figure 11, this Data Recapture Unit 1100 can comprise:
Processor (processor) 1110, communication interface (CommunicationsInterface) 1120, storer (memory) 1130 and communication bus 1140.Wherein:
Processor 1110, communication interface 1120 and storer 1130 complete mutual communication by communication bus 1240.
Communication interface 1120, for the net element communication with such as client etc.
Processor 1110, for executive routine 1132, specifically can perform the correlation step in the embodiment of the method shown in above-mentioned Fig. 2 to Fig. 8.
Particularly, program 1032 can comprise program code, and described program code comprises computer-managed instruction.
Processor 1110 may be a central processor CPU, or specific integrated circuit ASIC(ApplicationSpecificIntegratedCircuit), or be configured to the one or more integrated circuit implementing the embodiment of the present invention.
Storer 1130, for depositing program 1132.Storer 1130 may comprise high-speed RAM storer, still may comprise nonvolatile memory (non-volatilememory), such as at least one magnetic disk memory.Program 1132 specifically can comprise:
Back mapping table query unit, for when break down in cluster storage system memory storage time, inquire about the back mapping table of described failure storage place memory node, obtain the information of the file stored in described failure storage;
Metadata virtual volume query unit, for the metadata virtual volume that to inquire about with described failure storage place memory node be host node, obtains the data slice not being stored in described failure storage in described file;
Data recovery unit, is stored in the data slice in described failure storage according to the described each file of obtained data slice recovery.
In program 1132, the specific implementation of each unit see Fig. 9 to the corresponding units in embodiment illustrated in fig. 10, can be not repeated herein.Those skilled in the art can be well understood to, and for convenience and simplicity of description, the equipment of foregoing description and the specific works process of module, can describe with reference to the corresponding process in preceding method embodiment, not repeat them here.
In addition, present invention also offers a kind of cluster storage system as shown in Figure 1, this storage system comprises at least two storage servers (shown in Fig. 1 be three, but be not limited thereto), also comprises the Data Recapture Unit shown in Fig. 9 to Figure 11.
Those of ordinary skill in the art can recognize, in conjunction with unit and the method step of each example of embodiment disclosed herein description, can realize with the combination of electronic hardware or computer software and electronic hardware.These functions perform with hardware or software mode actually, depend on application-specific and the design constraint of technical scheme.Professional and technical personnel can use distinct methods to realize described function to each specifically should being used for, but this realization should not thought and exceeds scope of the present invention.
If described function using the form of SFU software functional unit realize and as independently production marketing or use time, can be stored in a computer read/write memory medium.Based on such understanding, the part of the part that technical scheme of the present invention contributes to prior art in essence in other words or this technical scheme can embody with the form of software product, this computer software product is stored in a storage medium, comprising some instructions in order to make a computer equipment (can be personal computer, server, or the network equipment etc.) perform all or part of step of method described in each embodiment of the present invention.And aforesaid storage medium comprises: USB flash disk, portable hard drive, ROM (read-only memory) (ROM, Read-OnlyMemory), random access memory (RAM, RandomAccessMemory), magnetic disc or CD etc. various can be program code stored medium.
Above embodiment is only for illustration of the present invention; and be not limitation of the present invention; the those of ordinary skill of relevant technical field; without departing from the spirit and scope of the present invention; can also make a variety of changes and modification; therefore all equivalent technical schemes also belong to category of the present invention, and scope of patent protection of the present invention should be defined by the claims.

Claims (8)

1. a data reconstruction method, is characterized in that, described method is applied in cluster storage system, and described cluster storage system comprises at least two memory nodes, and described method comprises:
When break down in described cluster storage system memory storage time, inquire about the back mapping table preserved in the memory node at described failure storage place, the information of the file stored in described failure storage is obtained by described back mapping table, wherein, described file is divided at least two data slice;
The metadata virtual volume that inquiry is host node with described failure storage place memory node, obtain the data slice not being stored in described failure storage in described file, described metadata virtual volume is the virtual volume of the metadata for storage file;
Recover described file according to the data slice of described acquisition and be stored in data slice in described failure storage;
Described back mapping table is stored in each memory node in described cluster storage system, for recording the corresponding relation of memory storage corresponding to specific metadata virtual volume and file, the metadata virtual volume that described specific metadata virtual volume is is host node with the memory node at described back mapping table place.
2. method according to claim 1, it is characterized in that, described back mapping table also comprises multiple back mapping sublist, be stored in respectively on each memory storage of corresponding memory node, described back mapping sublist have recorded the partial content in the back mapping table that its place memory node stores.
3. method according to claim 2, is characterized in that, also comprises:
According to the metadata virtual volume being host node with described failure storage place memory node, recover the back mapping sublist in described failure storage.
4. a Data Recapture Unit, is characterized in that, described application of installation is in cluster storage system, and described cluster storage system comprises at least two memory nodes, and described device comprises:
Back mapping table query unit, for when break down in described cluster storage system memory storage time, inquire about the back mapping table of described failure storage place memory node, obtain the information of the file stored in described failure storage, wherein, described file is divided at least two data slice;
Metadata virtual volume query unit, for the metadata virtual volume that to inquire about with described failure storage place memory node be host node, obtain the data slice not being stored in described failure storage in described file, described metadata virtual volume is the virtual volume of the metadata for storage file;
Data recovery unit, recovers described file according to obtained data slice and is stored in data slice in described failure storage;
Described back mapping table is stored in each memory node, for recording the corresponding relation of memory storage corresponding to specific metadata virtual volume and file, the metadata virtual volume that described specific metadata virtual volume is is host node with the memory node at described back mapping table place.
5. device according to claim 4, it is characterized in that, described back mapping table also comprises multiple back mapping sublist, be stored in respectively on each memory storage of corresponding memory node, described back mapping sublist have recorded the partial content in the back mapping table that its place memory node stores.
6. device according to claim 5, is characterized in that, also comprises:
Maintenance unit, for according to the metadata virtual volume being host node with described failure storage place memory node, recovers the back mapping sublist in described failure storage.
7. the device according to any one of claim 4 to 6, it is characterized in that, each described metadata virtual volume at least one memory storage corresponding, and one of them memory storage place memory node is the host node of described metadata virtual volume, the metadata of each file has the mirror image of predetermined number on metadata virtual volume, and every road mirror image is mapped on different memory storages respectively; Each memory node in described cluster storage system is respectively the host node of certain metadata virtual volume.
8. a cluster storage system, described cluster storage system comprises at least two memory nodes, it is characterized in that, described cluster storage system also comprises the Data Recapture Unit according to any one of claim 4 to 7.
CN201210587371.5A 2012-12-28 2012-12-28 Data reconstruction method, device and cluster storage system Active CN103064765B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201210587371.5A CN103064765B (en) 2012-12-28 2012-12-28 Data reconstruction method, device and cluster storage system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201210587371.5A CN103064765B (en) 2012-12-28 2012-12-28 Data reconstruction method, device and cluster storage system

Publications (2)

Publication Number Publication Date
CN103064765A CN103064765A (en) 2013-04-24
CN103064765B true CN103064765B (en) 2015-12-02

Family

ID=48107398

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201210587371.5A Active CN103064765B (en) 2012-12-28 2012-12-28 Data reconstruction method, device and cluster storage system

Country Status (1)

Country Link
CN (1) CN103064765B (en)

Families Citing this family (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103324553B (en) * 2013-06-21 2016-08-24 华为技术有限公司 Data reconstruction method, system and device
US10055352B2 (en) * 2014-03-11 2018-08-21 Amazon Technologies, Inc. Page cache write logging at block-based storage
CN105335250B (en) * 2014-07-28 2018-09-28 浙江大华技术股份有限公司 A kind of data reconstruction method and device based on distributed file system
US10177994B2 (en) * 2014-08-13 2019-01-08 Microsoft Technology Licensing, Llc Fault tolerant federation of computing clusters
US11290524B2 (en) 2014-08-13 2022-03-29 Microsoft Technology Licensing, Llc Scalable fault resilient communications within distributed clusters
CN104484130A (en) * 2014-12-04 2015-04-01 北京同有飞骥科技股份有限公司 Construction method of horizontal expansion storage system
CN104898542A (en) * 2015-04-29 2015-09-09 河南职业技术学院 Positioning device and programmable logic controller (PLC)
CN106686095A (en) * 2016-12-30 2017-05-17 郑州云海信息技术有限公司 Data storage method and device based on erasure code technology
CN107391391B (en) * 2017-07-19 2019-05-14 深圳大普微电子科技有限公司 Method, system and the solid state hard disk of data copy are realized in the FTL of solid state hard disk
CN107301133B (en) * 2017-07-20 2021-01-12 苏州浪潮智能科技有限公司 Method and device for constructing lost FTL table
WO2019080015A1 (en) * 2017-10-25 2019-05-02 华为技术有限公司 Data reading and writing method and device, and storage server
CN110502184B (en) * 2018-05-17 2021-01-05 杭州海康威视系统技术有限公司 Data storage method, data reading method, device and system
CN109086010B (en) * 2018-08-29 2021-12-17 郑州云海信息技术有限公司 Method for improving metadata reliability on full flash memory array
CN109117317A (en) * 2018-11-01 2019-01-01 郑州云海信息技术有限公司 A kind of clustering fault restoration methods and relevant apparatus
CN113316770B (en) * 2019-01-25 2023-08-22 华为技术有限公司 Data restoration method and device
CN111625390B (en) * 2020-05-28 2024-03-26 深圳市晶讯技术股份有限公司 Embedded equipment fault recovery method and device, embedded equipment and storage medium
CN111698330B (en) * 2020-06-12 2022-06-21 北京金山云网络技术有限公司 Data recovery method and device of storage cluster and server
CN112799882A (en) * 2021-02-08 2021-05-14 上海交通大学 File perception recovery method and device based on graph algorithm
CN113821176B (en) * 2021-09-29 2023-07-21 重庆紫光华山智安科技有限公司 Data migration processing method, device and storage medium
CN114936188A (en) * 2022-05-30 2022-08-23 重庆紫光华山智安科技有限公司 Data processing method and device, electronic equipment and storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6112286A (en) * 1997-09-19 2000-08-29 Silicon Graphics, Inc. Reverse mapping page frame data structures to page table entries
CN1542624A (en) * 2003-04-29 2004-11-03 大唐移动通信设备有限公司 Method for quickening logic block mapping speed in Flash file system
CN101840364A (en) * 2010-01-29 2010-09-22 成都市华为赛门铁克科技有限公司 Method for recovering data and storage device thereof
CN101986276A (en) * 2010-10-21 2011-03-16 成都市华为赛门铁克科技有限公司 Methods and systems for storing and recovering files and server
CN102024016A (en) * 2010-11-04 2011-04-20 天津曙光计算机产业有限公司 Rapid data restoration method for distributed file system (DFS)
CN102622185A (en) * 2011-01-27 2012-08-01 北京东方广视科技股份有限公司 Method for storing document in plurality of storage units and storage allocation method

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6112286A (en) * 1997-09-19 2000-08-29 Silicon Graphics, Inc. Reverse mapping page frame data structures to page table entries
CN1542624A (en) * 2003-04-29 2004-11-03 大唐移动通信设备有限公司 Method for quickening logic block mapping speed in Flash file system
CN101840364A (en) * 2010-01-29 2010-09-22 成都市华为赛门铁克科技有限公司 Method for recovering data and storage device thereof
CN101986276A (en) * 2010-10-21 2011-03-16 成都市华为赛门铁克科技有限公司 Methods and systems for storing and recovering files and server
CN102024016A (en) * 2010-11-04 2011-04-20 天津曙光计算机产业有限公司 Rapid data restoration method for distributed file system (DFS)
CN102622185A (en) * 2011-01-27 2012-08-01 北京东方广视科技股份有限公司 Method for storing document in plurality of storage units and storage allocation method

Also Published As

Publication number Publication date
CN103064765A (en) 2013-04-24

Similar Documents

Publication Publication Date Title
CN103064765B (en) Data reconstruction method, device and cluster storage system
US11960777B2 (en) Utilizing multiple redundancy schemes within a unified storage element
US10789020B2 (en) Recovering data within a unified storage element
US10977124B2 (en) Distributed storage system, data storage method, and software program
US9804939B1 (en) Sparse raid rebuild based on storage extent allocation
US10365983B1 (en) Repairing raid systems at per-stripe granularity
EP3617867B1 (en) Fragment management method and fragment management apparatus
CN103534688B (en) Data reconstruction method, memory device and storage system
US10664367B2 (en) Shared storage parity on RAID
CN104461390A (en) Method and device for writing data into imbricate magnetic recording SMR hard disk
CN101916173B (en) RAID (Redundant Array of Independent Disks) based data reading and writing method and system thereof
CN106557266B (en) Method and apparatus for redundant array of independent disks RAID
CN103617097B (en) File access pattern method and device
US10324794B2 (en) Method for storage management and storage device
CN108733326B (en) Disk processing method and device
CN103037004A (en) Implement method and device of cloud storage system operation
CN102024059A (en) Method and device for protecting redundant array of independent disk in file system
CN101866307A (en) Data storage method and device based on mirror image technology
US20080091916A1 (en) Methods for data capacity expansion and data storage systems
CN103608784A (en) Method for creating network volumes, data storage method, storage device and storage system
US7996447B2 (en) Method and system for optimal file system performance
US20190354433A1 (en) Parity log with by-pass
US20180307427A1 (en) Storage control apparatus and storage control method
CN116204137B (en) Distributed storage system, control method, device and equipment based on DPU
US20210216403A1 (en) Dynamically adjusting redundancy levels of storage stripes

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant