CN106293980A

CN106293980A - Data recovery method and system for distributed storage cluster

Info

Publication number: CN106293980A
Application number: CN201610595794.XA
Authority: CN
Inventors: 吴兴义
Original assignee: LeTV Holding Beijing Co Ltd; LeTV Cloud Computing Co Ltd
Current assignee: LeCloud Computing Co Ltd; LeTV Holding Beijing Co Ltd; LeTV Cloud Computing Co Ltd
Priority date: 2016-07-26
Filing date: 2016-07-26
Publication date: 2017-01-04

Abstract

The invention provides a kind of data recovery method for distributed storage cluster, including: according to the data read request received master disk in distributed storage cluster determines the primary copy of data, and primary copy is back to the promoter of data read request；Determine based on verification probability and verify hit results from copy, when result is for being, to asking from disk transmission verification from copy place, and receive the first check value sent from disk；Determine that the first check value is the most identical with the second check value of primary copy, when the first check value and the second check value difference, utilize primary copy to repair from copy.Present invention also offers corresponding system.The embodiment of the present invention can be based on primary copy data inconsistent the repairing from copy of its content to storage automatically.Determine whether the verification carrying out data consistency from copy by arranging verification probability, it is possible to reduce the system load of distributed storage cluster, reduce the wasting of resources, increase systematic function.

Description

Data recovery method and system for distributed storage cluster

Technical field

The present invention relates to technical field of distributed memory, be specifically related to a kind of data reparation for distributed storage cluster Method and system.

Background technology

Distributed memory system, is data according to the cutting of certain rule and to be broken up and be stored in many platform independent common store clothes On business device.Traditional network store system uses all data of storage server repository concentrated, storage server to become system The bottleneck of performance, is also the focus of reliability and safety, it is impossible to meet the needs of Mass storage application, and distributed storage System uses extendible system structure, utilizes multiple stage storage server to share storage load, utilizes location server location to deposit Storage information, it not only increases the reliability of system, availability and access efficiency, is also easy to extension.Storage cluster thousands of on Ten thousand station servers can be substantially redundant by data, such that it is able to significantly improve the safety of data.

In distributed memory system, the mode of three copies is generally used to guarantee data security.Three copy modes refer to, by One master disk is responsible for receiving request, and forward data to two other (from) disk, wait data rule on two other disk After success, oneself rule again, and give user response upon success.

During realizing the present invention, inventor finds that prior art at least there is problems in that under three copy modes, Any mechanism not can be used to confirm whether disk can be written correctly into data completely.Problem the most that may be present It is, wherein two pieces of data being likely to occur when rule from disk and writing and real data situation about not corresponding.And work as After the disk expendable fault of appearance that an only blocks of data is correct, these data will be unable to recover.This is for data one Cause property requires that under the highest scene be unacceptable.Additionally, the time that the situation that so data are inconsistent exists is the longest, number According to safety be more on the hazard.Because the restricted lifetime system of disk, along with the disk broken down gets more and more, thus bring The frequent replacing of disk, the probability that all can make loss of data is increasing.Therefore, to repairing this same data in difference The demand that storage content is inconsistent in disk be current industry urgently to be resolved hurrily need problem.

Summary of the invention

The embodiment of the present invention provides a kind of data recovery method for distributed storage cluster and system, on solving State at least one problem of the prior art of elaboration.

One aspect of the embodiment of the present invention provides a kind of data recovery method for distributed storage cluster, bag Include:

According to the master determining data in the data read request received master disk in described distributed storage cluster Copy, and described primary copy is back to the promoter of described data read request；

Based on verification probability determine from copy verify hit results, when result is for being, to described from copy place from Disk sends verification request, and receives described the first check value sent from disk；

Determine that described first check value is the most identical with the second check value of described primary copy, when described first check value and During described second check value difference, described primary copy is utilized to repair from copy described.

The another aspect of the embodiment of the present invention provides a kind of data repair system for distributed storage cluster, described System includes:

Request-response unit, for according to main magnetic in described distributed storage cluster of the data read request that receives Dish determines the primary copy of data, and described primary copy is back to the promoter of described data read request；

Unit repaired by copy, is used for:

The data recovery method for distributed storage cluster of embodiment of the present invention offer and system, by master disk Primary copy return the promoter of data read request after, determine whether from copy, data are carried out school based on verification probability Test；When verifying hit results for being, to asking from disk transmission verification and receiving asking in response to verification from disk return The check value from copy；Finally determine from the check value of copy the most identical with the check value of primary copy, show when difference Primary copy and the content from copy are inconsistent, now utilize the primary copy in master disk can complete to content inconsistent from pair The most automatically the process repaired.Method and system shown in the embodiment of the present invention can be based on primary copy to the data of storage in it Hold inconsistent automatically repairing from copy.The situation inconsistent due to data from copy with primary copy not always occurs In each blocks of data, therefore by arranging hit results that verification probability determines that copy verifies to determine whether from pair Originally carry out the verification of data consistency, it is possible to reduce the system load of distributed storage cluster, reduce the wasting of resources, increase system Performance.And whole checking procedure and repair process are machine and automatically process, it is not necessary to manual operation, greatly reduce anthropic factor The various errors brought.

Accompanying drawing explanation

In order to be illustrated more clearly that the technical scheme of the embodiment of the present invention, required use in embodiment being described below Accompanying drawing be briefly described, it should be apparent that, below describe in accompanying drawing be some embodiments of the present invention, for ability From the point of view of the those of ordinary skill of territory, on the premise of not paying creative work, it is also possible to obtain the attached of other according to these accompanying drawings Figure.

Fig. 1 is the flow chart of the data recovery method for distributed storage cluster of one embodiment of the invention；

Fig. 2 is the structural representation of the data repair system for distributed storage cluster of one embodiment of the invention；

Fig. 3 is that the structure for the equipment of the data recovery method of distributed storage cluster implementing the embodiment of the present invention is shown It is intended to.

Detailed description of the invention

For making the purpose of the embodiment of the present invention, technical scheme and advantage clearer, below in conjunction with the embodiment of the present invention In accompanying drawing, the technical scheme in the embodiment of the present invention is clearly and completely described, it is clear that described embodiment is The a part of embodiment of the present invention rather than whole embodiments.Based on the embodiment in the present invention, those of ordinary skill in the art The every other embodiment obtained under not making creative work premise, broadly falls into the scope of protection of the invention.

It should be noted that in the case of not conflicting, the embodiment in the present invention and the feature in embodiment can To be mutually combined.

The present invention can be used in numerous general or special purpose computing system environment or configuration.Such as: personal computer, service Device computer, handheld device or portable set, laptop device, multicomputer system, system based on microprocessor, top set Box, programmable consumer-elcetronics devices, network PC, minicomputer, mainframe computer, include any of the above system or equipment Distributed computing environment etc..

The present invention can be described in the general context of computer executable instructions, such as program Module.Usually, program module includes performing particular task or realizing the routine of particular abstract data type, program, object, group Part, data structure etc..The present invention can also be put into practice in a distributed computing environment, in these distributed computing environment, by The remote processing devices connected by communication network performs task.In a distributed computing environment, program module is permissible It is positioned in the local and remote computer-readable storage medium of storage device.

Finally, in addition it is also necessary to explanation, in this article, the relational terms of such as first and second or the like be used merely to by One entity or operation separate with another entity or operating space, and not necessarily require or imply these entities or operation Between exist any this reality relation or order.And, term " includes ", " comprising ", not only includes those key elements, and And also include other key elements being not expressly set out, or also include intrinsic for this process, method, article or equipment Key element.In the case of there is no more restriction, statement " including ... " key element limited, it is not excluded that including described wanting Process, method, article or the equipment of element there is also other identical element.

Fig. 1 is the flow chart of the data recovery method for distributed storage cluster of one embodiment of the invention.Such as Fig. 1 institute Showing, the method includes:

S11: determine data according in the data read request received master disk in described distributed storage cluster Primary copy, and described primary copy is back to the promoter of described data read request；

S12: determine based on verification probability and verify hit results, when result is for being, to described from copy place from copy Send verification request from disk, and receive described the first check value sent from disk；

S13: determine that described first check value is the most identical with the second check value of described primary copy, when described first verifies When being worth different with described second check value, described primary copy is utilized to repair from copy described.

The present embodiment is after the promoter that the primary copy in master disk returns data read request, based on verification probability Determine whether verifying from copy data；When verifying hit results for being, to sending verification request from disk and connecing Receive the check value from copy in response to verification request returned from disk；Finally determine from the check value of copy and primary copy Check value is the most identical, shows that when difference primary copy and the content from copy are inconsistent, now utilizes the major-minor in master disk Originally the process from copy automatically repaired inconsistent to content can be completed.Method shown in the embodiment of the present invention can be based on Primary copy data inconsistent automatically the repairing from copy of its content to storage.Owing to from copy and primary copy, data are not Consistent situation not always occurs in each blocks of data, therefore by arrange verification probability determine the life that copy verifies Middle result is to determine whether the verification carrying out data consistency from copy, it is possible to the system reducing distributed storage cluster is born Carry, reduce the wasting of resources, increase systematic function.And whole checking procedure and repair process are machine and automatically process, it is not necessary to people For operation, greatly reduce the various errors that anthropic factor brings.

As the further optimization of embodiment illustrated in fig. 1, in embodiment illustrated in fig. 1, verification probability includes for disk empty First verification probability of not busy state and the second verification probability for disk duty, wherein, the first verification probability is more than the Two verification probability.

When disk is the most in running order within the busy time, owing to the reading and writing data number of requests of user is more and frequency Rate is higher, if now often then can bear by heavy system verifying from copy.Arrange less in the most busy time Verification probit, such as 5%, both can meet the demand from the conforming verification of copy data, will not be again system band Carry out bigger pressure.And at one's leisure, the number of requests of user is less.Bigger verification probit now can be set, And by the way of internal system personnel carry out reading and writing data request, most data are carried out consistency desired result to carry out From the automatic reparation of copy, it is ensured that the safety of data.Especially, the verification probability in free time can be set to 100%, for all data in disk being traveled through and are found out the inconsistent copy of data data to be carried out reading behaviour While work, it is carried out from copy reparation.

Further illustrating as embodiment illustrated in fig. 1, in embodiment illustrated in fig. 1 S13 determine described first check value with Second check value of described primary copy is the most identical, when described first check value is different with described second check value, utilizes institute State primary copy to carry out reparation from copy include described:

S131: described primary copy is sent to described described from copy to replace from disk.

In the present embodiment, primary copy reads success and after being used for response data read requests, primary copy is sent to from Disk with cover or replace the check value of check value and primary copy from disk inconsistent from copy, can find from copy Repair immediately from copy while inconsistent with primary copy, it is to avoid from copy wait to be repaired during main magnetic Dish causes primary copy to lack because of fault, thus causes user cannot access the defect of these data, reduces shortage of data during this period The loss brought.

In the above-described embodiments, described check value determines according to the data content that described data read request is corresponding, described Check value at least includes the one in MD5 check value or CRC32 check value.

It should be noted that, under three copy modes, when two detected are consistent, the most not from the first check value of copy When being same as the second check value of primary copy, it is now that data are as the criterion with primary copy as said method embodiment, or with many The copy of number storage is as the criterion, and the system manager of distributed storage cluster can be transferred to determine.

It should be noted that for aforesaid each method embodiment, in order to be briefly described, therefore it is all expressed as a series of Action merge, but those skilled in the art should know, the present invention is not limited by described sequence of movement because According to the present invention, some step can use other orders or carry out simultaneously.Secondly, those skilled in the art also should know Knowing, embodiment described in this description belongs to preferred embodiment, involved action and the module not necessarily present invention Necessary.

In the above-described embodiments, the description to each embodiment all emphasizes particularly on different fields, and does not has the portion described in detail in certain embodiment Point, may refer to the associated description of other embodiments.

Fig. 2 is the structural representation of the data repair system for distributed storage cluster of one embodiment of the invention.This The data recovery method for distributed storage cluster described in inventive embodiments can divide based on being used in the present embodiment The data repair system of cloth storage cluster is implemented.As in figure 2 it is shown, this system includes that request-response unit 21 and copy are repaired single Unit 22.

Request-response unit 21 is for according to the data read request received master in described distributed storage cluster Disk determines the primary copy of data, and described primary copy is back to the promoter of described data read request；

Copy is repaired unit 22 and is used for:

Primary copy in master disk is being returned the initiation of data read request by the present embodiment by data-reading unit 21 After person, repair unit 22 by copy and determine whether verifying from copy data based on verification probability；When verification life When middle result is for being, to sending verification request from disk and receiving the school from copy in response to verification request returned from disk Test value；Finally determine from the check value of copy the most identical with the check value of primary copy, show primary copy when difference and from pair This content is inconsistent, now utilizes the primary copy in master disk can complete inconsistent automatically the repairing from copy of content Multiple process.System shown in the embodiment of the present invention can be based on primary copy data inconsistent the entering from copy of its content to storage Row is repaired automatically.Owing to the situation that data from copy with primary copy are inconsistent not always occurs in each blocks of data, Therefore by arranging hit results that verification probability determines that copy verifies to determine whether to carry out data consistency to from copy Verification, it is possible to reduce distributed storage cluster system load, reduce the wasting of resources, increase systematic function.And whole verification Process and repair process are machine and automatically process, it is not necessary to manual operation, greatly reduce the various errors that anthropic factor brings.

As the further optimization of embodiment illustrated in fig. 2, in embodiment illustrated in fig. 2, verification probability includes for disk empty First verification probability of not busy state and the second verification probability for disk duty, wherein, the first verification probability is more than the Two verification probability.

When disk is the most in running order within the busy time, owing to the reading and writing data number of requests of user is more and frequency Rate is higher, if now often then can bear by heavy system verifying from copy.Arrange less in the most busy time Verification probit, such as 5%, both can meet the demand from the conforming verification of copy data, will not be again system band Carry out bigger pressure.And at one's leisure, the number of requests of user is less.Bigger verification probit now can be set, And by the way of internal system personnel carry out reading and writing data request, most data are carried out consistency desired result to carry out From the automatic reparation of copy, it is ensured that the safety of data.Especially, the verification probability in free time can be set to 100%, for all data in disk being traveled through and are found out the inconsistent copy of data data to be carried out reading behaviour While work, it is carried out copy reparation.

Further illustrating as embodiment illustrated in fig. 2, in embodiment illustrated in fig. 2, copy repairs unit 22 for by described Primary copy sends to described described from copy to replace from disk.

In the present embodiment, primary copy reads success and after being used for response data read requests, primary copy is sent to from Disk with cover or replace the second check value of the first check value and primary copy from disk inconsistent from copy, can send out Now repair immediately from copy while copy and primary copy are inconsistent, it is to avoid waiting mistake to be repaired from copy In journey, master disk causes primary copy to lack because of fault, thus causes user cannot access the defect of these data, reduces during this period The loss that shortage of data brings.

The embodiment of the present invention can be passed through hardware processor (hardware processor) and realize correlation function mould Block.

The present invention provides a kind of non-transient (non-transitory) computer-readable recording medium, described storage medium Middle storage has one or more program including performing instruction, and described execution instruction can be by with the electronic equipment controlling interface Perform, for the correlation step performed in said method embodiment, such as:

Fig. 3 is the knot of the equipment 300 of the data recovery method for distributed storage cluster implementing the embodiment of the present invention Structure schematic diagram.Implementing of equipment 300 is not limited by the specific embodiment of the invention.As it is shown on figure 3, this equipment can wrap Include:

Processor (processor) 310, communication interface (Communications Interface) 320, memorizer (memory) 330 and communication bus 340.Wherein:

Processor 310, communication interface 320 and memorizer 330 complete mutual communication by communication bus 340.

Communication interface 320, for the net element communication with such as client etc..

Processor 310, for performing the program 332 in memorizer 330, specifically can perform in said method embodiment Correlation step.

Specifically, program 332 can include that program code, described program code include computer-managed instruction.

Processor 310 is probably a central processor CPU, or specific integrated circuit ASIC (Application Specific Integrated Circuit), or it is configured to implement the one or more integrated electricity of the embodiment of the present invention Road.

Memorizer 330, is used for program of depositing 332.Memorizer 330 may comprise high-speed RAM memorizer, it is also possible to also includes Nonvolatile memory (non-volatile memory), for example, at least one disk memory.Program 332 specifically can be used Following operation is performed in making equipment 300:

In program 332, each step implements correspondence in the corresponding steps and unit that may refer in above-described embodiment Describe, be not repeated herein.Those skilled in the art is it can be understood that arrive, and for convenience and simplicity of description, above-mentioned retouches The equipment stated and the specific works process of module, the corresponding process being referred in preceding method embodiment describes, at this no longer Repeat.

Embodiments described above is only schematically, and the wherein said unit that illustrates as separating component can be Or may not be physically separate, the parts shown as unit can be or may not be physical location, i.e. May be located at a place, or can also be distributed on multiple NE.Can select therein according to the actual needs Some or all of module realizes the purpose of the present embodiment scheme.Those of ordinary skill in the art are not paying the labor of creativeness In the case of Dong, i.e. it be appreciated that and implement.

By the description of above embodiment, those skilled in the art is it can be understood that can be by each embodiment Software adds the mode of required general hardware platform and realizes, naturally it is also possible to pass through hardware.Based on such understanding, above-mentioned skill The part that prior art is contributed by art scheme the most in other words can embody with the form of software product, this calculating Machine software product can store in a computer-readable storage medium, such as ROM/RAM, magnetic disc, CD etc., uses including some instructions So that computer equipment (can be personal computer, server, or the network equipment etc.) perform each embodiment or The method described in some part of person's embodiment.

Those skilled in the art are it should be appreciated that embodiments of the invention can be provided as method, system or computer program Product.Therefore, the reality in terms of the present invention can use complete hardware embodiment, complete software implementation or combine software and hardware Execute the form of example.And, the present invention can use at one or more computers wherein including computer usable program code The shape of the upper computer program implemented of usable storage medium (including but not limited to disk memory and optical memory etc.) Formula.

The present invention is with reference to method, equipment (system) and the flow process of computer program according to embodiments of the present invention Figure and/or block diagram describe.It should be understood that can the most first-class by computer program instructions flowchart and/or block diagram Flow process in journey and/or square frame and flow chart and/or block diagram and/or the combination of square frame.These computer programs can be provided Instruction arrives the processor of general purpose computer, special-purpose computer, Embedded Processor or other programmable data processing device to produce A raw machine so that the instruction performed by the processor of computer or other programmable data processing device is produced for real The device of the function specified in one flow process of flow chart or multiple flow process and/or one square frame of block diagram or multiple square frame now.

These computer program instructions may be alternatively stored in and computer or other programmable data processing device can be guided with spy Determine in the computer-readable memory that mode works so that the instruction being stored in this computer-readable memory produces and includes referring to Make the manufacture of device, this command device realize at one flow process of flow chart or multiple flow process and/or one square frame of block diagram or The function specified in multiple square frames.These computer program instructions also can be loaded into computer or other programmable datas process and set It is standby upper so that on computer or other programmable devices, execution sequence of operations step is to produce computer implemented process, Thus the instruction performed on computer or other programmable devices provides for realizing at one flow process of flow chart or multiple stream The step of the function specified in journey and/or one square frame of block diagram or multiple square frame.

Last it is noted that above example is only in order to illustrate technical scheme, it is not intended to limit；Although With reference to previous embodiment, the present invention is described in detail, it will be understood by those within the art that: it still may be used So that the technical scheme described in foregoing embodiments to be modified, or wherein portion of techniques feature is carried out equivalent； And these amendment or replace, do not make appropriate technical solution essence depart from various embodiments of the present invention technical scheme spirit and Scope.

Claims

1. for a data recovery method for distributed storage cluster, including:

According to the data read request received master disk in described distributed storage cluster determines the primary copy of data, And described primary copy is back to the promoter of described data read request；

Based on verification probability determine from copy verify hit results, when result is for being, to described from copy place from disk Send verification request, and receive described the first check value sent from disk；

Determine that described first check value is the most identical with the second check value of described primary copy, when described first check value and described During the second check value difference, described primary copy is utilized to repair from copy described.

Method the most according to claim 1, wherein, described verification probability includes the first verification for disk idle condition Probability and the second verification probability for disk duty, described first verification probability is more than described second verification probability.

Method the most according to claim 2, wherein, described the second school determining described first check value and described primary copy Test value the most identical, when described first check value is different with described second check value, utilize described primary copy to described from pair Originally carry out reparation to include:

Described primary copy is sent to described described from copy to replace from disk.

4. according to the method according to any one of claim 1-3, wherein, described check value is according to described data read request pair The data content answered determines, described check value at least includes the one in MD5 check value or CRC32 check value.

5. for a data repair system for distributed storage cluster, including:

Request-response unit, for according in master disk in described distributed storage cluster of the data read request that receives Determine the primary copy of data, and described primary copy is back to the promoter of described data read request；

Unit repaired by copy, is used for:

System the most according to claim 5, wherein, described verification probability includes the first verification for disk idle condition Probability and the second verification probability for disk duty, described first verification probability is more than described second verification probability.

System the most according to claim 6, wherein, described copy repairs unit for sending described primary copy to described Described from copy to replace from disk.

8. according to the system according to any one of claim 5-7, wherein, described check value is according to described data read request pair The data content answered determines, described check value at least includes the one in MD5 check value or CRC32 check value.