A kind of method and storage architecture of the more copy rapid verification uniformity of distributed storage
Technical field
The present invention relates to technical field of information storage, more particularly to a kind of more copy rapid verifications one of distributed storage
The method and storage architecture of cause property.
Background technology
With the arriving of information age, global metadata amount is in the trend of explosive growth.Improving storage system can
By property and ensure that availability of data has turned into the research emphasis of enterprise.In existing distributed memory system, the overwhelming majority is
By multi-duplicate technology come lifting system reliability, availability, performance and scalability.But distributed memory system is all logical
Cross network service, it is inconsistent that the unstability of network easily causes Back end data, and distributed memory system generally comprise compared with
More server hosts and number of disks, the probability of hardware damage are also higher.
If being unable to the uniformity of quick detection copy, the data integrity and high availability of distributed memory system are with regard to big
It is big to reduce.Existing verification coherence method is mainly the cryptographic Hash of calculation document, and the cryptographic Hash for contrasting multiple wave files is
It is no that unanimously to judge file, whether data are consistent.
But if for mass file, substantial amounts of CPU and storage host bandwidth will be consumed by calculating cryptographic Hash, be had a strong impact on and be
The performance of system.And the inconsistent position of file is often fewer, but calculation document cryptographic Hash needs to read the interior of whole file
Hold, cause the waste of huge resource.
Therefore, prior art has yet to be improved and developed.
The content of the invention
The technical problem to be solved in the present invention is, for the drawbacks described above of prior art, there is provided a kind of distributed storage
The method and storage architecture of more copy rapid verification uniformity, it is desirable to provide one kind improves consistency detection speed, reduces simultaneously
Storage host bandwidth consumption, and accelerate the method for data check speed.
The technical proposal for solving the technical problem of the invention is as follows:
A kind of method of the more copy rapid verification uniformity of distributed storage, the distributed storage use control main frame-storage
The processing framework of main frame, the method comprising the steps of:
A, the file of storage is evenly dividing in advance and is respectively arranged with independent corresponding first for some data segments, each data segment
Cryptographic Hash, and it is provided with the flag bit whether expired for representing corresponding first cryptographic Hash;
B, when receiving write request, according to the offset and length of write request, corresponding flag bit is calculated, and by the mark
Will position is arranged to expired;
C, expired flag bit is filtered out, after updating the first cryptographic Hash corresponding to the flag bit, according to the first of each data segment
Cryptographic Hash calculates the second cryptographic Hash of whole file.
The method of the more copy rapid verification uniformity of described distributed storage, wherein, first cryptographic Hash and mark
Position is preserved using extra new files.
The method of the more copy rapid verification uniformity of described distributed storage, wherein, during initialization, the first cryptographic Hash with
Flag bit is disposed as 0;And the first cryptographic Hash corresponding to the data segment not write is arranged to 0.
The method of the more copy rapid verification uniformity of described distributed storage, wherein, the step A is specifically included:
A1, the file of storage is divided into some data segments in advance, each data segment size is 4M, and carries out Initialize installation;
A2, each data segment are respectively arranged with individually corresponding first cryptographic Hash, and are provided with for representing corresponding first Hash
Value whether expired flag bit.
The method of the more copy rapid verification uniformity of described distributed storage, wherein, the step B is specifically included:
B1, when receiving write request, according to the offset and length of write request, calculate corresponding flag bit;
The flag bit is simultaneously arranged to 1 by B2 from 0, represents that the flag bit is out of date.
The method of the more copy rapid verification uniformity of described distributed storage, wherein, the step C is specifically included:
C1, expired flag bit is filtered out, calculate the first new cryptographic Hash of expired flag bit;
C2, judge during the first new cryptographic Hash is calculated, if there is flag bit to be arranged to expired, if then performing step
C1, if the first new cryptographic Hash otherwise is write into storage host;
C3, the second cryptographic Hash for calculating according to the first cryptographic Hash of each data segment whole file.
The method of the more copy rapid verification uniformity of described distributed storage, wherein, the step C2 is specially:Controlling
Flag bit is initialized as 0 in the internal memory of main frame processed, judged during the first new cryptographic Hash is calculated, if having flag bit to be set
1 is set to, if then performing step C1, if the first new cryptographic Hash otherwise is write into storage host.
A kind of storage architecture, wherein, the storage architecture uses the processing framework of control main frame-storage host;
Be built with virtual disk in the control main frame, and for managing the life cycle of virtual disk, complete data reception,
Caching, forwarding capability;
The storage host is made up of multiple storage mediums, the storage for redundant data;
Computer program is stored with the storage architecture, the computer program realizes any of the above-described when being performed by control main frame
The step of method of the more copy rapid verification uniformity of described distributed storage.
Beneficial effects of the present invention:The present invention provide a kind of more copy rapid verification uniformity of distributed storage method and
Storage architecture, by the way that one big file is divided into some data segments, the cryptographic Hash of divided data section calculation document, then by each
The cryptographic Hash of data segment calculates the cryptographic Hash of whole file;By the above method, only it need to record which data segment is changed, then
The cryptographic Hash of corresponding data section is updated, the data of whole file need to be read when avoiding verification uniformity, so as to carry significantly
Speed is examined in high uniformity school, reduces the consumption of storage host bandwidth;And divided data section calculates cryptographic Hash, in the system free time
Easier concurrent can be realized, greatly accelerate the speed of data check.
Brief description of the drawings
Fig. 1 is a kind of flow of the method preferred embodiment of the more copy rapid verification uniformity of distributed storage of the present invention
Figure.
Fig. 2 is a kind of theory diagram of storage architecture preferred embodiment of the present invention.
Fig. 3 is a kind of divided data of the method preferred embodiment of the more copy rapid verification uniformity of distributed storage of the present invention
The first cryptographic Hash schematic diagram of section.
Fig. 4 is a kind of the updated of method preferred embodiment of the more copy rapid verification uniformity of distributed storage of the present invention
Phase the first cryptographic Hash flow chart.
Embodiment
To make the objects, technical solutions and advantages of the present invention clearer, clear and definite, develop simultaneously embodiment pair referring to the drawings
The present invention is further described.It should be appreciated that the specific embodiments described herein are merely illustrative of the present invention, and do not have to
It is of the invention in limiting.
The embodiments of the invention provide a kind of method of the more copy rapid verification uniformity of distributed storage, refer to Fig. 1-
4, as illustrated, the processing framework by using control main frame-storage host.
Specifically comprise the following steps:
S100, the file of storage is evenly dividing as some data segments in advance, each data segment be respectively arranged with individually corresponding to
First cryptographic Hash, and it is provided with the flag bit whether expired for representing corresponding first cryptographic Hash.
S101, the file of storage is divided into some data segments in advance, each data segment size is 4M, and is initialized
Set.
S102, each data segment be respectively arranged with individually corresponding to the first cryptographic Hash, and be provided with for representing corresponding the
The whether expired flag bit of one cryptographic Hash.
In the embodiment of the present invention, it is assumed that the size of big file is 100G.Each data segment size is 4M, is divided into altogether
25600 data segments, if using crc32 hash algorithms, each data segment needs to consume 4B to store the first cryptographic Hash, entirely
The file of first cryptographic Hash needs 100K to store the first cryptographic Hash.Each data segment section also needs to 1 bit flag position to represent the
Whether one cryptographic Hash is expired, and the file of whole flag bit needs 3200B to carry out storage flag.Above-mentioned first cryptographic Hash and flag bit
The storage overhead of consumption is(100K+3200B)/ 100G ≈ 0.0001%.
The flag bit of first cryptographic Hash needs to be loaded into internal memory, accelerates to judge, from the foregoing, needed for 100G file
Mark bit occupancy memory headroom less than 4K.
S200, when receiving write request, according to the offset and length of write request, calculate corresponding flag bit, and
The flag bit is arranged to expired.
S201, when receiving write request, according to the offset and length of write request, calculate corresponding flag bit.
The flag bit is simultaneously arranged to 1 by S202 from 0, represents that the flag bit is out of date.
In the embodiment of the present invention, during initialization, the first cryptographic Hash is disposed as 0 with flag bit;And the number that will do not write
0 is arranged to according to the first cryptographic Hash corresponding to section.
When there is write request, according to the offset and length of write request, calculate which flag bit current write request is related to, such as
Fruit flag bit is 0, it is necessary to flag bit is arranged to 1, shows that the first cryptographic Hash of corresponding data section is expired, next time needs more
Newly;If have modified flag bit, it is necessary to which flag bit is write storage host, ensure that newest flag bit will not be different because of power-off etc.
Reason condition causes to lose.
S300, expired flag bit is filtered out, after updating the first cryptographic Hash corresponding to the flag bit, according to each data segment
The first cryptographic Hash calculate the second cryptographic Hash of whole file.
S301, expired flag bit is filtered out, calculate the first new cryptographic Hash of expired flag bit.
S302, judge during the first new cryptographic Hash is calculated, if there is flag bit to be arranged to expired, if then performing
Step C1, if the first new cryptographic Hash otherwise is write into storage host.
S303, the second cryptographic Hash for calculating according to the first cryptographic Hash of each data segment whole file.
The step S302 is specially:
Flag bit is initialized as 0 in the internal memory of control main frame, judged during the first new cryptographic Hash is calculated, if having mark
Will position is arranged to 1, if then performing step C1, if the first new cryptographic Hash otherwise is write into storage host.
In the embodiment of the present invention, when needing to calculate the first cryptographic Hash of data segment, the expired mark of which data segment first judged
Will position is set, and then updates the first cryptographic Hash of corresponding data section, the data segment not being set for which flag bit can
To ensure that the first cryptographic Hash is newest, it is not necessary to the first cryptographic Hash is updated, then according to the first cryptographic Hash of all data segments
Calculate the cryptographic Hash of whole file.
, it is necessary to first judge to carry out the first cryptographic Hash calculating in the data segment before new the first cryptographic Hash write-in storage host
Whether period has write request to change the data segment.
Specifically, flag bit is arranged into 0 in internal memory, storage host is not first updated, then calculates the first cryptographic Hash,
Judge whether the flag bit in internal memory is set modification again, if modification is set, then it represents that the first new cryptographic Hash calculates
Period has write request to change the data segment, then first cryptographic Hash or expired, need not write storage host.
Further, first cryptographic Hash is preserved with flag bit using extra new files.
In addition, according to a kind of method of the more copy rapid verification uniformity of distributed storage described above, the present invention also carries
A kind of storage architecture is supplied, the storage architecture uses the processing framework of control main frame-storage host.
Virtual disk is built with the control main frame, and for managing the life cycle of virtual disk, completes data
Receive, caching, forwarding capability;
The storage host is made up of multiple storage mediums, the storage for redundant data;In distributed memory system, data are most
Whole storage place, is abstracted into multiple storage assemblies, each component is made up of large-scale sparse file chain by storage resource.
Computer program is stored with the storage architecture, the computer program realizes above-mentioned when being performed by control main frame
The step of method of the more copy rapid verification uniformity of distributed storage described in one.
In summary, the invention discloses the method and storage rack of a kind of more copy rapid verification uniformity of distributed storage
Structure, using the processing framework of control main frame-storage host, including:The file of storage is evenly dividing as some data segments in advance,
Each data segment be respectively arranged with individually corresponding to the first cryptographic Hash, and be provided with for represent corresponding first cryptographic Hash whether mistake
The flag bit of phase;When receiving write request, according to the offset and length of write request, corresponding flag bit is calculated, and will
The flag bit is arranged to expired;Expired flag bit is filtered out, after updating the first cryptographic Hash corresponding to the flag bit, according to
First cryptographic Hash of each data segment calculates the second cryptographic Hash of whole file.The present invention provides a kind of more copies of distributed storage
The method and storage architecture of rapid verification uniformity, by the way that one big file is divided into some data segments, divided data section calculates
The cryptographic Hash of file, then calculate by the cryptographic Hash of each data segment the cryptographic Hash of whole file;By the above method, only need
Record which data segment is changed, then update the cryptographic Hash of corresponding data section, need to be read when avoiding verification uniformity whole
The data of individual file, so as to greatly improve uniformity school inspection speed, reduce the consumption of storage host bandwidth;And divided data section
Cryptographic Hash is calculated, easier concurrent can be realized in the system free time, greatly accelerate the speed of data check.
It should be appreciated that the application of the present invention is not limited to above-mentioned citing, for those of ordinary skills, can
To be improved or converted according to the above description, all these modifications and variations should all belong to the guarantor of appended claims of the present invention
Protect scope.