CN112052114A

CN112052114A - Data storage and recovery method, coder-decoder and coder-decoder system

Info

Publication number: CN112052114A
Application number: CN202010880335.2A
Authority: CN
Inventors: 吕川; 张晓星; 张炜
Original assignee: Jiangsu Super Flow Technology Co ltd
Current assignee: Jiangsu Super Flow Technology Co ltd
Priority date: 2020-08-27
Filing date: 2020-08-27
Publication date: 2020-12-08
Anticipated expiration: 2040-08-27
Also published as: CN112052114B

Abstract

The invention discloses a data storage and recovery method, a coder-decoder and a coder-decoder system, wherein the data storage method comprises the following steps: cutting an original file to be stored into a plurality of file segments; determining a generating matrix according to the number of the preset data blocks and the number of the preset check blocks; partitioning the generator matrix into a plurality of low-dimensional sub-matrices; based on an erasure code coding algorithm, the low-dimensional sub-matrix is adopted to code each file segment in batches; and performing distributed storage on each encoded file segment. The data storage and recovery method, the coder-decoder and the coding-decoding system can improve the data processing speed.

Description

Data storage and recovery method, coder-decoder and coder-decoder system

Technical Field

The present invention relates to the field of data storage, and more particularly, to a data storage and recovery method, a codec, and a codec system.

Background

Erasure Coding (EC), an emerging data storage scheme, has become one of the mainstream schemes for large-scale data storage in recent years. And through an EC mechanism, the data is stored in different node hard disks according to a certain redundancy ratio after being cut and sliced, so that the data is not lost when partial equipment fails, the storage space is saved, and the problems of data storage reliability and space utilization rate are well solved. The following is a brief description of the RS (Reed-Solomon) type erasure coding mechanism that is mainstream in EC codes.

The erasure codes have two core parameters m and n, which represent the number of check blocks and the number of original data blocks, respectively. The erasure coding process is shown in fig. 1. In the encoding process of fig. 1, C represents a check block and D represents an original data block. B represents a generator matrix.

If part of the data is lost, as shown in fig. 2, the cross points indicate the lost part, 3 data blocks are lost, and the rows corresponding to the lost data blocks are removed from the generated matrix for encoding, so that the B matrix becomes a new n × n dimensional new matrix B'. Meanwhile, the vector formed by C and D is changed from n + m rows to n rows, and in the above process, the corresponding relationship between the new matrix B' and the residual data block vector survivors can be obtained, as shown in fig. 3. So far, after a partial data block is lost, the goal of decoding is to find the original data vector D. In this case, only the inverse matrix (B ') of the B' matrix needs to be obtained^-1The data vector D can be represented by the formula D ═ B')^-1And 4. obtaining by survivors, and achieving the purpose of recovering the original data.

The inventor finds that the EC mechanism uses matrix multiplication in the encoding and decoding processes, so that although the EC code can greatly save space, extra calculation amount is generated in the storage and recovery processes, especially for a system with large data amount to be processed, when a storage node fails to trigger data recovery, a large amount of files are decoded, at this time, a large amount of CPU resource consumption is generated, the overall performance of the system is possibly affected, and normal calculation tasks are slowed down. Particularly, when data is recovered, the computing node needs to read the data blocks and the check blocks of other nodes, and a peak of network resource occupation may be generated in a short time.

The information disclosed in this background section is only for enhancement of understanding of the general background of the invention and should not be taken as an acknowledgement or any form of suggestion that this information forms the prior art already known to a person skilled in the art.

Disclosure of Invention

The invention aims to provide a data storage and recovery method, a coder-decoder and a coder-decoder system, which can improve the data processing speed.

In order to achieve the above object, the present invention provides a data storage method, which includes: cutting an original file to be stored into a plurality of file segments; determining a generating matrix according to the number of the preset data blocks and the number of the preset check blocks; partitioning the generator matrix into a plurality of low-dimensional sub-matrices; based on an erasure code coding algorithm, the low-dimensional sub-matrix is adopted to code each file segment in batches; and performing distributed storage on each encoded file segment.

In one embodiment of the present invention, cutting an original file to be stored, thereby cutting into a plurality of file segments, comprises: cutting according to a fixed length, and arranging the cut file segments according to the original sequence or randomly according to the disordered sequence according to the instruction of the control unit.

In an embodiment of the present invention, a matrix operation module in a hardware form, which is provided in an FPGA chip or a GPU chip, is used to perform batch coding.

Based on the same inventive concept, the invention also provides a data recovery method, which comprises the following steps: scanning a file to be restored; according to the scanning condition, in a generated matrix for erasure code coding, removing matrix rows corresponding to lost data blocks to obtain a new matrix; solving the inverse matrix of the new matrix; partitioning the inverse matrix into a plurality of low-dimensional sub-matrices; and decoding each file segment in batch by adopting the low-dimensional submatrix based on an erasure code decoding algorithm.

In an embodiment of the present invention, a matrix operation module in a hardware form, which is provided in an FPGA chip or a GPU chip, is used to perform batch decoding.

Based on the same inventive concept, the invention also provides an encoder which is an independent plug-in and is matched with the FPGA chip or the GPU chip for executing the data storage method, wherein a matrix operation module in a hardware form is arranged in the FPGA chip or the GPU chip, so that the encoder can carry out batch encoding.

Based on the same inventive concept, the invention also provides a decoder which is an independent plug-in and is matched with the FPGA chip or the GPU chip for executing the data recovery method, wherein the FPGA chip or the GPU chip is internally provided with a matrix operation module in a hardware form, so that the decoder can perform batch decoding.

Based on the same inventive concept, the invention also provides a codec, wherein the codec is an independent plug-in, and the independent plug-in is matched with the FPGA chip or the GPU chip and is used for executing the coding operation and the decoding operation. The device comprises a FPGA chip or a GPU chip, wherein the FPGA chip or the GPU chip is internally provided with a hardware-form matrix operation module for batch coding each file segment by adopting the low-dimensional submatrix based on an erasure code coding algorithm.

The encoder and the decoder are used for scanning a file to be recovered in the process of executing decoding operation, and removing matrix rows corresponding to lost data blocks in a generated matrix for erasure code coding according to the scanning condition so as to obtain a new matrix; the FPGA chip or the GPU chip adopts the low-dimensional sub-matrix to decode each file segment in batch based on an erasure code decoding algorithm.

Based on the same inventive concept, the invention also provides a coding and decoding system based on the cloud platform, which comprises: the system comprises a first virtual machine cluster, a second virtual machine cluster and a third virtual machine cluster. The first virtual machine cluster is used for interacting with the cloud platform through the application programming interface and executing encoding operation and decoding operation. The second virtual machine cluster is coupled with the first virtual machine cluster and used for verifying and storing the data link information file in the encoding and decoding process, and the first virtual machine cluster is also used for sending a storage or fetching instruction to the second virtual machine cluster. The third virtual machine cluster is coupled with the first virtual machine cluster and used for storing the encoded data block file in a distributed mode, and the first virtual machine cluster is further used for sending an instruction for storing or fetching to the third virtual machine cluster. The first virtual machine cluster is provided with the coder and the decoder, and also provided with the FPGA chip or the GPU chip.

In an embodiment of the present invention, the encoding and decoding system further includes: and the fourth virtual machine cluster is coupled with the first virtual machine cluster, the second virtual machine cluster and the third virtual machine cluster, and is used for monitoring and managing the working condition of each virtual machine cluster and receiving a request sent in a Web form for expanding, deleting or updating the virtual machine nodes in the virtual machine cluster.

Compared with the prior art, according to the data storage and recovery method, the coder-decoder and the coding-decoding system, the matrix is divided into a plurality of low-dimensional sub-matrices in the coding-decoding process, and the sub-matrices are adopted for batch coding or batch decoding, so that the operation efficiency can be improved. Preferably, a hardware matrix operation module arranged in an FPGA chip or a GPU chip is adopted to perform batch matrix operation, so that the operation efficiency is greatly improved, the resource consumption of a CPU is reduced, and the system performance is improved. Preferably, in the encoding process, when the file is segmented and sliced, the switching between the continuous arrangement and the random arrangement of the file segments can be realized through the control of the control unit, the occupation of network resources can be further reduced and the reading speed can be increased under the continuous arrangement, and the data security can be ensured to a greater extent under the random arrangement.

Drawings

FIG. 1 is an erasure coding algorithm according to the prior art;

FIG. 2 is a diagram of a transformation of a generated matrix into a new matrix after a data block is lost according to the prior art;

FIG. 3 is a correspondence of a new matrix to a residual data block vector according to the prior art;

FIG. 4 is a schematic diagram of a data storage method according to an embodiment of the invention;

FIG. 5 is a schematic diagram of a data recovery method according to an embodiment of the invention;

fig. 6 is a schematic composition diagram of a cloud platform-based codec system according to an embodiment of the present invention.

Detailed Description

The following detailed description of the present invention is provided in conjunction with the accompanying drawings, but it should be understood that the scope of the present invention is not limited to the specific embodiments.

Throughout the specification and claims, unless explicitly stated otherwise, the word "comprise", or variations such as "comprises" or "comprising", will be understood to imply the inclusion of a stated element or component but not the exclusion of any other element or component.

In order to overcome the problems in the prior art, the present invention firstly provides a data storage method, as shown in fig. 4, a data storage method of an embodiment includes: step S10 to step S14.

File cutting is performed in step S10: and cutting the original file to be stored into a plurality of file segments. Preferably, cutting the original file to be stored, so as to cut into a plurality of file segments, comprises: cutting according to a fixed length, and arranging the cut file segments according to the original sequence or randomly according to the disordered sequence according to the instruction of the control unit. In this embodiment, by setting the control unit, the flexible switching between the random arrangement and the sequential arrangement of the divided file segments is realized, the occupation of network resources can be further reduced and the reading speed can be increased under the continuous arrangement, and the data security can be ensured to a greater extent under the sequential arrangement.

The generator matrix is determined in step S11: and determining a generating matrix according to the number of the preset data blocks and the number of the preset check blocks. Wherein the sub-matrices of the generator matrix have reversible properties, such as the cauchy matrix.

Matrix division is performed in step S12: partitioning the generator matrix into a plurality of low-dimensional sub-matrices.

In step S13, batch encoding is performed: and based on an erasure code coding algorithm, the low-dimensional submatrix is adopted to code each file segment in batches. Preferably, a matrix operation module in a hardware form arranged in an FPGA chip or a GPU chip is adopted for batch coding, so that the operation efficiency can be greatly improved, the CPU resource consumption is reduced, and the system performance is improved. The dimension of the submatrix depends on the type of an FPGA chip or a GPU chip, taking the mainstream NIVIDIA GPU architecture in the market as an example, a proprietary Tensor Core (Tensor calculation Core) matrix operation module solidifies the multiplication and addition operations of 4 x 4FP16/FP32 matrixes commonly used in deep learning, and can divide the matrix B into a group of 4-dimensional submatrixes and utilize the Tensor Core to perform batch modularization operation of the 4 x 4 matrixes, wherein in FP16, FP represents a single-precision floating point number, and 16 represents that the data length is 16 bits.

File storage is performed in step S14: and performing distributed storage on each encoded file segment.

Based on the same inventive concept, the invention also provides a data recovery method. As shown in fig. 5, a data recovery method according to an embodiment includes: step S20 to step S24.

The file to be restored is scanned in step S20.

In step S21, the rows corresponding to the missing data blocks in the generated matrix are removed to obtain a new matrix according to the scanning condition.

In step S22, the new matrix is inverted.

The inverse matrix is divided into a plurality of low-dimensional sub-matrices in step S23.

Batch decoding is performed in step S24: and decoding each file segment in batch by adopting the low-dimensional submatrix based on an erasure code decoding algorithm. Preferably, a matrix operation module in a hardware form arranged in an FPGA chip or a GPU chip is used for batch decoding.

Based on the same inventive concept, the invention also provides an encoder. In one embodiment, the encoder is a stand-alone plug-in that cooperates with an FPGA chip or a GPU chip to perform the data storage method of the embodiment shown in fig. 4. The encoder is used as an independent plug-in, can be flexibly distributed to specified files in a system for use, and avoids increasing the complexity of a system architecture.

Based on the same inventive concept, the invention also provides a decoder. In one embodiment, the decoder is a stand-alone plug-in that cooperates with an FPGA chip or a GPU chip to perform the data recovery method of the embodiment shown in fig. 5. The decoder is used as an independent plug-in, and can be flexibly distributed to specified files in a system for use, so that the complexity of the system architecture is prevented from being increased.

Based on the same inventive concept, the invention also provides a coder-decoder. In one embodiment, the codec is a stand-alone plug-in that cooperates with the FPGA chip or the GPU chip to perform the encoding and decoding operations. The device comprises a FPGA chip or a GPU chip, wherein the FPGA chip or the GPU chip is internally provided with a hardware-form matrix operation module for batch coding each file segment by adopting the low-dimensional submatrix based on an erasure code coding algorithm. Optionally, in an embodiment, the codec is further configured to perform distributed storage on the encoded file segments.

The method comprises the steps that a coder and a decoder are used for scanning a file to be recovered in the process of executing decoding operation, removing rows in a generated matrix corresponding to a lost data block according to the scanning condition to obtain a new matrix, solving an inverse matrix of the new matrix, and dividing the inverse matrix into a plurality of low-dimensional sub-matrices, wherein the FPGA chip or the GPU chip adopts the low-dimensional sub-matrices to decode each file segment in batch based on an erasure code decoding algorithm. The codec is used as an independent plug-in, and can be flexibly distributed to specified files in a system for use, so that the complexity of the system architecture is avoided being increased.

Based on the same inventive concept, the present invention further provides a cloud platform based coding and decoding system, as shown in fig. 6, in an embodiment, the coding and decoding system includes: a first virtual machine cluster 10, a second virtual machine cluster 11, and a third virtual machine cluster 12.

The first virtual machine cluster 10 is used to interact with the cloud platform through the application programming interface 10 a. The first virtual machine cluster 10 is provided with a codec 10b, an FPGA chip 10c and/or a GPU chip 10d, and is configured to perform encoding operation and decoding operation. When batch encoding or batch decoding is carried out, batch matrix operation is carried out through a matrix operation module in the FPGA chip 10c or the GPU chip 10d, and operation efficiency is greatly improved. The number of matrix operation modules for batch matrix calculation is the same as the number of sub-matrices. One FPGA chip 10c or GPU chip 10d may include one or more matrix operation modules. If one FPGA chip 10c or GPU chip 10d includes one matrix operation module, a plurality of FPGA chips 10c or GPU chips 10d are required to complete batch matrix operations.

Optionally, the first virtual machine cluster 10 is also used for monitoring files that need distributed storage.

The second virtual machine cluster 11 is coupled to the first virtual machine cluster 10, and is configured to verify and store a data link information file in an encoding and decoding process, and the first virtual machine cluster 10 is further configured to send an instruction for storing or retrieving to the second virtual machine cluster 11.

The third virtual machine cluster 12 is coupled to the first virtual machine cluster 10, and is configured to store the encoded data block file in a distributed manner, and the first virtual machine cluster 10 is further configured to send an instruction for storing or retrieving to the third virtual machine cluster 12.

Optionally, the encoding and decoding system of an embodiment further includes: and a fourth virtual machine cluster 13, coupled to the first virtual machine cluster 10, the second virtual machine cluster 11, and the third virtual machine cluster 12, for monitoring and managing the working conditions of each virtual machine cluster, and receiving a request sent in a Web form to expand, delete, or update the virtual machine nodes in the virtual machine cluster. The encoding and decoding system according to the embodiment adopts a reasonable software framework, can be better compatible with the current mainstream cloud computing and storage framework, and meets the requirements of easy expansion and integration.

In summary, according to the data storage and recovery method, the codec and the codec system of the present embodiment, in the encoding and decoding process, the matrix is divided into a plurality of low-dimensional sub-matrices, and the sub-matrices are used for batch encoding or batch decoding, so that the operation efficiency can be improved. Preferably, a hardware matrix operation module arranged in an FPGA chip or a GPU chip is adopted to perform batch matrix operation instead of being handed to a CPU for processing, so that the operation efficiency is greatly improved, the average performance calculation can reach 15 to 30 times of the CPU processing, the energy consumption efficiency calculation can reach 30 to 80 times of the CPU processing, the resource consumption of the CPU is greatly reduced, and the system performance is improved. And the encoding and decoding processing capacity is greatly improved, so that the scheme can achieve higher EC cutting number, and has higher storage resource utilization rate compared with the original EC code. Preferably, in the encoding process, when the file is segmented and sliced, the switching between the continuous arrangement and the random arrangement of the file segments can be realized through the control of the control unit, the occupation of network resources can be further reduced and the reading speed can be increased under the continuous arrangement, and the data security can be ensured to a greater extent under the random arrangement.

As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

The foregoing descriptions of specific exemplary embodiments of the present invention have been presented for purposes of illustration and description. It is not intended to limit the invention to the precise form disclosed, and obviously many modifications and variations are possible in light of the above teaching. The exemplary embodiments were chosen and described in order to explain certain principles of the invention and its practical application to enable one skilled in the art to make and use various exemplary embodiments of the invention and various alternatives and modifications as are suited to the particular use contemplated. It is intended that the scope of the invention be defined by the claims and their equivalents.

Claims

1. A method of storing data, comprising:

cutting an original file to be stored into a plurality of file segments;

determining a generating matrix according to the number of the preset data blocks and the number of the preset check blocks;

partitioning the generator matrix into a plurality of low-dimensional sub-matrices;

based on an erasure code coding algorithm, the low-dimensional sub-matrix is adopted to code each file segment in batches;

and performing distributed storage on each encoded file segment.

2. The data storage method of claim 1, wherein cutting the original file to be stored into a plurality of file segments comprises:

cutting according to a fixed length, and arranging the cut file segments according to the original sequence or randomly according to the disordered sequence according to the instruction of the control unit.

3. The data storage method according to claim 1, wherein the matrix operation module in the form of hardware provided in an FPGA chip or a GPU chip is used for batch coding.

4. A method for data recovery, comprising:

scanning a file to be restored;

according to the scanning condition, in a generated matrix for erasure code coding, removing matrix rows corresponding to lost data blocks to obtain a new matrix;

solving the inverse matrix of the new matrix;

partitioning the inverse matrix into a plurality of low-dimensional sub-matrices;

and decoding each file segment in batch by adopting the low-dimensional submatrix based on an erasure code decoding algorithm.

5. The data recovery method according to claim 4, wherein the matrix operation module in the form of hardware provided in the FPGA chip or the GPU chip is used for batch decoding.

6. An encoder, characterized in that the encoder is an independent plug-in, which is cooperated with an FPGA chip or a GPU chip for executing the data storage method of claim 1, wherein a matrix operation module in a hardware form is provided in the FPGA chip or the GPU chip, so that the encoder can perform batch encoding.

7. A decoder, characterized in that the decoder is an independent plug-in which is matched with an FPGA chip or a GPU chip for executing the data recovery method according to claim 4, wherein a matrix operation module in a hardware form is arranged in the FPGA chip or the GPU chip, so that the decoder can perform batch decoding.

8. Codec, characterized in that the codec is a stand-alone plug-in that cooperates with an FPGA chip or a GPU chip for performing encoding operations and decoding operations,

the device comprises a coder and a decoder, wherein the coder and the decoder are used for cutting an original file to be stored into a plurality of file segments in the process of executing coding operation, determining a generating matrix according to the number of preset data blocks and the number of preset check blocks, and dividing the generating matrix into a plurality of low-dimensional sub-matrices, and a hardware-form matrix operation module is arranged in an FPGA chip or a GPU chip and is used for batch coding each file segment by adopting the low-dimensional sub-matrices based on an erasure code coding algorithm; and

9. A coding and decoding system based on a cloud platform is characterized by comprising:

the first virtual machine cluster is used for interacting with the cloud platform through an application programming interface and executing encoding operation and decoding operation;

the second virtual machine cluster is coupled with the first virtual machine cluster and used for verifying and storing the data link information file in the encoding and decoding process, and the first virtual machine cluster is also used for sending a storage or fetching instruction to the second virtual machine cluster;

a third virtual machine cluster coupled to the first virtual machine cluster and configured to store the encoded data block file in a distributed manner, wherein the first virtual machine cluster is further configured to send an instruction for storing or retrieving to the third virtual machine cluster;

the codec of claim 8 is arranged in the first virtual machine cluster, and the FPGA chip or the GPU chip of claim 8 is further arranged in the first virtual machine cluster.

10. The cloud platform-based codec system of claim 9, further comprising:

and the fourth virtual machine cluster is coupled with the first virtual machine cluster, the second virtual machine cluster and the third virtual machine cluster, and is used for monitoring and managing the working condition of each virtual machine cluster and receiving a request sent in a Web form for expanding, deleting or updating the virtual machine nodes in the virtual machine clusters.