CN112799872B

CN112799872B - A kind of erasure code encoding method and device based on key-value pair storage system

Info

Publication number: CN112799872B
Application number: CN202110191784.0A
Authority: CN
Inventors: 李颉; 吴晨涛; 过敏意; 薛广涛; 张弛
Original assignee: Shanghai Jiao Tong University
Current assignee: Shanghai Jiao Tong University
Priority date: 2021-02-19
Filing date: 2021-02-19
Publication date: 2022-08-12
Anticipated expiration: 2041-02-19
Also published as: CN112799872A

Abstract

The application discloses an erasure code coding method and device based on a key value pair storage system, wherein the method comprises the following steps: acquiring an incidence relation between keywords in a key value pair storage system; acquiring at least two corresponding target data with strong relevance or strong time limitation from the key value pair storage system according to the incidence relation between the keywords; dividing at least two target data into the same coding group for coding to obtain corresponding data blocks and check blocks; and writing the obtained data block and the check block into corresponding storage nodes by adopting load balancing. Through implementing this application, can combine the design with traditional erasure coding technique and novel key value to storage system, combine simultaneously to consider the relation between the data in the key value to storage system to reduce data access's number of times and time, promote data recovery efficiency.

Description

A kind of erasure code encoding method and device based on key-value pair storage system

技术领域technical field

本申请涉及云存储技术领域，特别是涉及一种基于键值对存储系统的纠删码编码方法及装置。The present application relates to the technical field of cloud storage, and in particular, to a method and device for erasure code encoding based on a key-value pair storage system.

背景技术Background technique

随着新兴存储技术与硬件的出现，以及键值对(key-value)存储系统的底层设计对于应用程序的性能提升，越来越多的服务器集群开始采用键值对存储系统来存储数据。为保障数据的可靠性和可用性，通常使用纠删码技术来存储数据。With the emergence of emerging storage technologies and hardware, and the performance improvement of applications by the underlying design of key-value storage systems, more and more server clusters begin to use key-value storage systems to store data. To ensure the reliability and availability of data, erasure coding technology is usually used to store data.

然而，传统的纠删码技术主要是针对传统的定长块存储设备，面对新型的键值对存储系统的非定长块存储背景时，无法很好地与键值对存储系统进行适配以及提供技术支持。且键值对存储系统中不同块之间总是存在关联关系的，这在传统纠删码技术中很难体现并加以利用。However, the traditional erasure coding technology is mainly aimed at the traditional fixed-length block storage devices. When faced with the non-fixed-length block storage background of the new key-value storage system, it cannot be well adapted to the key-value storage system. and provide technical support. Moreover, there is always an association between different blocks in the key-value pair storage system, which is difficult to reflect and utilize in traditional erasure coding technology.

因此，亟需提出一种适用于键值对存储系统的纠删码编码方案。Therefore, there is an urgent need to propose an erasure coding scheme suitable for key-value pair storage systems.

发明内容SUMMARY OF THE INVENTION

为克服上述现有技术存在的不足，本申请之目的在于提供一种基于键值对存储系统的纠删码编码方法及装置，将传统纠删码技术与新型键值对存储系统进行结合设计，考虑了键值对存储系统中的数据间关系，能减少数据访问的次数和时间，提升数据恢复效率。In order to overcome the above-mentioned deficiencies in the prior art, the purpose of this application is to provide an erasure coding method and device based on a key-value pair storage system, combining traditional erasure coding technology with a novel key-value pair storage system to design, Considering the relationship between data in the key-value pair storage system, it can reduce the number and time of data access and improve the efficiency of data recovery.

为达上述及其它目的，本申请提出一种基于键值对存储系统的纠删码编码方法，包括如下步骤：In order to achieve the above-mentioned and other purposes, the present application proposes a method for encoding erasure codes based on a key-value pair storage system, comprising the following steps:

获取键值对存储系统中关键字之间的关联关系，所述关联关系用于指示所述关键字对应的数据之间具有强关联性或强时间局部性，所述键值对存储系统采用键值对的形式来存储数据；Obtain the association relationship between keywords in the key-value pair storage system, the association relationship is used to indicate that the data corresponding to the keywords have strong association or strong time locality, and the key-value pair storage system uses the key to store data in the form of value pairs;

根据所述关键字之间的关联关系，从所述键值对存储系统中获取对应的具有强关联性或强时间局限性的至少两个目标数据；Acquire corresponding at least two target data with strong correlation or strong time limitation from the key-value pair storage system according to the association relationship between the keywords;

将至少两个所述目标数据分到同一编码组中进行编码，得到对应的数据块和校验块；Divide at least two of the target data into the same encoding group for encoding to obtain corresponding data blocks and check blocks;

采用负载均衡将得到的所述数据块和所述校验块写入对应的存储节点中。The obtained data block and the check block are written into the corresponding storage node by using load balancing.

可选的，所述关联关系包括以下中的至少一种：父子包含关系、具有强访问关联性的关系或具有强访问先后顺序的关系。Optionally, the association relationship includes at least one of the following: a parent-child inclusion relationship, a relationship with strong access correlation, or a relationship with strong access sequence.

可选的，所述数据之间具有强时间局限性是通过分析数据访问特征得到的，所述数据之间具有强关联性是通过分析数据的属性信息得到的。Optionally, the strong time limitation between the data is obtained by analyzing data access characteristics, and the strong correlation between the data is obtained by analyzing the attribute information of the data.

可选的，所述键值对存储系统中存储有关键字、关键字对应的值、关键字与值之间的对应关系以及关键字之间的关联关系。Optionally, the key-value pair storage system stores keywords, values corresponding to the keywords, correspondence between keywords and values, and associations between keywords.

可选的，所述数据块中存储有至少两个所述目标数据的关联数据，所述方法还包括：Optionally, the data block stores at least two associated data of the target data, and the method further includes:

接收数据恢复请求，所述数据恢复请求用于请求读取目标节点中的数据块或校验块，以基于所述数据块或校验块实现丢失数据的恢复，所述丢失数据为任一个所述目标数据中的部分数据；Receive a data recovery request, where the data recovery request is used to request to read the data block or check block in the target node, so as to realize the recovery of lost data based on the data block or check block, and the lost data is any Part of the data in the target data;

响应所述数据恢复请求，对读取的所述目标节点中的数据块或校验块进行恢复计算，得到所述丢失数据，同时根据所述关联数据为与任一个所述目标数据具有强关联性或强时间局限性的下一目标数据的访问做准备。In response to the data recovery request, perform recovery calculation on the read data block or check block in the target node to obtain the lost data, and at the same time according to the associated data as having a strong association with any of the target data Prepare for access to the next target data with sexual or strong time constraints.

为达上述及其它目的，本申请还提供了一种基于键值对存储系统的纠删码编码装置，包括：In order to achieve the above and other purposes, the present application also provides an erasure code encoding device based on a key-value pair storage system, including:

获取单元，用于获取键值对存储系统中关键字之间的关联关系，所述关联关系用于指示所述关键字对应的数据之间具有强关联性或强时间局部性，所述键值对存储系统采用键值对的形式来存储数据；The obtaining unit is used to obtain the association relationship between keywords in the key-value pair storage system, and the association relationship is used to indicate that the data corresponding to the keywords have strong association or strong temporal locality, and the key value The storage system uses the form of key-value pairs to store data;

所述获取单元，还用于根据所述关键字之间的关联关系，从所述键值对存储系统中获取对应的具有强关联性或强时间局限性的至少两个目标数据；The obtaining unit is further configured to obtain at least two corresponding target data with strong correlation or strong time limitation from the key-value pair storage system according to the association relationship between the keywords;

编码单元，用于将至少两个所述目标数据分到同一编码组中进行编码，得到对应的数据块和校验块；an encoding unit, configured to divide at least two of the target data into the same encoding group for encoding to obtain corresponding data blocks and check blocks;

写入单元，用于采用负载均衡将得到的所述数据块和所述校验块写入对应的存储节点中。The writing unit is configured to use load balancing to write the obtained data block and the check block into the corresponding storage node.

可选的，所述数据块中存储有至少两个所述目标数据的关联数据，所述装置还包括接收单元和恢复单元，Optionally, at least two associated data of the target data are stored in the data block, and the device further includes a receiving unit and a restoring unit,

所述接收单元，还用于接收数据恢复请求，所述数据恢复请求用于请求读取目标节点中的数据块或校验块，以基于所述数据块或校验块实现丢失数据的恢复，所述丢失数据为任一个所述目标数据中的部分数据；The receiving unit is further configured to receive a data recovery request, where the data recovery request is used to request to read a data block or a check block in the target node, so as to realize the recovery of lost data based on the data block or the check block, The missing data is any partial data in the target data;

所述恢复单元，还用于响应所述数据恢复请求，对读取的所述目标节点中的数据块或校验块进行恢复计算，得到所述丢失数据，同时根据所述关联数据为与任一个所述目标数据具有强关联性或强时间局限性的下一目标数据的访问做准备。The recovery unit is further configured to, in response to the data recovery request, perform recovery calculation on the read data block or check block in the target node, to obtain the lost data, and at the same time according to the associated data to be related to any data block. One of the target data has a strong correlation or strong time constraints to prepare for the access of the next target data.

由上可见本申请提供了一种基于键值对存储系统的纠删码编码方法及装置，能达到以下有益效果：本申请将传统纠删码技术与新型的键值对存储系统进行结合设计，同时将键值对存储系统的数据间关系引入了设计考虑，使得键值对存储系统也能使用纠删码技术进行数据恢复，这样能在数据恢复时更快地将强相关的数据一并恢复出来，相比于现有技术而言，提升了数据恢复的效率，以保障用户更好地服务质量。此外由于本申请将具有强关联性或强时间局限性的数据放入同一编码组中进行编码，这样在数据访问时还能减少节点的访问次数以及整个数据访问的访问时间。It can be seen from the above that the present application provides an erasure code encoding method and device based on a key-value pair storage system, which can achieve the following beneficial effects: the present application combines the traditional erasure code technology with a novel key-value pair storage system to design, At the same time, the relationship between data in the key-value pair storage system is introduced into the design consideration, so that the key-value pair storage system can also use the erasure coding technology for data recovery, so that the strongly correlated data can be recovered faster during data recovery. Compared with the existing technology, the efficiency of data recovery is improved to ensure better service quality for users. In addition, since the present application puts data with strong correlation or strong time limitation into the same coding group for coding, the access times of nodes and the access time of the entire data access can also be reduced during data access.

附图说明Description of drawings

图1是本申请实施例提供的一种基于键值对存储系统的纠删码编码方法的流程示意图。FIG. 1 is a schematic flowchart of a method for encoding erasure codes based on a key-value pair storage system provided by an embodiment of the present application.

图2是本申请实施例提供的一种基于键值对存储系统的纠删码编码装置的结构示意图。FIG. 2 is a schematic structural diagram of an erasure correction code encoding apparatus based on a key-value pair storage system provided by an embodiment of the present application.

图3是本申请实施例提供的另一种基于键值对存储系统的纠删码编码装置的结构示意图。FIG. 3 is a schematic structural diagram of another erasure code encoding apparatus based on a key-value pair storage system provided by an embodiment of the present application.

具体实施方式Detailed ways

以下通过特定的具体实例并结合附图说明本申请的实施方式，本领域技术人员可由本说明书所揭示的内容轻易地了解本申请的其它优点与功效。本申请亦可通过其它不同的具体实例加以施行或应用，本说明书中的各项细节亦可基于不同观点与应用，在不背离本申请的精神下进行各种修饰与变更。The embodiments of the present application will be described below with reference to specific examples and accompanying drawings, and those skilled in the art can easily understand other advantages and effects of the present application from the contents disclosed in this specification. The present application can also be implemented or applied through other different specific examples, and various details in this specification can also be modified and changed based on different viewpoints and applications without departing from the spirit of the present application.

请参见图1，是本申请实施例提供的一种基于键值对存储系统的纠删码编码方法的流程示意图。如图1所示的方法包括如下实施步骤。Please refer to FIG. 1 , which is a schematic flowchart of an erasure code encoding method based on a key-value pair storage system provided by an embodiment of the present application. The method shown in FIG. 1 includes the following implementation steps.

S101、获取键值对存储系统中关键字之间的关联关系，所述关联关系用于指示所述关键字对应的数据之间具有强关联性或强时间局部性，所述键值对存储系统采用键值对的形式来存储数据。S101. Obtain an association relationship between keywords in a key-value pair storage system, where the association relationship is used to indicate that the data corresponding to the keywords have strong association or strong temporal locality, and the key-value pair storage system Data is stored in the form of key-value pairs.

本申请键值对(key-value)存储系统中均采用键值对(key-value)这种形式来存储数据。该键值对存储系统中存储有很多数据内容，例如其可包括但不限于关键字(key)、关键字(key)对应的value(值)、关键字(key)与value(值)存在的对应关系、关键字(key)与关键字(key)之间存在的关联关系等。The key-value pair (key-value) storage system of the present application adopts the form of key-value pair (key-value) to store data. A lot of data content is stored in the key-value pair storage system. For example, it may include but not limited to a keyword (key), a value (value) corresponding to a keyword (key), and the existence of a keyword (key) and value (value). Correspondence, the association between keywords (keys) and keywords (keys), etc.

在可选实施例中，在数据存储时系统可通过分析数据的访问特征，来确定或获得与该数据具有强时间局部性的至少一个关联数据；通过分析数据的属性信息来获得与该数据具有强关联性的至少一个关联数据，然后采用键值对的形式来存储数据，并将该数据对应的关键字与每个关联数据对应的关键字创建相应地关联关系，并将它们存储到键值对存储系统中。便于后续从键值对存储系统中获取相应数据进行后续处理。In an optional embodiment, when the data is stored, the system can determine or obtain at least one associated data that has strong temporal locality with the data by analyzing the access characteristics of the data; At least one associated data with strong correlation, and then store the data in the form of key-value pairs, and create corresponding associations between the keywords corresponding to the data and the keywords corresponding to each associated data, and store them in the key-value on the storage system. It is convenient to obtain corresponding data from the key-value pair storage system for subsequent processing.

本申请涉及的关键字之间的关联关系包括但不限于以下关系中的任一种或多种的组合：父子包含关系、具有强访问关联性的关系、或者具有强访问先后顺序的关系等等。The association relationship between keywords involved in this application includes, but is not limited to, any one or a combination of the following relationships: parent-child inclusion relationship, relationship with strong access correlation, or relationship with strong access sequence, etc. .

S102、根据所述关键字之间的关联关系，从所述键值对存储系统中获取对应的具有强关联性或强时间局限性的至少两个目标数据。S102. Acquire corresponding at least two target data with strong correlation or strong time limitation from the key-value pair storage system according to the association relationship between the keywords.

纠删码系统在存储数据时会对数据进行切分操作，具体地本申请会根据关键字之间的关联关系(例如父子包含关系、或两个key的对应数据在访问时具有的强关联性或强先后访问关系等)，获取与该关联关系对应的具有强关联性或强时间局限性的至少两个目标数据。The erasure coding system will segment the data when storing the data. Specifically, this application will base on the association relationship between keywords (such as parent-child inclusion relationship, or the strong association between the corresponding data of the two keys when they are accessed). or strong sequential access relationship, etc.), obtain at least two target data with strong correlation or strong time limitation corresponding to the correlation relationship.

S103、将至少两个所述目标数据分到同一编码组中进行编码，得到对应的数据块和校验块。S103: Divide at least two of the target data into the same encoding group for encoding to obtain corresponding data blocks and check blocks.

本申请可将相互将具有强关联性或强时间局限性的至少两个目标数据分到相同编码组中，再进行编码操作，以编码获得对应的数据块和校验块。换句话说即是，本申请在进行纠删码编码时，可根据数据间关系，将关系比较紧密的至少两个目标数据放到同一编码组中进行编码。In the present application, at least two target data with strong correlation or strong time limitation can be divided into the same coding group, and then the coding operation can be performed to obtain the corresponding data block and check block by coding. In other words, when the present application performs erasure code encoding, at least two target data with a relatively close relationship can be put into the same encoding group for encoding according to the relationship between the data.

S04、采用负载均衡将得到的所述数据块和所述校验块写入对应的存储节点中。S04, using load balancing to write the obtained data block and the check block into a corresponding storage node.

最后本申请可采用负载均衡技术将编码获得的数据块和校验块写入到对应不同的存储节点(例如磁盘)中。可选地，在编码后得到的任一数据块中包括有针对所述至少两个目标数据的关联数据，该关联数据用于指示这两个目标数据之间存在强关联性或强时间局限性。Finally, the present application can use the load balancing technology to write the data blocks and check blocks obtained by encoding into corresponding different storage nodes (eg, disks). Optionally, any data block obtained after encoding includes associated data for the at least two target data, and the associated data is used to indicate that there is a strong correlation or strong time limitation between the two target data. .

需要说明的是，由于纠删码系统中数据访问时，会从不同存储接收中请求数据进行访问，本申请在将具有关联性的数据分到相同编码组进行编码存储后，在进行数据访问时可以降低/减少对存储节点的访问次数，减少数据访问时间。例如以存在关联性的数据A和数据B为例，在访问某数据A时有较大概率地会马上访问数据B。传统纠删码方案并未考虑这一点，本申请方案加以考虑将数据A和数据B放入同一纠删码编码分组中编码和存储。因而可以在访问数据A时同时取得去访问数据B相关的数据块，为访问数据B提前做好准备，这样能更好地取得短时间段内可能马上需要访问的数据B，提升数据访问效率。It should be noted that, since the data access in the erasure coding system requires data to be accessed from different storage receptions, in this application, after the related data are divided into the same coding group for coding and storage, when the data is accessed The number of accesses to storage nodes can be reduced/reduced, and the data access time can be reduced. For example, taking data A and data B that are related as an example, when a certain data A is accessed, there is a high probability that data B will be accessed immediately. The traditional erasure coding scheme does not consider this point, and the scheme of the present application considers that the data A and the data B are encoded and stored in the same erasure coding group. Therefore, when accessing data A, data blocks related to accessing data B can be obtained at the same time, and preparations for accessing data B can be made in advance, so that the data B that may need to be accessed immediately in a short period of time can be better obtained, and the data access efficiency can be improved.

在可选实施例中，由于本申请设计的纠删码方案将数据间关系引入加以考虑，因而在数据恢复时可以更快地将强相关的数据同时恢复出来，相比于传统随机分组技术而言，具有更高的数据恢复效率，为用户提供更好的服务质量。在具体实施时，本申请可接收数据恢复请求，该数据恢复请求用于请求对损坏节点中的丢书数据进行恢复，其具体可用于请求从相应目标节点中读取相应的数据块或校验块，进而利用读取的这些数据块和校验块进行计算，恢复出对应的丢失数据。进一步可响应该数据恢复请求，从相应目标节点中读取相应地数据块和校验块，然后对这些数据块和校验块进行恢复计算得到丢失数据。同时根据读取的数据块中的关联数据为下一个与当前数据具有强关联性或强时间局限性的下个目标数据的访问做准备。In an optional embodiment, since the erasure coding scheme designed in the present application takes into account the relationship between data, the strongly correlated data can be recovered at the same time faster during data recovery. Compared with the traditional random grouping technology, the It has higher data recovery efficiency and provides users with better service quality. During specific implementation, the present application can receive a data recovery request, the data recovery request is used to request the recovery of the lost book data in the damaged node, which can specifically be used to request to read the corresponding data block or checksum from the corresponding target node block, and then use these read data blocks and check blocks to perform calculations to recover the corresponding lost data. Further, in response to the data recovery request, the corresponding data blocks and check blocks can be read from the corresponding target nodes, and then the data blocks and check blocks can be recovered and calculated to obtain the lost data. At the same time, according to the associated data in the read data block, preparations are made for the access of the next target data that is strongly associated with the current data or has a strong time limitation.

通过实施本申请，本申请将传统纠删码技术与新型的键值对存储系统进行结合设计，同时将键值对存储系统的数据间关系引入了设计考虑，使得键值对存储系统也能使用纠删码技术进行数据恢复，这样能在数据恢复时更快地将强相关的数据一并恢复出来，相比于现有技术而言，提升了数据恢复的效率，以保障用户更好地服务质量。此外由于本申请将具有强关联性或强时间局限性的数据放入同一编码组中进行编码，这样在数据访问时还能减少节点的访问次数以及整个数据访问的访问时间。By implementing this application, the application combines the traditional erasure coding technology with the new key-value pair storage system, and at the same time introduces the relationship between data in the key-value pair storage system into design considerations, so that the key-value pair storage system can also be used Erasure code technology is used for data recovery, so that strongly related data can be recovered more quickly during data recovery. Compared with the existing technology, the efficiency of data recovery is improved to ensure better service for users. quality. In addition, since the present application puts data with strong correlation or strong time limitation into the same coding group for coding, the access times of nodes and the access time of the entire data access can also be reduced during data access.

请参见图2，是本申请实施例提供的一种基于键值对存储系统的纠删码编码装置的结构示意图。如图2所示的装置包括获取单元201、编码单元202和写入单元，其中：Please refer to FIG. 2 , which is a schematic structural diagram of an erasure correction code encoding apparatus based on a key-value pair storage system provided by an embodiment of the present application. The apparatus shown in FIG. 2 includes an acquisition unit 201, an encoding unit 202 and a writing unit, wherein:

所述获取单元201，用于获取键值对存储系统中关键字之间的关联关系，所述关联关系用于指示所述关键字对应的数据之间具有强关联性或强时间局部性，所述键值对存储系统采用键值对的形式来存储数据；The obtaining unit 201 is configured to obtain the association relationship between keywords in the key-value pair storage system, where the association relationship is used to indicate that the data corresponding to the keywords have strong association or strong temporal locality, so The key-value pair storage system uses the form of key-value pairs to store data;

所述获取单元201，还用于根据所述关键字之间的关联关系，从所述键值对存储系统中获取对应的具有强关联性或强时间局限性的至少两个目标数据；The obtaining unit 201 is further configured to obtain at least two corresponding target data with strong correlation or strong time limitation from the key-value pair storage system according to the association relationship between the keywords;

所述编码单元202，用于将至少两个所述目标数据分到同一编码组中进行编码，得到对应的数据块和校验块；The encoding unit 202 is configured to divide at least two of the target data into the same encoding group for encoding to obtain corresponding data blocks and check blocks;

所述写入单元203，用于采用负载均衡将得到的所述数据块和所述校验块写入对应的存储节点中。The writing unit 203 is configured to use load balancing to write the obtained data block and the check block into the corresponding storage node.

可选的，所述数据块中存储有至少两个所述目标数据的关联数据，所述装置还包括接收单元204和恢复单元205，Optionally, at least two associated data of the target data are stored in the data block, and the apparatus further includes a receiving unit 204 and a restoring unit 205,

所述接收单元204，还用于接收数据恢复请求，所述数据恢复请求用于请求读取目标节点中的数据块或校验块，以基于所述数据块或校验块实现丢失数据的恢复，所述丢失数据为任一个所述目标数据中的部分数据；The receiving unit 204 is further configured to receive a data recovery request, where the data recovery request is used to request to read a data block or a check block in the target node, so as to realize the recovery of lost data based on the data block or check block , the missing data is any partial data in the target data;

所述恢复单元205，还用于响应所述数据恢复请求，对读取的所述目标节点中的数据块或校验块进行恢复计算，得到所述丢失数据，同时根据所述关联数据为与任一个所述目标数据具有强关联性或强时间局限性的下一目标数据的访问做准备。The recovery unit 205 is further configured to, in response to the data recovery request, perform recovery calculation on the read data block or check block in the target node, to obtain the lost data, and at the same time according to the associated data as the Any one of the target data has a strong correlation or a strong time limitation to prepare for the access of the next target data.

通过实施本申请，本申请将传统纠删码技术与新型的键值对存储系统进行结合设计，同时将键值对存储系统的数据间关系引入了设计考虑，使得键值对存储系统也能使用纠删码技术进行数据恢复，这样能在数据恢复时更快地将强相关的数据一并恢复出来，相比于现有技术而言，提升了数据恢复的效率，以保障用户更好地服务质量。此外由于本申请将具有强关联性或强时间局限性的数据放入同一编码组中进行编码，这样在数据访问时还能减少节点的访问次数以及整个数据访问的访问时间。By implementing this application, the application combines the traditional erasure coding technology with the new key-value pair storage system, and at the same time introduces the relationship between data in the key-value pair storage system into design considerations, so that the key-value pair storage system can also be used Erasure code technology is used for data recovery, so that strongly related data can be recovered faster during data recovery. Compared with the existing technology, the efficiency of data recovery is improved to ensure better service for users. quality. In addition, since the present application puts data with strong correlation or strong time limitation into the same coding group for coding, the access times of nodes and the access time of the entire data access can also be reduced during data access.

上述实施例仅例示性说明本申请的原理及其功效，而非用于限制本申请。任何本领域技术人员均可在不违背本申请的精神及范畴下，对上述实施例进行修饰与改变。因此，本申请的权利保护范围，应如权利要求书所列。The above-mentioned embodiments merely illustrate the principles and effects of the present application, but are not intended to limit the present application. Any person skilled in the art can modify and change the above embodiments without departing from the spirit and scope of the present application. Therefore, the scope of protection of the right of this application should be as listed in the claims.

Claims

1. a kind of erasure code coding method based on key-value pair storage system, is characterized in that, comprises:

Obtain the association relationship between keywords in the key-value pair storage system, the association relationship is used to indicate that the data corresponding to the keywords have strong association or strong time locality, and the key-value pair storage system uses the key to store data in the form of value pairs;

Acquire corresponding at least two target data with strong correlation or strong time limitation from the key-value pair storage system according to the association relationship between the keywords;

Divide at least two of the target data into the same encoding group for encoding to obtain corresponding data blocks and check blocks;

Using load balancing to write the obtained data block and the verification block into the corresponding storage node;

The association relationship includes at least one of the following: a parent-child inclusion relationship, a relationship with strong access correlation, or a relationship with strong access sequence.

2. the erasure coding method based on key-value pair storage system according to claim 1, is characterized in that, having strong time limitation between described data is obtained by analyzing data access characteristic, between described data. Strong correlation is obtained by analyzing the attribute information of the data.

3. the erasure code coding method based on key-value pair storage system according to claim 1, is characterized in that, in described key-value pair storage system, there is a keyword, the value corresponding to the keyword, the difference between the keyword and the value. Correspondence between and the relationship between keywords.

4. the erasure coding method based on key-value pair storage system according to any one of claims 1-3, is characterized in that, in described data block, the associated data of at least two described target data is stored, The method also includes:

Receive a data recovery request, where the data recovery request is used to request to read the data block or check block in the target node, so as to realize the recovery of lost data based on the data block or check block, and the lost data is any Part of the data in the target data;

In response to the data recovery request, perform recovery calculation on the read data block or check block in the target node to obtain the lost data, and at the same time according to the associated data as having a strong association with any of the target data Prepare for access to the next target data with sexual or strong time constraints.

5. a kind of erasure code coding device based on key-value pair storage system, is characterized in that, comprises:

The obtaining unit is used to obtain the association relationship between keywords in the key-value pair storage system, and the association relationship is used to indicate that the data corresponding to the keywords have strong association or strong temporal locality, and the key value The storage system uses the form of key-value pairs to store data;

The obtaining unit is further configured to obtain at least two corresponding target data with strong correlation or strong time limitation from the key-value pair storage system according to the association relationship between the keywords;

an encoding unit, configured to divide at least two of the target data into the same encoding group for encoding to obtain corresponding data blocks and check blocks;

a writing unit, configured to use load balancing to write the obtained data block and the check block into the corresponding storage node;

6. The erasure code encoding device based on a key-value pair storage system according to claim 5, wherein the data with strong time limitation is obtained by analyzing data access characteristics, and between the data Strong correlation is obtained by analyzing the attribute information of the data.

7. The erasure code encoding device based on a key-value pair storage system according to claim 5, wherein, in the key-value pair storage system, a keyword, a value corresponding to the keyword, a keyword and a value are stored in the key-value pair storage system. Correspondence between and the relationship between keywords.

8. The erasure code encoding device based on a key-value pair storage system according to any one of claims 5-7, wherein the data block stores at least two associated data of the target data, The device also includes a receiving unit and a recovery unit,

The receiving unit is further configured to receive a data recovery request, where the data recovery request is used to request to read a data block or a check block in the target node, so as to realize the recovery of lost data based on the data block or the check block, The missing data is any partial data in the target data;

The recovery unit is further configured to, in response to the data recovery request, perform recovery calculation on the read data block or check block in the target node, to obtain the lost data, and at the same time according to the associated data to be related to any data block. One of the target data has a strong correlation or strong time constraints to prepare for the access of the next target data.