WO2020238653A1 - Encoding method in distributed system environment, decoding method in distributed system environment, and corresponding apparatuses - Google Patents

Encoding method in distributed system environment, decoding method in distributed system environment, and corresponding apparatuses Download PDF

Info

Publication number
WO2020238653A1
WO2020238653A1 PCT/CN2020/090515 CN2020090515W WO2020238653A1 WO 2020238653 A1 WO2020238653 A1 WO 2020238653A1 CN 2020090515 W CN2020090515 W CN 2020090515W WO 2020238653 A1 WO2020238653 A1 WO 2020238653A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
original data
encoding
partition
distributed system
Prior art date
Application number
PCT/CN2020/090515
Other languages
French (fr)
Chinese (zh)
Inventor
董元元
赵亚飞
Original Assignee
阿里巴巴集团控股有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 阿里巴巴集团控股有限公司 filed Critical 阿里巴巴集团控股有限公司
Publication of WO2020238653A1 publication Critical patent/WO2020238653A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/0706Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment
    • G06F11/0709Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment in a distributed system consisting of a plurality of standalone computer nodes, e.g. clusters, client-server systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/08Error detection or correction by redundancy in data representation, e.g. by using checking codes
    • G06F11/10Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/08Error detection or correction by redundancy in data representation, e.g. by using checking codes
    • G06F11/10Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's
    • G06F11/1008Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's in individual solid state devices
    • G06F11/1012Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's in individual solid state devices using codes or arrangements adapted for a specific type of error
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1471Saving, restoring, recovering or retrying involving logging of persistent data for recovery

Definitions

  • the present invention relates to the technical field of coding fault tolerance, in particular to a coding method, a decoding method and a corresponding device in a distributed system environment.
  • Erasure codes can add n pieces of original data to m pieces of coded data (used to store correction codes), and can restore to original data through any n pieces of data in n+m pieces. This erasure code is also called the maximum distance separable code.
  • the most widely used traditional encoding in the industry is RS encoding. It can tolerate any number of disk errors by adjusting the parameters k and m, but the RS encoding has obvious defects, which are mainly reflected in its performance of recovering data.
  • RS encoding to restore any data disk with errors requires data of k data disks. The amplification of the recovery cost will take up additional network bandwidth, IO resources and recovery time. In a distributed system environment, calling data across regions will take up more network bandwidth and the response time will be further improved.
  • the present application provides an encoding method, a decoding method, and a corresponding device in a distributed system environment, for the distributed system environment, the restoration cost is optimized without affecting the system's fault tolerance and storage cost.
  • the present invention provides an encoding method in a distributed system environment, including:
  • the global check data and local check data corresponding to the original data are stored in the data partition where the original data is stored.
  • the original data is evenly stored in each data partition of the distributed system; each data partition stores the same amount of local verification data corresponding to the original data; The number of global check data corresponding to the original data stored in each data partition is the same.
  • the first code is RS code; the second code is MSR code.
  • the present invention provides a decoding method in a distributed system environment, including:
  • the local check data and global check data stored in the current partition and the global check data stored in other data partitions of the distributed system are used to compare all the data.
  • the original data is restored in a joint encoding manner of the first encoding and the second encoding to obtain complete original data, wherein the second threshold is greater than the first threshold.
  • the first code is RS code; the second code is MSR code.
  • the present invention provides an encoding device in a distributed system environment, including:
  • the encoding module is configured to perform the first encoding on the original data to obtain the global verification data, and perform the second encoding on the original data to obtain the local verification data;
  • the partition module is configured to store the original data in one or more data partitions of the distributed system
  • the distribution module is configured to store the global check data and the local check data corresponding to the original data in the data partition where the original data is stored.
  • the partition module stores the original data in each data partition of the distributed system on average according to the data partition of the original data; each data partition stores the local verification data corresponding to the original data.
  • the number is the same; the number of global check data corresponding to the original data stored in each data partition is the same.
  • the present invention provides a decoding device in a distributed system environment, including:
  • the communication module is set to read the original data stored in the current partition
  • the recovery module is configured to use the local verification data stored in the current partition to recover the original data according to the second encoding method when the number of damages to the original data is less than or equal to the preset first threshold to obtain complete original data ;
  • the recovery module is further configured to use the global check data stored in the current partition and the global check data stored in other data partitions of the distributed system to compare the original data when the amount of damage to the original data is greater than the preset first threshold.
  • the data is restored according to the first encoding method to obtain complete original data.
  • the recovery module is further configured to use the local check data and global check data stored in the current partition and the data stored in other data partitions of the distributed system when the number of original data damages is greater than the preset second threshold.
  • the global check data performs data recovery on the original data in a joint encoding manner of the first encoding and the second encoding to obtain complete original data, wherein the second threshold is greater than the first threshold.
  • This application uses the idea of local verification to combine RS coding and MSR coding into a new coding form.
  • the new encoding format guarantees the data reliability and storage cost of the entire distributed system, and at the same time it also has the excellent performance of low-cost single-disk recovery of MSR encoding.
  • the performance of the system will be greatly improved after using the new coding; and it has excellent scalability, and can be applied to various partitioned system environments after adjusting the parameters.
  • Figure 1 is a flowchart of an encoding method in a distributed system environment according to an embodiment of the present invention
  • FIG. 2 is a schematic diagram of the distributed system partitioning storage of original data and verification data according to an embodiment of the present invention
  • FIG. 3 is a flowchart of a decoding method in a distributed system environment according to an embodiment of the present invention.
  • Figure 4 is a schematic structural diagram of an encoding device in a distributed system environment according to an embodiment of the present invention.
  • FIG. 5 is a schematic structural diagram of a decoding method in a distributed system environment according to an embodiment of the present invention.
  • Fig. 6 is a flowchart of a decoding process in a distributed system environment according to an embodiment of the present invention.
  • the distributed system includes multiple data partitions, and each data partition includes one or more data disks for storing data.
  • Each data disk may include one or more processors (CPU), input/output Output interface, network interface and memory (memory).
  • the memory may include non-permanent memory in a computer readable medium, random access memory (RAM) and/or non-volatile memory, such as read-only memory (ROM) or flash memory (flash RAM). Memory is an example of computer readable media.
  • RAM random access memory
  • ROM read-only memory
  • flash RAM flash memory
  • Memory is an example of computer readable media.
  • the memory may include one or more modules.
  • Computer-readable media include permanent and non-permanent, removable and non-removable storage media, and information storage can be realized by any method or technology.
  • the information can be computer-readable instructions, data structures, program modules, or other data.
  • Examples of computer storage media include, but are not limited to, phase change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), flash memory or other memory technology, CD-ROM, digital versatile disc (DVD) or other optical storage, Magnetic cassettes, disk storage or other magnetic storage devices or any other non-transmission media can be used to store information that can be accessed by computing devices.
  • PRAM phase change memory
  • SRAM static random access memory
  • DRAM dynamic random access memory
  • RAM random access memory
  • ROM read-only memory
  • EEPROM electrically erasable programmable read-only memory
  • flash memory or other memory technology
  • CD-ROM compact disc
  • an embodiment of the present invention provides an encoding method in a distributed system environment, including:
  • the number of original data is n as an example
  • the first encoding is performed on n pieces of original data to obtain m pieces of global check data
  • the n pieces of original data are subjected to second coding to obtain k pieces of local check.
  • Verify data save n copies of original data in the data partition of the distributed system.
  • the global verification data and local verification data corresponding to the original data also exist in the partition where the original data is located.
  • the original data is evenly stored in each data partition of the distributed system; each data partition stores the same amount of local verification data corresponding to the original data; The number of global check data corresponding to the original data stored in each data partition is the same;
  • the number of original data is n as an example, n pieces of original data are averagely stored in each data partition of the distributed system; the number of local verification data stored in each data partition is the same; The number of global check data stored in each data partition is the same.
  • n pieces of original data are evenly stored in each data partition of the distributed system for load balancing.
  • 21 pieces of original data are equally distributed among 3 data partitions, and each partition stores 7 pieces of original data. And its corresponding global check data and local check data.
  • the actual application is evenly distributed, try to achieve load balancing, for example, there are two data partitions, 21 copies of original data can be transmitted, 10 copies of original data and its corresponding global check data and local check data can be stored in one data partition.
  • Another data partition stores 11 pieces of original data and its corresponding global check data and local check data.
  • the embodiment of the present invention is 3AZ (Available Zone refers to the available partitions in the system), where the original data has 6 copies for each partition, and the local check data is 2 copies for each partition (denoted as P in the figure) There are 12 copies of global check data, and each partition stores 4 copies (denoted as G in the figure). The number of data in the entire distributed system is 18+6+12.
  • the first code is RS code; the second code is MSR code.
  • MSR coding is a kind of regeneration code, which can significantly reduce the cost of data recovery by changing the traditional coding method.
  • the local verification data is generated according to the data partition using MSR encoding and stored in the data partition; the global verification data is generated according to the RS encoding scheme and distributed to each partition.
  • the new code guarantees the data reliability and storage cost of the entire system and also has MSR.
  • the coded single-disk recovery has excellent performance with low cost. Compared with the RS coding scheme under the 3AZ environment, the system performance will be greatly improved after the new code is used; in addition, the embodiment of the present invention has excellent scalability. It can be applied to various partitioned system environments.
  • an embodiment of the present invention provides a decoding method in a distributed system environment, including:
  • the local check data and global check data stored in the current partition and the global check data stored in other data partitions of the distributed system are used.
  • the data performs data recovery on the original data according to the first coding and the second coding joint coding mode to obtain complete original data, wherein the second threshold is greater than the first threshold.
  • the first encoding is RS encoding
  • the second encoding is MSR encoding
  • the situation is discussed. If the amount of lost data is small, for example, only one or two pieces of data are wrong. It is only necessary to use the local verification data to use the MSR code recovery scheme to recover, and then the optimization effect of the MSR code on the recovery cost can be used. If a large-scale data error occurs, for example, all data in a data partition fails. The global check data needs to be used for recovery. In this case, the RS encoding scheme is used, so the recovery cost is the same as the RS encoding. In some extreme cases, it may be necessary to perform joint recovery of the global check data and the local check data. Based on the data reliability of the system.
  • an embodiment of the present invention provides an encoding device in a distributed system environment, including:
  • the encoding module 100 is configured to perform the first encoding on the original data to obtain the global verification data, and to perform the second encoding on the original data to obtain the local verification data;
  • the partition module 200 is configured to store the original data in one or more data partitions of the distributed system
  • the distribution module 300 is configured to store the global check data and the local check data corresponding to the original data in the data partition where the original data is stored.
  • the encoding module 100 performs first encoding on n pieces of original data to obtain m pieces of global verification data, and performs second encoding on n pieces of original data to obtain k pieces of local verification data;
  • the n pieces of original data are stored in the data partition of the distributed system, and the distribution module 300 stores the global check data and the local check data corresponding to the original data in the partition where the original data is located.
  • the partition module 200 stores the original data in each data partition of the distributed system on average according to the data partition of the original data; each data partition stores the local data corresponding to the original data.
  • the number of check data is the same; the number of global check data corresponding to the original data stored in each data partition is the same.
  • an embodiment of the present invention provides a decoding device in a distributed system environment, including:
  • the communication module 400 is configured to read the original data stored in the current partition
  • the restoration module 500 is configured to use the local check data stored in the current partition to restore the original data according to the second encoding method when the amount of damage to the original data is less than or equal to the preset first threshold to obtain a complete original data;
  • the recovery module 500 is further configured to use the global check data stored in the current partition and the global check data stored in other data partitions of the distributed system to compare the damage to the original data when the number of original data damages is greater than the preset first threshold The original data is restored according to the first encoding method to obtain complete original data.
  • the recovery module 500 is further configured to use the local check data and global check data stored in the current partition and other data of the distributed system when the original data damage quantity is greater than the preset second threshold.
  • the global check data stored in the partition performs data recovery on the original data according to a joint encoding method of the first encoding and the second encoding to obtain complete original data, wherein the second threshold is greater than the first threshold.
  • the encoding method is as follows: Local verification data is generated and stored in the corresponding partition according to the partition using MSR encoding. According to the RS coding scheme, global check data is generated and distributed evenly among the partitions. In the example shown in Figure 2, the local verification data has 2 copies for each partition. There are 12 copies of global verification data, and 4 copies are stored in each partition. The entire coding parameter is 18+6+12, of which 18 original data are stored, and each partition stores 6 copies on average.
  • the time division of data recovery is discussed. If the amount of lost data is small, for example, only one or two pieces of data are wrong, you only need to use the local encoding and use the MSR encoding recovery scheme to recover, which can be used
  • the optimization effect of MSR coding on recovery cost If there is a large-scale data error, such as a partition failure. It is necessary to use the global check data for recovery. Because of the RS encoding scheme used, the recovery cost is the same as the RS encoding. In some extreme cases, global check and local check may be required for joint recovery.
  • the embodiments of the present invention integrate the advantages of RS coding and MSR coding, so that the new coding is more suitable for systems in a 3AZ environment in terms of network bandwidth and response time.
  • the local check data is encoded using the MSR encoding scheme, which has obvious advantages in terms of network bandwidth occupation during restoration, compared with the direct exclusive OR operation encoding.
  • the partition code is configured as 4+2, 2 copies of local verification data and 4 copies of global verification data.
  • the LRC (Local Reconstruction Code, a new type of code that uses local verification) coding scheme requires 4 times the amount of data to be called.
  • the MSR encoding only needs to call 2.5 times the amount of data.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Hardware Design (AREA)
  • Error Detection And Correction (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

An encoding method in a distributed system environment, a decoding method in a distributed system environment, and corresponding apparatuses, relating to the technical field of encoding fault tolerance. The encoding method comprises: performing first encoding on original data to obtain global verification data, and performing second encoding on the original data to obtain local verification data (S101); storing the original data in one or more data partitions of a distributed system (S102); and storing, in the data partition where the original data is stored, the global verification data and the local verification data corresponding to the original data (S103). The recovery cost of a distributed system environment is optimized, without affecting the system fault tolerance capability and storage cost.

Description

一种分布式系统环境下的编码方法、解码方法和对应装置Encoding method, decoding method and corresponding device under distributed system environment
本申请要求2019年05月24日递交的申请号为201910439739.5、发明名称为“一种分布式系统环境下的编码方法、解码方法和对应装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of a Chinese patent application filed on May 24, 2019 with the application number 201910439739.5 and the invention title "A coding method, decoding method and corresponding device in a distributed system environment", the entire content of which is by reference Incorporated in this application.
技术领域Technical field
本发明涉及编码容错技术领域,具体涉及一种分布式系统环境下的编码方法、解码方法和对应装置。The present invention relates to the technical field of coding fault tolerance, in particular to a coding method, a decoding method and a corresponding device in a distributed system environment.
背景技术Background technique
目前,分布式系统的存储规模正在变得越来越大;而分布式系统中的设备错误也是一个不容忽视的问题。因此数据的存储成本与可靠性都是分布式系统设计时需要考虑的因素。纠删码可以在保证与其同样的数据可靠性的前提下,最小化系统的存储开销。At present, the storage scale of distributed systems is becoming larger and larger; and equipment errors in distributed systems are also a problem that cannot be ignored. Therefore, the storage cost and reliability of data are factors that need to be considered when designing a distributed system. Erasure codes can minimize the storage overhead of the system on the premise of ensuring the same data reliability.
纠删码可以将n份原始数据,增加m份编码数据(用来存储纠偏编码),并能通过n+m份中的任意n份数据,还原为原始数据。这种纠删码又被称为最大距离可分码。目前业界使用最广的一种传统编码是RS编码。其通过调整参数k和m可以拥有容忍任意数量盘出错的性能,但是RS编码有着明显的缺陷,主要体现在其恢复数据的表现上。RS编码恢复任意一个出错的数据盘都需要调用k个数据盘的数据。恢复代价的放大会占用额外的网络带宽,IO资源以及恢复时间。在分布式系统环境下跨区域调用数据还会占用更多的网络带宽,响应时间也会进一步提高。Erasure codes can add n pieces of original data to m pieces of coded data (used to store correction codes), and can restore to original data through any n pieces of data in n+m pieces. This erasure code is also called the maximum distance separable code. The most widely used traditional encoding in the industry is RS encoding. It can tolerate any number of disk errors by adjusting the parameters k and m, but the RS encoding has obvious defects, which are mainly reflected in its performance of recovering data. RS encoding to restore any data disk with errors requires data of k data disks. The amplification of the recovery cost will take up additional network bandwidth, IO resources and recovery time. In a distributed system environment, calling data across regions will take up more network bandwidth and the response time will be further improved.
发明内容Summary of the invention
本申请提供一种分布式系统环境下的编码方法、解码方法和对应装置,针对分布式系统环境,在不影响系统容错能力以及存储代价的前提下优化其恢复代价。The present application provides an encoding method, a decoding method, and a corresponding device in a distributed system environment, for the distributed system environment, the restoration cost is optimized without affecting the system's fault tolerance and storage cost.
采取的技术方案如下:The technical solutions adopted are as follows:
第一方面,本发明提供一种分布式系统环境下的编码方法,包括:In the first aspect, the present invention provides an encoding method in a distributed system environment, including:
对原始数据进行第一编码,获得全局校验数据,对原始数据进行第二编码,获得本地校验数据;Perform a first encoding on the original data to obtain global verification data, and perform a second encoding on the original data to obtain local verification data;
将所述原始数据保存在分布式系统的一个或者多个数据分区中;Save the original data in one or more data partitions of the distributed system;
在存入原始数据的数据分区中存入所述原始数据对应的全局校验数据和本地校验数 据。The global check data and local check data corresponding to the original data are stored in the data partition where the original data is stored.
优选地,依据原始数据的数据分块,将所述原始数据平均保存在分布式系统的每个数据分区中;每个数据分区中存储所述原始数据对应的本地校验数据的数量相同;每个数据分区中存储所述原始数据对应的全局校验数据的数量相同。Preferably, according to the data block of the original data, the original data is evenly stored in each data partition of the distributed system; each data partition stores the same amount of local verification data corresponding to the original data; The number of global check data corresponding to the original data stored in each data partition is the same.
优选地,所述第一编码为RS编码;所述第二编码为MSR编码。Preferably, the first code is RS code; the second code is MSR code.
第二方面,本发明提供一种分布式系统环境下的解码方法,包括:In the second aspect, the present invention provides a decoding method in a distributed system environment, including:
读取当前分区存储的原始数据;Read the original data stored in the current partition;
当所述原始数据损坏数量小于或者等于预设第一阈值时,利用存储在当前分区的本地校验数据对所述原始数据按照第二编码方式进行数据恢复,获得完整原始数据;When the amount of damage to the original data is less than or equal to the preset first threshold, use the local check data stored in the current partition to perform data recovery on the original data in the second encoding mode to obtain complete original data;
当所述原始数据损坏数量大于预设第一阈值时,利用存储在当前分区的全局校验数据以及分布式系统的其他数据分区中存储的全局校验数据对所述原始数据按照第一编码方式进行数据恢复,获得完整原始数据。When the amount of damage to the original data is greater than the preset first threshold, use the global check data stored in the current partition and the global check data stored in other data partitions of the distributed system to perform the first encoding method on the original data Perform data recovery and obtain complete original data.
优选地,当所述原始数据损坏数量大于预设第二阈值时,利用存储在当前分区的本地校验数据和全局校验数据以及分布式系统的其他数据分区中存储的全局校验数据对所述原始数据按照第一编码和第二编码联合编码方式进行数据恢复,获得完整原始数据,其中,第二阈值大于第一阈值。Preferably, when the number of original data damages is greater than the preset second threshold, the local check data and global check data stored in the current partition and the global check data stored in other data partitions of the distributed system are used to compare all the data. The original data is restored in a joint encoding manner of the first encoding and the second encoding to obtain complete original data, wherein the second threshold is greater than the first threshold.
优选地,所述第一编码为RS编码;所述第二编码为MSR编码。Preferably, the first code is RS code; the second code is MSR code.
第三方面,本发明提供一种分布式系统环境下的编码装置,包括:In a third aspect, the present invention provides an encoding device in a distributed system environment, including:
编码模块,设置为对原始数据进行第一编码,获得全局校验数据,对原始数据进行第二编码,获得本地校验数据;The encoding module is configured to perform the first encoding on the original data to obtain the global verification data, and perform the second encoding on the original data to obtain the local verification data;
分区模块,设置为将所述原始数据保存在分布式系统的一个或者多个数据分区中;The partition module is configured to store the original data in one or more data partitions of the distributed system;
分配模块,设置为在存入原始数据的数据分区中存入所述原始数据对应的全局校验数据和本地校验数据。The distribution module is configured to store the global check data and the local check data corresponding to the original data in the data partition where the original data is stored.
优选地,所述分区模块依据原始数据的数据分块,将所述原始数据平均保存在分布式系统的每个数据分区中;每个数据分区中存储所述原始数据对应的本地校验数据的数量相同;每个数据分区中存储所述原始数据对应的全局校验数据的数量相同。Preferably, the partition module stores the original data in each data partition of the distributed system on average according to the data partition of the original data; each data partition stores the local verification data corresponding to the original data. The number is the same; the number of global check data corresponding to the original data stored in each data partition is the same.
第四方面,本发明提供一种分布式系统环境下的解码装置,包括:In a fourth aspect, the present invention provides a decoding device in a distributed system environment, including:
通信模块,设置为读取当前分区存储的原始数据;The communication module is set to read the original data stored in the current partition;
恢复模块,设置为当所述原始数据损坏数量小于或者等于预设第一阈值时,利用存储在当前分区的本地校验数据对所述原始数据按照第二编码方式进行数据恢复,获得完 整原始数据;The recovery module is configured to use the local verification data stored in the current partition to recover the original data according to the second encoding method when the number of damages to the original data is less than or equal to the preset first threshold to obtain complete original data ;
恢复模块,还设置为当所述原始数据损坏数量大于预设第一阈值时,利用存储在当前分区的全局校验数据以及分布式系统的其他数据分区中存储的全局校验数据对所述原始数据按照第一编码方式进行数据恢复,获得完整原始数据。The recovery module is further configured to use the global check data stored in the current partition and the global check data stored in other data partitions of the distributed system to compare the original data when the amount of damage to the original data is greater than the preset first threshold. The data is restored according to the first encoding method to obtain complete original data.
优选地,恢复模块,还设置为当所述原始数据损坏数量大于预设第二阈值时,利用存储在当前分区的本地校验数据和全局校验数据以及分布式系统的其他数据分区中存储的全局校验数据对所述原始数据按照第一编码和第二编码联合编码方式进行数据恢复,获得完整原始数据,其中,第二阈值大于第一阈值。Preferably, the recovery module is further configured to use the local check data and global check data stored in the current partition and the data stored in other data partitions of the distributed system when the number of original data damages is greater than the preset second threshold. The global check data performs data recovery on the original data in a joint encoding manner of the first encoding and the second encoding to obtain complete original data, wherein the second threshold is greater than the first threshold.
本申请和现有技术相比,具有如下有益效果:Compared with the prior art, this application has the following beneficial effects:
本申请运用本地校验的思想将RS编码和MSR编码组合成了一种新的编码形式。新的编码形式保证了整个分布式系统的数据可靠性和存储代价的同时,也拥有了MSR编码的单盘恢复代价低的优异性能。在分布式系统环境下相较于传统RS编码方案,系统使用新编码后性能将会大大提高;并且具有优秀的可扩展性,将参数调整之后可适用于各种分区的系统环境之下。This application uses the idea of local verification to combine RS coding and MSR coding into a new coding form. The new encoding format guarantees the data reliability and storage cost of the entire distributed system, and at the same time it also has the excellent performance of low-cost single-disk recovery of MSR encoding. In a distributed system environment, compared with the traditional RS coding scheme, the performance of the system will be greatly improved after using the new coding; and it has excellent scalability, and can be applied to various partitioned system environments after adjusting the parameters.
附图说明Description of the drawings
图1为本发明实施例的一种分布式系统环境下的编码方法的流程图;Figure 1 is a flowchart of an encoding method in a distributed system environment according to an embodiment of the present invention;
图2为本发明实施例的分布式系统分区存储原始数据和校验数据的示意图;2 is a schematic diagram of the distributed system partitioning storage of original data and verification data according to an embodiment of the present invention;
图3为本发明实施例的一种分布式系统环境下的解码方法的流程图;3 is a flowchart of a decoding method in a distributed system environment according to an embodiment of the present invention;
图4为本发明实施例的一种分布式系统环境下的编码装置的结构示意图;Figure 4 is a schematic structural diagram of an encoding device in a distributed system environment according to an embodiment of the present invention;
图5为本发明实施例的一种分布式系统环境下的解码方法的结构示意图;5 is a schematic structural diagram of a decoding method in a distributed system environment according to an embodiment of the present invention;
图6为本发明实施例的分布式系统环境下的解码过程的流程图。Fig. 6 is a flowchart of a decoding process in a distributed system environment according to an embodiment of the present invention.
具体实施方式Detailed ways
下面将结合附图及实施例对本申请的技术方案进行更详细的说明。The technical solution of the present application will be described in more detail below with reference to the drawings and embodiments.
需要说明的是,如果不冲突,本申请实施例以及实施例中的各个特征可以相互结合,均在本申请的保护范围之内。另外,虽然在流程图中示出了逻辑顺序,但是在某些情况下,可以以不同于此处的顺序执行所示出或描述的步骤。It should be noted that, if there is no conflict, the embodiments of the present application and various features in the embodiments can be combined with each other, and all fall within the protection scope of the present application. In addition, although a logical sequence is shown in the flowchart, in some cases, the steps shown or described may be performed in a different order than here.
在一种配置中,分布式系统包括多个数据分区,每个数据分区包括一个或者多个用于存储数据的数据盘,每个数据盘可包括一个或多个处理器(CPU)、输入/输出接口、 网络接口和内存(memory)。In one configuration, the distributed system includes multiple data partitions, and each data partition includes one or more data disks for storing data. Each data disk may include one or more processors (CPU), input/output Output interface, network interface and memory (memory).
内存可能包括计算机可读介质中的非永久性存储器,随机存取存储器(RAM)和/或非易失性内存等形式,如只读存储器(ROM)或闪存(flash RAM)。内存是计算机可读介质的示例。内存可能包括一个或多个模块。The memory may include non-permanent memory in a computer readable medium, random access memory (RAM) and/or non-volatile memory, such as read-only memory (ROM) or flash memory (flash RAM). Memory is an example of computer readable media. The memory may include one or more modules.
计算机可读介质包括永久性和非永久性、可移动和非可移动存储介质,可以由任何方法或技术来实现信息存储。信息可以是计算机可读指令、数据结构、程序的模块或其他数据。计算机的存储介质的例子包括,但不限于相变内存(PRAM)、静态随机存取存储器(SRAM)、动态随机存取存储器(DRAM)、其他类型的随机存取存储器(RAM)、只读存储器(ROM)、电可擦除可编程只读存储器(EEPROM),快闪记忆体或其他内存技术、只读光盘只读存储器(CD-ROM)、数字多功能光盘(DVD)或其他光学存储、磁盒式磁带,磁盘存储或其他磁性存储设备或任何其他非传输介质,可用于存储可以被计算设备访问的信息。Computer-readable media include permanent and non-permanent, removable and non-removable storage media, and information storage can be realized by any method or technology. The information can be computer-readable instructions, data structures, program modules, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), flash memory or other memory technology, CD-ROM, digital versatile disc (DVD) or other optical storage, Magnetic cassettes, disk storage or other magnetic storage devices or any other non-transmission media can be used to store information that can be accessed by computing devices.
实施例一Example one
如图1所示,本发明实施例提供一种分布式系统环境下的编码方法,包括:As shown in Fig. 1, an embodiment of the present invention provides an encoding method in a distributed system environment, including:
S101、对原始数据进行第一编码,获得全局校验数据,对原始数据进行第二编码,获得本地校验数据;S101. Perform a first encoding on the original data to obtain global verification data, and perform a second encoding on the original data to obtain local verification data;
S102、将所述原始数据保存在分布式系统的一个或者多个数据分区中;S102. Save the original data in one or more data partitions of the distributed system.
S103、在存入原始数据的数据分区中存入所述原始数据对应的全局校验数据和本地校验数据。S103: Store the global check data and the local check data corresponding to the original data in the data partition where the original data is stored.
本发明实施例中,以原始数据的数量为n为例说明,对n份原始数据进行第一编码,获得m份全局校验数据,将n份原始数据进行第二编码,获得k份本地校验数据;将n份原始数据保存至分布式系统的数据分区中,相应的,原始数据对应的全局校验数据和本地校验数据也存在原始数据所在的分区。In the embodiment of the present invention, the number of original data is n as an example, the first encoding is performed on n pieces of original data to obtain m pieces of global check data, and the n pieces of original data are subjected to second coding to obtain k pieces of local check. Verify data; save n copies of original data in the data partition of the distributed system. Correspondingly, the global verification data and local verification data corresponding to the original data also exist in the partition where the original data is located.
优选地,依据原始数据的数据分块,将所述原始数据平均保存在分布式系统的每个数据分区中;每个数据分区中存储所述原始数据对应的本地校验数据的数量相同;每个数据分区中存储所述原始数据对应的全局校验数据的数量相同;Preferably, according to the data block of the original data, the original data is evenly stored in each data partition of the distributed system; each data partition stores the same amount of local verification data corresponding to the original data; The number of global check data corresponding to the original data stored in each data partition is the same;
本发明实施例中,以原始数据的数量为n为例说明,将n份原始数据平均保存在分布式系统的每个数据分区中;每个数据分区中存储的本地校验数据数量相同;每个数据分区中存储的全局校验数据的数量相同。In the embodiment of the present invention, the number of original data is n as an example, n pieces of original data are averagely stored in each data partition of the distributed system; the number of local verification data stored in each data partition is the same; The number of global check data stored in each data partition is the same.
本发明实施例中,将n份原始数据平均保存在分布式系统的每个数据分区中是为了 负载均衡,例如21份原始数据平均分配在3个数据分区中,每个分区存储7份原始数据及其对应的全局校验数据和本地校验数据。另外实际应用的平均分配,尽量做到负载均衡,例如有两个数据分区,传输21份原始数据,可以一个数据分区中存储10份原始数据及其对应的全局校验数据和本地校验数据,另一个数据分区中存储11份原始数据及其对应的全局校验数据和本地校验数据。In the embodiment of the present invention, n pieces of original data are evenly stored in each data partition of the distributed system for load balancing. For example, 21 pieces of original data are equally distributed among 3 data partitions, and each partition stores 7 pieces of original data. And its corresponding global check data and local check data. In addition, the actual application is evenly distributed, try to achieve load balancing, for example, there are two data partitions, 21 copies of original data can be transmitted, 10 copies of original data and its corresponding global check data and local check data can be stored in one data partition. Another data partition stores 11 pieces of original data and its corresponding global check data and local check data.
如图2所示,本发明实施例为3AZ(Available Zone指系统中的可用分区),其中,原始数据每个分区6份,本地校验数据每个分区2份(图中表示为P),全局校验数据一共12份,每个分区存储4份(图中表示为G)。整个分布式系统中的数据数量为18+6+12。As shown in Figure 2, the embodiment of the present invention is 3AZ (Available Zone refers to the available partitions in the system), where the original data has 6 copies for each partition, and the local check data is 2 copies for each partition (denoted as P in the figure) There are 12 copies of global check data, and each partition stores 4 copies (denoted as G in the figure). The number of data in the entire distributed system is 18+6+12.
优选地,所述第一编码为RS编码;所述第二编码为MSR编码。Preferably, the first code is RS code; the second code is MSR code.
MSR编码是再生码的一种,其通过改变传统的编码方式能够显著降低数据恢复代价。MSR coding is a kind of regeneration code, which can significantly reduce the cost of data recovery by changing the traditional coding method.
本发明实施例中,按照数据分区使用MSR编码的方式生成本地校验数据,并存储于所属数据分区之中;按照RS编码的方案生成全局校验数据并分配到各个分区之中。综合RS编码和MSR编码的优缺点,运用本地校验的思想将两种编码组合成了一种新的编码,新的编码保证了整个系统的数据可靠性和存储代价的同时,也拥有了MSR编码的单盘恢复代价低的优异性能,在3AZ环境下相较于RS编码方案,系统使用新编码后性能将会大大提高;另外,本发明实施例具有优秀的可扩展性,将参数调整之后可适用于各种分区的系统环境之下。In the embodiment of the present invention, the local verification data is generated according to the data partition using MSR encoding and stored in the data partition; the global verification data is generated according to the RS encoding scheme and distributed to each partition. Integrating the advantages and disadvantages of RS coding and MSR coding, using the idea of local verification to combine the two codes into a new code. The new code guarantees the data reliability and storage cost of the entire system and also has MSR. The coded single-disk recovery has excellent performance with low cost. Compared with the RS coding scheme under the 3AZ environment, the system performance will be greatly improved after the new code is used; in addition, the embodiment of the present invention has excellent scalability. It can be applied to various partitioned system environments.
实施例二Example two
如图3所示,本发明实施例提供一种分布式系统环境下的解码方法,包括:As shown in Figure 3, an embodiment of the present invention provides a decoding method in a distributed system environment, including:
S201、读取当前分区存储的原始数据;S201: Read the original data stored in the current partition;
S202、当所述原始数据损坏数量小于或者等于预设第一阈值时,利用存储在当前分区的本地校验数据对所述原始数据按照第二编码方式进行数据恢复,获得完整原始数据;S202: When the amount of damage to the original data is less than or equal to the preset first threshold, use the local verification data stored in the current partition to perform data recovery on the original data in a second encoding mode to obtain complete original data;
S203、当所述原始数据损坏数量大于预设第一阈值时,利用存储在当前分区的全局校验数据以及分布式系统的其他数据分区中存储的全局校验数据对所述原始数据按照第一编码方式进行数据恢复,获得完整原始数据。S203. When the amount of damage to the original data is greater than the preset first threshold, use the global check data stored in the current partition and the global check data stored in other data partitions of the distributed system to compare the original data according to the first Encoding method for data recovery to obtain complete original data.
本发明实施例中,当所述原始数据损坏数量大于预设第二阈值时,利用存储在当前分区的本地校验数据和全局校验数据以及分布式系统的其他数据分区中存储的全局校验数据对所述原始数据按照第一编码和第二编码联合编码方式进行数据恢复,获得完整原始数据,其中,第二阈值大于第一阈值。In the embodiment of the present invention, when the original data damage quantity is greater than the preset second threshold, the local check data and global check data stored in the current partition and the global check data stored in other data partitions of the distributed system are used. The data performs data recovery on the original data according to the first coding and the second coding joint coding mode to obtain complete original data, wherein the second threshold is greater than the first threshold.
本发明实施例中,所述第一编码为RS编码;所述第二编码为MSR编码。In the embodiment of the present invention, the first encoding is RS encoding; the second encoding is MSR encoding.
本发明实施例中,进行数据解码和恢复时,分情况讨论,如果丢失数据的数量较少,例如只有1、2份数据出错。则只需要利用本地校验数据采用MSR编码恢复方案进行恢复,这时便可以利用到MSR编码对恢复代价的优化效果。如果出现大规模的数据错误,比如一个数据分区的全部数据失效。则需要用到全局校验数据进行恢复,此时使用RS编码方案,因此恢复代价与RS编码相同。一些极限情况下可能需要全局校验数据和本地校验数据进行联合恢复。以此系统的数据可靠性。In the embodiment of the present invention, when data decoding and recovery are performed, the situation is discussed. If the amount of lost data is small, for example, only one or two pieces of data are wrong. It is only necessary to use the local verification data to use the MSR code recovery scheme to recover, and then the optimization effect of the MSR code on the recovery cost can be used. If a large-scale data error occurs, for example, all data in a data partition fails. The global check data needs to be used for recovery. In this case, the RS encoding scheme is used, so the recovery cost is the same as the RS encoding. In some extreme cases, it may be necessary to perform joint recovery of the global check data and the local check data. Based on the data reliability of the system.
实施例三Example three
如图4所示,本发明实施例提供一种分布式系统环境下的编码装置,包括:As shown in Figure 4, an embodiment of the present invention provides an encoding device in a distributed system environment, including:
编码模块100,设置为设置为对原始数据进行第一编码,获得全局校验数据,对原始数据进行第二编码,获得本地校验数据;The encoding module 100 is configured to perform the first encoding on the original data to obtain the global verification data, and to perform the second encoding on the original data to obtain the local verification data;
分区模块200,设置为将所述原始数据保存在分布式系统的一个或者多个数据分区中;The partition module 200 is configured to store the original data in one or more data partitions of the distributed system;
分配模块300,设置为在存入原始数据的数据分区中存入所述原始数据对应的全局校验数据和本地校验数据。The distribution module 300 is configured to store the global check data and the local check data corresponding to the original data in the data partition where the original data is stored.
本发明实施例中,编码模块100对n份原始数据进行第一编码,获得m份全局校验数据,将n份原始数据进行第二编码,获得k份本地校验数据;分区模块200,将n份原始数据保存至分布式系统的数据分区中,分配模块300,将原始数据对应的全局校验数据和本地校验数据也存在原始数据所在的分区。In the embodiment of the present invention, the encoding module 100 performs first encoding on n pieces of original data to obtain m pieces of global verification data, and performs second encoding on n pieces of original data to obtain k pieces of local verification data; The n pieces of original data are stored in the data partition of the distributed system, and the distribution module 300 stores the global check data and the local check data corresponding to the original data in the partition where the original data is located.
本发明实施例中,所述分区模块200依据原始数据的数据分块,将所述原始数据平均保存在分布式系统的每个数据分区中;每个数据分区中存储所述原始数据对应的本地校验数据的数量相同;每个数据分区中存储所述原始数据对应的全局校验数据的数量相同。In the embodiment of the present invention, the partition module 200 stores the original data in each data partition of the distributed system on average according to the data partition of the original data; each data partition stores the local data corresponding to the original data. The number of check data is the same; the number of global check data corresponding to the original data stored in each data partition is the same.
实施例四Example four
如图5所示,本发明实施例提供一种分布式系统环境下的解码装置,包括:As shown in FIG. 5, an embodiment of the present invention provides a decoding device in a distributed system environment, including:
通信模块400,设置为读取当前分区存储的原始数据;The communication module 400 is configured to read the original data stored in the current partition;
恢复模块500,设置为当所述原始数据损坏数量小于或者等于预设第一阈值时,利用存储在当前分区的本地校验数据对所述原始数据按照第二编码方式进行数据恢复,获得完整原始数据;The restoration module 500 is configured to use the local check data stored in the current partition to restore the original data according to the second encoding method when the amount of damage to the original data is less than or equal to the preset first threshold to obtain a complete original data;
恢复模块500,还设置为当所述原始数据损坏数量大于预设第一阈值时,利用存储 在当前分区的全局校验数据以及分布式系统的其他数据分区中存储的全局校验数据对所述原始数据按照第一编码方式进行数据恢复,获得完整原始数据。The recovery module 500 is further configured to use the global check data stored in the current partition and the global check data stored in other data partitions of the distributed system to compare the damage to the original data when the number of original data damages is greater than the preset first threshold The original data is restored according to the first encoding method to obtain complete original data.
本发明实施例中,恢复模块500,还设置为当所述原始数据损坏数量大于预设第二阈值时,利用存储在当前分区的本地校验数据和全局校验数据以及分布式系统的其他数据分区中存储的全局校验数据对所述原始数据按照第一编码和第二编码联合编码方式进行数据恢复,获得完整原始数据,其中,第二阈值大于第一阈值。In the embodiment of the present invention, the recovery module 500 is further configured to use the local check data and global check data stored in the current partition and other data of the distributed system when the original data damage quantity is greater than the preset second threshold. The global check data stored in the partition performs data recovery on the original data according to a joint encoding method of the first encoding and the second encoding to obtain complete original data, wherein the second threshold is greater than the first threshold.
实施例五Example five
如图2所示,编码方式如下:按照分区使用MSR编码的方式生成本地校验数据并存储于所属分区之中。按照RS编码的方案生成全局校验数据并平均分配到各个分区之中。图2中实例本地校验数据每个分区2份。全局校验数据一共12份,每个分区存储4份。整个编码参数为18+6+12,其中原始数据18份,平均每个分区存储6份。As shown in Figure 2, the encoding method is as follows: Local verification data is generated and stored in the corresponding partition according to the partition using MSR encoding. According to the RS coding scheme, global check data is generated and distributed evenly among the partitions. In the example shown in Figure 2, the local verification data has 2 copies for each partition. There are 12 copies of global verification data, and 4 copies are stored in each partition. The entire coding parameter is 18+6+12, of which 18 original data are stored, and each partition stores 6 copies on average.
如图6所示,进行数据恢复时分情况讨论,如果丢失数据的数量较少,例如只有1、2份数据出错,则只需要利用本地编码采用MSR编码恢复方案进行恢复,这是便可以利用到MSR编码对恢复代价的优化效果。如果出现大规模的数据错误,比如一个分区全部失效。则需要用到全局校验数据进行恢复,由于使用的RS编码方案,因此恢复代价与RS编码相同。一些极限情况下可能需要全局校验和本地校验进行联合恢复。As shown in Figure 6, the time division of data recovery is discussed. If the amount of lost data is small, for example, only one or two pieces of data are wrong, you only need to use the local encoding and use the MSR encoding recovery scheme to recover, which can be used The optimization effect of MSR coding on recovery cost. If there is a large-scale data error, such as a partition failure. It is necessary to use the global check data for recovery. Because of the RS encoding scheme used, the recovery cost is the same as the RS encoding. In some extreme cases, global check and local check may be required for joint recovery.
本发明实施例综合了RS编码和MSR编码的优势,使得新编码在网络带宽和响应时间上更适用于3AZ环境的系统。The embodiments of the present invention integrate the advantages of RS coding and MSR coding, so that the new coding is more suitable for systems in a 3AZ environment in terms of network bandwidth and response time.
本发明实施例中,将本地校验数据使用MSR编码方案进行编码,相比于直接的异或运算编码,在恢复时的网络带宽占用上有着明显的优势。比如分区编码配置为4+2时,2份本地校验数据,4份全局校验数据,LRC(Local Reconstruction Code,运用本地校验的一种新型编码)编码的方案需要调用4倍数据量,而MSR编码只需要调用2.5倍的数据量。In the embodiment of the present invention, the local check data is encoded using the MSR encoding scheme, which has obvious advantages in terms of network bandwidth occupation during restoration, compared with the direct exclusive OR operation encoding. For example, when the partition code is configured as 4+2, 2 copies of local verification data and 4 copies of global verification data. The LRC (Local Reconstruction Code, a new type of code that uses local verification) coding scheme requires 4 times the amount of data to be called. The MSR encoding only needs to call 2.5 times the amount of data.
虽然本发明所揭示的实施方式如上,但其内容只是为了便于理解本发明的技术方案而采用的实施方式,并非用于限定本发明。任何本发明所属技术领域内的技术人员,在不脱离本发明所揭示的核心技术方案的前提下,可以在实施的形式和细节上做任何修改与变化,但本发明所限定的保护范围,仍须以所附的权利要求书限定的范围为准。Although the embodiments disclosed in the present invention are as described above, the contents are only used to facilitate the understanding of the technical solutions of the present invention and are not used to limit the present invention. Any person skilled in the technical field of the present invention can make any modifications and changes in the implementation form and details without departing from the core technical solution disclosed in the present invention. However, the protection scope defined by the present invention remains The scope defined by the appended claims shall prevail.

Claims (10)

  1. 一种分布式系统环境下的编码方法,其特征在于,包括:A coding method in a distributed system environment, which is characterized in that it includes:
    对原始数据进行第一编码,获得全局校验数据,对原始数据进行第二编码,获得本地校验数据;Perform a first encoding on the original data to obtain global verification data, and perform a second encoding on the original data to obtain local verification data;
    将所述原始数据保存在分布式系统的一个或者多个数据分区中;Save the original data in one or more data partitions of the distributed system;
    在存入原始数据的数据分区中存入所述原始数据对应的全局校验数据和本地校验数据。The global check data and local check data corresponding to the original data are stored in the data partition where the original data is stored.
  2. 如权利要求1所述的方法,其特征在于,依据原始数据的数据分块,将所述原始数据平均保存在分布式系统的每个数据分区中;每个数据分区中存储所述原始数据对应的本地校验数据的数量相同;每个数据分区中存储所述原始数据对应的全局校验数据的数量相同。The method according to claim 1, wherein the original data is stored in each data partition of the distributed system on average according to the data partition of the original data; each data partition stores the corresponding original data The number of local verification data is the same; the number of global verification data corresponding to the original data stored in each data partition is the same.
  3. 如权利要求1或2所述的方法,其特征在于,所述第一编码为RS编码;所述第二编码为MSR编码。The method according to claim 1 or 2, wherein the first encoding is RS encoding; and the second encoding is MSR encoding.
  4. 一种分布式系统环境下的解码方法,其特征在于,包括:A decoding method in a distributed system environment, characterized in that it includes:
    读取当前分区存储的原始数据;Read the original data stored in the current partition;
    当所述原始数据损坏数量小于或者等于预设第一阈值时,利用存储在当前分区的本地校验数据对所述原始数据按照第二编码方式进行数据恢复,获得完整原始数据;When the amount of damage to the original data is less than or equal to the preset first threshold, use the local check data stored in the current partition to perform data recovery on the original data in the second encoding mode to obtain complete original data;
    当所述原始数据损坏数量大于预设第一阈值时,利用存储在当前分区的全局校验数据以及分布式系统的其他数据分区中存储的全局校验数据对所述原始数据按照第一编码方式进行数据恢复,获得完整原始数据。When the amount of damage to the original data is greater than the preset first threshold, use the global check data stored in the current partition and the global check data stored in other data partitions of the distributed system to perform the first encoding method on the original data Perform data recovery and obtain complete original data.
  5. 如权利要求4所述的方法,其特征在于,当所述原始数据损坏数量大于预设第二阈值时,利用存储在当前分区的本地校验数据和全局校验数据以及分布式系统的其他数据分区中存储的全局校验数据对所述原始数据按照第一编码和第二编码联合编码方式进行数据恢复,获得完整原始数据,其中,第二阈值大于第一阈值。The method of claim 4, wherein when the number of original data damages is greater than the preset second threshold, the local check data and global check data stored in the current partition and other data of the distributed system are used The global check data stored in the partition performs data recovery on the original data in a joint encoding manner of the first encoding and the second encoding to obtain complete original data, wherein the second threshold is greater than the first threshold.
  6. 如权利要求4或5所述的方法,其特征在于,所述第一编码为RS编码;所述第二编码为MSR编码。The method according to claim 4 or 5, wherein the first encoding is RS encoding; and the second encoding is MSR encoding.
  7. 一种分布式系统环境下的编码装置,其特征在于,包括:An encoding device in a distributed system environment, characterized in that it comprises:
    编码模块,设置为对原始数据进行第一编码,获得全局校验数据,对原始数据进行第二编码,获得本地校验数据;The encoding module is configured to perform the first encoding on the original data to obtain the global verification data, and perform the second encoding on the original data to obtain the local verification data;
    分区模块,设置为将所述原始数据保存在分布式系统的一个或者多个数据分区中;The partition module is configured to store the original data in one or more data partitions of the distributed system;
    分配模块,设置为在存入原始数据的数据分区中存入所述原始数据对应的全局校验数据和本地校验数据。The distribution module is configured to store the global check data and the local check data corresponding to the original data in the data partition where the original data is stored.
  8. 如权利要求7所述的装置,其特征在于,所述分区模块依据原始数据的数据分块,将所述原始数据平均保存在分布式系统的每个数据分区中;每个数据分区中存储所述原始数据对应的本地校验数据的数量相同;每个数据分区中存储所述原始数据对应的全局校验数据的数量相同。The device according to claim 7, wherein the partition module stores the original data in each data partition of the distributed system on average according to the data partition of the original data; each data partition stores the The quantity of local check data corresponding to the original data is the same; the quantity of global check data corresponding to the original data stored in each data partition is the same.
  9. 一种分布式系统环境下的解码装置,其特征在于,包括:A decoding device in a distributed system environment, characterized in that it comprises:
    通信模块,设置为读取当前分区存储的原始数据;The communication module is set to read the original data stored in the current partition;
    恢复模块,设置为当所述原始数据损坏数量小于或者等于预设第一阈值时,利用存储在当前分区的本地校验数据对所述原始数据按照第二编码方式进行数据恢复,获得完整原始数据;The recovery module is configured to use the local check data stored in the current partition to recover the original data according to the second encoding method when the number of damages to the original data is less than or equal to the preset first threshold to obtain complete original data ;
    恢复模块,还设置为当所述原始数据损坏数量大于预设第一阈值时,利用存储在当前分区的全局校验数据以及分布式系统的其他数据分区中存储的全局校验数据对所述原始数据按照第一编码方式进行数据恢复,获得完整原始数据。The recovery module is further configured to use the global check data stored in the current partition and the global check data stored in other data partitions of the distributed system to compare the original data when the amount of damage to the original data is greater than the preset first threshold. The data is restored according to the first encoding method to obtain complete original data.
  10. 如权利要求9所述的装置,其特征在于,恢复模块,还设置为当所述原始数据损坏数量大于预设第二阈值时,利用存储在当前分区的本地校验数据和全局校验数据以及分布式系统的其他数据分区中存储的全局校验数据对所述原始数据按照第一编码和第二编码联合编码方式进行数据恢复,获得完整原始数据,其中,第二阈值大于第一阈值。The device according to claim 9, wherein the recovery module is further configured to use the local check data and the global check data stored in the current partition when the number of original data damages is greater than a preset second threshold. The global check data stored in other data partitions of the distributed system performs data recovery on the original data according to the first coding and the second coding joint coding mode to obtain complete original data, wherein the second threshold is greater than the first threshold.
PCT/CN2020/090515 2019-05-24 2020-05-15 Encoding method in distributed system environment, decoding method in distributed system environment, and corresponding apparatuses WO2020238653A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910439739.5 2019-05-24
CN201910439739.5A CN111984443A (en) 2019-05-24 2019-05-24 Encoding method, decoding method and corresponding devices in distributed system environment

Publications (1)

Publication Number Publication Date
WO2020238653A1 true WO2020238653A1 (en) 2020-12-03

Family

ID=73436260

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/090515 WO2020238653A1 (en) 2019-05-24 2020-05-15 Encoding method in distributed system environment, decoding method in distributed system environment, and corresponding apparatuses

Country Status (2)

Country Link
CN (1) CN111984443A (en)
WO (1) WO2020238653A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113687975B (en) * 2021-07-14 2023-08-29 重庆大学 Data processing method, device, equipment and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106527993A (en) * 2016-11-09 2017-03-22 北京搜狐新媒体信息技术有限公司 Mass file storage method and device for distributed type system
CN107844272A (en) * 2017-10-31 2018-03-27 成都信息工程大学 A kind of cross-packet coding and decoding method for improving error correcting capability
US20180365102A1 (en) * 2017-06-16 2018-12-20 Alibaba Group Holding Limited Method and system for iterative data recovery and error correction in a distributed system
CN109491835A (en) * 2018-10-25 2019-03-19 哈尔滨工程大学 A kind of data fault tolerance method based on Dynamic Packet code

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106527993A (en) * 2016-11-09 2017-03-22 北京搜狐新媒体信息技术有限公司 Mass file storage method and device for distributed type system
US20180365102A1 (en) * 2017-06-16 2018-12-20 Alibaba Group Holding Limited Method and system for iterative data recovery and error correction in a distributed system
CN107844272A (en) * 2017-10-31 2018-03-27 成都信息工程大学 A kind of cross-packet coding and decoding method for improving error correcting capability
CN109491835A (en) * 2018-10-25 2019-03-19 哈尔滨工程大学 A kind of data fault tolerance method based on Dynamic Packet code

Also Published As

Publication number Publication date
CN111984443A (en) 2020-11-24

Similar Documents

Publication Publication Date Title
CN103944981B (en) Cloud storage system and implement method based on erasure code technological improvement
CN109814807B (en) Data storage method and device
CN109491835B (en) Data fault-tolerant method based on dynamic block code
WO2020047707A1 (en) Data coding, decoding and repairing method for distributed storage system
CN103838860A (en) File storing system based on dynamic transcript strategy and storage method of file storing system
CN105956128B (en) A kind of adaptive coding storage fault-tolerance approach based on simple regeneration code
CN106951340B (en) A kind of RS correcting and eleting codes data layout method and system preferential based on locality
CN105530294A (en) Mass data distributed storage method
WO2020035086A3 (en) Data security of shared blockchain data storage based on error correction code
CN106445726A (en) Data repairing method for distributed erasure code storage system
CN110427156A (en) A kind of parallel reading method of the MBR based on fragment
CN109194444A (en) A kind of balanced binary tree restorative procedure based on network topology
WO2015180038A1 (en) Partial replica code construction method and device, and data recovery method therefor
WO2020238653A1 (en) Encoding method in distributed system environment, decoding method in distributed system environment, and corresponding apparatuses
CN107153661A (en) A kind of storage, read method and its device of the data based on HDFS systems
CN106027638A (en) Hadoop data distribution method based on hybrid coding
CN114237971A (en) Erasure code coding layout method and system based on distributed storage system
CN116501553B (en) Data recovery method, device, system, electronic equipment and storage medium
US11347418B2 (en) Method, device and computer program product for data processing
US11841762B2 (en) Data processing
CN104572987B (en) A kind of method and system that simple regeneration code storage efficiency is improved by compressing
WO2020238736A1 (en) Method for generating decoding matrix, decoding method and corresponding device
CN106911793B (en) I/O optimized distributed storage data repair method
CN112000278B (en) Self-adaptive local reconstruction code design method for thermal data storage and cloud storage system
CN112860476A (en) Approximate erasure code coding method and device based on video layered storage

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20814992

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20814992

Country of ref document: EP

Kind code of ref document: A1