WO2021151298A1 - 一种数据冗余处理方法、装置、设备及存储介质 - Google Patents

一种数据冗余处理方法、装置、设备及存储介质 Download PDF

Info

Publication number
WO2021151298A1
WO2021151298A1 PCT/CN2020/118909 CN2020118909W WO2021151298A1 WO 2021151298 A1 WO2021151298 A1 WO 2021151298A1 CN 2020118909 W CN2020118909 W CN 2020118909W WO 2021151298 A1 WO2021151298 A1 WO 2021151298A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
processing
master node
node
erasure code
Prior art date
Application number
PCT/CN2020/118909
Other languages
English (en)
French (fr)
Inventor
雷林凯
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2021151298A1 publication Critical patent/WO2021151298A1/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1458Management of the backup or restore process
    • G06F11/1469Backup restoration techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0614Improving the reliability of storage systems
    • G06F3/0619Improving the reliability of storage systems in relation to data integrity, e.g. data losses, bit errors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • G06F3/064Management of blocks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/067Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS]

Definitions

  • This application relates to the technical field of data processing in artificial intelligence, and in particular to a data redundancy processing method, device, equipment, and storage medium.
  • Cloud storage is a mode of online storage, that is, storing data on multiple virtual servers usually hosted by a third party instead of dedicated servers.
  • Custody companies operate large-scale data centers, and those who need data storage custody can purchase or lease storage space from them to meet their data storage needs.
  • data center operators prepare storage virtualized resources at the back end and provide them in the form of storage resource pools. Customers can use this storage resource pool to store files or objects on their own.
  • the storage strategy identifier of the data to be stored is obtained; the storage method corresponding to the storage strategy identifier is used to store the data to be stored, and the storage strategy identifier is used to indicate the use of at least one of the following storage methods: copy redundancy processing method, erasure coding and decoding Processing method; that is, data redundancy for multiple copies and erasure codes; data redundancy for multiple copies and erasure codes is a more common data processing method.
  • the inventor realizes that when performing data redundancy processing in multiple copies and erasure codes, it is difficult and difficult to recover data after data loss or failure. operate.
  • the embodiment of the present application discloses a data redundancy processing method, the method includes:
  • IO data map the IO data into a set, the set includes a master node and an erasure code node corresponding to the master node, and the IO data is mapped and stored to the master node and the master node corresponding to the master node.
  • Erasure code node
  • the collection receives an IO data processing request and performs IO data processing in the collection, where the IO data processing includes write IO processing and read IO processing;
  • the embodiment of the application also discloses a data redundancy processing device, the device includes:
  • the acquisition module is used to acquire IO data, and map the IO data to a set.
  • the set includes a master node and an erasure code node corresponding to the master node.
  • the IO data is mapped and stored to the master node and all The erasure code node corresponding to the master node;
  • a receiving module configured to receive IO data processing requests in the collection and perform IO data processing in the collection, where the IO data processing includes write IO processing and read IO processing;
  • the detection module is configured to detect whether the IO data storage failure exists in the master node and the erasure code node corresponding to the master node in the set;
  • the completion module is used to complete the IO data redundancy processing after troubleshooting the IO data storage failure if there is an IO data storage failure;
  • the embodiment of the application also discloses a computer device, including a processor, a memory, and a computer program stored on the memory and capable of running on the processor.
  • a computer program stored on the memory and capable of running on the processor.
  • IO data map the IO data into a set, the set includes a master node and an erasure code node corresponding to the master node, and the IO data is mapped and stored to the master node and the master node corresponding to the master node.
  • Erasure code node
  • the collection receives an IO data processing request and performs IO data processing in the collection, where the IO data processing includes write IO processing and read IO processing;
  • the embodiment of the application also discloses a computer-readable storage medium on which a computer program is stored, and when the program is executed by a processor, the following steps are implemented:
  • IO data map the IO data into a set, the set includes a master node and an erasure code node corresponding to the master node, and the IO data is mapped and stored to the master node and the master node corresponding to the master node.
  • Erasure code node
  • the collection receives an IO data processing request and performs IO data processing in the collection, where the IO data processing includes write IO processing and read IO processing;
  • FIG. 1 is a flowchart of the steps of Embodiment 1 of a data redundancy processing method according to the present application;
  • FIG. 2 is a flow chart of the second embodiment of a data redundancy processing method according to the present application.
  • FIG. 3 is a flowchart of the steps of Embodiment 3 of a data redundancy processing method according to the present application;
  • Embodiment 4 is a flowchart of the steps of Embodiment 4 of a data redundancy processing method according to the present application;
  • FIG. 5 is a flowchart of steps of Embodiment 5 of a data redundancy processing method according to the present application.
  • Embodiment 6 is a flowchart of the steps of Embodiment 6 of a data redundancy processing method according to the present application;
  • FIG. 7 is a flowchart of the steps of Embodiment 7 of a data redundancy processing method according to the present application.
  • FIG. 8 is a flowchart of the steps of Embodiment 8 of a data redundancy processing method according to the present application.
  • Fig. 9 is a structural block diagram of an embodiment of a data redundancy processing device according to the present application.
  • Embodiment 1 of a data redundancy processing method of the present application which may specifically include the following steps:
  • Step S10 Obtain IO data, and map the IO data to a set.
  • the set includes a master node and an erasure code node corresponding to the master node.
  • the IO data is mapped and stored to the master node and the master node respectively.
  • Step S20 the collection receives an IO data processing request and performs IO data processing in the collection, where the IO data processing includes write IO processing and read IO processing;
  • Step S30 detecting whether the master node in the set and the erasure code node corresponding to the master node have the IO data storage failure
  • Step S40 If there is an IO data storage failure, the IO data storage failure is eliminated and the IO data redundancy processing is completed.
  • the IO data refers to the current cloud storage system through IO, ie I corresponds to input (Input), O corresponds to output (Output), and is divided into two parts: IO device and IO interface.
  • Cloud storage system Cache data transmitted through the IO device or IO interface; then map different IO data into the collection, and map the IO data to the master node and all the master nodes according to the master node in the collection and the erasure code nodes corresponding to the master node.
  • the mapping refers to the relationship between the elements of the two element sets, that is, the master node and the erasure code node are included in a set.
  • the erasure code node corresponding to its master node also has the unique IO data corresponding to the one IO data and set it as element B, where element A and element B are due to the relationship between the master node and the erasure code node ,
  • the element A to element B is a mapping from the master node to the erasure code node; IO data processing is performed according to the master node in the set and the erasure code node corresponding to the master node.
  • the IO data processing includes write IO processing and read IO processing; then check whether the master node in the set and the erasure code node corresponding to the master node have the IO data storage failure, and judge the failed node by whether the write IO processing and the read IO processing are completed, if there is a failure , Then determine whether the failed node is the master node or the erasure code node corresponding to the master node, and perform failure recovery; if there is no failure, complete IO data redundancy processing.
  • Embodiment 2 of a data redundancy processing method of the present application.
  • the execution method of the write IO processing may specifically include the following steps:
  • Step S2011 Obtain original data in the master node, where the original data is the initial data that has been stored in the master node before the IO data is obtained;
  • Step S2012 Divide the original data in the master node into K master data and M verification data, and generate K+M data; write the K+M data to the master node corresponding Within the erasure code node;
  • Step S2013 when the IO data processing request is received, simultaneously write the acquired IO data into the original data of the master node and K+M data of the erasure code node;
  • Step S2014 when the erasure code node writes is completed, send a write completed response message to the master node, and complete the write IO process according to the response information.
  • the original data in the master node is acquired, where the original data is the initial data that has been stored in the master node before the IO data is acquired; the original data in the master node is divided into K Master data, and calculate M parity data, and then write K+M data to the erasure code node corresponding to the master node; when the master node and the target erasure code corresponding to the master node
  • the node receives the write IO processing, it writes the IO data to be written in the original data in the master node, and at the same time writes the IO data to be written in the K+M data in the erasure code node corresponding to the master node Data; when the write completion of the erasure code node corresponding to the master node is completed, send a write completion response message to the master node; after the master node receives the write completion response message of the erasure code node corresponding to the master node , And the master node itself has completed the writing of the IO data.
  • Step S20141 Perform scene recognition on the response information
  • Step S20142 When it is recognized that the response information is a successful write IO processing scenario, the write IO processing is successful;
  • Step S20143 When it is recognized that the response information is a scenario where the write IO processing fails, the write IO processing fails.
  • the response information is sent through the master node, and the response information is identified by the scene, where the response information scene includes the write IO processing success scene and the write IO processing failure scene; whether the response information scene is written IO processing success scenarios and write IO processing failure scenarios, determine whether it is a successful write IO processing or a write IO processing failure; when the IO processing is successfully written, you only need to check the master node in the set and the error of the master node Whether there is an IO data storage failure in the erasure code node, if not, data redundancy processing can be completed; when the write IO processing fails, the specific failure in the master node in the set and the erasure code node of the master node can be detected Node, recover the failed node, and complete data redundancy processing after recovery.
  • Embodiment 4 there is shown a step flow chart of Embodiment 4 of a data redundancy processing method of the present application.
  • the write IO processing is successful, Specifically, it can include the following steps:
  • step S201421 if the IO data in the master node is successfully written, and the K+M data of the erasure code node corresponding to the master node is successfully written ⁇ K, then the IO processing is successfully written; or,
  • step S201422 if the writing of the IO data in the master node fails, but the K+M data in the erasure code node corresponding to the master node are all written successfully in K+M, then the IO processing is successfully written.
  • the IO processing is successfully written; or, If the IO data writing on the master node fails, and the K+M data in the K+M data on the erasure code node are all written successfully, the IO processing is successfully written; according to the master node and the corresponding error of the master node Delete the IO data in the code node and judge whether it is successfully written to the IO processing.
  • Embodiment 5 there is shown a step flow chart of Embodiment 5 of a data redundancy processing method of the present application.
  • the write IO processing fails. Specifically, it can include the following steps:
  • step S201431 if the writing of IO data in the master node fails, but the writing of K+M copies of the K+M data in the erasure code node corresponding to the master node fails, the write IO processing fails; or,
  • step S201432 if the IO data in the master node is successfully written, and the number of data written in K+M data of the erasure code node corresponding to the master node is less than K, the write IO processing fails.
  • the writing of IO data on the master node fails, but the writing of K+M copies of the K+M copies of data on the erasure code node corresponding to the master node fails, then write IO processing Failed; if the IO data on the master node is successfully written, and the K+M data on the erasure code node corresponding to the master node is successfully written less than K, the write IO processing fails; according to the master node and The IO data in the erasure code node corresponding to the master node determines whether the write IO processing fails.
  • Embodiment 6 of a data redundancy processing method of the present application.
  • the execution method of the read IO processing may specifically include the following steps:
  • Step S2021 read out the IO data to be read in the master node, and the IO data to be read out is the original data in the master node;
  • Step S2022 After the original data is received by the client, the processing of the request for reading the IO data is completed.
  • the IO data is read in the master node, and the master node directly reads the entire block of original data stored locally, and the IO data to be read is the entire block of original data stored in the master node.
  • the whole piece of original data is returned to the front-end client. After the front-end client receives it, it represents the completion of the read IO processing.
  • Embodiment 7 of a data redundancy processing method of the present application.
  • the data storage failure can specifically include the following steps:
  • Step S3011 detecting the master node in the set and the erasure code node corresponding to the master node, and identifying the location of the fault;
  • Step S3012 when the master node fails, a new temporary master node is generated, and write IO processing and read IO processing are performed;
  • Step S3013 when the erasure code node corresponding to the master node fails, and the number of copies of the failed erasure code node data ⁇ M, the erasure code node does not need to write IO data and read IO processing in the write IO processing ;
  • Step S3014 when the erasure code node corresponding to the master node fails, and the number of copies of the failed erasure code node data is greater than M, the write IO processing is interrupted;
  • Step S3015 when the master node and the erasure code node corresponding to the master node fail at the same time, and the number of copies of the failed erasure code node data ⁇ M, select a new temporary master node, and the erasure code node in the write IO process There is no need to write IO data, and the original data in the master node is read out in the read IO process.
  • the judgment is made by detecting whether the storage node is faulty; by detecting whether the master node and the erasure code node corresponding to the master node store IO data.
  • the storage system will select a new temporary primary node to perform write IO processing and read IO processing.
  • the erasure code node corresponding to the master node is faulty, and the number of copies of the erasure code node data of the failure is less than or equal to M, for write IO processing, the erasure code node of the failure does not need to write IO data; for read IO processing, The original data is read in the master node; the original data is returned to the client, and the client completes the processing of reading the IO data after receiving it.
  • the storage system's read and write service is interrupted, that is, the write IO request is interrupted; because the number of the erasure code nodes of the failure has exceeded the maximum value M that the erasure code can tolerate.
  • the storage system first selects a new temporary master node, and the erasure code node does not need to write IO during the write IO process Data, only read IO processing.
  • Embodiment 8 of a data redundancy processing method of the present application.
  • the failure node recovery includes the master node failure recovery and the erasure code node failure recovery. If there is IO data storage Failure, the IO data redundancy processing is completed after the IO data storage failure is eliminated, which may specifically include the following steps:
  • Step S3021 when the master node fails, perform the master node failure recovery by reading IO data in the erasure code node corresponding to the master node;
  • Step S3022 When the erasure code node fails, and the current erasure code node is the missing erasure code node, read IO data at other erasure code nodes to perform the erasure code node failure recover.
  • the faulty node recovery includes the master node failure recovery and the erasure coded node failure recovery; when the master node fails to recover, its lost IO data needs to be restored from The erasure code node is read and restored, and the data or check data on the erasure code node is read; when the erasure code node fails, the current erasure code node is the erasure code node of the missing object.
  • the erasure code node reads the data or check the data, and according to the erasure code recovery strategy, the erasure code recovery strategy is that the erasure code will create a mathematical function to describe a set of numbers, so that their accuracy can be checked, And once one of the numbers is lost, it can be recovered; polynomial interpolation or oversampling is the key technology used in erasure coding, and the data can be recovered through the above technology; it can be realized when the master node does not save the integrity If the entire block of IO data or K+M data on the erasure code node is missing, the data will be restored asynchronously in the background.
  • the foregoing data redundancy processing method can also be applied to a blockchain node to implement data blockchain storage to improve data security.
  • the blockchain referred to in this application is a new application mode of computer technology such as distributed data storage, point-to-point transmission, consensus mechanism, and encryption algorithm.
  • Blockchain essentially a decentralized database, is a series of data blocks associated with cryptographic methods. Each data block contains a batch of network transaction information for verification. The validity of the information (anti-counterfeiting) and the generation of the next block.
  • the blockchain can include the underlying platform of the blockchain, the platform product service layer, and the application service layer.
  • FIG. 9 there is shown a structural block diagram of an embodiment of a data redundancy processing device of the present application, which may specifically include the following modules:
  • the obtaining module 1001 is configured to obtain IO data, and map the IO data to a set.
  • the set includes a master node and an erasure code node corresponding to the master node.
  • the IO data is mapped and stored to the master node and The erasure code node corresponding to the master node;
  • the receiving module 1002 is configured to receive the IO data processing request in the set and perform IO data processing in the set, where the IO data processing includes write IO processing and read IO processing;
  • the detection module 1003 is configured to detect whether the IO data storage failure exists in the master node and the erasure code node corresponding to the master node in the set;
  • the completion module 1004 is configured to, if there is an IO data storage failure, complete the IO data redundancy processing after the IO data storage failure is eliminated.
  • the receiving module 1002 is configured to receive an IO data processing request by the collection, where the IO data processing request includes write IO processing and read IO processing, wherein the write IO processing, include:
  • a generating unit configured to divide the original data in the master node into K master data and M verification data, and generate K+M data; write the K+M data to the master Within the erasure code node corresponding to the node;
  • the writing unit is configured to simultaneously write the acquired IO data into the original data of the master node and the K+M data of the erasure code node when the IO data processing request is received ;
  • the response information unit is configured to send the write completion response information to the master node when the write completion of the erasure code node, and complete the write IO processing according to the response information.
  • the response information unit is configured to send a write completion response information to the master node when the erasure code node writing is completed, and complete the writing according to the response information IO processing, including:
  • An identification unit configured to perform scene identification on the response information
  • the write IO processing success scenario unit is used to successfully write IO processing when it is recognized that the response information is a write IO processing success scenario;
  • the write IO processing failure scenario unit is used for when it is identified that the response information is a write IO processing failure scenario, the write IO processing fails.
  • the write IO processing success scenario unit is configured to successfully write IO processing when it is recognized that the response information is a write IO processing success scenario, including:
  • the first completion unit is used to successfully write the IO data if the IO data in the master node is successfully written, and the K+M data of the erasure code node corresponding to the master node is successfully written ⁇ K. ;or,
  • the second completion unit is used to successfully write IO if the writing of IO data in the master node fails, but the K+M data in the erasure code node corresponding to the master node are all written successfully. handle.
  • the write IO processing failure scenario unit is configured to fail the write IO processing when the response information is identified as a write IO processing failure scenario, including:
  • the first failure unit is used for writing to IO if the writing of the IO data in the master node fails, and the writing of K+M data in the K+M data in the erasure code node corresponding to the master node fails, then writing to the IO processing Failed; or,
  • the second failure unit is used to write the IO data in the master node successfully, but the number of data written in K+M data of the erasure code node corresponding to the master node is less than K, then the write IO processing fails.
  • the receiving module 1002 is configured to receive an IO data processing request by the collection, where the IO data processing request includes write IO processing and read IO processing, wherein the read IO processing, include:
  • a reading unit for reading the IO data to be read in the master node, and the IO data to be read is the original data in the master node;
  • the read-out IO processing unit is used to complete the read-out IO processing after the original data is received by the client.
  • the detection module 1003 is configured to detect whether the master node and the erasure code node in the set have the IO data storage failure, including:
  • the fault unit is used to detect the master node and the erasure code node corresponding to the master node in the set, and identify the location of the fault;
  • the temporary master node unit is used to generate a new temporary master node when the master node fails, and perform write IO processing and read IO processing;
  • No write and read unit is required, which is used when the erasure code node corresponding to the master node fails, and the number of copies of the erasure code node data of the failure ⁇ M, the erasure code node does not need to be written in the write IO processing IO data and read IO processing;
  • the interrupt unit is used to interrupt the write IO processing when the erasure code node corresponding to the master node fails, and the number of copies of the failed erasure code node data is greater than M;
  • the readout unit is used to select a new temporary master node when the master node and the erasure code node corresponding to the master node fail at the same time, and the number of copies of the failed erasure code node data ⁇ M, select a new temporary master node and correct it during the write IO process
  • the code-deleted node does not need to write IO data, and the original data in the master node is read out in the read IO process.
  • the completion module 1004 is used for the recovery of the failed node including the recovery of the master node and the recovery of the erasure coded node. After elimination, complete IO data redundancy processing, including:
  • the recovery unit is configured to detect the faulty node and recover the faulty node if there is a fault, wherein the faulty node recovery includes the master node failure recovery and the erasure coded node failure;
  • the master node recovery unit is configured to perform the master node failure recovery by reading IO data in the erasure code node corresponding to the master node when the master node fails;
  • the erasure code node recovery unit is used for when the erasure code node fails, and the current erasure code node is the erasure code node of the missing object, then read IO data at other erasure code nodes to perform all Describe the erasure code node failure recovery.
  • the description is relatively simple, and for related parts, please refer to the part of the description of the method embodiment.
  • An embodiment of the present application also provides a computer device, including a processor, a memory, and a computer program stored on the memory and capable of running on the processor.
  • the computer device includes a processor, a memory, and a processor connected by a system bus. Network interface, where the memory includes non-volatile and/or volatile memory.
  • Acquire IO data map the IO data into a set, the set includes a master node and an erasure code node corresponding to the master node, and the IO data is mapped and stored to the master node and the master node corresponding to the master node.
  • Erasure code node
  • the collection receives an IO data processing request and performs IO data processing in the collection, where the IO data processing includes write IO processing and read IO processing;
  • the embodiments of the present application also provide a computer-readable storage medium, which may include non-volatile and/or volatile memory, on which a computer program is stored, and the program is executed by a processor At this time, a hybrid data redundancy method as in the embodiment of the present application is executed.
  • the embodiments of the present application include the following advantages: get IO data mapped in the collection, receive corresponding IO data processing requests through the IO data in the collection, the IO data processing requests include request processing for writing IO data and reading IO data, and Perform write IO processing and read IO processing according to the corresponding IO data processing request; generate response information after performing the IO data processing request, and perform scene identification based on the response information, where the response information scenario includes the successful write IO processing scenario and the write IO failure processing scenario, judge the write IO data request processing based on success and failure; at the same time, judge the read IO processing; check whether there is IO data storage on the master node in the set and the erasure code node corresponding to the master node Fault detection, judging whether the write IO processing and read IO processing are completed according to the detection results, if there is no fault, the write IO processing and the read IO processing are completed, that is, the data redundancy processing is completed; if there is a fault, the faulty no
  • the embodiments of the embodiments of the present application may be provided as methods, devices, or computer program products. Therefore, the embodiments of the present application may adopt the form of a complete hardware embodiment, a complete software embodiment, or an embodiment combining software and hardware. Moreover, the embodiments of the present application may adopt the form of computer program products implemented on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) containing computer-usable program codes.
  • computer-usable storage media including but not limited to disk storage, CD-ROM, optical storage, etc.
  • These computer program instructions can also be stored in a computer-readable memory that can guide a computer or other programmable data processing terminal equipment to work in a specific manner, so that the instructions stored in the computer-readable memory produce an article of manufacture including the instruction device.
  • the instruction device implements the functions specified in one process or multiple processes in the flowchart and/or one block or multiple blocks in the block diagram.
  • These computer program instructions can also be loaded on a computer or other programmable data processing terminal equipment, so that a series of operation steps are executed on the computer or other programmable terminal equipment to produce computer-implemented processing, so that the computer or other programmable terminal equipment
  • the instructions executed above provide steps for implementing functions specified in a flow or multiple flows in the flowchart and/or a block or multiple blocks in the block diagram.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Quality & Reliability (AREA)
  • Computer Security & Cryptography (AREA)
  • Retry When Errors Occur (AREA)
  • Techniques For Improving Reliability Of Storages (AREA)

Abstract

一种数据冗余处理方法、装置及存储介质,所述方法包括:获取IO数据,将所述IO数据映射到集合内,所述集合包括主节点和与主节点对应的纠删码节点,所述IO数据分别映射存储到所述主节点以及所述主节点对应的纠删码节点(S10);所述集合接收IO数据处理请求并在所述集合内执行IO数据处理,其中所述IO数据处理包括写入IO处理和读出IO处理(S20);检测所述集合内的主节点和与主节点对应的纠删码节点是否存在所述IO数据存储故障(S30);若存在IO数据存储故障,对所述IO数据存储故障进行排除后完成IO数据冗余处理(S40)。所述方法检测冗余处理中IO数据是否能够完成读写处理,当存在IO数据丢失以及故障时,对IO数据进行恢复,使得整个IO数据冗余处理完整性更优。

Description

一种数据冗余处理方法、装置、设备及存储介质
本申请要求于2020年05月26日提交中国专利局、申请号为CN202010456949.8、名称为“一种数据冗余处理方法、装置及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及人工智能中的数据处理技术领域,特别是涉及一种数据冗余处理方法、装置、设备及存储介质。
背景技术
云存储是一种网上在线存储的模式,即把数据存放在通常由第三方托管的多台虚拟服务器,而非专属的服务器上。托管公司运营大型的数据中心,需要数据存储托管的人,则透过向其购买或租赁存储空间的方式,来满足数据存储的需求。数据中心营运商根据客户的需求,在后端准备存储虚拟化的资源,并将其以存储资源池的方式提供,客户便可自行使用此存储资源池来存放文件或对象。
云存储系统的三个指标:高可靠性,低存储开销,高读写性能;多副本和纠删码是两种在存储系统中广泛使用的策略,它们在保证高可靠性的前提下,选择了不同极端的权衡,多副本存储开销大,但性能较好;纠删码存储开销低,但性能较差。
目前,获取待存储数据的存储策略标识;采用与存储策略标识相应的存储方式存储待存储数据,存储策略标识用于指示采用以下存储方式的至少之一:副本冗余处理方式、纠删编解码处理方式;即为多副本以及纠删码进行数据冗余;多副本以及纠删码的方式进行数据冗余是较为常见的数据处理方式。
技术问题
发明人意识到,目前对于当在多副本以及纠删码的方式进行数据冗余处理时,针对数据可能存在的丢失以及故障的问题;以及在数据丢失或故障后,数据恢复具有难度,且不易操作。
技术解决方案
本申请实施例公开了一种数据冗余处理方法,所述方法包括:
获取IO数据,将所述IO数据映射到集合内,所述集合包括主节点和与主节点对应的纠删码节点,所述IO数据分别映射存储到所述主节点以及所述主节点对应的纠删码节点;
所述集合接收IO数据处理请求并在所述集合内执行IO数据处理,其中所述IO数据处理包括写入IO处理和读出IO处理;
检测所述集合内的主节点和与主节点对应的纠删码节点是否存在所述IO数据存储故障;
若存在IO数据存储故障,对所述IO数据存储故障进行排除后完成IO数据冗余处理;
本申请实施例还公开了一种数据冗余处理装置,所述装置包括:
获取模块,用于获取IO数据,将所述IO数据映射到集合内,所述集合包括主节点和与主节点对应的纠删码节点,所述IO数据分别映射存储到所述主节点以及所述主节点对应的纠删码节点;
接收模块,用于所述集合接收IO数据处理请求并在所述集合内执行IO数据处理,其中所述IO数据处理包括写入IO处理和读出IO处理;
检测模块,用于检测所述集合内的主节点和与主节点对应的纠删码节点是否存在所述IO数据存储故障;
完成模块,用于若存在IO数据存储故障,对所述IO数据存储故障进行排除后完成IO数据冗余处理;
本申请实施例还公开了一种计算机设备,包括处理器、存储器及存储在所述存储器上并能够在所述处理器上运行的计算机程序,所述计算机程序被所述处理器执行时实现以下步骤:
获取IO数据,将所述IO数据映射到集合内,所述集合包括主节点和与主节点对应的纠删码节点,所述IO数据分别映射存储到所述主节点以及所述主节点对应的纠删码节点;
所述集合接收IO数据处理请求并在所述集合内执行IO数据处理,其中所述IO数据处理包括写入IO处理和读出IO处理;
检测所述集合内的主节点和所述纠删码节点是否存在所述IO数据存储故障;
若存在IO数据存储故障,对所述IO数据存储故障进行排除后完成IO数据冗余处理。
本申请实施例还公开了一种计算机可读存储介质,其上存储有计算机程序,所述程序被处理器执行时,实现以下步骤:
获取IO数据,将所述IO数据映射到集合内,所述集合包括主节点和与主节点对应的纠删码节点,所述IO数据分别映射存储到所述主节点以及所述主节点对应的纠删码节点;
所述集合接收IO数据处理请求并在所述集合内执行IO数据处理,其中所述IO数据处理包括写入IO处理和读出IO处理;
检测所述集合内的主节点和所述纠删码节点是否存在所述IO数据存储故障;
若存在IO数据存储故障,对所述IO数据存储故障进行排除后完成IO数据冗余处理。
附图说明
图1是本申请一种数据冗余处理方法实施例一的步骤流程图;
图2是本申请一种数据冗余处理方法实施例二的步骤流程图;
图3是本申请一种数据冗余处理方法实施例三的步骤流程图;
图4是本申请一种数据冗余处理方法实施例四的步骤流程图;
图5是本申请一种数据冗余处理方法实施例五的步骤流程图;
图6是本申请一种数据冗余处理方法实施例六的步骤流程图;
图7是本申请一种数据冗余处理方法实施例七的步骤流程图;
图8是本申请一种数据冗余处理方法实施例八的步骤流程图;
图9是本申请一种数据冗余处理装置某一实施例的结构框图。
本发明的实施方式
为使本申请的上述目的、特征和优点能够更加明显易懂,下面结合附图和具体实施方式对本申请作进一步详细的说明。
参照图1,示出了本申请的一种数据冗余处理方法实施例一的步骤流程图,具体可以包括如下步骤:
步骤S10,获取IO数据,将所述IO数据映射到集合内,所述集合包括主节点和与主节点对应的纠删码节点,所述IO数据分别映射存储到所述主节点以及所述主节点对应的纠删码节点;
步骤S20,所述集合接收IO数据处理请求并在所述集合内执行IO数据处理,其中所述IO数据处理包括写入IO处理和读出IO处理;
步骤S30,检测所述集合内的主节点和与主节点对应的纠删码节点是否存在所述IO数据存储故障;
步骤S40,若存在IO数据存储故障,对所述IO数据存储故障进行排除后完成IO数据冗余处理。
先获取不同的IO数据,所述IO数据指的是通过当前云存储系统通过IO即I对应输入(Input),O对应输出(Output),分为IO设备和IO接口两个部分,云存储系统通过IO设备或IO接口进行传输的缓存数据;再将不同IO数据映射到集合内,根据集合内的主节点以及主节点对应的纠删码节点,将IO数据映射存储到所述主节点以及所述主节点对应的纠删码节点,其映射是指两个元素的集之间元素相互对应的关系,即在一个集合包含主节点和纠删码节点,若主节点内映射有一个IO数据并设为元素A,其主节点对应的纠删码节点内也具有该一个IO数据唯一对应的IO数据并设为元素B,其中元素A和元素B因主节点与纠删码节点相对应的关系,其元素A到元素B为主节点到纠删码节点的一个映射;根据集合内的主节点以及主节点对应的纠删码节点进行IO数据处理,其中IO数据处理包括写入IO处理和读出IO处理;再检测集合内的主节点以及主节点对应的纠删码节点是否存在所述IO数据存储故障,通过是否完成写入IO处理以及读出IO处理判断其故障的节点,若存在故障,则判断其故障节点为主节点还是主节点对应的纠删码节点,并对其进行故障恢复;若不存在故障,则完成IO数据冗余处理。
在本申请实施例中,通过获取不同的IO数据,并将其映射到同一的集合内,先对其进行IO数据处理,并判断是否完成相应的写入IO处理以及读出IO处理,再检测集合内的IO数据是否存在故障,若具有故障,则恢复故障,若无故障,则完成数据冗余处理;能够检测IO数据冗余处理中IO数据是否能够完成读写的请求,再对在数据冗余处理中存在的IO数据丢失以及IO数据故障时,恢复故障的节点,使得整个IO数据冗余处理的完整性更优。
参照图2,示出了本申请的一种数据冗余处理方法实施例二的步骤流程图,所述写入IO处理的执行方法,具体可以包括如下步骤:
步骤S2011,获取主节点内的原始数据,其中,所述原始数据为在获取IO数据之前已经存储在主节点内的初始数据;
步骤S2012,将所述主节点内的原始数据分划分为K份主数据以及M份校验数据,并生成K+M份数据;将所述K+M份数据写入至所述主节点对应的纠删码节点内;
步骤S2013,当接收到所述IO数据处理请求时,将所述获取的IO数据同时写入至所述主节点的原始数据中以及所述纠删码节点的K+M份数据中;
步骤S2014,当所述纠删码节点写入完成时,向所述主节点发送写入完成的响应信息,根据所述响应信息完成所述写入IO处理。
在本申请实施例中,通过获取主节点内的原始数据,其中,所述原始数据为在获取IO数据之前已经存储在主节点内的初始数据;将所述主节点内的原始数据划分出K份主数据,和计算出M份校验数据,然后将K+M份数据写入至所述主节点对应的纠删码节点;当所述主节点以及所述主节点对应的目标纠删码节点接收到写入IO处理时,在主节点内的原始数据中写入待写入的IO数据,同时在主节点对应的纠删码节点内K+M份数据中写入待写入的IO数据;当所述主节点对应的纠删码节点写入完成时,向主节点发送写入完成的响应信息;主节点收到所述主节点对应的纠删码节点的写入完成响应信息后,且主节点自身完成了对IO数据的写入,此时,主节点向前端客户端响应写入IO完成的信息,代表此次写入IO处理完成。
参照图3,示出了本申请的一种数据冗余处理方法实施例三的步骤流程图,所述当所述主节点对应的纠删码节点写入完成时,向所述主节点发送写入完成的响应信息,根据所述响应信息完成所述写入IO处理,具体可以包括如下步骤:
步骤S20141,对所述响应信息进行场景识别;
步骤S20142,当识别出所述响应信息为写入IO处理成功场景时,则成功写入IO处理;
步骤S20143,当识别出所述响应信息为写入IO处理失败场景时,则写入IO处理失败。
在本申请实施例中,通过主节点发送响应信息,通过对响应信息进行场景识别,其中响应信息场景包括写入IO处理成功场景和写入IO处理失败场景;通过响应信息场景中是否为写入IO处理成功场景以及写入IO处理失败场景,判断其是否为成功写入IO处理或写入IO处理失败;当成功写入IO处理后,只需检测其集合内的主节点以及主节点的纠删码节点是否存在IO数据存储故障即可,若无则可完成数据冗余处理;当写入IO处理失败后,通过检测其集合内的主节点以及主节点的纠删码节点中具体的故障节点,对故障节点进行恢复,恢复后完成数据冗余处理。
参照图4,示出了本申请的一种数据冗余处理方法实施例四的步骤流程图,所述当识别出所述响应信息为写入IO处理成功场景时,则成功写入IO处理,具体可以包括如下步骤:
步骤S201421,若主节点内的IO数据写入成功,且主节点对应的纠删码节点的K+M份数据写入成功的数据份数≥K份,则成功写入IO处理;或者,
步骤S201422,若主节点内的IO数据写入失败,但主节点对应的纠删码节点内的K+M份数据数据都中K+M份写入成功,则成功写入IO处理。
在本申请实施例中,若主节点上的IO数据写入成功,而纠删码节点上的K+M份数据写入成功的数据分数大于等于K份,则成功写入IO处理;或者,若主节点上的IO数据写入失败,而纠删码节点上的K+M份数据中K+M份数据都写入成功,则成功写入IO处理;根据主节点以及主节点对应的纠删码节点内的IO数据,判断是否成功写入IO处理。
参照图5,示出了本申请的一种数据冗余处理方法实施例五的步骤流程图,所述当识别出所述响应信息为写入IO处理失败场景时,则写入IO处理失败,具体可以包括如下步骤:
步骤S201431,若主节点内的IO数据写入失败,但主节点对应的纠删码节点内的K+M份数据中K+M份写入失败,则写入IO处理失败;或者,
步骤S201432,若主节点内的IO数据写入成功,且主节点对应的纠删码节点的K+M份数据写入的数据份数<K份,则写入IO处理失败。
在本申请实施例中,若主节点上的IO数据写入失败, 但主节点对应的纠删码节点上的K+M份数据中K+M份写入失败的情况,则写入IO处理失败;若主节点上的IO数据写入成功,而主节点对应的纠删码节点上的K+M份数据写入成功的数据分数<K份,则写入IO处理失败;根据主节点以及主节点对应的纠删码节点内的IO数据,判断是否写入IO处理失败。
参照图6,示出了本申请的一种数据冗余处理方法实施例六的步骤流程图,所述读出IO处理的执行方法,具体可以包括如下步骤:
步骤S2021,在所述主节点内读出待读出的IO数据,所述待读出的IO数据为主节点内的原始数据;
步骤S2022,所述原始数据在客户端接收后完成读出IO数据请求处理。
在本申请实施例中,在主节点内读取IO数据,主节点直接将本地保存的整块原始数据读取出来,其要读取的IO数据为主节点内保存的整块原始数据,将整块原始数据并返回给前端客户端,前端客户端接收完成后,代表此次读出IO处理完成。
参照图7,示出了本申请的一种数据冗余处理方法实施例七的步骤流程图,所述检测所述集合内的主节点和与主节点对应的纠删码节点是否存在所述IO数据存储故障,具体可以包括如下步骤:
步骤S3011,检测所述集合内的主节点和与主节点对应的纠删码节点,识别故障位置;
步骤S3012,当主节点故障时,生成新的临时主节点,进行写入IO处理以及读出IO处理;
步骤S3013,当与主节点对应的纠删码节点故障,且故障的纠删码节点数据的份数≤M时,在写入IO处理中纠删码节点无需写入IO数据以及读出IO处理;
步骤S3014,当与主节点对应的纠删码节点故障,且故障的纠删码节点数据的份数>M时,则中断写入IO处理;
步骤S3015,当主节点和与主节点对应的纠删码节点同时故障,且故障的纠删码节点数据的份数≤M时,选取新的临时主节点,在写入IO处理中纠删码节点无需写入IO数据,在读出IO处理中将主节点内的原始数据读出。
在本申请实施例中,通过检测存储节点是否故障;通过检测主节点以及主节点对应的纠删码节点内是否存储有IO数据进行判断。
若是主节点故障,存储系统会选出新的临时主节点,进行写入IO处理以及读出IO处理。
若是与主节点对应的纠删码节点故障,且故障的纠删码节点数据的份数≤M,对于写入IO处理,故障的纠删码节点无需写入IO数据;对于读出IO处理,在所述主节点内将原始数据读出;将所述原始数据返回至客户端,客户端接收后完成读出IO数据处理。
若故障的纠删码节点数据的份数>M,存储系统读写业务中断,即中断写入IO请求;因为故障的纠删码节点数已经超过了纠删码所能容忍的最大值M。
若主节点和纠删码节点同时故障,且故障的纠删码节点数据的份数≤M,存储系统先选出新的临时主节点,在写入IO处理中纠删码节点无需写入IO数据,仅完成读出IO处理。
参照图8,示出了本申请的一种数据冗余处理方法实施例八的步骤流程图,所述故障节点恢复包括主节点故障恢复和纠删码节点故障恢复,所述若存在IO数据存储故障,对所述IO数据存储故障进行排除后完成IO数据冗余处理,具体可以包括如下步骤:
步骤S3021,当所述主节点故障时,通过与所述主节点对应的纠删码节点内读取IO数据进行所述主节点故障恢复;
步骤S3022,当所述纠删码节点故障时,且当前所述纠删码节点为缺失对象的纠删码节点,则在其他纠删码节点读取IO数据以进行所述纠删码节点故障恢复。
在本申请实施例中,通过检测是否存在故障节点,并对故障节点进行恢复,故障节点恢复包括主节点故障恢复和纠删码节点故障恢复;当主节点故障恢复时,其丢失的IO数据需要从纠删码节点读取恢复,读取纠删码节点上的数据或校验数据;当所述纠删码节点故障时,当前所述纠删码节点为缺失对象的纠删码节点,从其他纠删码节点读取数据或校验数据,根据纠删码恢复策略,所述纠删码恢复策略为纠删码会创建一个数学函数来描述一组数字,这样就可以检查它们的准确性,而且一旦其中一个数字丢失,还可以恢复;多项式插值(polynomial interpolation)或过采样(oversampling)就是纠删码所使用的关键技术,通过上述技术恢复数据即可;可实现当主节点上并没有保存完整的整块IO数据,或纠删码节点上的K+M份数据存在缺失,数据都会后台异步恢复。
在一实施例中,上述数据冗余处理方法还可以应用于区块链节点中,实现数据区块链存储,以提高数据的安全性。本申请所指区块链是分布式数据存储、点对点传输、共识机制、加密算法等计算机技术的新型应用模式。区块链(Blockchain),本质上是一个去中心化的数据库,是一串使用密码学方法相关联产生的数据块,每一个数据块中包含了一批次网络交易的信息,用于验证其信息的有效性(防伪)和生成下一个区块。区块链可以包括区块链底层平台、平台产品服务层以及应用服务层等。
需要说明的是,对于方法实施例,为了简单描述,故将其都表述为一系列的动作组合,但是本领域技术人员应该知悉,本申请实施例并不受所描述的动作顺序的限制,因为依据本申请实施例,某些步骤可以采用其他顺序或者同时进行。其次,本领域技术人员也应该知悉,说明书中所描述的实施例均属于优选实施例,所涉及的动作并不一定是本申请实施例所必须的。
参照图9,示出了本申请的一种数据冗余处理装置某一实施例的结构框图,具体可以包括如下模块:
获取模块1001,用于获取IO数据,将所述IO数据映射到集合内,所述集合包括主节点和与主节点对应的纠删码节点,所述IO数据分别映射存储到所述主节点以及所述主节点对应的纠删码节点;
接收模块1002,用于所述集合接收IO数据处理请求并在所述集合内执行IO数据处理,其中所述IO数据处理包括写入IO处理和读出IO处理;
检测模块1003,用于检测所述集合内的主节点和与主节点对应的纠删码节点是否存在所述IO数据存储故障;
完成模块1004,用于若存在IO数据存储故障,对所述IO数据存储故障进行排除后完成IO数据冗余处理。
在一优选实施例中,所述接收模块1002,用于所述集合接收IO数据处理请求,其中所述IO数据处理请求包括写入IO处理和读出IO处理,其中所述写入IO处理,包括:
获取单元,用于获取主节点内的原始数据,其中,所述原始数据为在获取IO数据之前已经存储在主节点内的初始数据;
生成单元,用于将所述主节点内的原始数据分划分为K份主数据以及M份校验数据,并生成K+M份数据;将所述K+M份数据写入至所述主节点对应的纠删码节点内;
写入单元,用于当接收到所述IO数据处理请求时,将所述获取的IO数据同时写入至所述主节点的原始数据中以及所述纠删码节点的K+M份数据中;
响应信息单元,用于当所述纠删码节点写入完成时,向所述主节点发送写入完成的响应信息,根据所述响应信息完成所述写入IO处理。
在一优选实施例中,所述响应信息单元,用于当所述纠删码节点写入完成时,向所述主节点发送写入完成的响应信息,根据所述响应信息完成所述写入IO处理,包括:
识别单元,用于对所述响应信息进行场景识别;
写入IO处理成功场景单元,用于当识别出所述响应信息为写入IO处理成功场景时,则成功写入IO处理;
写入IO处理失败场景单元,用于当识别出所述响应信息为写入IO处理失败场景时,则写入IO处理失败。
在一优选实施例中,所述写入IO处理成功场景单元,用于当识别出所述响应信息为写入IO处理成功场景时,则成功写入IO处理,包括:
第一完成单元,用于若主节点内的IO数据写入成功,且主节点对应的纠删码节点的K+M份数据写入成功的数据份数≥K份,则成功写入IO处理;或者,
第二完成单元,用于若主节点内的IO数据写入失败,但主节点对应的纠删码节点内的K+M份数据中K+M份数据都写入成功,则成功写入IO处理。
在一优选实施例中,所述写入IO处理失败场景单元,用于当识别出所述响应信息为写入IO处理失败场景时,则写入IO处理失败,包括:
第一失败单元,用于若主节点内的IO数据写入失败,且主节点对应的纠删码节点内的K+M份数据中K+M份数据都写入失败,则写入IO处理失败;或者,
第二失败单元,用于主节点内的IO数据写入成功,但主节点对应的纠删码节点的K+M份数据写入的数据份数<K份,则写入IO处理失败。
在一优选实施例中,所述接收模块1002,用于所述集合接收IO数据处理请求,其中所述IO数据处理请求包括写入IO处理和读出IO处理,其中所述读出IO处理,包括:
读出单元,用于在所述主节点内读出待读出的IO数据,所述待读出的IO数据为主节点内的原始数据;
完成读出IO处理单元,用于所述原始数据在客户端接收后完成读出IO处理。
在一优选实施例中,所述检测模块1003,用于检测所述集合内的主节点和所述纠删码节点是否存在所述IO数据存储故障,包括:
故障单元,用于检测所述集合内的主节点和与主节点对应的纠删码节点,识别故障位置;
临时主节点单元,用于当主节点故障时,生成新的临时主节点,进行写入IO处理以及读出IO处理;
无需写入及读出单元,用于当与主节点对应的纠删码节点故障,且故障的纠删码节点数据的份数≤M时,在写入IO处理中纠删码节点无需写入IO数据以及读出IO处理;
中断单元,用于当与主节点对应的纠删码节点故障,且故障的纠删码节点数据的份数>M时,则中断写入IO处理;
读出单元,用于当主节点和与主节点对应的纠删码节点同时故障,且故障的纠删码节点数据的份数≤M时,选取新的临时主节点,在写入IO处理中纠删码节点无需写入IO数据,在读出IO处理中将主节点内的原始数据读出。
在一优选实施例中,所述完成模块1004,用于所述故障节点恢复包括主节点故障恢复和纠删码节点故障恢复,所述若存在IO数据存储故障,对所述IO数据存储故障进行排除后完成IO数据冗余处理,包括:
恢复单元,用于若存在故障则检测故障节点,并对故障节点恢复,其中所述故障节点恢复包括主节点故障恢复和纠删码节点故障;
主节点恢复单元,用于当所述主节点故障时,通过与所述主节点对应的纠删码节点内读取IO数据进行所述主节点故障恢复;
纠删码节点恢复单元,用于当所述纠删码节点故障时,且当前所述纠删码节点为缺失对象的纠删码节点,则在其他纠删码节点读取IO数据以进行所述纠删码节点故障恢复。
对于装置实施例而言,由于其与方法实施例基本相似,所以描述的比较简单,相关之处参见方法实施例的部分说明即可。
本申请实施例还提供一种计算机设备,包括处理器、存储器及存储在所述存储器上并能够在所述处理器上运行的计算机程序,该计算机设备包括通过系统总线连接的处理器、存储器和网络接口,其中,存储器包括非易失性和/或易失性存储器。所述计算机程序被所述处理器执行时实现以下步骤:
获取IO数据,将所述IO数据映射到集合内,所述集合包括主节点和与主节点对应的纠删码节点,所述IO数据分别映射存储到所述主节点以及所述主节点对应的纠删码节点;
所述集合接收IO数据处理请求并在所述集合内执行IO数据处理,其中所述IO数据处理包括写入IO处理和读出IO处理;
检测所述集合内的主节点和所述纠删码节点是否存在所述IO数据存储故障;
若存在IO数据存储故障,对所述IO数据存储故障进行排除后完成IO数据冗余处理。
本申请实施例还提供了一种计算机可读存储介质,所述计算机可读存储介质可包括非易失性和/或易失性存储器,其上存储有计算机程序,所述程序被处理器执行时,执行如本申请实施例中的一种混合数据冗余方法。
本申请实施例包括以下优点:获取IO数据映射在集合内,通过集合内的IO数据接收相应的IO数据处理请求,IO数据处理请求中包括写入IO数据以及读出IO数据的请求处理,并根据相应的IO数据处理请求进行写入IO处理以及读出IO处理;进行IO数据处理请求后生成响应信息,根据对响应信息进行场景识别,其中响应信息场景包括写入IO处理成功场景以及写入IO失败处理场景,根据成功和失败对写入IO数据请求处理进行判断;同时,对读出IO处理的判断;对集合内的主节点以及主节点对应的纠删码节点进行是否存在IO数据存储故障检测,根据检测结果判断是否完成写入IO处理以及读出IO处理,若不存在故障则完成写入IO处理以及读出IO处理,即完成数据冗余处理;若存在故障则进一步检测故障节点,并对故障节点进行恢复,待数据恢复后完成数据冗余处理;本申请通过检测在IO数据冗余处理中IO数据是否能够完成读写的请求,对在数据冗余处理中存在的IO数据丢失以及IO数据故障时,对IO数据进行恢复,使得整个IO数据冗余处理的完整性更优。
本说明书中的各个实施例均采用递进的方式描述,每个实施例重点说明的都是与其他实施例的不同之处,各个实施例之间相同相似的部分互相参见即可。
本领域内的技术人员应明白,本申请实施例的实施例可提供为方法、装置、或计算机程序产品。因此,本申请实施例可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且,本申请实施例可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。
本申请实施例是参照根据本申请实施例的方法、终端设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理终端设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理终端设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理终端设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。
这些计算机程序指令也可装载到计算机或其他可编程数据处理终端设备上,使得在计算机或其他可编程终端设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程终端设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。
尽管已描述了本申请实施例的优选实施例,但本领域内的技术人员一旦得知了基本创造性概念,则可对这些实施例做出另外的变更和修改。所以,所附权利要求意欲解释为包括优选实施例以及落入本申请实施例范围的所有变更和修改。
最后,还需要说明的是,在本文中,诸如第一和第二等之类的关系术语仅仅用来将一个实体或者操作与另一个实体或操作区分开来,而不一定要求或者暗示这些实体或操作之间存在任何这种实际的关系或者顺序。而且,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者终端设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者终端设备所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括所述要素的过程、方法、物品或者终端设备中还存在另外的相同要素。
以上对本申请所提供的一种数据冗余处理方法、装置及存储介质,进行了详细介绍,本文中应用了具体个例对本申请的原理及实施方式进行了阐述,以上实施例的说明只是用于帮助理解本申请的方法及其核心思想;同时,对于本领域的一般技术人员,依据本申请的思想,在具体实施方式及应用范围上均会有改变之处,综上所述,本说明书内容不应理解为对本申请的限制。

Claims (20)

  1. 一种数据冗余处理方法,其中,包括:
    获取IO数据,将所述IO数据映射到集合内,所述集合包括主节点和与主节点对应的纠删码节点,所述IO数据分别映射存储到所述主节点以及所述主节点对应的纠删码节点;
    所述集合接收IO数据处理请求并在所述集合内执行IO数据处理,其中所述IO数据处理包括写入IO处理和读出IO处理;
    检测所述集合内的主节点和所述纠删码节点是否存在所述IO数据存储故障;
    若存在IO数据存储故障,对所述IO数据存储故障进行排除后完成IO数据冗余处理。
  2. 根据权利要求1所述的方法,其中,所述写入IO处理的执行方法包括:
    获取主节点内的原始数据,其中,所述原始数据为在获取IO数据之前已经存储在主节点内的初始数据;
    将所述主节点内的原始数据划分为K份主数据以及M份校验数据,并生成K+M份数据;将所述K+M份数据写入至所述纠删码节点内;
    当接收到所述IO数据处理请求时,将所述获取的IO数据同时写入至所述主节点的原始数据中以及所述纠删码节点的K+M份数据中;
    当所述纠删码节点写入完成时,向所述主节点发送写入完成的响应信息,根据所述响应信息完成所述写入IO处理。
  3. 根据权利要求2所述的方法,其中,所述当所述纠删码节点写入完成时,向所述主节点发送写入完成的响应信息,根据所述响应信息完成所述写入IO处理,包括:
    对所述响应信息进行场景识别;
    当识别出所述响应信息为写入IO处理成功场景时,则成功写入IO处理;
    当识别出所述响应信息为写入IO处理失败场景时,则写入IO处理失败。
  4. 根据权利要求3所述的方法,其中,所述当识别出所述响应信息为写入IO处理成功场景时,则成功写入IO处理,包括:
    若主节点内的IO数据写入成功,且主节点对应的纠删码节点的K+M份数据写入成功的数据份数≥K份,则成功写入IO处理;或者,
    若主节点内的IO数据写入失败,但主节点对应的纠删码节点内的K+M份数据中K+M份数据都写入成功,则成功写入IO处理。
  5. 根据权利要求3所述的方法,其中,所述当识别出所述响应信息为写入IO处理失败场景时,则写入IO处理失败,包括:
    若主节点内的IO数据写入失败,且主节点对应的纠删码节点内的K+M份数据中K+M份数据都写入失败,则写入IO处理失败;或者,
    若主节点内的IO数据写入成功,但主节点对应的纠删码节点的K+M份数据写入的数据份数<K份,则写入IO处理失败。
  6. 根据权利要求1所述的方法,其中,所述读出IO处理的执行方法包括:
    在所述主节点内读出待读出的IO数据,所述待读出的IO数据为主节点内的原始数据;
    所述原始数据在客户端接收后完成读出IO处理。
  7. 根据权利要求1所述的方法,其中,所述检测所述集合内的主节点和所述纠删码节点是否存在所述IO数据存储故障,包括:
    检测所述集合内的主节点和与主节点对应的纠删码节点,识别故障位置;
    当主节点故障时,生成新的临时主节点,进行写入IO处理以及读出IO处理;
    当与主节点对应的纠删码节点故障,且故障的纠删码节点数据的份数≤M时,在写入IO处理中纠删码节点无需写入IO数据以及读出IO处理;
    当与主节点对应的纠删码节点故障,且故障的纠删码节点数据的份数>M时,则中断写入IO处理;
    当主节点和与主节点对应的纠删码节点同时故障,且故障的纠删码节点数据的份数≤M时,选取新的临时主节点,在写入IO处理中纠删码节点无需写入IO数据,在读出IO处理中将主节点内的原始数据读出。
  8. 根据权利要求1所述的方法,其中,所述故障节点恢复包括主节点故障恢复和纠删码节点故障恢复,所述若存在IO数据存储故障,对所述IO数据存储故障进行排除后完成IO数据冗余处理,包括:
    当所述主节点故障时,通过与所述主节点对应的纠删码节点内读取IO数据进行所述主节点故障恢复;
    当所述纠删码节点故障时,且当前所述纠删码节点为缺失对象的纠删码节点,则在其他纠删码节点读取IO数据以进行所述纠删码节点故障恢复。
  9. 一种数据冗余处理装置,其中,包括:
    获取模块,用于获取IO数据,将所述IO数据映射到集合内,所述集合包括主节点和与主节点对应的纠删码节点,所述IO数据分别映射存储到所述主节点以及所述主节点对应的纠删码节点;
    接收模块,用于所述集合接收IO数据处理请求并在所述集合内执行IO数据处理,其中所述IO数据处理包括写入IO处理和读出IO处理;
    检测模块,用于检测所述集合内的主节点和与主节点对应的纠删码节点是否存在所述IO数据存储故障;
    完成模块,用于若存在IO数据存储故障,对所述IO数据存储故障进行排除后完成IO数据冗余处理。
  10. 一种计算机设备,其中,包括处理器、存储器及存储在所述存储器上并能够在所述处理器上运行的计算机程序,所述计算机程序被所述处理器执行时实现以下步骤:
    获取IO数据,将所述IO数据映射到集合内,所述集合包括主节点和与主节点对应的纠删码节点,所述IO数据分别映射存储到所述主节点以及所述主节点对应的纠删码节点;
    所述集合接收IO数据处理请求并在所述集合内执行IO数据处理,其中所述IO数据处理包括写入IO处理和读出IO处理;
    检测所述集合内的主节点和所述纠删码节点是否存在所述IO数据存储故障;
    若存在IO数据存储故障,对所述IO数据存储故障进行排除后完成IO数据冗余处理。
  11. 根据权利要求10所述的计算机设备,其中,所述写入IO处理的执行方法包括:
    获取主节点内的原始数据,其中,所述原始数据为在获取IO数据之前已经存储在主节点内的初始数据;
    将所述主节点内的原始数据划分为K份主数据以及M份校验数据,并生成K+M份数据;将所述K+M份数据写入至所述纠删码节点内;
    当接收到所述IO数据处理请求时,将所述获取的IO数据同时写入至所述主节点的原始数据中以及所述纠删码节点的K+M份数据中;
    当所述纠删码节点写入完成时,向所述主节点发送写入完成的响应信息,根据所述响应信息完成所述写入IO处理。
  12. 根据权利要求11所述的计算机设备,其中,所述当所述纠删码节点写入完成时,向所述主节点发送写入完成的响应信息,根据所述响应信息完成所述写入IO处理,包括:
    对所述响应信息进行场景识别;
    当识别出所述响应信息为写入IO处理成功场景时,则成功写入IO处理;
    当识别出所述响应信息为写入IO处理失败场景时,则写入IO处理失败。
  13. 根据权利要求12所述的计算机设备,其中,所述当识别出所述响应信息为写入IO处理成功场景时,则成功写入IO处理,包括:
    若主节点内的IO数据写入成功,且主节点对应的纠删码节点的K+M份数据写入成功的数据份数≥K份,则成功写入IO处理;或者,
    若主节点内的IO数据写入失败,但主节点对应的纠删码节点内的K+M份数据中K+M份数据都写入成功,则成功写入IO处理。
  14. 根据权利要求12所述的计算机设备,其中,所述当识别出所述响应信息为写入IO处理失败场景时,则写入IO处理失败,包括:
    若主节点内的IO数据写入失败,且主节点对应的纠删码节点内的K+M份数据中K+M份数据都写入失败,则写入IO处理失败;或者,
    若主节点内的IO数据写入成功,但主节点对应的纠删码节点的K+M份数据写入的数据份数<K份,则写入IO处理失败。
  15. 根据权利要求10所述的计算机设备,其中,所述读出IO处理的执行方法包括:
    在所述主节点内读出待读出的IO数据,所述待读出的IO数据为主节点内的原始数据;
    所述原始数据在客户端接收后完成读出IO处理。
  16. 根据权利要求10所述的计算机设备,其中,所述检测所述集合内的主节点和所述纠删码节点是否存在所述IO数据存储故障,包括:
    检测所述集合内的主节点和与主节点对应的纠删码节点,识别故障位置;
    当主节点故障时,生成新的临时主节点,进行写入IO处理以及读出IO处理;
    当与主节点对应的纠删码节点故障,且故障的纠删码节点数据的份数≤M时,在写入IO处理中纠删码节点无需写入IO数据以及读出IO处理;
    当与主节点对应的纠删码节点故障,且故障的纠删码节点数据的份数>M时,则中断写入IO处理;
    当主节点和与主节点对应的纠删码节点同时故障,且故障的纠删码节点数据的份数≤M时,选取新的临时主节点,在写入IO处理中纠删码节点无需写入IO数据,在读出IO处理中将主节点内的原始数据读出。
  17. 根据权利要求10所述的计算机设备,其中,所述故障节点恢复包括主节点故障恢复和纠删码节点故障恢复,所述若存在IO数据存储故障,对所述IO数据存储故障进行排除后完成IO数据冗余处理,包括:
    当所述主节点故障时,通过与所述主节点对应的纠删码节点内读取IO数据进行所述主节点故障恢复;
    当所述纠删码节点故障时,且当前所述纠删码节点为缺失对象的纠删码节点,则在其他纠删码节点读取IO数据以进行所述纠删码节点故障恢复。
  18. 一种计算机可读存储介质,其上存储有计算机程序,其中,该程序被处理器执行时实现以下步骤:
    获取IO数据,将所述IO数据映射到集合内,所述集合包括主节点和与主节点对应的纠删码节点,所述IO数据分别映射存储到所述主节点以及所述主节点对应的纠删码节点;
    所述集合接收IO数据处理请求并在所述集合内执行IO数据处理,其中所述IO数据处理包括写入IO处理和读出IO处理;
    检测所述集合内的主节点和所述纠删码节点是否存在所述IO数据存储故障;
    若存在IO数据存储故障,对所述IO数据存储故障进行排除后完成IO数据冗余处理。
  19. 根据权利要求18所述的计算机可读存储介质,其中,所述写入IO处理的执行方法包括:
    获取主节点内的原始数据,其中,所述原始数据为在获取IO数据之前已经存储在主节点内的初始数据;
    将所述主节点内的原始数据划分为K份主数据以及M份校验数据,并生成K+M份数据;将所述K+M份数据写入至所述纠删码节点内;
    当接收到所述IO数据处理请求时,将所述获取的IO数据同时写入至所述主节点的原始数据中以及所述纠删码节点的K+M份数据中;
    当所述纠删码节点写入完成时,向所述主节点发送写入完成的响应信息,根据所述响应信息完成所述写入IO处理。
  20. 根据权利要求19所述的计算机可读存储介质,其中,所述当所述纠删码节点写入完成时,向所述主节点发送写入完成的响应信息,根据所述响应信息完成所述写入IO处理,包括:
    对所述响应信息进行场景识别;
    当识别出所述响应信息为写入IO处理成功场景时,则成功写入IO处理;
    当识别出所述响应信息为写入IO处理失败场景时,则写入IO处理失败。
PCT/CN2020/118909 2020-05-26 2020-09-29 一种数据冗余处理方法、装置、设备及存储介质 WO2021151298A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010456949.8 2020-05-26
CN202010456949.8A CN111625400B (zh) 2020-05-26 2020-05-26 一种数据冗余处理方法、装置及存储介质

Publications (1)

Publication Number Publication Date
WO2021151298A1 true WO2021151298A1 (zh) 2021-08-05

Family

ID=72271156

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/118909 WO2021151298A1 (zh) 2020-05-26 2020-09-29 一种数据冗余处理方法、装置、设备及存储介质

Country Status (2)

Country Link
CN (1) CN111625400B (zh)
WO (1) WO2021151298A1 (zh)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111625400B (zh) * 2020-05-26 2024-01-16 平安科技(深圳)有限公司 一种数据冗余处理方法、装置及存储介质
CN112597654A (zh) * 2020-12-24 2021-04-02 中国人民解放军国防科技大学 基于mbse的顶层系统设计方案验证、优化和评估方法
CN113360890A (zh) * 2021-06-10 2021-09-07 重庆科创职业学院 基于计算机的安全认证方法及系统

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105095013A (zh) * 2015-06-04 2015-11-25 华为技术有限公司 数据存储方法、恢复方法、相关装置以及系统
US20170206135A1 (en) * 2015-12-31 2017-07-20 Huawei Technologies Co., Ltd. Data Reconstruction Method in Distributed Storage System, Apparatus, and System
CN109889440A (zh) * 2019-02-20 2019-06-14 哈尔滨工程大学 一种基于最大生成树的纠删码失效节点重构路径选择方法
CN110212923A (zh) * 2019-05-08 2019-09-06 西安交通大学 一种基于模拟退火的分布式纠删码存储系统数据修复方法
CN111625400A (zh) * 2020-05-26 2020-09-04 平安科技(深圳)有限公司 一种数据冗余处理方法、装置及存储介质

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107977167B (zh) * 2017-12-01 2020-08-18 西安交通大学 一种基于纠删码的分布式存储系统的退化读优化方法
CN110018783B (zh) * 2018-01-09 2022-12-20 阿里巴巴集团控股有限公司 一种数据存储方法、装置及系统

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105095013A (zh) * 2015-06-04 2015-11-25 华为技术有限公司 数据存储方法、恢复方法、相关装置以及系统
US20170206135A1 (en) * 2015-12-31 2017-07-20 Huawei Technologies Co., Ltd. Data Reconstruction Method in Distributed Storage System, Apparatus, and System
CN109889440A (zh) * 2019-02-20 2019-06-14 哈尔滨工程大学 一种基于最大生成树的纠删码失效节点重构路径选择方法
CN110212923A (zh) * 2019-05-08 2019-09-06 西安交通大学 一种基于模拟退火的分布式纠删码存储系统数据修复方法
CN111625400A (zh) * 2020-05-26 2020-09-04 平安科技(深圳)有限公司 一种数据冗余处理方法、装置及存储介质

Also Published As

Publication number Publication date
CN111625400B (zh) 2024-01-16
CN111625400A (zh) 2020-09-04

Similar Documents

Publication Publication Date Title
EP3934165A1 (en) Consensus method of consortium blockchain, and consortium blockchain system
WO2021151298A1 (zh) 一种数据冗余处理方法、装置、设备及存储介质
US11614867B2 (en) Distributed storage system-based data processing method and storage device
CN106201338B (zh) 数据存储方法及装置
CN108681565B (zh) 区块链数据并行处理方法、装置、设备和存储介质
EP3014451B1 (en) Locally generated simple erasure codes
CN106776130B (zh) 一种日志恢复方法、存储装置和存储节点
TW201909613A (zh) 區塊鏈共識網路中處理共識請求的方法、裝置和電子設備
EP3779760B1 (en) Blockchain-based data processing method and apparatus, and electronic device
CN106802892B (zh) 用于主备数据一致性校验的方法和设备
CN106611135A (zh) 一种存储数据完整性验证及恢复方法
US7849355B2 (en) Distributed object sharing system and method thereof
US9489254B1 (en) Verification of erasure encoded fragments
US20190215152A1 (en) End-to-end checksum in a multi-tenant encryption storage system
CN110121694B (zh) 一种日志管理方法、服务器和数据库系统
US10691353B1 (en) Checking of data difference for writes performed via a bus interface to a dual-server storage controller
US20190073284A1 (en) Validation of data written via two different bus interfaces to a dual server based storage controller
CN110753080A (zh) 区块传输方法、装置、设备及可读存储介质
US9552254B1 (en) Verification of erasure encoded fragments
CN104407806B (zh) 独立磁盘冗余阵列组硬盘信息的修改方法和装置
US11281532B1 (en) Synchronously storing data in a dispersed storage network
WO2020238736A1 (zh) 一种生成解码矩阵的方法、解码方法和对应装置
CN112463444A (zh) 一种数据不一致修复方法及相关装置
CN106844088B (zh) 一种raid存储系统的数据发送方法及装置
KR20170120889A (ko) 데이터베이스 시스템에서 블록 복구 방법, 장치 및 컴퓨터 판독가능 매체에 저장된 컴퓨터-프로그램

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20916497

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20916497

Country of ref document: EP

Kind code of ref document: A1