WO2017028494A1 - 一种数据恢复的方法、存储的方法相应的装置及系统 - Google Patents

一种数据恢复的方法、存储的方法相应的装置及系统 Download PDF

Info

Publication number
WO2017028494A1
WO2017028494A1 PCT/CN2016/071339 CN2016071339W WO2017028494A1 WO 2017028494 A1 WO2017028494 A1 WO 2017028494A1 CN 2016071339 W CN2016071339 W CN 2016071339W WO 2017028494 A1 WO2017028494 A1 WO 2017028494A1
Authority
WO
WIPO (PCT)
Prior art keywords
byte
encoding
file
data processing
block
Prior art date
Application number
PCT/CN2016/071339
Other languages
English (en)
French (fr)
Inventor
庄仕岳
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to EP16836360.4A priority Critical patent/EP3327571B1/en
Publication of WO2017028494A1 publication Critical patent/WO2017028494A1/zh
Priority to US15/893,201 priority patent/US10810091B2/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1458Management of the backup or restore process
    • G06F11/1469Backup restoration techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/08Error detection or correction by redundancy in data representation, e.g. by using checking codes
    • G06F11/10Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's
    • G06F11/1076Parity data used in redundant arrays of independent storages, e.g. in RAID systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/182Distributed file systems

Definitions

  • the present invention relates to the field of data storage technologies, and in particular, to a data recovery method, a storage method, a corresponding device and a system.
  • a multi-copy scheme may be adopted, that is, data in a disk is copied to multiple replica disks, and when any one of the disks fails, it survives from any other one.
  • the data is read out by inserting the data into the new disk on the disk.
  • a Reed-Solomon Code (RS) technology such as RS (10, 4) is used to encode data of 10 disks, and the generated encoded result is stored.
  • RS Reed-Solomon Code
  • the network bandwidth overhead is increased by 10 times, and the network bandwidth overhead is a disadvantage of the RS technology.
  • the embodiment of the present invention provides a data recovery method, which can reduce network overhead during data recovery under the premise of low storage overhead.
  • the embodiment of the invention also provides a corresponding data storage method, corresponding device and system.
  • a first aspect of the present invention provides a data recovery method, the method being applied to a distributed storage system, the distributed storage system including a named node, a plurality of first storage nodes, and a plurality of second storage sections Point, the plurality of first storage nodes are configured to store different file blocks of the stored file, and the plurality of second storage nodes are used for distributed storage of the check code blocks obtained by encoding the different file blocks.
  • Each of the first storage nodes includes a data processing device, and each of the data processing devices is in communication with the named node, the method comprising:
  • the data processing device does not find the target file according to the identifier of the target file, and determines that the target file block is lost;
  • the recovering The dependent data block includes a dependent file block and a dependent check code block required to recover the target file block, and a part of the check code in the check code block is obtained by encoding a partial file block of the target file, the dependency And verifying that the remaining part of the check code is obtained by encoding each file block of the target file, and the remaining part of the check code is a check code other than the part of the check code, the target
  • the file is a file to which the target file block belongs;
  • the data processing apparatus restores the target file block according to the dependent file block and the dependent check code block.
  • the data processing apparatus recovers the target file block according to the dependent file block and the dependent check code block, including:
  • the data processing apparatus recovers a first byte in the target file block according to a partial byte encoding function, the dependent file block, and the dependent check code block, wherein the partial byte encoding function is to adopt the target A partial file block in the file is encoded to obtain a function of the encoded result;
  • the data processing apparatus recovers a second byte in the target file block according to a full byte encoding function, the dependent file block, and the dependent check code block, the full byte encoding function adopting the target
  • Each file block in the file is encoded to obtain a function of the encoded result.
  • the data processing apparatus recovers the information according to a partial byte encoding function, the dependent file block, and the dependent check code block
  • the first byte in the target file block including:
  • the data processing device acquires a dependent byte required to recover the first byte from a dependent file block corresponding to the first encoding parameter, and obtains the recovery from the dependent check code block corresponding to the first encoding result.
  • a check code required by the first byte the first encoding parameter is an encoding parameter in the partial byte encoding function
  • the first encoding result is to use the partial byte encoding function to the first a result of encoding the dependent byte indicated by the encoding parameter and encoding the first byte;
  • the data processing apparatus decodes a check code required to recover the first byte according to a dependent byte required to recover the first byte to obtain the first byte.
  • the data processing apparatus is configured according to a full byte encoding function, the dependent file block, and the dependent check code The block restores the second byte in the target file block, including:
  • the data processing device acquires, from the dependent file block corresponding to the second encoding parameter, a dependent byte required to recover the second byte, and obtains the recovery from the dependent check code block corresponding to the second encoding result.
  • a check code required for the second byte is an encoding parameter in the full byte encoding function
  • the second encoding result is to use the full byte encoding function to the second a result obtained by encoding a dependent byte indicated by the encoding parameter and the second byte;
  • the data processing apparatus decodes a check code required to recover the second byte according to a dependent byte required to recover the second byte to obtain the second byte.
  • a second aspect of the present invention provides a data storage method, where the method is applied to a distributed storage system, where the distributed storage system includes a named node, a plurality of first storage nodes, and a plurality of second storage nodes,
  • the first storage node is configured to store different file blocks of the file to be stored
  • the plurality of second storage nodes are used for distributed storage of the check code blocks obtained by encoding the different file blocks, each second
  • the storage nodes each include data processing devices, each of which is in communication with the named node, the method comprising:
  • the data processing apparatus receives an identifier of a plurality of target storage nodes and an identifier of a target file sent by the named node, where the plurality of target storage nodes are first storage nodes that have stored different file blocks of the target file;
  • the data processing apparatus encodes a partial file block in the target file according to the identifier of the target storage node and a partial byte encoding function to obtain a first check code, and the partial byte code
  • the function is a function that encodes a partial file block in the object file to obtain a coding result
  • the data processing apparatus encodes each file block in the target file according to the identifier of the target storage node and a full-byte encoding function to obtain a second check code, where the full-byte encoding function is adopted a function of encoding each file block in the object file to obtain a coded result;
  • the data processing device stores the first check code and the second check code in a storage space of a second storage node to which the data processing device belongs.
  • the data processing apparatus encodes a part of the file blocks in the target file according to the identifier of the target storage node and a partial byte encoding function, to obtain a first Check code, including:
  • the data processing apparatus encodes the byte indicated by the first encoding parameter according to the partial byte encoding function to obtain a first parity code.
  • the data processing apparatus according to the identifier of the target storage node and the full-byte encoding function, to the target file
  • Each file block in the block is encoded to obtain a second check code, including:
  • the data processing apparatus acquires, from a target storage node corresponding to the second encoding parameter, a byte indicated by the second encoding parameter, where the second encoding parameter is each encoding parameter in the full-byte encoding function ;
  • the data processing apparatus encodes the byte indicated by the second encoding parameter according to the full byte encoding function to obtain a second parity code.
  • the data processing apparatus according to the identifier of the target storage node and a partial byte encoding function, to the target file
  • the partial file block is encoded to obtain the first check code, and the method further includes:
  • the data processing apparatus determines, according to the number of the target storage nodes and the number of check nodes specified by the named node, the number of the first parameter in the partial byte coding function and the immediately adjacent two
  • the number of identical first parameters included in the partial byte encoding function of the check nodes, and the first parameter of the partial byte encoding function included in the two adjacent check nodes has the largest number of overlapping.
  • a third aspect of the present invention provides a data processing apparatus, which is applied to a distributed storage system, where the distributed storage system includes a named node, a plurality of first storage nodes, and a plurality of second storage nodes, and the plurality of first storages
  • the node is configured to store different file blocks of the stored file
  • the plurality of second storage nodes are used for distributed storage of the check code blocks obtained by encoding the different file blocks
  • each of the first storage nodes includes
  • each of the data processing devices is in communication with the named node, the data processing device comprising:
  • a receiving module configured to receive a file block obtaining request sent by the user equipment, where the file block obtaining request carries an identifier of the target file;
  • a determining module configured to determine, according to the identifier of the target file received by the receiving module, that the target file is not found, and determine that the target file block is lost;
  • An obtaining module configured to acquire, after the determining module determines that the target file is lost, an identifier of a target storage node where the recovery dependent data block is located from the named node, and according to the identifier of the target storage node and the target file And the recovery dependent data block includes a dependent file block and a dependent check code block required to recover the target file block, and a part of the check code in the check code block is Obtaining a partial file block of the target file, and the remaining part of the check code block is coded for each file block of the target file, and the remaining part of the check code is divided by the part a check code other than the check code, where the target file is a file to which the target file block belongs;
  • a recovery module configured to recover the target file block according to the dependent file block and the dependent check code block acquired by the obtaining module.
  • the recovery module includes:
  • a first restoring unit configured to recover a first byte in the target file block according to a partial byte encoding function, the dependent file block, and the dependent check code block, where the partial byte encoding function is adopted a function of encoding a partial file block in the object file to obtain a coding result;
  • a second restoring unit configured to recover a second byte in the target file block according to the full byte encoding function, the dependent file block, and the dependent check code block, where the full byte encoding function is adopted State
  • Each file block in the target file is encoded to obtain a function of the encoded result.
  • the first recovery unit is specifically configured to obtain, from the dependent file block corresponding to the first encoding parameter, a dependent byte required to recover the first byte, and a dependent check code block corresponding to the first encoding result Obtaining a check code required to recover the first byte, where the first encoding parameter is an encoding parameter in the partial byte encoding function, and the first encoding result is that the partial byte encoding function is adopted And a result obtained by encoding the dependent byte indicated by the first encoding parameter and the first byte; recovering the first byte according to a dependent byte required to recover the first byte The required check code is decoded to obtain the first byte.
  • the second recovery unit is configured to obtain, from the dependent file block corresponding to the second encoding parameter, a dependent byte required to recover the second byte, and the dependent check code block corresponding to the second encoding result Obtaining a check code required for recovering the second byte, the second encoding parameter is an encoding parameter in the full-byte encoding function, and the second encoding result is using the full-byte encoding function And a result obtained by encoding the dependent byte indicated by the second encoding parameter and the second byte; recovering the second byte according to a dependent byte required to recover the second byte The required check code is decoded to obtain the second byte.
  • a fourth aspect of the present invention provides a data processing apparatus, which is applied to a distributed storage system, where the distributed storage system includes a named node, a plurality of first storage nodes, and a plurality of second storage nodes, and the plurality of first storages
  • the node is configured to store different file blocks of the file to be stored
  • the plurality of second storage nodes are used for distributed storage of the check code blocks obtained by encoding the different file blocks
  • each of the second storage nodes includes a data processing device, each data processing device being in communication with the named node, the data processing device comprising:
  • a receiving module configured to receive an identifier of a plurality of target storage nodes and an identifier of a target file sent by the named node, where the plurality of target storage nodes are first storage nodes that have stored different file blocks of the target file;
  • a first encoding module configured to determine, according to the identifier of the target storage node, received by the receiving module
  • the partial byte encoding function encodes a partial file block in the target file to obtain a first check code
  • the partial byte encoding function is a function of encoding a partial file block in the target file to obtain a coding result.
  • a second encoding module configured to encode each file block in the target file according to the identifier of the target storage node and the full byte encoding function received by the receiving module, to obtain a second check code, where
  • the full byte encoding function is a function that encodes each file block in the object file to obtain a coding result
  • a storage scheduling module configured to store the first check code encoded by the first coding module and the second check code encoded by the second coding module into a data processing device The storage space of the second storage node.
  • the first encoding module is configured to acquire, according to a target storage node corresponding to the first encoding parameter, a byte indicated by the first encoding parameter, where the first encoding parameter is in the partial byte encoding function Each encoding parameter; encoding the byte indicated by the first encoding parameter according to the partial byte encoding function to obtain a first parity code.
  • the second encoding module is configured to acquire a byte indicated by the second encoding parameter from a target storage node corresponding to the second encoding parameter, where the second encoding parameter is in the full-byte encoding function Each encoding parameter; encoding a byte indicated by the second encoding parameter according to the full byte encoding function to obtain a second parity code.
  • the data processing device further includes:
  • a determining module configured to determine, according to the number of the target storage nodes received by the receiving module and the number of check nodes specified by the named node, the quantity of the first parameter in the partial byte encoding function and the immediately adjacent The number of identical first parameters included in a partial byte encoding function of the two check nodes, and the overlap of the first parameter in the partial byte encoding function included in the immediately adjacent two check nodes The number is the most.
  • a fifth aspect of the present invention provides a distributed storage system, including a named node, a plurality of first storage nodes, and a plurality of second storage nodes, wherein the plurality of first storage nodes are configured to store different files of the stored file in a distributed manner.
  • Block, the plurality of second storage nodes are configured to distributedly store check code blocks obtained by encoding the different file blocks, each first storage node includes a first data processing device, and each second storage node Each includes a second data processing device, each of the first data processing device and each of the second data processing devices being in communication with the named node;
  • the first data processing device is the data processing device of any one of the foregoing third or third aspects;
  • the second data processing device is the data processing device according to any of the fourth aspect or the fourth aspect.
  • the data recovery method provided by the embodiment of the present invention is applied to a distributed storage system, where the distributed storage system includes a named node, a plurality of first storage nodes, and a plurality of second storage nodes, and the plurality of first storage nodes And a plurality of second storage nodes for distributed storage of the check code blocks obtained by encoding the different file blocks, each of the first storage nodes including data a processing device, each data processing device is in communication with the named node, the method comprising: the data processing device receiving a file block acquisition request sent by a user equipment, where the file block acquisition request carries an identifier of the target file The data processing device determines that the target file block is missing according to the identifier of the target file, and the data processing device acquires the identifier of the storage node where the dependent data block is restored from the named node.
  • the recovery dependent data block includes a dependent file block and a dependent check code block required to recover the target file block, Part of the check code in the check code block is obtained by encoding a partial file block of the target file, and the remaining part of the check code block is coded for each file block of the target file. And the remaining part check code is a check code other than the part of the check code, where the target file is a file to which the target file block belongs; and the data processing apparatus is configured according to the dependent file block and the The dependent check code block recovers the target file block.
  • the check code block is obtained by combining partial byte coding and full byte coding, as compared with the network overhead in the prior art. The storage overhead is reduced. When the data is restored, a part of the target file block only needs to rely on the partial dependent file block to obtain a reduction in data recovery. Network overhead.
  • FIG. 1 is a schematic diagram of an embodiment of a distributed storage system according to an embodiment of the present invention.
  • FIG. 2 is a schematic diagram of another embodiment of a distributed storage system according to an embodiment of the present invention.
  • FIG. 3 is a schematic diagram of an embodiment of a method for data storage in an embodiment of the present invention.
  • FIG. 4 is a schematic diagram of an example of a scenario in an embodiment of the present invention.
  • FIG. 5 is a schematic diagram of an embodiment of a method for data recovery in an embodiment of the present invention.
  • FIG. 6 is a schematic diagram of another embodiment of a method for data recovery in an embodiment of the present invention.
  • FIG. 7 is a schematic diagram of another embodiment of a method for data storage in an embodiment of the present invention.
  • FIG. 8 is a schematic diagram of an embodiment of a data processing apparatus according to an embodiment of the present invention.
  • FIG. 9 is a schematic diagram of an embodiment of a data processing apparatus according to an embodiment of the present invention.
  • FIG. 10 is a schematic diagram of an embodiment of a data processing apparatus according to an embodiment of the present invention.
  • FIG. 11 is a schematic diagram of an embodiment of a data processing apparatus according to an embodiment of the present invention.
  • FIG. 12 is a schematic diagram of an embodiment of a data processing apparatus according to an embodiment of the present invention.
  • FIG. 13 is a schematic diagram of an embodiment of a data processing apparatus according to an embodiment of the present invention.
  • FIG. 14 is a schematic diagram of an embodiment of a data processing apparatus according to an embodiment of the present invention.
  • the embodiment of the invention provides a data recovery method, which can reduce network overhead during data recovery under the premise of low storage overhead.
  • the embodiment of the invention also provides a corresponding data storage method, corresponding device and system. The details are described below separately.
  • FIG. 1 is a schematic diagram of an embodiment of a distributed storage system according to an embodiment of the present invention.
  • the distributed storage system includes a named node (NameNode) and a plurality of storage nodes (Nodes), each of which is in communication connection with a named node.
  • the named node and the storage node can be connected by a switch communication.
  • FIG. 2 is a schematic diagram of another embodiment of a distributed storage system according to an embodiment of the present invention.
  • FIG. 2 shows that three storage nodes form a rack, and they pass through.
  • the switch communicates; the racks communicate with each other through a higher bandwidth switch; the NameNode manages the metadata of the entire cluster and directly connects to the upper switch.
  • the NameNode, storage node, switch, and rack form a distributed storage cluster.
  • the metadata refers to the correspondence between each file block and the storage path in the file.
  • a file can be distributedly stored on multiple storage nodes. For example, on five storage nodes, the file has five file blocks, and the data content of each file block is different.
  • the use of the user equipment for the distributed storage system includes two aspects of data storage and data readout, and the user accesses the distributed storage system through the network when storing or reading data.
  • the storage node provided by the embodiment of the present invention may be an independent physical host, or may be a virtual machine located on one or more physical hosts.
  • FIG. 3 is a schematic diagram of an embodiment of a method for data storage according to an embodiment of the present invention.
  • the distributed storage system of the embodiment of the present invention includes a named node, a plurality of first storage nodes, and a plurality of second storage nodes, wherein the plurality of first storage nodes are configured to store different file blocks of the file to be stored, a plurality of second storage nodes for distributed storage of the check code blocks obtained by encoding the different file blocks, each of the second storage nodes including a data processing device, each data processing device communicating with the named node Connection, as shown in FIG.
  • the identifiers of the ten first storage nodes are respectively N1, N2 to N10, and the second storage node has four, and the identifiers of the four second storage nodes are respectively N11 to N14, of course, the example in Figure 3 is just a few examples. In fact, there are many distributed storage systems.
  • the first storage node and the second storage node, each storage node has its corresponding identifier.
  • the node and the named node maintain the storage space of each of the first storage node and each of the second storage nodes.
  • the named node allocates the first storage node to the target file, it also maintains the identifier of the target file.
  • the user equipment receives the file storage response sent by the named node, and the file storage response carries the identifiers of the ten first storage nodes from N1 to N10, and the user equipment splits the target file into ten file blocks and stores them in the first ten
  • the user equipment may not split the target file, but may perform the round-robin storage.
  • the file block size of the ten first storage nodes may be the same or different. limited.
  • the identifiers of the four second storage nodes are respectively N11 to N14, and then the named node sends the identifiers and targets of the ten first storage nodes N1 to N10 to the data processing devices in N11 to N14.
  • the identifier of the file is the identifiers of the target storage nodes.
  • the data processing apparatus uses two encoding functions for encoding each file block in the target file, the first is a partial byte encoding function, and the second is a full byte encoding function.
  • the partial byte encoding function is a function that encodes a partial file block in the object file to obtain an encoding result.
  • the full byte encoding function is a function that encodes each file block in the object file to obtain an encoding result.
  • the data processing apparatus encodes a part of the file blocks in the target file according to the identifier of the target storage node and a partial byte encoding function to obtain a first check code.
  • the data processing apparatus encodes each file block in the target file according to the identifier of the target storage node and a full byte encoding function to obtain a second check code.
  • the data processing device stores the first check code and the second check code in a storage space of a second storage node to which the data processing device belongs.
  • first storage nodes for storing object files
  • second storage nodes for storing user-defined check code blocks for the target file.
  • A1, a2, ..., a10 are N1, N2, ..., N10, respectively, the nth byte in the different file blocks of the target file on the ten first storage nodes
  • b1, b2, ..., b10 are respectively N1, N2, ..., N10 the n+1th byte in the different file blocks of the target file on the ten first storage nodes
  • c11, c21, c31, c41 are the nth on the second storage node N11, N12, N13, N14
  • Byte whose value is calculated by partial byte encoding functions g1, g2, g3, g4, c12, c22, c32, c42 are the n+1th byte on the second storage node N11, N12, N13, N14 Its value is calculated by the
  • g 1 (a 1 , a 2 , a 3 , a 4 ) B 11 *a 1 +B 12 *a 2 +B 13 *a 3 +B 14 *a 4 ;
  • g 2 (a 3 , a 4 , a 5 , a 6 ) B 23 *a 3 +B 24 *a 4 +B 25 *a 5 +B 26 *a 6 ;
  • f 1 (b 1 , b 2 , . . . , b 9 , b 10 ) B 11 *b 1 +B 12 *b 2 +..., +B 19 *b 9 +B 1,10 *b 10 ;
  • the present invention uses partial byte encoding for group a, with only a portion of the bytes as input parameters for its encoding function, and full byte encoding for group b, with all bytes as input parameters for its encoding function.
  • the value of the coding matrix is not limited, as long as the existence of the inverse matrix is ensured, and the coefficients required in the coding functions g1 to g5, f1 to f4 are obtained from the matrix.
  • the byte parameters of the immediately adjacent two partial byte encoding functions have a two byte overlap.
  • each byte can be guaranteed to be encoded into multiple check codes, and even if the target file has multiple file blocks lost, the lost bytes can be recovered normally.
  • the two adjacent byte encoding functions are the functions with the largest number of overlaps of g 1 and g 2 , g 2 and g 3 , g 3 and g 4 .
  • the number of byte parameters in the specific partial byte encoding function and the number of byte repetitions in the two adjacent partial encoding functions may be based on the first storage node of the target file. The number and the number of second storage nodes are determined. Ensure that each byte is programmed into at least two check code blocks.
  • the ten bytes of a 1, a 2, ... , a 9, a 10 may have five partial byte encoding functions in the case where two bytes are overlapped by two adjacent functions.
  • g 1 (a 1 , a 2 , a 3 , a 4 ), g 2 (a 3 , a 4 , a 5 , a 6 ), (a 5 , a 6 , a 7 , a 8 ), g 4 (a 7 , a 8 , a 9 , a 10 ), (a 9 , a 10 , a 1 , a 2 ), guaranteeing that each of a 1 , a 2 , ..., a 9 , a 10 is It has been compiled into two check code blocks. In fact, there can be three overlapping bytes, so that each byte of a 1 , a 2 ,..., a 9 , a 10 is encoded into 3 check code blocks, which is more reliable, but the storage overhead. Will increase.
  • the following is an example of calculating the reason why 4 bytes are selected as the bytes to be encoded in the partial byte encoding function of the above example of the present invention, and the overlap length is two bytes.
  • nodeNum is the number of storage nodes that need to participate in partial byte encoding
  • len is the length of the overlapping portion of the node
  • the number of second storage nodes is represented by r
  • the number of first storage nodes is represented by k
  • nodeNum and Len are the number of storage nodes that need to participate in partial byte encoding
  • the check code is stored in the storage space of the second storage node where the data processing device is located.
  • the data processing devices in N11, N12, N13, and N14 are respectively encoded according to respective partial byte encoding functions and full byte encoding functions, and can be encoded in parallel, thereby improving encoding efficiency.
  • the distributed storage system shown in FIG. 5 includes a named node, a plurality of first storage nodes, and a plurality of second storage nodes, the plurality of first storage nodes being configured to distributedly store different file blocks of the stored file, a plurality of second storage nodes are configured to distributedly store check code blocks obtained by encoding the different file blocks, each first storage node includes a data processing device, and each data processing device communicates with the named node connection.
  • the object files are stored on ten first storage nodes N1, N2, ..., N10, and the check codes of the target files are stored in N11, N12, N13, and N14 is on the four second storage nodes.
  • the user device When the user wants to use the target file, the user device sends a target file acquisition request to the named node,
  • the target file acquisition request carries the identifier of the target file.
  • the named node determines, according to the identifier of the target file, the association relationship between the identifier of the target file and the identifier of the first storage node that is established when the target file is stored, and determines that the target file is stored in N1, N2, ..., N10. On the storage node. Then, the named node returns the identifiers of the ten first storage nodes N1, N2, . . . , N10 to the user equipment.
  • the user equipment sends a file block acquisition request to the ten first storage nodes N1, N2, ..., N10 according to the identifiers of the ten first storage nodes N1, N2, ..., N10, and the file block acquisition request carries The ID of the target file.
  • the file blocks of the target file stored on N1, N2, ..., N10 are not lost, the corresponding file blocks are respectively returned to the user equipment, but in the scenario example of the present invention, the file blocks on the first storage node N1 are lost. means a 1, b 1 and other related bytes stored on the first storage node N1 is lost, the recover lost file blocks before returning the recovered file blocks to the user equipment.
  • the process of data recovery in the embodiment of the present invention is described below by taking the recovery a 1 and b 1 as an example.
  • a 2 a 3, a 4 and four values c 11
  • the process is decoded to get a 1 .
  • recovery b 1 can also use several other full-byte encoding functions, but the principle is the same.
  • the data storage method provided by the embodiment of the present invention adopts a method of mixing partial byte coding and full byte coding, improves byte reliability, reduces storage space, and improves coding efficiency.
  • the data recovery for the bytes encoded by the partial byte encoding function, it is not necessary to acquire each byte for decoding, which reduces the network overhead when the network data is recovered.
  • an embodiment of a data recovery method provided by an embodiment of the present invention includes:
  • the data processing device receives a file block acquisition request sent by the user equipment, where the file block acquisition request carries an identifier of the target file, the data processing device is applied to a distributed storage system, and the distributed storage system includes a named node. a plurality of first storage nodes for distributed storage of different file blocks of the stored file, and a plurality of second storage nodes for distributed storage pairs The check code blocks obtained by encoding the different file blocks, each The first storage nodes each include data processing devices, each of which is in communication with the named node.
  • the data processing apparatus does not find the target file according to the identifier of the target file, and determines that the target file block is lost.
  • the data processing device acquires, from the named node, an identifier of a target storage node where the dependent data block is restored, and obtains the recovery dependent data block according to the identifier of the target storage node and the identifier of the target file.
  • the recovery dependent data block includes a dependent file block and a dependent check code block required to recover the target file block, and a part of the check code in the check code block is obtained by encoding a partial file block of the target file.
  • the remaining part of the check code in the check code block is obtained by encoding each file block of the target file, and the remaining part check code is a check code other than the part of the check code.
  • the target file is a file to which the target file block belongs.
  • the data processing apparatus recovers the target file block according to the dependent file block and the dependent check code block.
  • the check code block is obtained by combining partial byte coding and full byte coding, as compared with the network overhead in the prior art.
  • the storage overhead is reduced.
  • a part of the target file block only needs to rely on the partial dependent file block to reduce the network overhead when the data is restored.
  • the data processing apparatus is configured according to the dependent file block and the Restoring the target file block by relying on the check code block may include:
  • the data processing apparatus recovers a first byte in the target file block according to a partial byte encoding function, the dependent file block, and the dependent check code block, wherein the partial byte encoding function is to adopt the target A partial file block in the file is encoded to obtain a function of the encoded result;
  • the data processing apparatus recovers a second byte in the target file block according to a full byte encoding function, the dependent file block, and the dependent check code block, the full byte encoding function adopting the target
  • Each file block in the file is encoded to obtain a function of the encoded result.
  • the data processing apparatus is based on a partial byte, in the second optional embodiment of the data recovery apparatus provided by the embodiment of the present invention.
  • Recovering the first byte in the target file block by the encoding function, the dependent file block, and the dependent check code block may include:
  • the data processing device acquires a dependent byte required to recover the first byte from a dependent file block corresponding to the first encoding parameter, and obtains the recovery from the dependent check code block corresponding to the first encoding result.
  • a check code required by the first byte the first encoding parameter is an encoding parameter in the partial byte encoding function
  • the first encoding result is to use the partial byte encoding function to the first a result of encoding the dependent byte indicated by the encoding parameter and encoding the first byte;
  • the data processing apparatus decodes a check code required to recover the first byte according to a dependent byte required to recover the first byte to obtain the first byte.
  • the data processing device is provided on the basis of the foregoing first or second alternative embodiment corresponding to FIG.
  • Recovering the second byte in the target file block according to the full byte encoding function, the dependent file block, and the dependent check code block may include:
  • the data processing device acquires, from the dependent file block corresponding to the second encoding parameter, a dependent byte required to recover the second byte, and obtains the recovery from the dependent check code block corresponding to the second encoding result.
  • a check code required for the second byte is an encoding parameter in the full byte encoding function
  • the second encoding result is to use the full byte encoding function to the second a result obtained by encoding a dependent byte indicated by the encoding parameter and the second byte;
  • the data processing apparatus decodes a check code required to recover the second byte according to a dependent byte required to recover the second byte to obtain the second byte.
  • the first encoding parameter refers to a parameter in a partial byte encoding function, for example, a 1 , a 2 , a 3 in the g 1 function in the scene example of FIG. 3 .
  • the second encoding parameter refers to the parameters in the full byte encoding function, for example: b 1, b 2 , ..., b 9 , b 10 in the f function, the first byte can be referred to in the group a
  • the nth byte is understood.
  • the second byte can be understood by referring to the n+1th byte in the b group.
  • the first byte is a byte of the partial byte encoding function, the second byte. Is a byte that uses a full byte encoding function.
  • the dependent file block and the dependent check code block are survival file blocks and check code blocks that are required to recover the lost file block. For example, restoring the file block of the target file in the first storage node N1 depends on N2, N3, and The relevant file block of the target file in N4 and the check code block of the target file in N11.
  • an embodiment of a method for data storage provided by an embodiment of the present invention includes:
  • the data processing device receives the identifiers of the plurality of target storage nodes and the identifiers of the target files sent by the named node, where the plurality of target storage nodes are first storage nodes that have stored different file blocks of the target file,
  • the data processing device is applied to a distributed storage system, where the distributed storage system includes a named node, a plurality of first storage nodes, and a plurality of second storage nodes, where the plurality of first storage nodes are used for distributed storage.
  • Storing different file blocks of the file the plurality of second storage nodes are configured to distributedly store the check code blocks obtained by encoding the different file blocks, and each of the second storage nodes includes a data processing device, each data
  • the processing devices are all in communication with the named node.
  • the data processing apparatus encodes a partial file block in the target file according to the identifier of the target storage node and a partial byte encoding function, to obtain a first check code, where the partial byte encoding function is adopted.
  • a portion of the file block in the object file is encoded to obtain a function of the encoded result.
  • the data processing apparatus encodes each file block in the target file according to the identifier of the target storage node and a full-byte encoding function, to obtain a second check code, where the full-byte encoding function is A function that encodes each file block in the object file to obtain an encoded result.
  • the data processing apparatus stores the first check code and the second check code in a storage space of a second storage node to which the data processing device belongs.
  • the data storage method provided by the embodiment of the present invention adopts a method of mixing partial byte coding and full byte coding to improve byte reliability, reduce storage space, and improve coding efficiency.
  • the data recovery for the bytes encoded by the partial byte encoding function, it is not necessary to acquire each byte for decoding, which reduces the network overhead when the network data is recovered.
  • the data processing apparatus is configured according to the identifier of the target storage node and The partial byte encoding function encodes a part of the file block in the target file to obtain a first check code, which may include:
  • the data processing apparatus acquires the number from a target storage node corresponding to the first encoding parameter a byte indicated by an encoding parameter, the first encoding parameter being each encoding parameter in the partial byte encoding function;
  • the data processing apparatus encodes the byte indicated by the first encoding parameter according to the partial byte encoding function to obtain a first parity code.
  • the data processing apparatus is configured according to the foregoing embodiment or the first optional embodiment of FIG.
  • the identifier of the target storage node and the full-byte encoding function encode each file block in the target file to obtain a second check code, which may include:
  • the data processing apparatus acquires, from a target storage node corresponding to the second encoding parameter, a byte indicated by the second encoding parameter, where the second encoding parameter is each encoding parameter in the full-byte encoding function ;
  • the data processing apparatus encodes the byte indicated by the second encoding parameter according to the full byte encoding function to obtain a second parity code.
  • the data processing apparatus is configured according to the foregoing embodiment or the first optional embodiment of FIG.
  • the method may further include: the identifier of the target storage node and the partial byte encoding function encoding a part of the file block in the target file, and obtaining the first check code, the method may further include:
  • Determining, by the data processing device, the number of the first parameter in the partial byte coding function and the two adjacent check nodes according to the number of the target storage nodes and the number of check nodes specified by the named node The number of identical first parameters included in the partial byte encoding function, and the number of overlapping of the first parameter in the partial byte encoding function included in the two adjacent check nodes is the largest.
  • the first encoding parameter refers to a parameter in a partial byte encoding function, for example, a 1 , a 2 , a 3 in the g 1 function in the scene example of FIG. 3 .
  • the second encoding parameter refers to a parameter in the full byte encoding function, for example: b 1 , b 2 , ..., b 9 , b 10 in the f function
  • the first check code can be referred to c 11
  • the second check code can be understood by referring to c 21 .
  • the first check code is generally a check code obtained by encoding a partial byte encoding function
  • the second byte is encoded by a full byte encoding function. The obtained check code.
  • an embodiment of a data processing apparatus 30 includes: a data processing apparatus 30 applied to a distributed storage system, where the distributed storage system includes a named node, a plurality of first storage nodes, and multiple a second storage node, the plurality of first storage nodes are configured to store different file blocks of the stored file, and the plurality of second storage nodes are used for distributed storage to encode the different file blocks.
  • Each of the first storage nodes includes the data processing device, each data processing device is in communication with the named node, and the data processing device 30 includes:
  • the receiving module 301 is configured to receive a file block obtaining request sent by the user equipment, where the file block obtaining request carries an identifier of the target file;
  • a determining module 302 configured to determine, according to the identifier of the target file received by the receiving module 301, that the target file is not found, and determine that the target file block is lost;
  • the obtaining module 303 is configured to: after the determining module 302 determines that the target file is lost, acquire, from the named node, an identifier of a target storage node where the recovery dependent data block is located, and according to the identifier of the target storage node and the identifier An identifier of the target file acquires the recovery dependent data block, the recovery dependent data block includes a dependent file block and a dependent check code block required to recover the target file block, and a part of the checksum code block is verified
  • the code is obtained by encoding a partial file block of the target file, and the remaining part of the check code block is coded for each file block of the target file, and the remaining part of the check code is a check code other than a part of the check code, where the target file is a file to which the target file block belongs;
  • the recovery module 304 is configured to restore the target file block according to the dependent file block and the dependent check code block acquired by the obtaining module 303.
  • the data processing apparatus does not need to acquire each byte for decoding, and reduces the time when the network data is recovered, in the data recovery, for the byte encoded by the partial byte encoding function.
  • Network overhead Compared with the prior art, the data processing apparatus provided by the embodiment of the present invention does not need to acquire each byte for decoding, and reduces the time when the network data is recovered, in the data recovery, for the byte encoded by the partial byte encoding function. Network overhead.
  • the recovery module 304 includes:
  • the first restoring unit 3041 is configured to recover the first byte in the target file block according to the partial byte encoding function, the dependent file block, and the dependent check code block, where the partial byte encoding function is Encoding with a partial file block in the object file to obtain a function of the encoded result;
  • a second restoring unit 3042 configured to recover a second byte in the target file block according to the full byte encoding function, the dependent file block, and the dependent check code block, where the full byte encoding function is adopted
  • Each file block in the object file is encoded to obtain a function of the encoded result.
  • the first restoring unit 3041 is configured to obtain, from the dependent file block corresponding to the first encoding parameter, a dependent byte required to recover the first byte, and a dependency check code corresponding to the first encoding result. Obtaining, in the block, a check code required to recover the first byte, where the first encoding parameter is an encoding parameter in the partial byte encoding function, and the first encoding result is using the partial byte encoding a result obtained by the function encoding the dependent byte indicated by the first encoding parameter and the first byte; recovering the first word according to a dependent byte required to recover the first byte The check code required for the section is decoded to obtain the first byte.
  • the second recovery unit 3042 is configured to obtain, from the dependent file block corresponding to the second encoding parameter, a dependent byte required to recover the second byte, and a dependency check code corresponding to the second encoding result.
  • the check code required for the section is decoded to obtain the second byte.
  • the first encoding parameter refers to a parameter in a partial byte encoding function, for example, a 1 , a 2 in the g 1 function in the scene example of FIG. 3 . , a 3 , a 4
  • the second encoding parameter refers to the parameters in the full byte encoding function, for example: b 1 , b 2 , ..., b 9 , b 10 in the f function
  • the first byte can refer to a The nth byte in the group is understood.
  • the second byte can be understood by referring to the n+1th byte in the b group.
  • the first byte is a byte using a partial byte encoding function.
  • the second byte is the byte that uses the full byte encoding function.
  • the dependent file block and the dependent check code block are survival file blocks and check code blocks that are required to recover the lost file block. For example, restoring the file block of the target file in the first storage node N1 depends on N2, N3, and The relevant file block of the target file in N4 and the check code block of the target file in N11.
  • FIG. 8 or FIG. 9 can be understood by referring to the related description in the parts of FIG. 1 to FIG. 6 , and no further description is made herein.
  • an embodiment of a data processing apparatus 40 includes: a data processing apparatus 40 applied to a distributed storage system, where the distributed storage system includes a named node, a plurality of first storage nodes, and multiple a second storage node, the plurality of first storage nodes are configured to store different file blocks of the file to be stored, and the plurality of second storage nodes are used for distributed storage to encode the different file blocks.
  • the code processing block, each of the second storage nodes includes a data processing device, each of the data processing devices is in communication with the named node, and the data processing device includes:
  • the receiving module 401 is configured to receive, by the named node, an identifier of the target storage node and an identifier of the target file, where the multiple target storage nodes are the first storage node that has stored different file blocks of the target file. ;
  • the first encoding module 402 is configured to encode a partial file block in the target file according to the identifier of the target storage node and a partial byte encoding function received by the receiving module 401, to obtain a first check code.
  • the partial byte encoding function is a function for encoding a partial file block in the target file to obtain a coding result;
  • a second encoding module 403 configured to encode each file block in the target file according to the identifier of the target storage node and the full byte encoding function received by the receiving module 401, to obtain a second check code
  • the full byte encoding function is a function that encodes each file block in the object file to obtain an encoding result
  • a storage scheduling module 404 configured to store the first check code encoded by the first encoding module 402 and the second check code encoded by the second encoding module 403 into the data processing device The storage space of the second storage node to which it belongs.
  • the data processing apparatus adopts a method of mixing a partial byte code and a full byte code to improve the byte reliability, reduce the storage space, and improve the coding efficiency.
  • the bytes encoded for the partial byte encoding function are not required.
  • the network overhead when network data recovery is reduced.
  • the first encoding module 402 is configured to acquire a byte indicated by the first encoding parameter from a target storage node corresponding to the first encoding parameter, where the first encoding parameter is the partial byte encoding function Each of the encoding parameters; encoding the byte indicated by the first encoding parameter according to the partial byte encoding function to obtain a first verification code.
  • the second encoding module is configured to acquire a byte indicated by the second encoding parameter from a target storage node corresponding to the second encoding parameter, where the second encoding parameter is in the full-byte encoding function Each encoding parameter; encoding a byte indicated by the second encoding parameter according to the full byte encoding function to obtain a second parity code.
  • the data processing device 40 further includes:
  • a determining module 405 configured to determine, according to the number of the target storage nodes received by the receiving module 401 and the number of check nodes specified by the named node, the number of first parameters in the partial byte encoding function The number of identical first parameters included in a partial byte encoding function of two immediately adjacent check nodes, and the number of overlapping first parameters in a partial byte encoding function included in the immediately adjacent two check nodes most.
  • the first encoding parameter refers to a parameter in a partial byte encoding function, for example, a 1 , a 2 in the g 1 function in the scene example of FIG. 3 .
  • the second encoding parameter refers to a parameter in the full byte encoding function, for example: b 1 , b 2 , ..., b 9 , b 10 in the f function
  • the first check code can be referred to c 11 to understand
  • the second check code can be understood by referring to c 21
  • the first check code is generally a check code obtained by using a partial byte encoding function
  • the second byte is encoded by full byte. The check code obtained by the function encoding.
  • FIG. 10 and FIG. 11 For the embodiment or any alternative embodiment of FIG. 10 and FIG. 11, reference may be made to FIG. 1 to FIG. 5 and FIG. The relevant descriptions of the points are understood, and we will not repeat them here.
  • the receiving module and the obtaining module may be implemented by an input/output I/O device (such as a network card), determining a module, and recovering.
  • the module, the first encoding module, the second encoding module, and the storage scheduling module may be implemented by a processor executing a program or instruction in a memory (in other words, by a processor and a special instruction in a memory coupled to the processor
  • the receiving module and the acquiring module may be implemented by an input/output I/O device (such as a network card), and the determining module, the restoring module, the first encoding module, the second encoding module,
  • the storage scheduling module can also be implemented by a dedicated circuit.
  • the receiving module and the acquiring module may be input/output I/O devices (
  • the network card is implemented, and the determining module, the recovery module, the first encoding module, the second encoding module, and the storage scheduling module can also be edited through the field.
  • the implementation of the present invention is described in detail in the prior art.
  • the present invention includes, but is not limited to, the foregoing implementation manners, and it should be understood that the present invention is implemented according to the idea of the present invention. The solution falls within the scope protected by the embodiments of the present invention.
  • a hardware structure of a data processing apparatus may include:
  • Transceiver device software device and hardware device
  • the transceiver device is a hardware circuit for completing packet transmission and reception
  • Hardware devices can also be called “hardware processing modules", or simpler, or simply “hardware”. Hardware devices mainly include dedicated hardware circuits based on FPGAs, ASICs (and other supporting devices, such as memory). The hardware circuits of certain functions are often processed much faster than general-purpose processors, but once the functions are customized, they are difficult to change. Therefore, they are not flexible to implement and are usually used to handle some fixed functions. It should be noted that the hardware device may also include an MCU (microprocessor, such as a single chip microcomputer) or a processor such as a CPU in practical applications, but the main function of these processors is not to complete the processing of big data, but mainly used for processing. Some control is performed. In this application scenario, the system that is paired with these devices is a hardware device.
  • MCU microprocessor, such as a single chip microcomputer
  • Software devices mainly include general purpose processors (such as CPUs) and Some of its supporting devices (such as memory, hard disk and other storage devices) can be programmed to let the processor have the corresponding processing functions. When implemented in software, it can be flexibly configured according to the service, but often the speed is higher than that of the hardware device. slow. After the software is processed, the processed data can be sent through the transceiver device through the hardware device, or the processed data can be sent to the transceiver device through an interface connected to the transceiver device.
  • the transceiver device is configured to receive an identifier of the target file, an identifier of the target storage node, and the like.
  • the receiving module and the obtaining module may be implemented by an input/output I/O device (such as a network card), and the determining module, the restoring module, the first encoding module, the second encoding module, and the storage scheduling module may be
  • the processor performs a detailed description of the technical solution implemented by the program or instruction in the memory:
  • FIG. 13 is a schematic structural diagram of a data processing apparatus 30 according to an embodiment of the present invention.
  • the data processing device 30 is applied to a distributed storage system including a named node, a plurality of first storage nodes, and a plurality of second storage nodes, the plurality of first storage nodes being used for distributed storage stored Different file blocks of the file, the plurality of second storage nodes are used for distributed storage of the check code blocks obtained by encoding the different file blocks, and each of the first storage nodes includes a data processing device, and each data processing The devices are all in communication with the named node.
  • the data processing apparatus 30 includes a processor 310, a memory 350, and an input/output I/O device 330.
  • the memory 350 can include read only memory and random access memory, and provides operational instructions and data to the processor 310.
  • a portion of memory 350 may also include non-volatile random access memory (NVRAM).
  • NVRAM non-volatile random access memory
  • the memory 350 stores elements, executable modules or data structures, or a subset thereof, or their extended set:
  • the operation instruction can be stored in the operating system
  • the I/O device 330 Acquiring, by the I/O device 330, the identifier of the target storage node where the recovery dependent data block is located from the named node, and acquiring the recovery dependent data block according to the identifier of the target storage node and the identifier of the target file, Recovering the dependent data block includes a dependent file block and a dependent check code block required to recover the target file block, and a part of the check code in the check code block is obtained by encoding a partial file block of the target file, Remaining the check code in the check code block is obtained by encoding each file block of the target file, and the remaining part check code is a check code other than the part of the check code,
  • the target file is a file to which the target file block belongs;
  • the check code block is obtained by combining partial byte coding and full byte coding, compared with the network overhead in the prior art.
  • the storage overhead is reduced.
  • a part of the target file block only needs to rely on the partial dependent file block to reduce the network overhead when the data is restored.
  • the processor 310 controls the operation of the data processing device 30, which may also be referred to as a CPU (Central Processing Unit).
  • Memory 350 can include read only memory and random access memory and provides instructions and data to processor 310.
  • a portion of memory 350 may also include non-volatile random access memory (NVRAM).
  • NVRAM non-volatile random access memory
  • the various components of the data processing device 30 are coupled together by a bus system 320, which may include, in addition to the data bus, a power bus, a control bus, a status signal bus, and the like. However, for clarity of description, various buses are labeled as bus system 320 in the figure.
  • Processor 310 may be an integrated circuit chip with signal processing capabilities. In the implementation process, each step of the foregoing method may be completed by an integrated logic circuit of hardware in the processor 310 or an instruction in a form of software.
  • the processor 310 described above may be a general purpose processor, a digital signal processor (DSP), an application specific integrated circuit (ASIC), an off-the-shelf programmable gate array (FPGA) or other programmable logic device, a discrete gate or transistor logic device, or discrete hardware. Component.
  • DSP digital signal processor
  • ASIC application specific integrated circuit
  • FPGA off-the-shelf programmable gate array
  • the methods, steps, and logical block diagrams disclosed in the embodiments of the present invention may be implemented or carried out.
  • the general purpose processor may be a microprocessor or the processor or any conventional processor or the like.
  • the steps of the method disclosed in the embodiments of the present invention may be directly implemented as a hardware decoding processor, or by using a hard processor in the decoding processor.
  • the combination of the piece and the software module is completed.
  • the software module can be located in a conventional storage medium such as random access memory, flash memory, read only memory, programmable read only memory or electrically erasable programmable memory, registers, and the like.
  • the storage medium is located in the memory 350, and the processor 310 reads the information in the memory 350 and performs the steps of the above method in combination with its hardware.
  • the processor 310 is specifically configured to:
  • the partial byte encoding function adopting a partial file in the target file a block is encoded to obtain a function of the encoded result
  • the processor 310 is specifically configured to:
  • the processor 310 is specifically configured to:
  • the second encoding parameter is an encoding parameter in the full-byte encoding function
  • the second encoding result is indicated by the full-byte encoding function for the second encoding parameter Dependent on the byte and the second byte to encode the result
  • the first coding parameter refers to a parameter in a partial byte coding function, for example, a 1 , a 2 , a 3 in the g 1 function in the scene example of Fig. 3 , a 4
  • the second encoding parameter refers to the parameters in the full byte encoding function, for example: b 1 , b 2 , ..., b 9 , b 10 in the f function
  • the first byte can be referred to in the group a
  • the nth byte is understood.
  • the second byte can be understood by referring to the n+1th byte in the b group.
  • the first byte is a byte of the partial byte encoding function, the second byte. Is a byte that uses a full byte encoding function.
  • the dependent file block and the dependent check code block are survival file blocks and check code blocks that are required to recover the lost file block. For example, restoring the file block of the target file in the first storage node N1 depends on N2, N3, and The relevant file block of the target file in N4 and the check code block of the target file in N11.
  • FIG. 13 For the corresponding embodiment or any alternative embodiment of FIG. 13 , reference may be made to the related descriptions of FIG. 1 to FIG. 5 , FIG. 6 , FIG. 8 , FIG. 9 , and no further description is made herein.
  • FIG. 14 is a schematic structural diagram of a data processing apparatus 40 according to an embodiment of the present invention.
  • the data processing device 40 is applied to a distributed storage system, where the distributed storage system includes a named node, a plurality of first storage nodes, and a plurality of second storage nodes, where the plurality of first storage nodes are used for distributed storage to be stored.
  • Different file blocks of the file the plurality of second storage nodes are used for distributed storage of the check code blocks obtained by encoding the different file blocks
  • each of the second storage nodes includes a data processing device, and each data processing The devices are all in communication with the named node.
  • the data processing apparatus 40 includes a processor 410, a memory 450, and an input/output I/O device 430, which may include read only memory and random access memory, and provides operational instructions and data to the processor 410.
  • a portion of the memory 450 may also include non-volatile random access memory (NVRAM).
  • NVRAM non-volatile random access memory
  • the memory 450 stores the following elements, executable modules or data structures, or a subset thereof, or their extended set:
  • the operation instruction can be stored in the operating system
  • the data storage method provided by the embodiment of the present invention adopts a method of mixing partial byte coding and full byte coding to improve byte reliability, reduce storage space, and improve coding efficiency.
  • the data recovery for the bytes encoded by the partial byte encoding function, it is not necessary to acquire each byte for decoding, which reduces the network overhead when the network data is recovered.
  • the processor 410 controls the operation of the data processing device 40, which may also be referred to as a CPU (Central Processing Unit).
  • Memory 450 can include read only memory and random access memory and provides instructions and data to processor 410. A portion of the memory 450 may also include non-volatile random access memory (NVRAM).
  • the various components of the data processing device 40 are coupled together by a bus system 420 in a particular application.
  • the bus system 420 can include, in addition to the data bus, a power bus, a control bus, a status signal bus, and the like. However, for clarity of description, various buses are labeled as bus system 420 in the figure.
  • Processor 410 may be an integrated circuit chip with signal processing capabilities. In the implementation process, each step of the foregoing method may be completed by an integrated logic circuit of hardware in the processor 410 or an instruction in a form of software.
  • the processor 410 described above may be a general purpose processor, a digital signal processor (DSP), an application specific integrated circuit (ASIC), an off-the-shelf programmable gate array (FPGA) or other programmable logic device, a discrete gate or transistor logic device, or discrete hardware. Component.
  • DSP digital signal processor
  • ASIC application specific integrated circuit
  • FPGA off-the-shelf programmable gate array
  • the methods, steps, and logical block diagrams disclosed in the embodiments of the present invention may be implemented or carried out.
  • the general purpose processor may be a microprocessor or the processor or any conventional processor or the like.
  • the steps of the method disclosed in the embodiments of the present invention may be directly implemented as a hardware decoding processor, or by using a hard processor in the decoding processor.
  • the combination of the piece and the software module is completed.
  • the software module can be located in a conventional storage medium such as random access memory, flash memory, read only memory, programmable read only memory or electrically erasable programmable memory, registers, and the like.
  • the storage medium is located in the memory 450, and the processor 410 reads the information in the memory 450 and completes the steps of the above method in combination with its hardware.
  • the processor 410 is specifically configured to:
  • the processor 410 is specifically configured to:
  • the processor 410 is specifically configured to:
  • Determining, according to the number of the target storage nodes and the number of check nodes specified by the named node, the number of the first parameter in the partial byte encoding function and the partial byte encoding in the two adjacent check nodes The number of the same first parameter included in the function, and the first parameter included in the two adjacent check nodes has the largest number of overlapping of the first parameter.
  • the first encoding parameter refers to a parameter in a partial byte encoding function, for example, a 1 , a 2 , a 3 in the g 1 function in the scene example of FIG. 3 .
  • the second encoding parameter refers to a parameter in the full byte encoding function, for example: b 1 , b 2 , ..., b 9 , b 10 in the f function
  • the first check code can be referred to c 11
  • the second check code can be understood by referring to c 21 .
  • the first check code is generally a check code obtained by encoding a partial byte encoding function
  • the second byte is encoded by a full byte encoding function. The obtained check code.
  • FIG. 13 For the corresponding embodiment or any of the alternative embodiments of FIG. 13 , reference may be made to the related descriptions of FIG. 1 to FIG. 5 , FIG. 7 , FIG. 10 , FIG. 11 , and no further description is made herein.
  • a distributed storage system includes a named node and multiple first storage nodes And a plurality of second storage nodes for distributing different file blocks of the stored file, the plurality of second storage nodes for encoding the different file blocks for distributed storage a resulting check code block, each first storage node comprising a first data processing device, each second storage node comprising a second data processing device, each first data processing device and each second data processing device Both are in communication with the named node;
  • the first data processing device can be understood by referring to the description in the portion of FIG. 3.
  • the second data processing device can be understood by referring to the description in FIG.
  • the check code block is obtained by combining partial byte coding and full byte coding, compared with the network overhead in the prior art.
  • the storage overhead is reduced.
  • a part of the target file block only needs to rely on the partial dependent file block to reduce the network overhead when the data is restored.
  • the program may be stored in a computer readable storage medium, and the storage medium may include: ROM, RAM, disk or CD.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Quality & Reliability (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

一种数据恢复的方法,应用于分布式存储系统,分布式存储系统包括命名节点、多个第一存储节点和多个第二存储节点,每个第一存储节点都包括数据处理装置,每个数据处理装置都与命名节点通信连接,所述方法包括:数据处理装置接收用户设备发送的文件块获取请求,文件块获取请求携带目标文件的标识(101);根据目标文件的标识确定目标文件块丢失(102);从命名节点获取恢复依赖数据块所在的目标存储节点的标识,并根据目标存储节点的标识和目标文件的标识获取恢复依赖数据块(103),恢复目标文件块(104)。该方法降低了存储开销,数据恢复时目标文件块中的一部分只需要依赖部分依赖文件块就可以得到,降低了数据恢复时的网络开销。

Description

一种数据恢复的方法、存储的方法相应的装置及系统
本申请要求于2015年8月17日提交中国专利局、申请号为201510504685.8、发明名称为“一种数据恢复的方法、存储的方法相应的装置及系统”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本发明涉及数据存储技术领域,具体涉及一种数据恢复的方法、存储的方法、相应的装置及系统。
背景技术
在大容量分布存储系统中,为提高数据存储的可靠性,可以采用多副本方案,即把磁盘里面的数据复制到多个副本磁盘中,当其中任意一个磁盘失效时,从其他任意一个存活的磁盘中把数据读取出来放入新磁盘中即完成数据恢复,这种技术实现简单,恢复耗时最少,但存储开销很大。
为了解决多副本方案存储开销大的问题,出现了纠删码(Reed-Solomon Code,RS)技术,例如:RS(10,4),即对10个磁盘的数据进行编码,产生的编码结果存放在4个冗余磁盘里面,存储开销为(10+4)/10=1.4倍,存储开销比多副本方案的明显减少,但是当一个磁盘失效时,需要从10个磁盘读取数据进行解码,才能实现数据的恢复。而多副本方案只需要从1个磁盘读数据即完成恢复,相比之下网络带宽开销增加了10倍,网络带宽开销大则是RS技术的缺点。
由此可见,现有技术中的分布式存储方案,要么存储开销大,要么数据恢复时网络开销大。
发明内容
为解决现有技术中数据分布式存储系统中数据恢复时网络开销大的问题,本发明实施例提供一种数据恢复的方法,可以在低存储开销的前提下,降低数据恢复时的网络开销。本发明实施例还提供了相应的数据存储的方法、相应的装置及系统。
本发明第一方面提供一种数据恢复的方法,所述方法应用于分布式存储系统,所述分布式存储系统包括命名节点、多个第一存储节点和多个第二存储节 点,所述多个第一存储节点用于分布式存储已存储文件的不同文件块,所述多个第二存储节点用于分布式存储对所述不同文件块进行编码得到的校验码块,每个第一存储节点都包括数据处理装置,每个数据处理装置都与所述命名节点通信连接,所述方法包括:
所述数据处理装置接收用户设备发送的文件块获取请求,所述文件块获取请求携带所述目标文件的标识;
所述数据处理装置根据所述目标文件的标识未查找到所述目标文件,确定所述目标文件块丢失;
所述数据处理装置从所述命名节点获取恢复依赖数据块所在的目标存储节点的标识,并根据所述目标存储节点的标识和所述目标文件的标识获取所述恢复依赖数据块,所述恢复依赖数据块包括恢复所述目标文件块所需的依赖文件块和依赖校验码块,所述依赖校验码块中的一部分校验码为对目标文件的部分文件块编码得到,所述依赖校验码块中的剩余部分校验码为对所述目标文件的每个文件块编码得到,所述剩余部分校验码为除所述一部分校验码之外的校验码,所述目标文件为所述目标文件块所属的文件;
所述数据处理装置根据所述依赖文件块和所述依赖校验码块恢复所述目标文件块。
结合第一方面,在第一种可能的实现方式中,所述数据处理装置根据所述依赖文件块和所述依赖校验码块恢复所述目标文件块,包括:
所述数据处理装置根据部分字节编码函数、所述依赖文件块和所述依赖校验码块恢复所述目标文件块中的第一字节,所述部分字节编码函数为采用所述目标文件中的部分文件块进行编码得到编码结果的函数;
所述数据处理装置根据全字节编码函数、所述依赖文件块和所述依赖校验码块恢复所述目标文件块中的第二字节,所述全字节编码函数为采用所述目标文件中的每个文件块进行编码得到编码结果的函数。
结合第一方面第一种可能的实现方式,在第二种可能的实现方式中,所述数据处理装置根据部分字节编码函数、所述依赖文件块和所述依赖校验码块恢复所述目标文件块中的第一字节,包括:
所述数据处理装置从第一编码参数所对应的依赖文件块中获取恢复所述第一字节所需的依赖字节,从第一编码结果所对应的依赖校验码块中获取恢复所述第一字节所需的校验码,所述第一编码参数为所述部分字节编码函数中的编码参数,所述第一编码结果为采用所述部分字节编码函数对所述第一编码参数所指示的依赖字节和所述第一字节进行编码所得到的结果;
所述数据处理装置根据恢复所述第一字节所需的依赖字节,对恢复所述第一字节所需的校验码进行解码,得到所述第一字节。
结合第一方面第一种或第二种可能的实现方式,在第三种可能的实现方式中,所述数据处理装置根据全字节编码函数、所述依赖文件块和所述依赖校验码块恢复所述目标文件块中的第二字节,包括:
所述数据处理装置从第二编码参数所对应的依赖文件块中获取恢复所述第二字节所需的依赖字节,从第二编码结果所对应的依赖校验码块中获取恢复所述第二字节所需的校验码,所述第二编码参数为所述全字节编码函数中的编码参数,所述第二编码结果为采用所述全字节编码函数对所述第二编码参数所指示的依赖字节和所述第二字节进行编码所得到的结果;
所述数据处理装置根据恢复所述第二字节所需的依赖字节,对恢复所述第二字节所需的校验码进行解码,得到所述第二字节。
本发明第二方面提供一种数据存储的方法,所述方法应用于分布式存储系统,所述分布式存储系统包括命名节点、多个第一存储节点和多个第二存储节点,所述多个第一存储节点用于分布式存储待存储文件的不同文件块,所述多个第二存储节点用于分布式存储对所述不同文件块进行编码得到的校验码块,每个第二存储节点都包括数据处理装置,每个数据处理装置都与所述命名节点通信连接,所述方法包括:
所述数据处理装置接收所述命名节点发送的多个目标存储节点的标识和目标文件的标识,所述多个目标存储节点为已存储了所述目标文件的不同文件块的第一存储节点;
所述数据处理装置根据所述目标存储节点的标识和部分字节编码函数对所述目标文件中的部分文件块进行编码,得到第一校验码,所述部分字节编码 函数为采用所述目标文件中的部分文件块进行编码得到编码结果的函数;
所述数据处理装置根据所述目标存储节点的标识和全字节编码函数对所述目标文件中的每个文件块进行编码,得到第二校验码,所述全字节编码函数为采用所述目标文件中的每个文件块进行编码得到编码结果的函数;
所述数据处理装置将所述第一校验码和所述第二校验码存储到所述数据处理装置所属的第二存储节点的存储空间中。
结合第二方面,在第一种可能的实现方式中,所述数据处理装置根据所述目标存储节点的标识和部分字节编码函数对所述目标文件中的部分文件块进行编码,得到第一校验码,包括:
所述数据处理装置从第一编码参数所对应的目标存储节点中获取所述第一编码参数所指示的字节,所述第一编码参数为所述部分字节编码函数中的每个编码参数;
所述数据处理装置根据所述部分字节编码函数对所述第一编码参数所指示的字节进行编码,得到第一校验码。
结合第二方面或第二方面第一种可能的实现方式,在第二种可能的实现方式中,所述数据处理装置根据所述目标存储节点的标识和全字节编码函数对所述目标文件中的每个文件块进行编码,得到第二校验码,包括:
所述数据处理装置从第二编码参数所对应的目标存储节点中获取所述第二编码参数所指示的字节,所述第二编码参数为所述全字节编码函数中的每个编码参数;
所述数据处理装置根据所述全字节编码函数对所述第二编码参数所指示的字节进行编码,得到第二校验码。
结合第二方面或第二方面第一种可能的实现方式,在第三种可能的实现方式中,所述数据处理装置根据所述目标存储节点的标识和部分字节编码函数对所述目标文件中的部分文件块进行编码,得到第一校验码之前,所述方法还包括:
所述数据处理装置根据所述目标存储节点的数量和所述命名节点所指定的校验节点的数量,确定所述部分字节编码函数中第一参数的数量和紧邻的两 个校验节点中的部分字节编码函数所包含的相同第一参数的个数,所述紧邻的两个校验节点所包含的部分字节编码函数中第一参数的重叠个数最多。
本发明第三方面提供一种数据处理装置,应用于分布式存储系统,所述分布式存储系统包括命名节点、多个第一存储节点和多个第二存储节点,所述多个第一存储节点用于分布式存储已存储文件的不同文件块,所述多个第二存储节点用于分布式存储对所述不同文件块进行编码得到的校验码块,每个第一存储节点都包括所述数据处理装置,每个数据处理装置都与所述命名节点通信连接,所述数据处理装置包括:
接收模块,用于接收用户设备发送的文件块获取请求,所述文件块获取请求携带所述目标文件的标识;
确定模块,用于根据所述接收模块接收的所述目标文件的标识未查找到所述目标文件,确定所述目标文件块丢失;
获取模块,用于在所述确定模块确定所述目标文件丢失后,从所述命名节点获取恢复依赖数据块所在的目标存储节点的标识,并根据所述目标存储节点的标识和所述目标文件的标识获取所述恢复依赖数据块,所述恢复依赖数据块包括恢复所述目标文件块所需的依赖文件块和依赖校验码块,所述依赖校验码块中的一部分校验码为对目标文件的部分文件块编码得到,所述依赖校验码块中的剩余部分校验码为对所述目标文件的每个文件块编码得到,所述剩余部分校验码为除所述一部分校验码之外的校验码,所述目标文件为所述目标文件块所属的文件;
恢复模块,用于根据所述获取模块获取的所述依赖文件块和所述依赖校验码块恢复所述目标文件块。
结合第三方面,在第一种可能的实现方式中,所述恢复模块包括:
第一恢复单元,用于根据部分字节编码函数、所述依赖文件块和所述依赖校验码块恢复所述目标文件块中的第一字节,所述部分字节编码函数为采用所述目标文件中的部分文件块进行编码得到编码结果的函数;
第二恢复单元,用于根据全字节编码函数、所述依赖文件块和所述依赖校验码块恢复所述目标文件块中的第二字节,所述全字节编码函数为采用所述目 标文件中的每个文件块进行编码得到编码结果的函数。
结合第三方面第一种可能的实现方式,在第二种可能的实现方式中,
所述第一恢复单元,具体用于从第一编码参数所对应的依赖文件块中获取恢复所述第一字节所需的依赖字节,从第一编码结果所对应的依赖校验码块中获取恢复所述第一字节所需的校验码,所述第一编码参数为所述部分字节编码函数中的编码参数,所述第一编码结果为采用所述部分字节编码函数对所述第一编码参数所指示的依赖字节和所述第一字节进行编码所得到的结果;根据恢复所述第一字节所需的依赖字节,对恢复所述第一字节所需的校验码进行解码,得到所述第一字节。
结合第三方面第一种或第二种可能的实现方式,在第三种可能的实现方式中,
所述第二恢复单元,具体用于从第二编码参数所对应的依赖文件块中获取恢复所述第二字节所需的依赖字节,从第二编码结果所对应的依赖校验码块中获取恢复所述第二字节所需的校验码,所述第二编码参数为所述全字节编码函数中的编码参数,所述第二编码结果为采用所述全字节编码函数对所述第二编码参数所指示的依赖字节和所述第二字节进行编码所得到的结果;根据恢复所述第二字节所需的依赖字节,对恢复所述第二字节所需的校验码进行解码,得到所述第二字节。
本发明第四方面提供一种数据处理装置,应用于分布式存储系统,所述分布式存储系统包括命名节点、多个第一存储节点和多个第二存储节点,所述多个第一存储节点用于分布式存储待存储文件的不同文件块,所述多个第二存储节点用于分布式存储对所述不同文件块进行编码得到的校验码块,每个第二存储节点都包括数据处理装置,每个数据处理装置都与所述命名节点通信连接,所述数据处理装置包括:
接收模块,用于接收所述命名节点发送的多个目标存储节点的标识和目标文件的标识,所述多个目标存储节点为已存储了所述目标文件的不同文件块的第一存储节点;
第一编码模块,用于根据所述接收模块接收的所述目标存储节点的标识和 部分字节编码函数对所述目标文件中的部分文件块进行编码,得到第一校验码,所述部分字节编码函数为采用所述目标文件中的部分文件块进行编码得到编码结果的函数;
第二编码模块,用于根据所述接收模块接收的所述目标存储节点的标识和全字节编码函数对所述目标文件中的每个文件块进行编码,得到第二校验码,所述全字节编码函数为采用所述目标文件中的每个文件块进行编码得到编码结果的函数;
存储调度模块,用于将所述第一编码模块编码得到的所述第一校验码和所述第二编码模块编码得到的所述第二校验码存储到所述数据处理装置所属的第二存储节点的存储空间中。
结合第四方面,在第一种可能的实现方式中,
所述第一编码模块,具体用于从第一编码参数所对应的目标存储节点中获取所述第一编码参数所指示的字节,所述第一编码参数为所述部分字节编码函数中的每个编码参数;根据所述部分字节编码函数对所述第一编码参数所指示的字节进行编码,得到第一校验码。
结合第四方面或第四方面第一种可能的实现方式,在第二种可能的实现方式中,
所述第二编码模块,具体用于从第二编码参数所对应的目标存储节点中获取所述第二编码参数所指示的字节,所述第二编码参数为所述全字节编码函数中的每个编码参数;根据所述全字节编码函数对所述第二编码参数所指示的字节进行编码,得到第二校验码。
结合第四方面或第四方面第一种可能的实现方式,在第三种可能的实现方式中,
所述数据处理装置还包括:
确定模块,用于根据所述接收模块接收的所述目标存储节点的数量和所述命名节点所指定的校验节点的数量,确定所述部分字节编码函数中第一参数的数量和紧邻的两个校验节点中的部分字节编码函数所包含的相同第一参数的个数,所述紧邻的两个校验节点所包含的部分字节编码函数中第一参数的重叠 个数最多。
本发明第五方面提供一种分布式存储系统,包括命名节点、多个第一存储节点和多个第二存储节点,所述多个第一存储节点用于分布式存储已存储文件的不同文件块,所述多个第二存储节点用于分布式存储对所述不同文件块进行编码得到的校验码块,每个第一存储节点都包括第一数据处理装置,每个第二存储节点都包括第二数据处理装置,每个第一数据处理装置和每个第二数据处理装置都与所述命名节点通信连接;
所述第一数据处理装置为上述第三方面或第三方面任一实现方式所述的数据处理装置;
所述第二数据处理装置为上述第四方面或第四方面任一实现方式所述的数据处理装置。
本发明实施例提供的数据恢复的方法,应用于分布式存储系统,所述分布式存储系统包括命名节点、多个第一存储节点和多个第二存储节点,所述多个第一存储节点用于分布式存储已存储文件的不同文件块,所述多个第二存储节点用于分布式存储对所述不同文件块进行编码得到的校验码块,每个第一存储节点都包括数据处理装置,每个数据处理装置都与所述命名节点通信连接,所述方法包括:所述数据处理装置接收用户设备发送的文件块获取请求,所述文件块获取请求携带所述目标文件的标识;所述数据处理装置根据所述目标文件的标识未查找到所述目标文件,确定所述目标文件块丢失;所述数据处理装置从所述命名节点获取恢复依赖数据块所在的存储节点的标识,所述恢复依赖数据块包括恢复所述目标文件块所需的依赖文件块和依赖校验码块,所述依赖校验码块中的一部分校验码为对目标文件的部分文件块编码得到,所述依赖校验码块中的剩余部分校验码为对所述目标文件的每个文件块编码得到,所述剩余部分校验码为除所述一部分校验码之外的校验码,所述目标文件为所述目标文件块所属的文件;所述数据处理装置根据所述依赖文件块和所述依赖校验码块恢复所述目标文件块。与现有技术中数据无法同时兼顾数据存储开销和数据恢复时的网络开销相比,本发明实施例提供的数据恢复的方法,校验码块是通过部分字节编码和全字节编码结合得到的,降低了存储开销,数据恢复时目标文件块中的一部分只需要依赖部分依赖文件块就可以得到降低了数据恢复时的 网络开销。
附图说明
为了更清楚地说明本发明实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1是本发明实施例中分布式存储系统的一实施例示意图;
图2是本发明实施例中分布式存储系统的另一实施例示意图;
图3是本发明实施例中数据存储的方法的一实施例示意图;
图4是本发明实施例中一场景示例示意图;
图5是本发明实施例中数据恢复的方法的一实施例示意图;
图6是本发明实施例中数据恢复的方法的另一实施例示意图;
图7是本发明实施例中数据存储的方法的另一实施例示意图;
图8是本发明实施例中数据处理装置的一实施例示意图;
图9是本发明实施例中数据处理装置的一实施例示意图;
图10是本发明实施例中数据处理装置的一实施例示意图;
图11是本发明实施例中数据处理装置的一实施例示意图;
图12是本发明实施例中数据处理装置的一实施例示意图;
图13是本发明实施例中数据处理装置的一实施例示意图;
图14是本发明实施例中数据处理装置的一实施例示意图。
具体实施方式
本发明实施例提供一种数据恢复的方法,可以在低存储开销的前提下,降低数据恢复时的网络开销。本发明实施例还提供了相应的数据存储的方法、相应的装置及系统。以下分别进行详细说明。
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。
图1为本发明实施例中分布式存储系统的一实施例示意图。
如图1所示,分布式存储系统包括命名节点(NameNode)和多个存储节点(Node),每个存储节点都与命名节点通信连接。实际上,命名节点与存储节点间可以通过交换机通信连接。
图2为本发明实施例中分布式存储系统的另一实施例示意图。
如图2所示,多个存储节点被划分到多个机架,同一个机架内的存储节点可以通过1G的交换机通信连接,图2中表示了3个存储节点组成一个机架,它们通过交换机通信连接;机架间则通过更高带宽的交换机通信连接;NameNode节点管理着整个集群的元数据,直接联入上层交换机,NameNode、存储节点、交换机、机架构成了一个分布式存储集群。元数据在本发明实施例中指的是文件中各文件块与存储路径的对应关系。一个文件可以分布式存储在多个存储节点上,例如:5个存储节点上,则该文件有5个文件块,而且,各文件块的数据内容不相同。用户设备对于分布式存储系统的使用包括数据存入和数据读出两个方面,用户在存入或读出数据时,通过网络接入分布式存储系统。
需要说明的是,本发明实施例提供的存储节点可以是独立的物理主机,也可以是位于一个或多个物理主机上的虚拟机。
下面分别从数据存入和数据读出两个方面介绍本发明实施例中数据存储的过程和数据恢复的过程:
需要预先说明的是,本发明实施例中多个包括两个以及两个以上。
首先结合图3介绍数据存储的过程:
图3为本发明实施例中数据存储的方法的一实施例示意图。
本发明实施例的分布式存储系统包括命名节点、多个第一存储节点和多个第二存储节点,所述多个第一存储节点用于分布式存储待存储文件的不同文件块,所述多个第二存储节点用于分布式存储对所述不同文件块进行编码得到的校验码块,每个第二存储节点都包括数据处理装置,每个数据处理装置都与所述命名节点通信连接,如图3所示,第一存储节点有十个,十个第一存储节点的标识分别为N1、N2至N10,第二存储节点有四个,四个第二存储节点的标识分别为N11至N14,当然,图3中只是举例,实际上分布式存储系统中有很多 第一存储节点和第二存储节点,每个存储节点都有其对应的标识。当用户设备要将目标文件存储到分布式存储系统时,先向命名节点发送存储请求,命名节点可以根据目标文件的大小、第一存储存储空间的大小等参数,为该目标文件分配第一存储节点,命名节点中维护有每个第一存储节点和每个第二存储节点存储空间的大小,同时,命名节点在为目标文件分配第一存储节点后,还会对应维护其目标文件的标识与分配的每个第一存储节点的标识的对应关系。例如:当为目标文件分配了N1至N10十个第一存储节点后,命名节点就会维护目标文件的标识和N1至N10的对应关系。用户设备接收命名节点发送的文件存储响应,该文件存储响应中携带N1至N10十个第一存储节点的标识,则用户设备将目标文件拆分为十个文件块,分别存储到十个第一存储节点中,当然,用户设备也可以不对目标文件进行拆分,而是轮循存储,十个第一存储节点中的文件块大小可以相同,也可以不相同,本发明实施例中对此不作限定。在目标文件存储到十个第一存储节点后,为了确保数据的可靠性,命名节点会为该目标文件分配第二存储节点,用于存储该目标文件十个文件块,命名节点为该目标函数分配了四个第二存储节点,四个第二存储节点的标识分别为N11至N14,然后,命名节点向N11至N14中的数据处理装置发送十个第一存储节点N1至N10的标识和目标文件的标识。这十个第一存储节点的标识为目标存储节点的标识。数据处理装置在接收到N1至N10的标识和目标文件的标识后,即获知了要对N1至N10中目标文件的10个文件块进行编码。
本发明实施例中,数据处理装置对目标文件中各文件块的编码采用两种编码函数,第一种为部分字节编码函数,第二种为全字节编码函数。部分字节编码函数为采用所述目标文件中的部分文件块进行编码得到编码结果的函数。全字节编码函数为采用所述目标文件中的每个文件块进行编码得到编码结果的函数。
所述数据处理装置根据所述目标存储节点的标识和部分字节编码函数对所述目标文件中的部分文件块进行编码,得到第一校验码。
所述数据处理装置根据所述目标存储节点的标识和全字节编码函数对所述目标文件中的每个文件块进行编码,得到第二校验码。
所述数据处理装置将所述第一校验码和所述第二校验码存储到所述数据处理装置所属的第二存储节点的存储空间中。
下面结合实例说明本发明实施例中部分字节编码和全字节编码的过程:
本发明实施例应用场景中,用于存储目标文件的第一存储节点有十个,为该目标文件分配的用户存储校验码块的第二存储节点有4个。a1,a2,…,a10分别为N1,N2,…,N10十个第一存储节点上目标文件的不同文件块中的第n个字节,b1,b2,…,b10分别为N1,N2,…,N10十个第一存储节点上目标文件的不同文件块中的第n+1个字节;c11,c21,c31,c41是第二存储节点N11,N12,N13,N14上的第n个字节,它的值通过部分字节编码函数g1,g2,g3,g4计算得到,c12,c22,c32,c42是第二存储节点N11,N12,N13,N14上的第n+1个字节,它的值通过全字节编码函数f1,f2,f3,f4+g5计算得到。
数据处理装置对上述应用场景的编码原理可以参阅表1进行理解:
表1:应用场景实例的编码原理
Figure PCTCN2016071339-appb-000001
其中,g1(a1,a2,a3,a4)=B11*a1+B12*a2+B13*a3+B14*a4
g2(a3,a4,a5,a6)=B23*a3+B24*a4+B25*a5+B26*a6
f1(b1,b2,…,b9,b10)=B11*b1+B12*b2+…,+B19*b9+B1,10*b10
f2(b1,b2,…,b9,b10)=B21*b1+B22*b2+…,+B29*b9+B2,10*b10
本处只是列举了其中几个函数关系式,其他的函数关系式可以参阅上述关系式进行改写,本处不一一赘述。
而且,上述只是举例说明了每个文件块中第n个字节和第n+1个字节的编码原理,实际上,每个文件块中其他字节也可以使用这两个编码函数进行编码。本处不一一列出。
本发明对a组采用部分字节编码,只有一部分字节作为它的编码函数的输入参数,对b组采用全字节编码,所有字节都作为它的编码函数的输入参数。
本发明实施例采用的编码矩阵为:
Figure PCTCN2016071339-appb-000002
本发明实施例中对编码矩阵值不做限定,只要保证逆矩阵存在就可以,各编码函数g1~g5,f1~f4中需要的系数从该矩阵中获取。
由上述示例的部分字节编码函数可见,紧邻的两个部分字节编码函数的字节参数有两个字节的重叠。这样,可以保证每个字节都能被编进多个校验码里,即使该目标文件有多个文件块丢失时,也能保证丢失的字节正常的恢复出来。紧邻的两个部分字节编码函数即为g1与g2,g2与g3,g3与g4这种参数重叠数最多的函数。
当然,重叠两个字节只是本示例的情况,具体部分字节编码函数中字节参数的数量和紧邻的两个部分编码函数中字节重复的数量可以根据该目标文件的第一存储节点的数量和第二存储节点的数量确定。保证每个字节都被编进至少两个校验码块即可。
如图4所示,a1,a2,,a9,a10这十个字节,按照紧邻的两个函数重叠两个字节的情况可以有五个部分字节编码函数。
分别为:g1(a1,a2,a3,a4),g2(a3,a4,a5,a6),(a5,a6,a7,a8),g4(a7,a8,a9,a10),(a9,a10,a1,a2),保证了a1,a2,…,a9,a10每个字节都被编进了两个校验码块中。实际上,也可以有三个重叠字节,这样,a1,a2,…,a9,a10每个字节都被编进3个校验码块中,可靠性更高,但存储开销会增大。
下面举例计算本发明上述示例的部分字节编码函数中选择4个字节作为参与编码的字节,重叠长度为两个字节的原因。
假设:nodeNum为需要参与部分字节编码的存储节点的数量,len为节点重叠部分的长度,第二存储节点的数量用r表示,第一存储节点的数量用k表示,通过如下公式确定nodeNum和len。
r=(k-nodeNum)/(nodeNum-len)+1;
带宽开销减少率ratio=(k-nodeNum)/2k;
本实施例中,通过多次赋值尝试。在nodeNum=4,len=2是最佳的参数组合,增加nodeNum的值会导致网络带宽开销增加,而增加len的值则会导致存储开销增加。因此,确定最终nodeNum=4,len=2,通过这两个值可以确定每个部分字节编码函数。
在一个数据处理装置对目标文件的不同文件块中的相应字节都编码结束后,就将校验码存储在该数据处理装置所在的第二存储节点的存储空间中。
在上述示例中,N11、N12、N13和N14中的数据处理装置都分别按照各自的部分字节编码函数和全字节编码函数进行编码,可以并行编码,提高了编码效率。
下面结合图5介绍本发明实施例中数据恢复的过程:
图5所示的分布式存储系统包括命名节点、多个第一存储节点和多个第二存储节点,所述多个第一存储节点用于分布式存储已存储文件的不同文件块,所述多个第二存储节点用于分布式存储对所述不同文件块进行编码得到的校验码块,每个第一存储节点都包括数据处理装置,每个数据处理装置都与所述命名节点通信连接。
在图5所示的示例中,结合图3所示的场景,目标文件存储在N1,N2,…,N10十个第一存储节点上,目标文件的校验码存储在N11、N12、N13和N14四个第二存储节点上。每个第一存储节点上都有一个数据处理装置。实际上第一存储节点和第二存储节点上都有数据处理装置,只是在数据恢复的解码过程中,第二存储节点上的数据处理装置暂时不需要使用。
在用户想使用目标文件时,用户设备向命名节点发送目标文件获取请求, 目标文件获取请求中携带该目标文件的标识。命名节点根据该目标文件的标识,从该目标文件存储时建立的目标文件的标识与第一存储节点的标识的关联关系中,确定该目标文件存储在N1,N2,…,N10十个第一存储节点上。则命名节点向用户设备返回N1,N2,…,N10这十个第一存储节点的标识。
用户设备根据N1,N2,…,N10这十个第一存储节点的标识,向N1,N2,…,N10这十个第一存储节点发送文件块的获取请求,该文件块的获取请求中携带目标文件的标识。
如果N1,N2,…,N10上存储的该目标文件的文件块没有丢失,则会分别向用户设备返回相应的文件块,但本发明场景示例中第一存储节点N1上的文件块丢失,则意味着a1、b1等存储在第一存储节点N1上的相关字节丢失,需要恢复丢失的文件块后再向用户设备返回恢复的文件块。下面以恢复a1、b1为例说明本发明实施例中数据恢复的过程。
由图3对应的场景示例可知,a1是根据部分字节编码函数编码在校验码c11=g1(a1,a2,a3,a4)中的,所以要恢复a1需要依赖a2,a3,a4和c11四个值,则可以从N2、N3、N4和N11中分别获取a2,a3,a4和c11这四个值,然后按照编码的逆过程进行解码,即可得到a1。同理,要恢复b1可以在获取b2,…,b9,b10和c12后,通过c12=f1(b1,b2,…,b9,b10)函数的逆过程恢复出b1。当然,恢复b1还可以采用其他几个全字节编码函数,但原理是相同的。
由以上可知,本发明实施例提供的数据存储的方法,采用部分字节编码和全字节编码混合的方式,提高了字节可靠性的同时,降低了存储空间,提高了编码效率。同时,在数据恢复时,针对部分字节编码函数编码的字节,不需要获取每个字节进行解码,降低了网络数据恢复时的网络开销。
参阅图6,本发明实施例提供的数据恢复的方法的一实施例包括:
101、数据处理装置接收用户设备发送的文件块获取请求,所述文件块获取请求携带所述目标文件的标识,所述数据处理装置应用于分布式存储系统,所述分布式存储系统包括命名节点、多个第一存储节点和多个第二存储节点,所述多个第一存储节点用于分布式存储已存储文件的不同文件块,所述多个第二存储节点用于分布式存储对所述不同文件块进行编码得到的校验码块,每个 第一存储节点都包括数据处理装置,每个数据处理装置都与所述命名节点通信连接。
102、所述数据处理装置根据所述目标文件的标识未查找到所述目标文件,确定所述目标文件块丢失。
103、所述数据处理装置从所述命名节点获取恢复依赖数据块所在的目标存储节点的标识,并根据所述目标存储节点的标识和所述目标文件的标识获取所述恢复依赖数据块,所述恢复依赖数据块包括恢复所述目标文件块所需的依赖文件块和依赖校验码块,所述依赖校验码块中的一部分校验码为对目标文件的部分文件块编码得到,所述依赖校验码块中的剩余部分校验码为对所述目标文件的每个文件块编码得到,所述剩余部分校验码为除所述一部分校验码之外的校验码,所述目标文件为所述目标文件块所属的文件。
104、所述数据处理装置根据所述依赖文件块和所述依赖校验码块恢复所述目标文件块。
与现有技术中数据无法同时兼顾数据存储开销和数据恢复时的网络开销相比,本发明实施例提供的数据恢复的方法,校验码块是通过部分字节编码和全字节编码结合得到的,降低了存储开销,数据恢复时目标文件块中的一部分只需要依赖部分依赖文件块就可以得到降低了数据恢复时的网络开销。
可选地,在上述图6对应的实施例的基础上,本发明实施例提供的数据恢复的方法的第一个可选实施例中,所述数据处理装置根据所述依赖文件块和所述依赖校验码块恢复所述目标文件块,可以包括:
所述数据处理装置根据部分字节编码函数、所述依赖文件块和所述依赖校验码块恢复所述目标文件块中的第一字节,所述部分字节编码函数为采用所述目标文件中的部分文件块进行编码得到编码结果的函数;
所述数据处理装置根据全字节编码函数、所述依赖文件块和所述依赖校验码块恢复所述目标文件块中的第二字节,所述全字节编码函数为采用所述目标文件中的每个文件块进行编码得到编码结果的函数。
可选地,在上述图6对应的第一个可选实施例的基础上,本发明实施例提供的数据恢复的方法的第二个可选实施例中,所述数据处理装置根据部分字节 编码函数、所述依赖文件块和所述依赖校验码块恢复所述目标文件块中的第一字节,可以包括:
所述数据处理装置从第一编码参数所对应的依赖文件块中获取恢复所述第一字节所需的依赖字节,从第一编码结果所对应的依赖校验码块中获取恢复所述第一字节所需的校验码,所述第一编码参数为所述部分字节编码函数中的编码参数,所述第一编码结果为采用所述部分字节编码函数对所述第一编码参数所指示的依赖字节和所述第一字节进行编码所得到的结果;
所述数据处理装置根据恢复所述第一字节所需的依赖字节,对恢复所述第一字节所需的校验码进行解码,得到所述第一字节。
可选地,在上述图6对应的第一个或第二个可选实施例的基础上,本发明实施例提供的数据恢复的方法的第三个可选实施例中,所述数据处理装置根据全字节编码函数、所述依赖文件块和所述依赖校验码块恢复所述目标文件块中的第二字节,可以包括:
所述数据处理装置从第二编码参数所对应的依赖文件块中获取恢复所述第二字节所需的依赖字节,从第二编码结果所对应的依赖校验码块中获取恢复所述第二字节所需的校验码,所述第二编码参数为所述全字节编码函数中的编码参数,所述第二编码结果为采用所述全字节编码函数对所述第二编码参数所指示的依赖字节和所述第二字节进行编码所得到的结果;
所述数据处理装置根据恢复所述第二字节所需的依赖字节,对恢复所述第二字节所需的校验码进行解码,得到所述第二字节。
图6对应的实施例或可选实施例中,第一编码参数指的是部分字节编码函数中的参数,例如:图3场景示例中的g1函数中的a1,a2,a3,a4,第二编码参数指的是全字节编码函数中的参数,例如:f函数中的b1,b2,…,b9,b10,第一字节可以参阅a组中的第n个字节进行理解,第二字节可以参阅b组中的第n+1个字节进行理解,第一字节总体来说是采用部分字节编码函数的字节,第二字节是采用全字节编码函数的字节。依赖文件块和所述依赖校验码块是恢复丢失文件块所需依赖的存活文件块和校验码块,例如:恢复第一存储节点N1中的目标文件的文件块需要依赖N2、N3和N4中目标文件的相关文件块和N11中目标文件的 校验码块。
图6对应的实施例或任一可选实施例中可以参阅图1至图5部分的相关描述进行理解,本处不做过多赘述。
参阅图7,本发明实施例提供的数据存储的方法的一实施例包括:
201、数据处理装置接收所述命名节点发送的多个目标存储节点的标识和目标文件的标识,所述多个目标存储节点为已存储了所述目标文件的不同文件块的第一存储节点,所述数据处理装置应用于分布式存储系统,所述分布式存储系统包括命名节点、多个第一存储节点和多个第二存储节点,所述多个第一存储节点用于分布式存储待存储文件的不同文件块,所述多个第二存储节点用于分布式存储对所述不同文件块进行编码得到的校验码块,每个第二存储节点都包括数据处理装置,每个数据处理装置都与所述命名节点通信连接。
202、所述数据处理装置根据所述目标存储节点的标识和部分字节编码函数对所述目标文件中的部分文件块进行编码,得到第一校验码,所述部分字节编码函数为采用所述目标文件中的部分文件块进行编码得到编码结果的函数。
203、所述数据处理装置根据所述目标存储节点的标识和全字节编码函数对所述目标文件中的每个文件块进行编码,得到第二校验码,所述全字节编码函数为采用所述目标文件中的每个文件块进行编码得到编码结果的函数。
204、所述数据处理装置将所述第一校验码和所述第二校验码存储到所述数据处理装置所属的第二存储节点的存储空间中。
与现有技术相比,本发明实施例提供的数据存储的方法,采用部分字节编码和全字节编码混合的方式,提高了字节可靠性的同时,降低了存储空间,提高了编码效率。同时,在数据恢复时,针对部分字节编码函数编码的字节,不需要获取每个字节进行解码,降低了网络数据恢复时的网络开销。
可选地,在上述图7对应的实施例的基础上,本发明实施例提供的数据存储的方法的第一个可选实施例中,所述数据处理装置根据所述目标存储节点的标识和部分字节编码函数对所述目标文件中的部分文件块进行编码,得到第一校验码,可以包括:
所述数据处理装置从第一编码参数所对应的目标存储节点中获取所述第 一编码参数所指示的字节,所述第一编码参数为所述部分字节编码函数中的每个编码参数;
所述数据处理装置根据所述部分字节编码函数对所述第一编码参数所指示的字节进行编码,得到第一校验码。
可选地,在上述图7对应的实施例或第一个可选实施例的基础上,本发明实施例提供的数据存储的方法的第二个可选实施例中,所述数据处理装置根据所述目标存储节点的标识和全字节编码函数对所述目标文件中的每个文件块进行编码,得到第二校验码,可以包括:
所述数据处理装置从第二编码参数所对应的目标存储节点中获取所述第二编码参数所指示的字节,所述第二编码参数为所述全字节编码函数中的每个编码参数;
所述数据处理装置根据所述全字节编码函数对所述第二编码参数所指示的字节进行编码,得到第二校验码。
可选地,在上述图7对应的实施例或第一个可选实施例的基础上,本发明实施例提供的数据存储的方法的第三个可选实施例中,所述数据处理装置根据所述目标存储节点的标识和部分字节编码函数对所述目标文件中的部分文件块进行编码,得到第一校验码之前,所述方法还可以包括:
所述数据处理装置根据所述目标存储节点的数量和所述命名节点所指定的校验节点的数量,确定所述部分字节编码函数中第一参数的数量和紧邻的两个校验节点中的部分字节编码函数所包含的相同第一参数的个数,所述紧邻的两个校验节点所包含的部分字节编码函数中第一参数的重叠个数最多。
图7对应的实施例或可选实施例中,第一编码参数指的是部分字节编码函数中的参数,例如:图3场景示例中的g1函数中的a1,a2,a3,a4,第二编码参数指的是全字节编码函数中的参数,例如:f函数中的b1,b2,…,b9,b10,第一校验码可以参阅c11进行理解,第二校验码可以参阅c21进行理解,第一校验码总体来说是采用部分字节编码函数进行编码得到的校验码,第二字节是采用全字节编码函数进行编码得到的校验码。
图7对应的实施例或任一可选实施例中可以参阅图1至图5部分的相关描述 进行理解,本处不做过多赘述。
参阅图8,本发明实施例提供的数据处理装置30的一实施例包括:数据处理装置30应用于分布式存储系统,所述分布式存储系统包括命名节点、多个第一存储节点和多个第二存储节点,所述多个第一存储节点用于分布式存储已存储文件的不同文件块,所述多个第二存储节点用于分布式存储对所述不同文件块进行编码得到的校验码块,每个第一存储节点都包括所述数据处理装置,每个数据处理装置都与所述命名节点通信连接,所述数据处理装置30包括:
接收模块301,用于接收用户设备发送的文件块获取请求,所述文件块获取请求携带所述目标文件的标识;
确定模块302,用于根据所述接收模块301接收的所述目标文件的标识未查找到所述目标文件,确定所述目标文件块丢失;
获取模块303,用于在所述确定模块302确定所述目标文件丢失后,从所述命名节点获取恢复依赖数据块所在的目标存储节点的标识,并根据所述目标存储节点的标识和所述目标文件的标识获取所述恢复依赖数据块,所述恢复依赖数据块包括恢复所述目标文件块所需的依赖文件块和依赖校验码块,所述依赖校验码块中的一部分校验码为对目标文件的部分文件块编码得到,所述依赖校验码块中的剩余部分校验码为对所述目标文件的每个文件块编码得到,所述剩余部分校验码为除所述一部分校验码之外的校验码,所述目标文件为所述目标文件块所属的文件;
恢复模块304,用于根据所述获取模块303获取的所述依赖文件块和所述依赖校验码块恢复所述目标文件块。
与现有技术相比,本发明实施例提供的数据处理装置,在数据恢复时,针对部分字节编码函数编码的字节,不需要获取每个字节进行解码,降低了网络数据恢复时的网络开销。
可选地,在上述图8对应的实施例的基础上,参阅图9,本发明实施例提供的数据处理装置30的第一个可选实施例中,所述恢复模块304包括:
第一恢复单元3041,用于根据部分字节编码函数、所述依赖文件块和所述依赖校验码块恢复所述目标文件块中的第一字节,所述部分字节编码函数为采 用所述目标文件中的部分文件块进行编码得到编码结果的函数;
第二恢复单元3042,用于根据全字节编码函数、所述依赖文件块和所述依赖校验码块恢复所述目标文件块中的第二字节,所述全字节编码函数为采用所述目标文件中的每个文件块进行编码得到编码结果的函数。
可选地,在上述图9对应的实施例的基础上,本发明实施例提供的数据处理装置30的第二个可选实施例中,
所述第一恢复单元3041,具体用于从第一编码参数所对应的依赖文件块中获取恢复所述第一字节所需的依赖字节,从第一编码结果所对应的依赖校验码块中获取恢复所述第一字节所需的校验码,所述第一编码参数为所述部分字节编码函数中的编码参数,所述第一编码结果为采用所述部分字节编码函数对所述第一编码参数所指示的依赖字节和所述第一字节进行编码所得到的结果;根据恢复所述第一字节所需的依赖字节,对恢复所述第一字节所需的校验码进行解码,得到所述第一字节。
可选地,在上述图9对应的实施例的基础上,本发明实施例提供的数据处理装置30的第三个可选实施例中,
所述第二恢复单元3042,具体用于从第二编码参数所对应的依赖文件块中获取恢复所述第二字节所需的依赖字节,从第二编码结果所对应的依赖校验码块中获取恢复所述第二字节所需的校验码,所述第二编码参数为所述全字节编码函数中的编码参数,所述第二编码结果为采用所述全字节编码函数对所述第二编码参数所指示的依赖字节和所述第二字节进行编码所得到的结果;根据恢复所述第二字节所需的依赖字节,对恢复所述第二字节所需的校验码进行解码,得到所述第二字节。
图8或图9对应的实施例或可选实施例中,第一编码参数指的是部分字节编码函数中的参数,例如:图3场景示例中的g1函数中的a1,a2,a3,a4,第二编码参数指的是全字节编码函数中的参数,例如:f函数中的b1,b2,…,b9,b10,第一字节可以参阅a组中的第n个字节进行理解,第二字节可以参阅b组中的第n+1个字节进行理解,第一字节总体来说是采用部分字节编码函数的字节,第二字节是采用全字节编码函数的字节。依赖文件块和所述依赖校验码块是恢复丢失文件 块所需依赖的存活文件块和校验码块,例如:恢复第一存储节点N1中的目标文件的文件块需要依赖N2、N3和N4中目标文件的相关文件块和N11中目标文件的校验码块。
图8或图9对应的实施例或任一可选实施例中可以参阅图1至图6部分的相关描述进行理解,本处不做过多赘述。
参阅图10,本发明实施例提供的数据处理装置40的一实施例包括:数据处理装置40应用于分布式存储系统,所述分布式存储系统包括命名节点、多个第一存储节点和多个第二存储节点,所述多个第一存储节点用于分布式存储待存储文件的不同文件块,所述多个第二存储节点用于分布式存储对所述不同文件块进行编码得到的校验码块,每个第二存储节点都包括数据处理装置,每个数据处理装置都与所述命名节点通信连接,所述数据处理装置包括:
接收模块401,用于接收所述命名节点发送的多个目标存储节点的标识和目标文件的标识,所述多个目标存储节点为已存储了所述目标文件的不同文件块的第一存储节点;
第一编码模块402,用于根据所述接收模块401接收的所述目标存储节点的标识和部分字节编码函数对所述目标文件中的部分文件块进行编码,得到第一校验码,所述部分字节编码函数为采用所述目标文件中的部分文件块进行编码得到编码结果的函数;
第二编码模块403,用于根据所述接收模块401接收的所述目标存储节点的标识和全字节编码函数对所述目标文件中的每个文件块进行编码,得到第二校验码,所述全字节编码函数为采用所述目标文件中的每个文件块进行编码得到编码结果的函数;
存储调度模块404,用于将所述第一编码模块402编码得到的所述第一校验码和所述第二编码模块403编码得到的所述第二校验码存储到所述数据处理装置所属的第二存储节点的存储空间中。
与现有技术相比,本发明实施例提供的数据处理装置,采用部分字节编码和全字节编码混合的方式,提高了字节可靠性的同时,降低了存储空间,提高了编码效率。同时,在数据恢复时,针对部分字节编码函数编码的字节,不需 要获取每个字节进行解码,降低了网络数据恢复时的网络开销。
可选地,在上述图10对应的实施例的基础上,本发明实施例提供的数据处理装置40的第一个可选实施例中,
所述第一编码模块402,具体用于从第一编码参数所对应的目标存储节点中获取所述第一编码参数所指示的字节,所述第一编码参数为所述部分字节编码函数中的每个编码参数;根据所述部分字节编码函数对所述第一编码参数所指示的字节进行编码,得到第一校验码。
可选地,在上述图10对应的实施例或第一个可选实施例的基础上,本发明实施例提供的数据处理装置40的第二个可选实施例中,
所述第二编码模块,具体用于从第二编码参数所对应的目标存储节点中获取所述第二编码参数所指示的字节,所述第二编码参数为所述全字节编码函数中的每个编码参数;根据所述全字节编码函数对所述第二编码参数所指示的字节进行编码,得到第二校验码。
可选地,在上述图10对应的实施例或第一个可选实施例的基础上,参阅图11,本发明实施例提供的数据处理装置40的第三个可选实施例中,所述数据处理装置40还包括:
确定模块405,用于根据所述接收模块401接收的所述目标存储节点的数量和所述命名节点所指定的校验节点的数量,确定所述部分字节编码函数中第一参数的数量和紧邻的两个校验节点中的部分字节编码函数所包含的相同第一参数的个数,所述紧邻的两个校验节点所包含的部分字节编码函数中第一参数的重叠个数最多。
图10和图11对应的实施例或可选实施例中,第一编码参数指的是部分字节编码函数中的参数,例如:图3场景示例中的g1函数中的a1,a2,a3,a4,第二编码参数指的是全字节编码函数中的参数,例如:f函数中的b1,b2,…,b9,b10,第一校验码可以参阅c11进行理解,第二校验码可以参阅c21进行理解,第一校验码总体来说是采用部分字节编码函数进行编码得到的校验码,第二字节是采用全字节编码函数进行编码得到的校验码。
图10和图11对应的实施例或任一可选实施例中可以参阅图1至图5、图7部 分的相关描述进行理解,本处不做过多赘述。
在上述数据处理装置的多个实施例中,应当理解的是,在一种实现方式下,接收模块、获取模块可以是由输入/输出I/O设备(比如网卡)来实现,确定模块、恢复模块、第一编码模块、第二编码模块、存储调度模块可以由处理器执行存储器中的程序或指令来实现的(换言之,即由处理器以及与所述处理器耦合的存储器中的特殊指令相互配合来实现);在另一种实现方式下接收模块、获取模块可以是由输入/输出I/O设备(比如网卡)来实现,确定模块、恢复模块、第一编码模块、第二编码模块、存储调度模块也可以分别通过专有电路来实现,具体实现方式参见现有技术,这里不再赘述;在再一种实现方式下,接收模块、获取模块可以是由输入/输出I/O设备(比如网卡)来实现,确定模块、恢复模块、第一编码模块、第二编码模块、存储调度模块也可以通过现场可编程门阵列(FPGA,Field-Programmable Gate Array)来实现,具体实现方式参见现有技术,这里不再赘述,本发明包括但不限于前述实现方式,应当理解的是,只要按照本发明的思想实现的方案,都落入本发明实施例所保护的范围。
本实施例提供了一种数据处理装置的硬件结构,参见图12所示,一种数据处理装置的硬件结构可以包括:
收发器件、软件器件以及硬件器件三部分;
收发器件为用于完成包收发的硬件电路;
硬件器件也可称“硬件处理模块”,或者更简单的,也可简称为“硬件”,硬件器件主要包括基于FPGA、ASIC之类专用硬件电路(也会配合其他配套器件,如存储器)来实现某些特定功能的硬件电路,其处理速度相比通用处理器往往要快很多,但功能一经定制,便很难更改,因此,实现起来并不灵活,通常用来处理一些固定的功能。需要说明的是,硬件器件在实际应用中,也可以包括MCU(微处理器,如单片机)、或者CPU等处理器,但这些处理器的主要功能并不是完成大数据的处理,而主要用于进行一些控制,在这种应用场景下,由这些器件搭配的系统为硬件器件。
软件器件(或者也简单“软件”)主要包括通用的处理器(例如CPU)及 其一些配套的器件(如内存、硬盘等存储设备),可以通过编程来让处理器具备相应的处理功能,用软件来实现时,可以根据业务灵活配置,但往往速度相比硬件器件来说要慢。软件处理完后,可以通过硬件器件将处理完的数据通过收发器件进行发送,也可以通过一个与收发器件相连的接口向收发器件发送处理完的数据。
本实施例中,收发器件用于接收目标文件的标识、目标存储节点的标识等。
硬件器件及软件器件的其他功能在前述实施例中已经详细论述,这里不再赘述。
下面结合附图就接收模块、获取模块可以是由输入/输出I/O设备(比如网卡)来实现,确定模块、恢复模块、第一编码模块、第二编码模块、存储调度模块可以是可以由处理器执行存储器中的程序或指令来实现的技术方案来做详细的介绍:
图13是本发明实施例提供的数据处理装置30的结构示意图。数据处理装置30应用于分布式存储系统,所述分布式存储系统包括命名节点、多个第一存储节点和多个第二存储节点,所述多个第一存储节点用于分布式存储已存储文件的不同文件块,所述多个第二存储节点用于分布式存储对所述不同文件块进行编码得到的校验码块,每个第一存储节点都包括数据处理装置,每个数据处理装置都与所述命名节点通信连接。所述数据处理装置30包括处理器310、存储器350和输入/输出I/O设备330,存储器350可以包括只读存储器和随机存取存储器,并向处理器310提供操作指令和数据。存储器350的一部分还可以包括非易失性随机存取存储器(NVRAM)。
在一些实施方式中,存储器350存储了如下的元素,可执行模块或者数据结构,或者他们的子集,或者他们的扩展集:
当数据处理装置30为源设备时:
在本发明实施例中,通过调用存储器350存储的操作指令(该操作指令可存储在操作系统中),
通过I/O设备330接收用户设备发送的文件块获取请求,所述文件块获取请求携带所述目标文件的标识;
通过I/O设备330从所述命名节点获取恢复依赖数据块所在的目标存储节点的标识,并根据所述目标存储节点的标识和所述目标文件的标识获取所述恢复依赖数据块,所述恢复依赖数据块包括恢复所述目标文件块所需的依赖文件块和依赖校验码块,所述依赖校验码块中的一部分校验码为对目标文件的部分文件块编码得到,所述依赖校验码块中的剩余部分校验码为对所述目标文件的每个文件块编码得到,所述剩余部分校验码为除所述一部分校验码之外的校验码,所述目标文件为所述目标文件块所属的文件;
根据所述依赖文件块和所述依赖校验码块恢复所述目标文件块。
与现有技术中数据无法同时兼顾数据存储开销和数据恢复时的网络开销相比,本发明实施例提供的数据处理装置,校验码块是通过部分字节编码和全字节编码结合得到的,降低了存储开销,数据恢复时目标文件块中的一部分只需要依赖部分依赖文件块就可以得到降低了数据恢复时的网络开销。
处理器310控制数据处理装置30的操作,处理器310还可以称为CPU(Central Processing Unit,中央处理单元)。存储器350可以包括只读存储器和随机存取存储器,并向处理器310提供指令和数据。存储器350的一部分还可以包括非易失性随机存取存储器(NVRAM)。具体的应用中数据处理装置30的各个组件通过总线系统320耦合在一起,其中总线系统320除包括数据总线之外,还可以包括电源总线、控制总线和状态信号总线等。但是为了清楚说明起见,在图中将各种总线都标为总线系统320。
上述本发明实施例揭示的方法可以应用于处理器310中,或者由处理器310实现。处理器310可能是一种集成电路芯片,具有信号的处理能力。在实现过程中,上述方法的各步骤可以通过处理器310中的硬件的集成逻辑电路或者软件形式的指令完成。上述的处理器310可以是通用处理器、数字信号处理器(DSP)、专用集成电路(ASIC)、现成可编程门阵列(FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。可以实现或者执行本发明实施例中的公开的各方法、步骤及逻辑框图。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。结合本发明实施例所公开的方法的步骤可以直接体现为硬件译码处理器执行完成,或者用译码处理器中的硬 件及软件模块组合执行完成。软件模块可以位于随机存储器,闪存、只读存储器,可编程只读存储器或者电可擦写可编程存储器、寄存器等本领域成熟的存储介质中。该存储介质位于存储器350,处理器310读取存储器350中的信息,结合其硬件完成上述方法的步骤。
可选地,处理器310具体用于:
根据部分字节编码函数、所述依赖文件块和所述依赖校验码块恢复所述目标文件块中的第一字节,所述部分字节编码函数为采用所述目标文件中的部分文件块进行编码得到编码结果的函数;
根据全字节编码函数、所述依赖文件块和所述依赖校验码块恢复所述目标文件块中的第二字节,所述全字节编码函数为采用所述目标文件中的每个文件块进行编码得到编码结果的函数。
可选地,处理器310具体用于:
从第一编码参数所对应的依赖文件块中获取恢复所述第一字节所需的依赖字节,从第一编码结果所对应的依赖校验码块中获取恢复所述第一字节所需的校验码,所述第一编码参数为所述部分字节编码函数中的编码参数,所述第一编码结果为采用所述部分字节编码函数对所述第一编码参数所指示的依赖字节和所述第一字节进行编码所得到的结果;
根据恢复所述第一字节所需的依赖字节,对恢复所述第一字节所需的校验码进行解码,得到所述第一字节。
可选地,处理器310具体用于:
从第二编码参数所对应的依赖文件块中获取恢复所述第二字节所需的依赖字节,从第二编码结果所对应的依赖校验码块中获取恢复所述第二字节所需的校验码,所述第二编码参数为所述全字节编码函数中的编码参数,所述第二编码结果为采用所述全字节编码函数对所述第二编码参数所指示的依赖字节和所述第二字节进行编码所得到的结果;
根据恢复所述第二字节所需的依赖字节,对恢复所述第二字节所需的校验码进行解码,得到所述第二字节。
图13对应的实施例或可选实施例中,第一编码参数指的是部分字节编码函 数中的参数,例如:图3场景示例中的g1函数中的a1,a2,a3,a4,第二编码参数指的是全字节编码函数中的参数,例如:f函数中的b1,b2,…,b9,b10,第一字节可以参阅a组中的第n个字节进行理解,第二字节可以参阅b组中的第n+1个字节进行理解,第一字节总体来说是采用部分字节编码函数的字节,第二字节是采用全字节编码函数的字节。依赖文件块和所述依赖校验码块是恢复丢失文件块所需依赖的存活文件块和校验码块,例如:恢复第一存储节点N1中的目标文件的文件块需要依赖N2、N3和N4中目标文件的相关文件块和N11中目标文件的校验码块。
图13对应的实施例或任一可选实施例中可以参阅图1至图5、图6、图8、图9部分的相关描述进行理解,本处不做过多赘述。
图14是本发明实施例提供的数据处理装置40的结构示意图。数据处理装置40应用于分布式存储系统,所述分布式存储系统包括命名节点、多个第一存储节点和多个第二存储节点,所述多个第一存储节点用于分布式存储待存储文件的不同文件块,所述多个第二存储节点用于分布式存储对所述不同文件块进行编码得到的校验码块,每个第二存储节点都包括数据处理装置,每个数据处理装置都与所述命名节点通信连接。所述数据处理装置40包括处理器410、存储器450和输入/输出I/O设备430,存储器450可以包括只读存储器和随机存取存储器,并向处理器410提供操作指令和数据。存储器450的一部分还可以包括非易失性随机存取存储器(NVRAM)。
在一些实施方式中,存储器450存储了如下的元素,可执行模块或者数据结构,或者他们的子集,或者他们的扩展集:
当数据处理装置40为源设备时:
在本发明实施例中,通过调用存储器450存储的操作指令(该操作指令可存储在操作系统中),
通过I/O设备430接收所述命名节点发送的多个目标存储节点的标识和目标文件的标识,所述多个目标存储节点为已存储了所述目标文件的不同文件块的第一存储节点;
根据所述目标存储节点的标识和部分字节编码函数对所述目标文件中的部分文件块进行编码,得到第一校验码,所述部分字节编码函数为采用所述目标文件中的部分文件块进行编码得到编码结果的函数;
根据所述目标存储节点的标识和全字节编码函数对所述目标文件中的每个文件块进行编码,得到第二校验码,所述全字节编码函数为采用所述目标文件中的每个文件块进行编码得到编码结果的函数;
将所述第一校验码和所述第二校验码存储到所述数据处理装置所属的第二存储节点的存储空间中。
与现有技术相比,本发明实施例提供的数据存储的方法,采用部分字节编码和全字节编码混合的方式,提高了字节可靠性的同时,降低了存储空间,提高了编码效率。同时,在数据恢复时,针对部分字节编码函数编码的字节,不需要获取每个字节进行解码,降低了网络数据恢复时的网络开销。
处理器410控制数据处理装置40的操作,处理器410还可以称为CPU(Central Processing Unit,中央处理单元)。存储器450可以包括只读存储器和随机存取存储器,并向处理器410提供指令和数据。存储器450的一部分还可以包括非易失性随机存取存储器(NVRAM)。具体的应用中数据处理装置40的各个组件通过总线系统420耦合在一起,其中总线系统420除包括数据总线之外,还可以包括电源总线、控制总线和状态信号总线等。但是为了清楚说明起见,在图中将各种总线都标为总线系统420。
上述本发明实施例揭示的方法可以应用于处理器410中,或者由处理器410实现。处理器410可能是一种集成电路芯片,具有信号的处理能力。在实现过程中,上述方法的各步骤可以通过处理器410中的硬件的集成逻辑电路或者软件形式的指令完成。上述的处理器410可以是通用处理器、数字信号处理器(DSP)、专用集成电路(ASIC)、现成可编程门阵列(FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。可以实现或者执行本发明实施例中的公开的各方法、步骤及逻辑框图。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。结合本发明实施例所公开的方法的步骤可以直接体现为硬件译码处理器执行完成,或者用译码处理器中的硬 件及软件模块组合执行完成。软件模块可以位于随机存储器,闪存、只读存储器,可编程只读存储器或者电可擦写可编程存储器、寄存器等本领域成熟的存储介质中。该存储介质位于存储器450,处理器410读取存储器450中的信息,结合其硬件完成上述方法的步骤。
可选地,处理器410具体用于:
从第一编码参数所对应的目标存储节点中获取所述第一编码参数所指示的字节,所述第一编码参数为所述部分字节编码函数中的每个编码参数;
根据所述部分字节编码函数对所述第一编码参数所指示的字节进行编码,得到第一校验码。
可选地,处理器410具体用于:
从第二编码参数所对应的目标存储节点中获取所述第二编码参数所指示的字节,所述第二编码参数为所述全字节编码函数中的每个编码参数;
根据所述全字节编码函数对所述第二编码参数所指示的字节进行编码,得到第二校验码。
可选地,处理器410具体用于:
根据所述目标存储节点的数量和所述命名节点所指定的校验节点的数量,确定所述部分字节编码函数中第一参数的数量和紧邻的两个校验节点中的部分字节编码函数所包含的相同第一参数的个数,所述紧邻的两个校验节点所包含的部分字节编码函数中第一参数的重叠个数最多。
图13对应的实施例或可选实施例中,第一编码参数指的是部分字节编码函数中的参数,例如:图3场景示例中的g1函数中的a1,a2,a3,a4,第二编码参数指的是全字节编码函数中的参数,例如:f函数中的b1,b2,…,b9,b10,第一校验码可以参阅c11进行理解,第二校验码可以参阅c21进行理解,第一校验码总体来说是采用部分字节编码函数进行编码得到的校验码,第二字节是采用全字节编码函数进行编码得到的校验码。
图13对应的实施例或任一可选实施例中可以参阅图1至图5、图7、图10、图11部分的相关描述进行理解,本处不做过多赘述。
本发明实施例提供的分布式存储系统,包括命名节点、多个第一存储节点 和多个第二存储节点,所述多个第一存储节点用于分布式存储已存储文件的不同文件块,所述多个第二存储节点用于分布式存储对所述不同文件块进行编码得到的校验码块,每个第一存储节点都包括第一数据处理装置,每个第二存储节点都包括第二数据处理装置,每个第一数据处理装置和每个第二数据处理装置都与所述命名节点通信连接;
第一数据处理装置可以参阅图3部分的描述进行理解,第二数据处理装置可以参阅图5部分的描述进行理解,本处不再重复赘述。
与现有技术中数据无法同时兼顾数据存储开销和数据恢复时的网络开销相比,本发明实施例提供的分布式存储系统,校验码块是通过部分字节编码和全字节编码结合得到的,降低了存储开销,数据恢复时目标文件块中的一部分只需要依赖部分依赖文件块就可以得到降低了数据恢复时的网络开销。
本领域普通技术人员可以理解上述实施例的各种方法中的全部或部分步骤是可以通过程序来指令相关的硬件来完成,该程序可以存储于一计算机可读存储介质中,存储介质可以包括:ROM、RAM、磁盘或光盘等。
以上对本发明实施例所提供的数据存储的方法、恢复的方法、相应装置以及系统进行了详细介绍,本文中应用了具体个例对本发明的原理及实施方式进行了阐述,以上实施例的说明只是用于帮助理解本发明的方法及其核心思想;同时,对于本领域的一般技术人员,依据本发明的思想,在具体实施方式及应用范围上均会有改变之处,综上所述,本说明书内容不应理解为对本发明的限制。

Claims (17)

  1. 一种数据恢复的方法,其特征在于,所述方法应用于分布式存储系统,所述分布式存储系统包括命名节点、多个第一存储节点和多个第二存储节点,所述多个第一存储节点用于分布式存储已存储文件的不同文件块,所述多个第二存储节点用于分布式存储对所述不同文件块进行编码得到的校验码块,每个第一存储节点都包括数据处理装置,每个数据处理装置都与所述命名节点通信连接,所述方法包括:
    所述数据处理装置接收用户设备发送的文件块获取请求,所述文件块获取请求携带所述目标文件的标识;
    所述数据处理装置根据所述目标文件的标识未查找到所述目标文件,确定所述目标文件块丢失;
    所述数据处理装置从所述命名节点获取恢复依赖数据块所在的目标存储节点的标识,并根据所述目标存储节点的标识和所述目标文件的标识获取所述恢复依赖数据块,所述恢复依赖数据块包括恢复所述目标文件块所需的依赖文件块和依赖校验码块,所述依赖校验码块中的一部分校验码为对目标文件的部分文件块编码得到,所述依赖校验码块中的剩余部分校验码为对所述目标文件的每个文件块编码得到,所述剩余部分校验码为除所述一部分校验码之外的校验码,所述目标文件为所述目标文件块所属的文件;
    所述数据处理装置根据所述依赖文件块和所述依赖校验码块恢复所述目标文件块。
  2. 根据权利要求1所述的方法,其特征在于,所述数据处理装置根据所述依赖文件块和所述依赖校验码块恢复所述目标文件块,包括:
    所述数据处理装置根据部分字节编码函数、所述依赖文件块和所述依赖校验码块恢复所述目标文件块中的第一字节,所述部分字节编码函数为采用所述目标文件中的部分文件块进行编码得到编码结果的函数;
    所述数据处理装置根据全字节编码函数、所述依赖文件块和所述依赖校验码块恢复所述目标文件块中的第二字节,所述全字节编码函数为采用所述目标文件中的每个文件块进行编码得到编码结果的函数。
  3. 根据权利要求2所述的方法,其特征在于,所述数据处理装置根据部分字节编码函数、所述依赖文件块和所述依赖校验码块恢复所述目标文件块中的第一字节,包括:
    所述数据处理装置从第一编码参数所对应的依赖文件块中获取恢复所述第一字节所需的依赖字节,从第一编码结果所对应的依赖校验码块中获取恢复所述第一字节所需的校验码,所述第一编码参数为所述部分字节编码函数中的编码参数,所述第一编码结果为采用所述部分字节编码函数对所述第一编码参数所指示的依赖字节和所述第一字节进行编码所得到的结果;
    所述数据处理装置根据恢复所述第一字节所需的依赖字节,对恢复所述第一字节所需的校验码进行解码,得到所述第一字节。
  4. 根据权利要求2或3所述的方法,其特征在于,所述数据处理装置根据全字节编码函数、所述依赖文件块和所述依赖校验码块恢复所述目标文件块中的第二字节,包括:
    所述数据处理装置从第二编码参数所对应的依赖文件块中获取恢复所述第二字节所需的依赖字节,从第二编码结果所对应的依赖校验码块中获取恢复所述第二字节所需的校验码,所述第二编码参数为所述全字节编码函数中的编码参数,所述第二编码结果为采用所述全字节编码函数对所述第二编码参数所指示的依赖字节和所述第二字节进行编码所得到的结果;
    所述数据处理装置根据恢复所述第二字节所需的依赖字节,对恢复所述第二字节所需的校验码进行解码,得到所述第二字节。
  5. 一种数据存储的方法,其特征在于,所述方法应用于分布式存储系统,所述分布式存储系统包括命名节点、多个第一存储节点和多个第二存储节点,所述多个第一存储节点用于分布式存储待存储文件的不同文件块,所述多个第二存储节点用于分布式存储对所述不同文件块进行编码得到的校验码块,每个第二存储节点都包括数据处理装置,每个数据处理装置都与所述命名节点通信连接,所述方法包括:
    所述数据处理装置接收所述命名节点发送的多个目标存储节点的标识和目标文件的标识,所述多个目标存储节点为已存储了所述目标文件的不同文件 块的第一存储节点;
    所述数据处理装置根据所述目标存储节点的标识和部分字节编码函数对所述目标文件中的部分文件块进行编码,得到第一校验码,所述部分字节编码函数为采用所述目标文件中的部分文件块进行编码得到编码结果的函数;
    所述数据处理装置根据所述目标存储节点的标识和全字节编码函数对所述目标文件中的每个文件块进行编码,得到第二校验码,所述全字节编码函数为采用所述目标文件中的每个文件块进行编码得到编码结果的函数;
    所述数据处理装置将所述第一校验码和所述第二校验码存储到所述数据处理装置所属的第二存储节点的存储空间中。
  6. 根据权利要求5所述的方法,其特征在于,所述数据处理装置根据所述目标存储节点的标识和部分字节编码函数对所述目标文件中的部分文件块进行编码,得到第一校验码,包括:
    所述数据处理装置从第一编码参数所对应的目标存储节点中获取所述第一编码参数所指示的字节,所述第一编码参数为所述部分字节编码函数中的每个编码参数;
    所述数据处理装置根据所述部分字节编码函数对所述第一编码参数所指示的字节进行编码,得到第一校验码。
  7. 根据权利要求5或6所述的方法,其特征在于,所述数据处理装置根据所述目标存储节点的标识和全字节编码函数对所述目标文件中的每个文件块进行编码,得到第二校验码,包括:
    所述数据处理装置从第二编码参数所对应的目标存储节点中获取所述第二编码参数所指示的字节,所述第二编码参数为所述全字节编码函数中的每个编码参数;
    所述数据处理装置根据所述全字节编码函数对所述第二编码参数所指示的字节进行编码,得到第二校验码。
  8. 根据权利要求5或6所述的方法,其特征在于,所述数据处理装置根据所述目标存储节点的标识和部分字节编码函数对所述目标文件中的部分文件块进行编码,得到第一校验码之前,所述方法还包括:
    所述数据处理装置根据所述目标存储节点的数量和所述命名节点所指定的校验节点的数量,确定所述部分字节编码函数中第一参数的数量和紧邻的两个校验节点中的部分字节编码函数所包含的相同第一参数的个数,所述紧邻的两个校验节点所包含的部分字节编码函数中第一参数的重叠个数最多。
  9. 一种数据处理装置,其特征在于,应用于分布式存储系统,所述分布式存储系统包括命名节点、多个第一存储节点和多个第二存储节点,所述多个第一存储节点用于分布式存储已存储文件的不同文件块,所述多个第二存储节点用于分布式存储对所述不同文件块进行编码得到的校验码块,每个第一存储节点都包括所述数据处理装置,每个数据处理装置都与所述命名节点通信连接,所述数据处理装置包括:
    接收模块,用于接收用户设备发送的文件块获取请求,所述文件块获取请求携带所述目标文件的标识;
    确定模块,用于根据所述接收模块接收的所述目标文件的标识未查找到所述目标文件,确定所述目标文件块丢失;
    获取模块,用于在所述确定模块确定所述目标文件丢失后,从所述命名节点获取恢复依赖数据块所在的目标存储节点的标识,并根据所述目标存储节点的标识和所述目标文件的标识获取所述恢复依赖数据块,所述恢复依赖数据块包括恢复所述目标文件块所需的依赖文件块和依赖校验码块,所述依赖校验码块中的一部分校验码为对目标文件的部分文件块编码得到,所述依赖校验码块中的剩余部分校验码为对所述目标文件的每个文件块编码得到,所述剩余部分校验码为除所述一部分校验码之外的校验码,所述目标文件为所述目标文件块所属的文件;
    恢复模块,用于根据所述获取模块获取的所述依赖文件块和所述依赖校验码块恢复所述目标文件块。
  10. 根据权利要求9所述的数据处理装置,其特征在于,所述恢复模块包括:
    第一恢复单元,用于根据部分字节编码函数、所述依赖文件块和所述依赖校验码块恢复所述目标文件块中的第一字节,所述部分字节编码函数为采用所 述目标文件中的部分文件块进行编码得到编码结果的函数;
    第二恢复单元,用于根据全字节编码函数、所述依赖文件块和所述依赖校验码块恢复所述目标文件块中的第二字节,所述全字节编码函数为采用所述目标文件中的每个文件块进行编码得到编码结果的函数。
  11. 根据权利要求10所述的数据处理装置,其特征在于,
    所述第一恢复单元,具体用于从第一编码参数所对应的依赖文件块中获取恢复所述第一字节所需的依赖字节,从第一编码结果所对应的依赖校验码块中获取恢复所述第一字节所需的校验码,所述第一编码参数为所述部分字节编码函数中的编码参数,所述第一编码结果为采用所述部分字节编码函数对所述第一编码参数所指示的依赖字节和所述第一字节进行编码所得到的结果;根据恢复所述第一字节所需的依赖字节,对恢复所述第一字节所需的校验码进行解码,得到所述第一字节。
  12. 根据权利要求10或11所述的数据处理装置,其特征在于,
    所述第二恢复单元,具体用于从第二编码参数所对应的依赖文件块中获取恢复所述第二字节所需的依赖字节,从第二编码结果所对应的依赖校验码块中获取恢复所述第二字节所需的校验码,所述第二编码参数为所述全字节编码函数中的编码参数,所述第二编码结果为采用所述全字节编码函数对所述第二编码参数所指示的依赖字节和所述第二字节进行编码所得到的结果;根据恢复所述第二字节所需的依赖字节,对恢复所述第二字节所需的校验码进行解码,得到所述第二字节。
  13. 一种数据处理装置,其特征在于,应用于分布式存储系统,所述分布式存储系统包括命名节点、多个第一存储节点和多个第二存储节点,所述多个第一存储节点用于分布式存储待存储文件的不同文件块,所述多个第二存储节点用于分布式存储对所述不同文件块进行编码得到的校验码块,每个第二存储节点都包括数据处理装置,每个数据处理装置都与所述命名节点通信连接,所述数据处理装置包括:
    接收模块,用于接收所述命名节点发送的多个目标存储节点的标识和目标文件的标识,所述多个目标存储节点为已存储了所述目标文件的不同文件块的 第一存储节点;
    第一编码模块,用于根据所述接收模块接收的所述目标存储节点的标识和部分字节编码函数对所述目标文件中的部分文件块进行编码,得到第一校验码,所述部分字节编码函数为采用所述目标文件中的部分文件块进行编码得到编码结果的函数;
    第二编码模块,用于根据所述接收模块接收的所述目标存储节点的标识和全字节编码函数对所述目标文件中的每个文件块进行编码,得到第二校验码,所述全字节编码函数为采用所述目标文件中的每个文件块进行编码得到编码结果的函数;
    存储调度模块,用于将所述第一编码模块编码得到的所述第一校验码和所述第二编码模块编码得到的所述第二校验码存储到所述数据处理装置所属的第二存储节点的存储空间中。
  14. 根据权利要求13所述的数据处理装置,其特征在于,
    所述第一编码模块,具体用于从第一编码参数所对应的目标存储节点中获取所述第一编码参数所指示的字节,所述第一编码参数为所述部分字节编码函数中的每个编码参数;根据所述部分字节编码函数对所述第一编码参数所指示的字节进行编码,得到第一校验码。
  15. 根据权利要求13或14所述的数据处理装置,其特征在于,
    所述第二编码模块,具体用于从第二编码参数所对应的目标存储节点中获取所述第二编码参数所指示的字节,所述第二编码参数为所述全字节编码函数中的每个编码参数;根据所述全字节编码函数对所述第二编码参数所指示的字节进行编码,得到第二校验码。
  16. 根据权利要求13或14所述的数据处理装置,其特征在于,所述数据处理装置还包括:
    确定模块,用于根据所述接收模块接收的所述目标存储节点的数量和所述命名节点所指定的校验节点的数量,确定所述部分字节编码函数中第一参数的数量和紧邻的两个校验节点中的部分字节编码函数所包含的相同第一参数的个数,所述紧邻的两个校验节点所包含的部分字节编码函数中第一参数的重叠 个数最多。
  17. 一种分布式存储系统,其特征在于,包括命名节点、多个第一存储节点和多个第二存储节点,所述多个第一存储节点用于分布式存储已存储文件的不同文件块,所述多个第二存储节点用于分布式存储对所述不同文件块进行编码得到的校验码块,每个第一存储节点都包括第一数据处理装置,每个第二存储节点都包括第二数据处理装置,每个第一数据处理装置和每个第二数据处理装置都与所述命名节点通信连接;
    所述第一数据处理装置为上述权利要求9-12任一所述的数据处理装置;
    所述第二数据处理装置为上述权利要求13-16任一所述的数据处理装置。
PCT/CN2016/071339 2015-08-17 2016-01-19 一种数据恢复的方法、存储的方法相应的装置及系统 WO2017028494A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP16836360.4A EP3327571B1 (en) 2015-08-17 2016-01-19 Data storage method and corresponding apparatus
US15/893,201 US10810091B2 (en) 2015-08-17 2018-02-09 Data recovery method, data storage method, and corresponding apparatus and system

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201510504685.8A CN106469100B (zh) 2015-08-17 2015-08-17 一种数据恢复的方法、存储的方法相应的装置及系统
CN201510504685.8 2015-08-17

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US15/893,201 Continuation US10810091B2 (en) 2015-08-17 2018-02-09 Data recovery method, data storage method, and corresponding apparatus and system

Publications (1)

Publication Number Publication Date
WO2017028494A1 true WO2017028494A1 (zh) 2017-02-23

Family

ID=58050726

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2016/071339 WO2017028494A1 (zh) 2015-08-17 2016-01-19 一种数据恢复的方法、存储的方法相应的装置及系统

Country Status (4)

Country Link
US (1) US10810091B2 (zh)
EP (1) EP3327571B1 (zh)
CN (1) CN106469100B (zh)
WO (1) WO2017028494A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112152974A (zh) * 2019-06-28 2020-12-29 华为技术有限公司 基于区块链网络的数据确权方法及相关装置

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107026912A (zh) * 2017-05-12 2017-08-08 成都优孚达信息技术有限公司 嵌入式通信设备数据传输方法
CN109144782B (zh) * 2018-08-22 2021-08-10 郑州云海信息技术有限公司 一种数据恢复方法及装置
CN112748851A (zh) * 2019-10-30 2021-05-04 北京白山耘科技有限公司 数据读取方法、装置和系统
CN111698330B (zh) * 2020-06-12 2022-06-21 北京金山云网络技术有限公司 存储集群的数据恢复方法、装置及服务器
CN112256472B (zh) * 2020-10-20 2024-06-25 平安科技(深圳)有限公司 分布式数据调取方法、装置、电子设备及存储介质
CN117093156A (zh) * 2021-09-08 2023-11-21 长江存储科技有限责任公司 用于存储器的数据保护方法及其存储装置
US20230082636A1 (en) * 2021-09-16 2023-03-16 Micron Technology, Inc. Parity data modification for partial stripe data update
CN115119015A (zh) * 2022-06-28 2022-09-27 广州势创信息科技有限公司 一种数据传输系统及方法

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7240236B2 (en) * 2004-03-23 2007-07-03 Archivas, Inc. Fixed content distributed data storage using permutation ring encoding
CN101630282A (zh) * 2009-07-29 2010-01-20 国网电力科学研究院 一种基于Erasure编码和副本技术的数据备份方法
CN102750195A (zh) * 2012-06-07 2012-10-24 浪潮电子信息产业股份有限公司 一种集群文件系统数据容错的方法
US8719232B2 (en) * 2011-06-30 2014-05-06 Verisign, Inc. Systems and methods for data integrity checking

Family Cites Families (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4887204A (en) * 1987-02-13 1989-12-12 International Business Machines Corporation System and method for accessing remote files in a distributed networking environment
US7636724B2 (en) * 2001-08-31 2009-12-22 Peerify Technologies LLC Data storage system and method by shredding and deshredding
US9495381B2 (en) * 2005-01-12 2016-11-15 Wandisco, Inc. Geographically-distributed file system using coordinated namespace replication over a wide area network
US9424272B2 (en) * 2005-01-12 2016-08-23 Wandisco, Inc. Distributed file system using consensus nodes
CN102405460B (zh) * 2009-02-11 2016-05-04 艾梵尼达有限公司 虚拟存储系统及其运行方法
CN101488104B (zh) * 2009-02-26 2011-05-04 北京云快线软件服务有限公司 一种实现高效安全存储的系统和方法
US8756473B1 (en) * 2010-12-23 2014-06-17 Sk Hynix Memory Solutions Inc. Solid state device coding architecture for chipkill and endurance improvement
US8949558B2 (en) * 2011-04-29 2015-02-03 International Business Machines Corporation Cost-aware replication of intermediate data in dataflows
TWI461929B (zh) * 2011-12-09 2014-11-21 Promise Tecnnology Inc 雲端數據儲存系統
US9268590B2 (en) * 2012-02-29 2016-02-23 Vmware, Inc. Provisioning a cluster of distributed computing platform based on placement strategy
CN103023968B (zh) * 2012-11-15 2015-12-23 中科院成都信息技术有限公司 一种网络分布式文件存储与读取方法
CN104572339A (zh) * 2013-10-17 2015-04-29 捷达世软件(深圳)有限公司 基于分布式文件系统的数据备份还原系统及方法
US9507800B2 (en) * 2013-10-23 2016-11-29 Netapp, Inc. Data management in distributed file systems
CN103605582B (zh) 2013-11-27 2017-01-25 华中科技大学 一种基于写重定向的纠删码存储重构优化方法
CN103645861B (zh) 2013-12-03 2016-04-13 华中科技大学 一种纠删码集群中失效节点的重构方法
CN103699494B (zh) * 2013-12-06 2017-03-15 北京奇虎科技有限公司 一种数据存储方法、数据存储设备和分布式存储系统

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7240236B2 (en) * 2004-03-23 2007-07-03 Archivas, Inc. Fixed content distributed data storage using permutation ring encoding
CN101630282A (zh) * 2009-07-29 2010-01-20 国网电力科学研究院 一种基于Erasure编码和副本技术的数据备份方法
US8719232B2 (en) * 2011-06-30 2014-05-06 Verisign, Inc. Systems and methods for data integrity checking
CN102750195A (zh) * 2012-06-07 2012-10-24 浪潮电子信息产业股份有限公司 一种集群文件系统数据容错的方法

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP3327571A4 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112152974A (zh) * 2019-06-28 2020-12-29 华为技术有限公司 基于区块链网络的数据确权方法及相关装置
CN112152974B (zh) * 2019-06-28 2022-02-25 华为技术有限公司 基于区块链网络的数据确权方法及相关装置

Also Published As

Publication number Publication date
CN106469100B (zh) 2019-04-05
EP3327571A1 (en) 2018-05-30
US10810091B2 (en) 2020-10-20
CN106469100A (zh) 2017-03-01
EP3327571A4 (en) 2018-05-30
EP3327571B1 (en) 2021-03-10
US20180165164A1 (en) 2018-06-14

Similar Documents

Publication Publication Date Title
WO2017028494A1 (zh) 一种数据恢复的方法、存储的方法相应的装置及系统
US10897522B2 (en) Method, apparatus, and electronic device for processing consensus requests in a blockchain consensus network
US9720841B2 (en) Hardware managed compressed cache
US9864538B1 (en) Data size reduction
US20150100860A1 (en) Systems and Methods of Vector-DMA cache-XOR for MPCC Erasure Coding
CN109582614A (zh) 针对远程存储器访问的nvm express控制器
US10860223B1 (en) Method and system for enhancing a distributed storage system by decoupling computation and network tasks
CN109714325A (zh) 一种单向光闸数据传输方法、系统、电子设备和介质
US9734008B2 (en) Error vector readout from a memory device
CN108334419B (zh) 一种数据恢复的方法和装置
KR20170013319A (ko) 데이터 관리 방법, 노드, 그리고 데이터베이스 클러스터를 위한 시스템
JP2017538982A (ja) 記憶システム内のデータ・バックアップのための方法および装置
US9733870B2 (en) Error vector readout from a memory device
CN115357571A (zh) 一种数据去重方法、装置、设备及介质
WO2024146186A1 (zh) 一种数据存储方法、装置、计算机设备及非易失性可读存储介质
US20230044165A1 (en) Systems, methods, and apparatus for remote data transfers to memory
WO2018202054A1 (zh) 一种编码的方法和装置
US10749709B2 (en) Distributed file system using torus network and method for operating the same
JP2017097437A (ja) 情報処理システム、情報処理装置、及びプログラム
US9367329B2 (en) Initialization of multi-core processing system
US20130144977A1 (en) Shared-bandwidth multiple target remote copy
US20190286515A1 (en) Dynamic and Preemptive Erasure Encoding in Software Defined Storage (SDS) Systems
WO2023016456A1 (zh) 一种数据发送的方法、网卡和计算设备
US20230305713A1 (en) Client and network based erasure code recovery
CN116991812A (zh) 文件压缩方法、装置、设备及存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16836360

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

WWE Wipo information: entry into national phase

Ref document number: 2016836360

Country of ref document: EP