CN111857603B

CN111857603B - Data processing method and related device

Info

Publication number: CN111857603B
Application number: CN202010760588.6A
Authority: CN
Inventors: 张伟益
Original assignee: Chongqing Unisinsight Technology Co Ltd
Current assignee: Chongqing Unisinsight Technology Co Ltd
Priority date: 2020-07-31
Filing date: 2020-07-31
Publication date: 2022-12-02
Anticipated expiration: 2040-07-31
Also published as: CN111857603A

Abstract

The invention relates to the technical field of distributed storage, and provides a data processing method and a related device, wherein the method comprises the following steps: receiving a data writing request sent by a client; if the first target storage block is the same as the storage block which is requested to be written by the latest data writing request, and the first target storage unit is adjacent to the second target storage unit which is requested to be written by the latest data writing request, merging the first target storage unit into a first target storage segment corresponding to the second target storage unit; incrementing the version number of the first target storage unit; and if the version number of the first target storage unit after increment is larger than the version number of the first target storage section, updating the version number of the first target storage section by using the version number of the first target storage unit after increment and sending the updated version number of the first target storage section to the metadata node for storage. The invention can greatly reduce the storage space occupied by the version number and improve the utilization efficiency of the storage space of the metadata node.

Description

Data processing method and related device

Technical Field

The invention relates to the technical field of distributed storage, in particular to a data processing method and a related device.

Background

In an existing distributed storage system, a storage node generally includes a plurality of storage blocks, each storage block includes a plurality of storage units, data to be stored is first encoded according to an erasure code to obtain a plurality of data blocks and at least one check block, and then the plurality of data blocks and the check block are sent to different storage nodes, the storage nodes store the received data blocks or check blocks into storage units in local storage blocks, in order to facilitate management of the storage units in the storage blocks, in the prior art, a metadata node is used to store a corresponding version number for each storage unit, when data in a storage unit changes, the version number of the storage unit is incremented, and when the number of the storage units included in a storage node is large, a storage space occupied by the version number of the recording storage unit also expands along with the storage unit, thereby reducing utilization efficiency of the storage space of the metadata node.

Disclosure of Invention

The invention aims to provide a data processing method and a related device, which can combine storage units adjacent to each other into a storage section during data writing and record a version number for each storage section, thereby greatly reducing the storage space occupied by the version numbers and improving the utilization efficiency of the storage space of a metadata node.

In order to achieve the above purpose, the embodiment of the present invention adopts the following technical solutions:

in a first aspect, the present invention provides a data processing method, which is applied to a storage node in a distributed storage system, where the storage node includes a plurality of storage blocks, each storage block includes a plurality of storage units, the distributed storage system further includes a client and a metadata node, and the storage node is in communication connection with both the client and the metadata node, and the method includes: receiving a data writing request sent by a client, wherein the data writing request carries an address to be written, and the address to be written is used for representing a first target storage block in which data to be written and a first target storage unit in the first target storage block; if the first target storage block is the same as the storage block which is written by the latest data writing request and the first target storage unit is adjacent to the second target storage unit which is written by the latest data writing request, merging the first target storage unit into a first target storage segment corresponding to the second target storage unit; incrementing the version number of the first target storage unit; and if the version number of the first target storage unit after increment is larger than the version number of the first target storage section, updating the version number of the first target storage section by using the version number of the first target storage unit after increment and sending the updated version number of the first target storage section to the metadata node for storage.

In a second aspect, the present invention provides a data processing method, which is applied to a metadata node in a distributed storage system, where the metadata node is in communication connection with a storage node, and the method further includes: when the fact that the version numbers of second target storage sections in a plurality of storage nodes corresponding to the same group of erasure code coded data are inconsistent is detected, the storage node with the smallest version number is determined as a node to be recovered, wherein the second target storage sections comprise a plurality of storage units adjacent in position, and the metadata node stores the version numbers of the second target storage sections; and sending recovery information to the node to be recovered so that the node to be recovered reads the data in the second target storage segment from the storage nodes except the node to be recovered according to the recovery information and recovers the data in the node to be recovered according to the read data, wherein the recovery information comprises the information of the storage nodes except the node to be recovered in the plurality of storage nodes, the version number of the second target storage segment, the starting position of the second target storage segment and the length of the second target storage segment.

In a third aspect, the present invention provides a data processing apparatus, which is applied to a storage node in a distributed storage system, where the storage node includes multiple storage blocks, each storage block includes multiple storage units, the distributed storage system further includes a client and a metadata node, and the storage node is in communication connection with both the client and the metadata node, and the apparatus includes: the data writing method comprises the steps that a receiving module is used for receiving a data writing request sent by a client, wherein the data writing request carries an address to be written, and the address to be written is used for representing a first target storage block in which data to be written is written and a first target storage unit in the first target storage block; the processing module is used for merging the first target storage unit into a first target storage segment corresponding to a second target storage unit if the first target storage block is the same as the storage block which is requested to be written by the latest data writing and the first target storage unit is adjacent to the second target storage unit which is requested to be written by the latest data writing; incrementing the version number of the first target storage unit; and if the version number of the first target storage unit after increment is larger than the version number of the first target storage section, updating the version number of the first target storage section by using the version number of the first target storage unit after increment and sending the updated version number of the first target storage section to the metadata node for storage.

In a fourth aspect, the present invention provides a data processing apparatus, applied to a metadata node in a distributed storage system, where the metadata node is communicatively connected to a storage node, the apparatus including: the detection module is used for determining a storage node with the minimum version number as a node to be recovered when detecting that the version numbers of second target storage sections in a plurality of storage nodes corresponding to the same group of erasure code coded data are not consistent, wherein the second target storage sections comprise a plurality of storage units adjacent in position, and the metadata node stores the version numbers of the second target storage sections; and the recovery module is used for sending recovery information to the node to be recovered so that the node to be recovered reads the data in the second target storage segment from the storage nodes except the node to be recovered according to the recovery information and recovers the data in the node to be recovered according to the read data, wherein the recovery information comprises the information of the storage nodes except the node to be recovered in the plurality of storage nodes, the version number of the second target storage segment, the starting position of the second target storage segment and the length of the second target storage segment.

In a fifth aspect, the present invention provides a distributed storage system, where the distributed storage system includes a storage node, a client and a metadata node, the storage node is in communication connection with both the client and the metadata node, and the client is in communication connection with the metadata node, and the distributed storage system is configured to implement the data processing method applied to the storage node, or implement the data processing method applied to the metadata node.

In a sixth aspect, the present invention provides a computer apparatus comprising: one or more processors; a memory for storing one or more programs which, when executed by the one or more processors, cause the one or more processors to implement the above-described data processing method applied to the storage node or the above-described data processing method applied to the metadata node.

In a seventh aspect, the present invention provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the above-described data processing method applied to a storage node, or implements the above-described data processing method applied to a metadata node.

Compared with the prior art, the method and the device can combine the storage units adjacent to each other into one storage segment during data writing, and record one version number for each storage segment, thereby greatly reducing the storage space occupied by the version numbers and improving the utilization efficiency of the storage space of the metadata node.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present invention and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained according to the drawings without inventive efforts.

Fig. 1 is an architecture diagram of a distributed storage system provided by an embodiment of the present invention.

Fig. 2 shows a block schematic diagram of a computer device provided by an embodiment of the present invention.

Fig. 3 is a flowchart illustrating a data processing method according to an embodiment of the present invention.

FIG. 4 is a schematic diagram illustrating memory cell location adjacency provided by an embodiment of the present invention.

Fig. 5 is a flowchart illustrating another data processing method according to an embodiment of the present invention.

Fig. 6 is a flowchart illustrating another data processing method according to an embodiment of the present invention.

Fig. 7 is a flowchart illustrating another data processing method according to an embodiment of the present invention.

FIG. 8 shows a block schematic diagram of a data processing apparatus for use with a storage node according to an embodiment of the present invention.

FIG. 9 shows a block schematic diagram of a data processing apparatus for providing metadata for application to a metadata node according to an embodiment of the present invention.

Icon: 10-a computer device; 11-a processor; 12-a memory; 13-a bus; 14-a communication interface; 20-a storage node; 30-a metadata node; 40-a client; 100-data processing means applied to the storage nodes; 110-a receiving module; 120-a processing module; 200-data processing means applied to the metadata node; 210-a detection module; 220-a recovery module; 230-read module.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. The components of embodiments of the present invention generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations.

Thus, the following detailed description of the embodiments of the present invention, presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined or explained in subsequent figures.

In the description of the present invention, it should be noted that if the terms "upper", "lower", "inside", "outside", etc. indicate an orientation or a positional relationship based on that shown in the drawings or that the product of the present invention is used as it is, this is only for convenience of description and simplification of the description, and it does not indicate or imply that the device or the element referred to must have a specific orientation, be constructed in a specific orientation, and be operated, and thus should not be construed as limiting the present invention.

Furthermore, the appearances of the terms "first," "second," and the like, if any, are used solely to distinguish one from another and are not to be construed as indicating or implying relative importance.

It should be noted that the features of the embodiments of the present invention may be combined with each other without conflict.

Referring to fig. 1, fig. 1 shows an architecture diagram of a distributed storage system according to an embodiment of the present invention, in fig. 1, the distributed storage system includes a plurality of storage nodes 20, a metadata node 30, and a client 40, the storage nodes 20 are communicatively connected to the metadata node 30 and the client 40, and the client 40 is communicatively connected to the storage nodes 20 and the metadata node 30.

The storage node 20 is configured to store data that a user needs to store, where the storage node 20 includes a plurality of storage blocks, each storage block includes a plurality of storage units, and each storage unit corresponds to one version number, and when data is written into any storage unit, the version number of the storage unit is incremented.

The metadata node 30 is used to store metadata for managing the storage nodes. For example, when the client 40 sends data to be written to a storage node, it needs to first apply for permission of a storage space for storing the data to be written to the metadata node 30, and the metadata node 30 returns a list of the storage node 20 for writing the data to be written and a storage unit of a storage block in the storage node 20 to the client 40. The version number of the storage unit is stored at the metadata node 30.

The client 40 receives data which is sent from outside and needs to be stored by a user, segments the received data, and performs erasure coding on each segment of data to obtain a group of coded multi-piece data, wherein the number of pieces in the group of data is determined according to a preset erasure coding rule, for example, if an erasure coding rule of (5,3) is adopted, that is, each piece of original data is divided into 5 data blocks and 3 check blocks, the client 40 stores the 8 blocks onto different storage nodes 20, and when any 3 blocks are in error (including the data blocks and the check blocks), the original 5 blocks can be restored through a corresponding reconstruction algorithm, thereby realizing redundant storage of the data and improving reliability of the data.

The storage node 20 may be a storage server for storing data, or a storage server group consisting of a plurality of storage servers.

The metadata node 30 may be an entity computer such as a host or a server, a host group consisting of a plurality of hosts, or a server group consisting of a plurality of servers, or a virtual host or a virtual server, or a virtual host group or a virtual server group, which can implement the same function as the entity computer. The metadata node 30 may be a separate piece of hardware or may be an application running on the storage node.

The client 40 may be an entity computer such as a host or a server, a host group composed of a plurality of hosts, or a server group composed of a plurality of servers, or a virtual host or a virtual server, or a virtual host group or a virtual server group, which can realize the same function as the entity computer. The client 40 may be a stand-alone piece of hardware or an application running on the storage node 20 or the metadata node 30.

In the prior art, the metadata node 30 records a version number for each storage unit, and the larger the storage capacity of the distributed storage system is, the larger the number of the storage units is, the larger the storage space of the metadata node 30 occupied by the version number of the storage unit is, and the effective utilization of the storage space of the metadata node 30 is greatly influenced.

In order to solve the problem, the conventional method is to reduce the version information of the storage unit of the storage block by increasing the number of snapshots, but increasing the number of snapshots may result in lengthening of a snapshot data chain, further resulting in degradation of data reading performance.

The inventor carefully researches the characteristics of data storage in a distributed storage system applied to the field of video monitoring, and finds that video monitoring requires a large amount of data to be stored, write data is more frequent than read data and is usually written sequentially, that is, data written by a plurality of write requests is continuous, so storage locations stored in a storage node 20 are also continuous, that is, storage units written by two write requests are storage units with adjacent locations.

The inventor provides a data processing method and related apparatus applied to the distributed storage system shown in fig. 1 based on the above findings, which will be described in detail below.

Referring to fig. 2, fig. 2 provides a computer device 10, and the computer device 10 may be the storage node 20 in fig. 1 or the metadata node 30 in fig. 1. Computer device 10 includes a processor 11, a memory 12, a bus 13, and a communication interface 14. The processor 11 and the memory 12 are connected by a bus 13, and when the computer device 10 is a storage node 20, the processor 11 is connected to the metadata node 30 and the client 40 by different communication interfaces 14. When the computer device 10 is a metadata node 30, the processor 11 is communicatively connected to the storage node 20 and the client 40 via different communication interfaces 14.

The processor 11 may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the above method may be performed by integrated logic circuits of hardware or instructions in the form of software in the processor 11. The Processor 11 may be a general-purpose Processor, and includes a Central Processing Unit (CPU), a Network Processor (NP), and the like; but may also be a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components.

The memory 12 is used for storing a program, such as the data processing apparatus applied to the storage node 20 or the data processing apparatus applied to the metadata node 30, the data processing apparatus includes at least one software functional module which can be stored in the memory 12 in the form of software or firmware (firmware), when the stored program is the data processing apparatus applied to the storage node 20, the processor 11 executes the program to implement the data processing method applied to the storage node 20 after receiving an execution instruction, and when the stored program is the data processing apparatus applied to the metadata node 30, the processor 11 executes the program to implement the data processing method applied to the metadata node 30 after receiving an execution instruction.

The Memory 12 may include a high-speed Random Access Memory (RAM) and may also include a non-volatile Memory (non-volatile Memory), such as at least one disk Memory. Alternatively, the memory 12 may be a storage device built in the processor 11, or may be a storage device independent of the processor 11.

The bus 13 may be an ISA bus, a PCI bus, an EISA bus, or the like. Fig. 2 is indicated by only one double-headed arrow, but does not indicate only one bus or one type of bus.

On the basis of fig. 1 and fig. 2, an embodiment of the present invention provides a data processing method, which may be applied to the storage node 20 in fig. 1, please refer to fig. 3, and fig. 3 shows a flowchart of the data processing method provided by the embodiment of the present invention, where the method includes the following steps:

step S100, receiving a data writing request sent by a client, where the data writing request carries an address to be written, and the address to be written is used to represent a first target storage block to which data to be written should be written and a first target storage unit in the first target storage block.

In this embodiment, the storage node 20 includes a plurality of storage blocks, the size of the storage block may be set in advance as needed, for example, the size of the storage block is set to 64MB, each storage block includes a plurality of storage units, the first target storage block is a storage block to which data to be written in the plurality of storage blocks should be written, the first target storage unit is a storage unit to which data to be written in the plurality of storage units in the first target storage block should be written, for example, the storage node 20 includes 5000 storage blocks, each storage block includes 100 storage units, and the address to be written is: if the storage block 500 is the storage unit 50, the storage block 500 is the first target storage block, and the storage unit 50 is the first target storage unit.

In this embodiment, after receiving a request for writing data by a user, a client 40 first applies for a storage space and a write permission for writing to the storage space from a metadata node 30, then fragments data that the user needs to write, and performs erasure coding on the data of each fragment to obtain a set of erasure coding data corresponding to the fragment, where the set of erasure coding data includes multiple data blocks and at least one check block, and each data block and each check block of the same set of erasure coding data are distributed to different storage nodes 20. For example, the client 40 receives data a to be written by the user, first slices a, divides a into a1, a2, and a3, and then performs erasure coding on a1, a2, and a3, respectively, taking (3,2) erasure coding as an example, that is, the number of data blocks is 3, the number of parity blocks is 2, and obtains erasure coded data of a 1: erasure code data of a11, a12, a13, b11, b12, a2, wherein a11, a12, a13 are data blocks, b11, b12 are check blocks; similarly, erasure code data of a 2: erasure code data of a21, a22, a23, b21, b22, a 3: erasure code data of a 3: a31, a32, a33, b31, b32. The client 40 applies for b sets of storage spaces from the metadata node 30, where each set of storage space corresponds to a set of erasure code data, and taking the size of a storage unit as 4KB as an example, the size of a set of storage spaces is: 5 × 4KB =20kb, taking b as 10 as an example, the client 40 applies for b groups of storage spaces with a size of 20kb × 10=200kb to the metadata node 30, the metadata node 30 simultaneously returns information of storage nodes corresponding to the b groups of storage spaces, the information includes storage nodes corresponding to the b groups of storage spaces, storage blocks in the storage nodes, and storage units in the storage blocks, the storage nodes are taken as storage nodes 1 to 5, the client 40 simultaneously sends data with a size of k × 4KB (k < = b, where b is usually an integer multiple of k) to the storage nodes 1 to 5 each time, and for the storage node 1, the client 40 sends the first 4KB data in a11 to the storage node 1 with a to-be-written address of: if the memory block 500 and the memory unit 10 are stored, the first 4KB data in a11 is written into the memory unit 10 and the memory unit 11 in the memory block 500, and the storage of the rest of the data is similar, and the description thereof is omitted.

It should be noted that, as a specific embodiment, when detecting that the storage space applied this time is about to be used up, the client 40 re-applies the b-group storage space to the metadata node 30 in advance, so as to write subsequent data.

In step S110, if the first target storage block is the same as the storage block requested to be written by the latest data writing request, and the first target storage unit is adjacent to the second target storage unit requested to be written by the latest data writing request, the first target storage unit is merged into the first target storage segment corresponding to the second target storage unit.

In this embodiment, the latest data writing request may be a last data writing request temporally adjacent to the current data writing request, as an embodiment, the position adjacency of two storage units of the same storage block may be that one storage unit ending address is adjacent to the starting address of another storage unit, as another embodiment, if the storage units are numbered sequentially according to the addresses in advance, the positions of the two storage units that are numbered adjacently are also necessarily adjacent, and at this time, the positions of the two storage units that are numbered adjacently may be determined to be adjacent according to the number adjacency. Referring to fig. 4, fig. 4 is a schematic diagram illustrating memory cell position adjacency according to an embodiment of the present invention. FIG. 4 (a) shows that the locations of the memory cells numbered 1 and 2 are adjacent, the address is taken as an example, the starting address of the memory cell numbered 1 is 0KB, the size of the memory cell is 4KB, the ending address of the memory cell numbered 1 is 4KB-1,2# is 4KB, and the memory cell numbered 1 and the memory cell numbered 2 are adjacent, and similarly, the memory cell numbered 2 and the memory cell numbered 3 are also adjacent, and FIG. 4 (b) shows that the locations of the memory cells numbered 2 and 3 are adjacent, and the memory cell numbered 1 and the memory cell numbered 2 are adjacent, and the memory cell numbered 2 and the memory cell numbered 3 are adjacent.

In this embodiment, a memory segment includes at least one memory cell, and when data is written to any memory cell, if the memory cell written this time is not adjacent to the memory cell written last time, the memory cell written this time is taken as a memory segment, and if the memory cell written this time is adjacent to the memory cell written last time, the memory cell written this time is merged into the memory segment corresponding to the memory cell written last time.

In the present embodiment, the first target memory segment is a memory segment corresponding to the second target memory unit among the plurality of memory segments. Taking the example of numbering the memory locations according to addresses, for example, the number of the second target memory location is 10, the second target memory location corresponds to the 1# memory segment, the 1# memory segment includes the memory locations 5 to 10, the number of the first target memory location is 11, the first target memory location is adjacent to the second target memory location, the first target memory location is merged into the 1# memory segment, and at this time, the 1# memory segment includes the memory locations 5 to 11.

Step S120, increment the version number of the first target storage unit.

In this embodiment, each storage unit corresponds to a version number, when data in the storage unit changes, the version number of the storage unit is incremented, if the storage unit has a corresponding storage segment, the version number of the corresponding storage segment is also updated, and the version number of the storage unit is finally stored as the version number of the storage segment, and as a specific implementation manner, the version number of the storage segment may be represented as: (identification of memory location, k, version number), the identification of memory location represents the identification of the starting memory location of the memory segment, and k represents the number of memory locations included in the memory segment. If the storage unit does not have the corresponding storage segment, the version number of the storage unit is finally stored.

Step S130, if the incremented version number of the first target storage unit is greater than the version number of the first target storage segment, updating the version number of the first target storage segment with the incremented version number of the first target storage unit and sending the updated version number of the first target storage segment to the metadata node for storage.

In this embodiment, if the version number of the first target storage segment is 5 and the incremented version number of the first target storage unit is 6, the version number of the first target storage segment is updated to 6.

It should be noted that as a specific implementation manner, when data is written into a first target storage unit, a snapshot is created for the first target storage block, then the data to be written is written into the snapshot, a version number of a first target storage segment is recorded in a metadata node 30 as metadata of the snapshot, a location of the first target storage unit is saved at the tail of the first target storage block, and the storage node 20 merges data in the snapshot into the first target storage unit in the first target storage block at regular time.

The data processing method provided by the embodiment of the invention can combine the storage units adjacent to each other into one storage segment during data writing, and record one version number for each storage segment, thereby greatly reducing the storage space occupied by the version numbers and improving the utilization efficiency of the storage space of the metadata node.

In this embodiment, in order to still manage the first target storage unit when the first target storage unit is not adjacent to the second target storage unit, and to facilitate merging when the storage unit to be written in the next request is adjacent to the first target storage unit, an embodiment of the present invention further provides a method for updating the version number of the first target storage unit in this scenario, please refer to fig. 5, where fig. 5 shows a flowchart of another data processing method provided in the embodiment of the present invention, and the method further includes the following steps:

in step S140, if the first target storage unit and the second target storage unit are not adjacent, the first target storage unit is used as a new storage segment, and the version number of the first target storage unit is incremented.

And step S150, taking the version number of the incremented first target storage unit as the version number of a new storage section and sending the version number of the new storage section to the metadata node for storage.

In this embodiment, when the storage unit written by the next write request and the first target storage unit belong to the same storage block and are adjacent to the first target storage unit, the storage unit written by the next write request may be merged into the storage segment corresponding to the first target storage unit.

In the data processing method provided by the embodiment of the present invention, when the first target storage unit is not adjacent to the second target storage unit, the first target storage unit can still be managed, and preparation is made in advance for merging when the storage unit to be written in the next request is adjacent to the first target storage unit.

Based on the same inventive concept as the above embodiment, an embodiment of the present invention further provides a data processing method applied to a metadata node in a distributed storage system, please refer to fig. 6, where fig. 6 shows a flowchart of another data processing method provided in an embodiment of the present invention, where the method includes the following steps:

step S200, when detecting that the version numbers of second target storage sections in a plurality of storage nodes corresponding to the same group of erasure code coded data are not consistent, determining the storage node with the minimum version number as a node to be recovered, wherein the second target storage section comprises a plurality of storage units with adjacent positions, and the metadata node stores the version number of the second target storage section.

In this embodiment, the metadata node 30 stores the version number of each storage segment and information of the storage segment, and when data in a storage unit in any storage segment in any storage node 20 changes, the version number of the corresponding storage segment is updated accordingly, and the storage node 20 synchronizes the updated version number of the storage segment to the metadata node 30.

In this embodiment, for multiple sets of erasure code encoded data stored in the distributed storage system, the metadata node 30 may periodically detect whether version numbers of storage segments in multiple storage nodes of each set of erasure code encoded data are consistent, if so, it is verified that the set of erasure code encoded data is normally stored, otherwise, the set of erasure code encoded data is abnormally stored, but abnormal data in the set of erasure code encoded data may be recovered through other normal data in the set, and at this time, the metadata node 30 starts a recovery process. As a specific implementation manner, when the data storage is abnormal, the metadata node 30 may mark a storage unit in a storage segment corresponding to the abnormal data as a recovery state, the metadata node 30 creates an independent thread, starts a recovery process for the storage unit marked as the recovery state, and at this time, modifies the recovery state to the recovery state.

In this embodiment, in the same group of erasure coding data, the storage node 20 with the smallest version number belongs to the node to be recovered where the group of erasure coding data is stored abnormally, that is, the data on the storage node 20 needs to be recovered. For example, the same group of erasure code encoded data is distributed on the storage node 1 to the storage node 5, and the corresponding version numbers are 5, 4, 5, and 4, respectively, so that the storage node 2 and the storage node 5 are nodes to be recovered.

In the present embodiment, the second target storage segments of the plurality of storage nodes 20 are the same set of erasure coding data.

Step S210, sending recovery information to the node to be recovered, so that the node to be recovered reads data in the second target storage segment from the storage nodes except the node to be recovered according to the recovery information, and recovers the data in the node to be recovered according to the read data, where the recovery information includes information of the storage nodes except the node to be recovered in the plurality of storage nodes, a version number of the second target storage segment, a starting position of the second target storage segment, and a length of the second target storage segment.

In this embodiment, the data on the node to be recovered may be recovered according to the data on the other storage nodes 20 except the node to be recovered, the metadata node 30 sends the recovery information to the storage nodes 20, the storage node 10 may read the data on the other storage nodes 20 except the node to be recovered according to the recovery information, and then generate the data that needs to be recovered on the recovery nodes according to the encoding principle of erasure codes on the read data.

In this embodiment, the recovery information may be expressed as: (storage node id list; start position of second target storage segment; length of second target storage segment; version number of second target storage segment), for example, the restoration information is: (1, 3,5, 128kb,100, 5), the recovery information indicating that the data of 100 memory cells having the version number of 5 needs to be read from the position of 128KB of the storage nodes 1,3, 5.

In the data processing method provided by the embodiment of the present invention, the metadata node 30 finds data with abnormal storage in time according to the version numbers of the storage segments of the plurality of storage nodes 20 in the same group of erasure code encoded data, and recovers the data with abnormal storage, thereby improving the reliability of the data of the distributed storage system.

In this embodiment, in order to quickly and accurately read out data stored in a distributed storage system, an embodiment of the present invention further provides a method for reading data, please refer to fig. 7, where fig. 7 shows a flowchart of another data processing method provided in the embodiment of the present invention, where the method includes the following steps:

step S300, receiving the address to be read sent by the client.

In this embodiment, before the client reads data from the storage node 20, the client first applies for a read permission from the metadata node 30 and determines a storage node corresponding to an address to be read and a location in the corresponding storage node.

In this embodiment, in order to ensure that the read data is correct, if the memory cell corresponding to the address to be read is being written, the data in the memory cell is not allowed to be read, and the data in the memory cell must be read until the memory cell is completely written.

Step S310, determining a target storage node corresponding to the address to be read and a third target storage unit in the plurality of storage units of the target storage node according to the address mapping relationship.

In this embodiment, the address mapping relationship is used to represent the mapping relationship between the address of the storage space in the distributed storage system and the storage node and the storage unit corresponding to the storage node.

In this embodiment, the target storage node is a storage node corresponding to the address to be read, and the third target storage unit is a storage unit corresponding to the address to be read in a plurality of storage units in the storage node, for example, the address mapping relationship is as shown in table 1:

TABLE 1

The storage space of the address 100KB is distributed among the data unit 100 of the storage block 1 of the storage node 1, the data unit 110 of the storage block 1 of the storage node 2, the data unit 120 of the storage block 1 of the storage node 3, the data unit 110 of the storage block 1 of the storage node 4, and the data unit 120 of the storage block 1 of the storage node 5.

As a specific implementation manner, the address mapping relationship may include a correspondence between a preset number of key addresses and storage nodes, and a target storage unit, and the arbitrary address may be obtained by calculation according to the address mapping relationship, the size of the storage block, and the size of the storage unit.

In step S320, the version number of the third target storage unit is obtained.

In this embodiment, as a specific implementation manner, the method for obtaining the version number may be:

and if the third target storage unit has a corresponding third target storage segment, taking the version number of the third target storage segment as the version number of the third target storage unit.

In this embodiment, if the third target storage unit does not have a corresponding storage segment, the version number of the third target storage unit may be directly obtained.

In step S330, if the number of the maximum version numbers in the version numbers of the third target storage unit is greater than the preset value, the target storage node with the maximum version number is used as the storage node to be read.

In this embodiment, the preset value is related to the erasure coding strategy, for example, the erasure coding strategy adopted is (n, k), that is, n data blocks and k check blocks are obtained after encoding according to the erasure coding strategy, at this time, the preset value is n, that is, in the same group of erasure coding data, data of any k blocks can be calculated from data of the remaining n blocks according to the same erasure coding strategy.

In this embodiment, the data in the third target storage unit with the maximum version number is correct and latest data.

Step S340, feeding back the third target storage unit of the storage node to be read to the client, so that the client obtains the data to be read according to the data read from the third target storage unit of the storage node to be read.

In this embodiment, for the same group of erasure code encoded data, when all data blocks are stored in the third target storage unit of the storage node to be read, the data block to be read may be directly read from the data block, and when both the data block and the parity block are stored in the third target storage unit of the storage node to be read, the data block and the parity block need to be read from the data block and the parity block, and the data block to be read is calculated according to the data block and the parity block. For example, the target storage nodes are storage nodes 1 to 5, data blocks stored by the storage nodes 1 to 3, and parity blocks stored by the storage nodes 4 to 5, and the storage node to be read is storage node 1,3,5, that is, data blocks stored in the storage nodes 1 and 3 and parity blocks stored in the storage node 5 are required to be calculated according to the data blocks in the storage nodes 1 and 3 and the parity blocks in the storage node 5, and finally, the client 40 splices the data blocks in the storage nodes 1,2, and 3 together and returns to the user.

According to the data processing method provided by the embodiment of the invention, data is read from the storage node to be read with the maximum version number larger than the preset value, and the data to be read can be quickly and accurately obtained according to the read data.

In order to perform the corresponding steps in the above-described embodiments and various possible implementations, an implementation of the data processing apparatus 100 applied to the storage node is given below. Referring to fig. 8, fig. 8 is a block diagram illustrating a data processing apparatus 100 applied to a storage node according to an embodiment of the present invention. It should be noted that the basic principle and the resulting technical effect of the data processing apparatus 100 applied to the storage node provided in the present embodiment are the same as those of the above embodiments, and for the sake of brief description, no reference is made to this embodiment portion.

The data processing apparatus 100 applied to the storage node includes a receiving module 110 and a processing module 120.

The receiving module 110 is configured to receive a data writing request sent by a client, where the data writing request carries an address to be written, and the address to be written is used to represent a first target storage block to which data to be written should be written and a first target storage unit in the first target storage block.

The processing module 120 is configured to, if the first target storage block is the same as the storage block that was requested to be written by the latest write data, and the first target storage unit is adjacent to the second target storage unit that was requested to be written by the latest write data, merge the first target storage unit into the first target storage segment corresponding to the second target storage unit; incrementing the version number of the first target storage unit; and if the version number of the first target storage unit after increment is larger than the version number of the first target storage section, updating the version number of the first target storage section by using the version number of the first target storage unit after increment and sending the updated version number of the first target storage section to the metadata node for storage.

As a specific embodiment, the processing module 120 is further configured to: if the first target storage unit and the second target storage unit are not adjacent, the first target storage unit is used as a new storage section, and the version number of the first target storage unit is increased progressively; and taking the version number of the first target storage unit after increment as the version number of the new storage section and sending the version number of the new storage section to the metadata node for storage.

In order to perform the corresponding steps in the above-described embodiments and various possible implementations, an implementation of the data processing apparatus 200 applied to the metadata node is given below. Referring to fig. 9, fig. 9 is a block diagram illustrating a data processing apparatus 200 applied to a metadata node according to an embodiment of the present invention. It should be noted that the basic principle and the resulting technical effect of the data processing apparatus 200 applied to the metadata node provided in the present embodiment are the same as those of the above embodiments, and for the sake of brief description, no mention is made in this embodiment.

The data processing apparatus 200 applied to the metadata node includes a detection module 210, a recovery module 220, and a reading module 230.

The detecting module 210 is configured to, when it is detected that version numbers of second target storage segments in multiple storage nodes corresponding to the same group of erasure coding data are not consistent, determine a storage node with a smallest version number as a node to be recovered, where the second target storage segment includes multiple storage units adjacent in position, and the metadata node stores the version number of the second target storage segment.

The recovery module 220 is configured to send recovery information to the node to be recovered, so that the node to be recovered reads data in the second target storage segment from the storage nodes except the node to be recovered according to the recovery information, and recovers the data in the node to be recovered according to the read data, where the recovery information includes information of the storage nodes except the node to be recovered in the plurality of storage nodes, a version number of the second target storage segment, a start position of the second target storage segment, and a length of the second target storage segment.

The reading module 230 is configured to receive an address to be read, which is sent by a client; determining a target storage node corresponding to the address to be read and a third target storage unit in the plurality of storage units of the target storage node according to the address mapping relation; acquiring the version number of a third target storage unit; if the number of the maximum version numbers in the version numbers of the third target storage units is larger than a preset value, the target storage node with the maximum version number is used as a storage node to be read; and feeding back the third target storage unit of the storage node to be read to the client so that the client obtains the data to be read according to the data read from the third target storage unit in the storage node to be read.

As a specific implementation manner, the reading module 230 is specifically configured to: and if the third target storage unit has a corresponding third target storage segment, taking the version number of the third target storage segment as the version number of the third target storage unit.

The embodiment of the present invention further provides a distributed storage system, where the distributed storage system includes a storage node 20, a client 40, and a metadata node 30, the storage node 20 is in communication connection with both the client 40 and the metadata node 30, the client 40 is in communication connection with the metadata node 30, and the storage node 20, the client 40, and the metadata node 30 in the distributed storage system cooperate with each other to implement the data processing method applied to the storage node 20 or implement the data processing method applied to the metadata node 30, when the distributed storage system stores data, the client 40 and the metadata node 30 cooperate with the storage node 20 to implement the method and the step of any one of the above steps S100 to S150, when the distributed storage system recovers data, the storage node 20 and the client 40 cooperate with the metadata node 30 to implement the method and the step of any one of the above steps S200 to S210, and when the distributed storage system reads data, the storage node 20 and the client 40 cooperate with the metadata node 30 to implement the method and the step of any one of the above steps S300 to S340.

The present invention provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the above-described data processing method applied to a storage node, or implements the above-described data processing method applied to a metadata node.

In summary, an embodiment of the present invention provides a data processing method and a related apparatus, which are applied to a storage node in a distributed storage system, where the storage node includes a plurality of storage blocks, each storage block includes a plurality of storage units, the distributed storage system further includes a client and a metadata node, and the storage node is in communication connection with both the client and the metadata node, and the method includes: receiving a data writing request sent by a client, wherein the data writing request carries an address to be written, and the address to be written is used for representing a first target storage block in which data to be written and a first target storage unit in the first target storage block; if the first target storage block is the same as the storage block which is requested to be written by the latest data writing request, and the first target storage unit is adjacent to the second target storage unit which is requested to be written by the latest data writing request, merging the first target storage unit into a first target storage segment corresponding to the second target storage unit; incrementing the version number of the first target storage unit; and if the version number of the first target storage unit after increment is larger than the version number of the first target storage section, updating the version number of the first target storage section by using the version number of the first target storage unit after increment and sending the updated version number of the first target storage section to the metadata node for storage. Compared with the prior art, the embodiment of the invention can combine the storage units adjacent to each other into one storage segment and record one version number for each storage segment during data writing, thereby greatly reducing the storage space occupied by the version numbers and improving the utilization efficiency of the storage space of the metadata node.

The above description is only for the specific embodiment of the present invention, but the scope of the present invention is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present invention are included in the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the appended claims.

Claims

1. A data processing method applied to a storage node in a distributed storage system, where the storage node includes a plurality of storage blocks, each storage block includes a plurality of storage units, the distributed storage system further includes a client and a metadata node, and the storage node is communicatively connected to both the client and the metadata node, and the method includes:

receiving a data writing request sent by the client, wherein the data writing request carries an address to be written, and the address to be written is used for representing a first target storage block in which data to be written and a first target storage unit in the first target storage block;

if the first target storage block is the same as the storage block which is requested to be written by the latest data writing request, and the first target storage unit is adjacent to the second target storage unit which is requested to be written by the latest data writing request, merging the first target storage unit into a first target storage segment corresponding to the second target storage unit;

incrementing a version number of the first target storage unit;

if the version number of the first target storage unit after increment is larger than the version number of the first target storage section, updating the version number of the first target storage section by using the version number of the first target storage unit after increment and sending the updated version number of the first target storage section to the metadata node for storage;

if the first target storage unit and the second target storage unit are not adjacent, taking the first target storage unit as a new storage section, and increasing the version number of the first target storage unit;

and taking the increased version number of the first target storage unit as the version number of the new storage section and sending the version number of the new storage section to the metadata node for storage.

2. A data processing method, applied to a metadata node in a distributed storage system, where the metadata node is in communication connection with a storage node, and the distributed storage system further includes a client, where the method further includes:

when detecting that the version numbers of second target storage sections in a plurality of storage nodes corresponding to the same group of erasure code coded data are not consistent, determining the storage node with the minimum version number as a node to be recovered, wherein the second target storage section comprises a plurality of storage units adjacent in position, and the metadata node stores the version number of the second target storage section;

sending recovery information to the node to be recovered to enable the node to be recovered to read data in the second target storage segment from storage nodes except the node to be recovered according to the recovery information and recover the data in the node to be recovered according to the read data, where the recovery information includes information of the storage nodes except the node to be recovered among the plurality of storage nodes, a version number of the second target storage segment, a start position of the second target storage segment, and a length of the second target storage segment, where the second target storage segment is a first target storage segment when the client requests to write data, and the manner of writing data into the first target storage segment is:

receiving a data writing request sent by the client, wherein the data writing request carries a to-be-written address, and the to-be-written address is used for representing a first target storage block into which data to be written and a first target storage unit in the first target storage block;

incrementing a version number of the first target storage unit;

if the incremented version number of the first target storage unit is greater than the version number of the first target storage section, updating the version number of the first target storage section by using the incremented version number of the first target storage unit and sending the updated version number of the first target storage section to the metadata node for storage.

3. The data processing method of claim 2, wherein the metadata node is further in communication connection with a client, the metadata node stores address mapping relationships between addresses and storage nodes and storage units of corresponding storage nodes in advance, and the method comprises:

receiving an address to be read sent by the client;

determining a target storage node corresponding to the address to be read and a third target storage unit in a plurality of storage units of the target storage node according to the address mapping relation;

acquiring the version number of the third target storage unit;

if the number of the maximum version numbers in the version numbers of the third target storage units is larger than a preset value, taking the target storage node with the maximum version number as a storage node to be read;

and feeding back the third target storage unit of the storage node to be read to the client so that the client obtains the data to be read according to the data read from the third target storage unit of the storage node to be read.

4. The data processing method of claim 3, wherein the step of obtaining the version number of the third target storage unit comprises:

5. A data processing apparatus, applied to a storage node in a distributed storage system, where the storage node includes a plurality of storage blocks, each of the storage blocks includes a plurality of storage units, the distributed storage system further includes a client and a metadata node, and the storage node is communicatively connected to both the client and the metadata node, the apparatus includes:

a receiving module, configured to receive a data writing request sent by the client, where the data writing request carries an address to be written, and the address to be written is used to characterize a first target storage block to which data to be written and a first target storage unit in the first target storage block;

a processing module to: if the first target storage block is the same as the storage block which is written by the latest data writing request, and the first target storage unit is adjacent to the second target storage unit which is written by the latest data writing request, merging the first target storage unit into a first target storage segment corresponding to the second target storage unit; incrementing a version number of the first target storage unit; if the version number of the first target storage unit after increment is larger than the version number of the first target storage section, updating the version number of the first target storage section by using the version number of the first target storage unit after increment and sending the updated version number of the first target storage section to the metadata node for storage;

the processing module is further configured to: if the first target storage unit and the second target storage unit are not adjacent in position, taking the first target storage unit as a new storage section, and increasing the version number of the first target storage unit; and taking the version number of the first target storage unit after increment as the version number of the new storage section and sending the version number of the new storage section to the metadata node for storage.

6. A data processing apparatus, for use in a metadata node in a distributed storage system, the metadata node being communicatively coupled to a storage node, the distributed storage system further comprising a client, the apparatus comprising:

the detection module is used for determining a storage node with the minimum version number as a node to be recovered when detecting that the version numbers of second target storage sections in a plurality of storage nodes corresponding to the same group of erasure code coded data are not consistent, wherein the second target storage sections comprise a plurality of storage units adjacent in position, and the metadata node stores the version numbers of the second target storage sections;

a recovery module, configured to send recovery information to the node to be recovered, so that the node to be recovered reads data in the second target storage segment from storage nodes other than the node to be recovered according to the recovery information, and recovers the data in the node to be recovered according to the read data, where the recovery information includes information of the storage nodes other than the node to be recovered in the plurality of storage nodes, a version number of the second target storage segment, a starting position of the second target storage segment, and a length of the second target storage segment, where the second target storage segment is a first target storage segment when the client writes a data request, and a manner of writing data into the first target storage segment is as follows:

incrementing a version number of the first target storage unit;

7. A distributed storage system, comprising storage nodes, clients and metadata nodes, wherein the storage nodes are communicatively connected to both the clients and the metadata nodes, and the clients are communicatively connected to the metadata nodes, and wherein the distributed storage system is configured to implement the data processing method according to claim 1, or the data processing method according to any one of claims 2 to 4.

8. A computer device, characterized in that the computer device comprises:

one or more processors;

memory for storing one or more programs which, when executed by the one or more processors, cause the one or more processors to implement the data processing method of claim 1 or to implement the data processing method of any of claims 2-4.

9. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the data processing method of claim 1 or is adapted to carry out the data processing method of any one of claims 2 to 4.