CN117075814A

CN117075814A - Metadata management method, device, equipment and medium

Info

Publication number: CN117075814A
Application number: CN202311107543.9A
Authority: CN
Inventors: 方浩
Original assignee: Zhengzhou Yunhai Information Technology Co Ltd
Current assignee: Zhengzhou Yunhai Information Technology Co Ltd
Priority date: 2023-08-30
Filing date: 2023-08-30
Publication date: 2023-11-17

Abstract

The application discloses a metadata management method, a device, equipment and a medium, which relate to the technical field of computers and are applied to a distributed storage system, wherein nodes in the distributed storage system are host nodes constructed based on a data processor, and the method comprises the following steps: screening target number of disks from the unselected disks of each host node according to a preset disk screening rule, and constructing a disk group based on the target number of disks; determining a target disk for storing metadata from the disk group according to a preset redundancy strategy, and storing the metadata to the target disk according to the preset redundancy strategy; and re-jumping to the step of screening the target number of disks from the non-screened disks of each host node according to the preset disk screening rule until all the disks in each host node are screened or no disk meeting the preset disk screening rule exists in the non-screened disks in each host node. The application provides a metadata management method in distributed storage.

Description

Metadata management method, device, equipment and medium

Technical Field

The present application relates to the field of computer technologies, and in particular, to a metadata management method, apparatus, device, and medium.

Background

The DPU (Data Processing Unit, i.e., a data processor) is a new generation processor facing the data center, and currently under the data center, particularly in the super fusion field, a CPU (Central Processing Unit, i.e., a central processing unit) is often used to process the infrastructure of security, communication, storage, virtualization, etc. in the data center, but the performance of the CPU is far behind the growth speed of data, so that the DPU gradually becomes a hotspot of technical innovation of the data center. The primitive of the DPU is currently implemented by adding hardware offload functions, such as secure offload, store offload, OVS (Open vSwitch, i.e., open virtual switch standard) offload, etc., to the network device, and fusing with the CPU of the ARM (Advanced RISC Machine, i.e., advanced reduced instruction set machine) and PCIe (Peripheral Component Interconnect express, i.e., high speed serial computer expansion bus standard) device.

Distributed storage is a key service in a super-fusion system that provides storage services for objects, files, and blocks for the super-fusion system. However, in the physical hardware platform of the DPU, the requirement of the conventional distributed storage system on hardware is high, and a CPU with a high main frequency, more CPU cores and the like are required, and it is difficult to run the existing distributed storage system in the internal operating system, so that a special distributed storage system is required to be designed for the hardware platform of the DPU. Currently, it is difficult to design database systems based on stored metadata, especially in a less CPU resource scenario.

In summary, how to perform metadata management for a distributed storage system in a DPU application scenario is a problem to be solved at present.

Disclosure of Invention

In view of the above, an object of the present application is to provide a metadata management method, apparatus, device, and medium, which can perform metadata management for a distributed storage system in a DPU application scenario. The specific scheme is as follows:

in a first aspect, the present application discloses a metadata management method, applied to a distributed storage system, where a node in the distributed storage system is a host node constructed based on a data processor, and the metadata management method includes:

screening a target number of disks from the non-screened disks of each host node according to a preset disk screening rule, and constructing a disk group based on the target number of disks;

determining a target disk for storing metadata from the disk group according to a preset redundancy strategy, and storing the metadata to the target disk according to the preset redundancy strategy;

and re-jumping to the step of screening the target number of disks from the non-screened disks of each host node according to a preset disk screening rule until all disks in each host node are screened or no disk meeting the preset disk screening rule exists in the non-screened disks of each host node.

Optionally, the determining, according to a preset redundancy policy, a target disk for storing metadata from the disk group includes:

if the preset redundancy strategy is a copy strategy, determining a first hot spare disk from the disk group, and taking the disks except the first hot spare disk in the disk group as target disks for storing metadata;

and if the preset redundancy strategy is an erasure code strategy, taking all the disks in the disk group as target disks for storing metadata.

Optionally, the storing the metadata to the target disk according to the preset redundancy policy includes:

if the preset redundancy strategy is a copy strategy, dividing the target disk into a plurality of pairs of mirror disk arrays by using a mirror disk method, and carrying out banding processing on the mirror disk arrays by using a banding technology so as to store the metadata by using the processed target disk; wherein two disks in the mirrored disk array are located at different host nodes;

and if the preset redundancy strategy is an erasure code strategy, processing the target disk by using a distributed parity check method so as to store the metadata by using the processed target disk.

Optionally, the metadata management method further includes:

and if the disk meeting the preset disk screening rule does not exist in the residual disk which is not screened in each host node, taking the residual disk as a second hot standby disk.

Optionally, the metadata management method further includes:

in the running process of the distributed storage system, monitoring running information of all disk groups by using a preset disk tool, and judging whether a faulty disk group with faults exists in the distributed storage system currently or not based on the running information;

if a faulty disk group with faults exists in the current distributed storage system, determining a faulty disk with faults in the faulty disk group;

judging whether unused hot standby discs exist in the second hot standby discs at present;

if the unused hot standby disk exists in the second hot standby disk currently, taking the hot standby disk which is not used currently as a third hot standby disk, and judging whether the number of the third hot standby disk is one;

if the number of the third hot spare disks is one, the third hot spare disks are directly used for replacing the fault magnetic disks, and if the number of the third hot spare disks is multiple, a target hot spare disk is determined from the third hot spare disks based on a minimum proximity algorithm, and the target hot spare disk is used for replacing the fault magnetic disks;

If the unused hot spare disk does not exist in the second hot spare disk currently, determining a preset redundancy strategy currently used;

if the preset redundancy strategy is a copy strategy, determining a first hot spare disk in the fault disk group, and replacing the fault disk by using the first hot spare disk;

and if the preset redundancy strategy is an erasure code strategy, repairing the fault disk by using the preset disk tool.

Optionally, before the target number of disks is selected from the unsearched disks of each host node according to a preset disk screening rule, the method further includes:

numbering the disks on all host nodes in the distributed storage system based on host information and disk slot position information in advance to obtain a disk two-dimensional array;

and determining a corresponding disk two-dimensional matrix based on the disk two-dimensional array.

Optionally, the screening a target number of disks from the non-screened disks of each host node according to a preset disk screening rule, and constructing a disk group based on the target number of disks, including:

marking the corresponding element positions of the screened magnetic disks in the magnetic disk two-dimensional matrix in each host node, and determining the target element positions from the unmarked element positions in the magnetic disk two-dimensional matrix;

Taking a disk corresponding to the target element position as a central disk, and constructing a disk group based on the central disk and disks corresponding to a plurality of element positions around the target element position; wherein the number of disks in the disk group satisfies a target number.

Optionally, the target number is five, and the disk group includes at least three disks of different host nodes.

In a second aspect, the present application discloses a metadata management apparatus, applied to a distributed storage system, where a node in the distributed storage system is a host node constructed based on a data processor, and the metadata management apparatus includes:

the disk group construction module is used for screening a target number of disks from the non-screened disks of each host node according to a preset disk screening rule, and constructing a disk group based on the target number of disks;

the metadata storage module is used for determining a target disk for storing metadata from the disk group according to a preset redundancy strategy and storing the metadata to the target disk according to the preset redundancy strategy;

and the step of jumping module is used for jumping to the step of screening the target number of disks from the non-screened disks of each host node according to the preset disk screening rule again until all the disks in each host node are screened or no disk meeting the preset disk screening rule exists in the non-screened disks in each host node.

In a third aspect, the present application discloses an electronic device, comprising:

a memory for storing a computer program;

a processor for executing the computer program to implement the steps of the previously disclosed metadata management method.

In a fourth aspect, the present application discloses a computer-readable storage medium for storing a computer program; wherein the computer program when executed by a processor implements the steps of the previously disclosed metadata management method.

It can be seen that the metadata management method of the present application is applied to a distributed storage system, wherein nodes in the distributed storage system are host nodes constructed based on a data processor, and specifically includes steps of screening a target number of disks from the unsearched disks of each host node according to a preset disk screening rule, and constructing a disk group based on the target number of disks; determining a target disk for storing metadata from the disk group according to a preset redundancy strategy, and storing the metadata to the target disk according to the preset redundancy strategy; and re-jumping to the step of screening the target number of disks from the non-screened disks of each host node according to a preset disk screening rule until all disks in each host node are screened or no disk meeting the preset disk screening rule exists in the non-screened disks of each host node. Therefore, the application provides a screening and constructing strategy of the disk group for the distributed storage system under the DPU application scene, specifically constructs the disk group according to the preset disk screening rule from the target number of disks which are screened from the non-screened disks of each host node, further determines the target disk for storing metadata in the disk group according to the preset redundancy strategy, thereby realizing the storage management of the metadata by using the target disk. Repeating the steps until all the disks in each host node are screened, or no disk meeting the preset disk screening rule exists in the non-screened disks in each host node, so as to complete the screening and constructing process of all the disk groups. Through the scheme, under the DPU application scene, the metadata management of the DPU host and the hard disk in the DPU is realized, and a simplified data management control scheme is provided for the distributed storage system.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings that are required to be used in the embodiments or the description of the prior art will be briefly described below, and it is obvious that the drawings in the following description are only embodiments of the present application, and that other drawings can be obtained according to the provided drawings without inventive effort for a person skilled in the art.

FIG. 1 is a flow chart of a metadata management method disclosed by the application;

FIG. 2 is a schematic diagram illustrating disk group selection in accordance with the present disclosure;

FIG. 3 is a schematic diagram illustrating a cluster disk group selection according to the present disclosure;

FIG. 4 is a flowchart of a specific metadata management method disclosed in the present application;

FIG. 5 is a schematic diagram of disk processing for a copy policy of the present disclosure;

FIG. 6 is a schematic diagram of a disk processing of an erasure coding strategy according to the present disclosure;

FIG. 7 is a flowchart illustrating a hot standby disk selection process when a disk fails according to the present disclosure;

FIG. 8 is a schematic diagram of selecting a hot standby disc based on a minimum proximity method according to the present disclosure;

FIG. 9 is a schematic diagram of a metadata management device according to the present disclosure;

Fig. 10 is a block diagram of an electronic device according to the present disclosure.

Detailed Description

The following description of the embodiments of the present application will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present application, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.

In the physical hardware platform of the DPU, the conventional distributed storage system has higher requirement on hardware, needs a CPU with higher main frequency, more CPU cores and the like, and is difficult to run in the internal operating system, so that a special distributed storage system is required to be designed for the hardware platform of the DPU. Currently, it is difficult to design database systems based on stored metadata, especially in a less CPU resource scenario. Therefore, the embodiment of the application discloses a metadata management method, a device, equipment and a medium, which can perform metadata management on a distributed storage system under the DPU application scene.

Referring to fig. 1, an embodiment of the present application discloses a metadata management method, which is applied to a distributed storage system, where a node in the distributed storage system is a host node constructed based on a data processor, and the method includes:

step S11: and screening a target number of disks from the non-screened disks of each host node according to a preset disk screening rule, and constructing a disk group based on the target number of disks.

In this embodiment, the host node is configured based on a data processor (i.e., DPU), and according to the number of hosts and disks in the distributed storage system, a target number of disks is selected from the non-selected disks of each host node by using a preset disk screening rule, and a disk group is configured based on the selected disks, where the target number is also specified in the preset disk screening rule.

It should be noted that before the above-mentioned screening a target number of disks from the non-screened disks of each host node according to the preset disk screening rule, the method further includes: numbering the disks on all host nodes in the distributed storage system based on host information and disk slot position information in advance to obtain a disk two-dimensional array; and determining a corresponding disk two-dimensional matrix based on the disk two-dimensional array. That is, the present application numbers the disks on all host nodes in the distributed storage system in advance based on the host information and the disk slot information, where the numbers are in the form of disks (host node number, slot number), so as to obtain a two-dimensional array of disks, for example, the disk of the 1 st slot on the host node 1, the disk of the 6 th slot on the host node 5, and the disk of the 6 (5, 6).

In a specific embodiment, the screening a target number of disks from the non-screened disks of each host node according to a preset disk screening rule, and constructing a disk group based on the target number of disks includes: marking the corresponding element positions of the screened magnetic disks in the magnetic disk two-dimensional matrix in each host node, and determining the target element positions from the unmarked element positions in the magnetic disk two-dimensional matrix; taking a disk corresponding to the target element position as a central disk, and constructing a disk group based on the central disk and disks corresponding to a plurality of element positions around the target element position; wherein the number of disks in the disk group satisfies a target number. That is, the corresponding element positions of the discs corresponding to the constructed disc group are marked in the two-dimensional matrix of the discs, so that the discs corresponding to the unmarked element positions are not currently screened discs, for example, the cross-shaped area in fig. 2 is the constructed disc group, and the rest areas correspond to the not-currently screened discs. Further, determining the target element position from the unlabeled element positions in the two-dimensional matrix of the magnetic disk; taking a disk corresponding to the target element position as a central disk, and constructing a disk group based on the central disk and disks corresponding to a plurality of element positions around the target element position; wherein the number of disks in the disk group satisfies the target number.

It should be noted that, in this embodiment, the creation of the disk group also needs to consider the preset redundancy policy in the distributed storage system, so the target number is set to five, and at least three disks including different host nodes are needed in the disk group. The preset disc screening rule specifically may be to use a disc of a host node as a center, search corresponding disc numbers upward, downward, leftward and rightward, for example, in the manner of constructing a disc set in fig. 2, for example, by using discs (3, 2) as a center, search a disc of a number 2 slot in the host node 4 upward, search a disc of a number 2 slot in the host node 2 downward, select a disc of a number 1 slot in the host node 3 leftward, and select a disc of a number 3 slot in the host node 3 rightward. It should be noted that in a two-dimensional matrix of disks, the left and right boundaries may be considered to be bordered, and the upper and lower boundaries may also be considered to be bordered. Therefore, assuming that there are 6 master nodes and 6 disks are provided for each host, the disk set formed according to the above method is shown in fig. 3, and the disks corresponding to the positions of the same letters in fig. 3 are the same disk set. As can be seen from fig. 3, all the disks in the host nodes 1, 3, 5 are used, while two disks are left for each of the host nodes 2, 4, 6, then for the remaining 6 disks, since three different hosts are still satisfied, and the total number is equal to or greater than 5, one disk group can be created again from any 5 of the 6 disks.

Step S12: and determining a target disk for storing metadata from the disk group according to a preset redundancy strategy, and storing the metadata to the target disk according to the preset redundancy strategy.

In this embodiment, a target disk for storing metadata in a disk group is determined according to a preset redundancy policy, so that metadata is stored in the target disk according to the preset redundancy policy.

Step S13: and re-jumping to the step of screening the target number of disks from the non-screened disks of each host node according to a preset disk screening rule until all disks in each host node are screened or no disk meeting the preset disk screening rule exists in the non-screened disks of each host node.

In this embodiment, the above steps are repeated until all the disks in each host node are screened, or no disk satisfying the preset disk screening rule exists in the non-screened disks in each host node, so as to complete the screening construction process of all the disk groups.

Referring to fig. 4, an embodiment of the present application discloses a specific metadata management method, and compared with the previous embodiment, the present embodiment further describes and optimizes a technical solution. The method specifically comprises the following steps:

step S21: and screening a target number of disks from the non-screened disks of each host node according to a preset disk screening rule, and constructing a disk group based on the target number of disks.

Step S22: if the preset redundancy strategy is a copy strategy, determining a first hot spare disk from the disk group, and taking the disks except the first hot spare disk in the disk group as target disks for storing metadata; and if the preset redundancy strategy is an erasure code strategy, taking all the disks in the disk group as target disks for storing metadata.

In a specific embodiment, if the preset redundancy policy is a copy policy, determining a first hot spare disk from the disk group, and taking the disks except the first hot spare disk in the disk group as target disks for storing metadata. That is, in the copy policy, not all of the disks in the disk group are used to store metadata, and the disk group also needs to include the first hot spare disk. In the specific embodiment, the disk in the middle area of the disk group is generally used as a hot spare disk, such as the disk (3, 2) in fig. 2. It should be noted that the hot spare disk does not store data, and is mainly used for replacing and repairing the disk when the disk fails.

In another embodiment, if the preset redundancy policy is an Erasure Coding (EC) policy, all disks in the disk group are used as target disks for storing metadata. That is, in the erasure coding strategy, all disks in a disk group are used to store metadata.

Step S23: if the preset redundancy strategy is a copy strategy, dividing the target disk into a plurality of pairs of mirror disk arrays by using a mirror disk method, and carrying out banding processing on the mirror disk arrays by using a banding technology so as to store the metadata by using the processed target disk; wherein two disks in the mirrored disk array are located at different host nodes.

In this embodiment, if the preset redundancy policy is a copy policy, the target disk is divided into a plurality of pairs of mirror disk arrays by using a mirror disk method (i.e. RAID 1, in which RAID is Redundant Arrays ofIndependent Disks, and disk arrays), and it should be noted that two disks in the mirror disk arrays need to be located at different host nodes. For example, as shown in fig. 5, the numbers corresponding to the disks in the disk group in fig. 5 are 1 to 5, where the disk 2 is used as a hot spare disk, the disks 1, 3, 4, and 5 are used to store metadata, and a mirrored disk array is constructed, where the mirrored disk array is to avoid selecting two disks of the same host, so, as shown in fig. 5, the disk 1 and the disk 4 may be a mirrored disk array, the disk 3 and the disk 5 may be a mirrored disk array, that is, the disk 1 and the disk 4 are backed up with each other, and the disk 3 and the disk 5 are backed up with each other. Furthermore, the striping technology (namely RAID 0) is utilized to carry out striping processing on the two pairs of mirror image disk arrays, so that unified hard disk space management is realized.

Step S24: and if the preset redundancy strategy is an erasure code strategy, processing the target disk by using a distributed parity check method so as to store the metadata by using the processed target disk.

In this embodiment, as shown in fig. 6, if the preset redundancy policy is an erasure code policy, the target disk is processed by using a distributed parity check method (i.e. RAID 5), so as to store metadata by using the processed target disk. It is appreciated that RAID5 employs a parity scheme that achieves the same effect of EC erasure.

Step S25: and re-jumping to the step of screening the target number of disks from the non-screened disks of each host node according to a preset disk screening rule until all disks in each host node are screened or no disk meeting the preset disk screening rule exists in the non-screened disks of each host node.

In this embodiment, the method further includes: and if the disk meeting the preset disk screening rule does not exist in the residual disk which is not screened in each host node, taking the residual disk as a second hot standby disk. That is, if there are no disks satisfying the preset disk screening rule in the remaining disks that are not screened in each host node, for example, the number of the remaining disks is less than 5, a disk group cannot be constructed, and therefore, all the remaining disks are used as the second hot standby disk.

For more specific processing procedures in the steps S21 and S25, reference may be made to the corresponding contents disclosed in the foregoing embodiments, and no detailed description is given here.

It can be seen that the preset redundancy policies in the embodiment of the present application include a copy policy and an erasure code policy. In the copy policy, the disks in the disk group are not all used for storing metadata, the disk group also needs to include a first hot standby disk, the target disk is further divided into a plurality of pairs of mirror disk arrays by using a mirror disk method, and the plurality of pairs of mirror disk arrays are striped by using a striping technology. In the erasure code strategy, all disks in a disk group are used to store metadata and the target disk needs to be processed using a distributed parity approach. In this way, the application provides a data redundancy implementation scheme in the distributed storage system, and in the selected disk group, a copy policy or an EC redundancy policy is implemented.

Further, in a specific embodiment, the present application provides specific steps of how to select a hot spare disk when the disk fails, as shown in fig. 7, specifically including:

step S31: and in the running process of the distributed storage system, monitoring running information of all the disk groups by using a preset disk tool, and judging whether a faulty disk group with faults exists in the distributed storage system currently based on the running information.

In this embodiment, in the operation process of the distributed storage system, the operation information of all disk groups is monitored in real time by using a preset disk tool, and whether a failed disk group with a failure exists in the current distributed storage system is determined based on the operation information.

Step S32: if a faulty disk group with faults exists in the current distributed storage system, determining the faulty disk with faults in the faulty disk group, and judging whether the unused hot spare disk exists in the current second hot spare disk.

In this embodiment, when a failed disk group that fails exists in the current distributed storage system, the failed disk that fails in the failed disk group is determined, and in particular, the number information of the failed disk in the disk two-dimensional matrix may be determined. And further determining whether an unused hot spare disk exists in the current second hot spare disk.

Step S33: and if the unused hot standby disk exists in the second hot standby disk currently, taking the hot standby disk which is not used currently as a third hot standby disk, and judging whether the number of the third hot standby disk is one.

In this embodiment, when an unused hot spare disk exists in the second hot spare disk, the hot spare disk that is not used currently is used as the third hot spare disk, and it is determined whether the number of the third hot spare disks is one.

Step S34: if the number of the third hot spare disks is one, the third hot spare disks are directly used for replacing the fault magnetic disks, and if the number of the third hot spare disks is multiple, a target hot spare disk is determined from the third hot spare disks based on a minimum proximity algorithm, and the target hot spare disk is used for replacing the fault magnetic disks.

If the number of the third hot spare disks is one, the fault magnetic disks are directly replaced by the third hot spare disks, and if the number of the third hot spare disks is multiple, the target hot spare disks are determined from the third hot spare disks based on a minimum proximity algorithm. As shown in FIG. 8, assuming that the disks (4, 2) in the disk group are damaged, when the third hot-standby disk is located on the disks (6, 1), the disks (6, 4), the disks (4, 4) and the disks (2, 5), the minimum-neighbor algorithm calculation is performed, the number of steps from the disk (4, 2) to the disk (4, 4) is only 2, and the number of steps to other positions is more than 3, so that the nearest-neighbor disk (4, 4) is selected to replace the failed disk.

Step S35: and if the unused hot spare disk does not exist in the second hot spare disk currently, determining a preset redundancy strategy currently used.

In this embodiment, if there is no unused hot spare disk in the current second hot spare disk, the preset redundancy policy that is currently used is further determined.

Step S36: and if the preset redundancy strategy is a copy strategy, determining a first hot spare disk in the fault disk group, and replacing the fault disk by using the first hot spare disk.

In this embodiment, if the preset redundancy policy is a copy policy, a first hot spare disk in the failed disk group is determined, and the failed disk is replaced by the first hot spare disk, that is, the failed disk is replaced by the first hot spare disk in the central area reserved in the disk group.

Step S37: and if the preset redundancy strategy is an erasure code strategy, repairing the fault disk by using the preset disk tool.

In this embodiment, if the preset redundancy policy is an erasure code policy, because in the erasure code policy, all disks in the disk group are used to store metadata and there is no hot spare disk, the fault disk is repaired by using the preset disk tool until there are spare disks in the system, and then the spare disks are used as hot spare disks of the disk group.

Therefore, the embodiment of the application provides a hot spare disk usage strategy in a disk group, when any disk in the disk group fails, a third hot spare disk which is not used in the second hot spare disk is utilized to replace the failed disk, and when the number of the third hot spare disks is a plurality of, the nearest neighboring disk is preferentially selected according to a minimum neighboring algorithm to replace the failed disk. If the second hot spare disk is used completely, determining which data redundancy strategy is currently used, if the data redundancy strategy is a copy strategy, replacing the fault disk by using the first hot spare disk of the reserved central area in the disk group, and if the data redundancy strategy is an erasure code strategy, repairing the fault disk by using a preset disk tool.

Referring to fig. 9, an embodiment of the present application discloses a metadata management apparatus, which is applied to a distributed storage system, where a node in the distributed storage system is a host node constructed based on a data processor, and the apparatus includes:

the disk group construction module 11 is configured to screen out a target number of disks from the non-screened disks of each host node according to a preset disk screening rule, and construct a disk group based on the target number of disks;

the metadata storage module 12 is configured to determine a target disk for storing metadata from the disk group according to a preset redundancy policy, and store the metadata to the target disk according to the preset redundancy policy;

and the step of skipping module 13 is configured to re-skip to the step of screening the target number of disks from the non-screened disks of each host node according to a preset disk screening rule until all the disks in each host node are screened, or no disk satisfying the preset disk screening rule exists in the non-screened disks in each host node.

In some specific embodiments, the metadata storage module 12 may specifically include:

the first disk determining unit is used for determining a first hot spare disk from the disk group if the preset redundancy strategy is a copy strategy, and taking the disks except the first hot spare disk in the disk group as target disks for storing metadata;

and the second disk determining unit is used for taking all disks in the disk group as target disks for storing metadata if the preset redundancy strategy is an erasure coding strategy.

the first data storage unit is used for dividing the target disk into a plurality of pairs of mirror disk arrays by utilizing a mirror disk method if the preset redundancy strategy is a copy strategy, and carrying out banding processing on the mirror disk arrays by utilizing a banding technology so as to store the metadata by utilizing the processed target disk; wherein two disks in the mirrored disk array are located at different host nodes;

and the first data storage unit is used for processing the target disk by using a distributed parity check method if the preset redundancy strategy is an erasure code strategy so as to store the metadata by using the processed target disk.

In some specific embodiments, the metadata management apparatus further comprises:

and the hot spare disk determining unit is used for taking the remaining disks which are not screened in each host node as a second hot spare disk if the disks which meet the preset disk screening rule do not exist in the remaining disks.

the fault detection unit is used for monitoring the operation information of all the disk groups by using a preset disk tool in the operation process of the distributed storage system and judging whether a faulty disk group with faults exists in the distributed storage system or not based on the operation information;

the fault disk determining unit is used for determining a fault disk with faults in the fault disk group if the fault disk group with faults exists in the distributed storage system currently;

the judging unit is used for judging whether unused hot standby discs exist in the second hot standby discs currently;

a hot standby disk determining unit, configured to, if an unused hot standby disk exists in the second hot standby disk, take the unused hot standby disk as a third hot standby disk, and determine whether the number of the third hot standby disk is one;

The first disk replacement unit is used for directly replacing the fault disk by the third hot spare disk if the number of the third hot spare disks is one, determining a target hot spare disk from the third hot spare disks based on a minimum adjacent algorithm if the number of the third hot spare disks is multiple, and replacing the fault disk by the target hot spare disk;

a redundancy policy determining unit, configured to determine a preset redundancy policy that is currently used if there is no unused hot spare disk in the second hot spare disk currently;

the second disk replacement unit is used for determining a first hot spare disk in the fault disk group and replacing the fault disk by using the first hot spare disk if the preset redundancy strategy is a copy strategy;

and the disk repairing unit is used for repairing the fault disk by using the preset disk tool if the preset redundancy strategy is an erasure code strategy.

Fig. 10 is a schematic structural diagram of an electronic device according to an embodiment of the present application. Specifically, the method comprises the following steps: at least one processor 21, at least one memory 22, a power supply 23, a communication interface 24, an input output interface 25, and a communication bus 26. Wherein the memory 22 is used for storing a computer program, which is loaded and executed by the processor 21 to implement the relevant steps in the metadata management method performed by the electronic device as disclosed in any of the foregoing embodiments.

In this embodiment, the power supply 23 is configured to provide an operating voltage for each hardware device on the electronic device 20; the communication interface 24 can create a data transmission channel between the electronic device 20 and an external device, and the communication protocol to be followed is any communication protocol applicable to the technical solution of the present application, which is not specifically limited herein; the input/output interface 25 is used for acquiring external input data or outputting external output data, and the specific interface type thereof may be selected according to the specific application requirement, which is not limited herein.

Processor 21 may include one or more processing cores, such as a 4-core processor, an 8-core processor, etc. The processor 21 may be implemented in at least one hardware form of DSP (Digital Signal Processing ), FPGA (Field-Programmable Gate Array, field programmable gate array), PLA (Programmable Logic Array ). The processor 21 may also comprise a main processor, which is a processor for processing data in an awake state, also called CPU (Central Processing Unit ); a coprocessor is a low-power processor for processing data in a standby state. In some embodiments, the processor 21 may integrate a GPU (Graphics Processing Unit, image processor) for rendering and drawing of content required to be displayed by the display screen. In some embodiments, the processor 21 may also include an AI (Artificial Intelligence ) processor for processing computing operations related to machine learning.

The memory 22 may be a carrier for storing resources, such as a read-only memory, a random access memory, a magnetic disk, or an optical disk, and the resources stored thereon include an operating system 221, a computer program 222, and data 223, and the storage may be temporary storage or permanent storage.

The operating system 221 is used for managing and controlling various hardware devices on the electronic device 20 and the computer program 222, so as to implement the operation and processing of the processor 21 on the mass data 223 in the memory 22, which may be Windows, unix, linux. The computer program 222 may further include a computer program that can be used to perform other specific tasks in addition to the computer program that can be used to perform the metadata management method performed by the electronic device 20 disclosed in any of the foregoing embodiments. The data 223 may include, in addition to data received by the electronic device and transmitted by the external device, data collected by the input/output interface 25 itself, and so on.

Further, the embodiment of the application also discloses a computer readable storage medium, wherein the storage medium stores a computer program, and when the computer program is loaded and executed by a processor, the metadata management method disclosed in any embodiment is realized.

In this specification, each embodiment is described in a progressive manner, and each embodiment is mainly described in a different point from other embodiments, so that the same or similar parts between the embodiments are referred to each other. For the device disclosed in the embodiment, since it corresponds to the method disclosed in the embodiment, the description is relatively simple, and the relevant points refer to the description of the method section.

Those of skill would further appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the various illustrative elements and steps are described above generally in terms of functionality in order to clearly illustrate the interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Those skilled in the art may implement the described functionality using different approaches for each particular application, but such implementation is not intended to be limiting.

The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. The software modules may be disposed in Random Access Memory (RAM), memory, read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.

Finally, it is further noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.

The foregoing has described in detail a metadata management method, apparatus, device and storage medium provided by the present invention, and specific examples have been applied herein to illustrate the principles and embodiments of the present invention, and the above examples are only for aiding in the understanding of the method and core idea of the present invention; meanwhile, as those skilled in the art will have variations in the specific embodiments and application scope in accordance with the ideas of the present invention, the present description should not be construed as limiting the present invention in view of the above.

Claims

1. The metadata management method is characterized by being applied to a distributed storage system, wherein nodes in the distributed storage system are host nodes constructed based on a data processor, and the metadata management method comprises the following steps:

2. The method for metadata management according to claim 1, wherein the determining a target disk for storing metadata from the disk group according to a preset redundancy policy includes:

3. The method of metadata management according to claim 2, wherein storing the metadata to the target disk according to the preset redundancy policy comprises:

4. The metadata management method according to claim 2, further comprising:

5. The metadata management method according to claim 4, further comprising:

6. The metadata management method according to any one of claims 1 to 5, wherein before the target number of disks is screened from the non-screened disks of each host node according to a preset disk screening rule, further comprising:

7. The method according to claim 6, wherein the step of screening a target number of disks from the non-screened disks of each host node according to a preset disk screening rule, and constructing a disk group based on the target number of disks, comprises:

8. The method of claim 7, wherein the target number is five, and the disk group includes at least three disks of different host nodes.

9. A metadata management apparatus, for use in a distributed storage system, wherein nodes in the distributed storage system are host nodes configured based on a data processor, comprising:

10. An electronic device, comprising:

a memory for storing a computer program;

a processor for executing the computer program to implement the steps of the metadata management method according to any one of claims 1 to 8.

11. A computer-readable storage medium storing a computer program; wherein the computer program when executed by a processor implements the steps of the metadata management method according to any of claims 1 to 8.