CN107729185B

CN107729185B - Fault processing method and device

Info

Publication number: CN107729185B
Application number: CN201711015125.1A
Authority: CN
Inventors: 顾雷雷; 乔辉
Original assignee: Hangzhou H3C Technologies Co Ltd
Current assignee: Hangzhou H3C Technologies Co Ltd
Priority date: 2017-10-26
Filing date: 2017-10-26
Publication date: 2020-12-04
Anticipated expiration: 2037-10-26
Also published as: CN107729185A

Abstract

The invention provides a fault processing method and a device, wherein the method comprises the following steps: acquiring the information of the available memory capacity of the storage node and the OSD number of the object storage equipment; when the ratio of the available memory capacity of the target storage node to the number of the OSD is smaller than a first preset threshold value, refusing the OSD increasing operation aiming at the target storage node, and setting part of the OSD of the target storage node to be in a first type non-working Down state, so that the ratio of the available memory capacity of the target storage node to the number of the rest OSD is larger than or equal to the first preset threshold value; wherein the remaining OSDs refer to available OSDs of the target storage node except for a portion of OSDs set to the first type Down state. By applying the method and the device, the risks of OSD performance bottleneck and data recovery failure after memory failure on the storage node can be avoided.

Description

Fault processing method and device

Technical Field

The present invention relates to the field of network communication technologies, and in particular, to a fault handling method and apparatus.

Background

The Ceph (distributed storage system) is an open source project, provides a software-defined and unified storage solution, and has the advantages of large-scale expansion, high performance and no single point of failure.

A typical Ceph cluster deployment creates an OSD (Object Storage Device) for each physical hard disk in the cluster node.

The failure domain of a Ceph cluster typically includes disks, nodes (i.e., servers), racks, power circuits, and the like. When any component in the failure domain fails, which causes the corresponding OSD deployed thereon to fail, the Ceph cluster marks the OSDs in a Down (non-working) state, performs an initialization operation, and reorganizes the affected data on the failed node.

However, practice shows that in the existing Ceph cluster implementation scheme, because part of the memory of the storage node is unavailable and does not directly affect the state of the OSD, when part of the memory of the storage node is unavailable, the Ceph cluster does not adjust the state of the OSD, but if a data recovery or rebalancing event occurs in the Ceph cluster, the processing performance of the Ceph cluster is seriously affected, and even a data recovery failure occurs.

Disclosure of Invention

The invention provides a fault processing method and device, and aims to solve the problems that in the prior art, storage node memory faults can reduce the processing performance of a Ceph cluster and even cause data recovery failure of the Ceph cluster.

According to a first aspect of the embodiments of the present invention, there is provided a fault handling method, including:

acquiring the information of the available memory capacity of the storage node and the OSD number of the object storage equipment;

when the ratio of the available memory capacity of a target storage node in the storage nodes to the number of OSD is determined to be smaller than a first preset threshold, refusing the OSD increasing operation aiming at the target storage node, and setting part of OSD of the target storage node to be in a first type non-working Down state, so that the ratio of the available memory capacity of the target storage node to the number of the rest OSD is larger than or equal to the first preset threshold; wherein the remaining OSDs refer to available OSDs of the target storage node except for a portion of OSDs set to the first type Down state.

According to a second aspect of the embodiments of the present invention, there is provided a fault handling method, including:

acquiring the information of the available memory capacity and the OSD number of the object storage object equipment;

and reporting the information of the available memory capacity and the OSD number to the monitor.

According to a third aspect of embodiments of the present invention, there is provided a fault handling apparatus including:

the device comprises an acquisition unit, a storage unit and a control unit, wherein the acquisition unit is used for acquiring the information of the available memory capacity of a storage node and the OSD number of object storage equipment;

the determining unit is used for determining whether the ratio of the memory capacity of the target storage node to the OSD number of the object storage devices in the storage nodes is smaller than a first preset threshold value;

the processing unit is used for refusing OSD increasing operation aiming at a target storage node when the determining unit determines that the ratio of the available memory capacity of the target storage node to the OSD number in the storage nodes is smaller than a first preset threshold value, and setting part of OSD of the target storage node to be in a first type non-working Down state so as to enable the ratio of the available memory capacity of the target storage node to the rest of OSD number to be larger than or equal to the first preset threshold value; wherein the remaining OSDs refer to available OSDs of the target storage node except for a portion of OSDs set to the first type Down state.

According to a fourth aspect of the embodiments of the present invention, there is provided a fault handling apparatus including:

the device comprises an acquisition unit, a storage unit and a display unit, wherein the acquisition unit is used for acquiring the information of the available memory capacity of the device and the OSD number of object storage object devices;

and the sending unit is used for reporting the information of the available memory capacity and the OSD number to the monitor.

By setting the first preset threshold used for indicating that the memory resources of the storage nodes are insufficient, when the ratio of the available memory capacity of the target storage node in the storage nodes to the OSD number is smaller than the first preset threshold, the OSD increasing operation aiming at the target storage node is refused, and part of the OSD of the target storage node is set to be in the first type Down state, so that the ratio of the available memory capacity of the target storage node to the rest of the OSD number is larger than or equal to the first preset threshold, and the risks of OSD performance bottleneck and data recovery failure after the memory failure on the storage nodes are avoided.

Drawings

Fig. 1 is a schematic flow chart of a fault handling method according to an embodiment of the present invention;

fig. 2 is a schematic flow chart of a fault handling method according to an embodiment of the present invention;

fig. 3 is a schematic structural diagram of a fault handling apparatus according to an embodiment of the present invention;

fig. 4 is a schematic structural diagram of another fault handling apparatus provided in the embodiment of the present invention;

fig. 5 is a schematic structural diagram of a fault handling apparatus according to an embodiment of the present invention;

fig. 6 is a schematic diagram of a hardware structure in which a monitor and a storage node are located on the same physical host according to an embodiment of the present invention.

Detailed Description

In order to make the technical solutions in the embodiments of the present invention better understood and make the above objects, features and advantages of the embodiments of the present invention more comprehensible, the technical solutions in the embodiments of the present invention are described in further detail below with reference to the accompanying drawings.

Referring to fig. 1, a schematic flow chart of a fault handling method according to an embodiment of the present invention is provided, where the fault handling method may be applied to a storage node of a Ceph cluster, as shown in fig. 1, the fault handling method may include the following steps:

it should be noted that, in the embodiment of the present invention, if not specifically described, the memory capacity is in units of G, the number of OSDs in the storage node is the OSD in the UP (working) state in the storage node, and a description of the embodiment of the present invention is not repeated in the following.

Step 101, obtaining the information of the available memory capacity and the OSD number of the user.

And step 102, reporting the information of the available memory capacity and the OSD number to a monitor.

In the embodiment of the invention, in order to ensure that the OSDs of each storage node in the Ceph cluster can work normally, it is required to ensure that each OSD in the storage node is configured with enough memory, and for the storage node as a whole, it is required to ensure that the ratio of the available memory capacity of the storage node to the number of OSDs is large enough.

When a storage node is initialized to operate, sufficient memory resources are usually reserved for the OSD, but when a part of the memory in the storage node fails, such as a memory bank is loose, a memory bank interface fails, and the like, the ratio of the available memory capacity of the storage node to the number of the OSD is reduced, and even the available memory is insufficient to meet the requirement of normal operation of the OSD.

Therefore, in the embodiment of the present invention, the storage node may report the information of the available memory capacity and the OSD number of the storage node to the monitor according to the preset policy, such as a ratio of the available memory capacity to the OSD number.

For example, the storage node may report the ratio of the available memory capacity of the storage node to the number of OSDs to the monitor at regular time (e.g., periodically); or, the storage node may report the ratio of the available memory capacity to the OSD number to the monitor when the ratio of the available memory capacity to the OSD number satisfies a specified condition.

In an embodiment of the present invention, the reporting of the information about the available memory capacity and the OSD number to the monitor may include:

and when the ratio of the available memory capacity of the monitor to the number of the OSD is smaller than a first preset threshold value, sending a first type report to the monitor.

In this embodiment, a threshold (referred to as a first preset threshold herein) may be preset in each storage node, where the threshold is used to determine whether the memory resource reserved for the OSD meets the minimum requirement that the OSD can normally operate, that is, when a ratio of an available memory capacity of the storage node to the number of the OSD is greater than or equal to the first preset threshold, it indicates that the memory resource reserved for the OSD by the storage node meets the minimum requirement that the OSD can normally operate, and otherwise, it indicates that the memory resource reserved for the OSD by the storage node does not meet the minimum requirement that the OSD can normally operate. The first preset threshold may be set according to an actual scene, for example, determined according to an actual hard disk capacity of the storage node.

Optionally, the first preset threshold may be any value within a range of 0.5 to 1.5.

Accordingly, in this embodiment, the storage node may obtain, in real time, a ratio of the available memory capacity of the storage node to the number of OSDs, to determine whether the ratio of the available memory capacity of the storage node to the number of OSDs is smaller than a first preset threshold, and when it is determined that the ratio of the available memory capacity of the storage node to the number of OSDs is smaller than the first preset threshold, send a report (referred to as a first type report herein) to the monitor, so as to notify the monitor that the ratio of the available memory capacity of the storage node to the number of OSDs is smaller than the first preset threshold, that is, the available memory capacity of the storage node is less than the minimum requirement for normal operation of the OSDs, so as to trigger the monitor to.

For example, the storage node may obtain a ratio of the available memory capacity of the storage node to the number of OSDs in real time, periodically poll the ratio of the available memory capacity to the number of OSDs by the OSD daemon (the period may be set according to an actual scene, for example, 300 seconds), determine whether the ratio of the available memory capacity to the number of OSDs exceeds a first preset threshold, and send a first type report to the monitor when the OSD daemon determines that the ratio of the available memory capacity to the number of OSDs is lower than the first preset threshold.

In order to avoid that a plurality of OSD daemon processes in the same storage node repeatedly send the first type report to the monitor, each OSD daemon process may poll the ratio of the available memory capacity to the OSD number in different periods, or start polling in the same period but at different times, considering that the storage node usually has a plurality of OSD daemon processes (the number of OSD is the same as the number of OSD).

For example, assuming that 32 OSDs are created in the storage node, correspondingly, 32 OSD daemons (OSD daemons 1 to 32) are running in the storage node, each OSD daemon in the storage node polls the storage node for the ratio of the available memory capacity to the number of OSDs in a period of 300 seconds, for example, OSD daemon 2 may start polling 5 seconds after the OSD daemon 1 polls the start time, and OSD daemon 3 may start polling … after OSD daemon 32 5 seconds after OSD daemon 31 polls the start time.

When any OSD daemon finds that the ratio of the available memory capacity of the storage node to the number of OSD is smaller than a first preset threshold value, the OSD daemon can send a first type report to a monitor; the monitor, upon receiving the first type of report, may respond with an acknowledgement message, such as an ACK (acknowledgement) message; after receiving the confirmation message returned by the monitor, the OSD daemon may send a notification message to other OSD daemon, so that the other OSD daemon may be in a silent state within a certain time length after receiving the notification message, that is, polling of a ratio of an available memory capacity to the OSD number is not performed within a preset time length. The silent duration of each OSD daemon needs to be longer than a polling period of a ratio of available memory capacity to the number of OSDs performed by each OSD daemon, for example, when the polling period is 300 seconds, the silent duration may be 900 seconds or 1800 seconds. When the OSD daemon receives the notification message sent by other OSD daemon within the silent duration, the OSD daemon can zero the timing of the silent time and restart the timing of the silent time; otherwise, that is, when the OSD daemon does not receive the notification message sent by other OSD daemon within the silence duration, the OSD daemon may start polling the ratio of the available memory capacity to the OSD number when the next polling start time is reached.

In the embodiment of the invention, when the monitor receives the first type report reported by the storage node, on one hand, OSD adding operation aiming at the storage node can be refused; on the other hand, a job removal process may be performed on a part of the OSDs on the storage node, that is, the part of the OSDs on the storage node is set to be in a first type Down state (also referred to as a management Down state herein), so that a ratio of the available memory capacity of the storage node to the number of the remaining OSDs is greater than or equal to a first preset threshold, and a specific implementation thereof may refer to a related description in the method flow shown in fig. 2, which is not described herein again in the embodiments of the present invention.

Wherein the remaining OSDs refer to available OSDs (i.e., OSDs in an UP state) of the OSDs of the storage node except for a portion of OSDs set to the first type Down state; the first type of Down differs from the Down of OSDs in the existing Ceph cluster in that: the former OSD is not malfunctioning, i.e., the OSD has the capability to function, but is set to a non-functional state; the latter OSD fails, i.e., the OSD has no capability to function and is set to the Down state.

It can be seen that, in the method flow shown in fig. 1, the ratio of the available memory capacity of the storage node to the number of OSDs is obtained by the storage node, and when the ratio of the available memory capacity of the storage node to the number of OSDs is reported to the monitor, so that when the monitor determines that the ratio of the available memory capacity of the storage node to the number of OSDs is smaller than the first preset threshold, part of the OSDs on the storage node are set to be in the first type Down state, and the ratio of the available memory capacity of the storage node to the number of the remaining OSDs is greater than or equal to the first preset threshold, that is, it is ensured that the storage node reserves enough available memory capacity for the OSDs, and the risk that OSD performance bottleneck and data recovery failure may occur after a memory.

Referring to fig. 2, a schematic flow chart of a fault handling method according to an embodiment of the present invention is shown, where the method may be applied to a Monitor (Monitor) in a Ceph cluster, and as shown in fig. 2, the fault handling method may include the following steps:

it should be noted that, in the embodiment of the present invention, the monitor and the storage node may be disposed on different physical hosts, or may be disposed on the same physical host. The storage node may be understood to include at least a set of OSDs that function as a storage, although the storage node may also include a processor with control functionality.

Step 201, obtaining the information of the available memory capacity and the number of OSDs of the storage node.

In the embodiment of the present invention, in order to avoid the risk that OSD performance bottleneck and data recovery failure may occur after a memory failure on a storage node occurs, a monitor may obtain information of an available memory capacity and an OSD number of the storage node, so that when it is determined that a ratio of the available memory capacity and the OSD number of the storage node is too low, a corresponding strategy is adopted to improve the ratio of the available memory capacity and the OSD number of the storage node. The obtained information of the available storage capacity and the number of OSDs of the storage node is for the whole cluster. However, the process of acquiring the information is performed in units of physical hosts, that is, the acquired parameters are for the same physical host, that is, the acquired available memory capacity of the storage node (total available memory capacity of the set of OSDs) and the number of OSDs are parameters of the same physical host.

In an embodiment of the present invention, the obtaining information of the available memory capacity and the OSD number of the storage node may include:

and receiving the information of the available memory capacity and the OSD number reported by the storage node.

In this embodiment, the specific implementation of reporting the information of the available memory capacity and the OSD number to the monitor by the storage node may refer to the related description in the method embodiment shown in fig. 1, and the embodiment of the present invention is not described herein again.

In another embodiment of the present invention, the obtaining information of the available memory capacity and the number of OSDs of the storage node may include:

and detecting the available capacity of the storage node and the information of the number of OSD.

In this embodiment, the monitor may actively probe the information of the available capacity and the number of OSDs of the storage node, for example, the monitor may probe the information of the available capacity and the number of OSDs of the storage node at regular time (e.g., periodically).

It should be noted that, in this embodiment, any monitor in the Ceph cluster may detect a ratio between the available memory capacity of all the storage nodes in the Ceph cluster and the number of OSDs, or all the storage nodes in the Ceph cluster may be divided into a plurality of groups according to the number of monitors in the Ceph cluster, where the number of the groups may be the same as the number of monitors, and one monitor detects a ratio between the available memory capacity of the storage nodes in one group and the number of OSDs, which is not described herein in detail.

Step 202, when it is determined that the ratio of the available memory capacity of the target storage node in the storage nodes to the number of OSDs is smaller than a first preset threshold, rejecting OSD addition operation for the target storage node, and setting a part of OSDs of the target storage node to be in a first type non-working Down state, so that the ratio of the available memory capacity of the target storage node to the number of the rest of OSDs is greater than or equal to the first preset threshold.

In the embodiment of the present invention, the target storage node does not refer to a fixed storage node, but may refer to any storage node in the Ceph cluster.

In one embodiment of the present invention, determining that a ratio between an available memory capacity of the target storage node and the number of OSDs is smaller than a first preset threshold may include:

a first type report sent by a target storage node is received.

The specific implementation of sending the first type report to the monitor by the target storage node may refer to the related description in the method flow shown in fig. 1, and details of the embodiment of the present invention are not described herein again.

In this embodiment, when the monitor receives the first type report sent by the target storage node, it may be determined that the ratio of the available memory capacity of the target storage node to the number of OSDs is smaller than the first preset threshold.

In another embodiment of the present invention, determining that a ratio of the available memory capacity of the target storage node to the number of OSDs is smaller than a first preset threshold may include:

and detecting that the ratio of the available memory capacity of the target storage node to the number of OSD (on screen display) is smaller than a first preset threshold.

In this embodiment, the monitor may periodically detect a ratio of the available memory capacity of each storage node in the Ceph cluster to the number of OSDs, and determine whether the ratio of the available memory capacity of each storage node to the number of OSDs is smaller than a first preset threshold.

In the embodiment of the present invention, when the monitor determines that the ratio of the available memory capacity of the target storage node to the number of OSDs is smaller than the first preset threshold, on one hand, the monitor needs to refuse an OSD adding operation for the target storage node, that is, when the monitor receives a notification message that the OSD needs to be added to the target storage node, the OSD adding operation is prohibited, so as to avoid that the ratio of the available memory capacity of the target storage node to the number of OSDs is further reduced by the increase of the number of the target storage node, and to aggravate the risk of OSD performance bottleneck and data recovery failure of the target storage node.

On the other hand, the monitor may perform a job removal process on a part of the OSDs of the target storage node, that is, set the part of the OSDs of the target storage node to be in the first type Down state, so that a ratio of the available memory capacity of the target storage node to the number of the rest of the OSDs is greater than or equal to a first preset threshold, that is, it is ensured that the memory resources of the target storage node can meet the minimum requirement of the normal operation of the OSDs in the UP state.

For example, assume that the first preset threshold is h1, the initial available memory capacity of the target storage node is M0, the initial OSD number is N0, and M0/N0 ≧ h 1; if a part of memory faults in the target storage node at a certain moment result in that the available memory capacity is changed into M1, and M1/N0 is less than h1, the target storage node reports a first type report to the monitor, and at the moment, the monitor can select N1 OSD from the target storage node to be set to be in a first type Down state, so that M1/(N0-N1) is not less than h 1. Optionally, the number of the selected OSDs set to the first type Down state may be the minimum value of N1 satisfying M1/(N0-N1) ≧ h 1.

Further, in the embodiment of the present invention, after the monitor performs the above processing on the storage node, if the memory failure of the storage node is repaired, the available memory capacity of the storage node may increase, and accordingly, the ratio of the available memory capacity of the storage node to the number of OSDs may also increase, at this time, the monitor may restore part or all of the OSDs on the storage node that are set to the first type Down state to the UP state.

In order to avoid that the ratio of the available memory capacity of the storage node to the number of OSDs is decreased to be less than the first preset threshold value again after the monitor restores part or all of the OSDs set as the first type Down state to the UP state, which results in that part of the OSDs set as the first type Down state on the storage node needs to be restored to the first type Down state again, thereby forming a shock, another threshold value (referred to as a second preset threshold value herein) may be preset, and the second preset threshold value is used to determine whether to allow the OSDs set as the first type Down state on the storage node to be restored to the UP state. Wherein the second preset threshold is greater than the first preset threshold.

Preferably, the second predetermined threshold is a ratio of an available memory capacity to an OSD in an ideal state, i.e., an optimal or preferred ratio between the available memory capacity and the OSD.

Optionally, the second preset threshold may be any value within a range of 1.5 to 2.0, where when the first preset threshold is 1.5, the second preset threshold needs to be greater than 1.5.

Accordingly, in an embodiment of the present invention, after the setting the part of the OSDs of the target storage node to the first type Down state, the method may further include:

and when the ratio of the available memory capacity to the OSD number of the target storage node is determined to be larger than a second preset threshold, allowing the OSD addition operation aiming at the target storage node, and restoring part or all of part of the OSD which is set to be in the first type Down state in the target storage node to be in an UP state on the basis that the ratio of the recovered available memory capacity to the OSD number is larger than or equal to the second preset threshold.

In this embodiment, the determining, by the monitor, that the ratio of the available memory capacity of the target storage node to the number of OSDs is greater than the second preset threshold may include:

and receiving a second type report reported by the target storage node, or detecting that the ratio of the available memory capacity of the target storage node to the OSD number is greater than a second preset threshold value.

The second type report is sent when the target storage node determines that the ratio of the available memory capacity of the target storage node to the number of the OSD is changed from being smaller than a first preset threshold to being larger than a second preset threshold, so as to trigger the monitor to execute the operation of allowing the OSD to be added to the target storage node, and the operation of restoring part or all of the OSD in the part of the OSD in the target storage node which is set to be in the first type Down state to be in the UP state is carried out on the basis that the ratio of the available memory capacity after restoration to the number of the OSD is larger than or equal to the second preset threshold.

The specific implementation of the target storage node sending the second type report to the monitor is similar to the implementation of the target storage node sending the first type report to the monitor, and the details of the embodiment of the present invention are not repeated herein.

In this embodiment, when the monitor determines that the ratio of the available memory capacity of the target storage node to the OSD number is greater than the second preset threshold, the monitor may determine that the target storage node reserves sufficient memory resources for the OSD, and at this time, the target storage node may be allowed to increase the OSD number.

Accordingly, in one aspect, the monitor may allow OSD addition operations for the target storage node; on the other hand, the monitor may restore some or all of the OSDs set to the first type Down state in the target storage node to the UP state.

When the monitor performs the foregoing recovery processing on the target storage node, it is required to ensure that a ratio of the available memory capacity after the recovery processing of the target storage node to the number of OSDs is greater than or equal to a second threshold.

For example, still taking the above example as an example, assuming that the second preset threshold is h2, if at a certain time, the available memory capacity of the target storage node is restored to M2(M1 < M2 ≦ M0), and M2/(N0-N1) > h2, at this time, the monitor may select N2(0 < N2 < N1) OSDs from OSDs set as the first type Down state in the target storage node to be restored to the UP state, where M2/(N0-N1+ N2) ≧ h 2.

Further, in one embodiment of the present invention, when the monitor determines that the ratio of the available memory capacity of the target storage node to the number of OSDs is greater than or equal to a first preset threshold and is less than a second preset threshold, the OSD addition operation for the storage node is rejected.

In this embodiment, the determining, by the monitor, that the ratio of the available memory capacity of the target storage node to the number of OSDs is greater than or equal to a first preset threshold, and is less than a second preset threshold may include receiving a third type report sent by the target storage node, or detecting that the ratio of the available memory capacity of the target storage node to the number of OSDs is greater than or equal to the first preset threshold and is less than the second preset threshold.

The third type report is sent when the target storage node determines that the ratio of the available memory capacity of the target storage node to the number of the OSD is greater than or equal to a first preset threshold and smaller than a second preset threshold, and then the monitor is triggered to refuse OSD increasing operation aiming at the storage node.

The specific implementation of the target storage node sending the third type report to the monitor is similar to the implementation of the target storage node sending the first type report to the monitor, and details of the embodiment of the present invention are not repeated herein.

It should be noted that, when the ratio of the available memory capacity of the target storage node to the number of OSDs is changed from being smaller than the first preset threshold to being greater than or equal to the first preset threshold and being smaller than the second preset threshold, it indicates that the storage node can maintain the normal operation of the OSDs at this time, but the optimal or better allocation between the OSDs and the available memory capacity is not met, and at this time, the monitor can keep rejecting the OSD addition operation for the target storage node without performing other special processing.

Further, in one embodiment of the present invention, when the monitor determines that the ratio of the available memory capacity of the target storage node to the OSD number is smaller than a first preset threshold, or is greater than or equal to the first preset threshold and smaller than a second preset threshold, the monitor may generate a system health log, where the system health log is used to record that the target storage node has a memory fault, and thus, an administrator may obtain the memory fault of the target storage node according to the system health log and perform corresponding processing according to requirements.

It should be noted that, in the embodiment of the present invention, after the OSD state of the storage node changes, a Leader of the Ceph cluster needs to update the cluster Map (mapping), so that, if a monitor other than the Leader sets a part of OSDs in the storage node to be in the first type Down state, the Leader node needs to be notified, and the Leader node updates the cluster Map, so that in order to improve the update efficiency of the cluster Map, the Leader of the Ceph cluster may perform the above processing operation.

In order to make those skilled in the art better understand that the technical solutions provided by the embodiments of the present invention are described below with reference to specific examples.

In this embodiment, it is assumed that the first preset threshold (h1) is 1 and the second preset threshold (h2) is 1.5; assuming that the initial available memory capacity of the target storage node is 64G (assumed to be provided by 16 4G memories), the number of OSDs is 32. The flow of the fault handling method in this embodiment is as follows:

1. when 5 memories in the target storage node have faults, the ratio (described as h3 below) of the available memory capacity of the target storage node to the OSD number is h3 ═ (64-4 × 5)/32 ═ 1.375, that is, h1 < h3 < h2, at this time, the target storage node can report to the monitor through the OSD daemon, that is, a third type report is sent to the monitor;

when the monitor receives the third type report sent by the target storage node, it determines that the target storage node is in a state of h1 < h3 < h2, and at this time, the monitor may perform the following processing:

a. generating a system health log and recording the memory fault of a target storage node;

b. rejecting OSD increasing operation aiming at the target storage node;

2. when another 4 memories of the target storage node fail (that is, 9 memories fail), h3 of the target storage node is (64-4 x 9)/32 is 0.875, that is, h3 is less than h1, and at this time, the target storage node may report to the monitor through the OSD daemon, that is, send the first type report to the monitor;

when the monitor receives the first type report sent by the target storage node, the monitor determines that the target storage node is in a state of h3 < h1, and at this time, the monitor may perform the following processing:

a. generating a system health log, and recording the occurrence of serious memory failure of a target storage node;

b. rejecting OSD increasing operation aiming at the target storage node;

c. and performing work removal processing on part of OSD on the target storage node, wherein the work removal processing is realized as follows:

i. calculating the number o1 of the OSD needing to be subjected to work processing to ensure that h3 of the processed target storage node is larger than h1, namely, 1 ≦ (64-4 × 9)/(32-o1), so that the minimum value of o1 is 4;

ii. The monitor carries out work removing processing on 4 random OSD on the target storage node, namely, the selected 4 OSD are set to be a management Down state (namely a first type Down state), and data on the 4 OSD are moved to other normal OSD of the Ceph cluster through a data recovery action;

wherein, the monitor can periodically detect the h3 value of the target storage node after performing the job-removing process on the 4 OSDs of the target storage node;

3. when the administrator finds the system health log, finds the target storage node according to the system health log, and performs the memory recovery processing (assuming that the available memory capacity after recovery is M), h3 of the target storage node is increased, and when the monitor detects that the target storage node is in a state of h3 > h2 (i.e. h3 > 1.5), the monitor may perform the following processing:

a. allowing an OSD addition operation for the target storage node;

b. the OSD for partially or totally managing the Down state on the target storage node is restored to the UP state, which is implemented as follows:

i. calculating the number o2 of OSD needing to be restored to UP state to ensure that h3 of the processed target storage node is greater than or equal to h2, namely 1.5-M/(28 + o 2);

ii. The monitor restores O2-number of OSD's on the target storage node managing the Down state to the UP state and performs a rebalancing process.

Fig. 3 is a schematic diagram of different operations performed by the monitor according to a change in a ratio between an available memory capacity of the storage node and the number of OSDs.

As can be seen from the above description, in the technical solution provided in the embodiment of the present invention, by setting the first preset threshold for indicating that the memory resource of the storage node is insufficient, when it is determined that the ratio of the available memory capacity of the target storage node in the storage node to the number of OSDs is smaller than the first preset threshold, the OSD addition operation for the target storage node is rejected, and a part of OSDs of the target storage node are set to be in the first type Down state, so that the ratio of the available memory capacity of the target storage node to the number of the remaining OSDs is greater than or equal to the first preset threshold, and the risk that an OSD performance bottleneck and a data recovery failure may occur after a memory failure on the storage node.

Referring to fig. 4, a schematic structural diagram of a fault handling apparatus according to an embodiment of the present invention is provided, where the apparatus may be applied to a monitor in the foregoing method embodiment, and as shown in fig. 4, the fault handling apparatus may include:

an obtaining unit 410, configured to obtain information about available memory capacity of a storage node and the number of OSD (on screen display) of object storage devices;

a determining unit 420, configured to determine whether a ratio of a memory capacity of a target storage node to an OSD number of object storage devices in the storage nodes is smaller than a first preset threshold;

a processing unit 430, configured to, when the determining unit 420 determines that a ratio of an available memory capacity of a target storage node in the storage nodes to the number of OSDs is smaller than a first preset threshold, reject OSD addition operation for the target storage node, and set a part of OSDs of the target storage node to be in a first type non-working Down state, so that the ratio of the available memory capacity of the target storage node to the number of remaining OSDs is greater than or equal to the first preset threshold; wherein the remaining OSDs refer to available OSDs of the target storage node except for a portion of OSDs set to the first type Down state.

In an optional embodiment, the obtaining unit 410 is specifically configured to receive information of the available memory capacity and the OSD number reported by the storage node; or, detecting the information of the available capacity and the number of OSD of the storage node.

In an optional embodiment, the processing unit 430 is further configured to, when the determining unit 420 determines that the ratio of the available memory capacity to the number of OSDs of the target storage node is greater than a second preset threshold, allow an OSD addition operation for the target storage node, and restore, to an UP state, a part or all of the OSDs in a part of the OSDs set in the first type Down state in the target storage node on the basis that the ratio of the recovered available memory capacity to the number of OSDs is greater than or equal to the second preset threshold; wherein the first preset threshold is smaller than the second preset threshold.

In an optional embodiment, the obtaining unit 410 is further configured to detect a ratio of an available memory capacity of the target storage node to the number of OSDs;

the determining unit 420 is specifically configured to determine that the ratio of the available memory capacity of the target storage node to the number of OSDs is greater than a second preset threshold when it is detected that the ratio of the available memory capacity of the target storage node to the number of OSDs is greater than the second preset threshold.

In an optional embodiment, the processing unit 430 is further configured to reject OSD addition operation for the target storage node when it is determined that a ratio of the memory capacity of the target storage node to the number of OSDs is greater than or equal to the first preset threshold and is less than the second preset threshold.

Referring to fig. 5, a schematic structural diagram of a fault handling apparatus according to an embodiment of the present invention is provided, where the apparatus may be applied to a storage node in the foregoing method embodiment, and as shown in fig. 5, the fault handling apparatus may include:

an obtaining unit 510, configured to obtain information about available memory capacity of the device and the number of OSD objects in the object storage device;

a sending unit 520, configured to report information about available memory capacity and OSD amount to the monitor.

Fig. 6 is a schematic diagram of a hardware structure in which a monitor and a storage node are located on the same physical host according to an example of the present disclosure. It should be understood that the monitor and storage nodes may be located on different physical hosts. This embodiment will be described with reference to fig. 6. The physical host may include a processor 601, a machine-readable storage medium 602 storing machine-executable instructions, and an OSD603 for storing metadata and/or copies of data objects.

The processor 601 and the machine-readable storage medium 602 and the OSD603 may communicate via a system bus 604. Also, the processor 601 may perform the fault handling methods described above by reading and executing machine executable instructions in the machine readable storage medium 602 corresponding to the fault handling logic.

The machine-readable storage medium 602 referred to herein may be any electronic, magnetic, optical, or other physical storage device that can contain or store information such as executable instructions, data, and the like. For example, the machine-readable storage medium may be: a RAM (random Access Memory), a volatile Memory, a non-volatile Memory, a flash Memory, a storage drive (e.g., a hard drive), a solid state drive, any type of storage disk (e.g., an optical disk, a dvd, etc.), or similar storage medium, or a combination thereof. The OSD603 may include, but is not limited to, a physical disk.

The implementation process of the functions and actions of each unit in the above device is specifically described in the implementation process of the corresponding step in the above method, and is not described herein again.

For the device embodiments, since they substantially correspond to the method embodiments, reference may be made to the partial description of the method embodiments for relevant points. The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules can be selected according to actual needs to achieve the purpose of the scheme of the invention. One of ordinary skill in the art can understand and implement it without inventive effort.

Other embodiments of the invention will be apparent to those skilled in the art from consideration of the specification and practice of the invention disclosed herein. This application is intended to cover any variations, uses, or adaptations of the invention following, in general, the principles of the invention and including such departures from the present disclosure as come within known or customary practice within the art to which the invention pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the invention being indicated by the following claims.

It will be understood that the invention is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the invention is limited only by the appended claims.

Claims

1. A method of fault handling, the method comprising:

acquiring the information of the available memory capacity of the storage node and the OSD number of the object storage equipment; the initial available memory capacity is the memory capacity reserved for OSD when the storage node is initialized to run;

when the ratio of the available memory capacity of a target storage node in the storage nodes to the number of OSD is determined to be smaller than a first preset threshold, refusing the OSD increasing operation aiming at the target storage node, and setting part of OSD of the target storage node to be in a first type non-working Down state, so that the ratio of the available memory capacity of the target storage node to the number of the rest OSD is larger than or equal to the first preset threshold; wherein the rest OSDs refer to available OSDs except for a part of OSDs set to a first type Down state in the OSDs of the target storage node, and the OSDs of the first type non-working Down state are OSDs having a working capability but being set to a non-working state.

2. The method of claim 1, wherein the obtaining information of the available memory capacity and the number of OSDs of the storage node comprises:

receiving the information of the available memory capacity and the OSD number reported by the storage node; or the like, or, alternatively,

3. The method of claim 1, wherein after setting the partial OSD of the target storage node to the first type Down state, further comprising:

when the ratio of the available memory capacity to the number of the OSD of the target storage node is determined to be larger than a second preset threshold, allowing the OSD of the target storage node to be subjected to additional operation, and restoring part or all of the part of the OSD of the target storage node which is set to be in the first type Down state to be in the working UP state on the basis that the ratio of the available memory capacity to the number of the OSD after recovery processing is larger than or equal to the second preset threshold; wherein the first preset threshold is smaller than the second preset threshold.

4. The method of claim 3, wherein the determining that the ratio of the available memory capacity of the target storage node to the number of OSD is greater than a second preset threshold comprises:

detecting the ratio of the available memory capacity of the target storage node to the OSD number;

and when the ratio of the available memory capacity of the target storage node to the number of the OSD is detected to be larger than a second preset threshold, determining that the ratio of the available memory capacity of the target storage node to the number of the OSD is larger than the second preset threshold.

5. The method of claim 3, further comprising:

and when the ratio of the memory capacity of the target storage node to the number of the OSD is determined to be greater than or equal to the first preset threshold and smaller than the second preset threshold, rejecting the OSD increasing operation aiming at the target storage node.

6. A method of fault handling, the method comprising:

reporting information of the available memory capacity and the OSD number of the monitor to the monitor, so that when the monitor determines that the ratio of the available memory capacity and the OSD number of a target storage node in a storage node is smaller than a first preset threshold, the monitor refuses to increase operation aiming at the OSD of the target storage node, and sets part of the OSD of the target storage node to be in a first type non-working Down state, so that the ratio of the available memory capacity and the rest of the OSD number of the target storage node is larger than or equal to the first preset threshold; wherein the rest OSDs refer to available OSDs except for a part of OSDs set to a first type Down state in the OSDs of the target storage node; the initial available memory capacity is the memory capacity reserved for OSD when the storage node is initialized to run; the first type of inactive Down state OSD is an OSD having an active capability but is set to an inactive state.

7. A fault handling apparatus, characterized in that the apparatus comprises:

the device comprises an acquisition unit, a storage unit and a control unit, wherein the acquisition unit is used for acquiring the information of the available memory capacity of a storage node and the OSD number of object storage equipment; the initial available memory capacity is the memory capacity reserved for OSD when the storage node is initialized to run;

the processing unit is used for refusing OSD increasing operation aiming at a target storage node when the determining unit determines that the ratio of the available memory capacity of the target storage node to the OSD number in the storage nodes is smaller than a first preset threshold value, and setting part of OSD of the target storage node to be in a first type non-working Down state so as to enable the ratio of the available memory capacity of the target storage node to the rest of OSD number to be larger than or equal to the first preset threshold value; wherein the rest OSDs refer to available OSDs except for a part of OSDs set to a first type Down state in the OSDs of the target storage node, and the OSDs of the first type non-working Down state are OSDs having a working capability but being set to a non-working state.

8. The apparatus of claim 7,

the acquiring unit is specifically configured to receive information of the available memory capacity and the OSD number reported by the storage node; or, detecting the information of the available capacity and the number of OSD of the storage node.

9. The apparatus of claim 7,

the processing unit is further configured to allow OSD addition operation for the target storage node when the determining unit determines that the ratio of the available memory capacity to the OSD number of the target storage node is greater than a second preset threshold, and restore part or all of the OSDs in the part of the OSDs set in the first type Down state in the target storage node to the working UP state on the principle that the ratio of the recovered available memory capacity to the OSD number is greater than or equal to the second preset threshold; wherein the first preset threshold is smaller than the second preset threshold.

10. The apparatus of claim 9,

the acquisition unit is further configured to detect a ratio of an available memory capacity of the target storage node to the number of OSDs;

the determining unit is specifically configured to determine that the ratio of the available memory capacity of the target storage node to the number of OSDs is greater than a second preset threshold when it is detected that the ratio of the available memory capacity of the target storage node to the number of OSDs is greater than the second preset threshold.

11. The apparatus of claim 9,

the processing unit is further configured to reject OSD addition operation for the target storage node when it is determined that a ratio of the memory capacity of the target storage node to the number of OSDs is greater than or equal to the first preset threshold and smaller than the second preset threshold.

12. A fault handling apparatus, characterized in that the apparatus comprises:

a sending unit, configured to report information about available memory capacity and OSD number of the monitor to the monitor, so that when the monitor determines that a ratio of the available memory capacity to the OSD number of a target storage node in a storage node is smaller than a first preset threshold, the monitor rejects OSD addition operation for the target storage node, and sets a part of the OSD of the target storage node to be in a first type non-working Down state, so that the ratio of the available memory capacity to the remaining OSD number of the target storage node is greater than or equal to the first preset threshold; wherein the rest OSDs refer to available OSDs except for a part of OSDs set to a first type Down state in the OSDs of the target storage node; the initial available memory capacity is the memory capacity reserved for OSD when the storage node is initialized to run; the first type of inactive Down state OSD is an OSD having an active capability but is set to an inactive state.