CN107729185B - Fault processing method and device - Google Patents

Fault processing method and device Download PDF

Info

Publication number
CN107729185B
CN107729185B CN201711015125.1A CN201711015125A CN107729185B CN 107729185 B CN107729185 B CN 107729185B CN 201711015125 A CN201711015125 A CN 201711015125A CN 107729185 B CN107729185 B CN 107729185B
Authority
CN
China
Prior art keywords
storage node
osd
memory capacity
target storage
osds
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201711015125.1A
Other languages
Chinese (zh)
Other versions
CN107729185A (en
Inventor
顾雷雷
乔辉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou H3C Technologies Co Ltd
Original Assignee
Hangzhou H3C Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou H3C Technologies Co Ltd filed Critical Hangzhou H3C Technologies Co Ltd
Priority to CN201711015125.1A priority Critical patent/CN107729185B/en
Publication of CN107729185A publication Critical patent/CN107729185A/en
Application granted granted Critical
Publication of CN107729185B publication Critical patent/CN107729185B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1458Management of the backup or restore process
    • G06F11/1464Management of the backup or restore process for networked environments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3003Monitoring arrangements specially adapted to the computing system or computing system component being monitored
    • G06F11/3034Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system component is a storage system, e.g. DASD based or network based

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Quality & Reliability (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The invention provides a fault processing method and a device, wherein the method comprises the following steps: acquiring the information of the available memory capacity of the storage node and the OSD number of the object storage equipment; when the ratio of the available memory capacity of the target storage node to the number of the OSD is smaller than a first preset threshold value, refusing the OSD increasing operation aiming at the target storage node, and setting part of the OSD of the target storage node to be in a first type non-working Down state, so that the ratio of the available memory capacity of the target storage node to the number of the rest OSD is larger than or equal to the first preset threshold value; wherein the remaining OSDs refer to available OSDs of the target storage node except for a portion of OSDs set to the first type Down state. By applying the method and the device, the risks of OSD performance bottleneck and data recovery failure after memory failure on the storage node can be avoided.

Description

Fault processing method and device
Technical Field
The present invention relates to the field of network communication technologies, and in particular, to a fault handling method and apparatus.
Background
The Ceph (distributed storage system) is an open source project, provides a software-defined and unified storage solution, and has the advantages of large-scale expansion, high performance and no single point of failure.
A typical Ceph cluster deployment creates an OSD (Object Storage Device) for each physical hard disk in the cluster node.
The failure domain of a Ceph cluster typically includes disks, nodes (i.e., servers), racks, power circuits, and the like. When any component in the failure domain fails, which causes the corresponding OSD deployed thereon to fail, the Ceph cluster marks the OSDs in a Down (non-working) state, performs an initialization operation, and reorganizes the affected data on the failed node.
However, practice shows that in the existing Ceph cluster implementation scheme, because part of the memory of the storage node is unavailable and does not directly affect the state of the OSD, when part of the memory of the storage node is unavailable, the Ceph cluster does not adjust the state of the OSD, but if a data recovery or rebalancing event occurs in the Ceph cluster, the processing performance of the Ceph cluster is seriously affected, and even a data recovery failure occurs.
Disclosure of Invention
The invention provides a fault processing method and device, and aims to solve the problems that in the prior art, storage node memory faults can reduce the processing performance of a Ceph cluster and even cause data recovery failure of the Ceph cluster.
According to a first aspect of the embodiments of the present invention, there is provided a fault handling method, including:
acquiring the information of the available memory capacity of the storage node and the OSD number of the object storage equipment;
when the ratio of the available memory capacity of a target storage node in the storage nodes to the number of OSD is determined to be smaller than a first preset threshold, refusing the OSD increasing operation aiming at the target storage node, and setting part of OSD of the target storage node to be in a first type non-working Down state, so that the ratio of the available memory capacity of the target storage node to the number of the rest OSD is larger than or equal to the first preset threshold; wherein the remaining OSDs refer to available OSDs of the target storage node except for a portion of OSDs set to the first type Down state.
According to a second aspect of the embodiments of the present invention, there is provided a fault handling method, including:
acquiring the information of the available memory capacity and the OSD number of the object storage object equipment;
and reporting the information of the available memory capacity and the OSD number to the monitor.
According to a third aspect of embodiments of the present invention, there is provided a fault handling apparatus including:
the device comprises an acquisition unit, a storage unit and a control unit, wherein the acquisition unit is used for acquiring the information of the available memory capacity of a storage node and the OSD number of object storage equipment;
the determining unit is used for determining whether the ratio of the memory capacity of the target storage node to the OSD number of the object storage devices in the storage nodes is smaller than a first preset threshold value;
the processing unit is used for refusing OSD increasing operation aiming at a target storage node when the determining unit determines that the ratio of the available memory capacity of the target storage node to the OSD number in the storage nodes is smaller than a first preset threshold value, and setting part of OSD of the target storage node to be in a first type non-working Down state so as to enable the ratio of the available memory capacity of the target storage node to the rest of OSD number to be larger than or equal to the first preset threshold value; wherein the remaining OSDs refer to available OSDs of the target storage node except for a portion of OSDs set to the first type Down state.
According to a fourth aspect of the embodiments of the present invention, there is provided a fault handling apparatus including:
the device comprises an acquisition unit, a storage unit and a display unit, wherein the acquisition unit is used for acquiring the information of the available memory capacity of the device and the OSD number of object storage object devices;
and the sending unit is used for reporting the information of the available memory capacity and the OSD number to the monitor.
By setting the first preset threshold used for indicating that the memory resources of the storage nodes are insufficient, when the ratio of the available memory capacity of the target storage node in the storage nodes to the OSD number is smaller than the first preset threshold, the OSD increasing operation aiming at the target storage node is refused, and part of the OSD of the target storage node is set to be in the first type Down state, so that the ratio of the available memory capacity of the target storage node to the rest of the OSD number is larger than or equal to the first preset threshold, and the risks of OSD performance bottleneck and data recovery failure after the memory failure on the storage nodes are avoided.
Drawings
Fig. 1 is a schematic flow chart of a fault handling method according to an embodiment of the present invention;
fig. 2 is a schematic flow chart of a fault handling method according to an embodiment of the present invention;
fig. 3 is a schematic structural diagram of a fault handling apparatus according to an embodiment of the present invention;
fig. 4 is a schematic structural diagram of another fault handling apparatus provided in the embodiment of the present invention;
fig. 5 is a schematic structural diagram of a fault handling apparatus according to an embodiment of the present invention;
fig. 6 is a schematic diagram of a hardware structure in which a monitor and a storage node are located on the same physical host according to an embodiment of the present invention.
Detailed Description
In order to make the technical solutions in the embodiments of the present invention better understood and make the above objects, features and advantages of the embodiments of the present invention more comprehensible, the technical solutions in the embodiments of the present invention are described in further detail below with reference to the accompanying drawings.
Referring to fig. 1, a schematic flow chart of a fault handling method according to an embodiment of the present invention is provided, where the fault handling method may be applied to a storage node of a Ceph cluster, as shown in fig. 1, the fault handling method may include the following steps:
it should be noted that, in the embodiment of the present invention, if not specifically described, the memory capacity is in units of G, the number of OSDs in the storage node is the OSD in the UP (working) state in the storage node, and a description of the embodiment of the present invention is not repeated in the following.
Step 101, obtaining the information of the available memory capacity and the OSD number of the user.
And step 102, reporting the information of the available memory capacity and the OSD number to a monitor.
In the embodiment of the invention, in order to ensure that the OSDs of each storage node in the Ceph cluster can work normally, it is required to ensure that each OSD in the storage node is configured with enough memory, and for the storage node as a whole, it is required to ensure that the ratio of the available memory capacity of the storage node to the number of OSDs is large enough.
When a storage node is initialized to operate, sufficient memory resources are usually reserved for the OSD, but when a part of the memory in the storage node fails, such as a memory bank is loose, a memory bank interface fails, and the like, the ratio of the available memory capacity of the storage node to the number of the OSD is reduced, and even the available memory is insufficient to meet the requirement of normal operation of the OSD.
Therefore, in the embodiment of the present invention, the storage node may report the information of the available memory capacity and the OSD number of the storage node to the monitor according to the preset policy, such as a ratio of the available memory capacity to the OSD number.
For example, the storage node may report the ratio of the available memory capacity of the storage node to the number of OSDs to the monitor at regular time (e.g., periodically); or, the storage node may report the ratio of the available memory capacity to the OSD number to the monitor when the ratio of the available memory capacity to the OSD number satisfies a specified condition.
In an embodiment of the present invention, the reporting of the information about the available memory capacity and the OSD number to the monitor may include:
and when the ratio of the available memory capacity of the monitor to the number of the OSD is smaller than a first preset threshold value, sending a first type report to the monitor.
In this embodiment, a threshold (referred to as a first preset threshold herein) may be preset in each storage node, where the threshold is used to determine whether the memory resource reserved for the OSD meets the minimum requirement that the OSD can normally operate, that is, when a ratio of an available memory capacity of the storage node to the number of the OSD is greater than or equal to the first preset threshold, it indicates that the memory resource reserved for the OSD by the storage node meets the minimum requirement that the OSD can normally operate, and otherwise, it indicates that the memory resource reserved for the OSD by the storage node does not meet the minimum requirement that the OSD can normally operate. The first preset threshold may be set according to an actual scene, for example, determined according to an actual hard disk capacity of the storage node.
Optionally, the first preset threshold may be any value within a range of 0.5 to 1.5.
Accordingly, in this embodiment, the storage node may obtain, in real time, a ratio of the available memory capacity of the storage node to the number of OSDs, to determine whether the ratio of the available memory capacity of the storage node to the number of OSDs is smaller than a first preset threshold, and when it is determined that the ratio of the available memory capacity of the storage node to the number of OSDs is smaller than the first preset threshold, send a report (referred to as a first type report herein) to the monitor, so as to notify the monitor that the ratio of the available memory capacity of the storage node to the number of OSDs is smaller than the first preset threshold, that is, the available memory capacity of the storage node is less than the minimum requirement for normal operation of the OSDs, so as to trigger the monitor to.
For example, the storage node may obtain a ratio of the available memory capacity of the storage node to the number of OSDs in real time, periodically poll the ratio of the available memory capacity to the number of OSDs by the OSD daemon (the period may be set according to an actual scene, for example, 300 seconds), determine whether the ratio of the available memory capacity to the number of OSDs exceeds a first preset threshold, and send a first type report to the monitor when the OSD daemon determines that the ratio of the available memory capacity to the number of OSDs is lower than the first preset threshold.
In order to avoid that a plurality of OSD daemon processes in the same storage node repeatedly send the first type report to the monitor, each OSD daemon process may poll the ratio of the available memory capacity to the OSD number in different periods, or start polling in the same period but at different times, considering that the storage node usually has a plurality of OSD daemon processes (the number of OSD is the same as the number of OSD).
For example, assuming that 32 OSDs are created in the storage node, correspondingly, 32 OSD daemons (OSD daemons 1 to 32) are running in the storage node, each OSD daemon in the storage node polls the storage node for the ratio of the available memory capacity to the number of OSDs in a period of 300 seconds, for example, OSD daemon 2 may start polling 5 seconds after the OSD daemon 1 polls the start time, and OSD daemon 3 may start polling … after OSD daemon 32 5 seconds after OSD daemon 31 polls the start time.
When any OSD daemon finds that the ratio of the available memory capacity of the storage node to the number of OSD is smaller than a first preset threshold value, the OSD daemon can send a first type report to a monitor; the monitor, upon receiving the first type of report, may respond with an acknowledgement message, such as an ACK (acknowledgement) message; after receiving the confirmation message returned by the monitor, the OSD daemon may send a notification message to other OSD daemon, so that the other OSD daemon may be in a silent state within a certain time length after receiving the notification message, that is, polling of a ratio of an available memory capacity to the OSD number is not performed within a preset time length. The silent duration of each OSD daemon needs to be longer than a polling period of a ratio of available memory capacity to the number of OSDs performed by each OSD daemon, for example, when the polling period is 300 seconds, the silent duration may be 900 seconds or 1800 seconds. When the OSD daemon receives the notification message sent by other OSD daemon within the silent duration, the OSD daemon can zero the timing of the silent time and restart the timing of the silent time; otherwise, that is, when the OSD daemon does not receive the notification message sent by other OSD daemon within the silence duration, the OSD daemon may start polling the ratio of the available memory capacity to the OSD number when the next polling start time is reached.
In the embodiment of the invention, when the monitor receives the first type report reported by the storage node, on one hand, OSD adding operation aiming at the storage node can be refused; on the other hand, a job removal process may be performed on a part of the OSDs on the storage node, that is, the part of the OSDs on the storage node is set to be in a first type Down state (also referred to as a management Down state herein), so that a ratio of the available memory capacity of the storage node to the number of the remaining OSDs is greater than or equal to a first preset threshold, and a specific implementation thereof may refer to a related description in the method flow shown in fig. 2, which is not described herein again in the embodiments of the present invention.
Wherein the remaining OSDs refer to available OSDs (i.e., OSDs in an UP state) of the OSDs of the storage node except for a portion of OSDs set to the first type Down state; the first type of Down differs from the Down of OSDs in the existing Ceph cluster in that: the former OSD is not malfunctioning, i.e., the OSD has the capability to function, but is set to a non-functional state; the latter OSD fails, i.e., the OSD has no capability to function and is set to the Down state.
It can be seen that, in the method flow shown in fig. 1, the ratio of the available memory capacity of the storage node to the number of OSDs is obtained by the storage node, and when the ratio of the available memory capacity of the storage node to the number of OSDs is reported to the monitor, so that when the monitor determines that the ratio of the available memory capacity of the storage node to the number of OSDs is smaller than the first preset threshold, part of the OSDs on the storage node are set to be in the first type Down state, and the ratio of the available memory capacity of the storage node to the number of the remaining OSDs is greater than or equal to the first preset threshold, that is, it is ensured that the storage node reserves enough available memory capacity for the OSDs, and the risk that OSD performance bottleneck and data recovery failure may occur after a memory.
Referring to fig. 2, a schematic flow chart of a fault handling method according to an embodiment of the present invention is shown, where the method may be applied to a Monitor (Monitor) in a Ceph cluster, and as shown in fig. 2, the fault handling method may include the following steps:
it should be noted that, in the embodiment of the present invention, the monitor and the storage node may be disposed on different physical hosts, or may be disposed on the same physical host. The storage node may be understood to include at least a set of OSDs that function as a storage, although the storage node may also include a processor with control functionality.
Step 201, obtaining the information of the available memory capacity and the number of OSDs of the storage node.
In the embodiment of the present invention, in order to avoid the risk that OSD performance bottleneck and data recovery failure may occur after a memory failure on a storage node occurs, a monitor may obtain information of an available memory capacity and an OSD number of the storage node, so that when it is determined that a ratio of the available memory capacity and the OSD number of the storage node is too low, a corresponding strategy is adopted to improve the ratio of the available memory capacity and the OSD number of the storage node. The obtained information of the available storage capacity and the number of OSDs of the storage node is for the whole cluster. However, the process of acquiring the information is performed in units of physical hosts, that is, the acquired parameters are for the same physical host, that is, the acquired available memory capacity of the storage node (total available memory capacity of the set of OSDs) and the number of OSDs are parameters of the same physical host.
In an embodiment of the present invention, the obtaining information of the available memory capacity and the OSD number of the storage node may include:
and receiving the information of the available memory capacity and the OSD number reported by the storage node.
In this embodiment, the specific implementation of reporting the information of the available memory capacity and the OSD number to the monitor by the storage node may refer to the related description in the method embodiment shown in fig. 1, and the embodiment of the present invention is not described herein again.
In another embodiment of the present invention, the obtaining information of the available memory capacity and the number of OSDs of the storage node may include:
and detecting the available capacity of the storage node and the information of the number of OSD.
In this embodiment, the monitor may actively probe the information of the available capacity and the number of OSDs of the storage node, for example, the monitor may probe the information of the available capacity and the number of OSDs of the storage node at regular time (e.g., periodically).
It should be noted that, in this embodiment, any monitor in the Ceph cluster may detect a ratio between the available memory capacity of all the storage nodes in the Ceph cluster and the number of OSDs, or all the storage nodes in the Ceph cluster may be divided into a plurality of groups according to the number of monitors in the Ceph cluster, where the number of the groups may be the same as the number of monitors, and one monitor detects a ratio between the available memory capacity of the storage nodes in one group and the number of OSDs, which is not described herein in detail.
Step 202, when it is determined that the ratio of the available memory capacity of the target storage node in the storage nodes to the number of OSDs is smaller than a first preset threshold, rejecting OSD addition operation for the target storage node, and setting a part of OSDs of the target storage node to be in a first type non-working Down state, so that the ratio of the available memory capacity of the target storage node to the number of the rest of OSDs is greater than or equal to the first preset threshold.
In the embodiment of the present invention, the target storage node does not refer to a fixed storage node, but may refer to any storage node in the Ceph cluster.
In one embodiment of the present invention, determining that a ratio between an available memory capacity of the target storage node and the number of OSDs is smaller than a first preset threshold may include:
a first type report sent by a target storage node is received.
The specific implementation of sending the first type report to the monitor by the target storage node may refer to the related description in the method flow shown in fig. 1, and details of the embodiment of the present invention are not described herein again.
In this embodiment, when the monitor receives the first type report sent by the target storage node, it may be determined that the ratio of the available memory capacity of the target storage node to the number of OSDs is smaller than the first preset threshold.
In another embodiment of the present invention, determining that a ratio of the available memory capacity of the target storage node to the number of OSDs is smaller than a first preset threshold may include:
and detecting that the ratio of the available memory capacity of the target storage node to the number of OSD (on screen display) is smaller than a first preset threshold.
In this embodiment, the monitor may periodically detect a ratio of the available memory capacity of each storage node in the Ceph cluster to the number of OSDs, and determine whether the ratio of the available memory capacity of each storage node to the number of OSDs is smaller than a first preset threshold.
In the embodiment of the present invention, when the monitor determines that the ratio of the available memory capacity of the target storage node to the number of OSDs is smaller than the first preset threshold, on one hand, the monitor needs to refuse an OSD adding operation for the target storage node, that is, when the monitor receives a notification message that the OSD needs to be added to the target storage node, the OSD adding operation is prohibited, so as to avoid that the ratio of the available memory capacity of the target storage node to the number of OSDs is further reduced by the increase of the number of the target storage node, and to aggravate the risk of OSD performance bottleneck and data recovery failure of the target storage node.
On the other hand, the monitor may perform a job removal process on a part of the OSDs of the target storage node, that is, set the part of the OSDs of the target storage node to be in the first type Down state, so that a ratio of the available memory capacity of the target storage node to the number of the rest of the OSDs is greater than or equal to a first preset threshold, that is, it is ensured that the memory resources of the target storage node can meet the minimum requirement of the normal operation of the OSDs in the UP state.
For example, assume that the first preset threshold is h1, the initial available memory capacity of the target storage node is M0, the initial OSD number is N0, and M0/N0 ≧ h 1; if a part of memory faults in the target storage node at a certain moment result in that the available memory capacity is changed into M1, and M1/N0 is less than h1, the target storage node reports a first type report to the monitor, and at the moment, the monitor can select N1 OSD from the target storage node to be set to be in a first type Down state, so that M1/(N0-N1) is not less than h 1. Optionally, the number of the selected OSDs set to the first type Down state may be the minimum value of N1 satisfying M1/(N0-N1) ≧ h 1.
Further, in the embodiment of the present invention, after the monitor performs the above processing on the storage node, if the memory failure of the storage node is repaired, the available memory capacity of the storage node may increase, and accordingly, the ratio of the available memory capacity of the storage node to the number of OSDs may also increase, at this time, the monitor may restore part or all of the OSDs on the storage node that are set to the first type Down state to the UP state.
In order to avoid that the ratio of the available memory capacity of the storage node to the number of OSDs is decreased to be less than the first preset threshold value again after the monitor restores part or all of the OSDs set as the first type Down state to the UP state, which results in that part of the OSDs set as the first type Down state on the storage node needs to be restored to the first type Down state again, thereby forming a shock, another threshold value (referred to as a second preset threshold value herein) may be preset, and the second preset threshold value is used to determine whether to allow the OSDs set as the first type Down state on the storage node to be restored to the UP state. Wherein the second preset threshold is greater than the first preset threshold.
Preferably, the second predetermined threshold is a ratio of an available memory capacity to an OSD in an ideal state, i.e., an optimal or preferred ratio between the available memory capacity and the OSD.
Optionally, the second preset threshold may be any value within a range of 1.5 to 2.0, where when the first preset threshold is 1.5, the second preset threshold needs to be greater than 1.5.
Accordingly, in an embodiment of the present invention, after the setting the part of the OSDs of the target storage node to the first type Down state, the method may further include:
and when the ratio of the available memory capacity to the OSD number of the target storage node is determined to be larger than a second preset threshold, allowing the OSD addition operation aiming at the target storage node, and restoring part or all of part of the OSD which is set to be in the first type Down state in the target storage node to be in an UP state on the basis that the ratio of the recovered available memory capacity to the OSD number is larger than or equal to the second preset threshold.
In this embodiment, the determining, by the monitor, that the ratio of the available memory capacity of the target storage node to the number of OSDs is greater than the second preset threshold may include:
and receiving a second type report reported by the target storage node, or detecting that the ratio of the available memory capacity of the target storage node to the OSD number is greater than a second preset threshold value.
The second type report is sent when the target storage node determines that the ratio of the available memory capacity of the target storage node to the number of the OSD is changed from being smaller than a first preset threshold to being larger than a second preset threshold, so as to trigger the monitor to execute the operation of allowing the OSD to be added to the target storage node, and the operation of restoring part or all of the OSD in the part of the OSD in the target storage node which is set to be in the first type Down state to be in the UP state is carried out on the basis that the ratio of the available memory capacity after restoration to the number of the OSD is larger than or equal to the second preset threshold.
The specific implementation of the target storage node sending the second type report to the monitor is similar to the implementation of the target storage node sending the first type report to the monitor, and the details of the embodiment of the present invention are not repeated herein.
In this embodiment, when the monitor determines that the ratio of the available memory capacity of the target storage node to the OSD number is greater than the second preset threshold, the monitor may determine that the target storage node reserves sufficient memory resources for the OSD, and at this time, the target storage node may be allowed to increase the OSD number.
Accordingly, in one aspect, the monitor may allow OSD addition operations for the target storage node; on the other hand, the monitor may restore some or all of the OSDs set to the first type Down state in the target storage node to the UP state.
When the monitor performs the foregoing recovery processing on the target storage node, it is required to ensure that a ratio of the available memory capacity after the recovery processing of the target storage node to the number of OSDs is greater than or equal to a second threshold.
For example, still taking the above example as an example, assuming that the second preset threshold is h2, if at a certain time, the available memory capacity of the target storage node is restored to M2(M1 < M2 ≦ M0), and M2/(N0-N1) > h2, at this time, the monitor may select N2(0 < N2 < N1) OSDs from OSDs set as the first type Down state in the target storage node to be restored to the UP state, where M2/(N0-N1+ N2) ≧ h 2.
Further, in one embodiment of the present invention, when the monitor determines that the ratio of the available memory capacity of the target storage node to the number of OSDs is greater than or equal to a first preset threshold and is less than a second preset threshold, the OSD addition operation for the storage node is rejected.
In this embodiment, the determining, by the monitor, that the ratio of the available memory capacity of the target storage node to the number of OSDs is greater than or equal to a first preset threshold, and is less than a second preset threshold may include receiving a third type report sent by the target storage node, or detecting that the ratio of the available memory capacity of the target storage node to the number of OSDs is greater than or equal to the first preset threshold and is less than the second preset threshold.
The third type report is sent when the target storage node determines that the ratio of the available memory capacity of the target storage node to the number of the OSD is greater than or equal to a first preset threshold and smaller than a second preset threshold, and then the monitor is triggered to refuse OSD increasing operation aiming at the storage node.
The specific implementation of the target storage node sending the third type report to the monitor is similar to the implementation of the target storage node sending the first type report to the monitor, and details of the embodiment of the present invention are not repeated herein.
It should be noted that, when the ratio of the available memory capacity of the target storage node to the number of OSDs is changed from being smaller than the first preset threshold to being greater than or equal to the first preset threshold and being smaller than the second preset threshold, it indicates that the storage node can maintain the normal operation of the OSDs at this time, but the optimal or better allocation between the OSDs and the available memory capacity is not met, and at this time, the monitor can keep rejecting the OSD addition operation for the target storage node without performing other special processing.
Further, in one embodiment of the present invention, when the monitor determines that the ratio of the available memory capacity of the target storage node to the OSD number is smaller than a first preset threshold, or is greater than or equal to the first preset threshold and smaller than a second preset threshold, the monitor may generate a system health log, where the system health log is used to record that the target storage node has a memory fault, and thus, an administrator may obtain the memory fault of the target storage node according to the system health log and perform corresponding processing according to requirements.
It should be noted that, in the embodiment of the present invention, after the OSD state of the storage node changes, a Leader of the Ceph cluster needs to update the cluster Map (mapping), so that, if a monitor other than the Leader sets a part of OSDs in the storage node to be in the first type Down state, the Leader node needs to be notified, and the Leader node updates the cluster Map, so that in order to improve the update efficiency of the cluster Map, the Leader of the Ceph cluster may perform the above processing operation.
In order to make those skilled in the art better understand that the technical solutions provided by the embodiments of the present invention are described below with reference to specific examples.
In this embodiment, it is assumed that the first preset threshold (h1) is 1 and the second preset threshold (h2) is 1.5; assuming that the initial available memory capacity of the target storage node is 64G (assumed to be provided by 16 4G memories), the number of OSDs is 32. The flow of the fault handling method in this embodiment is as follows:
1. when 5 memories in the target storage node have faults, the ratio (described as h3 below) of the available memory capacity of the target storage node to the OSD number is h3 ═ (64-4 × 5)/32 ═ 1.375, that is, h1 < h3 < h2, at this time, the target storage node can report to the monitor through the OSD daemon, that is, a third type report is sent to the monitor;
when the monitor receives the third type report sent by the target storage node, it determines that the target storage node is in a state of h1 < h3 < h2, and at this time, the monitor may perform the following processing:
a. generating a system health log and recording the memory fault of a target storage node;
b. rejecting OSD increasing operation aiming at the target storage node;
2. when another 4 memories of the target storage node fail (that is, 9 memories fail), h3 of the target storage node is (64-4 x 9)/32 is 0.875, that is, h3 is less than h1, and at this time, the target storage node may report to the monitor through the OSD daemon, that is, send the first type report to the monitor;
when the monitor receives the first type report sent by the target storage node, the monitor determines that the target storage node is in a state of h3 < h1, and at this time, the monitor may perform the following processing:
a. generating a system health log, and recording the occurrence of serious memory failure of a target storage node;
b. rejecting OSD increasing operation aiming at the target storage node;
c. and performing work removal processing on part of OSD on the target storage node, wherein the work removal processing is realized as follows:
i. calculating the number o1 of the OSD needing to be subjected to work processing to ensure that h3 of the processed target storage node is larger than h1, namely, 1 ≦ (64-4 × 9)/(32-o1), so that the minimum value of o1 is 4;
ii. The monitor carries out work removing processing on 4 random OSD on the target storage node, namely, the selected 4 OSD are set to be a management Down state (namely a first type Down state), and data on the 4 OSD are moved to other normal OSD of the Ceph cluster through a data recovery action;
wherein, the monitor can periodically detect the h3 value of the target storage node after performing the job-removing process on the 4 OSDs of the target storage node;
3. when the administrator finds the system health log, finds the target storage node according to the system health log, and performs the memory recovery processing (assuming that the available memory capacity after recovery is M), h3 of the target storage node is increased, and when the monitor detects that the target storage node is in a state of h3 > h2 (i.e. h3 > 1.5), the monitor may perform the following processing:
a. allowing an OSD addition operation for the target storage node;
b. the OSD for partially or totally managing the Down state on the target storage node is restored to the UP state, which is implemented as follows:
i. calculating the number o2 of OSD needing to be restored to UP state to ensure that h3 of the processed target storage node is greater than or equal to h2, namely 1.5-M/(28 + o 2);
ii. The monitor restores O2-number of OSD's on the target storage node managing the Down state to the UP state and performs a rebalancing process.
Fig. 3 is a schematic diagram of different operations performed by the monitor according to a change in a ratio between an available memory capacity of the storage node and the number of OSDs.
As can be seen from the above description, in the technical solution provided in the embodiment of the present invention, by setting the first preset threshold for indicating that the memory resource of the storage node is insufficient, when it is determined that the ratio of the available memory capacity of the target storage node in the storage node to the number of OSDs is smaller than the first preset threshold, the OSD addition operation for the target storage node is rejected, and a part of OSDs of the target storage node are set to be in the first type Down state, so that the ratio of the available memory capacity of the target storage node to the number of the remaining OSDs is greater than or equal to the first preset threshold, and the risk that an OSD performance bottleneck and a data recovery failure may occur after a memory failure on the storage node.
Referring to fig. 4, a schematic structural diagram of a fault handling apparatus according to an embodiment of the present invention is provided, where the apparatus may be applied to a monitor in the foregoing method embodiment, and as shown in fig. 4, the fault handling apparatus may include:
an obtaining unit 410, configured to obtain information about available memory capacity of a storage node and the number of OSD (on screen display) of object storage devices;
a determining unit 420, configured to determine whether a ratio of a memory capacity of a target storage node to an OSD number of object storage devices in the storage nodes is smaller than a first preset threshold;
a processing unit 430, configured to, when the determining unit 420 determines that a ratio of an available memory capacity of a target storage node in the storage nodes to the number of OSDs is smaller than a first preset threshold, reject OSD addition operation for the target storage node, and set a part of OSDs of the target storage node to be in a first type non-working Down state, so that the ratio of the available memory capacity of the target storage node to the number of remaining OSDs is greater than or equal to the first preset threshold; wherein the remaining OSDs refer to available OSDs of the target storage node except for a portion of OSDs set to the first type Down state.
In an optional embodiment, the obtaining unit 410 is specifically configured to receive information of the available memory capacity and the OSD number reported by the storage node; or, detecting the information of the available capacity and the number of OSD of the storage node.
In an optional embodiment, the processing unit 430 is further configured to, when the determining unit 420 determines that the ratio of the available memory capacity to the number of OSDs of the target storage node is greater than a second preset threshold, allow an OSD addition operation for the target storage node, and restore, to an UP state, a part or all of the OSDs in a part of the OSDs set in the first type Down state in the target storage node on the basis that the ratio of the recovered available memory capacity to the number of OSDs is greater than or equal to the second preset threshold; wherein the first preset threshold is smaller than the second preset threshold.
In an optional embodiment, the obtaining unit 410 is further configured to detect a ratio of an available memory capacity of the target storage node to the number of OSDs;
the determining unit 420 is specifically configured to determine that the ratio of the available memory capacity of the target storage node to the number of OSDs is greater than a second preset threshold when it is detected that the ratio of the available memory capacity of the target storage node to the number of OSDs is greater than the second preset threshold.
In an optional embodiment, the processing unit 430 is further configured to reject OSD addition operation for the target storage node when it is determined that a ratio of the memory capacity of the target storage node to the number of OSDs is greater than or equal to the first preset threshold and is less than the second preset threshold.
Referring to fig. 5, a schematic structural diagram of a fault handling apparatus according to an embodiment of the present invention is provided, where the apparatus may be applied to a storage node in the foregoing method embodiment, and as shown in fig. 5, the fault handling apparatus may include:
an obtaining unit 510, configured to obtain information about available memory capacity of the device and the number of OSD objects in the object storage device;
a sending unit 520, configured to report information about available memory capacity and OSD amount to the monitor.
Fig. 6 is a schematic diagram of a hardware structure in which a monitor and a storage node are located on the same physical host according to an example of the present disclosure. It should be understood that the monitor and storage nodes may be located on different physical hosts. This embodiment will be described with reference to fig. 6. The physical host may include a processor 601, a machine-readable storage medium 602 storing machine-executable instructions, and an OSD603 for storing metadata and/or copies of data objects.
The processor 601 and the machine-readable storage medium 602 and the OSD603 may communicate via a system bus 604. Also, the processor 601 may perform the fault handling methods described above by reading and executing machine executable instructions in the machine readable storage medium 602 corresponding to the fault handling logic.
The machine-readable storage medium 602 referred to herein may be any electronic, magnetic, optical, or other physical storage device that can contain or store information such as executable instructions, data, and the like. For example, the machine-readable storage medium may be: a RAM (random Access Memory), a volatile Memory, a non-volatile Memory, a flash Memory, a storage drive (e.g., a hard drive), a solid state drive, any type of storage disk (e.g., an optical disk, a dvd, etc.), or similar storage medium, or a combination thereof. The OSD603 may include, but is not limited to, a physical disk.
The implementation process of the functions and actions of each unit in the above device is specifically described in the implementation process of the corresponding step in the above method, and is not described herein again.
For the device embodiments, since they substantially correspond to the method embodiments, reference may be made to the partial description of the method embodiments for relevant points. The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules can be selected according to actual needs to achieve the purpose of the scheme of the invention. One of ordinary skill in the art can understand and implement it without inventive effort.
Other embodiments of the invention will be apparent to those skilled in the art from consideration of the specification and practice of the invention disclosed herein. This application is intended to cover any variations, uses, or adaptations of the invention following, in general, the principles of the invention and including such departures from the present disclosure as come within known or customary practice within the art to which the invention pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the invention being indicated by the following claims.
It will be understood that the invention is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the invention is limited only by the appended claims.

Claims (12)

1. A method of fault handling, the method comprising:
acquiring the information of the available memory capacity of the storage node and the OSD number of the object storage equipment; the initial available memory capacity is the memory capacity reserved for OSD when the storage node is initialized to run;
when the ratio of the available memory capacity of a target storage node in the storage nodes to the number of OSD is determined to be smaller than a first preset threshold, refusing the OSD increasing operation aiming at the target storage node, and setting part of OSD of the target storage node to be in a first type non-working Down state, so that the ratio of the available memory capacity of the target storage node to the number of the rest OSD is larger than or equal to the first preset threshold; wherein the rest OSDs refer to available OSDs except for a part of OSDs set to a first type Down state in the OSDs of the target storage node, and the OSDs of the first type non-working Down state are OSDs having a working capability but being set to a non-working state.
2. The method of claim 1, wherein the obtaining information of the available memory capacity and the number of OSDs of the storage node comprises:
receiving the information of the available memory capacity and the OSD number reported by the storage node; or the like, or, alternatively,
and detecting the available capacity of the storage node and the information of the number of OSD.
3. The method of claim 1, wherein after setting the partial OSD of the target storage node to the first type Down state, further comprising:
when the ratio of the available memory capacity to the number of the OSD of the target storage node is determined to be larger than a second preset threshold, allowing the OSD of the target storage node to be subjected to additional operation, and restoring part or all of the part of the OSD of the target storage node which is set to be in the first type Down state to be in the working UP state on the basis that the ratio of the available memory capacity to the number of the OSD after recovery processing is larger than or equal to the second preset threshold; wherein the first preset threshold is smaller than the second preset threshold.
4. The method of claim 3, wherein the determining that the ratio of the available memory capacity of the target storage node to the number of OSD is greater than a second preset threshold comprises:
detecting the ratio of the available memory capacity of the target storage node to the OSD number;
and when the ratio of the available memory capacity of the target storage node to the number of the OSD is detected to be larger than a second preset threshold, determining that the ratio of the available memory capacity of the target storage node to the number of the OSD is larger than the second preset threshold.
5. The method of claim 3, further comprising:
and when the ratio of the memory capacity of the target storage node to the number of the OSD is determined to be greater than or equal to the first preset threshold and smaller than the second preset threshold, rejecting the OSD increasing operation aiming at the target storage node.
6. A method of fault handling, the method comprising:
acquiring the information of the available memory capacity and the OSD number of the object storage object equipment;
reporting information of the available memory capacity and the OSD number of the monitor to the monitor, so that when the monitor determines that the ratio of the available memory capacity and the OSD number of a target storage node in a storage node is smaller than a first preset threshold, the monitor refuses to increase operation aiming at the OSD of the target storage node, and sets part of the OSD of the target storage node to be in a first type non-working Down state, so that the ratio of the available memory capacity and the rest of the OSD number of the target storage node is larger than or equal to the first preset threshold; wherein the rest OSDs refer to available OSDs except for a part of OSDs set to a first type Down state in the OSDs of the target storage node; the initial available memory capacity is the memory capacity reserved for OSD when the storage node is initialized to run; the first type of inactive Down state OSD is an OSD having an active capability but is set to an inactive state.
7. A fault handling apparatus, characterized in that the apparatus comprises:
the device comprises an acquisition unit, a storage unit and a control unit, wherein the acquisition unit is used for acquiring the information of the available memory capacity of a storage node and the OSD number of object storage equipment; the initial available memory capacity is the memory capacity reserved for OSD when the storage node is initialized to run;
the determining unit is used for determining whether the ratio of the memory capacity of the target storage node to the OSD number of the object storage devices in the storage nodes is smaller than a first preset threshold value;
the processing unit is used for refusing OSD increasing operation aiming at a target storage node when the determining unit determines that the ratio of the available memory capacity of the target storage node to the OSD number in the storage nodes is smaller than a first preset threshold value, and setting part of OSD of the target storage node to be in a first type non-working Down state so as to enable the ratio of the available memory capacity of the target storage node to the rest of OSD number to be larger than or equal to the first preset threshold value; wherein the rest OSDs refer to available OSDs except for a part of OSDs set to a first type Down state in the OSDs of the target storage node, and the OSDs of the first type non-working Down state are OSDs having a working capability but being set to a non-working state.
8. The apparatus of claim 7,
the acquiring unit is specifically configured to receive information of the available memory capacity and the OSD number reported by the storage node; or, detecting the information of the available capacity and the number of OSD of the storage node.
9. The apparatus of claim 7,
the processing unit is further configured to allow OSD addition operation for the target storage node when the determining unit determines that the ratio of the available memory capacity to the OSD number of the target storage node is greater than a second preset threshold, and restore part or all of the OSDs in the part of the OSDs set in the first type Down state in the target storage node to the working UP state on the principle that the ratio of the recovered available memory capacity to the OSD number is greater than or equal to the second preset threshold; wherein the first preset threshold is smaller than the second preset threshold.
10. The apparatus of claim 9,
the acquisition unit is further configured to detect a ratio of an available memory capacity of the target storage node to the number of OSDs;
the determining unit is specifically configured to determine that the ratio of the available memory capacity of the target storage node to the number of OSDs is greater than a second preset threshold when it is detected that the ratio of the available memory capacity of the target storage node to the number of OSDs is greater than the second preset threshold.
11. The apparatus of claim 9,
the processing unit is further configured to reject OSD addition operation for the target storage node when it is determined that a ratio of the memory capacity of the target storage node to the number of OSDs is greater than or equal to the first preset threshold and smaller than the second preset threshold.
12. A fault handling apparatus, characterized in that the apparatus comprises:
the device comprises an acquisition unit, a storage unit and a display unit, wherein the acquisition unit is used for acquiring the information of the available memory capacity of the device and the OSD number of object storage object devices;
a sending unit, configured to report information about available memory capacity and OSD number of the monitor to the monitor, so that when the monitor determines that a ratio of the available memory capacity to the OSD number of a target storage node in a storage node is smaller than a first preset threshold, the monitor rejects OSD addition operation for the target storage node, and sets a part of the OSD of the target storage node to be in a first type non-working Down state, so that the ratio of the available memory capacity to the remaining OSD number of the target storage node is greater than or equal to the first preset threshold; wherein the rest OSDs refer to available OSDs except for a part of OSDs set to a first type Down state in the OSDs of the target storage node; the initial available memory capacity is the memory capacity reserved for OSD when the storage node is initialized to run; the first type of inactive Down state OSD is an OSD having an active capability but is set to an inactive state.
CN201711015125.1A 2017-10-26 2017-10-26 Fault processing method and device Active CN107729185B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711015125.1A CN107729185B (en) 2017-10-26 2017-10-26 Fault processing method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711015125.1A CN107729185B (en) 2017-10-26 2017-10-26 Fault processing method and device

Publications (2)

Publication Number Publication Date
CN107729185A CN107729185A (en) 2018-02-23
CN107729185B true CN107729185B (en) 2020-12-04

Family

ID=61213886

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711015125.1A Active CN107729185B (en) 2017-10-26 2017-10-26 Fault processing method and device

Country Status (1)

Country Link
CN (1) CN107729185B (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109101357A (en) * 2018-07-20 2018-12-28 广东浪潮大数据研究有限公司 A kind of detection method and device of OSD failure
CN109213637B (en) * 2018-11-09 2022-03-04 浪潮电子信息产业股份有限公司 Data recovery method, device and medium for cluster nodes of distributed file system
CN109669822B (en) * 2018-11-28 2023-06-06 平安科技(深圳)有限公司 Electronic device, method for creating backup storage pool, and computer-readable storage medium
CN109614276B (en) * 2018-11-28 2021-09-21 平安科技(深圳)有限公司 Fault processing method and device, distributed storage system and storage medium
CN109710456B (en) * 2018-12-10 2021-03-23 新华三技术有限公司 Data recovery method and device
CN115543862B (en) * 2022-09-27 2023-09-01 超聚变数字技术有限公司 Memory management method and related device

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102693177A (en) * 2011-03-23 2012-09-26 中国移动通信集团公司 Fault diagnosing and processing methods of virtual machine as well as device and system thereof
CN105119737A (en) * 2015-07-16 2015-12-02 浪潮软件股份有限公司 Method for monitoring Ceph cluster through Zabbix
CN105930103A (en) * 2016-05-10 2016-09-07 南京大学 Distributed storage CEPH based erasure correction code overwriting method
CN106302717A (en) * 2016-08-12 2017-01-04 浪潮(北京)电子信息产业有限公司 The method for optimizing resources of a kind of CEPH system and device
CN107045469A (en) * 2017-03-28 2017-08-15 北京精强远科技有限公司 A kind of intelligent sound warning system and method

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9021296B1 (en) * 2013-10-18 2015-04-28 Hitachi Data Systems Engineering UK Limited Independent data integrity and redundancy recovery in a storage system
US9887008B2 (en) * 2014-03-10 2018-02-06 Futurewei Technologies, Inc. DDR4-SSD dual-port DIMM device

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102693177A (en) * 2011-03-23 2012-09-26 中国移动通信集团公司 Fault diagnosing and processing methods of virtual machine as well as device and system thereof
CN105119737A (en) * 2015-07-16 2015-12-02 浪潮软件股份有限公司 Method for monitoring Ceph cluster through Zabbix
CN105930103A (en) * 2016-05-10 2016-09-07 南京大学 Distributed storage CEPH based erasure correction code overwriting method
CN106302717A (en) * 2016-08-12 2017-01-04 浪潮(北京)电子信息产业有限公司 The method for optimizing resources of a kind of CEPH system and device
CN107045469A (en) * 2017-03-28 2017-08-15 北京精强远科技有限公司 A kind of intelligent sound warning system and method

Also Published As

Publication number Publication date
CN107729185A (en) 2018-02-23

Similar Documents

Publication Publication Date Title
CN107729185B (en) Fault processing method and device
EP3620905B1 (en) Method and device for identifying osd sub-health, and data storage system
US10609159B2 (en) Providing higher workload resiliency in clustered systems based on health heuristics
CN104836819A (en) Dynamic load balancing method and system, and monitoring and dispatching device
US20120197822A1 (en) System and method for using cluster level quorum to prevent split brain scenario in a data grid cluster
CN110830283B (en) Fault detection method, device, equipment and system
CN107508694B (en) Node management method and node equipment in cluster
CN107872517B (en) Data processing method and device
CN108776579B (en) Distributed storage cluster capacity expansion method, device, equipment and storage medium
CN107453932B (en) Distributed storage system management method and device
CN109921942B (en) Cloud platform switching control method, device and system and electronic equipment
CN109582459A (en) The method and device that the trustship process of application is migrated
CN106331081B (en) Information synchronization method and device
CN109474470A (en) One kind is from monitoring method and device
CN114168071B (en) Distributed cluster capacity expansion method, distributed cluster capacity expansion device and medium
CN111342986A (en) Distributed node management method and device, distributed system and storage medium
CN111949384B (en) Task scheduling method, device, equipment and computer readable storage medium
CN105323271A (en) Cloud computing system, and processing method and apparatus thereof
CN109587218B (en) Cluster election method and device
CN110837428B (en) Storage device management method and device
CN115314361B (en) Server cluster management method and related components thereof
CN108964992B (en) Node fault detection method and device and computer readable storage medium
JP6269199B2 (en) Management server, failure recovery method, and computer program
CN112269693B (en) Node self-coordination method, device and computer readable storage medium
CN115756955A (en) Data backup and data recovery method and device and computer equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant