CN113544636A

CN113544636A - Management method and device of sub-health nodes

Info

Publication number: CN113544636A
Application number: CN201980093598.4A
Authority: CN
Inventors: 王道辉; 宋驰; 湛云; 王同雷
Original assignee: Huawei Technologies Co Ltd
Current assignee: Huawei Cloud Computing Technologies Co Ltd
Priority date: 2019-03-12
Filing date: 2019-03-12
Publication date: 2021-10-22
Anticipated expiration: 2039-03-12
Also published as: CN113544636B; WO2020181478A1

Abstract

The application provides a management method and a device of sub-health nodes, wherein the method comprises the following steps: the computing node sends a first input/output (IO) request to a first storage node; the computing node determines that the first storage node is a sub-healthy node based on the first IO request; and the computing node sends an identification message to a control node, wherein the identification message is used for indicating that the first storage node is a sub-health node. In the embodiment of the application, based on the IO request, the computing node determines that the storage node is a sub-health node, so that the storage node is indicated as the sub-health node, and a process of identifying the sub-health node can be completed based on the IO request. Or, identify sub-healthy nodes at the granularity of IO requests. The time for identifying the sub-health nodes is reduced.

Description

Management method and device of sub-health nodes

Technical Field

The present application relates to the field of information technology, and more particularly, to a method and apparatus for managing sub-health nodes.

Background

In a storage system (e.g., distributed system), a compute node, a control node, a storage node, and the like are included. The computing node sends an Input/Output (IO) request through the storage node to perform an IO operation on data in the storage node. The storage nodes are used for providing storage space for data, and the control nodes are used for monitoring the running states of the computing nodes and the storage nodes. In a storage system, latency in processing IO requests is one of the major issues of concern in the industry.

Currently, a storage node in a sub-health state may be deleted from a storage system through a sub-health management function to manage a time delay when the storage system processes an IO request. Specifically, a first computing node in the storage system sends an IO request to a plurality of storage nodes, and if the IO request fails to be processed by the first storage node in the plurality of storage nodes, the first computing node reports to the control node that the first storage node may be a sub-health node. Correspondingly, the control node receives storage nodes which are suspected to be sub-health nodes and reported by other computing nodes in the storage system, carries out statistics, and then adopts a majority arbitration mechanism, when most of the computing nodes report that the first storage node is suspected to be a sub-health node, the control node determines that the first storage node is the sub-health node and deletes the first storage node from the storage system, so that subsequent IO requests are stopped being sent to the first storage node.

From the above-described sub-health management function, although the storage node in the sub-health state (i.e., the sub-health node) may be deleted from the storage system by the sub-health management function, the latency of the storage system processing the IO request is controlled. However, the process of determining the sub-health node by the control node is based on a majority arbitration mechanism, that is, a plurality of IO requests are required to determine the storage node suspected to be the sub-health node as the sub-health node, and finally the storage node is deleted from the storage system, so that the storage system still has a large delay in processing the IO request.

Disclosure of Invention

The application provides a management method and device of sub-health nodes, which are used for simplifying the process of marking the sub-health nodes and are beneficial to reducing the time delay of processing IO (input/output) requests in a storage system.

In a first aspect, a method for managing sub-health nodes is provided, including: the computing node sends a first input/output (IO) request to a first storage node; the computing node determines that the first storage node is a sub-healthy node based on the first IO request; and the computing node sends an identification message to a control node, wherein the identification message is used for indicating that the first storage node is a sub-health node.

The determining that the storage node is a sub-healthy node based on the IO request may include determining that the storage node is a healthy node based on a processing result of the IO request. The processing result includes failure to process the IO request or timeout to process the IO request.

The determining that the storage node is a sub-healthy node based on the IO request may include determining that the storage node is a sub-healthy node based on only the IO request, and may further include determining that the storage node is a sub-healthy node based on a plurality of IO requests, and data processed by the plurality of IO requests belong to the same strip, or data processed by the plurality of IO requests belong to the same copy.

In the embodiment of the application, based on the IO request, the computing node determines that the storage node is a sub-health node, so that the storage node is indicated as the sub-health node, and a process of identifying the sub-health node can be completed based on the IO request. Or, with IO requests as granularity, sub-health nodes are identified. The time for identifying the sub-health nodes is reduced.

Further, after the method of the embodiment of the present application is applied to a write IO request, the compute node does not send a write request to the sub-health node any more, which is beneficial to reducing the time delay of write request processing caused by the existence of the sub-health node, and improving the efficiency of writing data by the compute node.

In a possible implementation manner, the computing node sends a second IO request to the first storage node; and when a response message that the first storage node refuses to process the second IO request is received, the computing node marks the first storage node as the sub-health node.

In the embodiment of the application, the storage node refuses to process the second IO request, so as to indirectly indicate that the first storage node of the computing node is a sub-health node. The method avoids the situation that in the prior art, the control node pushes the view to the computing node to indicate the sub-health node, and is beneficial to reducing the load of the control node.

In one possible implementation, the method further includes: and the computing node writes the data of the first storage node into a substitute storage node of the first storage node.

In the embodiment of the application, under the condition that the storage node is a sub-healthy node, the computing node writes the data of the first storage node into a backup storage node of the first storage node, which is favorable for ensuring the reliability of data storage in the storage system. The method and the device avoid the problem that in the prior art, the storage node is a sub-healthy node, so that the data to be written into the storage node fails to be stored, and the data is stored in a degraded mode.

In a possible implementation manner, the computing node sends a third IO request to the second storage node; the computing node receives a processing success response of the third IO request returned by the second storage node; the computing node receives a processing failure response of the first IO request returned by the first storage node; the computing node determining, based on the first IO request, that the first storage node is a sub-healthy node, including: and the computing node determines that the first storage node is the sub-health node according to the processing failure response of the first IO request and the processing success response of the third IO request, wherein the data processed by the first IO request and the third IO request belong to the same strip, or the data processed by the first IO request and the third IO request belong to the same copy.

In this embodiment of the application, if the second storage node feeds back a processing success response of the third IO request to the computing node, it may be determined that the probability of the failure of the computing node itself is low, and at this time, if the first storage node feeds back a processing failure response of the first IO request to the computing node, it may be determined that the first storage node is a sub-health node with a high possibility, and based on this, the control node may instruct the control node to mark the storage node as a sub-health node, which is beneficial to improving the accuracy of marking the sub-health node.

In one possible implementation manner, the determining, by the computing node, that the first storage node is a sub-healthy node based on the first IO request includes: when the first storage node processes the first IO request overtime, the computing node determines that the first storage node is a sub-healthy node.

In the embodiment of the application, the condition that the IO processing process of the data is overtime is used for marking the sub-health nodes, so that the time delay of the computing node for processing the IO request is controlled.

In a second aspect, a method for managing sub-health nodes is provided, including: the control node determines that the storage node is a sub-health node; and the control node sends indication information to the storage node, wherein the indication information is used for indicating the storage node to stop processing the input/output (IO) request.

In the embodiment of the application, the control node sends the indication information to the storage node to inform the storage node to stop processing the IO request sent by the computing node, so that the storage node refuses to process the IO request through the storage node, and indirectly feeds back the storage node as a sub-healthy node to the computing node, so that the storage node is deleted from the storage system. In the prior art, the storage nodes are prevented from being deleted from the storage system in a mode of pushing updated views to each computing node through the control node, and the load of the control node is reduced.

In a possible implementation manner, the sending, by the control node, the indication information to the storage node includes: the control node sends a heartbeat response to the storage node, wherein the heartbeat response carries the indication information; the method further comprises the following steps: and the control node fails to send the heartbeat response to the storage node, and the control node determines that the communication with the storage node is interrupted.

In the embodiment of the application, the storage node is deleted from the storage system by carrying the indication information through the heartbeat response, which is beneficial to reducing the time delay of the storage system for processing the IO request. Further, the time delay of the marking process of the sub-health node and the heartbeat timeout are favorably within an order of magnitude of time delay range.

In a possible implementation manner, the indication information indicates that the storage node stops processing the IO request by indicating that the storage node is marked as the sub-healthy node.

In one possible implementation manner, the determining, by the control node, that the storage node is a sub-healthy node includes: the control node receives an identification message sent by a computing node, wherein the identification message requests that the storage node is marked as a sub-health node; and the control node marks the storage node as the sub-health node according to the identification message.

In the embodiment of the application, based on the IO request, the computing node determines that the storage node is a sub-health node, so that the storage node is indicated as the sub-health node, and a process of identifying the sub-health node can be completed based on the IO request. Or, with IO requests as granularity, sub-health nodes are identified. The time for identifying the sub-health nodes is reduced. Further, after the method of the embodiment of the present application is applied to a write IO request, the compute node does not send a write request to the sub-health node any more, which is beneficial to reducing the time delay of write request processing caused by the existence of the sub-health node, and improving the efficiency of writing data by the compute node.

In a third aspect, a method for managing sub-health nodes is provided, including: under the condition that a storage node is a sub-health node, the storage node receives indication information sent by a control node, wherein the indication information is used for indicating the storage node to stop processing an input/output (IO) request; the storage node stops processing IO requests.

In a possible implementation manner, the receiving, by the storage node, indication information sent by the control node includes: the storage node receives a heartbeat response sent by the control node, wherein the heartbeat response carries the indication information; the method further comprises the following steps: and the storage node determines that the communication with the control node is interrupted in order to receive the heartbeat response.

In a fourth aspect, a method for managing sub-health nodes is provided, including: if a storage node receives a heartbeat response sent by a control node, and the heartbeat response is used for indicating that the storage node is a sub-health node, the storage node determines that the storage node is the sub-health node; and if the heartbeat response feedback fails, the storage node determines that the storage node is deleted from the storage system.

In the embodiment of the application, if the storage node receives the heartbeat response, the storage node is indicated as a sub-healthy node through the heartbeat response, that is, the process of marking the sub-healthy node is combined with a heartbeat mechanism, so that the first storage node is deleted from the storage system, and the reduction of the time delay of the storage system for processing the IO request is facilitated. Further, the time delay of the marking process of the sub-health node and the heartbeat timeout are favorably within an order of magnitude of time delay range.

In one possible implementation, the heartbeat reply is further used to instruct the storage node to stop processing an input/output IO request, and the method further includes: and the storage node stops processing the IO request according to the heartbeat response.

In the embodiment of the application, the control node instructs the storage node to stop processing the IO request through the heartbeat response, so that the first storage node rejects processing the IO request, and indirectly feeds back the storage node as a sub-health node to the computing node, so that the storage node is deleted from the storage system. The storage nodes are prevented from being deleted from the storage system in a mode of pushing the updated view to each computing node through the control node in the prior art, so that the load of the control node is reduced, and the bandwidth occupied by the pushed view is reduced.

In a fifth aspect, a method for managing sub-health nodes is provided, including: the method comprises the steps that a controller sends a heartbeat response to a storage node, wherein the heartbeat response is used for indicating that the storage node is a sub-health node; if the heartbeat response feedback fails, the controller determines that communication with the storage device is interrupted.

In a possible implementation, the heartbeat reply is further used to instruct the storage node to stop processing input/output IO requests.

In a sixth aspect, a computing node is provided, which has the function of implementing the computing node in the above method design. These functions may be implemented by hardware, or by hardware executing corresponding software. The hardware or software includes one or more modules corresponding to the above-described functions.

In a seventh aspect, a control node is provided, where the control node has a function of implementing the control node in the above method design. These functions may be implemented by hardware, or by hardware executing corresponding software. The hardware or software includes one or more modules corresponding to the above-described functions.

In an eighth aspect, a storage node is provided, where the storage node has a function of implementing the first storage node or the storage node in the above method design. These functions may be implemented by hardware, or by hardware executing corresponding software. The hardware or software includes one or more modules corresponding to the above-described functions.

In a ninth aspect, there is provided a control node comprising a processor and a memory, the memory being adapted to store a computer program, the processor being adapted to invoke and run the computer program from the memory such that the control node performs the methods of the above aspects.

In a tenth aspect, a computing node is provided, comprising a processor and a memory, the memory being configured to store a computer program, the processor being configured to invoke and run the computer program from the memory, such that the computing node performs the method of the above aspects.

In an eleventh aspect, there is provided a storage node comprising a processor and a memory, the memory being used to store a computer program, the processor being used to invoke and run the computer program from the memory, such that the storage node performs the method of the above aspects.

In a twelfth aspect, there is provided a computer program product comprising: computer program code which, when run on a computer, causes the computer to perform the method of the above-mentioned aspects.

It should be noted that, all or part of the computer program code may be stored in the first storage medium, where the first storage medium may be packaged together with the processor or may be packaged separately from the processor, and this is not specifically limited in this embodiment of the present application.

In a thirteenth aspect, a computer-readable medium is provided, which stores program code, which, when run on a computer, causes the computer to perform the method in the above-mentioned aspects.

In a fourteenth aspect, a chip system is provided, which comprises a processor, and is configured to implement the functions referred to in the above aspects, such as generating, receiving, sending, or processing data and/or information referred to in the above methods. In one possible design, the system-on-chip further includes a memory for storing program instructions and data necessary for the terminal device. The chip system may be formed by a chip, or may include a chip and other discrete devices.

The device in which the chip system is located may be any one of a computing node, a control node, a storage node, and a first storage node.

Drawings

Fig. 1 is a schematic diagram of a storage system to which an embodiment of the present application is applied.

Fig. 2 is a schematic flow chart of a management method of a sub-health node according to an embodiment of the present application.

Fig. 3 is a schematic flow chart of a management method of a sub-health node according to an embodiment of the present application.

Fig. 4 is a schematic flow chart of a management method of a sub-health node according to an embodiment of the present application.

Fig. 5 is a schematic diagram of a management apparatus of a sub-health node according to an embodiment of the present application.

FIG. 6 is a schematic diagram of a compute node according to another embodiment of the present application.

Fig. 7 is a schematic diagram of a management apparatus of a sub-health node according to an embodiment of the present application.

Fig. 8 is a schematic diagram of a management apparatus of a sub-health node according to an embodiment of the present application.

Fig. 9 is a schematic diagram of a management apparatus of a sub-health node according to an embodiment of the present application.

Fig. 10 is a schematic diagram of a control node according to another embodiment of the present application.

Fig. 11 is a schematic diagram of a management apparatus of a sub-health node according to an embodiment of the present application.

Fig. 12 is a schematic diagram of a management apparatus of a sub-health node according to an embodiment of the present application.

Fig. 13 is a schematic diagram of a storage node according to another embodiment of the present application.

Detailed Description

The technical solution in the present application will be described below with reference to the accompanying drawings.

Fig. 1 is a schematic diagram of a storage system to which an embodiment of the present application is applied. The storage system includes a control node 110, a compute node 120, and a storage node 130. It should be noted that the number of the computing nodes 120 and the storage nodes 130 may be one or more. For example, as shown in the figures

The control node 110 is configured to monitor the operating status of the compute node 120 and the storage node 130 in the storage system, for example, monitor whether the storage node 130 is in a sub-health state.

It should be noted that the control node may be implemented by relying on one or more physical devices, and the physical devices may be computers or servers.

The computing node 120 is configured to determine a storage node corresponding to the IO request, and send the IO request to the corresponding storage node.

For example, in an Erasure Coding (EC) scenario, a storage space 1, a storage space 2, and a storage space 3 form a stripe, where the storage space 1 is distributed at the storage node 1, the storage space 2 is distributed at the storage node 2, and the storage space 3 is distributed at the storage node 3. The method comprises the steps that a computing node blocks data to obtain a data block 1 and a data block 2, coding blocks of the data block 1 and the data block 2 are obtained according to EC coding, the data block 1, the data block 2 and the coding blocks of the data block 1 and the data block 2 form a strip, the computing node sends IO requests to storage nodes respectively, specifically, the computing node 120 sends the IO requests carrying the data block 1 to the storage node 1, sends the IO requests carrying the data block 2 to the storage node 2, and sends the IO requests carrying the coding block 3 to the storage node 3. The operation request is a write IO request. Correspondingly, the computing node may further perform a read operation on the storage node 1, the storage node 2, and the storage node 3, and send a read IO request for reading the stored data block 1, the data block 2, and the coding block.

For another example, in a scenario with multiple copies, such as a 3-copy scenario, storage space 1, storage space 2, and storage space 3 constitute a stripe, where storage space 1 is distributed at storage node 1, storage space 2 is distributed at storage node 2, and storage space 3 is distributed at storage node 3. After receiving the write IO, the computing node sends data carried in the write IO request to the storage node 1, the storage node 2 and the storage node 3 respectively. Correspondingly, the computing node may also perform a read IO operation on the storage node 1, the storage node 2, and the storage node 3.

It should be noted that the computing nodes in the storage system may be implemented by relying on one or more physical devices, and the physical devices may be computers or servers.

The storage node 130 is used for providing a storage space, wherein the data stored in the storage node 130 may be metadata of the data, or the metadata of the data and the data themselves.

It should be noted that the storage nodes in the storage system may be implemented by relying on one or more physical devices, and the physical devices may be memories or storage servers.

The embodiment of the application provides a management method of sub-health nodes, so that the time for determining the sub-health nodes is reduced, and further the time delay for processing IO (input/output) requests is reduced. This is described in more detail below in connection with the method shown in fig. 2.

Fig. 2 is a schematic flow chart of a management method of a sub-health node according to an embodiment of the present application, which may include steps 210 to 220.

And 210, sending an IO request to the storage node, and determining the storage node to be a sub-health node based on the IO request.

The process of sending the IO request to the storage node may refer to the foregoing description, and may be a read IO request or a write IO request.

Optionally, determining that the storage node is a sub-healthy node based on the IO request includes that the storage node has processed a timeout for the IO request, and the computing node determines that the storage node is a sub-healthy node. For example, within a preset time threshold, the storage node does not return feedback information of successful operation to the computing node, and the computing node may determine that the storage node is a sub-healthy node. For another example, when the computing node determines to send the IO request to the storage node, data packet loss occurs, and the computing node may determine that the storage node is a sub-health node. The storage node in the sub-health state in the embodiment of the present invention, that is, the sub-health node, refers to a node operating in an abnormal state, for example, the response time is too long, the IO request processing is overtime, the resource occupation is too high, and the like.

It should be noted that the computing node may also determine that the storage node is a sub-healthy node through other conditions. The embodiments of the present application do not limit this. For example, the compute node determines that communication with the storage node is interrupted, and the compute node determines that the storage node is a sub-healthy node.

It should be noted that there are many reasons for the timeout of the IO request processing, and this is not limited in the embodiment of the present application. For example, this may be due to the first storage node being overloaded. Still alternatively, the failure may be due to a software failure of the first storage node, or the like.

220, the computing node sends an identification message to the control node, where the identification message is used to indicate that the storage node is a sub-health node.

Correspondingly, the control node receives the identification message sent by the computing node and marks the storage node as a sub-health node. Specifically, the control node marks the storage node as a sub-healthy node in the view. The control node pushes the updated view to all the computing nodes and the storage node in the storage system, so that the computing nodes of the storage system can know that the storage node is a sub-health node through the updated view, and the computing nodes do not send IO (input/output) requests to the storage node any more. Wherein the view is used for indicating the relationship of the storage space and the storage node, for example, the corresponding relationship of the strip and the storage node. The correspondence of the stripe to the storage node may be a direct correspondence of the stripe to the storage node. The corresponding relationship between the stripe and the storage node may also be an indirect relationship between the stripe and the storage node, for example, a relationship between the stripe and the partition and a relationship between the partition and the storage node. In another implementation, the update view may further include storage node information in place of the sub-health node, so that the computing node sends an IO request to the storage node in place of the sub-health node.

In another implementation manner, the control node may also directly delete the storage node in the view after receiving the identification message sent by the computing node, and push the updated view to all the computing nodes in the storage system, so that the computing nodes in the storage system cannot find the storage node through the view any more, that is, the storage node is deleted from the storage system.

Regardless of the above method of marking sub-health nodes in a view or the above method of directly deleting sub-health nodes from a view, the view needs to be updated, and then the control node needs to push the updated view to all the computing nodes. However, the number of nodes involved in the view pushing process is large, which results in that the time for the control node to push the view is long, and the view is frequently updated, which greatly increases the load of the control node and occupies a large amount of bandwidth of the storage system. In order to reduce the load generated by the view pushed by the control node and save the bandwidth of the storage system, the application also provides a management method of the sub-health node, so that the sub-health node in the storage system stops processing the IO request, and the same purpose as that of pushing the view is achieved.

A schematic flow chart of a method for managing sub-health nodes according to an embodiment of the present application is described below with reference to fig. 3. It should be noted that the method shown in fig. 3 may also be applied to the storage system shown in fig. 1, and the same terms in the method shown in fig. 3 as those in the method shown in fig. 2 may represent the same meanings, and are not described herein again for brevity. The method shown in fig. 3 includes step 310 and step 320.

The control node determines 310 that the storage node is a sub-healthy node.

It should be noted that, the way that the control node determines that the storage node is a sub-healthy node may use the method for identifying a sub-healthy node provided in the embodiment of the present application, that is, the method shown in fig. 2. That is, the method shown in fig. 2 may be used in combination with the method shown in fig. 3, and a specific method flow may be seen in fig. 4. The above manner for determining, by the control node, that the storage node is a sub-health node may also use a conventional method for determining a sub-health node, which is not limited in this embodiment of the present application.

And 320, the control node sends indication information to the storage node, wherein the indication information is used for indicating the storage node to stop processing the IO request. Accordingly, the storage node stops processing the IO request after receiving the indication information. And then delete itself from the storage system.

In the embodiment of the application, the control node sends the indication information to the storage node to indicate the storage node to stop processing the IO request sent by the computing node, and the computing node receives a response message of stopping processing the IO request by the storage node and determines that the storage node is a sub-healthy node, so that sending of views is reduced, load of the control node is reduced, and bandwidth of a storage system is saved.

Accordingly, the storage node may determine that it is deleted from the storage system according to the indication information, and the storage node may determine that it is a sub-healthy node according to the "alternative implementation" described below, so as to delete it from the storage system.

As an alternative implementation of step 320, sub-health management of the storage node may be implemented based on a heartbeat mechanism of the control node and the storage node. Or informing the storage node that the storage node is a sub-health node by means of the heartbeat mechanism.

The above-mentioned heartbeat mechanism can be simply understood that nodes that do not send heartbeats within a preset time period and nodes that do not receive heartbeat responses are regarded as failed nodes and are deleted from the storage system. Therefore, even if the storage node does not receive the heartbeat response within the preset time based on the heartbeat mechanism, the storage node cannot know that the storage node is a sub-health node, and the storage node cannot know that the storage node is deleted from the storage system because the heartbeat response is not received for a long time, so that the storage node can execute corresponding operations such as a self-checking process or a recovery process according to the self state, for example, a recovery process for recovering normal work is started. Of course, after receiving the heartbeat response, the storage node may determine that the node is determined to be a sub-health node according to sub-health node indication information carried in the heartbeat response. Accordingly, the storage node may perform operations such as a self-check process or a recovery process according to its own state.

In the embodiment of the application, the sub-health state management of the storage node is realized by carrying the sub-health node indication information through the heartbeat response, namely based on a heartbeat mechanism, so that the sub-health node can be deleted from the storage system no matter whether the sub-health node receives the sub-health node indication information, and the reduction of the time delay of the storage system for processing the IO request is facilitated. Furthermore, the time delay of the processing flow of the sub-health node and the heartbeat timeout are favorably within an order of magnitude of time delay range.

It should be noted that the indication information of the sub-health node may be transmitted in the heartbeat response, or may be transmitted through a separate information transmission, which is not limited in this embodiment of the present application.

Optionally, the indication information and the sub-health node indication information may be the same information, or may be two pieces of information sent independently. If the indication information and the sub-health node indication information are the same information and are carried in the heartbeat response, the storage device can determine that the storage device is a sub-health node and stop processing the IO request after receiving the heartbeat response.

Correspondingly, the method shown in the embodiment of the present application further includes: the computing node sends a first IO request to the storage node; when the storage node refuses to process the first IO request, the computing node marks the storage node as a sub-health node.

The first IO request may be a read request or a write request.

The storage node refuses to process the first IO request, which may include a response message that the storage node stops processing the IO request, to notify the computing node that the storage node is a sub-healthy node. For example, the storage node returns a request error to the compute node to inform the compute node that the storage node refuses to process the first IO request. For example, when the first IO request is a write request, the storage node may return a write error to the compute node. For another example, when the first IO request is a read request, the storage node may return a read error to the compute node.

It should be noted that the computing node sending the read request may be the computing node identified as the sub-healthy node above. The above-described computing node sending the read request may also be another computing node in the storage system.

In the above, after the storage node is deleted from the storage system, the data can be stored in a degraded mode, and the reliability of data storage is reduced. For example, in the EC scenario described above, storage node 1, storage node 2, and storage node 3 provide storage space for a stripe, for example, storage node 1 is a sub-healthy node, and then data block 1 in the stripe is stored on storage node 1, so that only data block 2 and the coding block are available, thereby resulting in storage degradation.

In order to avoid degraded storage of data due to the existence of sub-healthy nodes, a replacement storage node can be configured for the storage node, and the replacement storage node is used for providing storage space for the data after the storage node is a sub-healthy node and taking over the storage node.

That is, the method of the embodiment of the present application further includes: the computing node determines a backup storage node of the storage node; and the computing node writes the data to be written into the storage node to the alternate storage node.

The backup storage node of the sub-health node may be one of a plurality of backup storage nodes in the storage system, and the plurality of backup storage nodes may correspond to the plurality of storage nodes. The plurality of replacement storage nodes and the plurality of storage nodes may be in a one-to-one correspondence relationship or a one-to-many relationship, which is not limited in the embodiment of the present application.

It should be noted that, after the storage node is identified as a sub-healthy node, the storage node migrates all stored data to the backup storage node, and during the period that the storage node is a sub-healthy node, the backup storage node may be regarded as the storage node, so as to implement the function of the storage node. Accordingly, after the storage node resumes normal operation, incremental data in the alternate storage node may be migrated to the storage node.

It should be further noted that, the above-mentioned process of migrating data from the storage node to the replacement storage node may be executed after a period of time after the storage node is identified as a sub-health node, so that after a period of time, if the storage node is still a sub-health node and is not recovered, it may be considered that the storage node is indeed a sub-health node, and at this time, data migration is performed again, so that it may be avoided that the storage node recovers normal operation when data migration is completed or during the process of migrating data. Of course, the data migration process may also be directly executed after the storage node is identified as a sub-healthy node, which is not limited in this embodiment of the application.

It should be understood that the correspondence between the storage node and the backup storage node may be embodied by the view, or maintained by the control node separately, which is not limited in this embodiment of the application.

In order to ensure the accuracy of identifying a storage node as a sub-healthy node by a computing node, it is required to ensure that the computing node sends an IO request to a plurality of storage nodes including the storage node, and besides that the IO request for the storage node meets the condition of determining that the storage node is a sub-healthy node, at least one storage node in the plurality of storage nodes except the storage node needs to successfully process the IO request, so that the error identification of the sub-healthy storage node caused by the computing node is avoided. The data processed by the IO request sent to the plurality of storage nodes belong to the same strip or the same copy. And when the IO request of the storage node meets the condition that the storage node is determined to be a sub-healthy node, receiving an IO request processing failure response of the storage node, and returning an IO request processing success response by at least one storage node.

The method for managing sub-health nodes according to the embodiment of the present application is described in detail above with reference to fig. 1 to 4, and the apparatus according to the embodiment of the present application is described in detail below with reference to fig. 5 to 13. It should be noted that the apparatuses shown in fig. 5 to 13 may implement one or more steps of the above method, and are not described herein again for brevity.

Fig. 5 is a schematic diagram of a management apparatus for sub-health nodes according to an embodiment of the present application, and the apparatus 500 shown in fig. 5 may be any one of the computing nodes in fig. 1. The apparatus 500 includes a transmitting module 510 and a processing module 520.

A sending module 510, configured to send a first input/output IO request to a first storage node;

a processing module 520, configured to determine that the first storage node is a sub-healthy node based on the first IO request;

the sending module 510 is further configured to send an identification message to a control node, where the identification message is used to indicate that the first storage node is a sub-health node.

Optionally, as a possible implementation manner, the sending module is further configured to send a second IO request to the first storage node; when receiving a response message that the first storage node refuses to process the second IO request, the processing module is further configured to mark the first storage node as the sub-health node.

Optionally, as a possible implementation manner, the processing module is further configured to write data of the first storage node into a backup storage node of the first storage node.

Optionally, as a possible implementation manner, the apparatus includes a receiving module, and the sending module is configured to send a third IO request to the second storage node; the receiving module is further configured to receive a processing success response of the third IO request returned by the second storage node; the receiving module is further configured to receive a processing failure response of the first IO request returned by the first storage node; the processing module is further configured to determine that the first storage node is the sub-healthy node according to a processing failure response of the first IO request and a processing success response of the third IO request, where data processed by the first IO request and data processed by the third IO request belong to the same stripe, or data processed by the first IO request and the third IO request belong to the same copy.

Optionally, as a possible implementation manner, when the process of processing the first IO request by the first storage node is overtime, the processing module is configured to determine that the first storage node is a sub-healthy node.

Optionally, as a possible implementation manner, the first IO request is a write request or a read request.

In an alternative embodiment, the processing module 520 may be a processor 620, the sending module 510 may be an input/output interface 630, and the computing node may further include a memory 610, as shown in fig. 6.

FIG. 6 is a schematic diagram of a compute node according to another embodiment of the present application. The computing node 600 shown in fig. 6 may include: memory 610, processor 620, and input/output interface 630. The memory 610, the processor 620 and the input/output interface 630 are connected through an internal connection path, the memory 610 is used for storing instructions, and the processor 620 is used for executing the instructions stored in the memory 620, so as to control the input/output interface 630 to receive input data and information and output data such as operation results.

In implementation, the steps of the above method may be performed by integrated logic circuits of hardware or instructions in the form of software in the processor 620. The method disclosed in the embodiments of the present application may be directly implemented by a hardware processor, or may be implemented by a combination of hardware and software modules in the processor. The software module may be located in ram, flash memory, rom, prom, or eprom, registers, etc. storage media as is well known in the art. The storage medium is located in the memory 610, and the processor 620 reads the information in the memory 610 and performs the steps of the above method in combination with the hardware thereof. To avoid repetition, it is not described in detail here.

Fig. 7 is a schematic diagram of a management apparatus for sub-health nodes according to an embodiment of the present application, and an apparatus 700 shown in fig. 7 may be a control node in fig. 1. The apparatus 700 comprises: a receiving module 710 and a processing module 720.

A receiving module 710, configured to receive an identification message sent by a computing node, where the identification message indicates that a storage node is directly marked as a sub-health node;

and the processing module 720 is configured to mark the storage node as the sub-healthy node.

Fig. 8 is a schematic diagram of a management apparatus for sub-health nodes according to an embodiment of the present application, and an apparatus 800 shown in fig. 8 may be a control node in fig. 1. The apparatus 800 comprises: a processing module 810 and a sending module 820.

A processing module 810, configured to determine that a storage node is a sub-healthy node;

a sending module 820, configured to send indication information to the storage node, where the indication information is used to indicate that the storage node stops processing the input/output IO request.

Optionally, as an embodiment, the sending module is configured to send a heartbeat response to the first storage node, where the heartbeat response carries the indication information.

Optionally, as an embodiment, the indication information indicates that the first storage node stops processing the IO request by indicating to mark the first storage node as the sub-healthy node.

Optionally, as an embodiment, the apparatus includes a receiving module, configured to receive an identification message sent by a computing node, where the identification message requests that the first storage node be marked as a sub-healthy node; and the processing module is used for marking the first storage node as the sub-health node according to the identification message.

Fig. 9 is a schematic diagram of a management apparatus for sub-health nodes according to an embodiment of the present application, and the apparatus 900 shown in fig. 9 may be a control node in fig. 1. The apparatus 900 comprises: a sending module 910 and a processing module 920.

A sending module 910, configured to send a heartbeat response to a storage node, where the heartbeat response is used to indicate that the storage node is a sub-health node;

the processing module 920 is configured to determine that communication with the storage device is interrupted if the heartbeat response feedback fails.

Optionally, as an embodiment, the heartbeat reply is further used to instruct the storage node to stop processing the input/output IO request.

In an alternative embodiment, the apparatus 700 may be a control node 1000, the processing module 720 may be a processor 1020, the receiving module 710 may be an input/output interface 1030, and the control node 1000 may further include a memory 1010, as shown in fig. 10.

In an alternative embodiment, the apparatus 800 may be a control node 1000, the processing module 810 may be a processor 1020, the sending module 820 may be an input/output interface 1030, and the control node 1000 may further include a memory 1010, as specifically shown in fig. 10.

In an alternative embodiment, the apparatus 900 may be a control node 1000, the processing module 920 may be a processor 1020, the sending module 910 may be an input/output interface 1030, and the control node 1000 may further include a memory 1010, as specifically shown in fig. 10.

Fig. 10 is a schematic diagram of a control node according to another embodiment of the present application. The control node 1000 shown in fig. 10 may include: memory 1010, processor 1020, and input/output interface 1030. The memory 1010, the processor 1020 and the input/output interface 1030 are connected through an internal connection path, the memory 1010 is used for storing instructions, and the processor 1020 is used for executing the instructions stored in the memory 1020 so as to control the input/output interface 1030 to receive input data and information and output data such as operation results.

In implementation, the steps of the above method may be performed by integrated logic circuits of hardware or instructions in the form of software in the processor 1020. The method disclosed in the embodiments of the present application may be directly implemented by a hardware processor, or may be implemented by a combination of hardware and software modules in the processor. The software module may be located in ram, flash memory, rom, prom, or eprom, registers, etc. storage media as is well known in the art. The storage medium is located in the memory 1010, and the processor 1020 reads the information in the memory 1010 and performs the steps of the method in combination with the hardware. To avoid repetition, it is not described in detail here.

Fig. 11 is a schematic diagram of a management apparatus for sub-health nodes according to an embodiment of the present application, and the apparatus 1100 shown in fig. 11 may be any storage node in fig. 1. The storage node 1100 comprises a receiving module 1110 and a processing module 1120.

When the first storage node is a sub-health node, the receiving module 1110 is configured to receive indication information sent by a control node, where the indication information is used to indicate that the first storage node stops processing an input/output (IO) request;

the processing module 1120 is configured to stop processing the IO request.

Optionally, as an embodiment, the receiving module is configured to receive a heartbeat response sent by the control node, where the heartbeat response carries the indication information.

Fig. 12 is a schematic diagram of a management apparatus for sub-health nodes according to an embodiment of the present application, and the apparatus 1200 shown in fig. 12 may be any storage node in fig. 1. The storage node 1200 comprises a processing module 1210 and a receiving module 1220.

If a heartbeat response sent by a control node is received through the receiving module 1220 and the heartbeat response is used to indicate that the storage node is a sub-healthy node, the processing module 1210 is configured to determine that the storage node is a sub-healthy node;

if the heartbeat response feedback fails, the processing module 1210 is configured to determine that the storage node is deleted from the storage system.

In an alternative embodiment, the processing module 1120 may be a processor 1320, the receiving module 1110 may be an input/output interface 1330, and the storage node 1300 may further include a memory 1310, as shown in fig. 13.

In an alternative embodiment, the processing module 1210 may be a processor 1320, the receiving module 1220 may be an input/output interface 1330, and the storage node 1300 may further include a memory 1310, as shown in fig. 13.

Fig. 13 is a schematic diagram of a storage node according to another embodiment of the present application. The storage node 1300 shown in fig. 13 may include: memory 1310, processor 1320, and input/output interface 1330. The memory 1310, the processor 1320, and the input/output interface 1330 are connected through an internal connection path, the memory 1310 is used for storing instructions, and the processor 1320 is used for executing the instructions stored in the memory 1320, so as to control the input/output interface 1330 to receive input data and information and output data such as operation results.

In implementation, the steps of the above method may be performed by instructions in the form of hardware, integrated logic circuits, or software in the processor 1320. The method disclosed in the embodiments of the present application may be directly implemented by a hardware processor, or may be implemented by a combination of hardware and software modules in the processor. The software module may be located in ram, flash memory, rom, prom, or eprom, registers, etc. storage media as is well known in the art. The storage medium is located in the memory 1310, and the processor 1320 reads the information in the memory 1310, and performs the steps of the above method in combination with the hardware thereof. To avoid repetition, it is not described in detail here.

It should be understood that in the embodiments of the present application, the processor may be a Central Processing Unit (CPU), and the processor may also be other general-purpose processors, Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, and the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

It will also be appreciated that in embodiments of the present application, the memory may comprise both read-only memory and random access memory, and may provide instructions and data to the processor. A portion of the processor may also include non-volatile random access memory. For example, the processor may also store information of the device type.

Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.

It can be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working processes of the system, the apparatus and the unit described above may refer to the corresponding processes in the foregoing method embodiments, and the detailed description is stopped here.

In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, a division of a unit is merely a logical division, and an actual implementation may have another division, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.

Units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.

The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application or portions thereof that substantially contribute to the prior art may be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.

The above description is only for the specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present application, and shall be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims

A management method of sub-health nodes is characterized by comprising the following steps:

the computing node sends a first input/output (IO) request to a first storage node;

the computing node determines that the first storage node is a sub-healthy node based on the first IO request;

and the computing node sends an identification message to a control node, wherein the identification message is used for indicating that the first storage node is a sub-health node.
The method of claim 1, wherein the method further comprises:

the computing node sends a second IO request to the first storage node;

and when a response message that the first storage node refuses to process the second IO request is received, the computing node marks the first storage node as the sub-health node.
The method of claim 1 or 2, wherein the method further comprises:

and the computing node writes the data of the first storage node into a substitute storage node of the first storage node.
The method of any one of claims 1-3, further comprising:

the computing node sends a third IO request to the second storage node;

the computing node receives a processing success response of the third IO request returned by the second storage node;

the computing node receives a processing failure response of the first IO request returned by the first storage node;

the computing node determining, based on the first IO request, that the first storage node is a sub-healthy node, including:

the computing node determines that the first storage node is the sub-health node according to the processing failure response of the first IO request and the processing success response of the third IO request,

data processed by the first IO request and the third IO request belong to the same strip, or data processed by the first IO request and the third IO request belong to the same copy.
The method of claim 1, wherein the computing node determining that the first storage node is a sub-healthy node based on the first IO request comprises:

when the first storage node processes the first IO request overtime, the computing node determines that the first storage node is a sub-healthy node.
The method of claim 1, wherein the first IO request is a write request or a read request.
An apparatus for managing sub-health nodes, comprising:

the sending module is used for sending a first input/output (IO) request to the first storage node;

a processing module, configured to determine that the first storage node is a sub-health node based on the first IO request;

the sending module is further configured to send an identification message to a control node, where the identification message is used to indicate that the first storage node is a sub-health node.
The apparatus of claim 7,

the sending module is further configured to send a second IO request to the first storage node;

when receiving a response message that the first storage node refuses to process the second IO request, the processing module is further configured to mark the first storage node as the sub-health node.
The apparatus of claim 7 or 8, wherein the processing module is further configured to write data of the first storage node to a replacement storage node of the first storage node.
The apparatus according to any one of claims 7-9, wherein the apparatus comprises a receiving module,

the sending module is configured to send a third IO request to the second storage node;

the receiving module is further configured to receive a processing success response of the third IO request returned by the second storage node;

the receiving module is further configured to receive a processing failure response of the first IO request returned by the first storage node;

the processing module is further configured to determine that the first storage node is the sub-healthy node according to a processing failure response of the first IO request and a processing success response of the third IO request,

data processed by the first IO request and the third IO request belong to the same strip, or data processed by the first IO request and the third IO request belong to the same copy.
The apparatus of claim 1,

when the first storage node processes the first IO request process overtime, the processing module is configured to determine that the first storage node is a sub-healthy node.
The apparatus of claim 1, wherein the first IO request is a write request or a read request.
A computing node comprising a processor and a memory, the memory for storing a computer program, the processor for invoking and executing the computer program from the memory such that the computing node performs the method of any of claims 1-6.