CN113544636B

CN113544636B - Sub-health node management method and device

Info

Publication number: CN113544636B
Application number: CN201980093598.4A
Authority: CN
Inventors: 王道辉; 宋驰; 湛云; 王同雷
Original assignee: Huawei Cloud Computing Technologies Co Ltd
Current assignee: Huawei Cloud Computing Technologies Co Ltd
Priority date: 2019-03-12
Filing date: 2019-03-12
Publication date: 2024-05-03
Anticipated expiration: 2039-03-12
Also published as: WO2020181478A1; CN113544636A

Abstract

The application provides a method and a device for managing sub-health nodes, wherein the method comprises the following steps: the computing node sends a first input/output IO request to a first storage node; the computing node determines the first storage node to be a sub-healthy node based on the first IO request; the computing node sends an identification message to a control node, the identification message being used to indicate that the first storage node is a sub-healthy node. In the embodiment of the application, based on the IO request, the computing node determines the storage node as the sub-health node, thereby indicating that the storage node is the sub-health node, and the flow of identifying the sub-health node can be completed based on the IO request. Alternatively, sub-health nodes are identified with IO requests as granularity. Advantageously reducing the time to identify sub-healthy nodes.

Description

Sub-health node management method and device

Technical Field

The present application relates to the field of information technology, and more particularly, to a method and apparatus for managing sub-health nodes.

Background

In one storage system (e.g., distributed system (distributed system)), computing nodes, control nodes, storage nodes, and the like are included. The compute node sends an Input/Output (IO) request through the storage node to perform an IO operation on the data in the storage node. The storage node is used for providing a storage space for data, and the control node is used for monitoring the operation states of the computing node and the storage node. In a storage system, latency in processing IO requests is one of the key issues of concern in the industry.

Currently, a storage node in a sub-health state can be deleted from a storage system through a sub-health management function so as to manage the time delay when the storage system processes IO requests. Specifically, a first computing node in the storage system sends IO requests to a plurality of storage nodes, and if the first storage node in the plurality of storage nodes fails to process the IO requests, the first computing node reports to a control node that the first storage node may be a sub-healthy node. Correspondingly, the control node receives the storage nodes suspected to be sub-healthy nodes reported by other computing nodes in the storage system, performs statistics, adopts a majority dispatch mechanism, determines that the first storage node is sub-healthy when the first storage node reported by most computing nodes is suspected to be sub-healthy, and deletes the first storage node from the storage system so as to stop sending the subsequent IO request to the first storage node.

From the sub-health management function described above, although the storage nodes in a sub-health state (i.e., sub-health nodes) may be deleted from the storage system by the sub-health management function to control the latency of the storage system to process IO requests. However, the process of determining sub-health nodes by the control node is based on a majority arbitration mechanism, that is, a plurality of IO requests are required to determine the storage node suspected to be a sub-health node, and finally the sub-health node is deleted from the storage system, so that the delay of processing the IO requests by the storage system is still larger.

Disclosure of Invention

The application provides a sub-health node management method and device, which are used for simplifying the flow of marking sub-health nodes and being beneficial to reducing the delay of processing IO requests in a storage system.

In a first aspect, a method for managing sub-health nodes is provided, including: the computing node sends a first input/output IO request to a first storage node; the computing node determines the first storage node to be a sub-healthy node based on the first IO request; the computing node sends an identification message to a control node, the identification message being used to indicate that the first storage node is a sub-healthy node.

The determining that the storage node is a sub-healthy node based on the IO request may include determining that the storage node is a healthy node based on a processing result of the IO request. The processing result includes failure to process the IO request or timeout to process the IO request.

The determining that the storage node is a sub-health node based on the IO request may include determining that the storage node is a sub-health node based on the IO request only, or may include determining that the storage node is a sub-health node based on a plurality of IO requests, where data processed by the plurality of IO requests belong to a same stripe, or data processed by the plurality of IO requests belong to a same copy.

In the embodiment of the application, based on the IO request, the computing node determines the storage node as the sub-health node, thereby indicating that the storage node is the sub-health node, and the flow of identifying the sub-health node can be completed based on the IO request. Or, with the IO request as granularity, sub-health nodes are identified. Advantageously reducing the time to identify sub-healthy nodes.

Further, after the method of the embodiment of the application is applied to the write IO request, the computing node does not send the write request to the sub-health node any more, so that the delay of write request processing caused by the existence of the sub-health node is reduced, and the data writing efficiency of the computing node is improved.

In one possible implementation, the computing node sends a second IO request to the first storage node; and when receiving a response message that the first storage node refuses to process the second IO request, marking the first storage node as the sub-health node by the computing node.

In the embodiment of the application, the storage node refuses to process the second IO request so as to indirectly indicate the computing node that the first storage node is a sub-health node. In the prior art, the control node is prevented from indicating sub-health nodes by pushing views to the computing nodes, so that the load of the control node is reduced.

In one possible implementation, the method further includes: the computing node writes data of the first storage node to a replacement storage node of the first storage node.

In the embodiment of the application, under the condition that the storage node is a sub-health node, the computing node writes the data of the first storage node into the replacement storage node of the first storage node, which is beneficial to ensuring the reliability of data storage in the storage system. In the prior art, the data storage failure caused by the fact that the storage node is a sub-health node and to be written into the storage node is avoided, so that the degradation storage of the data is caused.

In one possible implementation, the computing node sends a third IO request to the second storage node; the computing node receives a processing success response of the third IO request returned by the second storage node; the computing node receives a processing failure response of the first IO request returned by the first storage node; the computing node determining, based on the first IO request, that the first storage node is a sub-healthy node, comprising: and the computing node determines the first storage node as the sub-health node according to the processing failure response of the first IO request and the processing success response of the third IO request, wherein the data processed by the first IO request and the data processed by the third IO request belong to the same stripe, or the data processed by the first IO request and the data processed by the third IO request belong to the same copy.

In the embodiment of the application, if the second storage node feeds back the processing success response of the third IO request to the computing node, the probability of the failure of the computing node itself can be judged to be smaller, and at this time, if the first storage node feeds back the processing failure response of the first IO request to the computing node, the probability that the first storage node is a sub-health node can be judged to be larger, based on the fact, the control node can instruct the control node to mark the storage node as the sub-health node, and the accuracy of marking the sub-health node is improved.

In one possible implementation, the computing node determines, based on the first IO request, that the first storage node is a sub-healthy node, including: and when the first IO request processing process of the first storage node is overtime, the computing node determines that the first storage node is a sub-health node.

In the embodiment of the application, the overtime of the IO processing process of the data is used as the condition for marking the sub-health node, which is beneficial to controlling the delay of processing the IO request by the computing node.

In a second aspect, a method for managing sub-health nodes is provided, including: the control node determines that the storage node is a sub-health node; the control node sends indication information to the storage node, wherein the indication information is used for indicating the storage node to stop processing the input/output IO request.

In the embodiment of the application, the control node sends the indication information to the storage node to inform the storage node to stop processing the IO request sent by the computing node, so that the storage node refuses to process the IO request, and indirectly feeds back the storage node to the computing node as a sub-health node, thereby deleting the storage node from the storage system. In the prior art, the storage nodes are deleted from the storage system in a mode of pushing updated views to each computing node by the control node, so that the load of the control node is reduced.

In one possible implementation manner, the control node sends indication information to a storage node, including: the control node sends a heartbeat response to the storage node, wherein the heartbeat response carries the indication information; the method further comprises the steps of: and the control node sends the heartbeat response failure to the storage node, and the control node determines that communication with the storage node is interrupted.

In the embodiment of the application, the heartbeat response carries the indication information, so that the storage node is deleted from the storage system, and the delay of the storage system for processing the IO request is reduced. Further, the time delay of the marking process of the sub-health node and the heartbeat timeout are in a time delay range of one order of magnitude.

In one possible implementation, the indication information indicates that the storage node stops processing the IO request by indicating that the storage node is marked as the sub-healthy node.

In one possible implementation, the control node determines the storage node to be a sub-healthy node, including: the control node receives an identification message sent by a computing node, wherein the identification message requests to mark the storage node as a sub-health node; and the control node marks the storage node as the sub-health node according to the identification message.

In the embodiment of the application, based on the IO request, the computing node determines the storage node as the sub-health node, thereby indicating that the storage node is the sub-health node, and the flow of identifying the sub-health node can be completed based on the IO request. Or, with the IO request as granularity, sub-health nodes are identified. Advantageously reducing the time to identify sub-healthy nodes. Further, after the method of the embodiment of the application is applied to the write IO request, the computing node does not send the write request to the sub-health node any more, so that the delay of write request processing caused by the existence of the sub-health node is reduced, and the data writing efficiency of the computing node is improved.

In a third aspect, a method for managing sub-health nodes is provided, including: in the case that the storage node is a sub-health node, the storage node receives indication information sent by the control node, wherein the indication information is used for indicating the storage node to stop processing input/output IO requests; the storage node stops processing IO requests.

In one possible implementation manner, the storage node receives indication information sent by a control node, including: the storage node receives a heartbeat response sent by the control node, wherein the heartbeat response carries the indication information; the method further comprises the steps of: and the storage node determines that communication with the control node is interrupted for receiving the heartbeat response.

In a fourth aspect, a method for managing sub-health nodes is provided, including: if the storage node receives a heartbeat response sent by the control node, the heartbeat response is used for indicating that the storage node is a sub-health node, and the storage node determines that the storage node is a sub-health node; and if the heartbeat response feedback fails, the storage node determines that the storage node is deleted from the storage system.

In the embodiment of the application, if the storage node receives the heartbeat response, the storage node is indicated to be the sub-health node through the heartbeat response, namely, the flow of marking the sub-health node is combined with a heartbeat mechanism, so that the first storage node is deleted from the storage system, and the delay of the storage system for processing IO requests is reduced. Further, the time delay of the marking process of the sub-health node and the heartbeat timeout are in a time delay range of one order of magnitude.

In one possible implementation, the heartbeat response is further used to instruct the storage node to stop processing input/output IO requests, and the method further includes: and the storage node stops processing IO requests according to the heartbeat response.

In the embodiment of the application, the control node instructs the storage node to stop processing the IO request through the heartbeat response so as to realize that the processing of the IO request is refused through the first storage node, and indirectly feeds back the storage node to the computing node as a sub-health node, so that the storage node is deleted from the storage system. In the prior art, the storage nodes are deleted from the storage system in a mode of pushing updated views to each computing node by the control node, so that the load of the control node is reduced, and the bandwidth occupied by pushing the views is reduced.

In a fifth aspect, a method for managing sub-health nodes is provided, including: the controller sends a heartbeat response to the storage node, wherein the heartbeat response is used for indicating that the storage node is a sub-health node; and if the heartbeat response feedback fails, the controller determines that communication with the storage device is interrupted.

In one possible implementation, the heartbeat response is further used to instruct the storage node to stop processing input/output IO requests.

In a sixth aspect, a computing node is provided, the computing node having functionality to implement the computing node in the method design described above. These functions may be implemented by hardware, or may be implemented by hardware executing corresponding software. The hardware or software includes one or more modules corresponding to the functions described above.

In a seventh aspect, a control node is provided, which has the functionality of implementing the control node in the method design described above. These functions may be implemented by hardware, or may be implemented by hardware executing corresponding software. The hardware or software includes one or more modules corresponding to the functions described above.

In an eighth aspect, a storage node is provided, which has the function of implementing the first storage node in the above method design or the above storage node. These functions may be implemented by hardware, or may be implemented by hardware executing corresponding software. The hardware or software includes one or more modules corresponding to the functions described above.

In a ninth aspect, there is provided a control node comprising a processor and a memory, the memory for storing a computer program, the processor being adapted to invoke and run the computer program from the memory, such that the control node performs the method of the above aspects.

In a tenth aspect, there is provided a computing node comprising a processor and a memory, the memory for storing a computer program, the processor being for invoking and running the computer program from the memory, causing the computing node to perform the method of the above aspects.

In an eleventh aspect, there is provided a storage node comprising a processor and a memory, the memory being for storing a computer program, the processor being for invoking and running the computer program from the memory, causing the storage node to perform the method of the above aspects.

In a twelfth aspect, there is provided a computer program product comprising: computer program code which, when run on a computer, causes the computer to perform the method of the above aspects.

It should be noted that, the above computer program code may be stored in whole or in part on a first storage medium, where the first storage medium may be packaged together with the processor or may be packaged separately from the processor, and embodiments of the present application are not limited in this regard.

In a thirteenth aspect, there is provided a computer readable medium storing program code which, when run on a computer, causes the computer to perform the method of the above aspects.

In a fourteenth aspect, a chip system is provided, which comprises a processor for the device in which the chip system is located to implement the functions involved in the above aspects, e.g. to generate, receive, transmit, or process data and/or information involved in the above methods. In one possible design, the chip system further includes a memory for storing program instructions and data necessary for the terminal device. The chip system can be composed of chips, and can also comprise chips and other discrete devices.

The device where the chip system is located may be any one node of a computing node, a control node, a storage node, and a first storage node.

Drawings

FIG. 1 is a schematic diagram of a storage system to which embodiments of the present application are applied.

Fig. 2 is a schematic flow chart of a method of managing sub-health nodes according to an embodiment of the application.

Fig. 3 is a schematic flow chart of a method of managing sub-health nodes according to an embodiment of the application.

Fig. 4 is a schematic flow chart of a method of managing sub-health nodes of an embodiment of the application.

Fig. 5 is a schematic diagram of a management apparatus for sub-health nodes according to an embodiment of the present application.

FIG. 6 is a schematic diagram of a computing node according to another embodiment of the application.

Fig. 7 is a schematic diagram of a management apparatus for sub-health nodes according to an embodiment of the present application.

Fig. 8 is a schematic diagram of a management apparatus for sub-health nodes according to an embodiment of the present application.

Fig. 9 is a schematic diagram of a management apparatus for sub-health nodes according to an embodiment of the present application.

Fig. 10 is a schematic diagram of a control node according to another embodiment of the present application.

Fig. 11 is a schematic diagram of a management apparatus for sub-health nodes according to an embodiment of the present application.

Fig. 12 is a schematic diagram of a management apparatus for sub-health nodes according to an embodiment of the present application.

Fig. 13 is a schematic diagram of a storage node according to another embodiment of the present application.

Detailed Description

The technical scheme of the application will be described below with reference to the accompanying drawings.

FIG. 1 is a schematic diagram of a storage system to which embodiments of the present application are applied. The storage system includes a control node 110, a compute node 120, and a storage node 130. It should be noted that the number of the computing nodes 120 and the storage nodes 130 may be one or more. For example, as shown in the drawings

A control node 110 for monitoring the operational status of the compute nodes 120 and the storage nodes 130 in the storage system, for example, monitoring whether the storage nodes 130 are in a sub-health status.

It should be noted that the control node may be implemented by one or more physical devices, and the physical devices may be a computer or a server.

The computing node 120 is configured to determine a storage node corresponding to the IO request, and send the IO request to the corresponding storage node.

For example, in an Erasure Coding (EC) scenario, a storage space 1, a storage space 2, and a storage space 3 form a stripe, where the storage space 1 is distributed at a storage node 1, the storage space 2 is distributed at the storage node 2, and the storage space 3 is distributed at the storage node 3. The computing node blocks the data to obtain a data block 1 and a data block 2, the data block 1 and the data block 2 are obtained according to EC coding, the data block 1, the data block 2 and the data block 1 and the data block 2 are formed into a stripe, the computing node respectively sends IO requests to the storage node, specifically, the computing node 120 sends the IO requests carrying the data block 1 to the storage node 1, sends the IO requests carrying the data block 2 to the storage node 2, and sends the IO requests carrying the coding block 3 to the storage node 3. The operation request is a write IO request. Correspondingly, the computing node can also perform read operation on the storage node 1, the storage node 2 and the storage node 3, and send a read IO request for reading the stored data block 1, the data block 2 and the coding block.

For another example, in a multi-copy scenario, such as a 3-copy scenario, storage 1, storage 2, and storage 3 form a stripe, where storage 1 is distributed at storage node 1, storage 2 is distributed at storage node 2, and storage 3 is distributed at storage node 3. After receiving the write IO, the computing node respectively sends data carried in the write IO request to the storage node 1, the storage node 2 and the storage node 3. Correspondingly, the computing node can also perform read IO operation on the storage node 1, the storage node 2 and the storage node 3.

It should be noted that, the computing nodes in the above storage system may be implemented by one or more physical devices, and the physical devices may be a computer or a server.

A storage node 130 for providing a storage space, wherein the data stored in the storage node 130 may be metadata of the data, or metadata of the data and the data itself.

It should be noted that, the storage nodes in the storage system may be implemented by one or more physical devices, and the physical devices may be a memory or a storage server.

The embodiment of the application provides a sub-health node management method, which is used for reducing the time for determining sub-health nodes and further reducing the time delay for processing IO requests. The method described in detail below in connection with fig. 2.

Fig. 2 is a schematic flow chart of a method of managing sub-health nodes according to an embodiment of the present application, which may include steps 210 to 220.

210, Sending an IO request to the storage node, and determining that the storage node is a sub-healthy node based on the IO request.

The above procedure of sending an IO request to a storage node may refer to the foregoing description, and may be a read IO request or a write IO request.

Optionally, determining that the storage node is a sub-healthy node based on the IO request includes the storage node timeout to process the IO request, and the computing node determines that the storage node is a sub-healthy node. For example, if the storage node does not return feedback information of successful operation to the computing node within a preset time threshold, the computing node may determine that the storage node is a sub-healthy node. For another example, when the computing node determines to send an IO request to the storage node, a data packet loss occurs, and the computing node may determine that the storage node is a sub-healthy node. In the embodiment of the invention, the storage node in the sub-health state, namely the sub-health node, refers to a node working in an abnormal state, for example, the response time is too long, the processing IO request is overtime, the resource occupation is too high, and the like.

It should be noted that the computing node may also determine that the storage node is a sub-healthy node through other conditions. The embodiment of the present application is not limited thereto. For example, the computing node determines that communication with the storage node is interrupted, and the computing node determines that the storage node is a sub-healthy node.

It should be further noted that there are various reasons for the timeout of the IO request processing, and the embodiment of the present application is not limited thereto. For example, it may be due to the first storage node being overloaded. Or may be due to a first storage node software failure, etc.

220, The computing node sends an identification message to the control node, the identification message being used to indicate that the storage node is a sub-healthy node.

Accordingly, the control node receives the identification message sent by the computing node and marks the storage node as a sub-health node. In particular, the control node marks the storage node as a sub-healthy node in the view. The control node pushes the updated view to all computing nodes in the storage system and the storage node, and the computing node of the storage system can acquire that the storage node is a sub-health node through the updated view, and the computing node does not send an IO request to the storage node any more. Wherein the view is used to indicate a relationship of the storage space to the storage node, e.g., a correspondence of the stripe to the storage node. The correspondence of the stripe to the storage node may be a direct correspondence of the stripe to the storage node. The correspondence of the stripe to the storage node may also be an indirect relationship between the stripe and the storage node, for example, a relationship between the stripe and the partition and a relationship between the partition and the storage node. In another implementation, storage node information that replaces the sub-healthy node may also be included in the updated view, thereby causing the computing node to send IO requests to the storage node that replaces the sub-healthy node.

In another implementation manner, when the control node receives the identification message sent by the computing node, the control node may also delete the storage node directly in the view, and push the updated view to all the computing nodes in the storage system, so that the computing node in the storage system cannot find the storage node through the view any more, i.e. the storage node is deleted from the storage system.

Either the above method of marking sub-health nodes in the view or the above method of directly deleting sub-health nodes from the view requires updating the view, and then the control node needs to push the updated view to all computing nodes. However, the above-mentioned pushing process of the view involves a large number of nodes, which results in a longer time taken by the control node to push the view, and frequent updating of the view, so that the load of the control node is greatly increased, and a large amount of bandwidth of the storage system is occupied. In order to reduce the load generated by the view pushed by the control node and save the bandwidth of the storage system, the application also provides a sub-health node management method, so that the sub-health node in the storage system stops processing IO requests, and the purpose same as that of the pushed view is achieved.

A schematic flow chart of a method of managing sub-health nodes according to an embodiment of the application is described below in connection with fig. 3. It should be noted that, the method shown in fig. 3 may also be applied to the storage system shown in fig. 1, and the terms in the method shown in fig. 3 that are the same as those in the method shown in fig. 2 may refer to the same meanings, which are not repeated herein for brevity. The method shown in fig. 3 includes steps 310 and 320.

The control node determines 310 that the storage node is a sub-healthy node.

It should be noted that, in the manner that the control node determines that the storage node is a sub-health node, the method for identifying a sub-health node provided by the embodiment of the present application, that is, the method shown in fig. 2, may be used. That is, the method shown in fig. 2 may be used in combination with the method shown in fig. 3, and a specific method flow may be seen in fig. 4. The above-mentioned manner in which the control node determines the storage node as a sub-health node may also use a conventional method for determining a sub-health node, which is not limited in the embodiment of the present application.

320, The control node sends indication information to the storage node, where the indication information is used to instruct the storage node to stop processing the IO request. Accordingly, the storage node stops processing the IO request after receiving the indication information. And then delete itself from the storage system.

In the embodiment of the application, the control node sends the indication information to the storage node, the storage node is instructed to stop processing the IO request sent by the computing node, the computing node receives the response message of the storage node for stopping processing the IO request, and the storage node is determined to be a sub-health node, so that the sending of views is reduced, the load of the control node is reduced, and the bandwidth of a storage system is saved.

Accordingly, the storage node may determine that the storage node itself is deleted from the storage system according to the above indication information, and may determine that the storage node itself is a sub-health node according to "replacement implementation" described below, so as to delete the storage node itself from the storage system.

As an alternative implementation of step 320, sub-health status management of the storage node may be implemented based on the heartbeat mechanism of the control node and the storage node. Alternatively, the storage node is informed of the fact that the storage node is a sub-healthy node by the heartbeat mechanism.

The above heartbeat mechanism can be simply understood that a node that does not send a heartbeat in a preset period of time and a node that does not receive a heartbeat response are regarded as a fault node and deleted from the storage system. Therefore, even if the storage node does not receive the heartbeat response within the preset time based on the heartbeat mechanism, the storage node cannot learn that the storage node is a sub-health node, and the storage node can learn that the storage node is deleted from the storage system because the storage node does not receive the heartbeat response for a long time, so that the storage node can execute corresponding self-checking flow or recovery flow and other operations according to the state of the storage node, for example, the recovery flow for recovering normal work is started. Of course, if the storage node receives the heartbeat response, the storage node may determine that the node is determined to be a sub-health node according to the sub-health node indication information carried in the heartbeat response. Accordingly, the storage node can execute corresponding self-checking flow or recovery flow and other operations according to the state of the storage node.

In the embodiment of the application, sub-health node indication information is carried by the heartbeat response, namely sub-health state management of the storage node is realized based on a heartbeat mechanism, so that whether the sub-health node receives the sub-health node indication information or not, the sub-health node indication information is deleted from the storage system, and the delay of the storage system for processing IO requests is reduced. Further, the time delay of the processing flow of the sub-health node and the heartbeat timeout are in a time delay range of one order of magnitude.

It should be noted that, the indication information of the sub-health node may be transmitted in a heartbeat response, or may be transmitted through separate information, which is not limited in the embodiment of the present application.

Alternatively, the indication information and the sub-health node indication information may be the same information or two pieces of information that are sent independently. If the indication information and the sub-health node indication information are the same information and are carried in the heartbeat response, the storage device can determine that the storage device is a sub-health node and stop processing IO requests after receiving the heartbeat response.

Correspondingly, the method shown in the embodiment of the application further comprises the following steps: the computing node sends a first IO request to the storage node; when the storage node refuses to process the first IO request, the computing node marks the storage node as a sub-health node.

The first IO request may be a read request or a write request.

The storage node refusing to process the first IO request may include a response message for the computing node that the storage node stopped processing the IO request to notify the computing node that the storage node is a sub-healthy node. For example, the storage node returns a request error to the computing node to inform the computing node that the storage node refuses to process the first IO request. For example, when the first IO request is a write request, the storage node may return a write error to the compute node. For another example, when the first IO request is a read request, the storage node may return a read error to the compute node.

It should be noted that, the computing node sending the read request may be the computing node that identifies the storage node as a sub-healthy node. The meter node that sends the read request may also be another computing node in the storage system.

After the storage node is deleted from the storage system, the data can be degraded and stored, and the reliability of data storage is reduced. For example, in the EC scenario described above, storage node 1, storage node 2, and storage node 3 provide storage space for a stripe, e.g., storage node 1 is a sub-healthy node, then data block 1 in the stripe is stored on storage node 1, thereby making only data block 2 and the encoded block available, resulting in storage degradation.

In order to avoid degraded storage of data due to the existence of sub-healthy nodes, an alternative storage node may be configured for the storage node, where the alternative storage node is configured to take over the storage node to provide storage space for data after the storage node is a sub-healthy node.

That is, the method of the embodiment of the application further comprises the following steps: the computing node determines an alternative storage node of the storage node; the computing node writes data to be written to the storage node to the replacement storage node.

The replacement storage node of the sub-health node may be one of a plurality of replacement storage nodes in the storage system, and the plurality of replacement storage nodes may correspond to the plurality of storage nodes. The plurality of alternative storage nodes and the plurality of storage nodes may be in a one-to-one correspondence relationship or a one-to-many relationship, which is not limited in the embodiment of the present application.

It should be noted that, after the storage node is identified as a sub-healthy node, the storage node migrates all the stored data to the above-mentioned replacement storage node, and during the period that the storage node is a sub-healthy node, the replacement storage node may be regarded as the storage node, so as to implement the function of the storage node. Accordingly, after the storage node resumes normal operation, incremental data in the replacement storage node may be migrated to the storage node.

It should be noted that, the above process of migrating data from the storage node to the replacement storage node may be performed after the storage node is identified as a sub-healthy node, so after a period of time, if the storage node is still a sub-healthy node and is not recovered, the storage node may be considered to be a sub-healthy node, and at this time, the data migration is performed again, so that it may be avoided that the data has just been migrated, or that the storage node resumes normal operation in the process of migrating data. Of course, the migration process of the data may be directly executed after the storage node is identified as a sub-healthy node, which is not limited by the embodiment of the present application.

It should be understood that the above-mentioned correspondence between the storage node and the alternative storage node may be represented by the above-mentioned view or maintained by the control node alone, which is not limited in the embodiment of the present application.

In order to ensure the accuracy of the computing node to identify the storage node as a sub-healthy node, the computing node needs to be ensured to send IO requests to a plurality of storage nodes including the storage node, and at least one storage node except the storage node in the plurality of storage nodes successfully processes the IO requests except the storage node except the IO requests of the storage node, so that the false identification of the sub-healthy storage node caused by the computing node is avoided. Wherein the data processed by the IO requests are sent to multiple storage nodes belonging to the same stripe or to the same copy. The method comprises the steps that an IO request of a storage node meets the condition that the storage node is determined to be a sub-healthy node, an IO request processing failure response of the storage node is received, and at least one storage node returns an IO request processing success response.

The method for managing sub-health nodes according to the embodiment of the present application is described in detail above with reference to fig. 1 to 4, and the apparatus according to the embodiment of the present application is described in detail below with reference to fig. 5 to 13. It should be noted that the apparatus shown in fig. 5 to 13 may implement one or more steps of the above method, and will not be described herein for brevity.

Fig. 5 is a schematic diagram of a management apparatus for sub-health nodes according to an embodiment of the present application, and the apparatus 500 shown in fig. 5 may be any computing node in fig. 1. The apparatus 500 includes a transmission module 510 and a processing module 520.

A sending module 510, configured to send a first input/output IO request to a first storage node;

A processing module 520, configured to determine, based on the first IO request, that the first storage node is a sub-healthy node;

The sending module 510 is further configured to send an identification message to a control node, where the identification message is used to indicate that the first storage node is a sub-healthy node.

Optionally, as a possible implementation manner, the sending module is further configured to send a second IO request to the first storage node; and when receiving a response message that the first storage node refuses to process the second IO request, the processing module is further used for marking the first storage node as the sub-health node.

Optionally, as a possible implementation manner, the processing module is further configured to write data of the first storage node to a replacement storage node of the first storage node.

Optionally, as a possible implementation manner, the apparatus includes a receiving module, where the sending module is configured to send a third IO request to the second storage node; the receiving module is further configured to receive a processing success response of the third IO request returned by the second storage node; the receiving module is further configured to receive a processing failure response of the first IO request returned by the first storage node; the processing module is further configured to determine, according to the processing failure response of the first IO request and the processing success response of the third IO request, that the first storage node is the sub-health node, where the data processed by the first IO request and the data processed by the third IO request belong to the same stripe, or the data processed by the first IO request and the data processed by the third IO request belong to the same copy.

Optionally, as a possible implementation manner, when the first storage node processes the first IO request and times out, the processing module is configured to determine that the first storage node is a sub-healthy node.

Optionally, as a possible implementation manner, the first IO request is a write request or a read request.

In an alternative embodiment, the processing module 520 may be a processor 620, the sending module 510 may be an input-output interface 630, and the computing node may further include a memory 610, as particularly shown in fig. 6.

FIG. 6 is a schematic diagram of a computing node according to another embodiment of the application. The computing node 600 shown in fig. 6 may include: memory 610, processor 620, and input/output interface 630. The memory 610, the processor 620 and the input/output interface 630 are connected through an internal connection path, the memory 610 is used for storing instructions, and the processor 620 is used for executing the instructions stored in the memory 620, so as to control the input/output interface 630 to receive input data and information, and output data such as operation results.

In implementation, the steps of the above method may be performed by integrated logic circuitry in hardware or instructions in software in the processor 620. The method disclosed in connection with the embodiments of the present application may be directly embodied as a hardware processor executing or may be executed by a combination of hardware and software modules in the processor. The software modules may be located in a random access memory, flash memory, read only memory, programmable read only memory, or electrically erasable programmable memory, registers, etc. as well known in the art. The storage medium is located in the memory 610, and the processor 620 reads the information in the memory 610 and, in combination with its hardware, performs the steps of the method described above. To avoid repetition, a detailed description is not provided herein.

Fig. 7 is a schematic diagram of a management apparatus for sub-health nodes according to an embodiment of the present application, and the apparatus 700 shown in fig. 7 may be the control node in fig. 1. The apparatus 700 comprises: a receiving module 710 and a processing module 720.

A receiving module 710, configured to receive an identification message sent by a computing node, where the identification message indicates that a storage node is directly marked as a sub-healthy node;

A processing module 720, configured to mark the storage node as the sub-health node.

Fig. 8 is a schematic diagram of a management apparatus for sub-health nodes according to an embodiment of the present application, and the apparatus 800 shown in fig. 8 may be the control node in fig. 1. The apparatus 800 includes: a processing module 810 and a transmitting module 820.

A processing module 810 configured to determine that the storage node is a sub-healthy node;

And a sending module 820, configured to send indication information to the storage node, where the indication information is used to instruct the storage node to stop processing the input/output IO request.

Optionally, as an embodiment, the sending module is configured to send a heartbeat response to the first storage node, where the heartbeat response carries the indication information.

Optionally, as an embodiment, the indication information indicates that the first storage node stops processing the IO request by indicating that the first storage node is marked as the sub-healthy node.

Optionally, as an embodiment, the apparatus includes a receiving module, configured to receive an identification message sent by a computing node, where the identification message requests that the first storage node be marked as a sub-healthy node; and the processing module is used for marking the first storage node as the sub-health node according to the identification message.

Fig. 9 is a schematic diagram of a management apparatus for sub-health nodes according to an embodiment of the present application, and the apparatus 900 shown in fig. 9 may be the control node in fig. 1. The apparatus 900 includes: a sending module 910 and a processing module 920.

A sending module 910, configured to send a heartbeat response to a storage node, where the heartbeat response is used to indicate that the storage node is a sub-healthy node;

And the processing module 920 is configured to determine that communication with the storage device is interrupted if the heartbeat response feedback fails.

Optionally, as an embodiment, the heartbeat response is further used to instruct the storage node to stop processing the input/output IO request.

In an alternative embodiment, the apparatus 700 may be the control node 1000, the processing module 720 may be the processor 1020, the receiving module 710 may be the input/output interface 1030, and the control node 1000 may further include the memory 1010, as particularly shown in fig. 10.

In an alternative embodiment, the apparatus 800 may be the control node 1000, the processing module 810 may be the processor 1020, the sending module 820 may be the input/output interface 1030, and the control node 1000 may further include the memory 1010, as particularly shown in fig. 10.

In an alternative embodiment, the apparatus 900 may be the control node 1000, the processing module 920 may be the processor 1020, the sending module 910 may be the input/output interface 1030, and the control node 1000 may further include the memory 1010, as particularly shown in fig. 10.

Fig. 10 is a schematic diagram of a control node according to another embodiment of the present application. The control node 1000 shown in fig. 10 may include: memory 1010, processor 1020, and input/output interface 1030. The memory 1010, the processor 1020, and the input/output interface 1030 are connected by an internal connection path, where the memory 1010 is used for storing instructions, and the processor 1020 is used for executing the instructions stored in the memory 1020 to control the input/output interface 1030 to receive input data and information, and output data such as operation results.

In implementation, the steps of the methods described above may be performed by integrated logic circuitry in hardware in processor 1020 or by instructions in software. The method disclosed in connection with the embodiments of the present application may be directly embodied as a hardware processor executing or may be executed by a combination of hardware and software modules in the processor. The software modules may be located in a random access memory, flash memory, read only memory, programmable read only memory, or electrically erasable programmable memory, registers, etc. as well known in the art. The storage medium is located in the memory 1010, and the processor 1020 reads information in the memory 1010 to perform the steps of the method described above in connection with its hardware. To avoid repetition, a detailed description is not provided herein.

Fig. 11 is a schematic diagram of a management apparatus for sub-health nodes according to an embodiment of the present application, and the apparatus 1100 shown in fig. 11 may be any storage node in fig. 1. Storage node 1100 includes a receiving module 1110 and a processing module 1120.

In the case that the first storage node is a sub-healthy node, a receiving module 1110 is configured to receive indication information sent by a control node, where the indication information is used to instruct the first storage node to stop processing an input/output IO request;

and a processing module 1120, configured to stop processing the IO request.

Optionally, as an embodiment, the receiving module is configured to receive a heartbeat response sent by the control node, where the heartbeat response carries the indication information.

Fig. 12 is a schematic diagram of a sub-health node management apparatus according to an embodiment of the present application, and the apparatus 1200 shown in fig. 12 may be any storage node in fig. 1. Storage node 1200 includes a processing module 1210 and a receiving module 1220.

If a heartbeat response sent by the control node is received by the receiving module 1220, and the heartbeat response is used to indicate that the storage node is a sub-healthy node, the processing module 1210 is configured to determine that the storage node is a sub-healthy node;

If the heartbeat response feedback fails, the processing module 1210 is configured to determine that the storage node is deleted from the storage system.

In an alternative embodiment, the processing module 1120 may be a processor 1320, the receiving module 1110 may be an input-output interface 1330, and the storage node 1300 may further include a memory 1310, as particularly shown in fig. 13.

In an alternative embodiment, the processing module 1210 may be a processor 1320, the receiving module 1220 may be an input/output interface 1330, and the storage node 1300 may further include a memory 1310, as particularly shown in fig. 13.

Fig. 13 is a schematic diagram of a storage node according to another embodiment of the present application. The storage node 1300 shown in fig. 13 may include: memory 1310, processor 1320, and input/output interface 1330. The memory 1310, the processor 1320 and the input/output interface 1330 are connected through an internal connection path, the memory 1310 is used for storing instructions, and the processor 1320 is used for executing the instructions stored in the memory 1320, so as to control the input/output interface 1330 to receive input data and information, and output data such as operation results.

In implementation, the steps of the methods described above may be performed by integrated logic circuitry in hardware in processor 1320 or by instructions in software. The method disclosed in connection with the embodiments of the present application may be directly embodied as a hardware processor executing or may be executed by a combination of hardware and software modules in the processor. The software modules may be located in a random access memory, flash memory, read only memory, programmable read only memory, or electrically erasable programmable memory, registers, etc. as well known in the art. The storage medium is located in the memory 1310, and the processor 1320 reads information in the memory 1310 and performs the steps of the method in combination with its hardware. To avoid repetition, a detailed description is not provided herein.

It should be appreciated that in embodiments of the present application, the processor may be a central processing unit (central processing unit, CPU), the processor may also be other general purpose processors, digital signal processors (DIGITAL SIGNAL processors, DSPs), application SPECIFIC INTEGRATED Circuits (ASICs), off-the-shelf programmable gate arrays (field programmable GATE ARRAY, FPGAs) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

It should also be appreciated that in embodiments of the present application, the memory may include read only memory and random access memory, and provide instructions and data to the processor. A portion of the processor may also include nonvolatile random access memory. The processor may also store information of the device type, for example.

Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.

It will be clearly understood by those skilled in the art that, for convenience and brevity of description, specific working procedures of the above-described system, apparatus and unit may refer to corresponding procedures in the foregoing method embodiments, and the detailed description is omitted herein.

In the several embodiments provided by the present application, it should be understood that the disclosed systems, devices, and methods may be implemented in other manners. For example, the apparatus embodiments described above are merely illustrative, e.g., the division of elements is merely a logical functional division, and there may be additional divisions of actual implementation, e.g., multiple elements or components may be combined or integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical or other form.

The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed over a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.

In addition, each functional unit in the embodiments of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit.

The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method of the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a magnetic disk, or an optical disk, or other various media capable of storing program codes.

The foregoing is merely illustrative of the present application, and the present application is not limited thereto, and any person skilled in the art will readily recognize that variations or substitutions are within the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims

1. A method of managing sub-health nodes, comprising:

the computing node sends a first input/output IO request to a first storage node;

the computing node determines that the first storage node is a sub-healthy node based on a first IO request;

The computing node sends an identification message to a control node, wherein the identification message is used for indicating the first storage node to be a sub-healthy node, so that the control node indicates the first storage node to send a response message for refusing processing to the computing node sending a subsequent IO request, and the subsequent IO request is an IO request sent to the first storage node after the computing node sends the first IO request;

The computing node sends a second IO request to the first storage node;

and when receiving a response message that the first storage node refuses to process the second IO request, marking the first storage node as the sub-health node by the computing node.

2. The method of claim 1, wherein the method further comprises:

the computing node writes data of the first storage node to a replacement storage node of the first storage node.

3. The method of claim 1 or 2, wherein the method further comprises:

the computing node sends a third IO request to the second storage node;

The computing node receives a processing success response of the third IO request returned by the second storage node;

The computing node receives a processing failure response of the first IO request returned by the first storage node;

The computing node determining, based on the first IO request, that the first storage node is a sub-healthy node, comprising:

The computing node determines the first storage node as the sub-health node according to the processing failure response of the first IO request and the processing success response of the third IO request,

The data processed by the first IO request and the third IO request belong to the same stripe, or the data processed by the first IO request and the third IO request belong to the same copy.

4. The method of claim 1, wherein the computing node determining, based on the first IO request, that the first storage node is a sub-healthy node comprises:

and when the first IO request processing process of the first storage node is overtime, the computing node determines that the first storage node is a sub-health node.

5. The method of claim 1, wherein the first IO request is a write request or a read request.

6. A management apparatus for sub-health nodes, comprising:

the sending module is used for sending a first input/output IO request to the first storage node;

The processing module is used for determining that the first storage node is a sub-health node based on a first IO request;

the sending module is further configured to send an identification message to a control node, where the identification message is used to indicate that the first storage node is a sub-healthy node, so that the control node indicates the first storage node to send a response message for rejecting processing to a computing node that sends a subsequent IO request, where the subsequent IO request is an IO request sent to the first storage node after the computing node sends the first IO request;

the sending module is further configured to send a second IO request to the first storage node;

and when receiving a response message that the first storage node refuses to process the second IO request, the processing module is further used for marking the first storage node as the sub-health node.

7. The apparatus of claim 6, wherein the processing module is further to write data of the first storage node to an alternate storage node of the first storage node.

8. The apparatus of claim 6 or 7, wherein the apparatus comprises a receiving module,

The sending module is used for sending a third IO request to the second storage node;

The receiving module is used for receiving a processing success response of the third IO request returned by the second storage node;

the receiving module is further configured to receive a processing failure response of the first IO request returned by the first storage node;

the processing module is further configured to determine that the first storage node is the sub-healthy node according to the processing failure response of the first IO request and the processing success response of the third IO request,

9. The apparatus of claim 6, wherein,

And when the first IO request process is overtime in the processing of the first storage node, the processing module is used for determining that the first storage node is a sub-health node.

10. The apparatus of claim 6, wherein the first IO request is a write request or a read request.

11. A computing node comprising a processor and a memory, the memory for storing a computer program, the processor for invoking and running the computer program from the memory, causing the computing node to perform the method of any of claims 1-5.