CN111949452A

CN111949452A - Method and device for rapidly recovering IO (input/output) in single-node fault of storage system

Info

Publication number: CN111949452A
Application number: CN202010987811.0A
Authority: CN
Inventors: 贺坤
Original assignee: Suzhou Inspur Intelligent Technology Co Ltd
Current assignee: Suzhou Inspur Intelligent Technology Co Ltd
Priority date: 2020-09-18
Filing date: 2020-09-18
Publication date: 2020-11-17
Anticipated expiration: 2040-09-18
Also published as: CN111949452B

Abstract

The invention discloses a method and a device for rapidly recovering IO (input/output) in single-node failure of a storage system, which adopt a service representative mode design, connect a client and a service end by an intermediate service representative, reduce the communication complexity of the client and the service end, realize that an interface is uniformly exposed outwards by the service, specifically inquire and use the service interface to execute related services by the service representative, and a client can realize a response service by calling the related simplified interface to the service representative. Meanwhile, the functions of communication and remote inquiry of service layer codes in the presentation layer codes are reduced, the communication time and the time difference caused by the cross triggering interruption and recovery of each module are reduced, and the shortest I/O interruption time is ensured. By the mode, the invention can reduce the communication time and the time difference caused by the cross triggering interruption and recovery of each module, and ensure the optimal interruption time.

Description

Method and device for rapidly recovering IO (input/output) in single-node fault of storage system

Technical Field

The invention relates to the technical field of storage, in particular to a method and a device for rapidly recovering IO (input/output) in a single-node fault of a storage system.

Background

An observer mode is mainly adopted in the existing mode of processing IO interruption by a node fault recovery event, but the observer mode has the following problems that a storage system cluster is used as an observed person, and sub-modules such as a forwarding layer, a cache, a volume and RAID are used as observers: 1. the observed person object has many direct and indirect observers, and it takes much time to notify all of the observers. 2. There is a cyclic dependency between the observer and the observed target, and the observed target triggers a cyclic call between them, which may cause a system crash. 3. The observer pattern has no corresponding mechanism for the observer to know how the observed target object has changed, but only how the observed target object has changed. 4. It also takes a lot of time for the observer to notify the observer and the observer needs to check if all the observers have recovered before notifying the host to recover the I/O.

Disclosure of Invention

The invention mainly solves the technical problem of providing a method and a device for rapidly recovering IO (input/output) in a single-node fault of a storage system, which can reduce communication time and time difference caused by cross-triggering interruption and recovery of each module and ensure that the interruption time is optimal.

In order to solve the technical problems, the invention adopts a technical scheme that: the method for rapidly recovering IO (input/output) in a single-node failure of a storage system comprises the following steps: firstly, a service representative inquires and calls a service interface to execute a service, a storage service submodule calls the service representative interface to realize a response service, the service representative is connected with the storage service submodule and a cluster event management module, and the service representative uniformly exposes the interface outwards; and secondly, serializing the management cluster events by the service representatives, and controlling the uniform issuing and response processing of the events.

Further, the first step specifically includes the steps of:

s100, transmitting an event needing to be processed to a cache of a service representative, wherein the cache is created at a cluster host end and is used for carrying out data communication with the service representative end;

s101, setting a service representative attribute class and a state machine;

s102, controlling and starting a state machine service representative algorithm;

s103, waiting for the occurrence of the cluster single node joining leaving event, reading the received event into the service representative state machine from the cache of the service representative, and completing the update of the event issuing state and the issuing of the notification host I/O.

Further, the cache in step S100 is stored in the cluster configuration management.

Further, the state machine in step S101 is used to control issuing events to the sub-modules, and respond to the processing results of each module.

Further, the second step specifically includes the steps of:

s200, when the node leaves, the cluster generates Pend and sends the Pend to the service representative end, the survival node of the I/O stack service submodule keeps I/O and processes the transaction processing required by the leaving of the node at the service end;

s201, generating a Remove event after the cluster BOSS nodes switch and reconstruct the view, sending the Remove event to a service representative end, and judging whether the service processing of each service sub-module is finished by the service representative end;

s202, the service representative monitors all service submodules of the I/O to complete processing, uniformly issues a Remove event to each service configuration management module, and triggers the I/O to start interruption;

s203, after IO interruption, all the sub-modules perform preferred node switching together, update and synchronize metadata, and notify a service representative after completion;

s204, the service representative notifies the host computer after receiving the completion of the updating of the configuration of all the sub-modules, and immediately resumes the I/O;

s205, when the node is added, the cluster generates an Add/Unpend/UnpendDone event;

s205, the service representative receives the cluster event and puts the cluster event into a cache, and the Add triggers the state machine to inform each sub-module service layer to carry out operations such as node attribute updating and the like, and the surviving node keeps I/O;

s207, after the updating is completed, a service representative is notified, the triggering of a Unpend event of the cluster after the node is added is waited, the service representative triggers a sub-service module to perform the Discard operation, the surviving node does not operate, and the I/O is continuously kept;

s208, the cluster completes view reconstruction operation and sends the UnpendDone event to a service representative;

s209, the service representative notifies the host computer after receiving the completion of the updating of the configuration of all the sub-modules, and immediately resumes the I/O.

Further, the Add/Unpend/UnpendDone event in the step S205 is a node join, release completion event.

Further, in step S208, after each sub-module has completed the Discard operation, the service representative uniformly triggers each sub-module configuration module to interrupt the I/O, perform the first-selected node switching, and update and synchronize the metadata.

Further, the second step is followed by reducing communication or remote query functions to the business layer code in the presentation layer code.

A device for rapidly recovering IO (input/output) in single-node failure of a storage system comprises a service representative module, a storage service submodule and a cluster event management module; the service representative module inquires and calls a service interface to execute a service, and the storage service sub-module calls the interface to the service representative to realize a response service; the service representative module is used as a middle representative to connect the storage service submodule and the cluster event management module, and the service representative module uniformly exposes interfaces outwards; the service representative module manages cluster event serialization and controls the uniform issuing and response processing of events.

The invention has the beneficial effects that: the invention uses a middle representative to connect the storage service sub-module and the cluster event management, reduces the communication complexity of the storage service sub-module and the cluster event management, the cluster service realizes the uniform exposure of the interface outwards, the service representative specifically inquires and calls the service interface to execute the related service, and the client can realize the response service only by calling the related simplified interface to the service representative. The remote query function of communication or business layer codes in the presentation layer codes is reduced, the communication time and the time difference caused by the cross triggering interruption and recovery of each module are reduced, and the optimal interruption time is ensured.

Drawings

Fig. 1 is an architecture diagram of a method for fast IO recovery from a single node failure in a storage system according to a preferred embodiment of the present invention.

Fig. 2 is a structural diagram of an apparatus for rapidly recovering IO due to a single-node failure in a storage system according to the present invention.

Detailed Description

The following detailed description of the preferred embodiments of the present invention, taken in conjunction with the accompanying drawings, will make the advantages and features of the invention easier to understand by those skilled in the art, and thus will clearly and clearly define the scope of the invention.

In a first aspect, referring to fig. 1, an embodiment of the present invention provides a method for rapidly recovering an IO in a single node failure in a storage system, including: firstly, adopting a service representative mode design, connecting a storage service sub-module and a cluster event management module by a middle representative, reducing the communication complexity of the service sub-module and the cluster event management module, realizing the uniform exposure of interfaces outwards by the service representative, specifically inquiring and calling a service interface by the service representative to execute related services, and realizing the response service by the service sub-module only needing to call the interface after the related simplification to the service representative; then, the service representative manages the cluster event serialization, controls the uniform issue and response processing of the event, and reduces the time difference caused by the cross triggering interruption and recovery of each sub-module due to the issue and response event; finally, the communication or remote query function of the service layer code in the presentation layer code is reduced, and the communication time is reduced to ensure the minimum interruption time.

The quick recovery I/O program implementation mainly comprises the following steps:

(1) creating a cache for data communication between a cluster host end and a service representative end, wherein the cache is stored in cluster configuration management;

(2) transmitting the event to be processed to a cache of a service representative;

(3) setting a service representative attribute class and a state machine, controlling and issuing events to the sub-modules, and responding to the processing results of the modules;

(4) controlling and starting a state machine service representative algorithm;

(5) and waiting for the occurrence of the adding/leaving event of the cluster single node, reading the received event into a service representative state machine from the cache of the service representative, and finishing the updating of the event issuing state and the issuing of the notification host I/O.

The method for controlling and optimizing the IO fast recovery I/O program by adopting the service representative mode mainly comprises the following steps:

(1) when the node leaves, the cluster generates Pend which is sent to the service representative end, the survival node of the I/O stack service submodule keeps I/O and processes the transaction processing required by the leaving of the node at the service end;

(2) the cluster BOSS node generates a Remove event after switching and reconstructing the view, and sends the Remove event to a service representative terminal, and the service representative terminal judges whether the service processing of each service sub-module is completed;

(3) the service representative monitors that all service submodules of the I/O are processed, uniformly issues a Remove event to each service configuration management module, and triggers the I/O to start interruption;

(4) after IO interruption, all the sub-modules perform operations such as preferred node switching, metadata updating synchronization and the like together, and inform a service representative after completion;

(5) the service representative informs the host computer after receiving the completion of the updating of all the sub-module configurations, and immediately recovers I/O without returning to the cluster;

(6) when the node is added, the cluster generates an Add/Unpend/UnpendDone event;

(7) the service representative receives the cluster event and puts the cluster event into a cache, and the Add triggers a state machine to inform each submodule service layer to carry out node attribute updating and other operations, and the surviving node keeps I/O;

(8) after the update is completed, notifying a service representative, waiting for the triggering of a Unpend event of the cluster after the node is added, triggering a sub-service module by the service representative to perform the Discard operation, not operating the surviving node, and continuously keeping I/O;

(9) the cluster finishes operations such as view reconstruction and sends UnpendDone events to a service representative, at the moment, each submodule finishes the Discard operation, the service representative uniformly triggers each submodule configuration module to interrupt I/O, and preferred node switching, metadata updating and synchronization are carried out;

(10) and the service representative informs the host computer after receiving the completion of the updating of all the sub-module configurations, and immediately recovers the I/O.

In a second aspect, based on the same inventive concept as the method for rapidly recovering IO from a single node failure in a storage system in the foregoing embodiment, an embodiment of the present specification further provides a device for rapidly recovering IO from a single node failure in a storage system, as shown in fig. 2, including a service representative module, a storage service sub-module, and a cluster event management module; the service representative module inquires and calls a service interface to execute a service, and the storage service sub-module calls the interface to the service representative module to realize a response service; the service representative module is used as a middle representative to connect the storage service submodule and the cluster event management module, and the service representative module uniformly exposes interfaces outwards; the service representative module manages cluster event serialization and controls the uniform issuing and response processing of events.

The above description is only an embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes performed by the present specification and drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.

Claims

1. A method for rapidly recovering IO (input/output) in a single-node failure of a storage system is characterized by comprising the following steps: firstly, a service representative inquires and calls a service interface to execute a service, a storage service submodule calls the service representative interface to realize a response service, the service representative is connected with the storage service submodule and a cluster event management module, and the service representative uniformly exposes the interface outwards; and secondly, serializing the management cluster events by the service representatives, and controlling the uniform issuing and response processing of the events.

2. The method for rapidly recovering IO due to single-node failure in a storage system according to claim 1, wherein the first step specifically includes the following steps:

s101, setting a service representative attribute class and a state machine;

3. The method for rapidly recovering IO due to single-node failure in a storage system according to claim 2, wherein the cache in step S100 is stored in cluster configuration management.

4. The method according to claim 2, wherein the state machine in step S101 is used to control issuing of events to the sub-modules and responding to the processing results of the modules.

5. The method for rapidly recovering IO due to single-node failure in a storage system according to claim 1, wherein the second step specifically includes the following steps:

6. The method according to claim 5, wherein the Add/Unpend/UnpendDone event in step S205 is a node join, cancel, or cancel completion event.

7. The method according to claim 5, wherein in step S208, each submodule has completed Discard operation, and the service representative uniformly triggers each submodule configuration module to interrupt I/O, perform preferred node switching, metadata update, and synchronization.

8. The method of claim 1, wherein the second step is followed by reducing communication or remote query function for service layer code in the presentation layer code.

9. A device for rapidly recovering IO (input/output) in single-node failure of a storage system is characterized by comprising a service representative module, a storage service submodule and a cluster event management module; the service representative module inquires and calls a service interface to execute a service, and the storage service sub-module calls the interface to the service representative module to realize a response service; the service representative module is used as a middle representative to connect the storage service submodule and the cluster event management module, and the service representative module uniformly exposes interfaces outwards; the service representative module manages cluster event serialization and controls the uniform issuing and response processing of events.