CN111949452A - Method and device for rapidly recovering IO (input/output) in single-node fault of storage system - Google Patents
Method and device for rapidly recovering IO (input/output) in single-node fault of storage system Download PDFInfo
- Publication number
- CN111949452A CN111949452A CN202010987811.0A CN202010987811A CN111949452A CN 111949452 A CN111949452 A CN 111949452A CN 202010987811 A CN202010987811 A CN 202010987811A CN 111949452 A CN111949452 A CN 111949452A
- Authority
- CN
- China
- Prior art keywords
- service
- service representative
- event
- cluster
- module
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 22
- 238000004891 communication Methods 0.000 claims abstract description 16
- 230000004044 response Effects 0.000 claims abstract description 15
- 238000012545 processing Methods 0.000 claims description 18
- 230000008569 process Effects 0.000 claims description 4
- 230000004083 survival effect Effects 0.000 claims description 3
- 238000011084 recovery Methods 0.000 abstract description 9
- 238000013461 design Methods 0.000 abstract description 2
- 125000004122 cyclic group Chemical group 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/14—Error detection or correction of the data by redundancy in operation
- G06F11/1402—Saving, restoring, recovering or retrying
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/3003—Monitoring arrangements specially adapted to the computing system or computing system component being monitored
- G06F11/3006—Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system is distributed, e.g. networked systems, clusters, multiprocessor systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/3051—Monitoring arrangements for monitoring the configuration of the computing system or of the computing system component, e.g. monitoring the presence of processing resources, peripherals, I/O links, software programs
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Computing Systems (AREA)
- Quality & Reliability (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Hardware Redundancy (AREA)
Abstract
The invention discloses a method and a device for rapidly recovering IO (input/output) in single-node failure of a storage system, which adopt a service representative mode design, connect a client and a service end by an intermediate service representative, reduce the communication complexity of the client and the service end, realize that an interface is uniformly exposed outwards by the service, specifically inquire and use the service interface to execute related services by the service representative, and a client can realize a response service by calling the related simplified interface to the service representative. Meanwhile, the functions of communication and remote inquiry of service layer codes in the presentation layer codes are reduced, the communication time and the time difference caused by the cross triggering interruption and recovery of each module are reduced, and the shortest I/O interruption time is ensured. By the mode, the invention can reduce the communication time and the time difference caused by the cross triggering interruption and recovery of each module, and ensure the optimal interruption time.
Description
Technical Field
The invention relates to the technical field of storage, in particular to a method and a device for rapidly recovering IO (input/output) in a single-node fault of a storage system.
Background
An observer mode is mainly adopted in the existing mode of processing IO interruption by a node fault recovery event, but the observer mode has the following problems that a storage system cluster is used as an observed person, and sub-modules such as a forwarding layer, a cache, a volume and RAID are used as observers: 1. the observed person object has many direct and indirect observers, and it takes much time to notify all of the observers. 2. There is a cyclic dependency between the observer and the observed target, and the observed target triggers a cyclic call between them, which may cause a system crash. 3. The observer pattern has no corresponding mechanism for the observer to know how the observed target object has changed, but only how the observed target object has changed. 4. It also takes a lot of time for the observer to notify the observer and the observer needs to check if all the observers have recovered before notifying the host to recover the I/O.
Disclosure of Invention
The invention mainly solves the technical problem of providing a method and a device for rapidly recovering IO (input/output) in a single-node fault of a storage system, which can reduce communication time and time difference caused by cross-triggering interruption and recovery of each module and ensure that the interruption time is optimal.
In order to solve the technical problems, the invention adopts a technical scheme that: the method for rapidly recovering IO (input/output) in a single-node failure of a storage system comprises the following steps: firstly, a service representative inquires and calls a service interface to execute a service, a storage service submodule calls the service representative interface to realize a response service, the service representative is connected with the storage service submodule and a cluster event management module, and the service representative uniformly exposes the interface outwards; and secondly, serializing the management cluster events by the service representatives, and controlling the uniform issuing and response processing of the events.
Further, the first step specifically includes the steps of:
s100, transmitting an event needing to be processed to a cache of a service representative, wherein the cache is created at a cluster host end and is used for carrying out data communication with the service representative end;
s101, setting a service representative attribute class and a state machine;
s102, controlling and starting a state machine service representative algorithm;
s103, waiting for the occurrence of the cluster single node joining leaving event, reading the received event into the service representative state machine from the cache of the service representative, and completing the update of the event issuing state and the issuing of the notification host I/O.
Further, the cache in step S100 is stored in the cluster configuration management.
Further, the state machine in step S101 is used to control issuing events to the sub-modules, and respond to the processing results of each module.
Further, the second step specifically includes the steps of:
s200, when the node leaves, the cluster generates Pend and sends the Pend to the service representative end, the survival node of the I/O stack service submodule keeps I/O and processes the transaction processing required by the leaving of the node at the service end;
s201, generating a Remove event after the cluster BOSS nodes switch and reconstruct the view, sending the Remove event to a service representative end, and judging whether the service processing of each service sub-module is finished by the service representative end;
s202, the service representative monitors all service submodules of the I/O to complete processing, uniformly issues a Remove event to each service configuration management module, and triggers the I/O to start interruption;
s203, after IO interruption, all the sub-modules perform preferred node switching together, update and synchronize metadata, and notify a service representative after completion;
s204, the service representative notifies the host computer after receiving the completion of the updating of the configuration of all the sub-modules, and immediately resumes the I/O;
s205, when the node is added, the cluster generates an Add/Unpend/UnpendDone event;
s205, the service representative receives the cluster event and puts the cluster event into a cache, and the Add triggers the state machine to inform each sub-module service layer to carry out operations such as node attribute updating and the like, and the surviving node keeps I/O;
s207, after the updating is completed, a service representative is notified, the triggering of a Unpend event of the cluster after the node is added is waited, the service representative triggers a sub-service module to perform the Discard operation, the surviving node does not operate, and the I/O is continuously kept;
s208, the cluster completes view reconstruction operation and sends the UnpendDone event to a service representative;
s209, the service representative notifies the host computer after receiving the completion of the updating of the configuration of all the sub-modules, and immediately resumes the I/O.
Further, the Add/Unpend/UnpendDone event in the step S205 is a node join, release completion event.
Further, in step S208, after each sub-module has completed the Discard operation, the service representative uniformly triggers each sub-module configuration module to interrupt the I/O, perform the first-selected node switching, and update and synchronize the metadata.
Further, the second step is followed by reducing communication or remote query functions to the business layer code in the presentation layer code.
A device for rapidly recovering IO (input/output) in single-node failure of a storage system comprises a service representative module, a storage service submodule and a cluster event management module; the service representative module inquires and calls a service interface to execute a service, and the storage service sub-module calls the interface to the service representative to realize a response service; the service representative module is used as a middle representative to connect the storage service submodule and the cluster event management module, and the service representative module uniformly exposes interfaces outwards; the service representative module manages cluster event serialization and controls the uniform issuing and response processing of events.
The invention has the beneficial effects that: the invention uses a middle representative to connect the storage service sub-module and the cluster event management, reduces the communication complexity of the storage service sub-module and the cluster event management, the cluster service realizes the uniform exposure of the interface outwards, the service representative specifically inquires and calls the service interface to execute the related service, and the client can realize the response service only by calling the related simplified interface to the service representative. The remote query function of communication or business layer codes in the presentation layer codes is reduced, the communication time and the time difference caused by the cross triggering interruption and recovery of each module are reduced, and the optimal interruption time is ensured.
Drawings
Fig. 1 is an architecture diagram of a method for fast IO recovery from a single node failure in a storage system according to a preferred embodiment of the present invention.
Fig. 2 is a structural diagram of an apparatus for rapidly recovering IO due to a single-node failure in a storage system according to the present invention.
Detailed Description
The following detailed description of the preferred embodiments of the present invention, taken in conjunction with the accompanying drawings, will make the advantages and features of the invention easier to understand by those skilled in the art, and thus will clearly and clearly define the scope of the invention.
In a first aspect, referring to fig. 1, an embodiment of the present invention provides a method for rapidly recovering an IO in a single node failure in a storage system, including: firstly, adopting a service representative mode design, connecting a storage service sub-module and a cluster event management module by a middle representative, reducing the communication complexity of the service sub-module and the cluster event management module, realizing the uniform exposure of interfaces outwards by the service representative, specifically inquiring and calling a service interface by the service representative to execute related services, and realizing the response service by the service sub-module only needing to call the interface after the related simplification to the service representative; then, the service representative manages the cluster event serialization, controls the uniform issue and response processing of the event, and reduces the time difference caused by the cross triggering interruption and recovery of each sub-module due to the issue and response event; finally, the communication or remote query function of the service layer code in the presentation layer code is reduced, and the communication time is reduced to ensure the minimum interruption time.
The quick recovery I/O program implementation mainly comprises the following steps:
(1) creating a cache for data communication between a cluster host end and a service representative end, wherein the cache is stored in cluster configuration management;
(2) transmitting the event to be processed to a cache of a service representative;
(3) setting a service representative attribute class and a state machine, controlling and issuing events to the sub-modules, and responding to the processing results of the modules;
(4) controlling and starting a state machine service representative algorithm;
(5) and waiting for the occurrence of the adding/leaving event of the cluster single node, reading the received event into a service representative state machine from the cache of the service representative, and finishing the updating of the event issuing state and the issuing of the notification host I/O.
The method for controlling and optimizing the IO fast recovery I/O program by adopting the service representative mode mainly comprises the following steps:
(1) when the node leaves, the cluster generates Pend which is sent to the service representative end, the survival node of the I/O stack service submodule keeps I/O and processes the transaction processing required by the leaving of the node at the service end;
(2) the cluster BOSS node generates a Remove event after switching and reconstructing the view, and sends the Remove event to a service representative terminal, and the service representative terminal judges whether the service processing of each service sub-module is completed;
(3) the service representative monitors that all service submodules of the I/O are processed, uniformly issues a Remove event to each service configuration management module, and triggers the I/O to start interruption;
(4) after IO interruption, all the sub-modules perform operations such as preferred node switching, metadata updating synchronization and the like together, and inform a service representative after completion;
(5) the service representative informs the host computer after receiving the completion of the updating of all the sub-module configurations, and immediately recovers I/O without returning to the cluster;
(6) when the node is added, the cluster generates an Add/Unpend/UnpendDone event;
(7) the service representative receives the cluster event and puts the cluster event into a cache, and the Add triggers a state machine to inform each submodule service layer to carry out node attribute updating and other operations, and the surviving node keeps I/O;
(8) after the update is completed, notifying a service representative, waiting for the triggering of a Unpend event of the cluster after the node is added, triggering a sub-service module by the service representative to perform the Discard operation, not operating the surviving node, and continuously keeping I/O;
(9) the cluster finishes operations such as view reconstruction and sends UnpendDone events to a service representative, at the moment, each submodule finishes the Discard operation, the service representative uniformly triggers each submodule configuration module to interrupt I/O, and preferred node switching, metadata updating and synchronization are carried out;
(10) and the service representative informs the host computer after receiving the completion of the updating of all the sub-module configurations, and immediately recovers the I/O.
In a second aspect, based on the same inventive concept as the method for rapidly recovering IO from a single node failure in a storage system in the foregoing embodiment, an embodiment of the present specification further provides a device for rapidly recovering IO from a single node failure in a storage system, as shown in fig. 2, including a service representative module, a storage service sub-module, and a cluster event management module; the service representative module inquires and calls a service interface to execute a service, and the storage service sub-module calls the interface to the service representative module to realize a response service; the service representative module is used as a middle representative to connect the storage service submodule and the cluster event management module, and the service representative module uniformly exposes interfaces outwards; the service representative module manages cluster event serialization and controls the uniform issuing and response processing of events.
The above description is only an embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes performed by the present specification and drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.
Claims (9)
1. A method for rapidly recovering IO (input/output) in a single-node failure of a storage system is characterized by comprising the following steps: firstly, a service representative inquires and calls a service interface to execute a service, a storage service submodule calls the service representative interface to realize a response service, the service representative is connected with the storage service submodule and a cluster event management module, and the service representative uniformly exposes the interface outwards; and secondly, serializing the management cluster events by the service representatives, and controlling the uniform issuing and response processing of the events.
2. The method for rapidly recovering IO due to single-node failure in a storage system according to claim 1, wherein the first step specifically includes the following steps:
s100, transmitting an event needing to be processed to a cache of a service representative, wherein the cache is created at a cluster host end and is used for carrying out data communication with the service representative end;
s101, setting a service representative attribute class and a state machine;
s102, controlling and starting a state machine service representative algorithm;
s103, waiting for the occurrence of the cluster single node joining leaving event, reading the received event into the service representative state machine from the cache of the service representative, and completing the update of the event issuing state and the issuing of the notification host I/O.
3. The method for rapidly recovering IO due to single-node failure in a storage system according to claim 2, wherein the cache in step S100 is stored in cluster configuration management.
4. The method according to claim 2, wherein the state machine in step S101 is used to control issuing of events to the sub-modules and responding to the processing results of the modules.
5. The method for rapidly recovering IO due to single-node failure in a storage system according to claim 1, wherein the second step specifically includes the following steps:
s200, when the node leaves, the cluster generates Pend and sends the Pend to the service representative end, the survival node of the I/O stack service submodule keeps I/O and processes the transaction processing required by the leaving of the node at the service end;
s201, generating a Remove event after the cluster BOSS nodes switch and reconstruct the view, sending the Remove event to a service representative end, and judging whether the service processing of each service sub-module is finished by the service representative end;
s202, the service representative monitors all service submodules of the I/O to complete processing, uniformly issues a Remove event to each service configuration management module, and triggers the I/O to start interruption;
s203, after IO interruption, all the sub-modules perform preferred node switching together, update and synchronize metadata, and notify a service representative after completion;
s204, the service representative notifies the host computer after receiving the completion of the updating of the configuration of all the sub-modules, and immediately resumes the I/O;
s205, when the node is added, the cluster generates an Add/Unpend/UnpendDone event;
s205, the service representative receives the cluster event and puts the cluster event into a cache, and the Add triggers the state machine to inform each sub-module service layer to carry out operations such as node attribute updating and the like, and the surviving node keeps I/O;
s207, after the updating is completed, a service representative is notified, the triggering of a Unpend event of the cluster after the node is added is waited, the service representative triggers a sub-service module to perform the Discard operation, the surviving node does not operate, and the I/O is continuously kept;
s208, the cluster completes view reconstruction operation and sends the UnpendDone event to a service representative;
s209, the service representative notifies the host computer after receiving the completion of the updating of the configuration of all the sub-modules, and immediately resumes the I/O.
6. The method according to claim 5, wherein the Add/Unpend/UnpendDone event in step S205 is a node join, cancel, or cancel completion event.
7. The method according to claim 5, wherein in step S208, each submodule has completed Discard operation, and the service representative uniformly triggers each submodule configuration module to interrupt I/O, perform preferred node switching, metadata update, and synchronization.
8. The method of claim 1, wherein the second step is followed by reducing communication or remote query function for service layer code in the presentation layer code.
9. A device for rapidly recovering IO (input/output) in single-node failure of a storage system is characterized by comprising a service representative module, a storage service submodule and a cluster event management module; the service representative module inquires and calls a service interface to execute a service, and the storage service sub-module calls the interface to the service representative module to realize a response service; the service representative module is used as a middle representative to connect the storage service submodule and the cluster event management module, and the service representative module uniformly exposes interfaces outwards; the service representative module manages cluster event serialization and controls the uniform issuing and response processing of events.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010987811.0A CN111949452B (en) | 2020-09-18 | 2020-09-18 | Method and device for rapidly recovering IO (input/output) in single-node fault of storage system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010987811.0A CN111949452B (en) | 2020-09-18 | 2020-09-18 | Method and device for rapidly recovering IO (input/output) in single-node fault of storage system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111949452A true CN111949452A (en) | 2020-11-17 |
CN111949452B CN111949452B (en) | 2022-09-20 |
Family
ID=73357483
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010987811.0A Active CN111949452B (en) | 2020-09-18 | 2020-09-18 | Method and device for rapidly recovering IO (input/output) in single-node fault of storage system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111949452B (en) |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102868754A (en) * | 2012-09-26 | 2013-01-09 | 北京联创信安科技有限公司 | High-availability method, node device and system for achieving cluster storage |
CN109032830A (en) * | 2018-07-25 | 2018-12-18 | 广东浪潮大数据研究有限公司 | A kind of fault recovery method of distributed memory system, system and associated component |
CN111158779A (en) * | 2019-12-24 | 2020-05-15 | 深圳云天励飞技术有限公司 | Data processing method and related equipment |
-
2020
- 2020-09-18 CN CN202010987811.0A patent/CN111949452B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102868754A (en) * | 2012-09-26 | 2013-01-09 | 北京联创信安科技有限公司 | High-availability method, node device and system for achieving cluster storage |
CN109032830A (en) * | 2018-07-25 | 2018-12-18 | 广东浪潮大数据研究有限公司 | A kind of fault recovery method of distributed memory system, system and associated component |
CN111158779A (en) * | 2019-12-24 | 2020-05-15 | 深圳云天励飞技术有限公司 | Data processing method and related equipment |
Also Published As
Publication number | Publication date |
---|---|
CN111949452B (en) | 2022-09-20 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN101989922B (en) | Method and system for recovering session initial protocol affairs | |
US8504873B1 (en) | Method and apparatus for providing in-memory checkpoint services within a distributed transaction | |
WO2021129008A1 (en) | Service invocation method, apparatus and device, and medium | |
US5910984A (en) | Fault tolerant service-providing apparatus for use in a telecommunications network | |
CN104935672A (en) | High available realizing method and equipment of load balancing service | |
JP2001306349A (en) | Backup device and backup method | |
JP2006338069A (en) | Operation method and base of component software | |
US9667475B2 (en) | Systems and methods for communicating information of participants registered with a sub-coordinator during distributed transaction processing | |
WO2014067324A1 (en) | Method and system for upgrading patching software | |
CN105472024A (en) | Cross-region data synchronizing method based on message pushing mode | |
CN115576655B (en) | Container data protection system, method, device, equipment and readable storage medium | |
CN111258723A (en) | Transaction processing method, device, system, medium and equipment of distributed system | |
WO2006129277A2 (en) | Method and hardware node for customized upgrade control | |
US20040123183A1 (en) | Method and apparatus for recovering from a failure in a distributed event notification system | |
CN111949452B (en) | Method and device for rapidly recovering IO (input/output) in single-node fault of storage system | |
CN113821363B (en) | Inter-process communication method and system | |
CN109684128B (en) | Cluster overall fault recovery method of message middleware, server and storage medium | |
WO2013037314A1 (en) | System and method for use in data processing center disaster backup | |
JP2005301436A (en) | Cluster system and failure recovery method for it | |
WO2022222968A1 (en) | Conference call recovery method, apparatus and system, electronic device, and readable storage medium | |
CN100362760C (en) | Duplication of distributed configuration database system | |
CN111858177B (en) | Inter-process communication abnormality repairing method and device, electronic equipment and storage medium | |
JP2013021529A (en) | Subscriber data management method and call control system | |
US6370654B1 (en) | Method and apparatus to extend the fault-tolerant abilities of a node into a network | |
JPH0879246A (en) | Distributed communication system and fault recovery method therefor |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |