CN111949452A - Method and device for rapidly recovering IO (input/output) in single-node fault of storage system - Google Patents

Method and device for rapidly recovering IO (input/output) in single-node fault of storage system Download PDF

Info

Publication number
CN111949452A
CN111949452A CN202010987811.0A CN202010987811A CN111949452A CN 111949452 A CN111949452 A CN 111949452A CN 202010987811 A CN202010987811 A CN 202010987811A CN 111949452 A CN111949452 A CN 111949452A
Authority
CN
China
Prior art keywords
service
service representative
event
cluster
module
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010987811.0A
Other languages
Chinese (zh)
Other versions
CN111949452B (en
Inventor
贺坤
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou Inspur Intelligent Technology Co Ltd
Original Assignee
Suzhou Inspur Intelligent Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou Inspur Intelligent Technology Co Ltd filed Critical Suzhou Inspur Intelligent Technology Co Ltd
Priority to CN202010987811.0A priority Critical patent/CN111949452B/en
Publication of CN111949452A publication Critical patent/CN111949452A/en
Application granted granted Critical
Publication of CN111949452B publication Critical patent/CN111949452B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3003Monitoring arrangements specially adapted to the computing system or computing system component being monitored
    • G06F11/3006Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system is distributed, e.g. networked systems, clusters, multiprocessor systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3051Monitoring arrangements for monitoring the configuration of the computing system or of the computing system component, e.g. monitoring the presence of processing resources, peripherals, I/O links, software programs

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Quality & Reliability (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Hardware Redundancy (AREA)

Abstract

The invention discloses a method and a device for rapidly recovering IO (input/output) in single-node failure of a storage system, which adopt a service representative mode design, connect a client and a service end by an intermediate service representative, reduce the communication complexity of the client and the service end, realize that an interface is uniformly exposed outwards by the service, specifically inquire and use the service interface to execute related services by the service representative, and a client can realize a response service by calling the related simplified interface to the service representative. Meanwhile, the functions of communication and remote inquiry of service layer codes in the presentation layer codes are reduced, the communication time and the time difference caused by the cross triggering interruption and recovery of each module are reduced, and the shortest I/O interruption time is ensured. By the mode, the invention can reduce the communication time and the time difference caused by the cross triggering interruption and recovery of each module, and ensure the optimal interruption time.

Description

Method and device for rapidly recovering IO (input/output) in single-node fault of storage system
Technical Field
The invention relates to the technical field of storage, in particular to a method and a device for rapidly recovering IO (input/output) in a single-node fault of a storage system.
Background
An observer mode is mainly adopted in the existing mode of processing IO interruption by a node fault recovery event, but the observer mode has the following problems that a storage system cluster is used as an observed person, and sub-modules such as a forwarding layer, a cache, a volume and RAID are used as observers: 1. the observed person object has many direct and indirect observers, and it takes much time to notify all of the observers. 2. There is a cyclic dependency between the observer and the observed target, and the observed target triggers a cyclic call between them, which may cause a system crash. 3. The observer pattern has no corresponding mechanism for the observer to know how the observed target object has changed, but only how the observed target object has changed. 4. It also takes a lot of time for the observer to notify the observer and the observer needs to check if all the observers have recovered before notifying the host to recover the I/O.
Disclosure of Invention
The invention mainly solves the technical problem of providing a method and a device for rapidly recovering IO (input/output) in a single-node fault of a storage system, which can reduce communication time and time difference caused by cross-triggering interruption and recovery of each module and ensure that the interruption time is optimal.
In order to solve the technical problems, the invention adopts a technical scheme that: the method for rapidly recovering IO (input/output) in a single-node failure of a storage system comprises the following steps: firstly, a service representative inquires and calls a service interface to execute a service, a storage service submodule calls the service representative interface to realize a response service, the service representative is connected with the storage service submodule and a cluster event management module, and the service representative uniformly exposes the interface outwards; and secondly, serializing the management cluster events by the service representatives, and controlling the uniform issuing and response processing of the events.
Further, the first step specifically includes the steps of:
s100, transmitting an event needing to be processed to a cache of a service representative, wherein the cache is created at a cluster host end and is used for carrying out data communication with the service representative end;
s101, setting a service representative attribute class and a state machine;
s102, controlling and starting a state machine service representative algorithm;
s103, waiting for the occurrence of the cluster single node joining leaving event, reading the received event into the service representative state machine from the cache of the service representative, and completing the update of the event issuing state and the issuing of the notification host I/O.
Further, the cache in step S100 is stored in the cluster configuration management.
Further, the state machine in step S101 is used to control issuing events to the sub-modules, and respond to the processing results of each module.
Further, the second step specifically includes the steps of:
s200, when the node leaves, the cluster generates Pend and sends the Pend to the service representative end, the survival node of the I/O stack service submodule keeps I/O and processes the transaction processing required by the leaving of the node at the service end;
s201, generating a Remove event after the cluster BOSS nodes switch and reconstruct the view, sending the Remove event to a service representative end, and judging whether the service processing of each service sub-module is finished by the service representative end;
s202, the service representative monitors all service submodules of the I/O to complete processing, uniformly issues a Remove event to each service configuration management module, and triggers the I/O to start interruption;
s203, after IO interruption, all the sub-modules perform preferred node switching together, update and synchronize metadata, and notify a service representative after completion;
s204, the service representative notifies the host computer after receiving the completion of the updating of the configuration of all the sub-modules, and immediately resumes the I/O;
s205, when the node is added, the cluster generates an Add/Unpend/UnpendDone event;
s205, the service representative receives the cluster event and puts the cluster event into a cache, and the Add triggers the state machine to inform each sub-module service layer to carry out operations such as node attribute updating and the like, and the surviving node keeps I/O;
s207, after the updating is completed, a service representative is notified, the triggering of a Unpend event of the cluster after the node is added is waited, the service representative triggers a sub-service module to perform the Discard operation, the surviving node does not operate, and the I/O is continuously kept;
s208, the cluster completes view reconstruction operation and sends the UnpendDone event to a service representative;
s209, the service representative notifies the host computer after receiving the completion of the updating of the configuration of all the sub-modules, and immediately resumes the I/O.
Further, the Add/Unpend/UnpendDone event in the step S205 is a node join, release completion event.
Further, in step S208, after each sub-module has completed the Discard operation, the service representative uniformly triggers each sub-module configuration module to interrupt the I/O, perform the first-selected node switching, and update and synchronize the metadata.
Further, the second step is followed by reducing communication or remote query functions to the business layer code in the presentation layer code.
A device for rapidly recovering IO (input/output) in single-node failure of a storage system comprises a service representative module, a storage service submodule and a cluster event management module; the service representative module inquires and calls a service interface to execute a service, and the storage service sub-module calls the interface to the service representative to realize a response service; the service representative module is used as a middle representative to connect the storage service submodule and the cluster event management module, and the service representative module uniformly exposes interfaces outwards; the service representative module manages cluster event serialization and controls the uniform issuing and response processing of events.
The invention has the beneficial effects that: the invention uses a middle representative to connect the storage service sub-module and the cluster event management, reduces the communication complexity of the storage service sub-module and the cluster event management, the cluster service realizes the uniform exposure of the interface outwards, the service representative specifically inquires and calls the service interface to execute the related service, and the client can realize the response service only by calling the related simplified interface to the service representative. The remote query function of communication or business layer codes in the presentation layer codes is reduced, the communication time and the time difference caused by the cross triggering interruption and recovery of each module are reduced, and the optimal interruption time is ensured.
Drawings
Fig. 1 is an architecture diagram of a method for fast IO recovery from a single node failure in a storage system according to a preferred embodiment of the present invention.
Fig. 2 is a structural diagram of an apparatus for rapidly recovering IO due to a single-node failure in a storage system according to the present invention.
Detailed Description
The following detailed description of the preferred embodiments of the present invention, taken in conjunction with the accompanying drawings, will make the advantages and features of the invention easier to understand by those skilled in the art, and thus will clearly and clearly define the scope of the invention.
In a first aspect, referring to fig. 1, an embodiment of the present invention provides a method for rapidly recovering an IO in a single node failure in a storage system, including: firstly, adopting a service representative mode design, connecting a storage service sub-module and a cluster event management module by a middle representative, reducing the communication complexity of the service sub-module and the cluster event management module, realizing the uniform exposure of interfaces outwards by the service representative, specifically inquiring and calling a service interface by the service representative to execute related services, and realizing the response service by the service sub-module only needing to call the interface after the related simplification to the service representative; then, the service representative manages the cluster event serialization, controls the uniform issue and response processing of the event, and reduces the time difference caused by the cross triggering interruption and recovery of each sub-module due to the issue and response event; finally, the communication or remote query function of the service layer code in the presentation layer code is reduced, and the communication time is reduced to ensure the minimum interruption time.
The quick recovery I/O program implementation mainly comprises the following steps:
(1) creating a cache for data communication between a cluster host end and a service representative end, wherein the cache is stored in cluster configuration management;
(2) transmitting the event to be processed to a cache of a service representative;
(3) setting a service representative attribute class and a state machine, controlling and issuing events to the sub-modules, and responding to the processing results of the modules;
(4) controlling and starting a state machine service representative algorithm;
(5) and waiting for the occurrence of the adding/leaving event of the cluster single node, reading the received event into a service representative state machine from the cache of the service representative, and finishing the updating of the event issuing state and the issuing of the notification host I/O.
The method for controlling and optimizing the IO fast recovery I/O program by adopting the service representative mode mainly comprises the following steps:
(1) when the node leaves, the cluster generates Pend which is sent to the service representative end, the survival node of the I/O stack service submodule keeps I/O and processes the transaction processing required by the leaving of the node at the service end;
(2) the cluster BOSS node generates a Remove event after switching and reconstructing the view, and sends the Remove event to a service representative terminal, and the service representative terminal judges whether the service processing of each service sub-module is completed;
(3) the service representative monitors that all service submodules of the I/O are processed, uniformly issues a Remove event to each service configuration management module, and triggers the I/O to start interruption;
(4) after IO interruption, all the sub-modules perform operations such as preferred node switching, metadata updating synchronization and the like together, and inform a service representative after completion;
(5) the service representative informs the host computer after receiving the completion of the updating of all the sub-module configurations, and immediately recovers I/O without returning to the cluster;
(6) when the node is added, the cluster generates an Add/Unpend/UnpendDone event;
(7) the service representative receives the cluster event and puts the cluster event into a cache, and the Add triggers a state machine to inform each submodule service layer to carry out node attribute updating and other operations, and the surviving node keeps I/O;
(8) after the update is completed, notifying a service representative, waiting for the triggering of a Unpend event of the cluster after the node is added, triggering a sub-service module by the service representative to perform the Discard operation, not operating the surviving node, and continuously keeping I/O;
(9) the cluster finishes operations such as view reconstruction and sends UnpendDone events to a service representative, at the moment, each submodule finishes the Discard operation, the service representative uniformly triggers each submodule configuration module to interrupt I/O, and preferred node switching, metadata updating and synchronization are carried out;
(10) and the service representative informs the host computer after receiving the completion of the updating of all the sub-module configurations, and immediately recovers the I/O.
In a second aspect, based on the same inventive concept as the method for rapidly recovering IO from a single node failure in a storage system in the foregoing embodiment, an embodiment of the present specification further provides a device for rapidly recovering IO from a single node failure in a storage system, as shown in fig. 2, including a service representative module, a storage service sub-module, and a cluster event management module; the service representative module inquires and calls a service interface to execute a service, and the storage service sub-module calls the interface to the service representative module to realize a response service; the service representative module is used as a middle representative to connect the storage service submodule and the cluster event management module, and the service representative module uniformly exposes interfaces outwards; the service representative module manages cluster event serialization and controls the uniform issuing and response processing of events.
The above description is only an embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes performed by the present specification and drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.

Claims (9)

1. A method for rapidly recovering IO (input/output) in a single-node failure of a storage system is characterized by comprising the following steps: firstly, a service representative inquires and calls a service interface to execute a service, a storage service submodule calls the service representative interface to realize a response service, the service representative is connected with the storage service submodule and a cluster event management module, and the service representative uniformly exposes the interface outwards; and secondly, serializing the management cluster events by the service representatives, and controlling the uniform issuing and response processing of the events.
2. The method for rapidly recovering IO due to single-node failure in a storage system according to claim 1, wherein the first step specifically includes the following steps:
s100, transmitting an event needing to be processed to a cache of a service representative, wherein the cache is created at a cluster host end and is used for carrying out data communication with the service representative end;
s101, setting a service representative attribute class and a state machine;
s102, controlling and starting a state machine service representative algorithm;
s103, waiting for the occurrence of the cluster single node joining leaving event, reading the received event into the service representative state machine from the cache of the service representative, and completing the update of the event issuing state and the issuing of the notification host I/O.
3. The method for rapidly recovering IO due to single-node failure in a storage system according to claim 2, wherein the cache in step S100 is stored in cluster configuration management.
4. The method according to claim 2, wherein the state machine in step S101 is used to control issuing of events to the sub-modules and responding to the processing results of the modules.
5. The method for rapidly recovering IO due to single-node failure in a storage system according to claim 1, wherein the second step specifically includes the following steps:
s200, when the node leaves, the cluster generates Pend and sends the Pend to the service representative end, the survival node of the I/O stack service submodule keeps I/O and processes the transaction processing required by the leaving of the node at the service end;
s201, generating a Remove event after the cluster BOSS nodes switch and reconstruct the view, sending the Remove event to a service representative end, and judging whether the service processing of each service sub-module is finished by the service representative end;
s202, the service representative monitors all service submodules of the I/O to complete processing, uniformly issues a Remove event to each service configuration management module, and triggers the I/O to start interruption;
s203, after IO interruption, all the sub-modules perform preferred node switching together, update and synchronize metadata, and notify a service representative after completion;
s204, the service representative notifies the host computer after receiving the completion of the updating of the configuration of all the sub-modules, and immediately resumes the I/O;
s205, when the node is added, the cluster generates an Add/Unpend/UnpendDone event;
s205, the service representative receives the cluster event and puts the cluster event into a cache, and the Add triggers the state machine to inform each sub-module service layer to carry out operations such as node attribute updating and the like, and the surviving node keeps I/O;
s207, after the updating is completed, a service representative is notified, the triggering of a Unpend event of the cluster after the node is added is waited, the service representative triggers a sub-service module to perform the Discard operation, the surviving node does not operate, and the I/O is continuously kept;
s208, the cluster completes view reconstruction operation and sends the UnpendDone event to a service representative;
s209, the service representative notifies the host computer after receiving the completion of the updating of the configuration of all the sub-modules, and immediately resumes the I/O.
6. The method according to claim 5, wherein the Add/Unpend/UnpendDone event in step S205 is a node join, cancel, or cancel completion event.
7. The method according to claim 5, wherein in step S208, each submodule has completed Discard operation, and the service representative uniformly triggers each submodule configuration module to interrupt I/O, perform preferred node switching, metadata update, and synchronization.
8. The method of claim 1, wherein the second step is followed by reducing communication or remote query function for service layer code in the presentation layer code.
9. A device for rapidly recovering IO (input/output) in single-node failure of a storage system is characterized by comprising a service representative module, a storage service submodule and a cluster event management module; the service representative module inquires and calls a service interface to execute a service, and the storage service sub-module calls the interface to the service representative module to realize a response service; the service representative module is used as a middle representative to connect the storage service submodule and the cluster event management module, and the service representative module uniformly exposes interfaces outwards; the service representative module manages cluster event serialization and controls the uniform issuing and response processing of events.
CN202010987811.0A 2020-09-18 2020-09-18 Method and device for rapidly recovering IO (input/output) in single-node fault of storage system Active CN111949452B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010987811.0A CN111949452B (en) 2020-09-18 2020-09-18 Method and device for rapidly recovering IO (input/output) in single-node fault of storage system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010987811.0A CN111949452B (en) 2020-09-18 2020-09-18 Method and device for rapidly recovering IO (input/output) in single-node fault of storage system

Publications (2)

Publication Number Publication Date
CN111949452A true CN111949452A (en) 2020-11-17
CN111949452B CN111949452B (en) 2022-09-20

Family

ID=73357483

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010987811.0A Active CN111949452B (en) 2020-09-18 2020-09-18 Method and device for rapidly recovering IO (input/output) in single-node fault of storage system

Country Status (1)

Country Link
CN (1) CN111949452B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102868754A (en) * 2012-09-26 2013-01-09 北京联创信安科技有限公司 High-availability method, node device and system for achieving cluster storage
CN109032830A (en) * 2018-07-25 2018-12-18 广东浪潮大数据研究有限公司 A kind of fault recovery method of distributed memory system, system and associated component
CN111158779A (en) * 2019-12-24 2020-05-15 深圳云天励飞技术有限公司 Data processing method and related equipment

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102868754A (en) * 2012-09-26 2013-01-09 北京联创信安科技有限公司 High-availability method, node device and system for achieving cluster storage
CN109032830A (en) * 2018-07-25 2018-12-18 广东浪潮大数据研究有限公司 A kind of fault recovery method of distributed memory system, system and associated component
CN111158779A (en) * 2019-12-24 2020-05-15 深圳云天励飞技术有限公司 Data processing method and related equipment

Also Published As

Publication number Publication date
CN111949452B (en) 2022-09-20

Similar Documents

Publication Publication Date Title
CN101989922B (en) Method and system for recovering session initial protocol affairs
US8504873B1 (en) Method and apparatus for providing in-memory checkpoint services within a distributed transaction
WO2021129008A1 (en) Service invocation method, apparatus and device, and medium
US5910984A (en) Fault tolerant service-providing apparatus for use in a telecommunications network
CN104935672A (en) High available realizing method and equipment of load balancing service
JP2001306349A (en) Backup device and backup method
JP2006338069A (en) Operation method and base of component software
US9667475B2 (en) Systems and methods for communicating information of participants registered with a sub-coordinator during distributed transaction processing
WO2014067324A1 (en) Method and system for upgrading patching software
CN105472024A (en) Cross-region data synchronizing method based on message pushing mode
CN115576655B (en) Container data protection system, method, device, equipment and readable storage medium
CN111258723A (en) Transaction processing method, device, system, medium and equipment of distributed system
WO2006129277A2 (en) Method and hardware node for customized upgrade control
US20040123183A1 (en) Method and apparatus for recovering from a failure in a distributed event notification system
CN111949452B (en) Method and device for rapidly recovering IO (input/output) in single-node fault of storage system
CN113821363B (en) Inter-process communication method and system
CN109684128B (en) Cluster overall fault recovery method of message middleware, server and storage medium
WO2013037314A1 (en) System and method for use in data processing center disaster backup
JP2005301436A (en) Cluster system and failure recovery method for it
WO2022222968A1 (en) Conference call recovery method, apparatus and system, electronic device, and readable storage medium
CN100362760C (en) Duplication of distributed configuration database system
CN111858177B (en) Inter-process communication abnormality repairing method and device, electronic equipment and storage medium
JP2013021529A (en) Subscriber data management method and call control system
US6370654B1 (en) Method and apparatus to extend the fault-tolerant abilities of a node into a network
JPH0879246A (en) Distributed communication system and fault recovery method therefor

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant