CN109032830A - A kind of fault recovery method of distributed memory system, system and associated component - Google Patents
A kind of fault recovery method of distributed memory system, system and associated component Download PDFInfo
- Publication number
- CN109032830A CN109032830A CN201810826771.4A CN201810826771A CN109032830A CN 109032830 A CN109032830 A CN 109032830A CN 201810826771 A CN201810826771 A CN 201810826771A CN 109032830 A CN109032830 A CN 109032830A
- Authority
- CN
- China
- Prior art keywords
- node
- address
- client
- virtual
- cluster
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 39
- 238000011084 recovery Methods 0.000 title claims abstract description 39
- 230000005540 biological transmission Effects 0.000 claims description 9
- 238000004590 computer program Methods 0.000 claims description 9
- 230000009286 beneficial effect Effects 0.000 abstract description 2
- 230000008569 process Effects 0.000 description 5
- 241000396836 Trinodes Species 0.000 description 2
- 238000004422 calculation algorithm Methods 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 230000006872 improvement Effects 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 238000011897 real-time detection Methods 0.000 description 2
- 239000011800 void material Substances 0.000 description 2
- 241001362551 Samba Species 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 150000001875 compounds Chemical class 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 238000012217 deletion Methods 0.000 description 1
- 230000037430 deletion Effects 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 239000006185 dispersion Substances 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 230000000750 progressive effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/0703—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
- G06F11/0766—Error or fault reporting or storing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/0703—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
- G06F11/0793—Remedial or corrective actions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/3003—Monitoring arrangements specially adapted to the computing system or computing system component being monitored
- G06F11/3006—Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system is distributed, e.g. networked systems, clusters, multiprocessor systems
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Quality & Reliability (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
Abstract
This application discloses a kind of fault recovery methods of distributed memory system, the fault recovery method includes carrying out cluster IP address when detecting node failure information normal nodes all in cluster using the host node and reassigning operation as each one-to-one virtual ip address of normal node distribution;The client-side information table in the cluster is inquired, and destination node is set for the corresponding normal node of the virtual ip address for being connected with client according to query result;It controls each destination node and sends TCP reconnection signal to corresponding client, to restore service connection.This method can fast implement fault recovery after service node failure, improve the stability of distributed memory system.Disclosed herein as well is a kind of fault recovery system of distributed memory system, a kind of computer readable storage medium and a kind of electronic equipment, have the above beneficial effect.
Description
Technical field
The present invention relates to technical field of data storage, in particular to a kind of fault recovery method of distributed memory system,
System, a kind of computer readable storage medium and a kind of electronic equipment.
Background technique
Distributed memory system is that data dispersion is stored in more independent equipment.Traditional network store system
All data, bottleneck and reliability and peace of the storage server as system performance are stored using the storage server of concentration
The focus of full property, is not able to satisfy the needs of Mass storage application.Distributed network storage system uses expansible system knot
Structure shares storage load using more storage servers, positions storage information using location server, it not only increases system
Reliability, availability and access efficiency, be also easy to extend.
CTDB is a cluster TDB database, can be by Samba or other using carrying out storing data.CTDB
There is a set of virtual IP address mechanism, can allow cluster after some node failure, business IP floats from a node to another node,
Business can be restored automatically.
In the prior art, after the disconnecting of client and cluster, the time for being again coupled to consuming is longer, the reason is that
The time-out time algorithm of the reconnection of TCP connection is exponential backoff algorithm, i.e., if opposite end IP is not connected, can attempt reconnection, but
It is that the interval of reconnection can increasingly be grown, the time is successively 1s, 3s, 6s, 12s, 24s, 48s, 64s, 64s.In this way, if in client
A sends a reconnection signal at a certain moment at end, and the completion but cluster virtual IP address does not drift about also, then client is in A+24s moment
Reconnection signal can be sent again, and if cluster has drifted about the IP that is at the A+5s moment, however, there remains wait 24s-5s
The time of=19s causes entire business interruption time longer.
Therefore, how service node failure after fast implement fault recovery, improve the steady of distributed memory system
It is qualitative to be a technical problem that technical personnel in the field need to solve at present.
Summary of the invention
Fault recovery method, system, a kind of computer that the purpose of the application is to provide a kind of distributed memory system can
Storage medium and a kind of electronic equipment are read, fault recovery can be fast implemented after service node failure, improved distributed
The stability of storage system.
In order to solve the above technical problems, the application provides a kind of fault recovery method of distributed memory system, the failure
Restoration methods include:
When detecting node failure information, to normal nodes all in cluster progress cluster IP using the host node
Location reassigns operation and distributes one-to-one virtual ip address for each described normal node;
The client-side information table in the cluster is inquired, and the virtual IP address of client will be connected with according to query result
The corresponding normal node in address is set as destination node;
It controls each destination node and sends TCP reconnection signal to corresponding client, to restore service connection.
Optionally, the reassignment operation of cluster IP address is being carried out to normal nodes all in cluster using the host node
Before, further includes:
Malfunctioning node is determined according to the node failure information, and judges whether the malfunctioning node is host node;
If so, re-electing the host node from all normal nodes.
Optionally, by the corresponding normal node of the virtual ip address for being connected with client be set as destination node it
Afterwards, further includes:
It controls all normal nodes of all destination nodes into cluster and sends ARP broadcast, so that all institutes
It states normal node and updates ARP table;Wherein, the ARP table is stored with the corresponding relationship of virtual ip address and MAC Address.
Optionally, further includes:
When receiving information transmission instruction, is sent according to the information and instruct determining destination virtual IP address;
The corresponding MAC Address of the destination virtual IP address is inquired according to the ARP table, and the information is sent and is instructed
Corresponding information is sent to the MAC Address.
Optionally, destination node packet is set by the corresponding normal node of the virtual ip address for being connected with client
It includes:
Inquire whether each virtual ip address connects client according to the client-side information table;
If so, setting destination node for the corresponding normal node of the virtual ip address.
Optionally, further includes:
The client-side information that all nodes are sent in the cluster is received according to predetermined period, according to the client-side information
Update the client-side information table.
Optionally, the service node is the node that operation has CTDB to service.
Present invention also provides a kind of fault recovery system of distributed memory system, which includes:
IP reallocation module, for when detecting node failure information, using the host node in cluster it is all just
Chang Jiedian carries out cluster IP address and reassigns operation as each one-to-one virtual ip address of normal node distribution;
Destination node determining module, the client-side information table for inquiring in the cluster, and will be even according to query result
The corresponding normal node of the virtual ip address for being connected to client is set as destination node;
Reconnection module sends TCP reconnection signal to corresponding client for controlling each destination node, so as to extensive
Multiple service connection.
Present invention also provides a kind of computer readable storage mediums, are stored thereon with computer program, the computer
Program realizes the step of fault recovery method of above-mentioned distributed memory system executes when executing.
Present invention also provides a kind of electronic equipment, including memory and processor, calculating is stored in the memory
Machine program, the processor realize that the failure of above-mentioned distributed memory system is extensive when calling the computer program in the memory
The step of compound method executes.
The present invention provides a kind of fault recovery methods of distributed memory system, including work as and detect node failure information
When, normal nodes all in cluster are carried out cluster IP address to reassign operation being that each is described normal using the host node
Node distributes one-to-one virtual ip address;The client-side information table in the cluster is inquired, and will be even according to query result
The corresponding normal node of the virtual ip address for being connected to client is set as destination node;Control each destination node to
Corresponding client sends TCP reconnection signal, to restore service connection.
The application distributes virtual ip address after there is node failure, for all normal nodes, since client is believed
Breath table is stored in cluster, therefore the application can inquire virtual ip address and whether be connected with client, and will actively to
The client for being connected with virtual ip address sends TCP reconnection signal.Since the operation of the recovery service connection of the application is in void
Destination node actively executes after quasi- IP address distribution, and there is no need to passively wait the reconnection signal of client.Therefore the application can
To fast implement fault recovery after service node breaks down, the stability of distributed memory system is improved.The application is simultaneously
The fault recovery system, a kind of computer readable storage medium and a kind of electronics for additionally providing a kind of distributed memory system are set
It is standby, there is above-mentioned beneficial effect, details are not described herein.
Detailed description of the invention
In ord to more clearly illustrate embodiments of the present application, attached drawing needed in the embodiment will be done simply below
It introduces, it should be apparent that, the drawings in the following description are only some examples of the present application, for ordinary skill people
For member, without creative efforts, it is also possible to obtain other drawings based on these drawings.
Fig. 1 is a kind of flow chart of the fault recovery method of distributed memory system provided by the embodiment of the present application;
Fig. 2 is the flow chart of the fault recovery method of another kind distributed memory system provided by the embodiment of the present application;
Fig. 3 is a kind of structural representation of the fault recovery system of distributed memory system provided by the embodiment of the present application
Figure.
Specific embodiment
To keep the purposes, technical schemes and advantages of the embodiment of the present application clearer, below in conjunction with the embodiment of the present application
In attached drawing, the technical scheme in the embodiment of the application is clearly and completely described, it is clear that described embodiment is
Some embodiments of the present application, instead of all the embodiments.Based on the embodiment in the application, those of ordinary skill in the art
Every other embodiment obtained without making creative work, shall fall in the protection scope of this application.
Below referring to Figure 1, Fig. 1 is a kind of fault recovery side of distributed memory system provided by the embodiment of the present application
The flow chart of method.
Specific steps may include:
S101: when detecting node failure information, normal nodes all in cluster are collected using the host node
Group's IP address reassigns operation and distributes one-to-one virtual ip address for each described normal node;
Wherein, the present embodiment default is implemented in distributed memory system, and there are more in distributed memory system
A node illustrates that a certain node in distributed memory system breaks down when detecting node failure information, malfunctioning node
Normal service connection can not be established with client, in order to guarantee the normal operation of business, need to restore client and distribution
The service connection of storage system.
It should be noted that the virtual ip address distribution of each node is required through main section in distributed memory system
Point is to execute.It needs to determine new host node in the cluster when malfunctioning node is host node, new host node is recycled to divide
With virtual ip address.There are many method reason for carrying out cluster IP reassignment, as a preferred embodiment, can use
The included cluster IP of CTDB service reassigns function to realize redistributing for virtual IP address.Illustrate the cluster IP of CTDB service
Function is reassigned, such as in distributed memory system, each node operation has CTDB service, external for providing cluster
Virtual IP address, client can be connect with a virtual IP address, if current cluster has an A, tri- nodes of B, C, A, B, on tri- nodes of C point
There is not virtual IP address: A (192.168.0.11), B (192.168.0.12), C (192.168.0.13).Client D is connected to
192.168.0.11, then 192.168.0.11 is exactly business IP, if A node failure, is drifted about by the complete IP of CTDB service execution
Afterwards, the distribution of current virtual IP is as follows: B (192.168.0.11,192.168.0.12), C (192.168.0.13).At this moment
Connection can be established by node B and cluster by waiting client.It should be noted that by CTDB service with redistributing virtual IP address
When location, if the IP has client in succession, this IP will not be re-assigned to other nodes, it is only that original malfunctioning node is corresponding
Virtual ip address be assigned to normal node.
It is understood that the operation redistributed in this step about virtual IP address is real both for normal node in cluster
It applies, there are the nodes of failure will not be assigned virtual ip address.The present embodiment is defaulted all node divisions in cluster
Two classes: malfunctioning node and normal node.
S102: the client-side information table in the cluster is inquired, and will be connected with described in client according to query result
The corresponding normal node of virtual ip address is set as destination node;
Wherein, client-side information table is stored in cluster by the present embodiment, after redistributing so as to virtual ip address
Actively carry out the situation of inquiry virtual IP address connection client.Pair of virtual ip address and client is stored in client-side information table
It should be related to, can judge whether some virtual ip address has established business company with some client according to client-side information table
It connects.
About the client-side information table of comparisons in cluster, building process is as follows: carrying out to local client connection real-time
Detection, other nodes of broadcaster client client information to cluster receive the client-side information of other nodes transmission;Linux can be used
The included ss order of system, retrieves the TCP (Transmission Control Protocol transmission control protocol) of designated port
Connection, the connection for meeting the characteristic is a client.A table being stored in memory due to client-side information table: it is
After system is restarted, which is cleared, and writes data to the table again;There are two types of the newly-increased and deletion of client-side information table only has
Mode: local real-time detection, and receive the information table of other nodes transmission.
When a certain normal node distribution virtual ip address be connected with client, then can establish the normal node with
The connection relationship of client restores the service connection of client.It should be noted that the application is by the void of all normal nodes
Quasi- IP address is all distributed, and can not determine which node should connect with which client after redistributing virtual ip address
It connects, therefore sets destination node for the corresponding normal node of the virtual ip address for being connected with client in the present embodiment,
Destination node is exactly the node for connecting and executing related service with client.
S103: each destination node of control sends TCP reconnection signal to corresponding client, connects to restore business
It connects.
Wherein, this step is established on the basis of S102 has determined destination node, controls destination node to its virtual IP address
The corresponding client in address sends TCP (Transmission Control Protocol, transmission control protocol) reconnection signal,
To establish the regular traffic connection of destination node and client, restores business operation, make originally as caused by node failure
Service disconnection restores normal.
The present embodiment distributes virtual ip address after there is node failure, for all normal nodes, due to client
Information table is stored in cluster, therefore the application can inquire whether virtual ip address has been connected with client, and will be actively
TCP reconnection signal is sent to the client for being connected with virtual ip address.Due to the operation of the recovery service connection of the application be
Destination node actively executes after virtual ip address distribution, and there is no need to passively wait the reconnection signal of client.Therefore this implementation
Example can fast implement fault recovery after service node failure, improve the stability of distributed memory system.
Fig. 2 is referred to below, and Fig. 2 is the fault recovery of another kind distributed memory system provided by the embodiment of the present application
The flow chart of method;
Specific steps may include:
S201: when detecting node failure information, malfunctioning node is determined according to the node failure information, and judge institute
State whether malfunctioning node is host node;If so, into S202;If it is not, then entering S203;
Wherein, since virtual ip address reassignment needs to rely on host node, if malfunctioning node is the main section in cluster
When point, need to re-elect from all normal nodes to obtain new host node.
S202: the host node is re-elected from all normal nodes.
Wherein, the service node is the node that operation has CTDB to service, and main section can be re-elected by CTDB service
Point.
S203: normal nodes all in cluster are carried out cluster IP address to reassign operation being each using the host node
A normal node distributes one-to-one virtual ip address;
S204: inquire whether each virtual ip address connects client according to the client-side information table;If so,
Into S205;If it is not, then terminating process.
Wherein it is possible to receive the client that all nodes are sent in the cluster according to predetermined period before the present embodiment
Information updates the client-side information table according to the client-side information.
S205: destination node is set by the corresponding normal node of the virtual ip address.
S206: all normal nodes of all destination nodes of control into cluster send ARP broadcast, so that institute
There is the normal node to update ARP table;Wherein, the ARP table is stored with the corresponding relationship of virtual ip address and MAC Address.
APR (Address Resolution Protocol, address resolution protocol) broadcast packet can notify all nodes
ARP table is updated, ARP table storage is virtual ip address and MAC (Medium Access Control, medium access), that is, is controlled
Physical address can tell all nodes, and the corresponding MAC Address of business IP changes, if necessary to send information, then past to be somebody's turn to do
MAC Address is sent.
S207: each destination node of control sends TCP reconnection signal to corresponding client, connects to restore business
It connects.
Fig. 3 is referred to, Fig. 3 is a kind of fault recovery system of distributed memory system provided by the embodiment of the present application
Structural schematic diagram;
The system may include:
IP reallocation module 100, for owning in cluster using the host node when detecting node failure information
Normal node carries out cluster IP address and reassigns operation as each one-to-one virtual ip address of normal node distribution;
Destination node determining module 200, the client-side information table for inquiring in the cluster, and will according to query result
The corresponding normal node of the virtual ip address for being connected with client is set as destination node;
Reconnection module 300 sends TCP reconnection signal to corresponding client for controlling each destination node, with
Just restore service connection.
Further, the fault recovery system further include:
Node judgment module for determining malfunctioning node according to the node failure information, and judges the malfunctioning node
It whether is host node;
Whether host node elects module, for being host node when malfunctioning node, then from all normal nodes again
Elect the host node.
Further, the fault recovery system further include:
APR broadcast module sends ARP for controlling all normal nodes of all destination nodes into cluster
Broadcast, so that all normal nodes update ARP table;Wherein, the ARP table is stored with virtual ip address and MAC Address
Corresponding relationship.
Further, the fault recovery system further include:
Address determination module determines target for sending to instruct according to the information when receiving information transmission instruction
Virtual ip address;
Information sending module, for inquiring the corresponding MAC Address of the destination virtual IP address according to the ARP table, and
The information, which is sent, instructs corresponding information to be sent to the MAC Address.
Further, the destination node determining module 200 is specially to inquire each institute according to the client-side information table
State whether virtual ip address connects client;If so, setting target section for the corresponding normal node of the virtual ip address
The module of point.
Further, the fault recovery system further include:
The client-side information that all nodes are sent in the cluster is received according to predetermined period, according to the client-side information
Update the client-side information table.
Further, the service node is the node that operation has CTDB to service.
Since the embodiment of components of system as directed is corresponded to each other with the embodiment of method part, the embodiment of components of system as directed is asked
Referring to the description of the embodiment of method part, wouldn't repeat here.
Present invention also provides a kind of computer readable storage mediums, have computer program thereon, the computer program
It is performed and step provided by above-described embodiment may be implemented.The storage medium may include: USB flash disk, mobile hard disk, read-only deposit
Reservoir (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disk or
The various media that can store program code such as CD.
Present invention also provides a kind of electronic equipment, may include memory and processor, have meter in the memory
Calculation machine program may be implemented provided by above-described embodiment when the processor calls the computer program in the memory
Step.Certain electronic equipment can also include various network interfaces, the components such as power supply.
Each embodiment is described in a progressive manner in specification, the highlights of each of the examples are with other realities
The difference of example is applied, the same or similar parts in each embodiment may refer to each other.For system disclosed in embodiment
Speech, since it is corresponded to the methods disclosed in the examples, so being described relatively simple, related place is referring to method part illustration
?.It should be pointed out that for those skilled in the art, under the premise of not departing from the application principle, also
Can to the application, some improvement and modification can also be carried out, these improvement and modification also fall into the protection scope of the claim of this application
It is interior.
It should also be noted that, in the present specification, relational terms such as first and second and the like be used merely to by
One entity or operation are distinguished with another entity or operation, without necessarily requiring or implying these entities or operation
Between there are any actual relationship or orders.Moreover, the terms "include", "comprise" or its any other variant meaning
Covering non-exclusive inclusion, so that the process, method, article or equipment for including a series of elements not only includes that
A little elements, but also including other elements that are not explicitly listed, or further include for this process, method, article or
The intrinsic element of equipment.Under the situation not limited more, the element limited by sentence "including a ..." is not arranged
Except there is also other identical elements in the process, method, article or apparatus that includes the element.
Claims (10)
1. a kind of fault recovery method of distributed memory system characterized by comprising
When detecting node failure information, cluster IP address weight is carried out to normal nodes all in cluster using the host node
Batch operation is that each described normal node distributes one-to-one virtual ip address;
The client-side information table in the cluster is inquired, and the virtual ip address of client will be connected with according to query result
Corresponding normal node is set as destination node;
It controls each destination node and sends TCP reconnection signal to corresponding client, to restore service connection.
2. fault recovery method according to claim 1, which is characterized in that using the host node in cluster it is all just
Chang Jiedian is carried out before the reassignment operation of cluster IP address, further includes:
Malfunctioning node is determined according to the node failure information, and judges whether the malfunctioning node is host node;
If so, re-electing the host node from all normal nodes.
3. fault recovery method according to claim 1, which is characterized in that in the virtual IP address that will be connected with client
The corresponding normal node in location is set as after destination node, further includes:
Control all normal nodes of all destination nodes into cluster and send ARP broadcast so that it is all it is described just
Normal node updates ARP table;Wherein, the ARP table is stored with the corresponding relationship of virtual ip address and MAC Address.
4. fault recovery method according to claim 3, which is characterized in that further include:
When receiving information transmission instruction, is sent according to the information and instruct determining destination virtual IP address;
The corresponding MAC Address of the destination virtual IP address is inquired according to the ARP table, and the information is sent into instruction and is corresponded to
Information be sent to the MAC Address.
5. fault recovery method according to claim 1, which is characterized in that the virtual ip address of client will be connected with
Corresponding normal node is set as destination node
Inquire whether each virtual ip address connects client according to the client-side information table;
If so, setting destination node for the corresponding normal node of the virtual ip address.
6. fault recovery method according to claim 1, which is characterized in that further include:
The client-side information that all nodes are sent in the cluster is received according to predetermined period, is updated according to the client-side information
The client-side information table.
7. fault recovery method according to claim 1, which is characterized in that the service node is that operation has CTDB to service
Node.
8. a kind of fault recovery system of distributed memory system characterized by comprising
IP reallocation module, for when detecting node failure information, using the host node to normal sections all in cluster
Point carries out cluster IP address and reassigns operation to be that each described normal node distributes one-to-one virtual ip address;
Destination node determining module, the client-side information table for inquiring in the cluster, and will be connected with according to query result
The corresponding normal node of the virtual ip address of client is set as destination node;
Reconnection module sends TCP reconnection signal to corresponding client for controlling each destination node, to restore industry
Business connection.
9. a kind of electronic equipment characterized by comprising
Memory, for storing computer program;
Processor realizes distributed storage system as described in any one of claim 1 to 7 when for executing the computer program
The step of fault recovery method of system.
10. a kind of computer readable storage medium, which is characterized in that be stored with computer on the computer readable storage medium
Program, the computer program realize distributed memory system as described in any one of claim 1 to 7 when being executed by processor
Fault recovery method the step of.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810826771.4A CN109032830A (en) | 2018-07-25 | 2018-07-25 | A kind of fault recovery method of distributed memory system, system and associated component |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810826771.4A CN109032830A (en) | 2018-07-25 | 2018-07-25 | A kind of fault recovery method of distributed memory system, system and associated component |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109032830A true CN109032830A (en) | 2018-12-18 |
Family
ID=64645229
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810826771.4A Pending CN109032830A (en) | 2018-07-25 | 2018-07-25 | A kind of fault recovery method of distributed memory system, system and associated component |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109032830A (en) |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110750379A (en) * | 2019-10-28 | 2020-02-04 | 无锡华云数据技术服务有限公司 | ETCD cluster recovery method, system, equipment and computer medium |
CN111258795A (en) * | 2019-11-29 | 2020-06-09 | 浪潮电子信息产业股份有限公司 | Samba cluster fault reconnection method, device, equipment and medium |
CN111314117A (en) * | 2020-01-20 | 2020-06-19 | 苏州浪潮智能科技有限公司 | Fault transfer method, device, equipment and readable storage medium |
CN111949452A (en) * | 2020-09-18 | 2020-11-17 | 苏州浪潮智能科技有限公司 | Method and device for rapidly recovering IO (input/output) in single-node fault of storage system |
CN112511317A (en) * | 2020-12-31 | 2021-03-16 | 河南信大网御科技有限公司 | Input distribution method, input agent and mimicry distributed storage system |
CN113596068A (en) * | 2020-04-30 | 2021-11-02 | 北京金山云网络技术有限公司 | Method, device and server for establishing TCP connection |
CN114116216A (en) * | 2021-11-24 | 2022-03-01 | 北京大道云行科技有限公司 | Method and device for realizing high availability of distributed block storage based on vip |
CN114285729A (en) * | 2021-11-29 | 2022-04-05 | 苏州浪潮智能科技有限公司 | Distributed cluster management node deployment method, device, equipment and storage medium |
CN114553900A (en) * | 2022-02-18 | 2022-05-27 | 苏州浪潮智能科技有限公司 | Distributed block storage management system and method and electronic equipment |
CN115437843A (en) * | 2022-08-25 | 2022-12-06 | 北京万里开源软件有限公司 | Database storage partition recovery method based on multi-level distributed consensus |
CN115866018A (en) * | 2023-02-28 | 2023-03-28 | 浪潮电子信息产业股份有限公司 | Service processing method and device, electronic equipment and computer readable storage medium |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102932500A (en) * | 2012-11-07 | 2013-02-13 | 曙光信息产业股份有限公司 | Method and system for taking over fault interface node |
CN103475732A (en) * | 2013-09-25 | 2013-12-25 | 浪潮电子信息产业股份有限公司 | Distributed file system data volume deployment method based on virtual address pool |
CN104090992A (en) * | 2014-08-06 | 2014-10-08 | 浪潮电子信息产业股份有限公司 | Method for high-availability configuration between conversion nodes in cluster storage system |
US9342390B2 (en) * | 2013-01-31 | 2016-05-17 | International Business Machines Corporation | Cluster management in a shared nothing cluster |
US20170220418A1 (en) * | 2009-12-29 | 2017-08-03 | International Business Machines Corporation | Determining completion of migration in a dispersed storage network |
-
2018
- 2018-07-25 CN CN201810826771.4A patent/CN109032830A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20170220418A1 (en) * | 2009-12-29 | 2017-08-03 | International Business Machines Corporation | Determining completion of migration in a dispersed storage network |
CN102932500A (en) * | 2012-11-07 | 2013-02-13 | 曙光信息产业股份有限公司 | Method and system for taking over fault interface node |
US9342390B2 (en) * | 2013-01-31 | 2016-05-17 | International Business Machines Corporation | Cluster management in a shared nothing cluster |
CN103475732A (en) * | 2013-09-25 | 2013-12-25 | 浪潮电子信息产业股份有限公司 | Distributed file system data volume deployment method based on virtual address pool |
CN104090992A (en) * | 2014-08-06 | 2014-10-08 | 浪潮电子信息产业股份有限公司 | Method for high-availability configuration between conversion nodes in cluster storage system |
Non-Patent Citations (2)
Title |
---|
THANDA SHWE ET AL.: "A fault tolerant approach in cluster computing system", 《2008 5TH INTERNATIONAL CONFERENCE ON ELECTRICAL ENGINEERING/ELECTRONICS, COMPUTER, TELECOMMUNICATIONS AND INFORMATION TECHNOLOGY》 * |
李昌隆: "云存储系统中数据访问和存储接口的研究与实现", 《中国优秀硕士学位论文全文数据库 信息科技辑》 * |
Cited By (17)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110750379A (en) * | 2019-10-28 | 2020-02-04 | 无锡华云数据技术服务有限公司 | ETCD cluster recovery method, system, equipment and computer medium |
CN110750379B (en) * | 2019-10-28 | 2023-10-31 | 无锡华云数据技术服务有限公司 | ETCD cluster recovery method, system, equipment and computer medium |
CN111258795A (en) * | 2019-11-29 | 2020-06-09 | 浪潮电子信息产业股份有限公司 | Samba cluster fault reconnection method, device, equipment and medium |
CN111258795B (en) * | 2019-11-29 | 2022-06-17 | 浪潮电子信息产业股份有限公司 | Samba cluster fault reconnection method, device, equipment and medium |
CN111314117A (en) * | 2020-01-20 | 2020-06-19 | 苏州浪潮智能科技有限公司 | Fault transfer method, device, equipment and readable storage medium |
CN113596068A (en) * | 2020-04-30 | 2021-11-02 | 北京金山云网络技术有限公司 | Method, device and server for establishing TCP connection |
CN111949452A (en) * | 2020-09-18 | 2020-11-17 | 苏州浪潮智能科技有限公司 | Method and device for rapidly recovering IO (input/output) in single-node fault of storage system |
CN111949452B (en) * | 2020-09-18 | 2022-09-20 | 苏州浪潮智能科技有限公司 | Method and device for rapidly recovering IO (input/output) in single-node fault of storage system |
CN112511317A (en) * | 2020-12-31 | 2021-03-16 | 河南信大网御科技有限公司 | Input distribution method, input agent and mimicry distributed storage system |
CN114116216A (en) * | 2021-11-24 | 2022-03-01 | 北京大道云行科技有限公司 | Method and device for realizing high availability of distributed block storage based on vip |
CN114285729B (en) * | 2021-11-29 | 2023-08-25 | 苏州浪潮智能科技有限公司 | Distributed cluster management node deployment method, device, equipment and storage medium |
CN114285729A (en) * | 2021-11-29 | 2022-04-05 | 苏州浪潮智能科技有限公司 | Distributed cluster management node deployment method, device, equipment and storage medium |
CN114553900A (en) * | 2022-02-18 | 2022-05-27 | 苏州浪潮智能科技有限公司 | Distributed block storage management system and method and electronic equipment |
CN114553900B (en) * | 2022-02-18 | 2023-08-04 | 苏州浪潮智能科技有限公司 | Distributed block storage management system, method and electronic equipment |
CN115437843A (en) * | 2022-08-25 | 2022-12-06 | 北京万里开源软件有限公司 | Database storage partition recovery method based on multi-level distributed consensus |
CN115866018B (en) * | 2023-02-28 | 2023-05-16 | 浪潮电子信息产业股份有限公司 | Service processing method, device, electronic equipment and computer readable storage medium |
CN115866018A (en) * | 2023-02-28 | 2023-03-28 | 浪潮电子信息产业股份有限公司 | Service processing method and device, electronic equipment and computer readable storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109032830A (en) | A kind of fault recovery method of distributed memory system, system and associated component | |
CN113037560B (en) | Service flow switching method and device, storage medium and electronic equipment | |
US7856488B2 (en) | Electronic device profile migration | |
CN1554055B (en) | High-availability cluster virtual server system | |
CN107465721B (en) | Global load balancing method and system based on double-active architecture and scheduling server | |
CN108780386A (en) | A kind of methods, devices and systems of data storage | |
US9419890B2 (en) | Streaming service load sharing method, streaming service processing method, and corresponding device and system | |
US9118595B2 (en) | Graceful failover of a principal link in a fiber-channel fabric | |
CN110069210B (en) | Storage system, and method and device for allocating storage resources | |
US10001945B2 (en) | Method of storing data and data storage managing server | |
EP2418824A1 (en) | Method for resource information backup operation based on peer to peer network and peer to peer network thereof | |
CN106059791A (en) | Business link switching method and storage device in storage system | |
CN103546315A (en) | System, method and equipment for backing up DHCP (dynamic host configuration protocol) server | |
KR101586354B1 (en) | Communication failure recover method of parallel-connecte server system | |
CN104967691A (en) | Distributed storage control method and system | |
CN114500523A (en) | Fixed IP application release method based on container cloud platform | |
US11153173B1 (en) | Dynamically updating compute node location information in a distributed computing environment | |
CN108089934A (en) | Cluster management method and cluster server | |
US20210326224A1 (en) | Method and system for processing device failure | |
JP2005011331A (en) | Load distribution system and computer management program | |
US11977450B2 (en) | Backup system, method therefor, and program | |
CN114138475A (en) | Data transmission load balancing method, device, equipment and storage medium | |
CN109788007B (en) | Cloud platform based on two places and three centers and communication method thereof | |
US20060168108A1 (en) | Methods and systems for defragmenting subnet space within an adaptive infrastructure | |
CN110855495B (en) | Task dynamic balancing method, device, system, electronic equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20181218 |