CN109032854B - Election request processing method and device, management node and storage medium - Google Patents
Election request processing method and device, management node and storage medium Download PDFInfo
- Publication number
- CN109032854B CN109032854B CN201810770164.0A CN201810770164A CN109032854B CN 109032854 B CN109032854 B CN 109032854B CN 201810770164 A CN201810770164 A CN 201810770164A CN 109032854 B CN109032854 B CN 109032854B
- Authority
- CN
- China
- Prior art keywords
- election
- management node
- management
- latest
- time
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/16—Error detection or correction of the data by redundancy in hardware
- G06F11/20—Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements
- G06F11/2053—Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where persistent mass storage functionality or persistent mass storage control functionality is redundant
- G06F11/2089—Redundant storage control functionality
- G06F11/2092—Techniques of failing over between control units
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Quality & Reliability (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention relates to the technical field of distributed storage, and provides a method, a device, a management node and a storage medium for processing election requests, wherein the method comprises the following steps: receiving an election version number and latest election time recorded by an abnormally recovered management node sent by the abnormally recovered management node; when the election version number is larger than the local version number, judging whether to ignore the election request according to the current system time and the latest election time; when the difference between the latest election time and the current system time is less than or equal to a preset threshold value, ignoring the election request; and when the difference between the latest election time and the current system time is larger than a preset threshold value, accepting the election request to start election. The invention realizes shielding, isolates problem management nodes and reduces the election frequency of the management cluster by conditionally ignoring the election request, thereby ensuring that the management cluster can normally provide services and further improving the reliability of the whole distributed storage system.
Description
Technical Field
The present invention relates to the field of distributed storage technologies, and in particular, to a method and an apparatus for processing election requests, a management node, and a storage medium.
Background
In the distributed storage system, a management node is used for maintaining state information of the distributed storage system, before a client reads and writes data stored in the distributed storage system, the state information of the distributed storage system must be acquired through the management node, and then normal read and write operations can be performed. In the prior art, a new round of election is triggered when the state of management nodes in a management cluster changes or the management nodes in the management cluster are increased or deleted, because the distributed storage system cannot provide services to the outside in the whole election process, if the network state of one management node in the management cluster is unstable, the management node is abnormal, frequent election of the management cluster is caused, normal service continuity is influenced under severe conditions, and the reliability of the whole distributed storage system is further reduced.
Disclosure of Invention
Embodiments of the present invention provide a method and an apparatus for processing an election request, a management node, and a storage medium, where when a single management node is abnormal, the election request is conditionally ignored, so as to implement shielding, isolate problem management nodes, and reduce election frequency of a management cluster, thereby ensuring that the management cluster can normally provide services, and further improving reliability of the entire distributed storage system.
In order to achieve the above purpose, the embodiment of the present invention adopts the following technical solutions:
in a first aspect, an embodiment of the present invention provides an election request processing method, which is applied to a management node of a management cluster in a distributed storage system, where the management node stores a local version number, and the management cluster further includes a management node that performs abnormal recovery and communicates with the management node, where the method includes: receiving an election request sent by an abnormally recovered management node, wherein the election request comprises an election version number recorded by the abnormally recovered management node and the latest election time recorded by the abnormally recovered management node; when the election version number recorded by the abnormally recovered management node is larger than the local version number, judging whether to ignore the election request or not according to the current system time and the latest election time recorded by the abnormally recovered management node; when the difference between the latest election time recorded by the abnormally recovered management node and the current system time is less than or equal to a preset threshold value, ignoring the election request; and when the difference between the latest election time recorded by the abnormally recovered management node and the current system time is greater than a preset threshold value, accepting the election request to start election.
In a second aspect, an embodiment of the present invention further provides an election request processing apparatus, which is applied to a management node of a management cluster in a distributed storage system, where the management node stores a local version number, the management cluster further includes a management node for recovering an exception, where the management node communicates with the management node, and the apparatus includes a receiving module, a determining module, a first ignoring module, and a first election module. The system comprises a receiving module, a selecting module and a sending module, wherein the receiving module is used for receiving an election request sent by an abnormally recovered management node, and the election request comprises an election version number recorded by the abnormally recovered management node and the latest election time recorded by the abnormally recovered management node; the judging module is used for judging whether to ignore the election request according to the current system time and the latest election time recorded by the abnormally recovered management node when the election version number recorded by the abnormally recovered management node is larger than the local version number; the first ignoring module is used for ignoring the election request when the difference between the latest election time recorded by the abnormally recovered management node and the current system time is less than or equal to a preset threshold value; the first election module is used for accepting the election request to start election when the difference between the latest election time recorded by the abnormally recovered management node and the current system time is greater than a preset threshold value.
In a third aspect, an embodiment of the present invention further provides a management node, where the management node includes: one or more processors; a memory for storing one or more programs that, when executed by the one or more processors, cause the one or more processors to implement the election request processing method described above.
In a fourth aspect, an embodiment of the present invention further provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the election request processing method described above.
Compared with the prior art, the method, the device, the management node and the storage medium for processing the election request provided by the embodiment of the invention have the advantages that firstly, the abnormally recovered management node sends the election request to the management node, wherein the election request comprises the election version number recorded by the abnormally recovered management node and the latest election time recorded by the abnormally recovered management node; then, the management node receives the election request, and when the election version number recorded by the abnormally recovered management node is larger than the local version number, whether the election request is ignored is judged according to the current system time and the latest election time recorded by the abnormally recovered management node; when the difference between the latest election time recorded by the abnormally recovered management node and the current system time is less than or equal to a preset threshold value, ignoring the election request; and when the difference between the latest election time recorded by the abnormally recovered management node and the current system time is greater than a preset threshold value, accepting the election request to start election. Compared with the prior art, the embodiment of the invention conditionally ignores the election request, realizes shielding and isolation of problem management nodes, reduces the election frequency of the management cluster, and reduces the influence of election on normal services as much as possible, thereby ensuring that the management cluster can normally provide services, and further improving the reliability of the whole distributed storage system.
In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in detail below.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present invention and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained according to the drawings without inventive efforts.
Fig. 1 is a schematic view illustrating an application scenario of an election request processing method according to an embodiment of the present invention.
Fig. 2 is a block diagram illustrating a management node according to an embodiment of the present invention.
Fig. 3 shows a first flowchart of an election request processing method according to an embodiment of the present invention.
Fig. 4 shows a second flowchart of an election request processing method according to an embodiment of the present invention.
Fig. 5 is a block diagram illustrating an election request processing device according to an embodiment of the present invention.
Icon: 100-a management node; 101-a memory; 102-a communication interface; 103-a processor; 104-a bus; 200-election request processing means; 201-a receiving module; 202-a judging module; 203-a first ignore module; 204-a first election module; 205-an acquisition module; 206-a second ignore module; 207-a second election module; 208-update module.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. The components of embodiments of the present invention generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present invention, presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present invention without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures. Meanwhile, in the description of the present invention, the terms "first", "second", and the like are used only for distinguishing the description, and are not to be construed as indicating or implying relative importance.
Referring to fig. 1, fig. 1 is a schematic view illustrating an application scenario of a election request processing method according to an embodiment of the present invention, where a distributed storage system includes a client, a storage cluster formed by a plurality of storage nodes, and a management cluster formed by a plurality of communication management nodes, where the client, the storage cluster, and the management cluster are in communication with each other, the storage nodes are responsible for storing data of users, and the management nodes are used for maintaining topology structure information and state information related to the distributed storage system, such as the storage cluster and the management cluster. The method comprises the steps that a user issues a data read-write request through a client, the client acquires state information of storage nodes from a management cluster, then calculation is carried out according to the state information to obtain storage position information of read-write data, corresponding storage nodes are found according to the storage position information, and data stored on the storage nodes are read and written. When a storage node newly joins in a storage cluster or finds itself or other storage nodes are abnormal, the state information of the storage node is reported to a management cluster, the management cluster records and updates according to the state information reported by the storage node, and the updated information is diffused to the storage cluster and a client. In fig. 1, the management cluster includes 3 management nodes: the management node 1, the management node 2 and the management node 3, the management node which is recovered abnormally may be any one management node in a management cluster, the management node except the management node which is recovered abnormally may be a normal management node 100, when the management node 1 is recovered abnormally, an election request is sent to the management node 2 and the management node 3, and the management node 2 and the management node 3 determine to adopt a corresponding policy to process the election request according to the election request and a local version number.
Referring to fig. 2, fig. 2 is a block diagram illustrating a management node 100 according to an embodiment of the present invention. In the embodiment of the present invention, the management node 100 refers to a normal management node, i.e. a normal management node 100 except for a management node for abnormal recovery in a management cluster, and the management node 100 may be, but is not limited to, a Personal Computer (PC), a server, and the like. The operating system of the management node 100 may be, but is not limited to, a Windows system, a Linux system, and the like. The management node 100 comprises a memory 101, a communication interface 102, a processor 103 and a bus 104, the memory 101, the communication interface 102 and the processor 103 being connected via the bus 104, the processor 103 being adapted to execute executable modules, such as computer programs, stored in the memory 101.
The Memory 101 may include a high-speed Random Access Memory (RAM) and may further include a non-volatile Memory (non-volatile Memory), such as at least one disk Memory. The communication connection between the management node 100 and at least one other management node 100, and external storage devices, is implemented by at least one communication interface 102 (which may be wired or wireless).
The bus 104 may be an ISA bus, PCI bus, EISA bus, or the like. Only one bi-directional arrow is shown in fig. 2, but this does not indicate only one bus or one type of bus.
The memory 101 is used for storing a program, such as the election request processing device 200 shown in fig. 5. The election request processing means 200 includes at least one software functional module that may be stored in the memory 101 in the form of software or firmware (firmware) or solidified in an Operating System (OS) of the server host 100. After receiving the execution instruction, the processor 103 executes the program to implement the election request processing method disclosed in the above embodiment of the present invention.
The processor 103 may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the above method may be performed by integrated logic circuits of hardware or instructions in the form of software in the processor 103. The Processor 103 may be a general-purpose Processor, and includes a Central Processing Unit (CPU), a Network Processor (NP), and the like; but may also be a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components.
First embodiment
Referring to fig. 3 and fig. 4, fig. 3 shows a first flowchart of an election request processing method according to an embodiment of the present invention, and fig. 4 shows a second flowchart of the election request processing method according to the embodiment of the present invention. The election request processing method is applied to a management node 100 of a management cluster in a distributed storage system, and comprises the following steps:
step S101, an election request sent by the abnormally recovered management node is received, wherein the election request comprises an election version number recorded by the abnormally recovered management node and the latest election time recorded by the abnormally recovered management node.
In the embodiment of the present invention, the exception of the management node means that the management node can no longer provide a function of querying, updating, or maintaining safe and effective state information or topology structure information for the client or the storage cluster. The exception may be, but is not limited to, a network flash caused by a problem occurring in a process running on the management node, a problem occurring in a communication module of the management node, and the like.
In the embodiment of the present invention, there are two roles of the management node in the management cluster: the data on the main management node and the standby management node are kept consistent through a specific algorithm, when a client needs to acquire state information related to a read command, the main management node and the standby management node can both return corresponding state information to the client, and when the client needs to update the state information related to a write command, an update message is firstly sent to the main management node and then distributed to the standby management node by the main management node. The main management node and the standby management node are determined by the management cluster through an election mechanism.
It should be noted that, in different distributed storage systems, it is essential to maintain the topology information and the state information related to the distributed storage systems, but the specific names may be different, for example, in an embodiment, the distributed storage system may be a Ceph system (an open source distributed storage system), and the management node cluster may be a monitor cluster. For another embodiment, the distributed storage system may be a fusion storage system (a type of distributed storage system) and the cluster of management nodes may be a metadata cluster. In different management clusters with similar implementation mechanisms, names of a management node, a master management node, and a standby management node may be different, and in the embodiment of the present invention, a Ceph distributed storage system is taken as an example for description. In the Ceph distributed storage system, a management node is called a monitor node, a main management node is called a leader, a standby management node is called a peon, data consistency is guaranteed among all monitor nodes in a monitor cluster through paxos algorithm, the roles of the monitor nodes are determined to be leader or peon through an election mechanism, the monitor cluster determines a priority for each monitor node when being deployed, and the monitor with the highest priority can become the leader through election. The version number is called epoch in Ceph, which plays a very important role in the election process, and has two roles:
(1) representing logic time, under normal conditions, the epoch values of the monitor nodes providing normal services to the outside should be equal, when one of the monitor nodes is abnormal, the epoch value of the abnormal monitor node is stored in the database, and after the abnormal monitor is recovered again, the epoch value of the abnormally recovered monitor is smaller than that of the other normal monitor nodes, so that whether the corresponding election request is the latest election can be judged through the epoch values;
(2) the method is used for judging whether the monitor node is currently in the election state or not, when the epoch is an odd number, the monitor node is in the election state, and after the election is finished, the epoch value is increased to an even number and is synchronized to all normal monitor nodes.
In this embodiment of the present invention, each monitor node stores a local version number, and in a normal case, after each round of election is finished, the local version numbers of all normal monitors in a monitor cluster are the same, and if one of the monitors is abnormal and before the monitor is recovered abnormally, other monitors in the monitor cluster except the abnormal monitor complete a new round of election, at this time, the local version number of the abnormal monitor is not consistent with the local version numbers of the other monitors, for example, there are 3 monitor nodes in the current monitor cluster: the monitor node 1, the monitor node 2 and the monitor node 3 are all 2, at this time, the monitor node 1 is abnormal, the monitor node 2 and the monitor node 3 perform a round of election, after the election is completed, the local version numbers of the monitor node 2 and the monitor node 3 are both changed into 4, and the local version number of the monitor node 1 after the abnormal recovery is 2. When the abnormally recovered monitor node initiates an election request, the local version number is added with 1 to obtain an election version number, and then the election version number is sent to other monitor nodes to initiate the election request, for example, if the local version number of the abnormally recovered monitor node is 2, then when the abnormally recovered monitor node initiates the election request, the local version number is added with 1 to obtain the election version number, and at this time, the election version number is 3.
Step S102, when the election version number recorded by the abnormally recovered management node is larger than the local version number, whether to ignore the election request is judged according to the current system time and the latest election time recorded by the abnormally recovered management node.
In this embodiment of the present invention, after receiving an election request sent by a management node that recovers abnormally, a management node 100 obtains, from the election request, an election version number recorded by the management node that recovers abnormally and a latest election time recorded by the management node that recovers abnormally, first compares the election version number recorded by the management node that recovers abnormally with a local version number, and when the election version number recorded by the management node that recovers abnormally is greater than the local version number, it means that the management node that recovers abnormally participates in the latest round of election, that is, before the management node that recovers abnormally, the management node 100 in a management cluster except the abnormal management node has not been elected yet. The management node 100 first obtains the latest election time recorded by the abnormally recovered management node from the election request, then determines whether to ignore the election request according to the current system time and the latest election time recorded by the abnormally recovered management node, when the difference between the latest election time recorded by the abnormally recovered management node and the current system time is less than or equal to a preset threshold, the election request is ignored, that is, step S103 is executed, and when the difference between the latest election time recorded by the abnormally recovered management node and the current system time is greater than the preset threshold, the election request is accepted to start election, that is, step S104 is executed.
It should be noted that time synchronization is performed among a plurality of management nodes in the management cluster, that is, system times of the plurality of management nodes are basically consistent, so that current system times acquired by the plurality of management nodes 100 participating in the same round of election are not greatly different, and therefore, a determination result of whether to ignore an election request is also consistent by the plurality of management nodes 100 according to the current system time and the latest election time.
Step S103, when the difference between the latest election time recorded by the abnormally recovered management node and the current system time is less than or equal to a preset threshold value, ignoring the election request.
In the embodiment of the present invention, the preset threshold may be preset according to a specific application scenario, for example, the preset threshold is 20s, the last election time is 10:00:00, and the current system time is 10:00:15, if the difference between the last election time and the current system time is 15s, and is less than the preset threshold 20s, the current election request is ignored.
Step S104, when the difference between the latest election time recorded by the abnormally recovered management node and the current system time is greater than a preset threshold value, accepting the election request to start election.
In the embodiment of the present invention, the preset threshold may be preset according to a specific application scenario, for example, the preset threshold is 20s, the last election time is 10:00:00, and the current system time is 10:00:25, if the difference between the last election time and the current system time is 25s, and if the difference is greater than the preset threshold 20s, the current election request is accepted.
In the embodiment of the present invention, when the election version number recorded by the management node subjected to the abnormal recovery is smaller than the local version number, after step S101 is executed, when the election version number recorded by the management node subjected to the abnormal recovery is smaller than the local version number, steps S105 to S107 are executed.
Step S105, when the election version number recorded by the abnormally recovered management node is smaller than the local version number, acquiring the current system time, the number of the management nodes providing service currently and the total number of the management nodes.
In the embodiment of the present invention, the election version number is smaller than the local version number, which means that the management node that recovers from the exception does not participate in the last round of election, that is, before the abnormal management node recovers, the management node 100 in the management cluster except the abnormal management node has performed at least one round of election.
In this embodiment of the present invention, taking a Ceph distributed storage system as an example, the currently serving management nodes are called quorum, that is, the management nodes support leader nodes as the monitor nodes of the leader, and the leader node is the monitor node with the highest priority in quorum, for example, there are 5 monitor nodes in the monitor cluster: the number of the currently served management nodes is 4 when the monitor node 2, the monitor node 3, the monitor node 4 and the monitor node 5 all have the highest priority level, the monitor node 5 initiates an election request, the monitor node 2, the monitor node 3 and the monitor node 4 all accept the election request and reply an acknowledgement message to the monitor node 5, at this time, the monitor5 is a leader, and the quorum includes the monitor node 2, the monitor node 3, the monitor node 4 and the monitor node 5.
After acquiring the current system time, the number of management nodes currently providing service, and the total number of management nodes, the management node 100 first determines whether the number of management nodes currently providing service is greater than half of the total number of management nodes and smaller than the total number of management nodes, if so, performs step S106, and if not, performs step S107.
And step S106, when the number of the management nodes providing service currently is more than half of the total number of the management nodes and less than the total number of the management nodes, determining whether to ignore the election request according to the current system time and the latest election time.
In the embodiment of the invention, the number of the management nodes which currently provide services is more than half of the total number of the management nodes and less than the total number of the management nodes, which means that even if election is not performed, the management nodes which currently provide services can also ensure normal functions, so that services can be normally performed.
Step S107, when the number of the management nodes which provide service currently is less than or equal to half of the total number of the management nodes or equal to the total number of the management nodes, the election request is accepted to start election.
In the embodiment of the present invention, the number of the management nodes currently providing services is less than or equal to half of the total number of the management nodes, which means that the current management cluster cannot provide functions normally, and a service cannot be performed normally, at this time, when an election request is initiated by an abnormal recovery node, in order to ensure the reliability of the entire management cluster, the management node 100 should immediately receive the election request to perform election, so that the management cluster is recovered as soon as possible to provide functions, and thus the service can also be recovered as soon as possible. When the number of the currently serving management nodes is equal to the total number of the management nodes, it means that a new management node is added to the management cluster, and at this time, in order to enable the new management node to be added to the management cluster as soon as possible, the election request should be immediately accepted for election.
In the embodiment of the present invention, after receiving the election request to perform the election, the latest election time should be the end time of the current election, and therefore, the latest election time recorded locally needs to be updated according to the local system time, so as to use the latest election time to perform the determination during the next election, and therefore, the implementation of the present invention further includes step S108, and step S108 may be performed after step S104, or may be performed after step S107.
And step S108, updating the latest election time of the local record according to the local system time after the election is finished.
In this embodiment of the present invention, taking a Ceph distributed storage system as an example, a management node 100 may be a leader or a peon, and regardless of leader or peon, as long as the management node participates in election, it is necessary to update the latest election time recorded locally according to the local system time after the election is ended, because the time when each management node 100 detects that the election is ended may be different, the local system time acquired by each management node 100 when the election is ended may also be different, and finally, the latest election time recorded locally by each management node 100 may also be not completely consistent, but from a period of time, this does not affect the effect of reducing the election frequency in the embodiment of the present invention, for example, there are 3 monitor nodes in a monitor cluster: the method comprises the following steps that a monitor node 1, a monitor node 2 and a monitor node 3 are adopted, the local system time obtained when the monitor node 1 detects that the election is finished is 10:00:01, the local system time obtained when the monitor node 2 detects that the election is finished is 10:00:02, the local system time obtained when the monitor node 3 detects that the election is finished is 10:00:01, the latest election time recorded by the monitor node 1 is 10:00:01, the latest election time recorded by the monitor node 2 is 10:00:02, and the latest election time recorded by the monitor node 3 is 10:00: 01.
It should be noted that, in the case that the election request is ignored, it is not necessary to update the latest election time, and it should be noted that, since the management node that recovers abnormally also records the latest election time, when the election is finished, the management node that participates in the abnormal recovery of the election also needs to update the locally recorded latest election time according to the local system time.
In the embodiment of the invention, under the abnormal conditions that a single management node has unstable network state and the like, whether election can be initiated or not is judged according to the online time of the management node and the condition of the management node which can provide service currently, and the election is conditionally ignored, so that compared with the prior art, the method has the following beneficial effects:
first, by conditionally ignoring elections, abnormal management nodes can be shielded and isolated, the election frequency of a management cluster is reduced, the influence of elections on normal services is reduced as much as possible, and the reliability of the whole distributed storage system is further improved.
Secondly, when the number of the management nodes which currently provide services in the management cluster is larger than half of the total number of the management nodes, namely the management cluster can normally provide functions, the more frequent election requests are ignored, so that the reliability of the whole management cluster cannot be reduced by ignoring the election requests.
Second embodiment
Referring to fig. 5, fig. 5 is a block diagram illustrating an election request processing device 200 according to an embodiment of the present invention. The election request processing device 200 is applied to the management node 100 and comprises a receiving module 201; a judging module 202; a first ignore module 203; a first election module 204; an acquisition module 205; a second ignore module 206; a second election module 207; and an update module 208.
The receiving module 201 is configured to receive an election request sent by a management node that recovers abnormally, where the election request includes an election version number recorded by the management node that recovers abnormally and a latest election time recorded by the management node that recovers abnormally.
In this embodiment of the present invention, the receiving module 201 is configured to execute step S101.
The determining module 202 is configured to determine whether to ignore the election request according to the current system time and the latest election time recorded by the abnormally recovered management node when the election version number recorded by the abnormally recovered management node is greater than the local version number.
In this embodiment of the present invention, the determining module 202 is configured to execute step S102.
The first ignoring module 203 is configured to ignore the election request when a difference between the latest election time recorded by the management node for abnormal recovery and the current system time is less than or equal to a preset threshold.
In the embodiment of the present invention, the second processing module 203 is configured to execute step S103.
A first election module 204, configured to accept the election request to start election when a difference between a last election time recorded by the abnormally recovered management node and a current system time is greater than a preset threshold.
In the embodiment of the present invention, the first election module 204 is configured to execute step S104.
An obtaining module 205, configured to obtain the current system time, the number of management nodes currently providing services, and the total number of management nodes when the election version number recorded by the abnormally recovered management node is smaller than the local version number.
In this embodiment of the present invention, the obtaining module 205 is configured to execute step S105.
A second ignoring module 206, configured to determine whether to ignore the election request according to the current system time and the latest election time recorded by the abnormally recovered management node when the number of the currently serving management nodes is greater than half of the total number of the management nodes and smaller than the total number of the management nodes.
In the embodiment of the present invention, the second ignoring module 206 is configured to execute step S106.
In this embodiment of the present invention, the second ignoring module 206 is specifically configured to:
when the difference between the latest election time recorded by the abnormally recovered management node and the current system time is less than or equal to a preset threshold value, ignoring the election request;
and when the difference between the latest election time recorded by the abnormally recovered management node and the current system time is greater than a preset threshold value, accepting the election request to start election.
A second election module 207, configured to accept the election request to start election when the number of the currently serving management nodes is less than or equal to half of the total number of the management nodes or equal to the total number of the management nodes.
In the embodiment of the present invention, the second election module 207 is configured to execute step S107.
And the updating module 208 is configured to update the latest election time recorded locally according to the local system time after the election is finished.
In this embodiment of the present invention, the updating module 208 is configured to execute step S108.
Also disclosed is a computer-readable storage medium having a computer program stored thereon, where the computer program is executed by the processor 103 to implement the election request processing method disclosed in the foregoing embodiments of the invention.
In summary, the election request processing method, apparatus, management node, and storage medium provided by the present invention are applied to a normal management node of a management cluster in a distributed storage system, where the normal management node stores a local version number, the management cluster further includes a management node that abnormally recovers communication with the normal management node, and the election request processing method includes: receiving an election request sent by a management node which is recovered abnormally, wherein the election request comprises an election version number; when the election version number is larger than the local version number, processing the election request according to a first processing strategy; and when the election version number is smaller than the local version number, processing the election request according to a second processing strategy. Compared with the prior art, the invention conditionally ignores the election request, realizes shielding, isolates the problem management nodes, reduces the election frequency of the management cluster, and reduces the influence of election on normal service as much as possible, thereby ensuring that the management cluster can normally provide service, and further improving the reliability of the whole distributed storage system.
In the embodiments provided in the present application, it should be understood that the disclosed apparatus and method can be implemented in other ways. The apparatus embodiments described above are merely illustrative, and for example, the flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
In addition, the functional modules in the embodiments of the present invention may be integrated together to form an independent part, or each module may exist separately, or two or more modules may be integrated to form an independent part.
The functions, if implemented in the form of software functional modules and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes. It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention. It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures.
Claims (10)
1. An election request processing method applied to a management node of a management cluster in a distributed storage system, wherein the management node stores a local version number, and the management cluster further comprises a management node which communicates with the management node and recovers an exception, the method comprising:
receiving an election request sent by the management node which is abnormally recovered, wherein the election request comprises an election version number recorded by the management node which is abnormally recovered and the latest election time recorded by the management node which is abnormally recovered;
when the election version number recorded by the abnormally recovered management node is larger than the local version number, judging whether to ignore the election request or not according to the current system time and the latest election time recorded by the abnormally recovered management node;
when the difference between the latest election time recorded by the abnormally recovered management node and the current system time is less than or equal to a preset threshold value, ignoring the election request;
and when the difference between the latest election time recorded by the management node with abnormal recovery and the current system time is greater than a preset threshold value, accepting the election request to start election.
2. The election request processing method of claim 1, said method further comprising:
when the election version number recorded by the abnormally recovered management node is smaller than the local version number, acquiring the current system time, the number of the management nodes providing service currently and the total number of the management nodes;
when the number of the management nodes providing service currently is more than half of the total number of the management nodes and less than the total number of the management nodes, determining whether to ignore the election request according to the current system time and the latest election time recorded by the management nodes recovered abnormally;
and when the number of the management nodes which currently provide the service is less than or equal to half of the total number of the management nodes or equal to the total number of the management nodes, accepting the election request to start election.
3. The election request processing method according to claim 2, wherein said step of determining whether to ignore said election request based on said current system time and a last election time recorded by said abnormally recovered management node comprises:
when the difference between the latest election time recorded by the abnormally recovered management node and the current system time is smaller than or equal to a preset threshold value, ignoring the election request;
and when the difference between the latest election time recorded by the abnormally recovered management node and the current system time is greater than a preset threshold value, accepting the election request to start election.
4. The election request processing method of claim 1, said method further comprising:
and updating the latest election time of the local record according to the local system time after the election is finished.
5. An election request processing device applied to a management node of a management cluster in a distributed storage system, the management node storing a local version number, the management cluster further including an exception recovery management node in communication with the management node, the device comprising:
a receiving module, configured to receive an election request sent by the abnormally recovered management node, where the election request includes an election version number recorded by the abnormally recovered management node and a latest election time recorded by the abnormally recovered management node;
the judging module is used for judging whether to ignore the election request according to the current system time and the latest election time recorded by the abnormally recovered management node when the election version number recorded by the abnormally recovered management node is larger than the local version number;
a first ignoring module, configured to ignore the election request when a difference between a latest election time recorded by the abnormally recovered management node and the current system time is less than or equal to a preset threshold;
and the first election module is used for accepting the election request to start election when the difference between the latest election time recorded by the abnormally recovered management node and the current system time is greater than a preset threshold value.
6. The election request processing device of claim 5, said device further comprising:
the obtaining module is used for obtaining the current system time, the number of management nodes providing service currently and the total number of the management nodes when the election version number recorded by the abnormally recovered management node is smaller than the local version number;
a second ignoring module, configured to determine whether to ignore the election request according to the current system time and the latest election time recorded by the abnormally recovered management node when the number of the currently serving management nodes is greater than half of the total number of the management nodes and smaller than the total number of the management nodes;
and the second election module is used for receiving the election request to start election when the number of the management nodes which currently provide services is less than or equal to half of the total number of the management nodes or equal to the total number of the management nodes.
7. The election request processing device of claim 6, wherein said second ignoring module is specifically configured to:
when the difference between the latest election time recorded by the abnormally recovered management node and the current system time is smaller than or equal to a preset threshold value, ignoring the election request;
and when the difference between the latest election time recorded by the abnormally recovered management node and the current system time is greater than a preset threshold value, accepting the election request to start election.
8. The election request processing device of claim 5, said device further comprising:
and the updating module is used for updating the latest election time of the local record according to the local system time after the election is finished.
9. A management node, characterized in that the management node comprises:
one or more processors;
memory for storing one or more programs that, when executed by the one or more processors, cause the one or more processors to implement the method of any of claims 1-4.
10. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the method according to any one of claims 1-4.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810770164.0A CN109032854B (en) | 2018-07-13 | 2018-07-13 | Election request processing method and device, management node and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810770164.0A CN109032854B (en) | 2018-07-13 | 2018-07-13 | Election request processing method and device, management node and storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109032854A CN109032854A (en) | 2018-12-18 |
CN109032854B true CN109032854B (en) | 2021-10-12 |
Family
ID=64642470
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810770164.0A Active CN109032854B (en) | 2018-07-13 | 2018-07-13 | Election request processing method and device, management node and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109032854B (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111371572B (en) * | 2018-12-25 | 2021-09-10 | 大唐移动通信设备有限公司 | Network node election method and node equipment |
CN115378799B (en) * | 2022-10-21 | 2023-02-28 | 北京奥星贝斯科技有限公司 | Election method and device in equipment cluster based on PaxosLease algorithm |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5862348A (en) * | 1996-02-09 | 1999-01-19 | Citrix Systems, Inc. | Method and apparatus for connecting a client node to a server node based on load levels |
CN102929696B (en) * | 2012-09-28 | 2015-09-09 | 北京搜狐新媒体信息技术有限公司 | A kind of distributed system Centroid structure, submission, method for supervising and device |
CN105471995B (en) * | 2015-12-14 | 2016-08-31 | 山东省农业机械科学研究院 | Extensive Web service group of planes high availability implementation method based on SOA |
CN105915391B (en) * | 2016-06-08 | 2019-06-14 | 国电南瑞科技股份有限公司 | The distributed key assignments storage method of self-recovering function is submitted and had based on single phase |
CN107995029B (en) * | 2017-11-28 | 2019-12-13 | 新华三信息技术有限公司 | Election control method and device and election method and device |
-
2018
- 2018-07-13 CN CN201810770164.0A patent/CN109032854B/en active Active
Also Published As
Publication number | Publication date |
---|---|
CN109032854A (en) | 2018-12-18 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11222043B2 (en) | System and method for determining consensus within a distributed database | |
US7840662B1 (en) | Dynamically managing a network cluster | |
KR101871383B1 (en) | Method and system for using a recursive event listener on a node in hierarchical data structure | |
US9778998B2 (en) | Data restoration method and system | |
EP3528431B1 (en) | Paxos protocol-based methods and apparatuses for online capacity expansion of distributed consistency system | |
US10983981B1 (en) | Acid transaction for distributed, versioned key-value databases | |
CN111049928B (en) | Data synchronization method, system, electronic device and computer readable storage medium | |
CN109032854B (en) | Election request processing method and device, management node and storage medium | |
US20210320977A1 (en) | Method and apparatus for implementing data consistency, server, and terminal | |
CN111752488B (en) | Management method and device of storage cluster, management node and storage medium | |
US11119688B2 (en) | Replica processing method and node, storage system, server, and readable medium | |
CN110635941A (en) | Database node cluster fault migration method and device | |
CN110825763B (en) | MySQL database high-availability system based on shared storage and high-availability method thereof | |
CN112463318A (en) | Timed task processing method, device and system | |
JP2020524324A (en) | Distributed storage network | |
US10169441B2 (en) | Synchronous data replication in a content management system | |
CN112600690A (en) | Configuration data synchronization method, device, equipment and storage medium | |
CN109088937B (en) | Cluster authorization method and device based on unified management | |
CN106354830B (en) | Method and device for data synchronization between database cluster nodes | |
CN109104299B (en) | Method and device for reducing cluster oscillation | |
CN113239059A (en) | Switching method and device of distributed lock, server and storage medium | |
CN114968656A (en) | Data rollback method, device, equipment and medium | |
US11853321B1 (en) | Data replication without in-place tombstones | |
CN111708780A (en) | Distributed table system and fragment master selection method, device, server and medium | |
CN113609104B (en) | Method and device for accessing distributed storage system by key value of partial fault |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |