Detailed Description
In order to make those skilled in the art better understand the technical solutions in the present application, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments in the present application without making creative efforts shall fall within the protection scope of the present application.
The embodiment of the application provides a distributed storage system and a leader node election method and device thereof, so that the reliability of the distributed storage system is improved.
Fig. 1 is a schematic structural diagram of a distributed storage system according to an embodiment of the present application, and as shown in fig. 1, the distributed storage system includes a plurality of data storage devices, such as data storage devices 11, 12, 13, 14, 15 and the like shown in fig. 1, the data storage devices are associated with corresponding master processes, such as master process 100 in fig. 1, the master processes are configured to divide the associated data storage devices to form a plurality of data storage nodes, such as data storage nodes 1001, 1002, 1003, 1004 and the like in fig. 1, and the master processes further configure corresponding control threads, such as control threads 10001, 10002, 10003, 10004 and the like in fig. 1, for each data storage node. In a particular embodiment, each data storage device in the distributed storage system is associated with a corresponding master process.
In the system shown in fig. 1, the data storage devices have corresponding data storage spaces, and the master process divides the associated data storage devices to form a plurality of data storage nodes.
Fig. 2a is a schematic diagram of a master process dividing an associated data storage device to form a plurality of data storage nodes according to an embodiment of the present disclosure, and fig. 2b is another schematic diagram of a master process dividing an associated data storage device to form a plurality of data storage nodes according to an embodiment of the present disclosure, as shown in fig. 2a and fig. 2b, the master process acquires a size of a data storage space of the associated data storage device, for example, 10T, and the master process divides the associated data storage device to form 5 data storage nodes in a manner that each 2T storage space corresponds to one data storage node.
In this embodiment, when the data storage device stores data, a data storage node corresponding to the data is determined, and the data is stored in a data storage space corresponding to the data storage node.
In this embodiment of the present application, data storage nodes located in different data storage devices are further selected and taken out to form a node group, a certain data storage node in the node group is a leader node (leader), the other data storage nodes are follower nodes (followers), and each data storage node in the node group is used to store the same data. Fig. 3 is a schematic structural diagram of a distributed storage system according to another embodiment of the present application, and as shown in fig. 3, taking three data storage devices as an example, all first data storage nodes in the three data storage devices are selected to form a first node group, and a certain first data storage node in the three data storage devices is determined to be a leader node, and all second data storage nodes in the three data storage devices are selected to form a second node group, and a certain second data storage node in the three data storage devices is determined to be a leader node, and all third data storage nodes in the three data storage devices are selected to form a third node group, and a certain third data storage node in the three data storage devices is determined to be a leader node. When data distributed storage is carried out, after the leader node acquires a data storage request, the leader node stores data in the corresponding data storage space and notifies other follower data storage nodes to store the same data in the corresponding data storage space, so that the purpose of distributed storage of a copy of data in different data storage devices is achieved.
In the distributed storage system, each data storage node is configured with a corresponding control thread by the main control process, the main control process and the configured control thread share a memory, and the control thread configured by the main control process can be regarded as a thread object generated by the main control process. The control thread is used for maintaining leader node election work of the node group, and the control thread can determine the node group to which the corresponding data storage node belongs and maintain the leader node election work of the node group when the leader node does not exist in the node group.
Based on the above-described distributed data storage system, fig. 4 is a schematic flowchart of a leader node election method according to an embodiment of the present application, and as shown in fig. 4, the flow includes the following steps:
step S402, if the data storage node is not a leader node and the corresponding control thread is not within the time length threshold value and receives a heartbeat message sent by the leader node in the node group to which the data storage node belongs, determining the data storage node as a candidate node in the node group to which the data storage node belongs, wherein each data storage node in the node group is used for storing the same data and is respectively positioned in different data storage devices;
step S404, sending first election information to the control threads corresponding to the other data storage nodes in the node group through the control thread corresponding to the data storage node, so as to instruct the control threads corresponding to the other data storage nodes in the node group to feed back corresponding first response information according to the first election information, and determining whether the data storage node is selected as a leader node of the node group according to the fed back first response information.
In this embodiment, RPC (Remote Procedure Call Protocol) communication may be performed between the master processes in each data storage device, and when a leader node exists in the node group, the master process corresponding to the leader node may send a heartbeat message to the master processes corresponding to other data storage nodes in the node group periodically through RPC communication.
In this embodiment, if the data storage node is not a leader node and the control thread corresponding to the data storage node is not within the time threshold, and the heartbeat message sent by the leader node in the node group to which the data storage node belongs is acquired, the control thread corresponding to the data storage node determines that the data storage node is a candidate node in the node group to which the data storage node belongs.
The control thread corresponding to the data storage node may send first election information to the control threads corresponding to other data storage nodes in the node group, where the first election information includes the number of received data storage requests and election weight corresponding to the data storage node, the number of received data storage requests corresponding to the data storage node is the number of accepted data storage requests of the control thread corresponding to the data storage node, the control thread corresponding to the data storage node further instructs the control threads corresponding to the other data storage nodes in the node group to feed back corresponding first response information according to the first election information, and the first response information includes first consent sub-information or first rejection sub-information.
Instructing control threads corresponding to other data storage nodes in the node group to feed back corresponding first response information according to the first election information, which specifically comprises the following steps: and indicating the control threads corresponding to other data storage nodes in the node group, feeding back first consent sub-information to the control thread corresponding to the data storage node when determining that the number of the received data storage requests is smaller than that of the received data storage requests corresponding to the data storage node and the election weight of the data storage node corresponding to the node group is smaller than that of the data storage node, and otherwise feeding back first rejection sub-information to the control thread corresponding to the data storage node.
Specifically, after receiving the first election information, the control threads corresponding to the other data storage nodes compare the magnitude relationship between the number of received data storage requests of the control thread sending the first election information and the number of received data storage requests of the control thread according to the first election information, compare the magnitude relationship between the election weight of the control thread sending the first election information and the election weight of the data storage node corresponding to the control thread sending the first election information, if the number of received data storage requests of the control thread sending the first election information is greater than the number of received data storage requests of the control thread, and the election weight of the data storage node corresponding to the controllable thread sending the first election information is greater than the election weight of the data storage node corresponding to the control thread sending the first election information, the control threads corresponding to the other data storage nodes return first consent sub-information to the control thread sending the first election information, and otherwise, return first rejection sub-information. The first consent sub-information indicates that the corresponding data storage nodes are consented to be election competitive leader nodes, and the first rejection sub-information indicates that the corresponding data storage nodes are rejected to be election competitive leader nodes.
In this embodiment, when data is stored each time, the control thread corresponding to the leader node receives a data storage request first, and the control thread corresponding to the leader node sends a data storage request to the control threads corresponding to the other data storage nodes in the node group to which the control thread corresponds according to the received data storage request, thereby implementing distributed storage of data. Considering that the control threads corresponding to other data storage nodes in the node group may not receive the data storage request sent by the control thread corresponding to the leader node due to a communication failure, so that the reliability of data storage is reduced, therefore, to ensure that a relatively reliable data storage node is selected as the leader node, the number of the received data storage requests is used as one of the basis factors for selecting the leader node, so as to ensure that the communication condition of the selected leader node is relatively stable, and ensure that the selected leader node is relatively reliable.
In this embodiment, the election weight is used to indicate a weight for the data storage node to participate in election of the leader node, for a control thread, whenever the control thread receives a notification message that another data storage node sent by a control thread corresponding to another data storage node in a node group to which the corresponding control thread belongs becomes a candidate node through a corresponding master process, the control thread determines that the election weight of the corresponding data storage node is increased by 1, and whenever the control thread determines that the corresponding data storage node becomes a candidate node, the control thread determines that the election weight of the corresponding data storage node is increased by 1.
In this embodiment, the control thread receives first responder information returned by the control thread corresponding to the other data storage nodes in the node group based on the first election information. The control thread may determine, according to the fed-back first response information, whether the corresponding data storage node is selected as a leader node of the node group, specifically, if the control thread determines that the number of the received first consent sub-information is not less than the number threshold according to the fed-back first response information, it is determined that the corresponding data storage node is selected as the leader node of the node group, and otherwise, it is determined that the corresponding data storage node is not selected as the leader node of the node group. The number threshold may be half of the number of nodes in the node group to which the data storage node belongs.
Specifically, when the number of the first consent sub-information received by the control thread is not less than the number threshold, it is indicated that the data storage node not less than the number threshold agrees to elect the data storage node corresponding to the control thread as the leader node, and therefore the control thread determines that the corresponding data storage node is selected as the leader node of the node group. Otherwise, if the number of the first synonym sub-information is less than the number threshold, the control thread determines that the corresponding data storage node is not selected as the leader node of the node group.
Therefore, in the embodiment of the application, the election process of the leader node can be maintained by the control thread corresponding to each data storage node, so that the effect of putting the election work of the leader node down to each data storage device is achieved, and after the data storage device where the leader node is located is down, other data storage devices can autonomously elect the leader node, so that the data storage process of the distributed storage system is not affected, and the reliability of the distributed storage system is improved.
In this embodiment, RPC communication may be performed between the main control processes in each data storage device, and the control thread sends the first election information to the control threads corresponding to the other data storage nodes in the node group, where the first election information may be: and the control thread sends first election information to the control threads corresponding to other data storage nodes in the node group through the communication relation between the main control process corresponding to the corresponding data storage node and the main control processes corresponding to other data storage nodes in the node group. In this embodiment, the control thread corresponding to the data storage node may further receive, through the communication relationship, first response information returned by the control threads corresponding to other data storage nodes in the node group.
Specifically, fig. 5a is a schematic diagram of a leader node election process provided in an embodiment of the present application, and as shown in fig. 5a, a master process corresponding to a control thread sends first election information to master processes corresponding to other data storage nodes in an affiliated node group through RPC communication, and receives first response information returned by the master processes corresponding to the other data storage nodes in the affiliated node group through RPC communication.
Because the main control process and the control thread configured by the main control process share the memory, the main control process can acquire the first election information which needs to be sent by the control thread from the memory, and the control thread can acquire the first response information received by the main control process from the memory. Similarly, the main control processes corresponding to other data storage nodes may also acquire the first response information that needs to be sent, and the control processes corresponding to other data storage nodes may also acquire the first election information received by the corresponding main control processes.
In this embodiment, after other data storage nodes in the node group become candidate nodes, if the control thread corresponding to the data storage node receives second election information sent by the control thread corresponding to other data storage nodes in the node group, the control thread corresponding to the data storage node returns second response information to the control thread sending the second election information according to the second election information, so as to determine whether the data storage node sending the second election information agrees to be elected as a leader node. Wherein the process of the other data storage nodes becoming candidate nodes is the same as the process of the data storage node becoming a candidate node described earlier.
Specifically, the second election information includes the number of data storage requests received by the control thread that sends out the second election information and the corresponding election weight, the election weight corresponding to the control thread that sends out the second election information is the election weight of the data storage node corresponding to the control thread that sends out the second election information, the second response information includes the second consent sub-information or the second rejection sub-information, and the control thread corresponding to the data storage node returns the second response information to the control thread that sends out the second election information according to the second election information, including:
and if the number of the received data storage requests of the control thread sending the second election information is larger than the number of the received data storage requests of the control thread sending the second election information and the election weight of the data storage node corresponding to the control thread sending the second election information is larger than the election weight of the data storage node corresponding to the control thread sending the second election information, returning second consent sub-information to the control thread sending the second election information, and otherwise, returning second rejection sub-information to the control thread sending the second election information.
Correspondingly, if the control thread sending out the second election information receives second consent sub-information returned by the control thread corresponding to the data storage nodes not smaller than the quantity threshold value in the node group, it is determined that the data storage node corresponding to the control thread sending out the second election information is elected as the leader node of the node group.
In order to avoid the situation that a plurality of candidate nodes exist in the node group, in this embodiment, if the control thread determines that the election weight of the data storage node corresponding to the control thread which sends the second election information is greater than the election weight of the data storage node corresponding to the control thread which sends the second election information according to the received second election information, the data storage node corresponding to the control thread is degraded from the candidate node to the follower node, and the election weight of the data storage node corresponding to the control thread which sends the second election information is adjusted to be the same as the election weight of the data storage node corresponding to the control thread which sends the second election information.
Similarly, if the other control threads determine that the election weight of the data storage node corresponding to the other control threads is smaller than the election weight corresponding to the data storage node according to the received first election information, the other control threads determine that the data storage node corresponding to the other control threads is degraded into a following node in the node group, and adjust the election weight of the data storage node corresponding to the other control threads to be the same as the election weight of the data storage node corresponding to the control thread which sends the first election information.
In this embodiment, RPC communication may be performed between the master control processes in each data storage device, and the control thread may receive second election information sent by the control threads corresponding to other data storage nodes through RPC communication between the corresponding master control process and the master control processes corresponding to other data storage nodes, and return second response information to the control threads corresponding to other data storage nodes.
Specifically, fig. 5b is a schematic diagram of a leader node election process provided in an embodiment of the present application, and as shown in fig. 5b, a master process corresponding to a control thread receives second election information sent by the master process corresponding to a candidate node through RPC communication, and sends second response information to the master process corresponding to the candidate node through RPC communication. Because the main control process and the control thread configured by the main control process share the memory, the main control process can acquire the second response information which needs to be sent by the control thread from the memory, and the control thread can acquire the second election information received by the main control process from the memory. Similarly, the master control thread corresponding to the candidate node may also acquire second election information to be sent, and the control thread corresponding to the candidate node may also acquire second response information.
In this embodiment, if the data storage node is elected as the leader node of the node group, the master control process corresponding to the data storage node may also send a heartbeat message to master control processes corresponding to other data storage nodes except the leader node in the node group by periodically performing RPC communication, so that control threads corresponding to other data storage nodes receive the heartbeat message, and keep heartbeat connection between the leader node and each other data storage node.
In this embodiment, if the data storage node is not elected as the leader node of the node group, the master control process corresponding to the data storage node may also receive, periodically through RPC communication, a heartbeat message sent by the master control process corresponding to the leader node of the node group, thereby periodically obtaining the heartbeat message from the leader node.
Fig. 6 is a schematic structural diagram of a distributed storage system according to another embodiment of the present application, and as shown in fig. 6, a data storage device is further associated with corresponding data read/write processes, such as the data read/write processes 10005 and 10006 shown in fig. 6, where one data read/write process corresponds to one or more data storage nodes in the data storage device.
The data reading and writing process is used for receiving a first data storage request aiming at the data storage node after the data storage node is selected as a leader node of the node group, storing data in a storage space corresponding to the data storage node according to the first data storage request, and sending a second data storage request to data reading and writing processes corresponding to other data storage nodes in the node group according to the first data storage request so as to indicate the other data storage nodes in the node group to store the data in the corresponding storage space.
Specifically, the data read-write process is used to implement data storage and data read in the distributed storage system, as shown in fig. 6, a plurality of data storage nodes may correspond to one data read-write process together, or each data storage node may correspond to one data read-write process. In this embodiment, a control thread can obtain, from a data read-write process, quantity information and election level information of received data storage requests corresponding to data storage nodes through an HTTP ((HyperText Transfer Protocol, hyperText Transfer Protocol) communication mode, where the data read-write process and the control thread corresponding to the data storage node corresponding to the data read-write process share a memory, when storing data, a node used for storing data in a distributed storage system first determines a node group corresponding to the data to be stored, and then sends a data storage request to a data read-write process corresponding to each data storage node in the node group, and after receiving the data storage request, a data read-write process corresponding to a leader node in the node group responds to the request to write the data into a storage space corresponding to the leader node, and the data read-write process corresponding to the leader node also sends a data storage request to a data read-write process corresponding to another node in the node group, so that the data read-write processes corresponding to another node also store the same data in the corresponding storage space, thereby completing distributed storage of the data.
The data reading and writing process corresponding to the leader node sends the data storage request to the data reading and writing processes corresponding to other data storage nodes in the node group, and the data storage request can be realized through RPC communication between the master control processes in each data storage device.
In this embodiment, the data storage request may be used to store a plurality of pieces of data to be stored, and the data read-write process may further store, in parallel, each piece of data to be stored in the storage space corresponding to the corresponding data storage node when it is determined that there is no dependency relationship between each piece of data to be stored. Determining whether the data to be stored have the dependency relationship may be, and determining whether the data to be stored have the dependency relationship according to whether the data to be stored carries the dependency identifier. For example, if the data to be stored does not carry the dependency identifier, it is determined that the data to be stored does not depend on other data to be stored, and if the data to be stored carries the dependency identifier, the data to be stored pointed by the dependency identifier is used as the data on which the data to be stored depends.
Therefore, according to the embodiment, when the data to be stored does not have a dependency relationship, the data to be stored can be stored in parallel in the storage space corresponding to the corresponding data storage node, so that the data can be written in parallel, and the data storage efficiency is improved.
In summary, the distributed storage system in the embodiment of the present application has at least the following advantages:
(1) Each data storage device is provided with a plurality of control threads and independent master control processes, and the election process of the leader node can be maintained by the control threads corresponding to the data storage nodes, so that the election work of the leader node is downloaded to each data storage device, decentralized of a distributed storage system is achieved, and the bottleneck problem caused by centralized management of the devices is solved.
(2) When a large number of data storage devices where the leader nodes are located in the distributed storage system are down, other data storage devices can autonomously select the leader nodes, so that the data storage process of the distributed storage system is not affected, the efficiency and reliability of the distributed storage system are improved, and the fault recovery efficiency of the distributed storage system is improved.
(3) The master control process and the control thread are used for being responsible for heartbeat communication among the data storage nodes and selection of the leader node, and the data reading and writing process is used for being responsible for data storage and reading, so that separation of a control layer and a data layer in the distributed storage system is realized, and usability of the distributed storage system is improved.
(4) When the data to be stored do not have a dependency relationship, the data reading and writing process can store the data to be stored in parallel in the storage space corresponding to the corresponding data storage node, so that the data can be written in parallel, and the data storage efficiency is improved.
Based on the above description, an embodiment of the present application further provides a distributed storage system, including: a plurality of data storage devices having associated therewith a corresponding master process. And the main control process is used for dividing the associated data storage equipment to form a plurality of data storage nodes and configuring corresponding control threads for each data storage node. And the control thread is used for determining that the data storage node is a candidate node in the node group to which the data storage node belongs if the corresponding data storage node is not the leader node and does not receive the heartbeat message sent by the leader node in the node group to which the data storage node belongs within the time threshold, sending first election information to the control threads corresponding to other data storage nodes in the node group to indicate the control threads corresponding to the other data storage nodes in the node group to feed back corresponding first response information according to the first election information, and determining whether the data storage node is selected as the leader node of the node group according to the fed back first response information. And each data storage node in the node group is used for storing the same data and is respectively positioned in different data storage devices.
Optionally, the first election information includes the number of received data storage requests and election weight corresponding to the data storage node, the first response information includes first consent sub-information or first rejection sub-information, and the control thread is specifically configured to: and indicating the control threads corresponding to other data storage nodes in the node group, feeding back first consent sub-information to the control thread corresponding to the data storage node when determining that the number of the received data storage requests per se is smaller than the number of the received data storage requests corresponding to the data storage node and the election weight of the data storage node per se is smaller than the election weight of the data storage node, and otherwise, feeding back first rejection sub-information to the control thread corresponding to the data storage node.
Optionally, the control thread is specifically configured to: and if the number of the received first consent sub-information is determined to be not less than the number threshold according to the fed back first response information, determining that the data storage node is selected as the leader node of the node group, otherwise, determining that the data storage node is not selected as the leader node of the node group.
Optionally, the control thread is further configured to: and if second election information sent by the control threads corresponding to other data storage nodes in the node group is received, returning second response information to the control thread sending the second election information according to the second election information so as to determine whether the data storage nodes sending the second election information agree to be elected as leader nodes.
Optionally, the second election information includes the number of data storage requests received by the control thread that sends out the second election information and a corresponding election weight, and the second response information includes second consent sub-information or second rejection sub-information; the control thread is further specifically configured to: and if the number of the received data storage requests of the control thread sending the second election information is larger than the number of the received data storage requests corresponding to the data storage nodes and the election weight corresponding to the control thread sending the second election information is larger than the election weight of the data storage nodes, returning second consent sub-information to the control thread sending the second election information, and otherwise, returning second rejection sub-information to the control thread sending the second election information.
Optionally, the control thread is further configured to: and if the election weight corresponding to the control thread sending out the second election information is determined to be larger than the election weight of the data storage node according to the second election information, the data storage node is degraded from the candidate node to the following node.
Optionally, the control thread is specifically configured to: sending first election information to control threads corresponding to other data storage nodes in the node group through a communication relation between a main control process corresponding to the data storage node and main control processes corresponding to other data storage nodes in the node group; the control thread is further configured to: and receiving first response information returned by the control threads corresponding to other data storage nodes in the node group through the communication relation.
Optionally, the data storage device is further associated with a corresponding data read-write process, where in the data storage device, the data read-write process corresponds to multiple data storage nodes, and the data read-write process is configured to: after the data storage node is selected as a leader node of the node group, receiving a first data storage request aiming at the data storage node, and storing data in a storage space corresponding to the data storage node according to the first data storage request; and sending a second data storage request to the data reading and writing process corresponding to other data storage nodes in the node group according to the first data storage request so as to indicate the other data storage nodes in the node group to store data in the corresponding storage space.
Optionally, the first data storage request is used to store a plurality of data to be stored, and the data reading and writing process is specifically configured to: and when determining that the data to be stored do not have the dependency relationship, storing the data to be stored in parallel in the storage space corresponding to the corresponding data storage node.
In the embodiment of the application, the election process of the leader node can be maintained by the control thread corresponding to each data storage node, so that the effect of putting the election work of the leader node to each data storage device is achieved, and after the data storage device where the leader node is located is down, other data storage devices can autonomously elect the leader node, so that the data storage process of the distributed storage system is not affected, and the reliability of the distributed storage system is improved.
The specific process of the present embodiment can also refer to the description of the method part, and has the same beneficial effects, and will not be repeated here.
Further, an embodiment of the present application provides a leader node election device for a distributed storage system, where the distributed storage system includes a plurality of data storage devices, each data storage device is associated with a corresponding master process, and each master process is configured to divide the associated data storage device to form a plurality of data storage nodes, and configure corresponding control threads for each data storage node, where fig. 7 is a schematic structural diagram of the leader node election device provided in an embodiment of the present application, and as shown in fig. 7, the device includes: a node determination module 71 and a node election module 72.
The node determining module 71 is configured to determine that the data storage node is a candidate node in a node group to which the data storage node belongs if the data storage node is not a leader node and a corresponding control thread is not within a time threshold and receives a heartbeat message sent by the leader node in the node group to which the data storage node belongs, where each data storage node in the node group is used to store the same data and is located in different data storage devices respectively.
The node election module 72 is configured to send first election information to the control threads corresponding to the other data storage nodes in the node group through the control thread corresponding to the data storage node, so as to instruct the control threads corresponding to the other data storage nodes in the node group to feed back corresponding first response information according to the first election information, and determine whether the data storage node is selected as a leader node of the node group according to the fed back first response information.
In this embodiment, the first election information includes the number of received data storage requests and election weight corresponding to the data storage node, the first response information includes first consent sub-information or first rejection sub-information, and the node election module 72 is specifically configured to: and indicating the control threads corresponding to other data storage nodes in the node group, feeding back first consent sub-information to the control thread corresponding to the data storage node when determining that the number of the received data storage requests is smaller than that of the data storage nodes and the election weight of the data storage node corresponding to the control thread is smaller than that of the data storage node, and otherwise feeding back first rejection sub-information to the control thread corresponding to the data storage node.
In this embodiment, the node election module 72 is specifically configured to: and if the number of the received first consent sub-information is determined to be not less than the number threshold according to the fed back first response information, determining that the data storage node is selected as the leader node of the node group, otherwise, determining that the data storage node is not selected as the leader node of the node group.
In this embodiment, the method further includes: and the information sending module is used for returning second response information to the control thread sending the second election information according to the second election information if the control thread corresponding to the data storage node receives the second election information sent by the control thread corresponding to other data storage nodes in the node group, so as to determine whether the data storage node sending the second election information agrees to be elected as a leader node.
In this embodiment, the second election information includes the number of data storage requests received by the control thread that sends the second election information and a corresponding election weight, the second response information includes second consent sub-information or second denial sub-information, and the information sending module is specifically configured to: and if the number of the received data storage requests of the control thread sending the second election information is larger than the number of the received data storage requests corresponding to the data storage node and the election weight corresponding to the control thread sending the second election information is larger than the election weight of the data storage node according to the second election information, returning second consent sub-information to the control thread sending the second election information, and otherwise, returning second rejection sub-information to the control thread sending the second election information.
In this embodiment, the method further includes: and the degradation module is used for degrading the data storage node from the candidate node to the following node if the election weight corresponding to the control thread sending out the second election information is determined to be greater than the election weight of the data storage node according to the second election information.
In this embodiment, the node election module 72 is specifically configured to: through the communication relation between the main control process corresponding to the data storage node and the main control processes corresponding to other data storage nodes in the node group, first election information is sent to the control threads corresponding to other data storage nodes in the node group, and the device further comprises: and the information receiving module is used for receiving the first response information returned by the control threads corresponding to other data storage nodes in the node group through the communication relation.
In the embodiment of the application, the election process of the leader node can be maintained by the control thread corresponding to each data storage node, so that the election work of the leader node is released to each data storage device, and after the data storage device where the leader node is located is down, other data storage devices can autonomously elect the leader node, so that the data storage process of the distributed storage system is not affected, and the reliability of the distributed storage system is improved.
The specific processes of the present embodiment can also refer to the descriptions of the foregoing method parts, and have the same beneficial effects, and are not repeated here.
Further, an embodiment of the present application further provides a leader node election device for a distributed storage system, fig. 8 is a schematic diagram of a result of the leader node election device provided in an embodiment of the present application, as shown in fig. 8, the leader node election device may generate a relatively large difference due to different configurations or performances, and may include one or more processors 901 and a memory 902, and one or more storage applications or data may be stored in the memory 902. Memory 902 may be, among other things, transient storage or persistent storage. The application program stored in memory 902 may include one or more modules (not shown), each of which may include a series of computer-executable instructions in the election device of a leader node. Still further, the processor 901 can be configured to communicate with the memory 902 to execute a series of computer-executable instructions in the memory 902 on the leader node election device. The leader node election device may also include one or more power supplies 903, one or more wired or wireless network interfaces 904, one or more input-output interfaces 905, one or more keyboards 906, and the like.
In a particular embodiment, the leader election device is comprised of memory, and one or more programs, wherein the one or more programs are stored in the memory, and the one or more programs may include one or more modules, and each module may include a series of computer-executable instructions for the leader election device, and execution of the one or more programs by one or more processors includes computer-executable instructions for:
if the data storage node is not a leader node, and the corresponding control thread is not within the time threshold and receives a heartbeat message sent by the leader node in a node group to which the data storage node belongs, determining that the data storage node is a candidate node in the node group to which the data storage node belongs, wherein each data storage node in the node group is used for storing the same data and is respectively positioned in different data storage devices;
and sending first election information to control threads corresponding to other data storage nodes in the node group through the control thread corresponding to the data storage node to instruct the control threads corresponding to the other data storage nodes in the node group to feed back corresponding first response information according to the first election information, and determining whether the data storage node is selected as a leader node of the node group according to the fed back first response information.
Optionally, when executed, the computer executable instructions include that the first election information includes the number of received data storage requests and election weights corresponding to the data storage node, and the first response information includes first consent sub-information or first rejection sub-information, and instruct control threads corresponding to other data storage nodes in the node group to feed back corresponding first response information according to the first election information, including: and indicating the control threads corresponding to other data storage nodes in the node group, feeding back first consent sub-information to the control thread corresponding to the data storage node when determining that the number of the received data storage requests is smaller than that of the received data storage requests corresponding to the data storage node and the election weight of the data storage node corresponding to the node group is smaller than that of the data storage node, and otherwise feeding back first rejection sub-information to the control thread corresponding to the data storage node.
Optionally, the computer-executable instructions, when executed, determine whether the data storage node is selected as a leader node of the group of nodes according to the fed back first response information, comprising: and if the number of the received first consent sub-information is determined to be not less than the number threshold according to the fed back first acknowledgement information, determining that the data storage node is selected as the leader node of the node group, otherwise, determining that the data storage node is not selected as the leader node of the node group.
Optionally, the computer executable instructions, when executed, further comprise: and if the control thread corresponding to the data storage node receives second election information sent by the control threads corresponding to other data storage nodes in the node group, returning second response information to the control thread sending the second election information according to the second election information so as to determine whether the data storage node sending the second election information agrees to be elected as a leader node.
Optionally, when the computer-executable instructions are executed, the second election information includes the number of data storage requests received by the control thread which sends out the second election information and election weight, the second response information includes second consent sub-information or second rejection sub-information, and the second response information is returned to the control thread which sends out the second election information according to the second election information, where the method includes: and if the number of the received data storage requests of the control thread sending the second election information is determined to be larger than the number of the received data storage requests corresponding to the data storage nodes according to the second election information, and the election weight corresponding to the control thread sending the second election information is larger than the election weight of the data storage nodes, returning second consent sub-information to the control thread sending the second election information, and otherwise, returning second rejection sub-information to the control thread sending the second election information.
Optionally, the computer executable instructions, when executed, further comprise: and if the election weight corresponding to the control thread sending out the second election information is determined to be larger than the election weight of the data storage node according to the second election information, the data storage node is degraded from the candidate node to the following node.
Optionally, when executed, the computer-executable instructions send first election information to control threads corresponding to other data storage nodes in the node group, where the first election information includes: and sending first election information to control threads corresponding to other data storage nodes in the node group through the communication relation between the main control process corresponding to the data storage node and the main control processes corresponding to the other data storage nodes in the node group. Further comprising: and the control thread corresponding to the data storage node receives the first response information returned by the control threads corresponding to other data storage nodes in the node group through the communication relation.
In the embodiment of the application, the election process of the leader node can be maintained by the control thread corresponding to each data storage node, so that the effect of putting the election work of the leader node to each data storage device is achieved, and after the data storage device where the leader node is located is down, other data storage devices can autonomously elect the leader node, so that the data storage process of the distributed storage system is not affected, and the reliability of the distributed storage system is improved.
The specific processes of the present embodiment can also refer to the descriptions of the foregoing method parts, and have the same beneficial effects, and are not repeated here.
Further, embodiments of the present application also provide a storage medium for storing computer-executable instructions, in a specific embodiment, the storage medium may be a usb disk, an optical disk, a hard disk, and the like, and the storage medium stores computer-executable instructions that, when executed by a processor, implement the following processes:
if the data storage node is not a leader node and the corresponding control thread is not within the time threshold and receives a heartbeat message sent by the leader node in the node group to which the data storage node belongs, determining that the data storage node is a candidate node in the node group to which the data storage node belongs, wherein each data storage node in the node group is used for storing the same data and is respectively positioned in different data storage devices;
and sending first election information to control threads corresponding to other data storage nodes in the node group through the control thread corresponding to the data storage node to instruct the control threads corresponding to the other data storage nodes in the node group to feed back corresponding first response information according to the first election information, and determining whether the data storage node is selected as a leader node of the node group according to the fed back first response information.
Optionally, when executed by the computer-executable instructions, the first election information includes the number of received data storage requests and election weight corresponding to the data storage node, the first response information includes first consent sub-information or first rejection sub-information, and the control threads corresponding to other data storage nodes in the node group are instructed to feed back corresponding first response information according to the first election information, including: and indicating the control threads corresponding to other data storage nodes in the node group, feeding back first consent sub-information to the control thread corresponding to the data storage node when determining that the number of the received data storage requests is smaller than that of the received data storage requests corresponding to the data storage node and the election weight of the data storage node corresponding to the node group is smaller than that of the data storage node, and otherwise feeding back first rejection sub-information to the control thread corresponding to the data storage node.
Optionally, the computer-executable instructions, when executed, determine whether the data storage node is selected as a leader node of the group of nodes according to the fed back first reply information, comprising: and if the number of the received first consent sub-information is determined to be not less than the number threshold according to the fed back first acknowledgement information, determining that the data storage node is selected as the leader node of the node group, otherwise, determining that the data storage node is not selected as the leader node of the node group.
Optionally, the computer executable instructions, when executed, further comprise: and if the control thread corresponding to the data storage node receives second election information sent by the control threads corresponding to other data storage nodes in the node group, returning second response information to the control thread sending the second election information according to the second election information so as to determine whether the data storage node sending the second election information agrees to be elected as a leader node.
Optionally, when the computer-executable instructions are executed, the second election information includes the number of data storage requests received by the control thread which sends out the second election information and election weight, the second response information includes second consent sub-information or second rejection sub-information, and the second response information is returned to the control thread which sends out the second election information according to the second election information, where the method includes: and if the number of the received data storage requests of the control thread sending the second election information is determined to be larger than the number of the received data storage requests corresponding to the data storage nodes according to the second election information, and the election weight corresponding to the control thread sending the second election information is larger than the election weight of the data storage nodes, returning second consent sub-information to the control thread sending the second election information, and otherwise, returning second rejection sub-information to the control thread sending the second election information.
Optionally, the computer executable instructions, when executed, further comprise: and if the election weight corresponding to the control thread sending out the second election information is determined to be larger than the election weight of the data storage node according to the second election information, the data storage node is degraded from the candidate node to the following node.
Optionally, when executed, the computer-executable instructions send first election information to control threads corresponding to other data storage nodes in the node group, where the first election information includes: and sending first election information to control threads corresponding to other data storage nodes in the node group through the communication relation between the main control process corresponding to the data storage node and the main control processes corresponding to the other data storage nodes in the node group. Further comprising: and the control thread corresponding to the data storage node receives the first response information returned by the control threads corresponding to other data storage nodes in the node group through the communication relation.
In the embodiment of the application, the election process of the leader node can be maintained by the control thread corresponding to each data storage node, so that the effect of putting the election work of the leader node to each data storage device is achieved, and after the data storage device where the leader node is located is down, other data storage devices can autonomously elect the leader node, so that the data storage process of the distributed storage system is not affected, and the reliability of the distributed storage system is improved.
The specific processes of the present embodiment can also refer to the descriptions of the foregoing method parts, and have the same beneficial effects, and are not repeated here.
In the 90 s of the 20 th century, improvements in a technology could clearly distinguish between improvements in hardware (e.g., improvements in circuit structures such as diodes, transistors, switches, etc.) and improvements in software (improvements in process flow). However, as technology advances, many of today's process flow improvements have been seen as direct improvements in hardware circuit architecture. Designers almost always obtain the corresponding hardware circuit structure by programming an improved method flow into the hardware circuit. Thus, it cannot be said that an improvement in the process flow cannot be realized by hardware physical blocks. For example, a Programmable Logic Device (PLD) (e.g., a Field Programmable Gate Array (FPGA)) is an integrated circuit whose Logic functions are determined by a user programming the Device. A digital system is "integrated" on a PLD by the designer's own programming without requiring the chip manufacturer to design and fabricate application-specific integrated circuit chips. Furthermore, nowadays, instead of manually manufacturing an Integrated Circuit chip, such Programming is often implemented by "logic compiler" software, which is similar to a software compiler used in program development and writing, but the original code before compiling is also written by a specific Programming Language, which is called Hardware Description Language (HDL), and HDL is not only one but many, such as ABEL (Advanced Boolean Expression Language), AHDL (alternate Hardware Description Language), traffic, CUPL (core universal Programming Language), HDCal, jhddl (Java Hardware Description Language), lava, lola, HDL, PALASM, rhyd (Hardware Description Language), and vhigh-Language (Hardware Description Language), which is currently used in most popular applications. It will also be apparent to those skilled in the art that hardware circuitry that implements the logical method flows can be readily obtained by merely slightly programming the method flows into an integrated circuit using the hardware description languages described above.
The controller may be implemented in any suitable manner, for example, the controller may take the form of, for example, a microprocessor or processor and a computer readable medium that stores computer readable program code (e.g., software or firmware) executable by the (micro) processor, logic gates, switches, an Application Specific Integrated Circuit (ASIC), a programmable logic controller, and embedded microcontrollers, examples of which include, but are not limited to, the following microcontrollers: ARC 625D, atmel AT91SAM, microchip PIC18F26K20, and Silicone Labs C8051F320, the memory controller may also be implemented as part of the control logic for the memory. Those skilled in the art will also appreciate that, in addition to implementing the controller as pure computer readable program code, the same functionality can be implemented by logically programming method steps such that the controller is in the form of logic gates, switches, application specific integrated circuits, programmable logic controllers, embedded microcontrollers and the like. Such a controller may thus be considered a hardware component, and the means included therein for performing the various functions may also be considered as a structure within the hardware component. Or even means for performing the functions may be regarded as being both a software module for performing the method and a structure within a hardware component.
The systems, apparatuses, modules or units described in the above embodiments may be specifically implemented by a computer chip or an entity, or implemented by a product with certain functions. One typical implementation device is a computer. In particular, the computer may be, for example, a personal computer, a laptop computer, a cellular telephone, a camera phone, a smartphone, a personal digital assistant, a media player, a navigation device, an email device, a game console, a tablet computer, a wearable device, or a combination of any of these devices.
For convenience of description, the above devices are described as being divided into various units by function, and are described separately. Of course, the functionality of the various elements may be implemented in the same one or more pieces of software and/or hardware in the practice of the present application.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and so forth) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
In a typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
The memory may include forms of volatile memory in a computer readable medium, random Access Memory (RAM) and/or non-volatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of a computer-readable medium.
Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, a computer readable medium does not include a transitory computer readable medium such as a modulated data signal and a carrier wave.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising a … …" does not exclude the presence of another identical element in a process, method, article, or apparatus that comprises the element.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and so forth) having computer-usable program code embodied therein.
The application may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The application may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.
All the embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from other embodiments. In particular, as for the system embodiment, since it is substantially similar to the method embodiment, the description is relatively simple, and reference may be made to the partial description of the method embodiment for relevant points.
The above description is only an example of the present application and is not intended to limit the present application. Various modifications and changes may occur to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the scope of the claims of the present application.