CN117155930A - Node determining method, task processing method and related devices of distributed system - Google Patents

Node determining method, task processing method and related devices of distributed system Download PDF

Info

Publication number
CN117155930A
CN117155930A CN202311440542.6A CN202311440542A CN117155930A CN 117155930 A CN117155930 A CN 117155930A CN 202311440542 A CN202311440542 A CN 202311440542A CN 117155930 A CN117155930 A CN 117155930A
Authority
CN
China
Prior art keywords
node
voting
nodes
round
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202311440542.6A
Other languages
Chinese (zh)
Other versions
CN117155930B (en
Inventor
杨一迪
黄辉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN202311440542.6A priority Critical patent/CN117155930B/en
Publication of CN117155930A publication Critical patent/CN117155930A/en
Application granted granted Critical
Publication of CN117155930B publication Critical patent/CN117155930B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/10Active monitoring, e.g. heartbeat, ping or trace-route
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Cardiology (AREA)
  • General Health & Medical Sciences (AREA)
  • Hardware Redundancy (AREA)

Abstract

The embodiment of the application provides a node determining method, a task processing method and a related device of a distributed system, relates to the technical field of distribution, and also relates to the technical field of cloud. The method comprises the following steps: if the voting triggering condition is met, carrying out the N-th round of voting processing to obtain an N-th round of first voting result, and then determining a target node for processing tasks in a plurality of execution nodes based on the N-th round of first voting result, wherein each round of voting processing comprises the following steps: the heartbeat information of each second node is acquired, the survival nodes in the second nodes are determined based on the heartbeat information of each second node, voting is carried out based on the determination results of the survival nodes in the second nodes, and a first voting result is obtained, so that the technical problem that the brain fracture easily occurs in the distributed technology can be solved, and the technical effect of reducing the occurrence of the brain fracture is achieved.

Description

Node determining method, task processing method and related devices of distributed system
Technical Field
The application relates to the technical field of distributed systems, in particular to a node determining method, a task processing method and a related device of a distributed system.
Background
In the high-availability distributed technology, a plurality of execution nodes which are mutually backed up exist, when the plurality of execution nodes are in a connection state, one execution node processes tasks, and the other execution node serves as a backup execution node, and main and standby negotiations are achieved through a heartbeat link. However, when a plurality of execution nodes have a heartbeat link failure, only one of the plurality of execution nodes can be in a survival state, namely, only one of the plurality of execution nodes can process tasks, and if the plurality of execution nodes are in the survival state, the tasks are processed simultaneously, and the two execution nodes cannot synchronize data, so that the situation that brain fracture occurs is indicated, and the situation can lead to confusion of task data.
Therefore, how to reduce the occurrence of cerebral infarction has become one of important research directions.
Disclosure of Invention
The embodiment of the application provides a node determining method, a task processing method and a related device of a distributed system, which are used for solving the technical problem that cerebral cracks easily occur in a distributed technology, thereby achieving the technical effect of reducing the occurrence of cerebral cracks.
In one aspect, an embodiment of the present application provides a method for determining a node of a distributed system, where the distributed system includes a plurality of execution nodes, the method is applied to any first node of the plurality of execution nodes, and the method includes:
If the voting triggering condition is met, carrying out the N-th round of voting processing to obtain an N-th round of first voting result, wherein the first voting result is used for indicating a first candidate node for processing tasks in a plurality of executing nodes, and N is an integer not less than 1;
determining a target node of the processing task in the plurality of execution nodes based on the nth round of first voting results, wherein the target node comprises at least one of the first candidate nodes;
wherein each round of voting process comprises:
obtaining heartbeat information of each second node, determining survival nodes in the second nodes based on the heartbeat information of each second node, wherein each second node is a node except the first node in a plurality of execution nodes, and each survival node is a node with a time interval between the time of the latest heartbeat and the current time smaller than a time interval threshold;
and voting is carried out based on the determination result of the survival node in the second node, and a first voting result is obtained.
On the other hand, the embodiment of the application also provides a task processing method of a distributed system, the distributed system comprises a plurality of execution nodes, the method is applied to any first node in the plurality of execution nodes, and the method comprises the following steps:
determining a target node for processing a task in a plurality of execution nodes, wherein the target node is determined based on the method of any embodiment of the application;
If the target node comprises the first node, determining a first number of target tasks in the tasks to be processed, and processing the first number of target tasks.
On the other hand, the embodiment of the application also provides a node determining device of a distributed system, the distributed system comprises a plurality of execution nodes, the device is applied to any first node in the plurality of execution nodes, and the device comprises:
the decision module is used for carrying out the N-th round of voting processing to obtain the N-th round of first voting result if the voting triggering condition is met, wherein the first voting result is used for indicating a first candidate node for processing tasks in a plurality of execution nodes, and N is an integer not less than 1;
the decision module is further configured to determine a target node of the plurality of execution nodes that performs the processing task based on the nth round of the first voting result, where the target node includes at least one of the first candidate nodes;
the decision module is used for carrying out each round of voting processing, wherein the decision module is used for:
obtaining heartbeat information of each second node, determining survival nodes in the second nodes based on the heartbeat information of each second node, wherein each second node is a node except the first node in a plurality of execution nodes, and each survival node is a node with a time interval between the time of the latest heartbeat and the current time smaller than a time interval threshold;
And voting is carried out based on the determination result of the survival node in the second node, and a first voting result is obtained.
Optionally, the voting triggering condition includes at least one of:
reaching the voting triggering time;
the time interval between the time of the last heartbeat of the first node and the current time is less than a time interval threshold;
the survival nodes in the execution nodes are inconsistent with the target nodes determined by the N-1 round of voting, and N is an integer not less than 2;
a change in surviving nodes in the second node is detected.
Optionally, the first node is in a working state or a suspension state, the first node in the working state allows processing tasks, and the first node in the suspension state prohibits processing tasks; the decision module is also for:
if the determined target node comprises the first node, setting the state of the first node as a working state;
if the time interval between the latest heartbeat time and the current time of the first node is greater than or equal to the time interval threshold value, setting the state of the first node as a pause state;
if the voting triggering condition is met and before the target node is determined, the state of the first node is set to be a pause state.
Optionally, the decision module, when determining a target node of the processing tasks in the plurality of execution nodes based on the nth round of the first voting result, may be configured to:
querying the Nth round of second voting results respectively corresponding to at least one second node, wherein the second voting results are used for indicating second candidate nodes for processing tasks in a plurality of execution nodes;
if the second voting result of the nth round is failed to inquire, determining a first candidate node indicated by the first voting result of the nth round as a target node;
and if the second voting result of the nth round is successfully queried, determining a target node for processing tasks in the plurality of execution nodes based on the first voting result of the nth round and the second voting result of the nth round.
Optionally, when determining the target node of the processing task in the plurality of execution nodes based on the nth round of the first voting result and the nth round of the second voting result, the method may be used to:
if the first voting result of the nth round and the second voting result of the nth round are determined to meet the voting ending condition, determining a target node set for processing the task, and determining any node in the target node set as a target node, wherein the target node set is an intersection or union between a first candidate node indicated by the first voting result of the nth round and a second candidate node indicated by the second voting result of the nth round;
Wherein the voting ending condition includes at least one of:
the first voting result of the nth round is consistent with the second voting result of the nth round corresponding to each second node;
at least one second node corresponding to the queried Nth round of second voting result is consistent with survival nodes in the second nodes;
the method further comprises the steps of:
and if the first voting result of the nth round and the second voting result of the nth round are determined to not meet the voting ending condition, carrying out the voting processing of the (n+1) th round.
Optionally, any one of the plurality of executing nodes is connected with a voting management database, and the voting management database is used for recording a voting result obtained by each round of voting processing of each executing node;
the decision module may be configured to, when querying the nth round of second voting results corresponding to the at least one second node respectively:
and inquiring the Nth round of second voting results respectively corresponding to at least one second node from the voting management database.
Optionally, the voting management database is further used for recording heartbeat information reported by each execution node; the decision module, when obtaining the heartbeat information of each second node, may be configured to:
and obtaining heartbeat information of each second node from the voting management database.
Optionally, the decision module may be configured to, when voting based on the determination result of the surviving node in the second node, obtain a first voting result, at least one of:
If the surviving node determination result is that the surviving node determination in the second node is successful, the first candidate node indicated by the first voting result comprises the surviving nodes in the first node and the second node;
if the surviving node determination result is that the surviving node determination in the second node fails, the first candidate node indicated by the first voting result comprises the first node and does not comprise the surviving node in the second node.
On the other hand, the embodiment of the application also provides a task processing device of a distributed system, the distributed system comprises a plurality of execution nodes, the device is applied to any first node in the plurality of execution nodes, and the device comprises:
the decision module is used for determining a target node for processing tasks in a plurality of execution nodes, wherein the target node is determined based on the method of any embodiment of the application;
and the execution module is used for determining a first number of target tasks in the tasks to be processed if the target nodes comprise the first node, and processing the first number of target tasks.
Optionally, if the target nodes are at least two, the execution module may be configured to determine a first number of target tasks among the tasks to be processed, where the first number of target tasks may be any of the following:
Determining a second number of target nodes, and determining a first number of tasks processed by the first node based on the total number of tasks to be processed and the second number, wherein the first number is inversely related to the second number, and the first number is positively related to the total number of tasks;
the method comprises the steps of obtaining processing capacity indication information of each target node, determining a ratio coefficient corresponding to each target node based on task processing capacity indicated by the processing capacity indication information, and determining a first number of tasks processed by a first node based on the total number of tasks to be processed and the ratio coefficient corresponding to the first node, wherein the ratio coefficient corresponding to each target node is positively correlated with the task processing capacity corresponding to each target node, and the first number is positively correlated with the ratio coefficient corresponding to the first node and the total number of tasks respectively.
In another aspect, an embodiment of the present application further provides an execution node, including a memory, a processor, and a computer program stored on the memory, where the processor executes the computer program to implement a method according to any embodiment of the present application.
On the other hand, the embodiment of the application also provides a distributed system, which comprises a plurality of execution nodes, wherein any one of the plurality of execution nodes is used as a first node to execute the method of any embodiment of the application.
In another aspect, embodiments of the present application further provide a computer readable storage medium having a computer program stored thereon, which when executed by a processor implements a method according to any of the embodiments of the present application.
In another aspect, embodiments of the present application also provide a computer program product comprising a computer program which, when executed by a processor, implements a method according to any of the embodiments of the present application.
According to the technical scheme of the embodiment, if the voting triggering condition is met, the nth round of voting processing is performed to obtain an nth round of first voting result, the first voting result is used for indicating a first candidate node for processing tasks in a plurality of executing nodes, a target node for processing the tasks in the plurality of executing nodes is determined based on the nth round of first voting result, and each round of voting processing comprises: obtaining heartbeat information of each second node, determining survival nodes in the second nodes based on the heartbeat information of each second node, wherein each second node is a node except the first node in a plurality of execution nodes, and each survival node is a node with a time interval between the time of the latest heartbeat and the current time smaller than a time interval threshold; and voting is carried out based on the determination result of the surviving node in the second node, so as to obtain a first voting result, that is, when voting is carried out in each round, the target node for processing the task can be determined based on the current surviving node, so that the target node for processing the task is continuously updated, the technical problem that the brain fracture easily occurs in the distributed technology can be solved, and the technical effect of reducing the occurrence of the brain fracture is achieved.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings that are required to be used in the description of the embodiments of the present application will be briefly described below.
FIG. 1 is a schematic diagram of a framework for implementing an embodiment of the present application;
FIG. 2 is a schematic diagram of a module architecture of an execution node according to an embodiment of the present application;
fig. 3 is a flow chart of a node determining method of a distributed system according to an embodiment of the present application;
fig. 4 is a schematic flow chart of a node heartbeat update according to an embodiment of the present application;
fig. 5 is a schematic diagram of a node according to an embodiment of the present application initiating voting in different states;
fig. 6 is a schematic diagram of a status update flow of a node according to an embodiment of the present application;
fig. 7 is a schematic flow chart of a task processing method of a distributed system according to an embodiment of the present application;
FIG. 8 is a schematic diagram illustrating comparison of CPU utilization of two execution nodes according to an embodiment of the present application;
fig. 9 is a schematic structural diagram of a node determining device of a distributed system according to an embodiment of the present application;
FIG. 10 is a schematic diagram of a task processing device of a distributed system according to an embodiment of the present application;
Fig. 11 is a schematic structural diagram of an execution node according to an embodiment of the present application.
Detailed Description
Embodiments of the present application are described below with reference to the drawings in the present application. It should be understood that the embodiments described below with reference to the drawings are exemplary descriptions for explaining the technical solutions of the embodiments of the present application, and the technical solutions of the embodiments of the present application are not limited.
As used herein, the singular forms "a", "an", and "the" are intended to include the plural forms as well, unless expressly stated otherwise, as understood by those skilled in the art. It will be further understood that the terms "comprises" and "comprising," when used in this specification, specify the presence of stated features, information, data, steps, operations, elements, and/or components, but do not preclude the presence or addition of other features, information, data, steps, operations, elements, components, and/or groups thereof, all of which may be included in the present specification. It will be understood that when an element is referred to as being "connected" or "coupled" to another element, it can be directly connected or coupled to the other element or intervening elements may be present. Further, "connected" or "coupled" as used herein may include wirelessly connected or wirelessly coupled. The term "and/or" as used herein indicates at least one of the items defined by the term, e.g. "a and/or B" indicates implementation as "a", or as "a and B". "plurality" may mean at least two.
For the purpose of making the objects, technical solutions and advantages of the present application more apparent, the embodiments of the present application will be described in further detail with reference to the accompanying drawings.
First, several terms related to the present application are described and explained:
tasks: tasks to be executed according to business scene in the computer field.
Executing nodes: the unit that performs the task may be a process on a computer.
Disaster recovery: the capability of coping with a failure, such as the capability of whether a task can be continuously executed by the remaining execution nodes after a failure of one execution node (e.g., a power failure of a computer, a network failure, etc.) when a plurality of execution nodes exist.
Load balancing: the ability to perform tasks is apportioned. Such as whether the number of tasks performed by each executing node per unit time is close.
Cerebral cleavage: in the task management scenario, when the same task is executed by multiple execution nodes at the same time, the tasks may affect each other in the execution process, and the situation that the execution result is unknown occurs.
In the related art, for a distributed system, main schemes include, but are not limited to, the following:
scheme one: and selecting the main execution node based on the distributed protocol, wherein the successfully selected main execution node can execute tasks, other execution nodes are in an idle state in the disaster recovery role, and when the main node is abnormal, the other execution nodes reselect the main and execute the tasks.
Scheme II: message middleware is introduced to enable multiple executing nodes to consume task messages and execute tasks in a producer-consumer mode. The producer selects the main part through the distributed protocol, so that a plurality of producers run in a main-standby mode, and after the successful producer sends the task set to the message middleware, a plurality of consumers acquire the task information from the message middleware in a competitive mode and execute the task information.
Scheme III: task scheduling is carried out based on load detection, and for an execution node with load exceeding a threshold value, a task scheduling module pauses task distribution to the node, and after the load is restored to be within the threshold value, the task is distributed to the node;
in the related art, there are respective drawbacks:
scheme one disadvantage:
the scheme has disaster recovery capability, but does not have load balancing capability. All tasks are executed on the execution node with successful host selection, which can cause high load and even reach the bottleneck of computer resources, so that the tasks cannot be completed in time. Meanwhile, other execution nodes are in an idle state, so that the waste of computer resources is caused.
Scheme two, drawbacks:
1) The scheme is complex, the message middleware is introduced, and a user needs to have certain maintenance capability on the message middleware;
2) There is some additional resource consumption in this scheme. Message middleware handles the transfer of tasks between producer and consumer, and computer resources are consumed, especially when the task amount is huge, the resource consumption is more serious.
Scheme three has the disadvantage:
the scheme cannot solve the problem of the split brain execution of the same task. When the task is executed overtime, the actual execution condition of the task may still be in execution, and if the task overtime starts retrying, the same task can be executed simultaneously, so that the situation of brain fracture occurs.
Aiming at least one technical problem or the place needing improvement in the related art, the application provides a node determining method, a task processing method and a related device of a distributed system, which ensure that the same task is not executed by a plurality of executing nodes at the same time on the premise of providing disaster tolerance and load balancing capability, thereby avoiding the situation of brain fracture and causing the mutual influence when the task is executed.
If the voting triggering condition is met, carrying out the N-th round of voting processing to obtain an N-th round of first voting result, wherein the first voting result is used for indicating a first candidate node for processing tasks in a plurality of executing nodes, N is an integer not less than 1, and then determining a target node for processing tasks in the plurality of executing nodes based on the N-th round of first voting result; and each round of voting processing includes: obtaining heartbeat information of each second node, determining survival nodes in the second nodes based on the heartbeat information of each second node, wherein each second node is a node except the first node in a plurality of execution nodes, and each survival node is a node with a time interval between the time of the latest heartbeat and the current time smaller than a time interval threshold; voting is carried out based on the determination result of the survival node in the second node, so that a first voting result is obtained, the technical problem that brain cracks easily occur in the distributed technology can be solved, and the technical effect of reducing the occurrence of the brain cracks is achieved. In addition, compared with the related art, the scheme can prevent the parallel execution of the tasks, whether the tasks are switched due to component abnormality or the task execution is overtime, the serial execution of the tasks can be ensured, and the mutual influence caused by the parallel execution is avoided. In addition, load balancing among the execution nodes is realized, and the capacity of transverse expansion is realized. In addition, the technical scheme of the application is a lightweight scheme, and the used algorithm can be embedded into each service module in a static Library (LIB) mode for use.
Cloud technology (Cloud technology) refers to a hosting technology for integrating hardware, software, network and other series resources in a wide area network or a local area network to realize calculation, storage, processing and sharing of data.
Cloud technology (Cloud technology) is based on the general terms of network technology, information technology, integration technology, management platform technology, application technology and the like applied by Cloud computing business models, and can form a resource pool, so that the Cloud computing business model is flexible and convenient as required. Cloud computing technology will become an important support. Background services of technical networking systems require a large amount of computing, storage resources, such as video websites, picture-like websites, and more portals. Along with the high development and application of the internet industry, each article possibly has an own identification mark in the future, the identification mark needs to be transmitted to a background system for logic processing, data with different levels can be processed separately, and various industry data needs strong system rear shield support and can be realized only through cloud computing.
In particular, cloud computing may be involved. Voting is performed, for example, by cloud computing to obtain a voting result. Cloud computing (clouding) is a computing model that distributes computing tasks across a large pool of computers, enabling various application systems to acquire computing power, storage space, and information services as needed. The network that provides the resources is referred to as the "cloud". Resources in the cloud are infinitely expandable in the sense of users, and can be acquired at any time, used as needed, expanded at any time and paid for use as needed. As a basic capability provider of cloud computing, a cloud computing resource pool (called IaaS (Infrastructure as a Service) platform for short is established, and various types of virtual resources are deployed in the resource pool for external clients to select and use.
In particular, cloud storage may also be involved, for example, implementing required information, such as heartbeat information, voting results, etc., through a cloud storage scheme.
Cloud storage (cloud storage) is a new concept that extends and develops in the concept of cloud computing, and a distributed cloud storage system (hereinafter referred to as a storage system for short) refers to a storage system that integrates a large number of storage devices (storage devices are also referred to as storage nodes) of various types in a network to work cooperatively through application software or application interfaces through functions such as cluster application, grid technology, and a distributed storage file system, so as to provide data storage and service access functions for the outside.
At present, the storage method of the storage system is as follows: when creating logical volumes, each logical volume is allocated a physical storage space, which may be a disk composition of a certain storage device or of several storage devices. The client stores data on a certain logical volume, that is, the data is stored on a file system, the file system divides the data into a plurality of parts, each part is an object, the object not only contains the data but also contains additional information such as a data Identification (ID) and the like, the file system writes each object into a physical storage space of the logical volume, and the file system records storage position information of each object, so that when the client requests to access the data, the file system can enable the client to access the data according to the storage position information of each object.
The process of allocating physical storage space for the logical volume by the storage system specifically includes: physical storage space is divided into stripes in advance according to the set of capacity measures for objects stored on a logical volume (which measures tend to have a large margin with respect to the capacity of the object actually to be stored) and redundant array of independent disks (RAID, redundant Array of Independent Disk), and a logical volume can be understood as a stripe, whereby physical storage space is allocated for the logical volume.
The technical solutions of the embodiments of the present application and technical effects produced by the technical solutions of the present application are described below by describing several exemplary embodiments. It should be noted that the following embodiments may be referred to, or combined with each other, and the description will not be repeated for the same terms, similar features, similar implementation steps, and the like in different embodiments.
Referring to fig. 1, fig. 1 is a schematic diagram of a framework for implementing an embodiment of the present application. The scenario implementation framework shown in fig. 1 may include a plurality of execution nodes 110 (hereinafter also simply referred to as nodes).
Wherein each execution node 110 is configured to process a task (also referred to as executing a task). Taking the application scenario as an example of taking a taxi taking a dispatch task as an example, each execution node 110 may be configured to process the dispatch task.
In this embodiment, any one of the plurality of executing nodes 110 may be used as a first node, so as to execute the steps of the method in any embodiment of the present application, that is, in the implementation process of this solution, each executing node 110 may execute the steps of the method in any embodiment of the present application, so as to implement at least one of node determination of the distributed system or implementing task processing of the distributed system. The executing node 110 may be a server or a cluster of servers with data processing capabilities.
Optionally, a plurality of executing nodes 110 may communicate with each other.
Optionally, the scheme implementation framework may further include a vote management database 120, where each executing node 110 may communicate with the vote management database 120, and each node may store information required for implementing the scheme in the vote management database 120, so that other nodes may obtain the required information from the vote management database 120, where communication between multiple executing nodes 110 may not be required. Wherein the vote management database 120 records task management metadata. The task management metadata is used for recording node heartbeat information, voting results and task division decision information required by task management. The metadata needs to have high availability and consistency guarantees, metadata stores based on distributed protocols can be used, or relational databases (the failover capabilities of relational databases require additional management system responsibility).
Optionally, the scenario implementation framework may also include a business database 130. Wherein the service database 130 records service metadata. Execution of tasks typically relies on business-related metadata, such as tasks may originate from business metadata. The business metadata is independent of the task management metadata and may be in different storage schemes.
It should be noted that the database may be a relational database, and the metadata in the database may include heartbeat metadata and decision metadata of the execution node 110. The heartbeat metadata and decision metadata may be in the form of data from a relational database.
It will be appreciated that the manner of communication between the plurality of nodes may be set as desired and is not limited thereto.
Referring to fig. 2, fig. 2 is a schematic block diagram of an execution node according to an embodiment of the present application.
The task management related module is embedded into a common task execution node in the LIB form, provides the capability of deciding the task fragments to be executed by the execution node, and the execution module executes the corresponding task fragments according to the decision result. The following is a brief description of the various module assemblies:
task management metadata: and recording node heartbeat information required by task management and decision information of task division. The metadata needs to have high availability and consistency guarantees, metadata stores based on distributed protocols can be used, or relational databases (the failover capabilities of relational databases require additional management system responsibility). The following description will use a relational database.
And a reporting module: and is responsible for timing reporting at least one of heartbeat information of the task execution node or reporting a voting result.
Decision module: and deciding out the surviving executing node and the task slicing information which should be executed by each node according to the heartbeat information of each task executing node, wherein the task slicing information comprises the total slicing number and the slicing serial number executed by the node.
The execution module: and acquiring a slicing task set and processing tasks according to the total slicing number and the slicing sequence number of the node.
Business metadata: processing of tasks typically relies on business-related metadata, as tasks may originate from business metadata. The business metadata is independent of the task management metadata and may be in different storage schemes.
Referring to fig. 3, fig. 3 is a flowchart illustrating a node determining method of a distributed system according to an embodiment of the present application. The method of the present embodiment may be performed by any node of a distributed system. For convenience of distinction, a node performing the method is taken as a first node, and other nodes are taken as second nodes, and since each node may perform the method of the embodiment of the present application, each node may be either the first node or the second node, and whether it is the first node or the second node is determined according to a relative relationship with itself, which is not limited herein.
The node determining method of the distributed system as shown in fig. 3 may include:
and S310, if the voting triggering condition is met, carrying out the N-th round of voting processing to obtain the N-th round of first voting results, wherein the first voting results are used for indicating first candidate nodes for processing tasks in the plurality of execution nodes.
Wherein N is an integer not less than 1. The voting trigger condition may refer to a condition that triggers the first node to vote for the purpose of determining a target node for processing a task among the plurality of execution nodes. Alternatively, the voting triggering condition may be a condition determined by determining that the first node may be the target node and combining other factors, and the first candidate node may include the first node.
In this embodiment, each round of voting processing may obtain a first voting result, and the nth round of voting processing correspondingly obtains the nth round of first voting result.
Wherein, each round of voting processing may include:
s311, obtaining heartbeat information of each second node, and determining survival nodes in the second nodes based on the heartbeat information of each second node.
The second node is a node except the first node in the plurality of executing nodes. A surviving node is a node whose time interval between the time of the last heartbeat and the current time is less than the time interval threshold. The heartbeat information may include the reported heartbeat time corresponding to each heartbeat. In this embodiment, specifically, the main body that receives the reported heartbeat may record the time when the node reports the heartbeat or record the time when the main body receives the heartbeat, as the heartbeat time corresponding to the heartbeat, and each time the node reports the heartbeat, the main body may record the heartbeat time, so as to form the heartbeat information corresponding to each node.
It should be noted that, communication may be performed between any two nodes, and the first node may obtain heartbeat information of each second node; in addition, communication between each node and the voting management database is also possible, so that heartbeat is reported to the voting management database, and the voting management database can store heartbeat information of each node.
Alternatively, the heartbeat information of each node may be stored in the same list, or may be stored in a different list, which is not limited herein.
Referring to table 1, table 1 is a schematic illustration of heartbeat information of a storage node according to an embodiment of the present application.
TABLE 1
Referring to fig. 4, fig. 4 is a schematic flow chart of node heartbeat update according to an embodiment of the present application.
As shown in fig. 4, the heartbeat is updated at regular intervals of n seconds, and after the update is successful, the processing deadline of the task is prolonged. Wherein task processing deadline = heartbeat update time + configured processing timeout time.
S312, voting is conducted based on the survival node determination result in the second node, and a first voting result is obtained.
In this embodiment, alternatively, the voting is performed based on the determination result of the surviving node in the second node, which may be that the node that survives currently in the plurality of nodes is determined as the first candidate node indicated by the first voting result.
In one possible implementation, voting is performed based on the determination result of the surviving nodes in the second node, so as to obtain a first voting result, which includes at least one of the following:
if the surviving node determination result is that the surviving node determination in the second node is successful, the first candidate node indicated by the first voting result comprises the surviving nodes in the first node and the second node;
If the surviving node determination result is that the surviving node determination in the second node fails, the first candidate node indicated by the first voting result comprises the first node and does not comprise the surviving node in the second node.
In this embodiment, specifically, if the surviving node determination result is that the surviving node determination in the second node is successful, it is indicated that there is a surviving node in the second node, and then the surviving nodes in the first node and the second node may be used as the first candidate node. If the result of the determination of the surviving node is that the determination of the surviving node in the second node fails, it is indicated that there is no surviving node in the second node, and the first node may be used as the first candidate node at this time.
S320, determining a target node for processing tasks in the plurality of execution nodes based on the Nth round of first voting results.
Wherein the target node may comprise at least one of the first candidate nodes. In this embodiment, after the target node is determined, the task to be processed may be processed by the target node.
According to the technical scheme of the embodiment, if the voting triggering condition is met, the nth round of voting processing is performed to obtain an nth round of first voting result, the first voting result is used for indicating a first candidate node for processing tasks in a plurality of executing nodes, a target node for processing the tasks in the plurality of executing nodes is determined based on the nth round of first voting result, and each round of voting processing comprises: obtaining heartbeat information of each second node, determining survival nodes in the second nodes based on the heartbeat information of each second node, wherein each second node is a node except the first node in a plurality of execution nodes, and each survival node is a node with a time interval between the time of the latest heartbeat and the current time smaller than a time interval threshold; and voting is carried out based on the determination result of the surviving node in the second node, so as to obtain a first voting result, that is, when voting is carried out in each round, the target node for processing the task can be determined based on the current surviving node, so that the target node for processing the task is continuously updated, the technical problem that the brain fracture easily occurs in the distributed technology can be solved, and the technical effect of reducing the occurrence of the brain fracture is achieved.
In one possible implementation, the voting trigger conditions include at least one of:
reaching the voting triggering time;
the time interval between the time of the last heartbeat of the first node and the current time is less than a time interval threshold;
the survival nodes in the execution nodes are inconsistent with the target nodes determined by the N-1 round of voting, and N is an integer not less than 2;
a change in surviving nodes in the second node is detected.
For the case where the voting trigger conditions include voting trigger times, the voting trigger times may be continuously varied, and the voting trigger interval may be preset, and the next voting trigger time may be the last voting trigger time plus the voting trigger interval. The voting may be triggered periodically by the arrival of a voting trigger time to trigger the voting. The time for triggering the first vote may be the same among the nodes.
For the case that the voting triggering condition includes that the time interval between the time of the last heartbeat of the first node and the current time is smaller than the time interval threshold value, the first node can be ensured to be a survival node, namely the first node is relatively stable, and node election can be performed or tasks can be processed. Specifically, the first node in this embodiment may be in a working state or a suspended state, where the first node in the working state allows processing tasks, and the first node in the suspended state prohibits processing tasks. For the first node in the working state, if the time interval between the time of the latest heartbeat of the first node and the current time is smaller than the time interval threshold value, the first node is always on, and voting can be performed. For the first node in the suspended state, if the time interval between the time of the last heartbeat of the first node and the current time is smaller than the time interval threshold value, the first node is indicated to be on-line, and voting can be performed.
For the case that the voting triggering condition includes that the surviving node in the plurality of executing nodes is inconsistent with the target node determined by the N-1 round of voting, the latest surviving node is inconsistent with the target node determined by the last round of voting, and the target node needs to be redetermined at the moment, otherwise, the risk of brain fracture exists. Specifically, the surviving nodes in the plurality of executing nodes are inconsistent with the target nodes determined by the N-1 round of voting, which may be fewer or more than the target nodes determined by the N-1 round of voting in the plurality of executing nodes, and when the surviving nodes in the plurality of executing nodes are in one-to-one correspondence with the target nodes determined by the N-1 round of voting, the surviving nodes in the plurality of executing nodes are considered to be consistent with the target nodes determined by the N-1 round of voting.
For the voting triggering conditions including detecting a change in a surviving node in the second node, it is indicated that the latest surviving node is inconsistent with the target node determined by the last vote, and the target node needs to be redetermined, otherwise there is a risk of brain fracture.
One or more of the above conditions may be selected as voting trigger conditions as needed, and are not limited herein.
It can be appreciated that by selecting the above various conditions as the voting trigger conditions, the timing of triggering the voting can be determined from a plurality of angles, thereby improving the accuracy of the voting trigger and saving the computing resources required for the voting.
In one possible implementation, the first node is in an operating state or a suspended state, the first node in the operating state allowing processing tasks, the first node in the suspended state prohibiting processing tasks; the method further comprises at least one of:
if the determined target node comprises the first node, setting the state of the first node as a working state;
if the time interval between the latest heartbeat time and the current time of the first node is greater than or equal to the time interval threshold value, setting the state of the first node as a pause state;
if the voting triggering condition is met and before the target node is determined, the state of the first node is set to be a pause state.
Specifically, if the determined target node includes the first node, it is indicated that the first node can process the task, and the state of the first node is set to be a working state. If the time interval between the latest heartbeat time and the current time of the first node is greater than or equal to the time interval threshold, the operation of the first node is unstable, and if the continuous operation has a large risk of occurrence of a brain fracture problem, the state of the first node is set to be a pause state so as to prohibit the first node from continuously processing tasks. If the voting triggering condition is met and before the target node is determined, the situation that voting is started but the latest target node is not determined is indicated, and if the operation is continued, a brain crack problem occurs at a high risk, so that the state of the first node is set to be a pause state.
The state of the first node may be set to an operating state, the operating state of the first node may be maintained, or the first node may be switched from a suspended state to an operating state, which is not limited herein. Similarly, the state of the first node may be set to a suspended state, or the suspended state of the first node may be maintained, or the first node may be switched from the operating state to the suspended state, which is not limited herein.
In this embodiment, the first node in the suspended state or in the active state may initiate the voting. Fig. 5 is a schematic diagram of a node according to an embodiment of the present application initiating voting in different states, as shown in fig. 5.
As shown in fig. 5, only in the working state (running), the task can be acquired and executed; when the heartbeat is time-out, entering a pause state (pause), and not executing tasks any more; after entering the suspended state, the voting (voting) is tried to be initiated, and when the determined target node comprises the first node, namely the node can belong to the participating work, the working state is re-entered.
According to the technical scheme, the first node is set to be in the working state or the suspension state according to different time-on conditions, so that the probability of occurrence of the brain fracture problem can be further reduced, and the occurrence of the brain fracture is further reduced.
The following embodiments further describe how to determine a target node of the processing tasks in the plurality of execution nodes based on the nth round of the first voting result, on the basis of any of the above embodiments.
In one possible implementation manner, determining a target node of the processing tasks in the plurality of execution nodes based on the nth round of the first voting result may include:
and determining the first candidate node indicated by the Nth round of first voting result as a target node.
In another possible implementation manner, determining a target node of the processing tasks in the plurality of execution nodes based on the nth round of the first voting result includes:
querying the Nth round of second voting results respectively corresponding to at least one second node, wherein the second voting results are used for indicating second candidate nodes for processing tasks in a plurality of execution nodes;
if the second voting result of the nth round is failed to inquire, determining a first candidate node indicated by the first voting result of the nth round as a target node;
and if the second voting result of the nth round is successfully queried, determining a target node for processing tasks in the plurality of execution nodes based on the first voting result of the nth round and the second voting result of the nth round.
In this embodiment, specifically, each node may select whether to trigger voting according to its actual situation, if at least one second node triggers voting too, the nth round of second voting results corresponding to at least one second node respectively may be obtained, and if no second node triggers voting, the nth round of second voting results corresponding to the second node may not be obtained.
According to the technical scheme, the first candidate node indicated by the first voting result of the nth round is determined to be the target node under the condition that the second voting result of the nth round fails to inquire, and if the second voting result of the nth round is inquired successfully, the target nodes of the processing tasks in the executing nodes are determined based on the first voting result of the nth round and the second voting result of the nth round, so that the determination accuracy of the target nodes can be improved, and the occurrence probability of brain cracks is further reduced.
In one possible implementation manner, determining a target node of the processing tasks in the plurality of execution nodes based on the nth round of first voting results and the nth round of second voting results may include:
and determining a target node set for processing the task, and determining any node in the target node set as a target node, wherein the target node set is an intersection or union between a first candidate node indicated by the nth round of first voting result and a second candidate node indicated by the nth round of second voting result.
In another possible implementation manner, determining a target node of the processing tasks in the plurality of execution nodes based on the nth round of first voting results and the nth round of second voting results includes:
If the first voting result of the nth round and the second voting result of the nth round are determined to meet the voting ending condition, determining a target node set for processing the task, and determining any node in the target node set as a target node, wherein the target node set is an intersection or union between a first candidate node indicated by the first voting result of the nth round and a second candidate node indicated by the second voting result of the nth round.
In this embodiment, the selection of the intersection or the union may be set as needed. Specifically, the number of nodes in the target node set obtained by selecting the intersection is smaller than or equal to the number of nodes in the target node set obtained by selecting the union.
It can be appreciated that selecting the intersection as the target node set can further reduce the split brain condition; and the union is selected as the target node set, so that the capability of processing tasks can be further improved.
Optionally, if the current number of tasks to be processed is smaller than the task number threshold, an intersection can be selected as a target node for combination, so that the brain fracture condition is reduced; if the current number of the tasks to be processed is greater than or equal to the task number threshold, the union set can be selected as a target node set, and the task processing capability can be further improved.
Wherein the voting ending condition includes at least one of:
the first voting result of the nth round is consistent with the second voting result of the nth round corresponding to each second node;
at least one second node corresponding to the queried Nth round of second voting result is consistent with the survival nodes in the second nodes.
In this embodiment, the nth round of first voting result is consistent with the nth round of second voting result corresponding to each second node, which may be a first candidate node indicated by the nth round of first voting result and is consistent with a second candidate node indicated by the nth round of second voting result corresponding to each second node. Specifically, the nth round of first voting results are consistent with the nth round of second voting results corresponding to each second node, the intersection or union results are the same, and the nth round of first voting results are the nth round of first voting results, and the method is equivalent to determining the first candidate node indicated by the nth round of first voting results as the target node.
And if at least one second node corresponding to the queried Nth round of second voting result is consistent with the surviving nodes in the second nodes, each surviving node initiates voting.
In this embodiment, the first voting result of the nth round is consistent with the second voting results of the nth round corresponding to each second node, and/or at least one second node corresponding to the second voting result of the nth round and the survival node in the second nodes are consistent, so that the first voting result of the nth round is determined to be the target node, the accuracy of determining the target node can be improved, and the situation of brain fracture is further reduced.
Optionally, the method of this embodiment may further include:
and if the first voting result of the nth round and the second voting result of the nth round are determined to not meet the voting ending condition, carrying out the voting processing of the (n+1) th round.
In this embodiment, if it is determined that the voting end condition is not satisfied based on the nth round first voting result and the nth round second voting result, it is explained that the target node cannot be determined at this time, and therefore, the n+1th round voting process needs to be performed until the voting end condition is satisfied.
For ease of understanding, the following embodiments provide examples for determining a target node to aid in the description.
Referring to table 2, table 2 is a schematic illustration of a target node participating in a processing task according to an embodiment of the present application.
TABLE 2
Wherein, ballot is a voting number which indicates what round of voting; voler is a voting participant, in this example ip is used as the identification; vole is the voting content, which is a list of target nodes that are involved in the processing task.
This example demonstrates one such decision process:
a) 9.0.0.1 initiates, triggers the first round of voting and decides to perform all tasks by 9.0.0.1;
b) 9.0.0.2 initiates triggering a second round of voting and deciding to perform all tasks by 9.0.0.1, 9.0.0.2;
c) 9.0.0.3 initiates triggering a third round of voting and deciding to perform all tasks by 9.0.0.1, 9.0.0.2, 9.0.0.3;
d) 9.0.0.1 failure triggers the fourth round of voting and decides to perform all tasks by 9.0.0.2, 9.0.0.3.
Specifically, in the first round of voting, only nodes with ip of 9.0.0.1 survive, and then the first round of voting is performed with ip of 9.0.0.1, and the obtained first round of voting result is 9.0.0.1 as a target node.
In the second round of voting, nodes with ip 9.0.0.1 and ip 9.0.0.2 survive, then nodes with ip 9.0.0.1 vote at 9.0.0.1 and 9.0.0.2 in the second round, and nodes with ip 9.0.0.2 vote at 9.0.0.1 and 9.0.0.2 in the second round. If a node with ip 9.0.0.1 can acquire the voting result of ip 9.0.0.2 in the second round, the node with ip 9.0.0.1 stops voting when the voting stop condition is satisfied. The nodes with ip 9.0.0.2 can acquire the voting result of ip 9.0.01 in the second round, and the nodes with ip 9.0.0.2 stop voting, and the nodes stop voting at this time, which means that the nodes with ip 9.0.0.1 and 9.0.0.2 are target nodes after the second round of voting.
In the third round of voting, nodes with ip 9.0.0.1, ip 9.0.0.2 and ip 9.0.0.3 survived. Then nodes with ip 9.0.0.1 vote on the third round as 9.0.0.1, 9.0.0.2 and 9.0.0.3, nodes with ip 9.0.0.2 vote on the third round as 9.0.0.1, 9.0.0.2 and 9.0.0.3, and nodes with ip 9.0.0.3 vote on the third round as 9.0.0.1, 9.0.0.2 and 9.0.0.3. Any node of ip 9.0.0.1, ip 9.0.0.2 and ip 9.0.0.3 can obtain voting results of the other two nodes, if any node of ip 9.0.0.1, ip 9.0.0.2 and ip 9.0.0.3 stops voting, it is stated that the node of ip 9.0.0.1, ip 9.0.0.2 and ip 9.0.0.3 after the third round of voting are target nodes.
And by the fourth round of voting, 9.0.0.1 fails, nodes with ip 9.0.0.2 and ip 9.0.0.3 survive. Nodes with ip 9.0.0.2 at the fourth round vote result 9.0.0.2 and 9.0.0.3, and nodes with ip 9.0.0.3 at the fourth round vote result 9.0.0.2 and 9.0.0.3, then it is explained that nodes with ip 9.0.0.2 and 9.0.0.3 after the fourth round vote are target nodes.
Referring to table 3, table 3 is a list of target nodes participating in a processing task according to an embodiment of the present application.
TABLE 3 Table 3
In the presentation process shown in table 3, the difference is that in the third round of voting, the voting results of the 9.0.0.3 node are 9.0.0.1 and 9.0.0.3, and the voting results of the 9.0.0.1 node and the 9.0.0.2 node are different, so that the third round of voting does not meet the voting ending condition, that is, the third round of voting cannot determine the target node, and the fourth round of voting is entered.
In one possible implementation manner, any one of the plurality of execution nodes is connected with a voting management database, and the voting management database is used for recording a voting result obtained by each round of voting processing of each execution node;
querying the Nth round of second voting results respectively corresponding to at least one second node, wherein the second voting results comprise:
and inquiring the Nth round of second voting results respectively corresponding to at least one second node from the voting management database.
In this embodiment, any one of the plurality of executing nodes is connected to the vote management database, so that communication between the plurality of task nodes is not required, and the vote result of another node can be obtained, thereby reducing the brain fracture caused by the loss of communication data between the task nodes.
In one possible implementation manner, the voting management database is further used for recording heartbeat information reported by each execution node;
Obtaining heartbeat information of each second node includes:
and obtaining heartbeat information of each second node from the voting management database.
In this embodiment, any one of the plurality of executing nodes is connected to the vote management database, so that communication between the plurality of task nodes is not required, and heartbeat information of other nodes can be obtained, thereby reducing the brain fracture caused by the loss of communication data between the task nodes.
Note that the vote management database in this embodiment may be one or more, and is not limited herein. Specifically, if the number of the voting management databases is multiple, the multiple voting management databases may include a first voting management database and a second voting management database, where the first voting management database is used to record the voting result obtained by each round of voting processing of each execution node, and the second voting management database is used to record the heartbeat information reported by each execution node.
It can be understood that, the first voting management database is used for recording the voting result obtained by each round of voting process of each execution node, and the second voting management database is used for recording the heartbeat information reported by each execution node, so that when one voting management database is abnormal, the use of partial functions can be reserved.
Referring to fig. 6, fig. 6 is a schematic diagram of a status update flow of a node according to an embodiment of the present application.
The core logic of the flow is as follows:
1) After each heartbeat update is successful, the worker (node) can continue to work work_timeout (10 s);
2) After the heartbeat of the worker is updated by server_readline (15 s), the worker does not survive;
3) Checking whether the survival worker seen by the user from the heartbeat table is consistent with the survival worker of the last voting or not at regular time, and if not, initiating a new round of voting;
4) When the actual survival worker list is different from the last voting result, each worker reinitiates the voting, and stops executing tasks before the voting is agreed;
5) The conditions for the vote to agree are: the surviving worker in all the voting results participate in the voting, and the voting results are the same;
updating the state machine of the executing node according to the flow may cause that tasks are not executed within a certain period of time, but can ensure that tasks executed by each worker do not intersect at any moment. And because each worker always takes as a voting result the workers surviving in the worker heartbeat table, the voting result is eventually converged.
Referring to fig. 7, fig. 7 is a flowchart illustrating a task processing method of a distributed system according to an embodiment of the present application. The method as shown in fig. 7 may include:
s710, determining a target node for processing the task in a plurality of execution nodes.
The target node may be determined based on any method embodiment described above, which is not limited herein.
S720, if the target node comprises the first node, determining a first number of target tasks in the tasks to be processed, and processing the first number of target tasks.
According to the technical scheme of the embodiment, if the voting triggering condition is met, the nth round of voting processing is performed to obtain an nth round of first voting result, the first voting result is used for indicating a first candidate node for processing tasks in a plurality of executing nodes, a target node for processing the tasks in the plurality of executing nodes is determined based on the nth round of first voting result, and each round of voting processing comprises: obtaining heartbeat information of each second node, determining survival nodes in the second nodes based on the heartbeat information of each second node, wherein each second node is a node except the first node in a plurality of execution nodes, and each survival node is a node with a time interval between the time of the latest heartbeat and the current time smaller than a time interval threshold; and voting is carried out based on the determination result of the surviving node in the second node, so as to obtain a first voting result, that is, when voting is carried out in each round, the target node for processing the task can be determined based on the current surviving node, so that the target node for processing the task is continuously updated, the technical problem that the brain fracture easily occurs in the distributed technology can be solved, and the technical effect of reducing the occurrence of the brain fracture is achieved.
In one possible implementation manner, if the target nodes are at least two, determining a first number of target tasks in the tasks to be processed, including any one of the following:
determining a second number of target nodes, and determining a first number of tasks processed by the first node based on the total number of tasks to be processed and the second number, wherein the first number is inversely related to the second number, and the first number is positively related to the total number of tasks;
the method comprises the steps of obtaining processing capacity indication information of each target node, determining a ratio coefficient corresponding to each target node based on task processing capacity indicated by the processing capacity indication information, and determining a first number of tasks processed by a first node based on the total number of tasks to be processed and the ratio coefficient corresponding to the first node, wherein the ratio coefficient corresponding to each target node is positively correlated with the task processing capacity corresponding to each target node, and the first number is positively correlated with the ratio coefficient corresponding to the first node and the total number of tasks respectively.
Alternatively, the ratio of the total number of tasks to the second number may be determined as the first number. Specifically, when deciding a node list participating in processing tasks, the total number of fragments of the tasks is the length of the node list, and the sequence numbers of fragments executed by each node are sequence numbers ordered according to a fixed sequence (such as the character sequence of the node id). Assuming that after a certain round of voting is finished, the list of executing nodes is as follows: 9.0.0.2, 9.0.0.3, the total number of task slices is 2,9.0.0.2, the task slice with the execution number 0 is executed, 9.0.0.3, the task allocation with the execution number 1 is executed, wherein the task slice with the execution number 0 represents a task set as follows: all tasks id% 2=0 tasks.
The processing capability indication information may be at least one of a Central Processing Unit (CPU) frequency or a core number.
In the technical scheme of the embodiment, load balancing among the execution nodes can be realized, and the capacity of transverse expansion is realized.
Referring to fig. 8, fig. 8 is a schematic diagram illustrating comparison of CPU utilization of two execution nodes according to an embodiment of the present application. As shown in fig. 8, the cpu usage of the two executing nodes is substantially the same.
Referring to fig. 9, fig. 9 is a schematic structural diagram of a node determining apparatus of a distributed system according to an embodiment of the present application. The apparatus as shown in fig. 9 may include a decision module 910. Wherein:
the decision module 910 is configured to perform an nth round of voting process to obtain an nth round of first voting result if the voting trigger condition is satisfied, where the first voting result is used to indicate a first candidate node of the processing tasks in the plurality of execution nodes, and N is an integer not less than 1;
the decision module 910 is further configured to determine a target node of the processing tasks in the plurality of execution nodes based on the nth round of the first voting result;
wherein, the decision module 910 is configured to, when performing each round of voting process:
obtaining heartbeat information of each second node, determining survival nodes in the second nodes based on the heartbeat information of each second node, wherein each second node is a node except the first node in a plurality of execution nodes, and each survival node is a node with a time interval between the time of the latest heartbeat and the current time smaller than a time interval threshold;
And voting is carried out based on the determination result of the survival node in the second node, and a first voting result is obtained.
Optionally, the voting triggering condition includes at least one of:
reaching the voting triggering time;
the time interval between the time of the last heartbeat of the first node and the current time is less than a time interval threshold;
the survival nodes in the execution nodes are inconsistent with the target nodes determined by the N-1 round of voting, and N is an integer not less than 2;
a change in surviving nodes in the second node is detected.
Optionally, the first node is in a working state or a suspension state, the first node in the working state allows processing tasks, and the first node in the suspension state prohibits processing tasks; the decision module 910 is further configured to:
if the determined target node comprises the first node, setting the state of the first node as a working state;
if the time interval between the latest heartbeat time and the current time of the first node is greater than or equal to the time interval threshold value, setting the state of the first node as a pause state;
if the voting triggering condition is met and before the target node is determined, the state of the first node is set to be a pause state.
Optionally, the decision module 910 may be configured, when determining a target node of the processing tasks in the plurality of execution nodes based on the nth round of the first voting result, to:
querying the Nth round of second voting results respectively corresponding to at least one second node, wherein the second voting results are used for indicating second candidate nodes for processing tasks in a plurality of execution nodes;
if the second voting result of the nth round is failed to inquire, determining a first candidate node indicated by the first voting result of the nth round as a target node;
and if the second voting result of the nth round is successfully queried, determining a target node for processing tasks in the plurality of execution nodes based on the first voting result of the nth round and the second voting result of the nth round.
Optionally, when determining the target node of the processing task in the plurality of execution nodes based on the nth round of the first voting result and the nth round of the second voting result, the method may be used to:
if the first voting result of the nth round and the second voting result of the nth round are determined to meet the voting ending condition, determining a target node set for processing the task, and determining any node in the target node set as a target node, wherein the target node set is an intersection or union between a first candidate node indicated by the first voting result of the nth round and a second candidate node indicated by the second voting result of the nth round;
Wherein the voting ending condition includes at least one of:
the first voting result of the nth round is consistent with the second voting result of the nth round corresponding to each second node;
at least one second node corresponding to the queried Nth round of second voting result is consistent with survival nodes in the second nodes;
the method further comprises the steps of:
and if the first voting result of the nth round and the second voting result of the nth round are determined to not meet the voting ending condition, carrying out the voting processing of the (n+1) th round.
Optionally, any one of the plurality of executing nodes is connected with a voting management database, and the voting management database is used for recording a voting result obtained by each round of voting processing of each executing node;
the decision module 910 may be configured to, when querying the nth round of second voting results corresponding to at least one second node respectively:
and inquiring the Nth round of second voting results respectively corresponding to at least one second node from the voting management database.
Optionally, the voting management database is further used for recording heartbeat information reported by each execution node; the decision module 910, when acquiring heartbeat information of each second node, may be configured to:
and obtaining heartbeat information of each second node from the voting management database.
Optionally, the decision module 910 may be configured to, when performing voting based on the surviving node determination result in the second node, obtain a first voting result, at least one of the following:
if the surviving node determination result is that the surviving node determination in the second node is successful, the first candidate node indicated by the first voting result comprises the surviving nodes in the first node and the second node;
if the surviving node determination result is that the surviving node determination in the second node fails, the first candidate node indicated by the first voting result comprises the first node and does not comprise the surviving node in the second node.
Referring to fig. 10, fig. 10 is a schematic structural diagram of a task processing device of a distributed system according to an embodiment of the present application. The apparatus as shown in fig. 10 may include a decision module 910 and an execution module 920, where:
a decision module 910 for determining a target node for processing a task among a plurality of executing nodes, wherein the target node is determined based on the method as in any of the above embodiments;
the execution module 920 is configured to determine a first number of target tasks among the tasks to be processed if the target nodes include the first node, and process the first number of target tasks.
Optionally, if there are at least two target nodes, the execution module 920 may be configured to determine a first number of target tasks among the tasks to be processed, where the first number of target tasks may be any of the following:
determining a second number of target nodes, and determining a first number of tasks processed by the first node based on the total number of tasks to be processed and the second number, wherein the first number is inversely related to the second number, and the first number is positively related to the total number of tasks;
the method comprises the steps of obtaining processing capacity indication information of each target node, determining a ratio coefficient corresponding to each target node based on task processing capacity indicated by the processing capacity indication information, and determining a first number of tasks processed by a first node based on the total number of tasks to be processed and the ratio coefficient corresponding to the first node, wherein the ratio coefficient corresponding to each target node is positively correlated with the task processing capacity corresponding to each target node, and the first number is positively correlated with the ratio coefficient corresponding to the first node and the total number of tasks respectively.
The device of the embodiment of the present application may perform the method provided by the embodiment of the present application, and its implementation principle is similar, and actions performed by each module in the device of the embodiment of the present application correspond to steps in the method of the embodiment of the present application, and detailed functional descriptions of each module of the device may be referred to the descriptions in the corresponding methods shown in the foregoing, which are not repeated herein.
An embodiment of the present application provides an execution node, including a memory, a processor, and a computer program stored on the memory, where the processor executes the computer program to implement the steps of the method of any embodiment.
In an alternative embodiment, there is provided an execution node, as shown in fig. 11, an execution node 1100 shown in fig. 11 includes: a processor 1101 and a memory 1103. The processor 1101 is coupled to a memory 1103, such as via a bus 1102. Optionally, the executing node 1100 may further include a transceiver 1104, where the transceiver 1104 may be used for data interaction between the executing node and other executing nodes, such as transmission of data and/or reception of data, etc. It should be noted that, in practical applications, the transceiver 1104 is not limited to one, and the structure of the execution node 1100 is not limited to the embodiment of the present application.
The processor 1101 may be a CPU (Central Processing Unit ), general purpose processor, DSP (Digital Signal Processor, data signal processor), ASIC (Application Specific Integrated Circuit ), FPGA (Field Programmable Gate Array, field programmable gate array) or other programmable logic device, transistor logic device, hardware components, or any combination thereof. Which may implement or perform the various exemplary logic blocks, modules and circuits described in connection with this disclosure. The processor 1101 may also be a combination that performs computing functions, such as a combination comprising one or more microprocessors, a combination of a DSP and a microprocessor, or the like.
Bus 1102 may include a path that communicates information between the components. Bus 1102 may be a PCI (Peripheral Component Interconnect, peripheral component interconnect Standard) bus or an EISA (Extended Industry Standard Architecture ) bus, or the like. Bus 1102 may be divided into address bus, data bus, control bus, and the like. For ease of illustration, only one thick line is shown in FIG. 11, but not only one bus or one type of bus.
The Memory 1103 may be a ROM (Read Only Memory) or other type of static storage device that can store static information and instructions, a RAM (Random Access Memory ) or other type of dynamic storage device that can store information and instructions, an EEPROM (Electrically Erasable Programmable Read Only Memory ), a CD-ROM (Compact Disc Read Only Memory, compact disc Read Only Memory) or other optical disk storage, optical disk storage (including compact discs, laser discs, optical discs, digital versatile discs, blu-ray discs, etc.), magnetic disk storage media, other magnetic storage devices, or any other medium that can be used to carry or store a computer program and that can be Read by a computer, without limitation.
The memory 1103 is used for storing a computer program for executing an embodiment of the present application, and is controlled to be executed by the processor 1101. The processor 1101 is configured to execute a computer program stored in the memory 1103 to implement the steps shown in the foregoing method embodiments.
Optionally, the execution node may include, but is not limited to, at least one of a personal computer, a notebook computer, a smart phone, a tablet computer, an internet of things device, a portable wearable device, or a server, where the internet of things device may be at least one of a smart speaker, a smart television, a smart air conditioner, or a smart vehicle device. The portable wearable device may be one of a smart watch, a smart bracelet, or a headset, etc. The server may be implemented as a stand-alone server or as a server cluster composed of a plurality of servers.
The embodiment of the application provides a distributed system which comprises a plurality of execution nodes, wherein any one of the plurality of execution nodes is used as a first node to execute the method of any one of the embodiments.
Embodiments of the present application provide a computer readable storage medium having a computer program stored thereon, which when executed by a processor, implements the steps of the foregoing method embodiments and corresponding content.
The embodiment of the application also provides a computer program product, which comprises a computer program, wherein the computer program can realize the steps and corresponding contents of the embodiment of the method when being executed by a processor.
It should be understood that, although various operation steps are indicated by arrows in the flowcharts of the embodiments of the present application, the order in which these steps are implemented is not limited to the order indicated by the arrows. In some implementations of embodiments of the application, the implementation steps in the flowcharts may be performed in other orders as desired, unless explicitly stated herein. Furthermore, some or all of the steps in the flowcharts may include multiple sub-steps or multiple stages based on the actual implementation scenario. Some or all of these sub-steps or phases may be performed at the same time, or each of these sub-steps or phases may be performed at different times, respectively. In the case of different execution time, the execution sequence of the sub-steps or stages can be flexibly configured according to the requirement, which is not limited by the embodiment of the present application.
The foregoing is only an optional implementation manner of some implementation scenarios of the present application, and it should be noted that, for those skilled in the art, other similar implementation manners based on the technical ideas of the present application are adopted without departing from the technical ideas of the scheme of the present application, which also belongs to the protection scope of the embodiments of the present application.

Claims (15)

1. A method of determining nodes of a distributed system, the distributed system comprising a plurality of executing nodes, the method being applied to any first node of the plurality of executing nodes, the method comprising:
if the voting triggering condition is met, carrying out the N-th round of voting processing to obtain an N-th round of first voting result, wherein the first voting result is used for indicating a first candidate node for processing tasks in the plurality of execution nodes, and N is an integer not less than 1;
determining a target node of the processing task in the plurality of execution nodes based on the nth round first voting result, wherein the target node comprises at least one of the first candidate nodes;
wherein each round of voting process comprises:
acquiring heartbeat information of each second node, and determining a survival node in the second node based on the heartbeat information of each second node, wherein the second node is a node except the first node in the plurality of execution nodes, and the survival node is a node with a time interval between the time of the last heartbeat and the current time smaller than a time interval threshold;
and voting is carried out based on the determination result of the survival node in the second node, and a first voting result is obtained.
2. The method of claim 1, wherein the voting trigger conditions comprise at least one of:
reaching the voting triggering time;
the time interval between the time of the last heartbeat of the first node and the current time is smaller than a time interval threshold;
the survival nodes in the execution nodes are inconsistent with the target nodes determined by the N-1 round of voting, and N is an integer not less than 2;
a change in surviving nodes in the second node is detected.
3. The method of claim 1, wherein the first node is in an active state or a suspended state, the first node in the active state allowing processing tasks, the first node in the suspended state prohibiting processing tasks; the method further comprises at least one of:
if the determined target node comprises the first node, setting the state of the first node as a working state;
if the time interval between the time of the last heartbeat of the first node and the current time is greater than or equal to a time interval threshold value, setting the state of the first node to be a pause state;
and if the voting triggering condition is met and before the target node is determined, setting the state of the first node to be a pause state.
4. The method of claim 1, wherein the determining a target node of the processing task of the plurality of executing nodes based on the nth round first voting result comprises:
querying an Nth round of second voting results respectively corresponding to at least one second node, wherein the second voting results are used for indicating second candidate nodes for processing tasks in the plurality of execution nodes;
if the second voting result query of the nth round fails, determining a first candidate node indicated by the first voting result of the nth round as a target node;
and if the second voting result of the nth round is queried successfully, determining a target node for processing tasks in the plurality of execution nodes based on the first voting result of the nth round and the second voting result of the nth round.
5. The method of claim 4, wherein the determining a target node of the processing tasks in the plurality of executing nodes based on the nth round of first voting results and the nth round of second voting results comprises:
if the condition of finishing the voting is met based on the first voting result of the nth round and the second voting result of the nth round, determining a target node set for processing tasks, and determining any node in the target node set as a target node, wherein the target node set is an intersection or union between a first candidate node indicated by the first voting result of the nth round and a second candidate node indicated by the second voting result of the nth round;
Wherein the voting ending condition includes at least one of:
the nth round of first voting results are consistent with the nth round of second voting results corresponding to each second node;
the at least one second node corresponding to the queried Nth round of second voting result is consistent with the survival nodes in the second nodes;
the method further comprises the steps of:
and if the first voting result of the nth round and the second voting result of the nth round are determined to not meet the voting ending condition, carrying out the voting processing of the (n+1) th round.
6. The method of claim 4, wherein any one of the plurality of executing nodes is connected to a vote management database, and the vote management database is configured to record a vote result obtained by each round of voting processing by each executing node;
querying the nth round of second voting results respectively corresponding to at least one second node, wherein the second round of second voting results comprise:
and inquiring the N-th round of second voting results respectively corresponding to at least one second node from the voting management database.
7. The method of claim 6, wherein the vote management database is further configured to record heartbeat information reported by each executing node;
The obtaining the heartbeat information of each second node includes:
and obtaining heartbeat information of each second node from the voting management database.
8. The method according to any one of claims 1-7, wherein the voting based on the surviving node determination results in the second node results in a first voting result comprising at least one of the following:
if the surviving node determination result is that the surviving node in the second node is successfully determined, the first candidate node indicated by the first voting result comprises the surviving nodes in the first node and the second node;
and if the survival node determination result is that the survival node determination in the second node fails, the first candidate node indicated by the first voting result comprises the first node and does not comprise the survival node in the second node.
9. A method of task processing for a distributed system, the distributed system comprising a plurality of execution nodes, the method being applied to any first node of the plurality of execution nodes, the method comprising:
determining a target node of the plurality of executing nodes for processing tasks, wherein the target node is determined based on the method of any of claims 1-8;
And if the target node comprises the first node, determining a first number of target tasks in the tasks to be processed, and processing the first number of target tasks.
10. The method of claim 9, wherein if the target nodes are at least two, the determining a first number of target tasks among the tasks to be processed comprises any one of:
determining a second number of the target nodes, and determining a first number of tasks processed by the first node based on the total number of tasks to be processed and the second number, wherein the first number is inversely related to the second number, and the first number is positively related to the total number of tasks;
the method comprises the steps of obtaining processing capacity indication information of each target node, determining a ratio coefficient corresponding to each target node based on task processing capacity indicated by the processing capacity indication information, and determining a first number of tasks processed by a first node based on the total number of tasks to be processed and the ratio coefficient corresponding to the first node, wherein the ratio coefficient corresponding to each target node is positively correlated with the task processing capacity corresponding to each target node, and the first number is positively correlated with the ratio coefficient corresponding to the first node and the total number of tasks respectively.
11. A node determining apparatus of a distributed system, the distributed system comprising a plurality of execution nodes, the apparatus being applied to any first node of the plurality of execution nodes, the apparatus comprising:
the decision module is used for carrying out the N-th round of voting processing to obtain the N-th round of first voting result if the voting triggering condition is met, wherein the first voting result is used for indicating a first candidate node for processing tasks in the plurality of execution nodes, and N is an integer not less than 1;
the decision module is further configured to determine a target node of the processing task in the plurality of execution nodes based on the nth round of first voting results, where the target node includes at least one of the first candidate nodes;
wherein, the decision module is used for carrying out each round of voting processing:
acquiring heartbeat information of each second node, and determining a survival node in the second node based on the heartbeat information of each second node, wherein the second node is a node except the first node in the plurality of execution nodes, and the survival node is a node with a time interval between the time of the last heartbeat and the current time smaller than a time interval threshold;
And voting is carried out based on the determination result of the survival node in the second node, and a first voting result is obtained.
12. A task processing device of a distributed system, wherein the distributed system comprises a plurality of execution nodes, the device being applied to any first node of the plurality of execution nodes, the device comprising:
a decision module for determining a target node of the plurality of execution nodes for processing a task, wherein the target node is determined based on the method of any of claims 1-8;
and the execution module is used for determining a first number of target tasks in the tasks to be processed if the target nodes comprise the first node, and processing the first number of target tasks.
13. An executing node comprising a memory, a processor and a computer program stored on the memory, characterized in that the processor executes the computer program to implement the method of any one of claims 1-8 or to implement the method of any one of claims 9-10.
14. A distributed system comprising a plurality of executing nodes, wherein any of the plurality of executing nodes performs the method of any of claims 1-8 or performs the method of any of claims 9-10 as a first node.
15. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the method of any one of claims 1-8 or the method of any one of claims 9-10.
CN202311440542.6A 2023-11-01 2023-11-01 Node determining method, task processing method and related devices of distributed system Active CN117155930B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311440542.6A CN117155930B (en) 2023-11-01 2023-11-01 Node determining method, task processing method and related devices of distributed system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311440542.6A CN117155930B (en) 2023-11-01 2023-11-01 Node determining method, task processing method and related devices of distributed system

Publications (2)

Publication Number Publication Date
CN117155930A true CN117155930A (en) 2023-12-01
CN117155930B CN117155930B (en) 2024-02-06

Family

ID=88906652

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311440542.6A Active CN117155930B (en) 2023-11-01 2023-11-01 Node determining method, task processing method and related devices of distributed system

Country Status (1)

Country Link
CN (1) CN117155930B (en)

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7664125B1 (en) * 2006-01-03 2010-02-16 Emc Corporation Indication forwarding in a distributed environment
CN108134712A (en) * 2017-12-19 2018-06-08 海能达通信股份有限公司 A kind of processing method, device and the equipment of distributed type assemblies fissure
US20190052520A1 (en) * 2017-08-14 2019-02-14 Nicira, Inc. Cooperative active-standby failover between network systems
CN113886129A (en) * 2021-10-21 2022-01-04 联想(北京)有限公司 Information processing method and device and electronic equipment
CN116055563A (en) * 2022-11-22 2023-05-02 北京明朝万达科技股份有限公司 Task scheduling method, system, electronic equipment and medium based on Raft protocol
CN116225655A (en) * 2023-03-07 2023-06-06 中国建设银行股份有限公司 Task scheduling method, device and storage medium
CN116346588A (en) * 2023-02-07 2023-06-27 浙江大华技术股份有限公司 Management node switching method, device, equipment and medium
WO2023147750A1 (en) * 2022-02-07 2023-08-10 上海哔哩哔哩科技有限公司 Edge cluster scheduling method and apparatus
CN116860421A (en) * 2023-09-05 2023-10-10 中信消费金融有限公司 Task processing method and task processing system

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7664125B1 (en) * 2006-01-03 2010-02-16 Emc Corporation Indication forwarding in a distributed environment
US20190052520A1 (en) * 2017-08-14 2019-02-14 Nicira, Inc. Cooperative active-standby failover between network systems
CN108134712A (en) * 2017-12-19 2018-06-08 海能达通信股份有限公司 A kind of processing method, device and the equipment of distributed type assemblies fissure
CN113886129A (en) * 2021-10-21 2022-01-04 联想(北京)有限公司 Information processing method and device and electronic equipment
WO2023147750A1 (en) * 2022-02-07 2023-08-10 上海哔哩哔哩科技有限公司 Edge cluster scheduling method and apparatus
CN116055563A (en) * 2022-11-22 2023-05-02 北京明朝万达科技股份有限公司 Task scheduling method, system, electronic equipment and medium based on Raft protocol
CN116346588A (en) * 2023-02-07 2023-06-27 浙江大华技术股份有限公司 Management node switching method, device, equipment and medium
CN116225655A (en) * 2023-03-07 2023-06-06 中国建设银行股份有限公司 Task scheduling method, device and storage medium
CN116860421A (en) * 2023-09-05 2023-10-10 中信消费金融有限公司 Task processing method and task processing system

Also Published As

Publication number Publication date
CN117155930B (en) 2024-02-06

Similar Documents

Publication Publication Date Title
US10749954B2 (en) Cross-data center hierarchical consensus scheme with geo-aware leader election
US10482104B2 (en) Zero-data loss recovery for active-active sites configurations
US20180091588A1 (en) Balancing workload across nodes in a message brokering cluster
US20180091586A1 (en) Self-healing a message brokering cluster
US20060129615A1 (en) Performing scheduled backups of a backup node associated with a plurality of agent nodes
KR20140122240A (en) Managing partitions in a scalable environment
KR20140119090A (en) Dynamic load balancing in a scalable environment
CN106933672B (en) Distributed environment coordinated consumption queue method and device
CN112261135A (en) Node election method, system, device and equipment based on consistency protocol
US11102284B2 (en) Service processing methods and systems based on a consortium blockchain network
CN107038192B (en) Database disaster tolerance method and device
CN108462756B (en) Data writing method and device
CN113900598A (en) Block chain based data storage method, device, equipment and storage medium
CN111726388A (en) Cross-cluster high-availability implementation method, device, system and equipment
CN110740155A (en) Request processing method and device in distributed system
US20170235600A1 (en) System and method for running application processes
CN117155930B (en) Node determining method, task processing method and related devices of distributed system
CN112631756A (en) Distributed regulation and control method and device applied to space flight measurement and control software
WO2023244491A1 (en) Techniques for replication checkpointing during disaster recovery
US20230185631A1 (en) Embedded capacity-computer module for microservice load balancing and distribution
Lin et al. ReHRS: A hybrid redundant system for improving MapReduce reliability and availability
CN114296891A (en) Task scheduling method, system, computing device, storage medium and program product
US10949322B2 (en) Collecting performance metrics of a device
CN115967611A (en) Cross-domain switching processing method, device, equipment and storage medium
CN113434297A (en) Hot spot peak clipping method and device, storage server, client and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant