CN118034911A - Load balancing method and related device for message middleware Kafka - Google Patents

Load balancing method and related device for message middleware Kafka Download PDF

Info

Publication number
CN118034911A
CN118034911A CN202410031410.6A CN202410031410A CN118034911A CN 118034911 A CN118034911 A CN 118034911A CN 202410031410 A CN202410031410 A CN 202410031410A CN 118034911 A CN118034911 A CN 118034911A
Authority
CN
China
Prior art keywords
node
partition
nodes
partitions
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202410031410.6A
Other languages
Chinese (zh)
Inventor
王超
刘墩建
李建伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hisense TransTech Co Ltd
Original Assignee
Hisense TransTech Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hisense TransTech Co Ltd filed Critical Hisense TransTech Co Ltd
Priority to CN202410031410.6A priority Critical patent/CN118034911A/en
Publication of CN118034911A publication Critical patent/CN118034911A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • G06F9/505Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the load
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/546Message passing systems or structures, e.g. queues
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/54Indexing scheme relating to G06F9/54
    • G06F2209/547Messaging middleware

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Multi Processors (AREA)

Abstract

The application discloses a load balancing method and a related device for message middleware Kafka, and relates to the technical field of data processing. In the method, when a theme triggering node updating behavior exists in a message middleware Kafka, the partition number average value of partitions held by each consumption node in the theme is determined. Then, a target partition that needs to be reassigned is determined from the plurality of partitions of the subject, and a target node for receiving the target partition is determined, based on the node update behavior and the partition number average. And finally, distributing the target partition to the target node to balance the load value of each consumption node under the theme, thereby realizing the load balance of the system. In the process, only the target partition is reassigned to the target node, and recovery and reassignment of the partitions held by all consumption nodes are not needed, so that the resource overhead and the processing time consumption of load balancing can be reduced.

Description

Load balancing method and related device for message middleware Kafka
Technical Field
The application relates to the technical field of data processing, and particularly discloses a load balancing method and a related device for message middleware Kafka.
Background
The distributed messaging system is a distributed architecture based on a messaging middleware mechanism, and Kafka is one of the messaging middleware that is currently most frequently applied. Kafka contains a plurality of topics, each topic corresponding to a service type. Each topic is provided with a consumption group corresponding to the topic and a plurality of partitions for carrying message bodies generated in the topic, and each consumption group comprises at least one consumption node (also called consumer). Kafka assigns partitions within the topic to consuming nodes in the consuming group, causing the consuming nodes to consume message bodies within the partitions held by themselves.
In practical application, when a new consumption node joins a theme or a consumption node with a partition leaves the theme, the partition held by each consumption node under the theme needs to be uniformly recovered, and the recovered partition is equally divided into each consumption node as much as possible according to the number of the consumption nodes in the theme, so that the consumption time difference among the consumption nodes is reduced to the greatest extent, and the load balance of the system is realized.
However, in the above-mentioned processing flow of load balancing, the partitions held by all the consuming nodes under the theme need to be uniformly recovered and redistributed, which has the problem of long time consumption. In addition, the process may reclaim and reassign some partitions of the consuming node that are originally in the load balancing state, that is, there are some redundant reassignment operations, which increases unnecessary resource overhead.
Disclosure of Invention
The embodiment of the application provides a load balancing method and a related device for message middleware Kafka, which are used for reducing the resource overhead and the processing time consumption of load balancing.
In order to achieve the above object, the technical solution of the embodiment of the present application is as follows:
In a first aspect, an embodiment of the present application provides a method for balancing loads of message middleware Kafka, where the method includes:
When a theme triggering node updating behavior exists in the Kafka, determining the partition quantity average value of partitions held by each consumption node in the theme; wherein, a plurality of partitions are arranged in the theme, and each partition is distributed with a unique consumption node;
Determining a target partition from a plurality of partitions of the subject according to the node updating behavior and the partition number average value, and determining a target node from the consumption nodes;
and distributing the target partition to the target nodes to balance the load values of the consumption nodes.
In a second aspect, an embodiment of the present application provides a load balancing apparatus for a message middleware Kafka, where the apparatus includes:
A mean processing unit configured to: when a theme triggering node updating behavior exists in the Kafka, determining the partition quantity average value of partitions held by each consumption node in the theme; wherein, a plurality of partitions are arranged in the theme, and each partition is distributed with a unique consumption node;
A partition node unit configured to: determining a target partition from a plurality of partitions of the subject according to the node updating behavior and the partition number average value, and determining a target node from the consumption nodes;
a load balancing unit configured to: and distributing the target partition to the target nodes to balance the load values of the consumption nodes.
In a third aspect, an embodiment of the present application further provides an electronic device, including a data detection unit and a processor:
the data detection unit is configured to: detecting whether a theme triggering node updating behavior exists in Kafka;
The processor is configured to: when a theme triggering node updating behavior exists in the Kafka, determining the partition quantity average value of partitions held by each consumption node in the theme; wherein, a plurality of partitions are arranged in the theme, and each partition is distributed with a unique consumption node;
Determining a target partition from a plurality of partitions of the subject according to the node updating behavior and the partition number average value, and determining a target node from the consumption nodes;
and distributing the target partition to the target nodes to balance the load values of the consumption nodes.
In a fourth aspect, embodiments of the present application also provide a computer readable storage medium having stored therein a computer program which, when executed by a processor, implements a method of any of the above first aspects.
In a fifth aspect, embodiments of the present application provide a computer program product comprising computer instructions stored in a computer readable storage medium; when the processor of the computer device reads the computer instructions from the computer readable storage medium, the processor executes the computer instructions, causing the computer device to perform the method of any of the above-mentioned first aspects.
In the embodiment of the application, when a theme triggering node updating behavior exists in the message middleware Kafka, the partition number average value of partitions held by each consumption node in the theme is determined. Then, according to the specifically triggered node updating behavior and the partition number average value under the topic, determining a target partition needing to be reassigned from a plurality of partitions of the topic, and determining a target node for receiving the target partition from all consumption nodes under the topic. And finally, distributing the target partition to the target node to balance the load value of each consumption node under the theme, thereby realizing the load balance of the system. In the process, only the target partition is reassigned to the target node, and recovery and reassignment of the partitions held by all the consuming nodes are not needed, so that the reassignment of the partitions to the consuming nodes with balanced loads can be avoided, the resource cost is saved, and the processing time consumption of the load balancing can be effectively reduced.
Additional features and advantages of the application will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the application. The objectives and other advantages of the application will be realized and attained by the structure particularly pointed out in the written description and claims thereof as well as the appended drawings.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the description of the embodiments will be briefly described below, it will be apparent that the drawings in the following description are only some embodiments of the present application, and that other drawings can be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic diagram of a subject structure of Kafka according to an embodiment of the present application;
FIG. 2 is a schematic diagram of a partition flow provided in an embodiment of the present application;
FIG. 3 is a schematic diagram of a repartitioning alignment according to an embodiment of the present application;
Fig. 4 is an overall flowchart of a load balancing method of a message middleware Kafka according to an embodiment of the present application;
FIG. 5 is a schematic diagram of a partition number average calculation flow provided in an embodiment of the present application;
FIG. 6 is an overall flowchart of obtaining a target partition according to an embodiment of the present application;
FIG. 7 is an overall flowchart of obtaining a node to be processed according to an embodiment of the present application;
fig. 8 is a schematic flow chart of acquiring a first node according to an embodiment of the present application;
fig. 9 is a schematic flow chart of acquiring a second node according to an embodiment of the present application;
FIG. 10 is a flowchart of a method for acquiring a node to be processed according to an embodiment of the present application;
FIG. 11 is a schematic diagram of allocating target partitions to target nodes according to an embodiment of the present application;
fig. 12 is an overall flowchart of obtaining a target node according to an embodiment of the present application;
Fig. 13 is a schematic flow chart of obtaining a third node according to an embodiment of the present application;
FIG. 14 is another schematic diagram of allocating target partitions to target nodes according to an embodiment of the present application;
Fig. 15 is a block diagram of a load balancing device of a message middleware Kafka according to an embodiment of the present application;
fig. 16 is a block diagram of an electronic device according to an embodiment of the present application.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present application more apparent, the technical solutions of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application, and it is apparent that the described embodiments are some embodiments of the technical solutions of the present application, but not all embodiments. All other embodiments, based on the embodiments described in the present document, which can be obtained by a person skilled in the art without any creative effort, are within the scope of protection of the technical solutions of the present application.
In order to facilitate understanding of the technical scheme provided by the application, the following details are provided for the technical background:
With the continuous development of computer technology, the business scale of enterprises and the data volume involved in business are also increasing. In view of the advantages of high throughput, low delay and the like of the distributed message system, the transmission requirement of mass service data is met by deploying the distributed message system at present.
Kafka is one of the most frequently applied message middleware in the distributed messaging system today. Kafka contains a plurality of topics, each topic corresponding to a service type. Each topic has a consumption group corresponding to the topic and a plurality of partitions for carrying message bodies generated within the topic, each consumption group including at least one consumption node (Customer). Kafka assigns partitions within the topic to consuming nodes in the consuming group, causing the consuming nodes to consume message bodies within the partitions held by themselves.
Specifically, as shown in fig. 1, the Topic0 shown in fig. 1 includes 10 partitions in total from Partition0 to Partition 9. The consumer group CustomerTeam corresponding to Topic1 includes a total of 3 consumer nodes from Customer0 to Customer 3.
Kafka will equally divide all the partitions under the subject into each consumption node as much as possible, so as to reduce consumption time difference among the consumption nodes to the maximum extent, and thus realize load balancing of the system. The specific equipartition strategy may adopt conventional allocation means such as polling allocation or sequential allocation, for example, in fig. 1, the consumer node consumer holds partitions partition0-partition3, consumer node consumer holds partitions partition3-partition5, consumer node Custome holds partitions partition6 and partition7, and consumer node Custome holds partitions partition8 and partition9.
When a new consumption node joins the theme or a consumption node with a partition leaves the theme, the partitions of all the consumption nodes under the theme need to be uniformly recovered, and the recovered partitions are equally divided into each consumption node as much as possible according to the number of the consumption nodes in the theme.
Taking the foregoing fig. 1 as an example, fig. 2 is specifically shown. Assuming that consumer node Customer4 is newly added to Topic0, kafka recovers all the partitions already held by each consumer node, and then re-equally distributes the recovered partitions Partition0 to Partition9 to customers 0 to 4 as much as possible. For example, using a round robin allocation scheme, partitions Particle 0 and Particle 5 are allocated to Customer0, partitions Particle 1 and Particle 6 are allocated to Customer1, partitions Particle 2 and Particle 7 are allocated to Customer2, partitions Particle 3 and Particle 8 are allocated to Customer3, and partitions Particle 4 and Particle 9 are allocated to Customer4 that is the current newly added topic.
However, in the above-mentioned processing flow of load balancing, the partitions held by all the consuming nodes under the theme need to be uniformly recovered and redistributed, which has the problem of long time consumption. In addition, the process may reclaim and reassign some partitions of the consuming node that are originally in the load balancing state, that is, there are some redundant reassignment operations, which increases unnecessary resource overhead.
Still referring to the foregoing example of FIG. 2, and in particular FIG. 3, consumer node Custome and consumer 3 each hold a number of partitions of 2 before new consumer node consumer 4 joins the topic. While consumer node Custome and consumer 3 each hold the number of partitions still being 2 after consumer 4 joins the topic and repartitions. This indicates that consumer node Custome and Customer3 are in a load-balanced state prior to repartitioning, which simply allows the two consumer nodes to return to a load-balanced state again, i.e., a redundant reassignment operation is created, adding additional resource overhead.
In order to solve the above problems, the inventive concept of the embodiment of the present application is as follows: when a theme triggering node updating behavior exists in the message middleware Kafka, determining the partition number average value of partitions held by each consumption node in the theme. Then, according to the specifically triggered node updating behavior and the partition number average value under the topic, determining a target partition needing to be reassigned from a plurality of partitions of the topic, and determining a target node for receiving the target partition from all consumption nodes under the topic. And finally, distributing the target partition to the target node to balance the load value of each consumption node under the theme, thereby realizing the load balance of the system. In the process, only the target partition is reassigned to the target node, and recovery and reassignment of the partitions held by all the consuming nodes are not needed, so that the reassignment of the partitions to the consuming nodes with balanced loads can be avoided, the resource cost is saved, and the processing time consumption of the load balancing can be effectively reduced.
Next, as shown in fig. 4, fig. 4 shows an overall flow of a load balancing method of a message middleware Kafka according to an embodiment of the present application, which specifically includes the following steps:
Step 401: when a theme triggering node updating behavior exists in the Kafka, determining the partition quantity average value of partitions held by each consumption node in the theme; wherein, a plurality of partitions are arranged in the theme, and each partition is distributed with a unique consumption node;
the node update behavior of the embodiment of the application comprises that the new consuming node joins the theme or the consuming node with the partition leaves the theme. And when detecting that the theme triggering the node update exists, acquiring the partition quantity average value of the partitions held by each consumption node in the theme.
For example, the subject shown in fig. 5 is provided with Partition0 to Partition9 in total of 10 partitions. Wherein, consumption node Customer0 holds Partition 0-Partition 4, consumption node Customer1 holds Partition 5-Partition 9. When a new consumption node Customer2 joins the theme, the number of consumption nodes in the theme is updated to 3, and the average value of the number of partitions held by each consumption node under the theme is (5+5+0)/3=10/3.
Step 402: determining a target partition from a plurality of partitions of the subject according to the node updating behavior and the partition number average value, and determining a target node from the consumption nodes;
Step 403: and distributing the target partition to the target nodes to balance the load values of the consumption nodes.
As mentioned above, the node update behavior of the embodiment of the present application includes two behaviors, i.e., a new consuming node joining the theme or a consuming node already holding the partition leaving the theme.
In order to facilitate understanding of the technical solution of the present application, how to use the steps 402 to 403 to explain the load balancing process of Kafka when each behavior is triggered through the first scenario and the second scenario:
The method comprises the steps that firstly, a triggered node updating behavior represents a consumption node joining theme;
the target partition characterization in the embodiment of the application allocates unreasonable partitions after triggering node update behavior, and the target node is a node for receiving the target partition.
When executing step 402, if the triggered node update behavior characterizes that there is a consuming node to join the theme, since the consuming node of the newly joined theme has not already held a partition and is in an idle state, the consuming node of the newly joined theme is directly taken as a target node.
Next, through the flow shown in fig. 6, a target partition to be reassigned is found from the partitions under the subject, which specifically includes the following steps:
Step 601: detecting whether the total number of nodes of each consumption node under the theme is larger than the total number of partitions in the theme;
step 602: if the total number of nodes is greater than the total number of partitions, exiting the load balancing processing flow;
As mentioned above, kafka will divide all partitions within a topic equally as much as possible to each consuming node under the topic. Therefore, after a new consumption node joins the theme, if the total number of nodes of each consumption node in the theme is greater than the total number of partitions in the theme, the total number of partitions held by each consumption node in the theme is indicated to be not greater than 1. For example, there are 10 partitions and 12 consuming nodes in a topic, then only 10 consuming nodes are allocated to a partition, and the other two consuming nodes are waiting. At this time, the consumption nodes under the theme and having the partitions have the same number of the partitions, and no redundant partitions can be allocated under the theme, so that all the consumption nodes under the theme are in a load balancing state.
The total number of nodes for the topic is updated from 12 to 13 when a new consuming node is added. At this point, the total number of nodes 13 for the topic is still greater than the total number of partitions 10 within the topic. Because no redundant partition is available for allocation, newly added consumption nodes need to enter waiting, and all consumption nodes are still in a load balancing state under the theme, so that load balancing processing is not needed.
Step 603: if the total number of the nodes is not greater than the total number of the partitions, determining the nodes to be processed from all the consumption nodes according to the partition number average value of the theme and the number of the partitions held by all the consumption nodes under the theme;
In the implementation, the node to be processed can be determined from all the consumption nodes through the flow shown in fig. 7, specifically, as shown in fig. 7, the method comprises the following steps:
step 701: detecting whether the total number of nodes of all consumption nodes in the theme is larger than the expected partition number;
The number of partitions predicted in an embodiment of the present application characterizes the number of partitions that need to be reassigned by rounding down the partition number average.
Step 702: if the total number of the nodes is larger than the expected partition number, determining a first node with the partition number larger than a partition number threshold value from all consumption nodes; otherwise, jumping to step 706 below;
The partition number threshold in the embodiment of the application is an upward rounding of the partition number average. As shown in FIG. 8, the subject is provided with Partition0 to Partition9 in total with 10 partitions. The consumption node Customer0 holds partitions of Partition 0-Partition 3, the consumption node Customer1 holds partitions of Partition 4-Partition 6, and the consumption node Customer2 holds partitions of Partition 7-Partition 9.
When a new consumer node Customer3 joins the topic, the partition number average of the topic is calculated to be 10/4=2.5 by the above step 401. At this time, the upward rounding result of the partition number average is taken as the partition number threshold of the subject: celing (2.5) =3.
The core idea of load balancing, as already mentioned above, is to make the number of holding partitions of each consuming node under the topic as identical as possible. Based on this, in the embodiment of the present application, the consuming node having the number of partitions greater than the average value of the number of partitions is used as the first node, so as to preferentially pick the target partition that needs to be allocated to the target node (i.e., the consuming node newly added with the theme at this time) from the partitions held by the first node.
As illustrated by the previous example of FIG. 8, when there is a new consumer node Customer3 added within the topic, the partition number threshold for the topic is 3. The consuming node within the topic that holds a partition number greater than the partition number average (2.5) is Customer0, i.e., customer0 is the first node within the topic.
Step 703: detecting whether the first number of first nodes is less than the expected number of partitions;
Step 704: if the first number is smaller than the number of the predicted partitions, selecting a second number of second nodes from the nodes to be detected, and taking the first nodes and the second nodes as nodes to be processed;
Taking the foregoing example of fig. 8 as an example, it is specifically shown in fig. 9: since there is only one first node within the topic, the first number of first nodes (1) is less than the expected number of partitions: floor (2.5) =2.
At this time, a node to be measured is selected from the consuming nodes of the topic, and a second number of second nodes is selected from the nodes to be measured. The nodes to be tested in the embodiment of the application are consuming nodes which already hold partitions except the first node. Since consumer nodes Customer1 and Customer2 already holding partitions are nodes to be tested, except for first node Customer0, within the theme. The second number is calculated from the difference between the number of the predicted partitions and the first number, i.e. the second number is 2-1=1.
In practical applications, each consuming node in Kafka carries a unique number (i.e., customer0, customer1 … … CustomerN), which is set in order from early to late according to the time stamp when the consuming node joins the topic, i.e., the larger the number of consuming nodes, the later the time to join the topic.
Since the consuming nodes with later time for adding the theme generally have later time for leaving the theme, in order to avoid frequent execution of load balancing processing, a second number of second nodes can be sequentially selected from the nodes to be tested according to the sequence from the number from the big to the small. In the example of fig. 8 described above, there are only two nodes to be measured, customer1 and Customer2, so Customer2 with the largest number can be used as the second node.
Step 705: if the first number is not less than the predicted partition number, the first node is used as a node to be processed;
If the first number of the first nodes is not smaller than the number of the expected partitions, the method indicates that load balancing can be achieved only by selecting a target partition from the partitions held by the first nodes and reassigning the target partition, and the first nodes are taken as the nodes to be processed.
Step 706: and if the total number of the nodes is not greater than the number of the expected partitions, taking all the consumption nodes of the held partitions as the nodes to be processed.
As shown in FIG. 10, assume that a topic is provided with Partition0 through Partition9 for a total of 10 partitions, including a total of 2 consumer nodes for Customer0 and Customer 1.
Wherein, consumption node Customer0 holds Partition 0-Partition 4, consumption node Customer1 holds Partition 5-Partition 9. When a new consumer node Customer2 joins the topic, the number of partitions predicted for the topic is floor (10/3) =3.
At this point, since the total number of consuming nodes within the topic is not greater than the expected number of partitions, all consuming nodes of the held partitions within the topic (i.e., customer0 and Customer 1) are treated as pending nodes.
Step 604: releasing the partition held by the node to be processed to obtain the target partition;
after selecting the nodes to be processed from the consumption nodes, the flow shown in fig. 7 describes how to release the partitions with the threshold number of partitions from the partitions held by the nodes to be processed, and takes the released partitions as target partitions; the number of the partitions held by each node to be processed after the partitions are released is not smaller than the threshold value of the number of the partitions.
As mentioned above, each consuming node carries a unique number (i.e., customer0, customer1 … … CustomerN) that is set in the order of early to late based on the time stamp of when the consuming node joins the topic, i.e., the larger the number of consuming nodes, the later the time to join the topic.
Based on the method, a polling mode can be adopted, and one partition is sequentially released from the partitions held by each node to be processed as a target partition according to the sequence from the large number to the small number until the released partition number reaches the expected partition number.
Taking the foregoing fig. 10 as an example, fig. 11 specifically shows: since the number of partitions predicted for the topic is 3, the nodes to be processed are Customer0 holding partitions Partition 0-Partition 4 and Customer1 holding partitions Partition 5-Partition 9.
Next, the last Partition9 held by the largest numbered Customer1 is released first, then the last Partition4 held by Customer0 is released, then Customer1 is returned, and the last Partition8 held by Customer1 is released. Since the number of released partitions reaches the predicted number of partitions (3) at this time, the operation of releasing the partitions is stopped.
The released partitions Partition4, partition8 and Partition9 are the target partitions that need to be reassigned. At this point, the target partition is assigned to the target node Customer2 that newly joins the topic by executing step 403.
In order to further reduce resource overhead, after determining that there is a node update action trigger in step 401, kafka detects whether the total number of nodes under the topic is greater than the total number of partitions of the topic, and if yes, allows newly added consumption to enter a waiting state; otherwise, the newly added consumption node is updated to the management node, and the management node judges and controls the processing flow of initiating load balancing through the flow.
Scene two, the triggered node update behavior characterizes that a consuming node leaves a theme;
When step 402 is executed, if the triggered node update behavior characterizes that the consuming node leaves the theme, the consuming node after leaving the theme can automatically release the partition held by the consuming node, so that the partition held by the consuming node before leaving the theme can be directly used as the target partition. And then, determining the target nodes according to the partition number of the target partitions in the theme, the partition number average value and the holding partition number of each consumption node.
Next, a target node for receiving the target partition is found from the consumption nodes under the subject by a flow as shown in fig. 12, which specifically includes the following steps:
Step 121: detecting whether idle nodes exist in all consumption nodes under the theme;
As mentioned above, the partition is a message body for carrying service data, and the consuming node holding the partition is used for consuming the message bodies to complete the transmission of the service data. The idle node in the embodiment of the application refers to a message node in which no message body exists in all the held partitions. That is, the idle node holds the partition, but does not actually perform the consuming operation.
Step 122: if the idle node exists, taking the idle node as a target node;
Since the idle node does not perform the consuming operation, the idle node may be directly taken as the target node, and the target partition may be allocated to the target node by performing step 403.
If there is only one target node, all target partitions are allocated to the target node. If there are a plurality of target nodes, each target partition may be allocated to each target node in turn in order of the number of each target node from the largest to the smallest based on the polling allocation method described above.
Step 123: if no idle node exists, detecting whether a third node exists in each consumption node;
The third node in the embodiment of the application is a consumption node with the number of the partitions smaller than the average value of the number of the partitions. Similar to the logic of the first node, if the number of the holding partitions is smaller than the average value of the number of the partitions, which indicates that the consumption node needs to consume lower data, the load value is relatively lower, and the target partition can be preferentially allocated to the consumption node.
Step 124: if the third node exists, detecting whether the third number of the third node is smaller than the partition number of the target partition; otherwise, jump to step 127 below;
Step 125: if the third number is not less than the partition number, the third node is used as a target node;
That is, if the third number of the third nodes is not less than the number of partitions, it is explained that all the target partitions may be equally divided by the third nodes, and in this case, the third nodes are directly taken as the target nodes, and the target partitions may be allocated to the target nodes by executing the foregoing step 403.
Step 126: if the third number is smaller than the partition number, determining a fourth node of a fourth number from all consumption nodes, and taking the third node and the fourth node as target nodes;
The fourth node in the embodiment of the application is a consuming node which has a partition in the subject except the third node. The fourth number is derived from the difference between the number of partitions of the target partition and the third number. For example, in fig. 13, partition0 to Partition9 are provided in total under the Topic0, and a total of 10 partitions are provided.
The consumption node Customer0 holds partitions of Partition 0-Partition 2, the consumption node Customer1 holds partitions of Partition 3-Partition 5, the consumption node Customer2 holds partitions of Partition6 and Partition7, and the consumption node Customer3 holds partitions of Partition8 and Partition9.
And when the consumption node Customer0 completes the consumption operation and leaves the theme, taking partitions Partition 0-Partition 2 originally held by the Customer0 as target partitions, namely, the number of the target partitions under the theme is 3. The average value of the number of the partitions held by each consumption node under the theme is 7/3.
Since consumer nodes Customer2 and Customer3 hold a partition number of 2, which is less than 7/3 of the partition number average, consumer nodes Customer2 and Customer3 are the third nodes under the topic, i.e., the third number of third nodes under the topic is 2.
Not shown in the figure, since the third number (2) is smaller than the number of partitions (3), a difference between the number of partitions of the target partition and the third number is calculated to obtain a fourth number: 3-2=1. Next, a fourth number of fourth nodes, other than the third node, are selected from the consuming nodes under the topic, which hold the partition.
In implementation, the fourth nodes with the largest numbers can be selected from the consumption nodes according to the sequence of the numbers of the consumption nodes from large to small, and the third nodes and the fourth nodes are used as target nodes. In the example of fig. 13, the third node Customer2 and Customer, and the fourth node Customer1 are the target nodes.
Further, after the target node is found through the above procedure, as shown in fig. 14, each target partition may be allocated to each target node in turn in the order from the higher number to the lower number based on the polling allocation method. That is, the load balancing process for each consumer node is completed by first assigning the target Partition0 to Customer3, then assigning the target Partition1 to Customer2, and finally assigning the target Partition2 to Customer 1.
Step 127: and if the third node does not exist in the consumption nodes, taking all consumption nodes under the theme as target nodes.
Since the third node is a consuming node with the number of partitions being smaller than the average value of the number of partitions, if the third node does not exist under the subject, only one situation exists, namely that the number of partitions held by each consuming node is the same, namely equal to the average value of the number of partitions.
At this time, all consumption nodes under the subject can be used as target nodes, and each target partition is allocated to each target node in turn according to the sequence of the numbers of each target node from big to small based on the polling allocation mode.
In order to further reduce resource overhead, kafka may upgrade the consuming node with the smallest load value under the topic to a monitor to monitor the number change of the consuming nodes under the topic.
And after the node updating behavior triggering the consuming node to leave the theme is monitored, if the monitoring node is currently an idle node, all target partitions released by the consuming node leaving the theme are directly received.
If the monitoring node is not the idle node, upgrading the monitoring node to a management node, and judging and controlling the processing flow of initiating load balancing by the management node through the flow. In addition, if the consuming node leaving the topic is the original monitoring node under the topic, the consuming node with the smallest load needs to be preferentially selected from the topic as a new monitoring node.
In the load balancing processing flow of the embodiment of the application, only the target partition is reassigned to the target node, and recovery and reassignment are not required for all the partitions held by the consuming nodes, so that the reassignment of the partitions to the consuming nodes with balanced loads can be avoided, the resource cost is saved, and the processing time consumption of the load balancing can be effectively reduced.
Based on the same inventive concept, the embodiment of the application also provides a load balancing device of the message middleware Kafka, specifically as shown in fig. 15, the device comprises:
The average value processing unit 151 is configured to: when a theme triggering node updating behavior exists in the Kafka, determining the partition quantity average value of partitions held by each consumption node in the theme; wherein, a plurality of partitions are arranged in the theme, and each partition is distributed with a unique consumption node;
a partition node unit 152 configured to: determining a target partition from a plurality of partitions of the subject according to the node updating behavior and the partition number average value, and determining a target node from the consumption nodes;
A load balancing unit 153 configured to: and distributing the target partition to the target nodes to balance the load values of the consumption nodes.
In some embodiments, performing the determining a target partition from the plurality of partitions of the topic and a target node from the consuming nodes according to the node update behavior and the partition number average, the partition node unit 152 is configured to:
When the node updating behavior characterizes that a consumption node joins a theme, determining whether the total number of nodes of each consumption node is larger than the total number of partitions in the theme;
If the total number of the nodes is larger than the total number of the partitions, determining a node to be processed according to the partition number average value and the number of the partitions held by each consumption node; otherwise, canceling the load balancing processing of each message node;
Releasing the partition held by the node to be processed to obtain the target partition, and taking the newly added consumption node in the theme as the target node.
In some embodiments, performing the determining the node to be processed according to the partition number average and the number of held partitions of each consuming node, the partition node unit 152 is configured to:
If the total number of nodes of each consumption node is larger than the expected partition number, determining a first node with the partition number larger than a partition number threshold value from the consumption nodes, and determining the node to be processed according to the first node; the estimated partition number is obtained by rounding down according to the partition number threshold, and the partition number threshold is obtained by rounding up according to the partition number average;
And if the total number of the nodes of each consumption node is not larger than the predicted partition number, taking the consumption nodes of the held partition as the nodes to be processed.
In some embodiments, performing the determining the pending node from the first node, the partition node unit 152 is configured to:
If the first number of the first nodes is smaller than the predicted partition number, selecting a second number of second nodes from the nodes to be detected, and taking the first nodes and the second nodes as nodes to be processed; the nodes to be tested are consumption nodes with partitions except the first node; the second number is determined from the predicted number of partitions and the first number;
And if the first number of the first nodes is smaller than the predicted partition number, the first nodes are used as the nodes to be processed.
In some embodiments, performing the freeing of the partition held by the node to be processed results in the target partition, and the partition node unit 152 is configured to:
releasing the partitions with the estimated partition number from the partitions held by each node to be processed;
taking the released partition as the target partition; the number of the holding partitions of each node to be processed after releasing the partitions is not smaller than the number of the predicted partitions.
In some embodiments, performing the determining a target partition from the plurality of partitions of the topic and a target node from the consuming nodes according to the node update behavior and the partition number average, the partition node unit 152 is configured to:
when the node updating behavior characterizes that a consuming node leaves a theme, taking a partition held before the consuming node leaves the theme as the target partition;
and determining the target nodes according to the partition number of the target partitions in the theme, the partition number average value and the holding partition number of each consumption node.
In some embodiments, before performing the determining the target node according to the number of partitions of the target partition within the subject, the partition number average, and the number of held partitions of the consuming nodes, the partition node unit 152 is further configured to:
determining that no idle node exists in the consumption nodes; wherein, the message body is not loaded in each partition held by the idle node;
the partition node unit 152 is further configured to: and if the idle node exists in each consumption node, taking the idle node as the target node.
In some embodiments, performing the determining the target node according to the number of partitions of the target partition within the subject, the partition number average, and the number of held partitions of the consuming nodes, the partition node unit 152 is configured to:
If a third node with the partition number smaller than the partition number average value exists in each consumption node, and the third number of the third node is not smaller than the partition number, the third node is taken as a target node;
if the third nodes exist in the consumption nodes and the third number is smaller than the partition number, determining a fourth node with a fourth number from the consumption nodes, and taking the third nodes and the fourth node as target nodes; wherein the fourth node is a consuming node holding a partition except the third node; the fourth number is determined from the number of partitions and the third number;
and if the third node does not exist in the consumption nodes, all consumption nodes under the theme are taken as target nodes.
Based on the same inventive concept, the embodiment of the present application further provides an electronic device, specifically as shown in fig. 16, including: a data detection unit 161 and a processor 162:
The data detection unit 161 is configured to: detecting whether a theme triggering node updating behavior exists in Kafka;
the processor 162 is configured to: when a theme triggering node updating behavior exists in the Kafka, determining the partition quantity average value of partitions held by each consumption node in the theme; wherein, a plurality of partitions are arranged in the theme, and each partition is distributed with a unique consumption node;
Determining a target partition from a plurality of partitions of the subject according to the node updating behavior and the partition number average value, and determining a target node from the consumption nodes;
and distributing the target partition to the target nodes to balance the load values of the consumption nodes.
In some embodiments, performing the determining a target partition from the plurality of partitions of the topic and a target node from the consuming nodes according to the node update behavior and the partition number average, the processor 162 is configured to:
When the node updating behavior characterizes that a consumption node joins a theme, determining whether the total number of nodes of each consumption node is larger than the total number of partitions in the theme;
If the total number of the nodes is larger than the total number of the partitions, determining a node to be processed according to the partition number average value and the number of the partitions held by each consumption node; otherwise, canceling the load balancing processing of each message node;
Releasing the partition held by the node to be processed to obtain the target partition, and taking the newly added consumption node in the theme as the target node.
In some embodiments, the processor 162 is configured to:
If the total number of nodes of each consumption node is larger than the expected partition number, determining a first node with the partition number larger than a partition number threshold value from the consumption nodes, and determining the node to be processed according to the first node; the estimated partition number is obtained by rounding down according to the partition number threshold, and the partition number threshold is obtained by rounding up according to the partition number average;
And if the total number of the nodes of each consumption node is not larger than the predicted partition number, taking the consumption nodes of the held partition as the nodes to be processed.
In some embodiments, performing the determining the node to be processed from the first node, the processor 162 is configured to:
If the first number of the first nodes is smaller than the predicted partition number, selecting a second number of second nodes from the nodes to be detected, and taking the first nodes and the second nodes as nodes to be processed; the nodes to be tested are consumption nodes with partitions except the first node; the second number is determined from the predicted number of partitions and the first number;
And if the first number of the first nodes is smaller than the predicted partition number, the first nodes are used as the nodes to be processed.
In some embodiments, performing the freeing of the partition held by the node to be processed results in the target partition, the processor 162 being configured to:
releasing the partitions with the estimated partition number from the partitions held by each node to be processed;
taking the released partition as the target partition; the number of the holding partitions of each node to be processed after releasing the partitions is not smaller than the number of the predicted partitions.
In some embodiments, performing the determining a target partition from the plurality of partitions of the topic and a target node from the consuming nodes according to the node update behavior and the partition number average, the processor 162 is configured to:
when the node updating behavior characterizes that a consuming node leaves a theme, taking a partition held before the consuming node leaves the theme as the target partition;
and determining the target nodes according to the partition number of the target partitions in the theme, the partition number average value and the holding partition number of each consumption node.
In some embodiments, prior to performing the determining the target node from the number of partitions of the target partition within the topic, the partition number average, and the number of held partitions of the respective consuming nodes, the method further comprises the processor 162 being configured to:
Determining that no idle node exists in the consumption nodes; wherein, the message body is not loaded in each partition held by the idle node; the processor 162 is further configured to:
And if the idle node exists in each consumption node, taking the idle node as the target node.
In some embodiments, the processor 162 is configured to perform the determining the target node based on the number of partitions of the target partition within the topic, the partition number average, and the number of held partitions of the respective consuming nodes by:
If a third node with the partition number smaller than the partition number average value exists in each consumption node, and the third number of the third node is not smaller than the partition number, the third node is taken as a target node;
if the third nodes exist in the consumption nodes and the third number is smaller than the partition number, determining a fourth node with a fourth number from the consumption nodes, and taking the third nodes and the fourth node as target nodes; wherein the fourth node is a consuming node holding a partition except the third node; the fourth number is determined from the number of partitions and the third number;
and if the third node does not exist in the consumption nodes, all consumption nodes under the theme are taken as target nodes.
The embodiment of the application also provides a computer storage medium, wherein computer program instructions are stored in the computer storage medium, and when the instructions run on a computer, the computer is caused to execute the steps of the traffic signal control method.
It will be appreciated by those skilled in the art that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
It will be apparent to those skilled in the art that various modifications and variations can be made to the present application without departing from the scope of the application. Thus, it is intended that the present application also include such modifications and alterations insofar as they come within the scope of the appended claims or the equivalents thereof.

Claims (10)

1. A method for load balancing of message middleware Kafka, the method comprising:
When a theme triggering node updating behavior exists in the Kafka, determining the partition quantity average value of partitions held by each consumption node in the theme; wherein, a plurality of partitions are arranged in the theme, and each partition is distributed with a unique consumption node;
Determining a target partition from a plurality of partitions of the subject according to the node updating behavior and the partition number average value, and determining a target node from the consumption nodes;
and distributing the target partition to the target nodes to balance the load values of the consumption nodes.
2. The method of claim 1, wherein the determining a target partition from the plurality of partitions of the topic and the target node from the consuming node based on the node update behavior and the partition number average comprises:
When the node updating behavior characterizes that a consumption node joins a theme, determining whether the total number of nodes of each consumption node is larger than the total number of partitions in the theme;
If the total number of the nodes is larger than the total number of the partitions, determining a node to be processed according to the partition number average value and the number of the partitions held by each consumption node; otherwise, canceling the load balancing processing of each message node;
Releasing the partition held by the node to be processed to obtain the target partition, and taking the newly added consumption node in the theme as the target node.
3. The method of claim 2, wherein the determining the node to be processed according to the partition number average and the number of held partitions of each consuming node comprises:
If the total number of nodes of each consumption node is larger than the expected partition number, determining a first node with the partition number larger than a partition number threshold value from the consumption nodes, and determining the node to be processed according to the first node; the estimated partition number is obtained by rounding down according to the partition number threshold, and the partition number threshold is obtained by rounding up according to the partition number average;
And if the total number of the nodes of each consumption node is not larger than the predicted partition number, taking the consumption nodes of the held partition as the nodes to be processed.
4. A method according to claim 3, wherein said determining the node to be processed from the first node comprises:
If the first number of the first nodes is smaller than the predicted partition number, selecting a second number of second nodes from the nodes to be detected, and taking the first nodes and the second nodes as nodes to be processed; the nodes to be tested are consumption nodes with partitions except the first node; the second number is determined from the predicted number of partitions and the first number;
And if the first number of the first nodes is smaller than the predicted partition number, the first nodes are used as the nodes to be processed.
5. The method according to claim 3 or 4, wherein said releasing the partition held by the node to be processed to obtain the target partition comprises:
releasing the partitions with the estimated partition number from the partitions held by each node to be processed;
taking the released partition as the target partition; the number of the holding partitions of each node to be processed after releasing the partitions is not smaller than the number of the predicted partitions.
6. The method of claim 1, wherein the determining a target partition from the plurality of partitions of the topic and the target node from the consuming node based on the node update behavior and the partition number average comprises:
when the node updating behavior characterizes that a consuming node leaves a theme, taking a partition held before the consuming node leaves the theme as the target partition;
and determining the target nodes according to the partition number of the target partitions in the theme, the partition number average value and the holding partition number of each consumption node.
7. The method of claim 6, wherein prior to determining the target node based on the number of partitions of the target partition within the subject, the partition number average, and the number of partitions held by each consuming node, the method further comprises:
determining that no idle node exists in the consumption nodes; wherein, the message body is not loaded in each partition held by the idle node;
The method further comprises the steps of:
And if the idle node exists in each consumption node, taking the idle node as the target node.
8. The method of claim 6, wherein the determining the target node based on the number of partitions of the target partition within the topic, the partition number average, and the number of partitions held by each consuming node comprises:
If a third node with the partition number smaller than the partition number average value exists in each consumption node, and the third number of the third node is not smaller than the partition number, the third node is taken as a target node;
if the third nodes exist in the consumption nodes and the third number is smaller than the partition number, determining a fourth node with a fourth number from the consumption nodes, and taking the third nodes and the fourth node as target nodes; wherein the fourth node is a consuming node holding a partition except the third node; the fourth number is determined from the number of partitions and the third number;
and if the third node does not exist in the consumption nodes, all consumption nodes under the theme are taken as target nodes.
9. An electronic device, comprising a data detection unit and a processor:
the data detection unit is configured to: detecting whether a theme triggering node updating behavior exists in Kafka;
The processor is configured to: when a theme triggering node updating behavior exists in the Kafka, determining the partition quantity average value of partitions held by each consumption node in the theme; wherein, a plurality of partitions are arranged in the theme, and each partition is distributed with a unique consumption node;
Determining a target partition from a plurality of partitions of the subject according to the node updating behavior and the partition number average value, and determining a target node from the consumption nodes;
and distributing the target partition to the target nodes to balance the load values of the consumption nodes.
10. A computer readable storage medium, characterized in that the computer readable storage medium stores a computer program comprising program instructions which, when executed by a computer, cause the computer to perform the method of any of claims 1-8.
CN202410031410.6A 2024-01-08 2024-01-08 Load balancing method and related device for message middleware Kafka Pending CN118034911A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410031410.6A CN118034911A (en) 2024-01-08 2024-01-08 Load balancing method and related device for message middleware Kafka

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410031410.6A CN118034911A (en) 2024-01-08 2024-01-08 Load balancing method and related device for message middleware Kafka

Publications (1)

Publication Number Publication Date
CN118034911A true CN118034911A (en) 2024-05-14

Family

ID=90999583

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410031410.6A Pending CN118034911A (en) 2024-01-08 2024-01-08 Load balancing method and related device for message middleware Kafka

Country Status (1)

Country Link
CN (1) CN118034911A (en)

Similar Documents

Publication Publication Date Title
JP3678414B2 (en) Multiprocessor system
CN110515704B (en) Resource scheduling method and device based on Kubernetes system
US6560628B1 (en) Apparatus, method, and recording medium for scheduling execution using time slot data
CN102281329B (en) Resource scheduling method and system for platform as a service (Paas) cloud platform
US8516492B2 (en) Soft partitions and load balancing
CN109766172B (en) Asynchronous task scheduling method and device
CN113608871A (en) Service processing method and device
CN107515781B (en) Deterministic task scheduling and load balancing system based on multiple processors
CN106775975B (en) Process scheduling method and device
US20230037293A1 (en) Systems and methods of hybrid centralized distributive scheduling on shared physical hosts
CN105824705B (en) A kind of method for allocating tasks and electronic equipment
CN108694083B (en) Data processing method and device for server
CN116010064A (en) DAG job scheduling and cluster management method, system and device
CN110413393B (en) Cluster resource management method and device, computer cluster and readable storage medium
CN114968601A (en) Scheduling method and scheduling system for AI training jobs with resources reserved according to proportion
CN111209098A (en) Intelligent rendering scheduling method, server, management node and storage medium
CN114995974A (en) Task scheduling method and device, storage medium and computer equipment
CN117971491A (en) In-process resource control method, device, equipment and storage medium
CN113553171A (en) Load balancing control method, device and computer readable storage medium
CN111143063B (en) Task resource reservation method and device
CN117724811A (en) Hierarchical multi-core real-time scheduler
CN110175078B (en) Service processing method and device
CN118034911A (en) Load balancing method and related device for message middleware Kafka
CN110516121A (en) Method for reading data and device
CN116260876A (en) AI application scheduling method and device based on K8s and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination