CN106878415B - Load balancing method and device for data consumption - Google Patents

Load balancing method and device for data consumption Download PDF

Info

Publication number
CN106878415B
CN106878415B CN201710081800.4A CN201710081800A CN106878415B CN 106878415 B CN106878415 B CN 106878415B CN 201710081800 A CN201710081800 A CN 201710081800A CN 106878415 B CN106878415 B CN 106878415B
Authority
CN
China
Prior art keywords
message
distributed
node device
queues
node
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710081800.4A
Other languages
Chinese (zh)
Other versions
CN106878415A (en
Inventor
刘恒
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Advanced New Technologies Co Ltd
Advantageous New Technologies Co Ltd
Original Assignee
Alibaba Group Holding Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Group Holding Ltd filed Critical Alibaba Group Holding Ltd
Priority to CN201710081800.4A priority Critical patent/CN106878415B/en
Publication of CN106878415A publication Critical patent/CN106878415A/en
Application granted granted Critical
Publication of CN106878415B publication Critical patent/CN106878415B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1001Protocols in which an application is distributed across nodes in the network for accessing one among a plurality of replicated servers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/50Network services
    • H04L67/56Provisioning of proxy services
    • H04L67/566Grouping or aggregating service requests, e.g. for unified processing

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Computer And Data Communications (AREA)

Abstract

The application provides a load balancing method for data consumption, which comprises the following steps: calculating a hash value corresponding to each message topic to be distributed subscribed by the cluster, and determining the mapping position of the hash value in the node equipment list to be distributed as an initial distribution position corresponding to each message topic to be distributed; based on the total number of the message queues subscribed by the cluster, corresponding to the average number of the node devices to be distributed, starting from the node device corresponding to the distribution starting position, averagely distributing the message queues for the node devices in the node device list to be distributed, and combining and matching the data volume of the message queues distributed to the node devices; when all the message queues subscribed by the cluster are completely distributed, searching a target message queue distributed to the device, and acquiring message data from the target message queue based on a pull mode to perform message processing. The method and the device can ensure that the message queues distributed to each node device are balanced in quantity and corresponding data quantity.

Description

Load balancing method and device for data consumption
Technical Field
The present application relates to the field of computer applications, and in particular, to a load balancing method and apparatus for data consumption.
Background
Pull mode (Pull) data consumption, which means that a producer of message data dispersedly stores generated message data in different message queues in a message system through message middleware; when a consumer of the message data needs to consume the message data in the message queue, a data acquisition request can be actively initiated to the message queue, and the message data stored in the message queue is pulled to the local for message processing.
In the related art, a message system may generally divide generated message data into different message topics (Topic) based on the type of the message data, each message Topic may be further divided into a plurality of message queues (quene), and a consumer of the message data may subscribe to one or more message topics in the message system, and then actively "pull" the data from the message queues under the subscribed message topics through a pull mode for message processing.
In practical applications, if a consumer of message data is a distributed cluster, since the distributed cluster may include a plurality of node devices as consumers, when the distributed cluster subscribes to a plurality of message topics in a message system and each of the message topics includes a plurality of message queues, the message queues are generally required to be respectively allocated to each node device according to a uniform allocation policy.
In the pull mode, each node device allocates a message queue to each node device based on a uniform allocation policy; in the existing message distribution mechanism, in order to avoid the repetition of message queues autonomously distributed by each node device, each node device usually adopts a unified node device list and a subscribed message topic list, and the message queues are respectively distributed to each node device according to a unified sequence in the list; however, although the existing message distribution mechanism can avoid duplication of message queues distributed to each node device, the distribution method is a mechanical distribution completely according to the sequence in the list, so that the message queues are not flexibly distributed.
Disclosure of Invention
The application provides a load balancing method for data consumption, which is applied to any node device in a distributed cluster butted with a message system, wherein the distributed cluster subscribes message data of at least one message theme in the message system, and the message theme comprises a plurality of message queues; the node equipment acquires message data corresponding to the message theme from the message system based on a pull mode to process the message, and the method comprises the following steps:
calculating a hash value corresponding to each message topic to be distributed subscribed by the cluster, and determining the mapping position of the hash value in the node equipment list to be distributed as an initial distribution position corresponding to each message topic to be distributed;
based on the total number of message queues subscribed by the cluster, corresponding to the average number of each node device to be distributed in the node device list to be distributed, starting from the node device corresponding to the distribution starting position, averagely distributing the message queues for each node device in the node device list to be distributed, and combining and collocating the data volume of the message queues distributed to each node device based on a preset load balancing strategy;
when all the message queues subscribed by the cluster are completely distributed, searching a target message queue distributed to the device, and acquiring message data from the target message queue based on a pull mode to perform message processing.
The application also provides a load balancing device for data consumption, which is applied to any node device in a distributed cluster butted with a message system, wherein the distributed cluster subscribes message data of at least one message theme in the message system, and the message theme comprises a plurality of message queues; the node equipment acquires message data corresponding to the message theme from the message system based on a pull mode to process the message, and the device comprises:
the computing module is used for computing hash values corresponding to all to-be-distributed message themes subscribed by the cluster and determining the mapping positions of the hash values in the to-be-distributed node equipment list as initial distribution positions corresponding to all to-be-distributed message themes;
the distribution module is used for averagely distributing the message queues for the node devices in the node device list to be distributed from the node devices corresponding to the distribution starting positions on the basis of the total number of the message queues subscribed by the cluster, and combining and matching the data quantity of the message queues distributed to the node devices on the basis of a preset load balancing strategy, wherein the total number of the message queues corresponds to the average number of the node devices to be distributed in the node device list to be distributed;
and the acquisition module is used for searching a target message queue allocated to the equipment when all the message queues subscribed by the cluster are completely allocated, and acquiring message data from the target message queue based on a pull mode to perform message processing.
In the application, when each node device in the distributed cluster distributes a message queue under a message topic to be distributed in a message topic list subscribed by the cluster to each node device in the cluster, an initial distribution position corresponding to each message topic to be distributed is determined in the node device list to be distributed through a hash value corresponding to each message topic to be distributed, and the node devices are averagely distributed for each node device from the node device corresponding to the initial distribution position according to the average number of the node devices corresponding to the total number of the message queues subscribed by the cluster; meanwhile, in the distribution process, the data volume of the message queue distributed to each node device can be combined and matched based on a preset load balancing strategy, and after the distribution of the message queues under all subscribed message topics is completed, the message data is obtained from the target message queue distributed to the device based on the pull mode to perform message processing, so that the message queue distributed to each node device can be ensured to the greatest extent, the quantity and the corresponding data volume can be balanced, and when each node device obtains the data from the message queue distributed to the device based on the pull mode to perform message processing, the message processing load of each node device in the cluster can be approximately balanced, and the overall processing efficiency of the distributed cluster during message processing can be optimized.
Drawings
FIG. 1 is a flow chart illustrating a method for load balancing of data consumption according to an embodiment of the present application;
fig. 2 is a logic block diagram of a load balancing apparatus based on pull mode data consumption according to an embodiment of the present application;
fig. 3 is a hardware structure diagram of a node device that carries the load balancing apparatus based on pull mode data consumption according to an embodiment of the present application.
Detailed Description
In practical applications, when a distributed cluster subscribes to one or more message topics in a message system and the subscribed message topics include a plurality of message queues, in a pull mode, each node device is usually configured to independently allocate a message queue to each node device based on a uniform allocation policy.
In the existing message distribution mechanism, in order to avoid repetition of message queues autonomously distributed by each node device, each node device may load a unified list of node devices to be distributed and a message topic list subscribed by a cluster in a memory, and then sequentially distribute the message queues under the message topics to be distributed to each node device according to the sequence of the message topics to be distributed in the message topic list and the sequence of the node devices to be distributed in the list of node devices to be distributed.
For example, assume that the message system contains 4 message topics topic 1-4; topic1 has 3 queues tp1.q1, tp1.q2 and tp1.q 3; topic2 has 2 queues tp2.q1 and tp2.q 2; topic3 has 1 queue tp3.q 1; and topic4 has 2 queues tp4.q1 and tp4.q2, for a total of 8 queues. In the distributed cluster which is subscribed to the 4 topics at the same time, a total of 4 node devices client1-4 which are used as consumers are included.
Based on the existing message queue allocation mechanism, the final allocation result can be as shown in table 1 below:
Figure GDA0002422206110000041
TABLE 1
As shown in table 1, since the existing allocation mechanism is completely and mechanically allocated according to the sequence in the list, tp1.q1, tp2.q1, tp3.q1 and tp4.q1 which are arranged at the top in the message subject list to be allocated are allocated to the client1 which is arranged at the top in the node device list to be allocated; and allocating the second-ranked tp1.q2, tp2.q2 and tp4.q2 in the message subject list to be allocated to the second-ranked client2 in the node device list to be allocated, and so on.
As can be seen from Table 1, eventually client1 is assigned 4 message queues, client2 is assigned 3 message queues, client3 is assigned 2 message queues, and client4 is not assigned a message queue.
Therefore, based on the existing message queue distribution mode, the node equipment arranged in front of the node equipment list to be distributed can be preferentially distributed to the message queue; node devices arranged behind the node device list to be allocated may not be allocated to the message queue; in addition, in this way, the number of message queues ultimately allocated to each node device in the same cluster also has a problem of imbalance, and thus, the message processing resources in the cluster cannot be utilized to the maximum extent.
In view of this, the present application provides a load balancing method for message consumption of each node device in a distributed cluster in a pull mode. When each node device in the distributed cluster distributes a message queue under a message topic to be distributed in a message topic list subscribed by the cluster to each node device in the cluster, determining an initial distribution position corresponding to each message topic to be distributed in the node device list to be distributed through a hash value corresponding to each message topic to be distributed, and averagely distributing the node devices for each node device from the node device corresponding to the initial distribution position according to the fact that the total number of the message queues subscribed by the cluster corresponds to the average number of each node device; meanwhile, in the distribution process, the data volume of the message queue distributed to each node device can be combined and matched based on a preset load balancing strategy, and after the distribution of the message queues under all subscribed message topics is completed, the message data is obtained from the target message queue distributed to the device based on the pull mode to perform message processing, so that the message queue distributed to each node device can be ensured to the greatest extent, the quantity and the corresponding data volume can be balanced, and when each node device obtains the data from the message queue distributed to the device based on the pull mode to perform message processing, the message processing load of each node device in the cluster can be approximately balanced, and the overall processing efficiency of the distributed cluster during message processing can be optimized.
The present application is described below with reference to specific embodiments and specific application scenarios.
Referring to fig. 1, fig. 1 is a load balancing method based on pull mode data consumption according to an embodiment of the present application, applied to any node device in a distributed cluster interfacing with a messaging system, where the distributed cluster subscribes to message data of at least one message topic in the messaging system, and the message topic includes a plurality of message queues; the node equipment acquires message data corresponding to the message theme from the message system based on a pull mode to perform message processing; the method performs the steps of:
step 101, calculating a hash value corresponding to each message topic to be distributed subscribed by the cluster, and determining the mapping position of the hash value in a node device list to be distributed as an initial distribution position corresponding to each message topic to be distributed;
the message system can be a distributed data center built based on a server or a server cluster. The distributed cluster refers to a distributed device cluster which is connected with a message system and is composed of a plurality of node devices.
In practical applications, the distributed cluster may subscribe to one or more message topics in the message system. Each node device in the distributed cluster can autonomously complete the distribution of the message queue based on the completely same list of node devices to be distributed loaded in the respective memory and the message topic list subscribed by the cluster.
The list of node devices to be allocated may include node devices in the cluster that can currently perform message processing on message data under a message topic subscribed by the cluster. The message topic list may include message topics subscribed by the cluster and message queues under the message topics.
In this example, after starting the distribution of the message queues under the message topics in the message topic list, each node device may sequentially select each message topic as the message topic to be distributed according to the arrangement order of each message topic in the message topic list, and then sequentially distribute the message queues under each message topic to the node device according to the arrangement order of each message topic in the message topic list.
Based on the existing message queue allocation mechanism, when allocating a message queue under a message topic to be allocated to each node device, the message queue is allocated to each node device in sequence according to the sequence in the node device list to be allocated, usually starting from the first node device in the node device list to be allocated.
According to the distribution mode, although repeated distribution of the message queues can be avoided, because the number of the message queues under each message topic may be different, the message queues are distributed to the node devices in strict sequence, which may cause the number of the message queues distributed to the node devices to be inconsistent, and the node device arranged at the back position in the node device list to be distributed may not be distributed to the message queues.
In this example, in order to avoid the above problem, when each node device allocates the message queue under the message subject to be allocated to each node device, the node device arranged at the first position in the node device list to be allocated may not start to be allocated according to the arrangement order of each node device in the node device list to be allocated.
Specifically, after each node device selects a message topic to be distributed, a hash value (hash value) of the message topic to be distributed may be calculated first.
When calculating the hash value of the message subject to be distributed, the information capable of distinguishing each message subject can be adopted for calculation, so that the calculated hash values of the message subjects can be ensured to be different; for example, a hash value may be calculated based on the name of each message topic.
After calculating the hash value of the message subject to be distributed, mapping the calculated hash value to the node device list to be distributed, and then determining the mapping position of the hash value in the node device list to be distributed as the distribution starting position of the node device to be distributed
The mapping method for mapping the calculated hash value to the list of the node devices to be distributed is not particularly limited in the present application;
in an illustrated embodiment, when mapping the calculated hash value to the to-be-allocated node device list, a remainder (mod) operation may be performed on the calculated hash value and the total number of node devices in the node device list, then a position corresponding to a result of the remainder operation in the to-be-allocated node device list is searched, and the found position is determined as the mapping position as the allocation starting position;
for example, assuming that there are 10 node apparatuses as consumers in the node apparatus list, the calculated hash value is 1234567, and mod operations are performed on 1234567 and 10 to result in 7, so that a message queue may be allocated to each node apparatus from the 7 th node apparatus in the node apparatus list with 7 as an allocation start position.
In this way, since the hash values calculated by different message topics are different from each other, when each node device selects different message topics as node devices to be allocated, it can be ensured that initial allocation positions in the node device list are different from each other, so that node devices later in the node device list can also be allocated to the message queue at an opportunity.
And 102, based on the total number of the message queues subscribed by the cluster, corresponding to the average number of each node device to be distributed in the node device list to be distributed, starting from the node device corresponding to the distribution starting position, averagely distributing the message queues for each node device in the node device list to be distributed, and combining and collocating the data volume of the message queues distributed to each node device based on a preset load balancing strategy.
Through calculating the hash value of each message topic to be distributed, the distribution starting position corresponding to each message topic to be distributed is determined in the node device list to be distributed, although it can be guaranteed to some extent that the node devices at the back in the node device list can also be distributed to the message queue at a chance, the node devices are not distributed according to the total number of the message queues subscribed by the cluster at present, and the number distributed to each node device is random finally.
In this example, each node device may locally store a list of each message queue under each message topic subscribed by the cluster, so that each node device as a consumer has a global list formed by all message queues subscribed by the cluster;
for example, taking the example shown in table 1 as an example, each node device may locally maintain a global message queue list composed of tp1.q1, tp1.q2, tp1.q3, tp2.q1, tp2.q2, tp3.q1, tp4.q1, and tp4.q 2.
In this case, after each node device determines a starting allocation position corresponding to each node device to be allocated in the node devices to be allocated finally by calculating a hash value of a message topic to be allocated, the total number of message queues subscribed by the cluster may be counted based on the global list, the total number of message queues subscribed by the cluster is calculated, the total number corresponds to an average number of each node device in the node device list to be allocated, and then, according to the calculated average number, node devices are evenly allocated for each node device in each node device list to be allocated, starting from the node device corresponding to the determined starting allocation position.
For example, still taking the example shown in table 1 as an example, a cluster subscribes to a total of 8 message queues, which are evenly distributed to 4 consumers, and then each consumer can be distributed to 2 message queues.
In this way, when no message queue is allocated to each node device, the message queues are not strictly allocated according to the arrangement sequence of each node device in the node device list, and the total number of the message queues subscribed by the cluster is fully considered in the allocation process, so that the message queues are guaranteed to be evenly allocated to each node device to the greatest extent.
In addition, in practical application, because the data volumes of the message queues to be allocated under the message subjects to be allocated may be different, in order to ensure that the number of the message queues allocated to each node device and the total data volume of the allocated message queues can be balanced to the greatest extent, each node device may further combine and match the data volumes of the message queues allocated to each node device based on a preset load balancing policy in the process of evenly allocating the message queues to each node device based on the calculated average number.
In an illustrated embodiment, when the node devices evenly distribute the message queues to the node devices according to the calculated average number, the node devices may further count the data size of each message queue under the message topic to be distributed, sort the message queues according to the data size, and generate a sequence based on the sorted order.
When distributing the message queues for each node device, the message queues can be respectively selected for each node device based on the head end and the tail end of the sequence generated after sequencing, and the message queues distributed to each node device are combined and matched;
for example, assume that a cluster focuses on two message topics, topic1 and topic2, there are two message queues, tp1.q1 and tp1.q2, under topic1, and two message queues, tp2.q1 and tp2.q2, under topic 2; the node devices in the cluster as consumers are 2, client1 and client 2. Assuming that the data of tp1.q1 is the largest and the data amount of tp2.q2 is the smallest, the sequence generated by sorting the message queues according to the data amount is tp1.q1> tp1.q2> tp2.q1> tp2.q2, and since the average number of all the message queues distributed to the node devices is 2, when distributing the message queues to the client1, one message queue can be selected at the head end and the tail end of the sequence, tp1.q1 and tp2.q2 can be distributed to the client1, and tp1.q2 and tp2.q1 can be distributed to the client 2.
By the method, the quantity of the message queues distributed to each node device and the total data quantity of the message queues distributed to each node device can be ensured to be balanced to the greatest extent, so that the load of each node device can be balanced.
In addition, by averagely distributing the message queues for the node devices from the determined distribution starting position and combining and collocating the data quantity of the message queues distributed to the node devices based on a preset load balancing strategy, although the quantity of the message queues distributed to the node devices and the corresponding data total quantity tend to be balanced to some extent, due to the fact that the types of the message data in the message queues distributed to the node devices are different, and when the node devices process the message data in the message queues, the corresponding processing overhead values are completely different; therefore, when each node device acquires data from the message queue allocated to the node device based on the pull mode to perform message processing, the load of each node device may still be unbalanced.
In this case, each node device equally allocates the message queues to each node device from the determined allocation start position, and combines and collocates the data volumes of the message queues allocated to each node device based on a preset load balancing policy in the allocation process, and after the message queues are allocated to each node device, each node device may further adjust the allocation results based on the load weight values of each message queue.
In an embodiment shown, each node device may further calculate a corresponding load weight value for each message queue based on a processing overhead value of message data in each message queue and a data size of each message queue.
The processing overhead value may be a system overhead parameter when each node device processes or calculates message data in each message queue; for example, the overhead value may specifically be a time duration that the node device needs to process or calculate message data in each message queue, or other types of overhead parameters.
The load weight value may be a weight value that is calculated according to a certain weighting algorithm and in combination with a processing overhead value and a total data amount of message data in each message queue, and that can represent a load size of each message queue. However, the weighting algorithm is not particularly limited in this example, and those skilled in the art can flexibly select the weighting algorithm with reference to the description in the related art when implementing the technical solution described in the present application.
When each node device calculates a corresponding load weight value for each message queue, the data amount of the message queue allocated to each node device may be adjusted based on the calculated load weight value, so as to balance the load of the message queue allocated to each node device.
When the data volume of the message queues distributed to each node device is adjusted, the load weight values of the message queues distributed to each node device can be added, the addition results of each node device are compared, and then the message queues distributed to each node device are combined and adjusted again based on the comparison results, so that the sum of the adjusted load weight values of the message queues distributed to each node device can be in a basically balanced state;
for example, assume that a cluster focuses on two message topics, topic1 and topic2, there are two message queues, tp1.q1 and tp1.q2, under topic1, and two message queues, tp2.q1 and tp2.q2, under topic 2; the node devices in the cluster as consumers are 2, client1 and client 2. The message queues ultimately assigned to client1 are tp1.q1 and tp2. q2; the message queues assigned to client1 are tp1.q2 and tp2.q 1.
Assuming that the finally calculated load weight values of tp1.q1 and tp2.q2 are both 3, and the load weight values of tp1.q2 and tp2.q1 are both 2, according to the above allocation result, the load of client1 is 6, and the load of client1 is 4. It can be seen that, according to the above allocation results, although the number of message queues allocated to the client1 and the client2 and the data amount are approximately balanced, the actual loads of the client1 and the client2 are completely different.
Therefore, in this case, the above allocation results may be adjusted in combination again based on the actual load weight values allocated to the client1 and the client2, the message queue allocated to the client1 is adjusted to tp1.q1 and tp2.q1, the message queue allocated to the client2 is adjusted to tp1.q2 and tp1.q1, and the loads of the adjusted client1 and the client2 are both 5, so that a load balancing state is achieved.
Of course, in practical applications, besides that each node device may further calculate a corresponding load weight value for each message queue based on the processing overhead value of the message data in each message queue and the data size of each message queue, the load weight value of each message queue may also be manually configured by an administrator.
In this case, each node device may obtain a load weight value manually configured by the administrator for each message queue, and then adjust the data amount of the message queue allocated to each node device based on the load weight value manually configured by the administrator for each message queue, so as to balance the load of the message queue allocated to each node device, and a specific implementation process is not described again.
Therefore, through the method, each node device averagely distributes the message queues to each node device from the determined distribution starting position, the data quantity of the message queues distributed to each node device is combined and matched based on the preset load balancing strategy in the distribution process, and after the message queues are distributed to each node device, each node device can further adjust the distribution result based on the load weight value of each message queue.
And 103, when all the message queues subscribed by the cluster are completely distributed, searching a target message queue distributed to the device, and acquiring message data from the target message queue based on a pull mode to perform message processing.
In this example, after each node device has completely distributed each message queue subscribed by the cluster in the manner described above, each node device may search for a target message queue distributed to the node device, then "pull" the cancellation message data from the target message queue of the message system based on the pull mode, and then perform message processing locally; the specific data acquisition method in the pull mode is not described in detail in the present application, and those skilled in the art can refer to the description in the related art when implementing the technical solution of the present application.
When distributing the message queues for each node device, the balance of the quantity, data volume and load of the distributed message queues is comprehensively considered, so when each node device locally processes the message data in the message queue distributed to itself, the load of each node device in the cluster is approximately equivalent, and the node devices are in a load balanced state.
Corresponding to the method embodiment, the application also provides an embodiment of the device.
Referring to fig. 2, the present application provides a load balancing apparatus 20 for data consumption, which is applied to any node device in a distributed cluster interfacing with a message system; referring to fig. 3, the hardware architecture related to the node device carrying the load balancing apparatus 20 based on pull mode data consumption generally includes a CPU, a memory, a nonvolatile memory, a network interface, an internal bus, and the like; taking a software implementation as an example, the load balancing apparatus 20 based on pull mode data consumption may be generally understood as a computer program loaded in a memory, and a logic apparatus formed by combining software and hardware after being executed by a CPU, where the apparatus 20 includes:
the calculation module 201 is configured to calculate a hash value corresponding to each message topic to be distributed subscribed by the cluster, and determine a mapping position of the hash value in the node device list to be distributed as an initial distribution position corresponding to each message topic to be distributed;
the distribution module 202, which is configured to, based on the total number of the message queues subscribed by the cluster, correspond to the average number of the node devices to be distributed in the node device list to be distributed, start from the node device corresponding to the distribution start position, evenly distribute the message queues for the node devices in the node device list to be distributed, and combine and match the data amount of the message queues distributed to the node devices based on a preset load balancing policy;
the obtaining module 203 searches a target message queue allocated to the device when the distribution of each message queue subscribed by the cluster is completed, and obtains message data from the target message queue based on a pull mode to perform message processing.
In this example, the calculation module 201:
performing a remainder operation on the hash value and the total number of the node devices in the node device list;
searching a position corresponding to the residue taking operation result in the node equipment list;
and determining the searched position as the distribution starting position corresponding to each message subject to be distributed.
In this example, the assignment module 202:
counting the data size of each message queue of the message subject to be distributed;
sequencing the message queues of the message subjects to be distributed according to the data size, and generating a sequence based on the sequenced sequence;
when distributing the message queues for each node device, the message queues are respectively selected for each node device from the head end and the tail end of the sequence so as to combine and match the data quantity of the message queues distributed to each node device.
In this example, the assignment module 202 further:
respectively calculating corresponding load weight values for each message queue based on the processing overhead value of the message data in each message queue and the data volume of each message queue;
and adjusting the number of the message queues distributed to each node device based on the calculated load weight value so as to balance the load corresponding to the message queues distributed to each node device.
In this example, the assignment module 202 further:
acquiring a load weight value pre-configured for each message queue;
and adjusting the number of the message queues distributed to each node device based on the load weight values preconfigured for each message queue, so as to balance the load corresponding to the message queues distributed to each node device.
For the device embodiments, since they substantially correspond to the method embodiments, reference may be made to the partial description of the method embodiments for relevant points. The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules can be selected according to actual needs to achieve the purpose of the scheme of the application. One of ordinary skill in the art can understand and implement it without inventive effort.
The systems, devices, modules or units illustrated in the above embodiments may be implemented by a computer chip or an entity, or by a product with certain functions. A typical implementation device is a computer, which may take the form of a personal computer, laptop computer, cellular telephone, camera phone, smart phone, personal digital assistant, media player, navigation device, email messaging device, game console, tablet computer, wearable device, or a combination of any of these devices.
Other embodiments of the present application will be apparent to those skilled in the art from consideration of the specification and practice of the invention disclosed herein. This application is intended to cover any variations, uses, or adaptations of the invention following, in general, the principles of the application and including such departures from the present disclosure as come within known or customary practice within the art to which the invention pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the application being indicated by the following claims.
It will be understood that the present application is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the application is limited only by the appended claims.
The above description is only exemplary of the present application and should not be taken as limiting the present application, as any modification, equivalent replacement, or improvement made within the spirit and principle of the present application should be included in the scope of protection of the present application.

Claims (8)

1. A load balancing method for message processing is characterized in that the load balancing method is applied to any node device in a distributed cluster which is in butt joint with a message system, the distributed cluster subscribes to message data of at least one message topic in the message system, and the message topic comprises a plurality of message queues; the node equipment acquires message data corresponding to the message theme from the message system based on a pull mode to process the message, and the method comprises the following steps:
calculating hash values corresponding to all message topics to be distributed subscribed by the cluster, and performing remainder operation on the hash values and the total number of node equipment in the node equipment list to be distributed to obtain operation results;
searching a mapping position corresponding to the operation result in the node equipment list to be distributed, and determining the mapping position as an initial distribution position corresponding to each message subject to be distributed;
based on the total number of message queues subscribed by the cluster, corresponding to the average number of each node device to be distributed in the node device list to be distributed, starting from the node device corresponding to the distribution starting position, averagely distributing the message queues for each node device in the node device list to be distributed, and combining and collocating the data volume of the message queues distributed to each node device based on a preset load balancing strategy;
when all the message queues subscribed by the cluster are completely distributed, searching a target message queue distributed to the device, and acquiring message data from the target message queue based on a pull mode to perform message processing.
2. The method according to claim 1, wherein the combining and matching the data amount of the message queue allocated to each node device based on the preset load balancing policy comprises:
counting the data size of each message queue of the message subject to be distributed;
sequencing the message queues of the message subjects to be distributed according to the data size, and generating a sequence based on the sequenced sequence;
when distributing the message queues for each node device, the message queues are respectively selected for each node device from the head end and the tail end of the sequence so as to combine and match the data quantity of the message queues distributed to each node device.
3. The method of claim 2, further comprising:
respectively calculating corresponding load weight values for each message queue based on the processing overhead value of the message data in each message queue and the data volume of each message queue;
and adjusting the number of the message queues distributed to each node device based on the calculated load weight value so as to balance the load corresponding to the message queues distributed to each node device.
4. The method of claim 3, further comprising:
acquiring a load weight value pre-configured for each message queue;
and adjusting the number of the message queues distributed to each node device based on the load weight values preconfigured for each message queue, so as to balance the load corresponding to the message queues distributed to each node device.
5. The load balancing device for message processing is applied to any node device in a distributed cluster which is in butt joint with a message system, wherein the distributed cluster is subscribed to message data of at least one message topic in the message system, and the message topic comprises a plurality of message queues; the node equipment acquires message data corresponding to the message theme from the message system based on a pull mode to process the message, and the device comprises:
the computing module is used for computing hash values corresponding to all message topics to be distributed subscribed by the cluster, and performing remainder operation on the hash values and the total number of the node equipment in the node equipment list to be distributed to obtain an operation result;
searching a mapping position corresponding to the operation result in the node equipment list to be distributed, and determining the mapping position as an initial distribution position corresponding to each message subject to be distributed;
the distribution module is used for averagely distributing the message queues for the node devices in the node device list to be distributed from the node devices corresponding to the distribution starting positions on the basis of the total number of the message queues subscribed by the cluster, and combining and matching the data quantity of the message queues distributed to the node devices on the basis of a preset load balancing strategy, wherein the total number of the message queues corresponds to the average number of the node devices to be distributed in the node device list to be distributed;
and the acquisition module is used for searching a target message queue allocated to the equipment when all the message queues subscribed by the cluster are completely allocated, and acquiring message data from the target message queue based on a pull mode to perform message processing.
6. The apparatus of claim 5, wherein the assignment module:
counting the data size of each message queue of the message subject to be distributed;
sequencing the message queues of the message subjects to be distributed according to the data size, and generating a sequence based on the sequenced sequence;
when distributing the message queues for each node device, the message queues are respectively selected for each node device from the head end and the tail end of the sequence so as to combine and match the data quantity of the message queues distributed to each node device.
7. The apparatus of claim 6, wherein the assignment module is further to:
respectively calculating corresponding load weight values for each message queue based on the processing overhead value of the message data in each message queue and the data volume of each message queue;
and adjusting the number of the message queues distributed to each node device based on the calculated load weight value so as to balance the load corresponding to the message queues distributed to each node device.
8. The apparatus of claim 6, wherein the assignment module is further to:
acquiring a load weight value pre-configured for each message queue;
and adjusting the number of the message queues distributed to each node device based on the load weight values preconfigured for each message queue, so as to balance the load corresponding to the message queues distributed to each node device.
CN201710081800.4A 2017-02-15 2017-02-15 Load balancing method and device for data consumption Active CN106878415B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710081800.4A CN106878415B (en) 2017-02-15 2017-02-15 Load balancing method and device for data consumption

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710081800.4A CN106878415B (en) 2017-02-15 2017-02-15 Load balancing method and device for data consumption

Publications (2)

Publication Number Publication Date
CN106878415A CN106878415A (en) 2017-06-20
CN106878415B true CN106878415B (en) 2020-09-01

Family

ID=59166020

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710081800.4A Active CN106878415B (en) 2017-02-15 2017-02-15 Load balancing method and device for data consumption

Country Status (1)

Country Link
CN (1) CN106878415B (en)

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109587692B (en) * 2017-09-29 2022-08-30 中国石油天然气股份有限公司 Oil well condition data acquisition method and device
CN108134814B (en) * 2017-11-27 2020-12-22 海尔优家智能科技(北京)有限公司 Service data processing method and device
CN110875935B (en) * 2018-08-30 2023-03-24 阿里巴巴集团控股有限公司 Message publishing, processing and subscribing method, device and system
CN112307037B (en) * 2019-07-26 2023-09-22 北京京东振世信息技术有限公司 Data synchronization method and device
CN110717132A (en) * 2019-09-05 2020-01-21 深圳平安通信科技有限公司 Data collection method and pushing method for full-link monitoring system and related equipment
CN110708312A (en) * 2019-09-30 2020-01-17 交控科技股份有限公司 Method and system for message transmission in ATS and ATS
CN111625366A (en) * 2020-06-02 2020-09-04 深圳市网是科技有限公司 Elastic expansion service method based on release and subscription model
CN112272217B (en) * 2020-10-16 2022-05-24 苏州浪潮智能科技有限公司 Kafka cluster load balancing method, system, equipment and medium
CN112333083B (en) * 2020-10-30 2023-04-28 平安付科技服务有限公司 Transaction information processing method, device, computer equipment and computer readable medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2013094837A1 (en) * 2011-12-19 2013-06-27 주식회사 솔박스 Method for managing server load distribution by using hash function results, and apparatus for same
CN103383690A (en) * 2012-05-04 2013-11-06 深圳市腾讯计算机系统有限公司 Distributed data storage method and system
CN104811459A (en) * 2014-01-23 2015-07-29 阿里巴巴集团控股有限公司 Processing method, processing device and system for message services and message service system
CN104935622A (en) * 2014-03-21 2015-09-23 阿里巴巴集团控股有限公司 Method used for message distribution and consumption and apparatus thereof, and system used for message processing
CN105791431A (en) * 2016-04-26 2016-07-20 北京邮电大学 On-line distributed monitoring video processing task scheduling method and device
CN106293968A (en) * 2016-08-04 2017-01-04 华中科技大学 A kind of intercommunication system based on Kafka message-oriented middleware and method

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101383905B1 (en) * 2011-12-19 2014-04-17 주식회사 솔박스 method and apparatus for processing server load balancing with the result of hash function
US8560455B1 (en) * 2012-12-13 2013-10-15 Digiboo Llc System and method for operating multiple rental domains within a single credit card domain

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2013094837A1 (en) * 2011-12-19 2013-06-27 주식회사 솔박스 Method for managing server load distribution by using hash function results, and apparatus for same
CN103383690A (en) * 2012-05-04 2013-11-06 深圳市腾讯计算机系统有限公司 Distributed data storage method and system
CN104811459A (en) * 2014-01-23 2015-07-29 阿里巴巴集团控股有限公司 Processing method, processing device and system for message services and message service system
CN104935622A (en) * 2014-03-21 2015-09-23 阿里巴巴集团控股有限公司 Method used for message distribution and consumption and apparatus thereof, and system used for message processing
CN105791431A (en) * 2016-04-26 2016-07-20 北京邮电大学 On-line distributed monitoring video processing task scheduling method and device
CN106293968A (en) * 2016-08-04 2017-01-04 华中科技大学 A kind of intercommunication system based on Kafka message-oriented middleware and method

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Multiple Path Selection Algorithm using Prime Number;Jaeyoung Kim et al;《 2006 10th IEEE Singapore International Conference on Communication Systems》;20061101;全文 *
云计算平台虚拟机调度策略研究;张琪;《中国优秀硕士学位论文全文数据库 信息科技辑》;20160415;全文 *

Also Published As

Publication number Publication date
CN106878415A (en) 2017-06-20

Similar Documents

Publication Publication Date Title
CN106878415B (en) Load balancing method and device for data consumption
Kliazovich et al. CA-DAG: Modeling communication-aware applications for scheduling in cloud computing
CN105391797B (en) Cloud Server load-balancing method and device based on SDN
CN107800768B (en) Open platform control method and system
CN111522641B (en) Task scheduling method, device, computer equipment and storage medium
CN112131006A (en) Service request distribution method, device, computer equipment and storage medium
CN107026900B (en) Shooting task allocation method and device
JP6881575B2 (en) Resource allocation systems, management equipment, methods and programs
CN107278365B (en) Apparatus for scalable peer matching
De Cauwer et al. The temporal bin packing problem: an application to workload management in data centres
CN105791254B (en) Network request processing method and device and terminal
CN108933829A (en) A kind of load-balancing method and device
CN106130972B (en) resource access control method and device
CN110808922A (en) Message processing method and device, storage medium and electronic equipment
WO2014194642A1 (en) Systems and methods for matching users
CN111078391A (en) Service request processing method, device and equipment
CN116431282A (en) Cloud virtual host server management method, device, equipment and storage medium
CN108153494B (en) A kind of I/O request processing method and processing device
CN110505276B (en) Object matching method, device and system, electronic equipment and storage medium
CN107800744B (en) Service request forwarding method, device and system
Chen et al. Partitioning and placing virtual machine clusters on cloud environment
CN110609707B (en) Online data processing system generation method, device and equipment
CN107046503B (en) Message transmission method, system and device
CN109981696B (en) Load balancing method, device and equipment
US20140215075A1 (en) Load balancing apparatus and method based on estimation of resource usage

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20201016

Address after: Cayman Enterprise Centre, 27 Hospital Road, George Town, Grand Cayman Islands

Patentee after: Innovative advanced technology Co.,Ltd.

Address before: Cayman Enterprise Centre, 27 Hospital Road, George Town, Grand Cayman Islands

Patentee before: Advanced innovation technology Co.,Ltd.

Effective date of registration: 20201016

Address after: Cayman Enterprise Centre, 27 Hospital Road, George Town, Grand Cayman Islands

Patentee after: Advanced innovation technology Co.,Ltd.

Address before: A four-storey 847 mailbox in Grand Cayman Capital Building, British Cayman Islands

Patentee before: Alibaba Group Holding Ltd.