CN117971486A - Load balancing method for distributed operation, electronic equipment and storage medium - Google Patents

Load balancing method for distributed operation, electronic equipment and storage medium Download PDF

Info

Publication number
CN117971486A
CN117971486A CN202410167689.0A CN202410167689A CN117971486A CN 117971486 A CN117971486 A CN 117971486A CN 202410167689 A CN202410167689 A CN 202410167689A CN 117971486 A CN117971486 A CN 117971486A
Authority
CN
China
Prior art keywords
trigger
execution
node
target
load balancing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202410167689.0A
Other languages
Chinese (zh)
Inventor
冯源
梁扬
雷琪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan Dream Database Co ltd
Original Assignee
Wuhan Dream Database Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan Dream Database Co ltd filed Critical Wuhan Dream Database Co ltd
Priority to CN202410167689.0A priority Critical patent/CN117971486A/en
Publication of CN117971486A publication Critical patent/CN117971486A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5083Techniques for rebalancing the load in a distributed system
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • G06F9/505Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the load

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a load balancing method for distributed operation, electronic equipment and a storage medium. The load balancing method of the distributed job is applied to a metadata service node and comprises the following steps: determining that a trigger and default execution nodes of the trigger meet preset conditions, and acquiring resource use conditions of all execution nodes; determining target execution nodes in all the execution nodes according to the resource use condition, and establishing dynamic distribution information of the trigger and the target execution nodes in a preset global trigger execution linked list; and sending the trigger identification of the trigger to the target execution node according to the dynamic distribution information so that the target execution node executes the job task corresponding to the trigger. According to the embodiment of the invention, the dynamic distribution of the execution nodes of the trigger is realized, the problem of long job execution time caused by unbalanced load of the execution nodes is prevented, and the execution efficiency of the job task is improved.

Description

Load balancing method for distributed operation, electronic equipment and storage medium
Technical Field
The present invention relates to the field of computer technologies, and in particular, to a load balancing method for distributed operations, an electronic device, and a storage medium.
Background
The distributed computing cluster (Distributed Processing Cluster, DPC) is made up of three parts, a plan generation node (SQL Processor, SP), a data storage node (Backend Processor, BP), and a metadata server node (Metadata Processor, MP). The SP receives the client request and generates an execution plan; the BP is responsible for storing user data, executing a scheduling instruction of the SP and returning an execution result to the SP; the MP is responsible for storing metadata and providing metadata services to the SP and BP.
In DPC systems, an execution node (the execution node may be any one of all SP nodes) may be designated when creating a job, and the job may be executed on the designated execution node; if the executing node is not specified, the SP node with the smallest available node number is selected for executing. If the job designates an execution node, once the job is created, the job is always executed on the fixed node, and under the condition that the number of the jobs is large, the use of certain node server resources can be large, so that the problems of unbalanced load and long job execution time occur. If the execution node is not specified, more jobs are executed on the SP node with smaller node number, and the problems of unbalanced load and longer execution time of the jobs also occur. Therefore, how to load balance the execution nodes is a problem to be solved.
Disclosure of Invention
The invention provides a load balancing method for distributed jobs, electronic equipment and a storage medium, which are used for solving the problems of unbalanced load of execution nodes and longer job execution time.
According to an aspect of the present invention, there is provided a load balancing method for distributed jobs, wherein the load balancing method is applied to a metadata service node, and includes:
determining that a trigger and default execution nodes of the trigger meet preset conditions, and acquiring resource use conditions of all execution nodes;
determining target execution nodes in all the execution nodes according to the resource use condition, and establishing dynamic distribution information of the trigger and the target execution nodes in a preset global trigger execution linked list;
and sending the trigger identification of the trigger to the target execution node according to the dynamic distribution information so that the target execution node executes the job task corresponding to the trigger.
According to another aspect of the present invention, there is provided a load balancing method of distributed jobs, applied to a target execution node, including:
Receiving a trigger identifier sent by a metadata service node, and adding the trigger identifier to a preset local trigger execution linked list;
and executing the job tasks corresponding to the triggers according to the arrangement sequence of the trigger identifications in the preset local trigger execution linked list.
According to another aspect of the present invention, there is provided a load balancing apparatus for distributed jobs, wherein the load balancing apparatus is applied to a metadata service node, comprising:
The condition acquisition module is used for determining that the trigger and default execution nodes of the trigger meet preset conditions and acquiring resource use conditions of all execution nodes;
The information generation module is used for determining target execution nodes in all the execution nodes according to the resource use condition, and establishing dynamic distribution information of the trigger and the target execution nodes in a preset global trigger execution linked list;
And the job distribution module is used for sending the trigger identification of the trigger to the target execution node according to the dynamic distribution information so that the target execution node executes the job task corresponding to the trigger.
According to another aspect of the present invention, there is provided a load balancing apparatus for distributed jobs, wherein the load balancing apparatus is applied to a target execution node, and includes:
The device comprises an identifier receiving module, a trigger identification processing module and a trigger identification processing module, wherein the identifier receiving module is used for receiving a trigger identifier sent by a metadata service node and adding the trigger identifier to a preset local trigger execution linked list;
And the job execution module is used for executing the job tasks corresponding to the triggers according to the arrangement sequence of the trigger identifiers in the preset local trigger execution chain table.
According to another aspect of the present invention, there is provided an electronic apparatus including:
At least one processor; and
A memory communicatively coupled to the at least one processor; wherein,
The memory stores a computer program executable by the at least one processor to enable the at least one processor to perform the load balancing method of distributed jobs as described in any of the embodiments of the present invention.
According to another aspect of the present invention, there is provided a computer readable storage medium storing computer instructions for causing a processor to execute a load balancing method for distributed jobs according to any of the embodiments of the present invention.
According to the technical scheme, the trigger and the default execution node of the trigger meet the preset conditions, the resource use condition of all execution nodes is obtained, then the target execution node in all execution nodes is determined according to the resource use condition, the dynamic distribution information of the trigger and the target execution node is established in the preset global trigger execution linked list, the trigger identification of the trigger is sent to the target execution node according to the dynamic distribution information, so that the target execution node executes the job task corresponding to the trigger, the dynamic distribution of the execution node of the trigger is realized, the problem of long job execution time caused by unbalanced load of the execution node is prevented, the execution efficiency of the job task is improved, and the use experience of a user is improved.
It should be understood that the description in this section is not intended to identify key or critical features of the embodiments of the invention or to delineate the scope of the invention. Other features of the present invention will become apparent from the description that follows.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of a method for load balancing of distributed jobs according to a first embodiment of the present invention;
FIG. 2 is a flow chart of another method for load balancing distributed jobs according to a second embodiment of the present invention;
FIG. 3 is a flow chart of a load balancing method for a distributed job according to a third embodiment of the present invention;
FIG. 4 is a flow chart of a load balancing method for a distributed job according to a fourth embodiment of the present invention;
fig. 5 is a schematic structural diagram of a load balancing device for distributed operation according to a fifth embodiment of the present invention;
Fig. 6 is a schematic structural diagram of a load balancing device for distributed operation according to a sixth embodiment of the present invention;
fig. 7 is a schematic structural diagram of an electronic device implementing a load balancing method for distributed jobs according to an embodiment of the present invention.
Detailed Description
In order that those skilled in the art will better understand the present invention, a technical solution in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in which it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present invention without making any inventive effort, shall fall within the scope of the present invention.
It should be noted that the terms "default," "target," and the like in the description and claims of the present invention and the above figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the invention described herein may be implemented in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
It should be noted that, in a distributed computing cluster (DPC), only one MP node can be configured, and all metadata information is stored in the MP; multiple BP nodes can be configured, and different user data are stored on each BP node; multiple SP nodes can be configured, and a complete database service can be obtained by connecting any SP node.
In the database management work, for some works with fixed flow, such as regular database backup, regular data statistics report generation and the like, the work can be configured into a job, and the corresponding job is executed regularly through trigger control.
Example 1
Fig. 1 is a flowchart of a load balancing method for distributed jobs, which is applied to a metadata service node according to an embodiment of the present invention, where the method may be performed by a load balancing device for distributed jobs, where the load balancing device for distributed jobs may be implemented in hardware and/or software, and the load balancing device for distributed jobs may be configured in an electronic device. As shown in fig. 1, the method includes:
S110, determining that the trigger and default execution nodes of the trigger meet preset conditions, and acquiring resource use conditions of all execution nodes.
Wherein a trigger is a special stored procedure associated with a table event in a database, whose execution is not invoked by a program, nor manually initiated, but triggered by the event. For example, trigger execution may be activated when a table in the database is operated (added, deleted, or changed). The trigger is a database object related to a table, and when a specified event occurs on the table where the trigger is located and a defined condition is met, a set of sentences defined in the trigger will be executed. For each trigger, there is a default execution node.
The default execution node refers to a default execution node of the trigger, and in an actual operation process, the default execution node can generate a node for the plan. An execution node refers to a node that executes a job task of a trigger, and may generate a node for an arbitrary plan. There may be an association between the trigger and the default execution node, and for example, the trigger identifier of the trigger may be stored in association with the node identifier of the default execution node, and the corresponding default execution node may be queried according to the trigger identifier.
Resource usage may be understood as usage of the executing node and resource occupancy. By way of example, resource usage may include, but is not limited to, central processing unit (Central Processing Unit, CPU) usage, memory consumption, and the like.
In an embodiment, determining that the trigger and the default execution node of the trigger satisfy the preset condition includes: the default execution node of the trigger is a load balancing permission node; the trigger is not assigned.
The load balancing permission node can be understood as a node which allows the task of the associated trigger to be distributed to other execution nodes for execution. The load balancing allows the node to have a special identification, e.g., "at_all_sp$". In one embodiment, the creation of the special flag may include setting the execution node to the special flag during the creation of the designated execution node for the job task, indicating that the job task is allowed to be dynamically distributed to any execution node for execution. When the special mark is designated by the job task, creating a trigger for the job task, and automatically setting an execution node of the trigger as the special mark; or directly set a special flag to the designated execution node.
In the embodiment of the invention, the default execution node of the trigger can be determined, the node type of the default execution node can be determined, and meanwhile, the allocation situation of the trigger can be determined. When the default execution node of the trigger is a load balancing permission node and the trigger is not allocated, it may be determined that the trigger and the default execution node of the trigger satisfy a preset condition. At this time, the resource use cases of all the execution nodes can be extracted.
In the actual operation process, after the trigger is allocated to the execution node, the trigger and the corresponding execution node may be stored in a preset global trigger execution linked list to identify that the trigger has been allocated. The preset global trigger execution linked list can be extracted to inquire whether the trigger identifier of the trigger exists. When a trigger identifier exists in a preset global trigger execution chain table, determining that the trigger does not meet a preset condition; and when the trigger identification does not exist in the preset global trigger execution linked list, determining that the trigger meets the preset condition. When the trigger and the default execution node corresponding to the trigger meet preset conditions, acquiring the resource use condition of all the execution nodes. In one embodiment, the CPU utilization and memory consumption of all executing nodes may be determined as resource utilization.
S120, determining target execution nodes in all execution nodes according to the resource use condition, and establishing dynamic distribution information of triggers and the target execution nodes in a preset global trigger execution linked list.
The preset global trigger execution linked list can be understood as a linked list which is created in advance and stores dynamic distribution information of all triggers and target execution nodes in the distributed computing cluster. The dynamic distribution information may be understood as information indicating that the trigger has been issued, and in an actual operation process, the dynamic distribution information may be an association relationship between the target execution node and the trigger. For example, a node identifier of the target execution node and a trigger identifier association relationship of the trigger may be established as the dynamic distribution information.
In the embodiment of the invention, the execution node with the least resource use condition can be determined as the target node by comparing the resource use conditions of all the execution nodes. In the actual operation process, the execution node with the minimum utilization rate of the central processing unit of all the execution nodes and the minimum value of the memory consumption condition can be determined as the target execution node. Alternatively, a load balancing algorithm, such as a weighted random balancing scheduling (Weighted Random Scheduling) algorithm, a throughput balancing scheduling (Processing Capacity Scheduling) algorithm, or the like, may be used to select the optimal execution node as the target execution node. And determining a node identifier of the target execution node, establishing an association relation between the trigger identifier of the trigger and the node identifier as dynamic distribution information, and storing the dynamic distribution information in a preset global trigger execution linked list. In one embodiment, the trigger identification may include a trigger identification number (Identity document, ID); the node identification may include a node number.
S130, sending the trigger identification of the trigger to the target execution node according to the dynamic distribution information so that the target execution node executes the job task corresponding to the trigger.
A job task is understood to be a task that is executed at a fixed time by a trigger control. By way of example, job tasks may include, but are not limited to, periodically backing up databases and periodically generating data statistics, etc. For a fixed-flow job in the database, it can be configured as a job task.
In the embodiment of the invention, the dynamic distribution information associated with the trigger can be determined, the node identification in the dynamic distribution information is extracted, and the trigger identification is sent to the target execution node according to the node identification, so that the target execution node executes the job task corresponding to the trigger. In the actual operation process, corresponding dynamic distribution information can be determined according to the trigger identification of the trigger, the node identification of the target execution node is queried, and the trigger identification is sent to the target execution node according to the node identification.
According to the embodiment of the invention, the trigger and the default execution node of the trigger meet the preset conditions, the resource use condition of all the execution nodes is acquired, then the target execution node in all the execution nodes is determined according to the resource use condition, the dynamic distribution information of the trigger and the target execution node is established in the preset global trigger execution linked list, the trigger identification of the trigger is sent to the target execution node according to the dynamic distribution information, so that the target execution node executes the job task corresponding to the trigger, the dynamic distribution of the execution node of the trigger is realized, the problem of long job execution time caused by unbalanced load of the execution node is prevented, the execution efficiency of the job task is improved, and the use experience of a user is improved.
In one embodiment, after sending the trigger identification of the trigger to the target execution node according to the dynamic distribution information, the method further comprises:
receiving execution completion information fed back by a target execution node;
And removing the dynamic distribution information from a preset global trigger execution linked list according to the execution completion information.
The execution completion information may be understood as completion information fed back by the target execution node after the execution of the job task is completed, so as to notify the metadata service node that the job task is completed.
In the embodiment of the invention, after receiving the execution completion information fed back by the target execution node, the trigger identifier included in the execution completion information may be determined. Inquiring dynamic distribution information associated with the trigger identifier in a preset global trigger execution linked list, and removing the corresponding dynamic distribution information in the preset global trigger execution linked list.
In an embodiment, the load balancing method of the distributed job further includes:
And when the default execution node of the trigger is determined not to meet the load balancing condition, the trigger identification of the trigger is sent to the default execution node so that the default execution node executes the job task corresponding to the trigger.
In the embodiment of the invention, when the default execution node of the trigger is not the load balancing permission node, the job task corresponding to the trigger is considered not to be allowed to be distributed to other execution nodes, and the trigger identifier of the trigger can be sent to the default execution node, so that the default execution node executes the job task corresponding to the trigger.
In an embodiment, when the metadata service node fails, if the metadata service node is a single node, after the metadata service node is restarted, the preset global trigger execution linked list is re-created, the preset local trigger execution linked lists on all the execution nodes are collected, and all the trigger identifiers and the corresponding execution node identifiers are added into the preset global trigger execution linked list.
When the metadata service node is configured with the backup node, after the metadata service node is subjected to primary-backup switching, the new primary node can re-establish a preset global trigger execution linked list, collect preset local trigger execution linked lists on all execution nodes, and add all trigger identifiers and corresponding execution node identifiers into the preset global trigger execution linked list.
Example two
Fig. 2 is a flowchart of another load balancing method for distributed jobs according to a second embodiment of the present invention, where the embodiment is further optimized and expanded based on the foregoing embodiment, and may be combined with each optional technical solution in the foregoing embodiment, as shown in fig. 2, and the method includes:
s210, determining that the trigger and default execution nodes of the trigger meet preset conditions, and acquiring resource use conditions of all execution nodes.
S220, comparing the resource use conditions in all the execution nodes; the monitoring items of the resource usage condition at least comprise: CPU utilization and memory consumption.
In the embodiment of the invention, the memory of the execution node can be read, and the resource use condition of each execution node can be determined. In the actual operation process, all monitoring items in the resource use condition of each execution node, such as the central processing unit use rate and the memory consumption condition, can be respectively determined, and the same monitoring items in each execution node are compared. For example, the central processing unit usage rate and the memory consumption condition of all the execution nodes may be respectively compared, and the central processing unit usage rate and the memory consumption condition of each execution node may be compared to determine the target execution node.
S230, taking the execution node with the minimum monitoring item value of the resource use condition in all the execution nodes as a target execution node.
In the embodiment of the invention, after the comparison of the resource usage conditions of all the execution nodes is completed, the execution node with the minimum value of the monitoring item of the resource usage condition can be used as the target execution node. In an embodiment, the execution node with the smallest central processing unit utilization rate and memory consumption condition value of each execution node can be used as a target execution node; or a weighted random balance scheduling algorithm, a processing capacity balance scheduling algorithm and the like can be adopted to select the optimal execution node as the target execution node.
S240, extracting the node identification of the target execution node and the trigger identification of the trigger.
The node identification may be understood as identification information distinguishing the executing nodes, and may include a node number, for example. The trigger identification may be understood as identification information indicating the trigger, and may include a trigger ID, for example.
In the embodiment of the invention, the corresponding node identifier and the trigger identifier can be respectively extracted from the memories of the target execution node and the trigger.
S250, establishing a node identifier and trigger identifier association relationship as dynamic distribution information, and storing the dynamic distribution information in a preset global trigger execution linked list.
In the embodiment of the invention, the association relation between the node identifier and the trigger identifier can be determined, and the association relation is used as dynamic distribution information to be stored in a preset global trigger execution linked list.
S260, reading dynamic distribution information of an execution linked list of a preset global trigger.
S270, determining trigger identifiers and node identifiers stored in the dynamic distribution information.
In the embodiment of the invention, dynamic distribution information can be extracted, and trigger identifiers and node identifiers stored in the dynamic distribution information are determined.
And S280, sending the trigger identification to the target execution node according to the node identification, so that the target execution node executes the job task corresponding to the trigger.
According to the embodiment of the invention, the trigger and the default execution node of the trigger meet the preset conditions, so that the resource use condition of all the execution nodes is acquired, the resource use condition in all the execution nodes is compared, and the execution node with the minimum monitoring item value of the resource use condition in all the execution nodes is used as the target execution node, so that the determination of the target execution node is realized. By extracting the node identifier of the target execution node and the trigger identifier of the trigger, establishing the association relation between the node identifier and the trigger identifier as dynamic distribution information, and storing the dynamic distribution information in a preset global trigger execution linked list, the distribution condition of the trigger can be checked more clearly in the preset global trigger execution linked list. The trigger identification and the node identification stored in the dynamic distribution information are determined through reading the dynamic distribution information of the preset global trigger execution linked list, and the trigger identification is sent to the target execution node according to the node identification, so that the task corresponding to the trigger is executed through the target execution node, and meanwhile, the dynamic distribution of the execution node of the trigger is realized.
Example III
Fig. 3 is a flowchart of a load balancing method of a distributed job according to a third embodiment of the present invention, which is applied to a target execution node. The embodiment may be applied to a case of load balancing a job task of a trigger, as shown in fig. 3, where the method includes:
S310, receiving a trigger identification sent by the metadata service node, and adding the trigger identification to a preset local trigger execution linked list.
The preset local trigger execution linked list can be understood as a trigger linked list which is created in advance and stored in a target execution node to execute a job task. Corresponding job tasks can be executed according to the arrangement sequence of trigger identifiers in a preset local trigger execution linked list.
In the embodiment of the invention, after the trigger identifier sent by the metadata service node is obtained, the trigger identifier may be added to a preset local trigger execution linked list to execute the job task corresponding to the trigger.
S320, executing the job tasks corresponding to the triggers according to the arrangement sequence of the trigger identifiers in the preset local trigger execution chain table.
In the embodiment of the invention, the arrangement sequence of the trigger identifications in the preset local trigger execution linked list can be determined, and the job tasks corresponding to the triggers are sequentially executed according to the arrangement sequence of the trigger identifications.
In an embodiment, the target executing node obtains a first trigger ID in a preset local trigger execution chain table, and queries specific trigger information according to the trigger ID, where the trigger information may include a job identifier corresponding to the trigger, and executes the job task.
According to the embodiment of the invention, the trigger identification sent by the metadata service node is received, and the trigger identification is added to the preset local trigger execution linked list, so that the operation tasks corresponding to the triggers are executed according to the arrangement sequence of the trigger identification in the preset local trigger execution linked list, the orderly execution of the operation tasks is realized, and the execution efficiency of the operation tasks is improved.
In an embodiment, the load balancing method of the distributed job further includes:
After the execution of the job task corresponding to the trigger is completed, feeding back the execution completion information to the metadata service node, and removing the trigger identification from a preset local trigger execution linked list.
Wherein the execution completion information includes at least a trigger identification.
In the embodiment of the invention, after the execution of the job task corresponding to the trigger is completed, execution completion information with the trigger identifier can be generated, and the execution completion information is sent to the metadata service node to inform the metadata service node that the execution of the job task corresponding to the trigger is completed. Meanwhile, removing the trigger mark in the preset local trigger execution linked list to prevent the repeated execution of the job task.
In an embodiment, when the metadata service node fails, if the metadata service node is a single node, after the metadata service node is restarted, the preset global trigger execution linked list is re-created, the preset local trigger execution linked lists on all the execution nodes are collected, and all the trigger identifiers and the corresponding execution node identifiers are added into the preset global trigger execution linked list.
When the metadata service node is configured with the backup node, after the metadata service node is subjected to primary-backup switching, the new primary node can re-establish a preset global trigger execution linked list, collect preset local trigger execution linked lists on all execution nodes, and add all trigger identifiers and corresponding execution node identifiers into the preset global trigger execution linked list. In one embodiment, the DPC system supports configuration of multiple backup nodes for the metadata service node to ensure system reliability, since important metadata information is stored on the metadata service node. When the metadata service node main node fails, one node is selected from the metadata service node backup nodes to serve as a new main node to continue to provide service.
In an embodiment, when the metadata service node monitors that the execution node fails, the metadata service node removes all dynamic distribution information corresponding to the failed execution node from the preset global trigger execution linked list.
If the execution node is failed and restarted quickly (the preset local trigger execution linked list on the execution node is empty after restarting), the execution node sends a message to the metadata service node to inform the metadata service node that the current execution node performs restarting operation, and the metadata service node removes all dynamic distribution information corresponding to the execution node from the preset global trigger execution linked list (if the preset global trigger execution linked list does not exist, the dynamic distribution information is not removed).
Example IV
Fig. 4 is a flowchart of a load balancing method for a distributed job according to a fourth embodiment of the present invention. In this embodiment, based on the above embodiment, an MP node is used as a metadata service node, and an SP node is used as an execution node. The trigger ID is a trigger identification, the SP node number is an execution node identification, and the target SP node is taken as a target execution node for further explanation of a load balancing method of distributed operation. As shown in fig. 4, the method includes:
S410, the MP node checks whether a special mark exists in a default execution node of the trigger; if yes, executing S420; if not, S430 is performed.
In an embodiment, when the default execution node has a special tag, the default execution node may be considered as a load balancing enabled node.
In one embodiment, when a job task is created to designate an executing node, the executing node may be set to a special flag, for example, the special flag may be "at_all_sp$", indicating that the job task is allowed to be dynamically distributed to any SP node for execution. If a job task specifies a special flag, then the execution node of the trigger is automatically set to that special flag when the trigger is created for the job task.
S420, when the MP node checks that the trigger ID does not exist in the preset global trigger execution chain table, the MP node monitors and collects the resource use condition of each SP node, and determines the target SP node according to the resource use condition.
The resource use condition includes, but is not limited to, CPU use rate, memory consumption condition, and the like. The optimal SP node may be selected as the target SP node using a load balancing algorithm, such as a weighted random balanced scheduling (Weighted Random Scheduling) algorithm or a processing power balanced scheduling (Processing Capacity Scheduling) algorithm.
In an embodiment, when the trigger ID exists in the preset global trigger execution linked list, it is indicated that the trigger has been distributed, and the corresponding job task has not been executed to completion, and is therefore no longer distributed.
S430, sending the trigger ID of the trigger to a default execution node so that the default execution node executes the job task corresponding to the trigger.
S440, the MP node sends the trigger ID to the target SP node, adds a piece of dynamic distribution information in a preset global trigger execution chain table, and records the trigger ID and the SP node number of the target SP node.
S450, the SP node receives the trigger ID sent by the MP node, adds the trigger ID into a preset local trigger execution linked list, and feeds back execution completion information to the MP node after the job task is executed.
In an embodiment, the SP node sequentially executes the job tasks corresponding to the triggers in the preset local trigger execution chain table, and sends a message to the MP node to notify the MP node that the job tasks are executed after each execution of one job task is completed.
In an embodiment, the SP node obtains a first trigger ID in a preset local trigger execution chain table, and queries specific trigger information according to the trigger ID, where the trigger information may include a job identifier corresponding to the trigger, and executes the job task. After the job task is executed, the SP node sends execution completion information to the MP node to inform the MP node that the job execution is completed. The execution completion information includes a trigger ID. After the execution completion information is sent, the SP node removes the current trigger ID from the preset local trigger execution linked list, and then acquires the next trigger ID to continue executing the job.
S460, the MP node receives the message of the SP node, and after the completion of the task execution of the SP node operation is determined, the dynamic distribution information corresponding to the corresponding trigger ID is removed from the preset global trigger execution linked list.
In an embodiment, when the MP node fails, if the MP node is a single node, after the MP fails, the global trigger execution linked list is recreated, the preset local trigger execution linked lists on all SP nodes are collected, and all the trigger IDs and the corresponding SP node numbers are added to the preset global trigger execution linked list.
When the MP node is configured with the backup node, after the MP node is subjected to primary-backup switching, the new master node can re-establish a preset global trigger execution linked list, collect preset local trigger execution linked lists on all SP nodes, and add all trigger IDs and corresponding SP node numbers into the preset global trigger execution linked list. Because important metadata information is stored on the MP node, the DPC system supports configuration of a plurality of backup nodes for the MP node so as to ensure the reliability of the system. When the MP master node fails, one node is selected from the MP backup nodes as a new master node to continue to provide service.
In an embodiment, when the MP node monitors that the SP node has a failure, the MP node removes all dynamic distribution information corresponding to the failed SP node from the preset global trigger execution linked list.
If the SP node is failed and restarted quickly (the preset local trigger execution linked list on the SP node is empty after restarting), the SP node sends a message to the MP node to inform the MP node that the current SP node performs restarting operation, and the MP node removes all dynamic distribution information corresponding to the SP node from the preset global trigger execution linked list (if the preset global trigger execution linked list does not exist, the SP node does not remove the dynamic distribution information).
Example five
Fig. 5 is a schematic structural diagram of a load balancing device for distributed jobs, which is provided in a fifth embodiment of the present invention, and is applied to a metadata service node. As shown in fig. 5, the load balancing apparatus for distributed job includes: a situation acquisition module 51, an information generation module 52, and a job distribution module 53.
The condition obtaining module 51 is configured to determine that the trigger and a default execution node of the trigger meet preset conditions, and obtain resource usage conditions of all execution nodes;
The information generating module 52 is configured to determine a target execution node in all execution nodes according to a resource usage situation, and establish dynamic distribution information of a trigger and the target execution node in a preset global trigger execution linked list;
The job distributing module 53 is configured to send the trigger identifier of the trigger to the target executing node according to the dynamic distribution information, so that the target executing node executes the job task corresponding to the trigger.
According to the embodiment of the invention, the trigger and the default execution node of the trigger are determined to meet the preset condition through the condition acquisition module, the resource use condition of all the execution nodes is acquired, the information generation module determines the target execution node in all the execution nodes according to the resource use condition, the trigger and the dynamic distribution information of the target execution node are established in the preset global trigger execution linked list, the trigger identification of the trigger is sent to the target execution node by the job distribution module according to the dynamic distribution information, so that the target execution node executes the job task corresponding to the trigger, the dynamic distribution of the execution node of the trigger is realized, the problem of long job execution time caused by unbalanced load of the execution node is prevented, the execution efficiency of the job task is improved, and the use experience of a user is improved.
In one embodiment, the condition acquisition module 51 includes:
the node condition determining unit is used for determining that a default executing node of the trigger is a load balancing permission node;
and the trigger condition determining unit is used for determining that the trigger is not allocated.
In one embodiment, the information generation module 52 includes:
The data comparison unit is used for comparing the resource use conditions in all the execution nodes; the monitoring items of the resource usage condition at least comprise: the CPU utilization rate and the memory consumption condition;
The inter-node determining unit is used for taking the execution node with the minimum monitoring item value of the resource use condition in all the execution nodes as a target execution node;
The identification extraction unit is used for extracting the node identification of the target execution node and the trigger identification of the trigger;
the information establishing unit is used for establishing the association relation between the node identifier and the trigger identifier as dynamic distribution information, and storing the dynamic distribution information in a preset global trigger execution linked list.
In one embodiment, job distribution module 53 includes:
The information reading unit is used for reading dynamic distribution information of an execution linked list of a preset global trigger; the identification determining unit is used for determining trigger identifications and node identifications stored in the dynamic distribution information;
And the job distributing unit is used for sending the trigger identification to the target executing node according to the node identification so as to enable the target executing node to execute the job task corresponding to the trigger.
In an embodiment, the load balancing device of the distributed job further includes:
The completion information receiving module is used for receiving the execution completion information fed back by the target execution node;
And the dynamic information removing module is used for removing the dynamic distribution information from a preset global trigger execution linked list according to the execution completion information.
In an embodiment, the load balancing device of the distributed job further includes:
And the task distribution module is used for sending the trigger identification of the trigger to the default execution node under the condition that the default execution node of the trigger does not meet the load balance, so that the default execution node executes the job task corresponding to the trigger.
The load balancing device for the distributed operation provided by the embodiment of the invention can execute the load balancing method for the distributed operation provided by any embodiment of the invention, and has the corresponding functional modules and beneficial effects of the executing method.
Example six
Fig. 6 is a schematic structural diagram of a load balancing device for distributed jobs, which is provided in a sixth embodiment of the present invention, and is applied to a target execution node. As shown in fig. 6, the load balancing apparatus for distributed job includes: an identification receiving module 61 and a job execution module 62.
The identifier receiving module 61 is configured to receive a trigger identifier sent by the metadata service node, and add the trigger identifier to a preset local trigger execution linked list;
And the job execution module 62 is used for executing the job tasks corresponding to the triggers according to the arrangement sequence of the trigger identifiers in the preset local trigger execution linked list.
According to the embodiment of the invention, the trigger identification sent by the metadata service node is received through the identification receiving module, the trigger identification is added to the preset local trigger execution linked list, and the job execution module executes the job tasks corresponding to the triggers according to the arrangement sequence of the trigger identification in the preset local trigger execution linked list, so that the orderly execution of the job tasks is realized, and the execution efficiency of the job tasks is improved.
In an embodiment, the load balancing device of the distributed job further includes:
The data removing module is used for feeding back execution completion information to the metadata service node after the execution of the operation task corresponding to the trigger is completed, and removing the trigger mark from a preset local trigger execution linked list; wherein the execution completion information includes at least a trigger identification.
The load balancing device for the distributed operation provided by the embodiment of the invention can execute the load balancing method for the distributed operation provided by any embodiment of the invention, and has the corresponding functional modules and beneficial effects of the executing method.
Example seven
Fig. 7 is a schematic structural diagram of an electronic device implementing a load balancing method for distributed jobs according to an embodiment of the present invention. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. Electronic equipment may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smartphones, wearable devices (e.g., helmets, glasses, watches, etc.), and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the inventions described and/or claimed herein.
As shown in fig. 7, the electronic device 10 includes at least one processor 11, and a memory, such as a Read Only Memory (ROM) 12, a Random Access Memory (RAM) 13, etc., communicatively connected to the at least one processor 11, in which the memory stores a computer program executable by the at least one processor, and the processor 11 may perform various appropriate actions and processes according to the computer program stored in the Read Only Memory (ROM) 12 or the computer program loaded from the storage unit 18 into the Random Access Memory (RAM) 13. In the RAM 13, various programs and data required for the operation of the electronic device 10 may also be stored. The processor 11, the ROM 12 and the RAM 13 are connected to each other via a bus 14. An input/output (I/O) interface 15 is also connected to bus 14.
Various components in the electronic device 10 are connected to the I/O interface 15, including: an input unit 16 such as a keyboard, a mouse, etc.; an output unit 17 such as various types of displays, speakers, and the like; a storage unit 18 such as a magnetic disk, an optical disk, or the like; and a communication unit 19 such as a network card, modem, wireless communication transceiver, etc. The communication unit 19 allows the electronic device 10 to exchange information/data with other devices via a computer network, such as the internet, and/or various telecommunication networks.
The processor 11 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of processor 11 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various processors running machine learning model algorithms, digital Signal Processors (DSPs), and any suitable processor, controller, microcontroller, etc. The processor 11 performs the various methods and processes described above, such as a load balancing method of distributed jobs.
In some embodiments, the load balancing method of distributed jobs may be implemented as a computer program tangibly embodied on a computer-readable storage medium, such as storage unit 18. In some embodiments, part or all of the computer program may be loaded and/or installed onto the electronic device 10 via the ROM 12 and/or the communication unit 19. When the computer program is loaded into RAM 13 and executed by processor 11, one or more steps of the load balancing method of distributed jobs described above may be performed. Alternatively, in other embodiments, processor 11 may be configured to perform the load balancing method of distributed jobs in any other suitable manner (e.g., by means of firmware).
Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuit systems, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), systems On Chips (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs, the one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a special purpose or general-purpose programmable processor, that may receive data and instructions from, and transmit data and instructions to, a storage system, at least one input device, and at least one output device.
A computer program for carrying out methods of the present invention may be written in any combination of one or more programming languages. These computer programs may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the computer programs, when executed by the processor, cause the functions/acts specified in the flowchart and/or block diagram block or blocks to be implemented. The computer program may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.
In the context of the present invention, a computer-readable storage medium may be a tangible medium that can contain, or store a computer program for use by or in connection with an instruction execution system, apparatus, or device. The computer readable storage medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. Alternatively, the computer readable storage medium may be a machine readable signal medium. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide for interaction with a user, the systems and techniques described here can be implemented on an electronic device having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) through which a user can provide input to the electronic device. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), blockchain networks, and the internet.
The computing system may include clients and servers. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server can be a cloud server, also called a cloud computing server or a cloud host, and is a host product in a cloud computing service system, so that the defects of high management difficulty and weak service expansibility in the traditional physical hosts and VPS service are overcome.
It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps described in the present invention may be performed in parallel, sequentially, or in a different order, so long as the desired results of the technical solution of the present invention are achieved, and the present invention is not limited herein.
The above embodiments do not limit the scope of the present invention. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives are possible, depending on design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present invention should be included in the scope of the present invention.

Claims (10)

1. A method for load balancing of distributed jobs, applied to metadata service nodes, comprising:
determining that a trigger and default execution nodes of the trigger meet preset conditions, and acquiring resource use conditions of all execution nodes;
determining target execution nodes in all the execution nodes according to the resource use condition, and establishing dynamic distribution information of the trigger and the target execution nodes in a preset global trigger execution linked list;
and sending the trigger identification of the trigger to the target execution node according to the dynamic distribution information so that the target execution node executes the job task corresponding to the trigger.
2. The method of claim 1, wherein determining that the trigger and the default execution node of the trigger satisfy a preset condition comprises:
the default execution node of the trigger is a load balancing permission node;
The trigger is not assigned.
3. The method according to claim 1, wherein the determining the target execution node among all the execution nodes according to the resource usage situation, and establishing the dynamic distribution information of the trigger and the target execution node in a preset global trigger execution linked list, includes:
comparing the resource use cases in all the execution nodes; the monitoring items of the resource use condition at least comprise: the CPU utilization rate and the memory consumption condition;
taking the execution node with the minimum monitoring item value of the resource use condition in all the execution nodes as a target execution node;
extracting a node identifier of the target execution node and a trigger identifier of the trigger;
And establishing the association relation between the node identifier and the trigger identifier as dynamic distribution information, and storing the dynamic distribution information in the preset global trigger execution linked list.
4. The method according to claim 1, wherein the sending the trigger identifier of the trigger to the target execution node according to the dynamic distribution information, so that the target execution node executes the job task corresponding to the trigger, includes:
reading the dynamic distribution information of the preset global trigger execution linked list; determining a trigger identifier and a node identifier stored in the dynamic distribution information;
and sending the trigger identification to a target execution node according to the node identification so that the target execution node executes the job task corresponding to the trigger.
5. The method of claim 1, further comprising, after said sending the trigger identification of the trigger to the target execution node in accordance with the dynamic distribution information:
Receiving execution completion information fed back by the target execution node;
And removing the dynamic distribution information from the preset global trigger execution linked list according to the execution completion information.
6. The method of claim 1, wherein the load balancing method of the distributed job further comprises:
and when the default execution node of the trigger is determined not to meet the load balancing condition, sending the trigger identification of the trigger to the default execution node so that the default execution node executes the job task corresponding to the trigger.
7. A method for load balancing of distributed jobs, applied to a target execution node, comprising:
Receiving a trigger identifier sent by a metadata service node, and adding the trigger identifier to a preset local trigger execution linked list;
and executing the job tasks corresponding to the triggers according to the arrangement sequence of the trigger identifications in the preset local trigger execution linked list.
8. The method of claim 7, wherein the load balancing method of the distributed job further comprises:
After the execution of the job task corresponding to the trigger is completed, feeding back execution completion information to the metadata service node, and removing the trigger identifier from the preset local trigger execution linked list;
Wherein the execution completion information includes at least a trigger identification.
9. An electronic device, the electronic device comprising:
At least one processor; and
A memory communicatively coupled to the at least one processor; wherein,
The memory stores a computer program executable by the at least one processor to enable the at least one processor to perform the load balancing method of distributed jobs of any one of claims 1-6 or claims 7-8.
10. A computer readable storage medium storing computer instructions for causing a processor to perform the load balancing method of distributed jobs according to any one of claims 1-6 or claims 7-8.
CN202410167689.0A 2024-02-06 2024-02-06 Load balancing method for distributed operation, electronic equipment and storage medium Pending CN117971486A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410167689.0A CN117971486A (en) 2024-02-06 2024-02-06 Load balancing method for distributed operation, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410167689.0A CN117971486A (en) 2024-02-06 2024-02-06 Load balancing method for distributed operation, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN117971486A true CN117971486A (en) 2024-05-03

Family

ID=90863008

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410167689.0A Pending CN117971486A (en) 2024-02-06 2024-02-06 Load balancing method for distributed operation, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN117971486A (en)

Similar Documents

Publication Publication Date Title
CN109194514B (en) Dual-computer monitoring method and device, server and storage medium
CN102694868A (en) Cluster system implementation and task dynamic distribution method
CN112231108A (en) Task processing method and device, computer readable storage medium and server
US10116736B2 (en) System for dynamically varying traffic routing modes in a distributed cluster and method therefor
CN114461407B (en) Data processing method, data processing device, distribution server, data processing system, and storage medium
CN105516266A (en) Service monitoring methods and system, and related devices
CN111045811A (en) Task allocation method and device, electronic equipment and storage medium
CN115495473A (en) Database query method and device, electronic equipment and storage medium
CN213876703U (en) Resource pool management system
CN112925811B (en) Method, apparatus, device, storage medium and program product for data processing
US10083201B2 (en) System for maintaining consistency across a decentralized database cluster and method therefor
CN106815318B (en) Clustering method and system for time sequence database
CN117633116A (en) Data synchronization method, device, electronic equipment and storage medium
CN117725115A (en) Database sequence processing method, device, equipment and storage medium
CN109739883B (en) Method and device for improving data query performance and electronic equipment
CN111767126A (en) System and method for distributed batch processing
CN115577958A (en) Risk processing method, device, equipment and storage medium
CN117971486A (en) Load balancing method for distributed operation, electronic equipment and storage medium
CN116303524A (en) Data processing method, device, electronic equipment and storage medium
CN114691781A (en) Data synchronization method, system, device, equipment and medium
CN103973811A (en) High-availability cluster management method capable of conducting dynamic migration
CN116303279A (en) Concurrent file acquisition method, concurrent file acquisition device, concurrent file acquisition equipment and storage medium
CN116821175A (en) Data query and index service creation method and device and electronic equipment
CN118838901A (en) Database instance state updating method, device, equipment and storage medium
US20170083578A1 (en) System and method for implementing a database

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination