CN114579278A - Distributed scheduling method, device and system and computer readable storage medium - Google Patents

Distributed scheduling method, device and system and computer readable storage medium Download PDF

Info

Publication number
CN114579278A
CN114579278A CN202210220960.3A CN202210220960A CN114579278A CN 114579278 A CN114579278 A CN 114579278A CN 202210220960 A CN202210220960 A CN 202210220960A CN 114579278 A CN114579278 A CN 114579278A
Authority
CN
China
Prior art keywords
server
task
fragmentation
tasks
executed
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210220960.3A
Other languages
Chinese (zh)
Inventor
刘哲
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Avatr Technology Chongqing Co Ltd
Original Assignee
Avatr Technology Chongqing Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Avatr Technology Chongqing Co Ltd filed Critical Avatr Technology Chongqing Co Ltd
Priority to CN202210220960.3A priority Critical patent/CN114579278A/en
Publication of CN114579278A publication Critical patent/CN114579278A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
    • G06F9/4881Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • G06F9/5038Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the execution order of a plurality of tasks, e.g. taking priority or time dependency constraints into consideration

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Hardware Redundancy (AREA)

Abstract

The embodiment of the invention relates to the technical field of computer application, and discloses a distributed scheduling method, a device and a system and a computer readable storage medium, wherein the method comprises the following steps: dividing a task to be executed into at least two first fragmentation tasks according to a preset fragmentation strategy, and distributing the first fragmentation tasks to corresponding registration servers; acquiring the working states of all the registration servers; if the working state of at least one registration server is down and the tasks to be executed are not finished, dividing the unfinished parts in the tasks to be executed into at least two second fragmentation tasks according to the running server and a preset fragmentation strategy; and distributing the second fragmentation task to the running server. By applying the technical scheme of the invention, the problem of the upper limit of the computing capacity of a single server can be solved, the influence of partial task failure on the whole system is reduced while the task processing progress of each node is fully considered, and the flexible capacity expansion of task processing is realized.

Description

Distributed scheduling method, device and system and computer readable storage medium
Technical Field
The embodiment of the invention relates to the technical field of computer application, in particular to a distributed scheduling method, a distributed scheduling device, a distributed scheduling system and a computer readable storage medium.
Background
The timing task scheduling solution is a task which is frequently executed periodically in many application systems. For example: commodity SKU (Stock Keeping Unit) is put on shelf at regular time, order overtime automatic cancellation, order audit, timing synchronous data, financial product calculation income and the like.
However, at present, the DelayQueue technology or the Spring Schedule technology is generally used in the development of timing task scheduling, but whatever technology is used, at least two problems are faced: firstly, the application service in the prior art cannot be deployed in multiple nodes, and the expansion achieved by blind multi-node deployment may cause repeated execution to cause errors of system logic. Secondly, the increase of the number of the application nodes can not bring improvement to the execution efficiency of each time, the task processing progress of each node is not fully considered, and horizontal expansion can not be realized.
Disclosure of Invention
In view of the foregoing problems, embodiments of the present invention provide a distributed scheduling method, apparatus, system and computer-readable storage medium, which are used to solve the problem that in the prior art, the task processing progress of each node is not fully considered in the task scheduling process, and elastic capacity expansion cannot be achieved.
According to an aspect of an embodiment of the present invention, a distributed scheduling method is provided, where the method includes:
dividing a task to be executed into at least two first fragmentation tasks according to a preset fragmentation strategy, and distributing the first fragmentation tasks to corresponding registration servers;
acquiring the working states of all the registration servers; the working state comprises an operating state and a downtime state, and the registration server comprises an operating server in the operating state and a downtime server in the downtime state;
if the working state of at least one registration server is down and the task to be executed is not completed, dividing the uncompleted part of the task to be executed into at least two second fragmentation tasks according to the running server and the preset fragmentation strategy;
and distributing the second fragmentation task to the running server.
In an optional manner, if the working state of the at least one operating server is a down state and the task to be executed is not completed, dividing an incomplete part of the task to be executed into at least two second fragmented tasks according to the operating server and the preset fragmentation policy, including:
if the working state of at least one running server is a downtime state, acquiring the processing progress of each first sliced task;
if any one of the first sliced tasks starts to be processed and is not finished, repacking the first sliced task corresponding to the downtime server to form a first task packet;
and dividing the first task packet into at least two second fragmentation tasks according to the running server and the preset fragmentation strategy.
In an optional manner, after the obtaining a processing progress of each first sharded task if the working state of the at least one operating server is the down state, the method further includes:
and if all the first fragmentation tasks do not start to be processed, regenerating at least two second fragmentation tasks of the tasks to be executed according to the running server and the preset fragmentation strategy, and distributing the second fragmentation tasks to the running server.
In an optional manner, the dividing the first task package into at least two second sharded tasks according to the operating server and the preset sharding policy includes:
determining whether the number of the operating servers meets the condition of the preset fragmentation strategy;
and if so, dividing the first task packet into at least two second fragmentation tasks according to the preset fragmentation strategy.
In an optional manner, the registration server further includes a new server; after the obtaining of the operating states of all the registration servers, the method further includes:
if the newly added server exists, acquiring the processing progress of each first fragment task;
if at least one first slicing task is not finished, maintaining the first slicing task until the task to be executed is finished;
and if all the first fragmentation tasks are completed, taking the newly added server as a new running server.
In an optional manner, the dividing a task to be executed into at least two first sharding tasks according to a preset sharding policy, and allocating the first sharding tasks to corresponding registration servers includes:
acquiring the number of currently operating servers;
determining the number of the first fragmentation tasks according to the number of the current running servers;
generating a corresponding number of first slicing tasks of the to-be-executed tasks according to the preset slicing strategy and the number of the first slicing tasks;
and distributing the first fragmentation task to a corresponding running server for task processing.
In an optional manner, the preset slicing policy is: the method comprises the following steps of average distribution algorithm strategy or job name hash value odd-even algorithm strategy or rotation slicing strategy or user-defined slicing strategy.
According to another aspect of the embodiments of the present invention, there is provided a distributed scheduling apparatus, including:
the system comprises a first processing module, a second processing module and a register server, wherein the first processing module is used for dividing a task to be executed into at least two first fragmentation tasks according to a preset fragmentation strategy and distributing the first fragmentation tasks to the corresponding register servers;
the acquisition module is used for acquiring the registration states of all the registration servers; the working state comprises an operating state and a downtime state, and the registration server comprises an operating server in the operating state and a downtime server in the downtime state;
the second processing module is used for dividing the unfinished part of the tasks to be executed into at least two second fragmentation tasks according to the running server and the preset fragmentation strategy if the working state of at least one registration server is in a downtime state and the tasks to be executed are unfinished;
and the operation module is used for distributing the second fragmentation task to the operation server.
According to another aspect of the embodiments of the present invention, there is provided a distributed scheduling system, including: the system comprises a registration center and at least one registration server, wherein the registration server comprises an operating server in an operating state and a downtime server in a downtime state;
the registry is configured to: dividing a task to be executed into at least two first fragmentation tasks according to a preset fragmentation strategy, and distributing the first fragmentation tasks to corresponding registration servers;
the registration server is configured to: performing task processing on the first fragmentation task;
the registry is further configured to: acquiring the working states of all the registration servers; if the working state of at least one registration server is down and the task to be executed is not completed, dividing the uncompleted part of the task to be executed into at least two second fragmentation tasks according to the operating server and the preset fragmentation strategy, and distributing the second fragmentation tasks to the operating server;
the operation server is used for: and processing the second fragmentation task.
According to yet another aspect of the embodiments of the present invention, there is provided a computer-readable storage medium having at least one executable instruction stored therein, the executable instruction causing a distributed scheduling apparatus to perform operations such as a distributed scheduling method.
The embodiment of the invention disperses the tasks to be executed into different running servers to run through the data fragments, generates new fragment tasks and sends the new fragment tasks to the corresponding running servers to be processed when the registration state of the running servers changes, can solve the problem of the upper limit of the computing capacity of a single server, reduces the influence of part of task failures on the whole system while fully considering the task processing progress of each node, and realizes the flexible capacity expansion of the task processing.
The foregoing description is only an overview of the technical solutions of the embodiments of the present invention, and the embodiments of the present invention can be implemented according to the content of the description in order to make the technical means of the embodiments of the present invention more clearly understood, and the detailed description of the present invention is provided below in order to make the foregoing and other objects, features, and advantages of the embodiments of the present invention more clearly understandable.
Drawings
The drawings are only for purposes of illustrating embodiments and are not to be construed as limiting the invention. Also, like reference numerals are used to refer to like parts throughout the drawings. In the drawings:
fig. 1 is a schematic flow chart illustrating a first embodiment of a distributed scheduling method provided by the present invention;
fig. 2 is a schematic flow chart illustrating step 110 in a first embodiment of the distributed scheduling method provided in the present invention;
fig. 3 is a schematic flowchart illustrating step 130 in a first embodiment of the distributed scheduling method provided in the present invention;
FIG. 4 is a flowchart illustrating a second embodiment of a distributed scheduling method provided by the present invention;
fig. 5 is a schematic structural diagram illustrating an embodiment of a distributed scheduling apparatus provided in the present invention;
FIG. 6 is a first block diagram of an embodiment of a distributed scheduling system provided by the present invention;
fig. 7 shows a second structural diagram of an embodiment of the distributed scheduling system provided by the present invention.
Detailed Description
Exemplary embodiments of the present invention will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the invention are shown in the drawings, it should be understood that the invention may be embodied in various forms and should not be limited to the embodiments set forth herein.
Fig. 1 shows a flowchart of a first embodiment of a distributed scheduling method provided by the present invention, which is performed by a registry. As shown in fig. 1, the method comprises the steps of:
step 110: dividing a task to be executed into at least two first fragmentation tasks according to a preset fragmentation strategy, and distributing the first fragmentation tasks to corresponding registration servers.
The method comprises the steps of dividing a task to be executed into at least two first fragmentation tasks according to different preset fragmentation strategies, wherein the number of the first fragmentation tasks is the same as that of running servers.
It should be noted that, in this embodiment, for the situation of distributed scheduling of multiple operation servers, when the number of the operation servers is only one, the operation server is directly used to process the task to be executed without using the preset fragmentation policy, instead of using the method of this embodiment.
Each first slicing task corresponds to one running server, and when any running server fails to run normally, such as downtime, a plurality of first slicing tasks may correspond to one running server.
Specifically, at this time, 3 running servers may perform task processing, and the preset fragmentation policy selected by the user is the average allocation algorithm policy. At this time, the registry equally divides the tasks to be executed into 3 first fragment tasks according to the average distribution algorithm strategy, and distributes the first fragment tasks to the corresponding running servers for processing.
Step 120: acquiring the working states of all the registration servers; the working state comprises an operating state and a downtime state, and the registration server comprises an operating server in the operating state and a downtime server in the downtime state.
Wherein, the working state of each registration server comprises: an operating state and a down state.
Specifically, whether each current registration server is a running server in a running state or a down server in a down state is acquired through the registration center.
Step 130: and if the working state of at least one registration server is in a downtime state and the task to be executed is not finished, dividing the unfinished part in the task to be executed into at least two second fragmentation tasks according to the running server and the preset fragmentation strategy.
When the registration center acquires that the working state of any one of all current registration servers is a down state, whether a task to be executed is completed is detected; if the task to be executed is not finished, the first fragmentation task which is distributed to the downtime server before is re-fragmented according to the number of the running servers in the running state and a preset fragmentation strategy, and is divided into at least two second fragmentation tasks, wherein the number of the second fragmentation tasks is the same as that of the currently available registration servers (namely the running servers).
The task content of the second time of fragmentation is the remaining tasks with uncompleted tasks to be executed, namely all first fragmentation tasks corresponding to the downtime server in the downtime state.
It should be noted that, if the number of available registration servers is only 1, the registration server is directly used to process the task to be executed without using the preset fragmentation policy.
Specifically, before the registration state of a certain running server changes, the number of the registration servers is 3, and at this time, the working state of 1 registration server changes and the corresponding first fragmentation task is not completed, and then the first fragmentation task corresponding to the downtime server is divided into 2 second fragmentation tasks according to a preset fragmentation strategy (such as an average allocation algorithm strategy).
Step 140: and distributing the second fragmentation task to the running server.
And the registry allocates each second fragment task to the corresponding running server for processing so as to complete the processing of the task to be executed.
The embodiment of the invention disperses the tasks to be executed into different running servers to run through the data fragments, generates new fragment tasks and sends the new fragment tasks to the corresponding running servers to be processed when the registration state of the running servers changes, can solve the problem of the upper limit of the computing capacity of a single server, reduces the influence of part of task failures on the whole system while fully considering the task processing progress of each node, and realizes the flexible capacity expansion of the task processing.
Fig. 2 shows a detailed flowchart of step 110 in the first embodiment of the distributed scheduling method provided in the present invention. As shown in fig. 2, step 110 includes the following steps:
step 111: and acquiring the number of the current running servers.
The number of the current running servers is obtained through the registration center.
Step 112: and determining the number of the first fragmentation tasks according to the number of the current running servers.
And the number of the first fragmentation tasks is the same as that of the currently running servers.
Step 113: and generating the first slicing tasks with the corresponding quantity of the tasks to be executed according to the preset slicing strategy and the quantity of the first slicing tasks.
The preset fragmentation strategy influences the size of each first fragmentation task and does not influence the number of the first fragmentation tasks.
For example, when the preset fragmentation policy is an average allocation algorithm policy and the number of currently operating servers is 3, the registration center averagely divides the task to be executed into 3 first fragmentation tasks.
Step 114: and distributing the first fragmentation task to a corresponding running server for task processing.
And the registry allocates each first fragmentation task to the corresponding running server for processing according to a preset fragmentation strategy.
It should be noted that the preset slicing policy is as follows: the average distribution algorithm strategy or the job name hash value odd-even algorithm strategy or the rotation slicing strategy can also set a user-defined slicing strategy according to the requirements of users, and no limitation is set here.
The embodiment of the invention further defines that the number of the fragmentation tasks is determined according to the number of the operation servers, and solves the problem of the upper limit of the processing capacity of a single operation server, for example, when the operation server is down, the first fragmentation task corresponding to the down server is redistributed, and the influence on the whole system when the single server fails is solved through an elastic capacity expansion mode.
Fig. 3 shows a detailed flowchart of step 130 in the first embodiment of the distributed scheduling method provided in the present invention. As shown in fig. 3, step 130 includes the steps of:
step 131: and if the working state of at least one running server is the downtime state, acquiring the processing progress of each first fragment task.
When any one of the running servers is registered as a server down, the processing progress of each running server for processing the first fragment task is obtained through the registration center.
Furthermore, after step 131, if all the first sharding tasks do not start to be processed, at least two second sharding tasks of the tasks to be executed are regenerated according to the running server and the preset sharding policy, and the second sharding tasks are distributed to the running server.
Specifically, when all the first sharding tasks do not start to be processed, the registry regenerates two second sharding tasks with the same number as that of the currently running servers according to the number of the currently running servers and the preset sharding strategy again at the moment, and allocates each second sharding task to the corresponding running server.
Step 132: and if any one of the first sliced tasks starts to be processed and is not finished, repackaging the first sliced task corresponding to the downtime server to form a first task packet.
When the registry detects that any one first fragmented task starts to be processed and is not finished, the first fragmented task corresponding to the downtime server is packaged again through the registry to form a first task packet. At this time, if only one downtime server exists, the corresponding first sliced task is taken as the first task package, and if a plurality of downtime servers exist, all the first sliced tasks corresponding to all the downtime servers need to be packaged to form the first task package.
Step 133: and dividing the first task packet into at least two second fragmentation tasks according to the running server and the preset fragmentation strategy.
Wherein step 133 specifically includes:
step 1331: and determining whether the number of the operating servers meets the condition of the preset fragmentation strategy.
The preset fragmentation strategy conditions may be: the number of active servers is at least two. If the number of the operating servers is not less than two, the condition is met, and subsequent work is executed; and if the number of the first time slices is less than two, the second time slicing work is not executed.
Specifically, the number of available running servers in all the registration servers is detected by the registration center.
Step 1332: and if so, dividing the first task packet into at least two second fragmentation tasks according to the preset fragmentation strategy.
If the conditions are met, generating a corresponding number of second fragmentation tasks according to the preset fragmentation strategy and the number of the operation servers.
If the above condition is not satisfied, and the number of the operation servers is only one at this time, the task in the first task package may be sent to the operation server for processing.
The condition of the preset fragmentation strategy can be other conditions, the preset fragmentation strategy is executed only when the condition is met, and a default surplus fragmentation strategy is executed when the condition is not met.
The embodiment of the invention further makes clear that when the registration state of the running server is down, the first fragmentation task corresponding to the down server is redistributed, and the influence on the whole system when a single server fails is solved by means of elastic capacity expansion.
Fig. 4 shows a flowchart of a second embodiment of a distributed scheduling method provided by the present invention, where the method is executed by a registration center, and the registration server further includes a new server. As shown in fig. 4, the method comprises the steps of:
step 210: dividing a task to be executed into at least two first fragmentation tasks according to a preset fragmentation strategy, and distributing the first fragmentation tasks to corresponding registration servers.
Step 220: acquiring the working states of all the registration servers; the working state comprises an operating state, a downtime state and a newly added state, and the registration server comprises an operating server in the operating state, a downtime server in the downtime state and a newly added server in the newly added state.
Step 230: and if the newly added server exists, acquiring the processing progress of each first fragment task.
When the registry acquires that any one of the registry servers is the newly added server, the processing progress of each first fragmentation task is acquired through the registry.
Step 240: and if at least one first slicing task is not finished, maintaining the first slicing task until the task to be executed is finished.
When the registry detects that any one first fragmentation task is not completed, the original fragmentation mode of the running server is maintained and all the first fragmentation tasks are completed.
Step 250: and if all the first fragmentation tasks are completed, taking the newly added server as a new operating server.
When all the first fragmentation tasks are completed, the registration center takes the newly added server as a new running server to be executed.
For example, when the number of the original running servers is 4, at this time, in the process of processing the task to be executed, 1 running server is added, at this time, the running server does not perform the processing of the task, and it is necessary to wait for the completion of the processing of the task to be executed currently being processed, and then use the newly added running server as the running server of the new task to be executed, at this time, the number of the running servers of the new task to be executed is 5.
According to the embodiment of the invention, when the existence of the newly added server is detected in the task processing to be executed, whether the task to be executed is completed at the moment is judged, and when the task to be executed is completed, the newly added server is used as the new running server of the task to be executed for task processing, so that the flexible capacity expansion in the task scheduling can be realized, the problem of the upper limit of the computing capacity of a single running server is further solved, and the task processing efficiency is improved.
Fig. 5 shows a schematic structural diagram of an embodiment of a distributed scheduling apparatus provided in the present invention. As shown in fig. 5, the apparatus 300 includes: a first processing module 310, an acquisition module 320, a second processing module 330, and an execution module 340.
The first processing module 310 is configured to divide a task to be executed into at least two first sharding tasks according to a preset sharding policy, and allocate the first sharding tasks to corresponding registration servers;
an obtaining module 320, configured to obtain registration statuses of all registration servers; the working state comprises an operating state and a downtime state, and the registration server comprises an operating server in the operating state and a downtime server in the downtime state;
the second processing module 330 is configured to, if the working state of the at least one registration server is a down state and the task to be executed is not completed, divide an uncompleted part of the task to be executed into at least two second fragmented tasks according to the operating server and the preset fragmentation policy;
an operation module 340, configured to assign the second sharded task to the operation server.
In an optional manner, the second processing module 330 is specifically configured to:
if the working state of at least one running server is a downtime state, acquiring the processing progress of each first fragment task;
if any one of the first sliced tasks starts to be processed and is not finished, repacking the first sliced task corresponding to the downtime server to form a first task packet;
and dividing the first task packet into at least two second fragmentation tasks according to the running server and the preset fragmentation strategy.
In an optional manner, after acquiring the processing progress of each first shard task if the working state of the at least one operating server is the downtime state, the method further includes:
and if all the first slicing tasks do not start to be processed, regenerating at least two second slicing tasks of the tasks to be executed according to the operating server and the preset slicing strategy, and distributing the second slicing tasks to the operating server.
In an optional manner, the dividing the first task package into at least two second sharded tasks according to the operating server and the preset sharding policy includes:
determining whether the number of the operating servers meets the condition of the preset fragmentation strategy;
and if so, dividing the first task packet into at least two second fragmentation tasks according to the preset fragmentation strategy.
In an optional manner, the registration server further includes a new server; after the obtaining module 320, the method further includes:
if the newly added server exists, acquiring the processing progress of each first fragment task;
if at least one first slicing task is not finished, maintaining the first slicing task until the task to be executed is finished;
and if all the first fragmentation tasks are completed, taking the newly added server as a new running server.
In an optional manner, the first processing module 310 is specifically configured to:
acquiring the number of currently operating servers;
determining the number of the first fragmentation tasks according to the number of the current running servers;
generating a corresponding number of first slicing tasks of the tasks to be executed according to the preset slicing strategy and the number of the first slicing tasks;
and distributing the first fragment task to a corresponding running server for task processing.
In an optional manner, the preset fragmentation policy is: the method comprises the following steps of average distribution algorithm strategy or job name hash value odd-even algorithm strategy or rotation slicing strategy or user-defined slicing strategy.
The embodiment of the invention disperses the tasks to be executed into different running servers to run through the data fragments, generates new fragment tasks and sends the new fragment tasks to the corresponding running servers to be processed when the registration state of the running servers changes, can solve the problem of the upper limit of the computing capacity of a single server, reduces the influence of part of task failures on the whole system while fully considering the task processing progress of each node, and realizes the flexible capacity expansion of the task processing.
Fig. 6 shows a first structural diagram of an embodiment of a distributed scheduling system 400 provided by the present invention. As shown in fig. 6, the distributed scheduling system 400 includes: a registry 410 and at least one registry server 420; the registration server 420 comprises a running server 421 in a running state and a down server 422 in a down state;
the registry 410 is configured to: dividing a task to be executed into at least two first fragmentation tasks according to a preset fragmentation strategy, and distributing the first fragmentation tasks to corresponding registration servers 420;
the registration server 420 is configured to: performing task processing on the first fragmentation task;
the registry 410 is further configured to: acquiring the working states of all the registration servers 420; if the working state of at least one registration server 420 is a down state and the task to be executed is not completed, dividing the uncompleted part of the task to be executed into at least two second fragmentation tasks according to the running server 421 and the preset fragmentation policy, and distributing the second fragmentation tasks to the running server 421;
the operation server 421 is configured to: and processing the second fragmentation task.
In an optional manner, the registry 410 is specifically configured to:
if the working state of at least one of the operation servers 421 is the downtime state, acquiring a processing progress of each of the first sliced tasks;
if any one of the first sliced tasks starts to be processed and is not completed, repackaging the first sliced task corresponding to the downtime server 422 to form a first task packet;
according to the operating server 421 and the preset fragmentation policy, dividing the first task packet into at least two second fragmentation tasks, and allocating the second fragmentation tasks to the operating server 421.
In an optional manner, after acquiring the processing progress of each first shard task if the working state of the at least one operating server 421 is the down state, the method further includes:
if all the first slicing tasks do not start to be processed, at least two second slicing tasks of the tasks to be executed are regenerated according to the running server 421 and the preset slicing strategy, and the second slicing tasks are distributed to the running server.
In an optional manner, the dividing the first task package into at least two second sharding tasks according to the operating server 421 and the preset sharding policy includes:
determining whether the number of the running servers 421 meets the condition of the preset fragmentation policy;
and if so, dividing the first task packet into at least two second fragmentation tasks according to the preset fragmentation strategy.
Fig. 7 shows a second structural diagram of an embodiment of a distributed scheduling system 400 provided by the present invention. As shown in fig. 7, in an optional manner, the registration server 420 further includes a new server 423; after acquiring the operating states of all the registration servers 420, the method further includes:
if the newly added server 423 exists, acquiring the processing progress of each first fragment task;
if at least one first slicing task is not finished, maintaining the first slicing task until the task to be executed is finished;
if all the first sharding tasks are completed, the newly added server 423 is used as a new running server 421.
In an optional manner, the registry 410 is specifically configured to:
acquiring the number of currently running servers 421;
determining the number of the first fragmentation tasks according to the number of the currently running servers 421;
generating a corresponding number of first slicing tasks of the tasks to be executed according to the preset slicing strategy and the number of the first slicing tasks;
the first sliced task is assigned to the corresponding running server 421.
In an optional manner, the preset fragmentation policy is: the method comprises the following steps of average distribution algorithm strategy or job name hash value odd-even algorithm strategy or rotation slicing strategy or user-defined slicing strategy.
The embodiment of the invention disperses the tasks to be executed into different running servers to run through the data fragments, generates new fragment tasks and sends the new fragment tasks to the corresponding running servers to be processed when the registration state of the running servers changes, can solve the problem of the upper limit of the computing capacity of a single server, reduces the influence of part of task failures on the whole system while fully considering the task processing progress of each node, and realizes the flexible capacity expansion of the task processing.
An embodiment of the present invention provides a computer-readable storage medium, where the storage medium stores at least one executable instruction, and when the executable instruction runs on a distributed scheduling apparatus, the distributed scheduling apparatus is enabled to execute a distributed scheduling method in any of the above method embodiments.
The executable instructions may be specifically configured to cause the distributed scheduling apparatus to perform the following operations:
dividing a task to be executed into at least two first fragmentation tasks according to a preset fragmentation strategy, and distributing the first fragmentation tasks to corresponding registration servers;
acquiring the working states of all the registration servers; the working state comprises an operating state and a downtime state, and the registration server comprises an operating server in the operating state and a downtime server in the downtime state;
if the working state of at least one registration server is down and the task to be executed is not completed, dividing the uncompleted part of the task to be executed into at least two second fragmentation tasks according to the running server and the preset fragmentation strategy;
and distributing the second fragmentation task to the running server.
In an optional manner, if the working status of the at least one operating server is a down status and the task to be executed is not completed, dividing the portion of the task to be executed that is not completed into at least two second split tasks according to the operating server and the preset splitting policy, including:
if the working state of at least one running server is a downtime state, acquiring the processing progress of each first sliced task;
if any one of the first sliced tasks starts to be processed and is not finished, repacking the first sliced task corresponding to the downtime server to form a first task packet;
and dividing the first task packet into at least two second fragmentation tasks according to the running server and the preset fragmentation strategy.
In an optional manner, after the obtaining a processing progress of each first sharded task if the working state of the at least one operating server is the down state, the method further includes:
and if all the first slicing tasks do not start to be processed, regenerating at least two second slicing tasks of the tasks to be executed according to the operating server and the preset slicing strategy, and distributing the second slicing tasks to the operating server.
In an optional manner, the dividing the first task package into at least two second sharded tasks according to the operating server and the preset sharding policy includes:
determining whether the number of the operating servers meets the condition of the preset fragmentation strategy;
and if so, dividing the first task packet into at least two second fragmentation tasks according to the preset fragmentation strategy.
In an optional manner, the registration server further includes a new server; after the obtaining of the operating states of all the registration servers, the method further includes:
if the newly added server exists, acquiring the processing progress of each first fragment task;
if at least one first slicing task is not finished, maintaining the first slicing task until the task to be executed is finished;
and if all the first fragmentation tasks are completed, taking the newly added server as a new running server.
In an optional manner, the dividing a task to be executed into at least two first sharding tasks according to a preset sharding policy, and allocating the first sharding tasks to corresponding registration servers includes:
acquiring the number of currently operating servers;
determining the number of the first fragmentation tasks according to the number of the current running servers;
generating a corresponding number of first slicing tasks of the tasks to be executed according to the preset slicing strategy and the number of the first slicing tasks;
and distributing the first fragment task to a corresponding running server for task processing.
In an optional manner, the preset slicing policy is: the method comprises the following steps of average distribution algorithm strategy or job name hash value odd-even algorithm strategy or rotation slicing strategy or user-defined slicing strategy.
The embodiment of the invention disperses the tasks to be executed into different running servers to run through the data fragments, generates new fragment tasks and sends the new fragment tasks to the corresponding running servers to be processed when the registration state of the running servers changes, can solve the problem of the upper limit of the computing capacity of a single server, reduces the influence of part of task failures on the whole system while fully considering the task processing progress of each node, and realizes the flexible capacity expansion of the task processing.
The algorithms or displays presented herein are not inherently related to any particular computer, virtual system, or other apparatus. In addition, embodiments of the present invention are not directed to any particular programming language.
In the description provided herein, numerous specific details are set forth. It is understood, however, that embodiments of the invention may be practiced without these specific details. Similarly, in the above description of exemplary embodiments of the invention, various features of the embodiments of the invention are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the invention and aiding in the understanding of one or more of the various inventive aspects. Where the claims following the detailed description are hereby expressly incorporated into this detailed description, with each claim standing on its own as a separate embodiment of this invention.
Those skilled in the art will appreciate that the modules in the device in an embodiment may be adaptively changed and disposed in one or more devices different from the embodiment. The modules or units or components of the embodiments may be combined into one module or unit or component, and furthermore they may be divided into a plurality of sub-modules or sub-units or sub-components. Except that at least some of such features and/or processes or elements are mutually exclusive.
It should be noted that the above-mentioned embodiments illustrate rather than limit the invention, and that those skilled in the art will be able to design alternative embodiments without departing from the scope of the appended claims. In the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. The word "comprising" does not exclude the presence of elements or steps not listed in a claim. The word "a" or "an" preceding an element does not exclude the presence of a plurality of such elements. The invention may be implemented by means of hardware comprising several distinct elements, and by means of a suitably programmed computer. In the unit claims enumerating several means, several of these means may be embodied by one and the same item of hardware. The usage of the words first, second and third, etcetera do not indicate any ordering. These words may be interpreted as names. The steps in the above embodiments should not be construed as limiting the order of execution unless specified otherwise.

Claims (10)

1. A distributed scheduling method, comprising:
dividing a task to be executed into at least two first fragmentation tasks according to a preset fragmentation strategy, and distributing the first fragmentation tasks to corresponding registration servers;
acquiring the working states of all the registration servers; the working state comprises an operating state and a downtime state, and the registration server comprises an operating server in the operating state and a downtime server in the downtime state;
if the working state of at least one registration server is down and the task to be executed is not completed, dividing the uncompleted part of the task to be executed into at least two second fragmentation tasks according to the running server and the preset fragmentation strategy;
assigning the second sharded task to the runtime server.
2. The method according to claim 1, wherein if the operating status of the at least one operating server is down and the task to be executed is not completed, dividing the portion of the task to be executed that is not completed into at least two second fragmented tasks according to the operating server and the preset fragmentation policy, including:
if the working state of at least one running server is a downtime state, acquiring the processing progress of each first sliced task;
if any one first slicing task starts to be processed and is not completed, repacking the first slicing task corresponding to the downtime server to form a first task packet;
and dividing the first task packet into at least two second fragmentation tasks according to the operation server and the preset fragmentation strategy.
3. The method according to claim 2, wherein after acquiring the processing progress of each of the first fragmented tasks if the operating status of the at least one operating server is down, the method further comprises:
and if all the first fragmentation tasks do not start to be processed, regenerating at least two second fragmentation tasks of the tasks to be executed according to the running server and the preset fragmentation strategy, and distributing the second fragmentation tasks to the running server.
4. The method of claim 2, wherein the partitioning the first task package into at least two second sharded tasks according to the operating server and the pre-set sharding policy comprises:
determining whether the number of the operating servers meets the condition of the preset fragmentation strategy;
and if so, dividing the first task packet into at least two second fragmentation tasks according to the preset fragmentation strategy.
5. The method of claim 1, wherein the registration server further comprises a newly added server; after the obtaining of the operating states of all the registration servers, the method further includes:
if the newly added server exists, acquiring the processing progress of each first fragment task;
if at least one first slicing task is not finished, maintaining the first slicing task until the task to be executed is finished;
and if all the first fragmentation tasks are completed, taking the newly added server as a new running server.
6. The method according to claim 1, wherein the dividing the task to be executed into at least two first sharding tasks according to a preset sharding policy and allocating the first sharding tasks to corresponding registration servers comprises:
acquiring the number of currently operating servers;
determining the number of the first fragmentation tasks according to the number of the current running servers;
generating a corresponding number of first slicing tasks of the tasks to be executed according to the preset slicing strategy and the number of the first slicing tasks;
and distributing the first fragmentation task to a corresponding running server for task processing.
7. The method of claim 1, wherein the preset slicing policy is: the method comprises the following steps of average distribution algorithm strategy or job name hash value odd-even algorithm strategy or rotation slicing strategy or user-defined slicing strategy.
8. A distributed scheduling apparatus, the apparatus comprising:
the system comprises a first processing module, a second processing module and a register server, wherein the first processing module is used for dividing a task to be executed into at least two first fragmentation tasks according to a preset fragmentation strategy and distributing the first fragmentation tasks to the corresponding register servers;
the acquisition module is used for acquiring the registration states of all the registration servers; the working state comprises an operating state and a downtime state, and the registration server comprises an operating server in the operating state and a downtime server in the downtime state;
the second processing module is used for dividing the unfinished part of the tasks to be executed into at least two second fragmentation tasks according to the running server and the preset fragmentation strategy if the working state of at least one registration server is in a downtime state and the tasks to be executed are unfinished;
and the operation module is used for distributing the second fragmentation task to the operation server.
9. A distributed scheduling system, the system comprising: the system comprises a registration center and at least one registration server, wherein the registration server comprises an operating server in an operating state and a downtime server in a downtime state;
the registry is configured to: dividing a task to be executed into at least two first fragmentation tasks according to a preset fragmentation strategy, and distributing the first fragmentation tasks to corresponding registration servers;
the registration server is configured to: performing task processing on the first fragmentation task;
the registry is further configured to: acquiring the working states of all the registration servers; if the working state of at least one registration server is a down state and the task to be executed is not completed, dividing the part which is not completed in the task to be executed into at least two second fragmentation tasks according to the running server and the preset fragmentation strategy, and distributing the second fragmentation tasks to the running server;
the operation server is used for: and processing the second fragmentation task.
10. A computer-readable storage medium having stored therein at least one executable instruction that, when executed on a distributed scheduling system, causes the distributed scheduling system to perform the operations of the distributed scheduling method of any one of claims 1-7.
CN202210220960.3A 2022-03-08 2022-03-08 Distributed scheduling method, device and system and computer readable storage medium Pending CN114579278A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210220960.3A CN114579278A (en) 2022-03-08 2022-03-08 Distributed scheduling method, device and system and computer readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210220960.3A CN114579278A (en) 2022-03-08 2022-03-08 Distributed scheduling method, device and system and computer readable storage medium

Publications (1)

Publication Number Publication Date
CN114579278A true CN114579278A (en) 2022-06-03

Family

ID=81778057

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210220960.3A Pending CN114579278A (en) 2022-03-08 2022-03-08 Distributed scheduling method, device and system and computer readable storage medium

Country Status (1)

Country Link
CN (1) CN114579278A (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190278684A1 (en) * 2018-03-09 2019-09-12 Toyota Motor Engineering & Manufacturing North America, Inc. Distributed Architecture for Fault Monitoring
CN111708627A (en) * 2020-06-22 2020-09-25 中国平安财产保险股份有限公司 Task scheduling method and device based on distributed scheduling framework
CN114090198A (en) * 2021-10-27 2022-02-25 青岛海尔科技有限公司 Distributed task scheduling method and device, electronic equipment and storage medium

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190278684A1 (en) * 2018-03-09 2019-09-12 Toyota Motor Engineering & Manufacturing North America, Inc. Distributed Architecture for Fault Monitoring
CN111708627A (en) * 2020-06-22 2020-09-25 中国平安财产保险股份有限公司 Task scheduling method and device based on distributed scheduling framework
CN114090198A (en) * 2021-10-27 2022-02-25 青岛海尔科技有限公司 Distributed task scheduling method and device, electronic equipment and storage medium

Similar Documents

Publication Publication Date Title
US8713565B2 (en) Automated tuning in a virtual machine computing environment
CN103593242B (en) Resource sharing control system based on Yarn frameworks
CN107291546B (en) Resource scheduling method and device
CN105912399B (en) Task processing method, device and system
US8185905B2 (en) Resource allocation in computing systems according to permissible flexibilities in the recommended resource requirements
EP3073374A1 (en) Thread creation method, service request processing method and related device
KR20130136449A (en) Controlled automatic healing of data-center services
CN104598316B (en) A kind of storage resource distribution method and device
US20060195845A1 (en) System and method for scheduling executables
TW202127249A (en) Machine learning workload orchestration in heterogeneous clusters
US11301299B2 (en) Data based scheduling for horizontally scalable clusters
CN116340005B (en) Container cluster scheduling method, device, equipment and storage medium
CN112988361B (en) Cluster task allocation method and device and computer readable medium
US9672073B2 (en) Non-periodic check-pointing for fine granular retry of work in a distributed computing environment
CN109189581B (en) Job scheduling method and device
CN116157778A (en) System and method for hybrid centralized and distributed scheduling on shared physical hosts
CN114579278A (en) Distributed scheduling method, device and system and computer readable storage medium
CN108833532B (en) Service processing method, device and system based on Internet of things
US10171570B2 (en) Information processing apparatus
CN112685168B (en) Resource management method, device and equipment
EP2218000A2 (en) Kernel processor grouping
CN109981731B (en) Data processing method and equipment
CN111459651B (en) Load balancing method, device, storage medium and scheduling system
Xie et al. A novel independent job rescheduling strategy for cloud resilience in the cloud environment
CN107529696B (en) Storage resource access control method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination