WO2022222403A1 - 任务分发系统、方法、装置、计算机设备及存储介质 - Google Patents

任务分发系统、方法、装置、计算机设备及存储介质 Download PDF

Info

Publication number
WO2022222403A1
WO2022222403A1 PCT/CN2021/126624 CN2021126624W WO2022222403A1 WO 2022222403 A1 WO2022222403 A1 WO 2022222403A1 CN 2021126624 W CN2021126624 W CN 2021126624W WO 2022222403 A1 WO2022222403 A1 WO 2022222403A1
Authority
WO
WIPO (PCT)
Prior art keywords
task
distributed
distribution server
server
main distribution
Prior art date
Application number
PCT/CN2021/126624
Other languages
English (en)
French (fr)
Inventor
王欢
Original Assignee
上海商汤科技开发有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 上海商汤科技开发有限公司 filed Critical 上海商汤科技开发有限公司
Publication of WO2022222403A1 publication Critical patent/WO2022222403A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/0706Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment
    • G06F11/0709Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment in a distributed system consisting of a plurality of standalone computer nodes, e.g. clusters, client-server systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5083Techniques for rebalancing the load in a distributed system
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/546Message passing systems or structures, e.g. queues
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/54Indexing scheme relating to G06F9/54
    • G06F2209/548Queue

Definitions

  • the present disclosure relates to the field of computer technology, and in particular, to a task distribution system, method, apparatus, computer device, and storage medium.
  • a user's task request is generally received through a server responsible for task distribution, and then the task is distributed to each server in the cluster that processes the task.
  • Embodiments of the present disclosure provide at least a task distribution system, method, apparatus, computer device, and storage medium.
  • an embodiment of the present disclosure provides a task distribution system, characterized by comprising a first main distribution server, at least one second main distribution server, and a plurality of execution servers, wherein:
  • the first main distribution server is used to receive the task to be distributed sent by the client; from the tasks to be distributed, determine the first task to be distributed to be distributed by the first main distribution server, and the task to be distributed by the second main distribution server.
  • second task to be distributed sending the first task to be distributed to the first execution server, and sending the second task to be distributed to the second main distribution server; receiving the first task sent by the first execution server
  • the execution result, and the second task execution result sent by the second main distribution server respectively send the first task execution result and the second task execution result to the corresponding client;
  • the second master distribution server is configured to receive the second task to be distributed sent by the first master distribution server, and send the second task to be distributed to the second execution server; receive the second task sent by the second execution server For the second task execution result, the second task execution result is sent to the first main distribution server.
  • the system further includes a sub-distribution server, and the sub-distribution server includes a first sub-distribution server connected to the first main distribution server and the first execution server, and a sub-distribution server connected to the first main distribution server and the first execution server.
  • a second sub-distribution server connected to the second main distribution server and the second execution server;
  • the sub-distribution server is configured to forward the task to be distributed sent by the main distribution server to the corresponding execution server; forward the task execution result sent by the execution server to the corresponding main distribution server; wherein, the main distribution server includes the first master distribution server and the second master distribution server.
  • the first main distribution server after receiving the task to be distributed sent by the client, is further used for:
  • the sending the second task to be distributed to the second main distribution server includes:
  • the marked task queue is sent to the second master distribution server.
  • the continuity of the task distribution system during task execution can be improved; by marking the second task to be distributed in the task queue and sending it to the second main distribution server, the first task can be Both the main distribution server and the second main distribution server can have a complete task queue and perform task distribution at the same time, which not only improves the task processing efficiency, but also improves the fault tolerance of the task distribution system.
  • the first main distribution server when receiving the second task execution result sent by the second main distribution server, is used to:
  • the first task status queue includes a second task to be distributed that has been distributed and has received the execution result of the second task, a second task to be distributed that has been distributed but has not received the execution result of the second task, and the first task that has been received. 2. Task execution results.
  • the task distribution progress of the second master distribution server can be sensed in time, so that when the second master distribution server appears, the second master distribution server is performing task distribution.
  • the time records will not be lost, thus improving the fault tolerance of the task distribution system.
  • the first main distribution server is further used for:
  • the second task status queue is sent to the second main distribution server at preset time intervals.
  • the first main distribution server after receiving and storing the first task status queue sent by the second main distribution service according to preset time intervals, is further used for:
  • the assignment of tasks among the main distribution servers can be adjusted according to the task execution conditions, so that the load of the first main distribution server and the second main distribution server during task distribution can be adjusted. Balanced, improving the task distribution efficiency of the entire task distribution system
  • the second main distribution server is further configured to:
  • the second master distribution server to redistribute the tasks of the first master distribution server that is abnormal, the probability that the system cannot run when the master distribution server is abnormal is reduced, and the fault tolerance of the task distribution system is improved.
  • the second main distribution server is further configured to:
  • the third task execution result sent by the second execution server is received, and the third task execution result is respectively sent to the corresponding client.
  • the second main distribution server is used to establish interaction with the client, so that subsequent task distribution can continue, thereby improving the fault tolerance of the task distribution system.
  • the first main distribution server is further used for:
  • the first main distribution server is also used for:
  • the first task to be distributed that has not received the first task execution result is distributed again.
  • the main distribution server can sense in time when the sub-distribution server is abnormal, and redistribute the unfinished tasks to be distributed, thereby improving the task performance.
  • the fault tolerance of the distribution system is not limited to:
  • any main distribution server is used for:
  • the respective tasks to be distributed are distributed through at least one main distribution server, thereby reducing the probability of task loss caused by abnormal conditions and improving the task distribution system. fault tolerance.
  • the sub-distribution server includes a third sub-distribution server connected to a plurality of main distribution servers, and the first sub-distribution server and the second sub-distribution server include the third sub-distribution server. server;
  • the third sub-distribution server uses At:
  • the task execution result corresponding to the same task to be distributed is sent to the corresponding master distribution server with the largest number of targets.
  • the third sub-distribution server is set with an exception handling mechanism, so that when it receives multiple identical tasks to be distributed at the same time, it can pass the corresponding processing mechanism.
  • the probability of incorrect task execution results and the collapse of the task distribution system due to repeated task distribution is reduced, thereby improving the fault tolerance of the task distribution system.
  • an embodiment of the present disclosure further provides a task distribution method, which is applied to the first master distribution server, including:
  • an embodiment of the present disclosure further provides a task distribution device, including:
  • the receiving module is used to receive the task to be distributed sent by the client;
  • a determining module configured to determine, from the tasks to be distributed, a first task to be distributed to be distributed by the first main distribution server, and a second task to be distributed to be distributed by the second main distribution server;
  • a first sending module configured to send the first task to be distributed to a first execution server, and send the second task to be distributed to a second main distribution server;
  • the second sending module is configured to receive the first task execution result sent by the first execution server and the second task execution result sent by the second main distribution server, and send the first task execution result and the second task execution result are sent to the corresponding client respectively.
  • an embodiment of the present disclosure further provides a computer device, including: a processor, a memory, and a bus, where the memory stores machine-readable instructions executable by the processor, and when the computer device runs, the processing The processor and the memory communicate through a bus, and the machine-readable instructions execute the steps in the second aspect when executed by the processor.
  • an embodiment of the present disclosure further provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and the computer program executes the steps in the second aspect when the computer program is run by a processor.
  • embodiments of the present disclosure further provide a computer program product, comprising computer-readable codes, or a non-volatile computer-readable storage medium carrying computer-readable codes, when the computer-readable codes are stored in an electronic device When running in the processor of the electronic device, the processor in the electronic device is used to implement the above method.
  • a multi-channel task distribution system is constructed by using a first main distribution server, at least one second main distribution server, and multiple task execution servers.
  • the distribution of multiple tasks can be completed through multiple main distribution servers at the same time.
  • the task distribution system provided by the embodiment of the present disclosure improves the task distribution efficiency and reduces the single-channel task distribution system.
  • FIG. 1a shows a schematic diagram of the architecture of a task distribution system provided by an embodiment of the present disclosure
  • Fig. 1b shows a schematic diagram of the architecture of another task distribution system provided by an embodiment of the present disclosure
  • FIG. 2 shows a flowchart of sending the second task to be distributed to the second main distribution server in the task distribution system provided by the embodiment of the present disclosure
  • FIG. 3 shows a flowchart of sending a second task status queue to the second main distribution server in the task distribution system provided by an embodiment of the present disclosure
  • FIG. 4 shows a flowchart of adjusting the distribution of the tasks to be distributed in the task distribution system provided by the embodiment of the present disclosure
  • FIG. 5 shows a flowchart of task distribution performed by a second main distribution server in the task distribution system provided by an embodiment of the present disclosure, in the case of an abnormality in the first main distribution server in the task distribution system;
  • FIG. 6 shows a flowchart of the communication between the second main distribution server and the client in the task distribution system provided by the embodiment of the present disclosure, when the first main distribution server in the task distribution system is abnormal;
  • FIG. 7 shows a flowchart of the task distribution performed by the first main distribution server in the task distribution system provided by the embodiment of the present disclosure, when any of the first sub-distribution servers is abnormal;
  • FIG. 8 shows a flowchart of task distribution performed by any main distribution server in the task distribution system provided by an embodiment of the present disclosure, in the case of abnormal communication between the main distribution servers in the task distribution system;
  • FIG. 9 shows a flow of task distribution and task execution result sending by a third sub-distribution server in the case of a communication abnormality between the main distribution servers in the task distribution system in the task distribution system provided by the embodiment of the present disclosure. picture;
  • FIG. 10 shows a flowchart of a task distribution method provided by an embodiment of the present disclosure
  • FIG. 11 shows a schematic diagram of a task distribution apparatus provided by an embodiment of the present disclosure
  • FIG. 12 shows a schematic diagram of a computer device provided by an embodiment of the present disclosure.
  • the user's task request is generally received through the server responsible for task distribution, and then the task is distributed to each server in the cluster that processes the task.
  • the present disclosure provides a task distribution system, method, device, computer equipment and storage medium.
  • a first main distribution server, at least one second main distribution server, and multiple task execution servers are used to construct a multi-channel
  • the task distribution system can complete the distribution of multiple tasks through multiple main distribution servers at the same time.
  • the task distribution system Compared with the single-channel task distribution system, the task distribution system provided by the embodiments of the present disclosure improves the task distribution efficiency and reduces the The probability of paralysis of the whole system caused by the failure of a certain link in the single-channel task distribution system, thereby improving the reliability and fault tolerance of the system; The probability of task distribution confusion is reduced; by using multiple main distribution servers to perform task distribution at the same time, the probability of wasting computing resources caused by the cold standby of the main distribution server is reduced, and the task distribution efficiency is improved.
  • the architecture of the task distribution system disclosed in the embodiment of the present disclosure is first introduced in detail, and the task distribution system is composed of a client and a server.
  • a schematic diagram of the architecture of a task distribution system provided by an embodiment of the present disclosure includes a first main distribution server 11, at least one second main distribution server 12, multiple execution servers 13, and a client 14, wherein :
  • the first main distribution server 11 is used to receive the task to be distributed sent by the client 14; the first task to be distributed to be distributed by the first main distribution server 11 is determined from the tasks to be distributed, and the task to be distributed by the second main distribution server is determined. 12 to distribute the second task to be distributed; send the first task to be distributed to the first execution server 131, and send the second task to be distributed to the second main distribution server 12; receive the first execution The first task execution result sent by the server 131 and the second task execution result sent by the second main distribution server 12 are respectively sent to the corresponding client 14;
  • the client 14 may include multiple clients, and each of the multiple clients may send the task to be distributed to the first main distribution server 11, and the first task execution result and the second task execution result are sent respectively.
  • the client 14 it can be understood as sending the task execution result to the client requesting to execute the task, that is, sending the first task execution result to the client requesting to execute the first task to be distributed, and sending the second task execution result to the client that requests to execute the second task to be distributed.
  • the second main distribution server 12 is configured to receive the second task to be distributed sent by the first main distribution server 11, send the second task to be distributed to the second execution server 132; receive the second execution server 132 The sent second task execution result is sent to the first main distribution server 11 .
  • each task to be distributed is sent to the corresponding execution server, in a possible implementation manner, the task types that can be executed by the execution servers connected to each main distribution server are the same. Therefore, when each task to be distributed is sent to the corresponding execution server When it reaches the corresponding execution server, it can be distributed according to the current task execution status of each execution server; in another possible implementation, the types of tasks that each execution server can perform are different, It is executed on a specific execution server. Therefore, when each task to be distributed is sent to the corresponding execution server, it can be distributed according to the task type of the task to be distributed.
  • the task distribution system needs to deploy devices with a transfer function.
  • the task distributed by the main distribution server is forwarded to the execution server, so that the device connection pressure of the main distribution server can be reduced; on the other hand, in the entire task distribution system, this connection can be added by adding There is a transfer device of the execution server to increase the task processing capability of the entire task distribution system, thereby improving the task distribution performance of the task distribution system.
  • a schematic structural diagram of another task distribution system provided by an embodiment of the present disclosure includes a first main distribution server 11, at least one second main distribution server 12, multiple execution servers 13, and sub-distribution servers 15, where:
  • the first main distribution server 11 is used to receive the task to be distributed sent by the client 14; the first task to be distributed to be distributed by the first main distribution server 11 is determined from the tasks to be distributed, and the task to be distributed by the second main distribution server is determined. 12.
  • the second task to be distributed for distribution; the first task to be distributed is sent to the first sub-distribution server 151 to which it is connected, and forwarded to the first execution server 131 via the first sub-distribution server 151, and the The second task to be distributed is sent to the second main distribution server 12; the first task execution result sent by the first execution server 131 and the second task execution result sent by the second main distribution server 12 are received, and the The first task execution result and the second task execution result are respectively sent to the corresponding client 14;
  • the second main distribution server 12 is configured to receive the second to-be-distributed task sent by the first main distribution server 11, send the second to-be-distributed task to the second sub-distribution server 152 connected to it, and send the second to-be-distributed task to The second sub distribution server 152 forwards it to the second execution server 132 ; receives the second task execution result sent by the second execution server 132 , and sends the second task execution result to the first main distribution server 11 ;
  • the sub-distribution server 15 is used to forward the task to be distributed sent by the main distribution server to the corresponding execution server 13, wherein the execution server 13 corresponding to the task to be distributed is the execution server 13 that can execute the task to be distributed; Forward the task execution result sent by the execution server 13 to the corresponding main distribution server; wherein, the main distribution server includes the first main distribution server 11 and the second main distribution server 12;
  • the distribution server 15 includes a first sub-distribution server 151 connected to the first main distribution server 11 and the first execution server 131 , and a first sub-distribution server 151 connected to the second main distribution server 12 and the second execution server 132 The second child distribution server 152 .
  • first master distribution server 11 is hereinafter referred to as “first master distribution server”
  • second master distribution server 12 is hereinafter referred to as “second master distribution server”
  • first execution server 131 is hereinafter referred to as “second master distribution server”.
  • first execution server ; “second execution server 132” hereinafter referred to as “second execution server”; “execution server 13” hereinafter referred to as “execution server”; “client 14” hereinafter referred to as “client”; “Sub-distribution server 15” is hereinafter referred to as “sub-distribution server”; “first sub-distribution server 151” is hereinafter referred to as “first sub-distribution server”; “second sub-distribution server 152” is hereinafter referred to as “second sub-distribution server” server”.
  • the sub-distribution server and the execution server connected to the sub-distribution server are usually deployed in the same location (such as a computer room, etc.), so the communication between the sub-distribution server and the execution server connected to the sub-distribution server is connected.
  • the execution server connected to the sub-distribution server and other sub-distribution servers are often in different locations, so the execution server connected to the sub-distribution server cannot communicate with other sub-distribution servers; the sub-distribution server is connected to the corresponding main distribution server through the network.
  • the same sub-distribution server can establish communication with multiple main distribution servers, so these sub-distribution servers connected to multiple main distribution servers are the first sub-distribution server and the second sub-distribution server.
  • the sub-distribution server is divided according to whether it can be connected to the corresponding main distribution server.
  • the first sub-distribution server that can be connected to the first main distribution server is the first sub-distribution server
  • the second main distribution server can be connected to the second main distribution server.
  • the connected one is the second sub-distribution server.
  • the first sub-distribution server is 6
  • the number of second sub-distribution servers is 8. Among them, there are 4 sub-distribution servers that can be connected to both the first main distribution server and the second main distribution server.
  • these four sub-distribution servers It is a third sub-distribution server, the first sub-distribution server includes a third sub-distribution server, and the second sub-distribution server also includes a third sub-distribution server.
  • the division standard of the execution server is similar to the division standard of the sub-distribution server in the above-mentioned FIG. 1b, and will not be described here.
  • the execution server includes a first execution server connected to the first main distribution server when performing task distribution, and a second execution server connected to the second main distribution server;
  • the user terminal can be one or more when performing task distribution, as long as it is in the same communication network as the first main distribution server (can communicate with the first main distribution server);
  • the second main distribution server The number can also be one or more, and the number can be set according to actual needs such as task distribution order of magnitude, task time requirements, etc., and after setting the number of the second main distribution servers, set the corresponding task distribution logic.
  • the number of the second master distribution servers may be set to one.
  • the first main distribution server may send the second task to be distributed to the second main distribution server according to the following steps:
  • Distributor
  • S201 Store the task to be distributed in a task queue.
  • the tasks stored in the task queue are undistributed tasks to be distributed, and when the task distribution is completed, they can be removed from the task queue accordingly.
  • the task queue may be as shown in Table 1 below:
  • task 1, task 2, and task 3 are undistributed tasks to be distributed.
  • the second task to be distributed may be determined according to various distribution factors such as the execution authority of the task to be distributed, the connection between each distribution server and the execution server, and the load of each main distribution server. .
  • the execution authority of the to-be-distributed task represents the authority used by the to-be-distributed task during execution.
  • a distribution server/execution server with a lower authority cannot distribute/execute the to-be-distributed task.
  • the tasks to be distributed can be executed by calling data in a specific database, and the specific database is connected to the corresponding execution server in the task distribution system, so a specific execution server needs to be executed, so it is necessary to determine the
  • the connection between the distribution server and the execution server can only distribute the to-be-distributed task to the execution server capable of executing the to-be-distributed task through the corresponding distribution server; in order to coordinate the workload of each main distribution server, reduce the occurrence of some main distribution server distribution tasks
  • the probability that another part of the main distribution server is not working, the resources to be distributed can be allocated according to the load situation. For example, there are 1000 tasks to be distributed.
  • the first main distribution server A and the second main distribution server B There are no tasks to be distributed (the load is 0), so it can be determined that the second task to be distributed is 500, that is, A and B each distribute 500 tasks to be distributed, thus achieving load balance between the main distribution servers .
  • the second task to be distributed that needs to be distributed by the second main distribution server is marked.
  • the marked task queue can be shown in Table 2 below:
  • the first row indicates that task 1 is distributed by the second main distribution server; the second row indicates that task 2 is distributed by the second main distribution server; the third row indicates that task 3 is distributed by the second main distribution server.
  • S203 Send the marked task queue to the second main distribution server.
  • a data synchronization backup tool rsync (remote synchronize) can be used, so that the task queue can be sent safely and quickly.
  • the second main distribution server after the second main distribution server distributes the tasks to be distributed according to the marked task queue, it can receive the second task execution result sent by the execution server, and because the client communicates with one main distribution server at a time. (the first main distribution server) to communicate, so the execution result of the second task needs to be sent to the first main distribution server.
  • the communication can be performed through a virtual IP address, that is, the user terminal sends a virtual IP address to a virtual IP address. Send the task to be distributed, and receive the corresponding task execution result from the virtual IP address.
  • the virtual IP address can only be used by one main distribution server at the same time. After establishing communication with the client, the virtual IP address is generally used by the first main distribution server when the first main distribution server operates normally.
  • the second main distribution server when it sends the second task execution result to the first main distribution server, it may be sent through a task status queue.
  • the first main distribution server may receive and store the first task status queue sent by the second main distribution service at preset time intervals; wherein, the first task status queue includes distributed and received The second task to be distributed to the second task execution result, the second task to be distributed that has been distributed and has not received the second task execution result, and the received second task execution result.
  • the second task to be distributed that has been distributed and has received the execution result of the second task
  • the second task to be distributed that has been distributed and has not received the execution result of the second task
  • the task is used to represent the task distribution progress of the second task to be distributed, so as to provide support for subsequent real-time load balancing according to the task distribution progress of each primary distribution server.
  • the first task status queue can also be used for data backup, which reduces the probability that the first main distribution server cannot obtain the task status of the second to-be-distributed task when an abnormal situation occurs on the second main distribution server; corresponding to Yes, the second task status queue of the first main distribution server can also perform data backup.
  • the data backup described here can be understood as: when the second main distribution server is down, some tasks may have been distributed but the task execution result has not been received. Return to the second master distribution server, which results in that the first master distribution server cannot obtain the task execution result of this part of the task. For this part of the task, since the second master distribution server has synchronized the first task status queue to the first master distribution server, the first master distribution server can re-allocate this part of the task, thereby ensuring that the second master distribution server can In the case of downtime, the task execution results will not be lost.
  • the first main distribution server may also send the second task status queue to the second main distribution server through the following steps:
  • S301 Update the second task status queue based on the task status of the first distributed task to be distributed.
  • the task status includes distributed and received task execution results and distributed and unreceived task execution results;
  • the second task status queue is similar to the first task status queue, including distributed and received first tasks The first task to be distributed of the execution result, the first task to be distributed that has been distributed and has not received the execution result of the first task.
  • the second task status queue may further include the received first task execution result, which may be used to send the first task execution result to the second main distribution server, so that The task execution result will not be lost due to an abnormality in the first main distribution server, thus completing the data backup of the first execution result.
  • S302 Send the second task status queue to the second main distribution server according to a preset time interval.
  • the preset time interval may be a small time interval, so that the second task status queue is sent quickly and frequently.
  • the data synchronization backup tool rsync can be used for sending, so that the second task status queue can be sent safely and quickly.
  • the to-be-distributed tasks allocated to different main distribution servers can be dynamically adjusted, so as to realize the load balancing of each main distribution server and improve the task distribution efficiency of the task distribution system.
  • the first main distribution server may adjust the distribution of the tasks to be distributed through the following steps:
  • S401 Based on the first task status queue and the marked task queue, determine the second task to be distributed that is not currently distributed by the second master distribution server.
  • the marked task queue contains a second task to be distributed that was previously allocated to the second master distribution server for distribution, and the first task status queue represents the status of the second distributed task , therefore, based on the first task status queue and the marked task queue, the currently undistributed second task to be distributed can be determined.
  • the task queue includes 50 marked second tasks to be distributed, and according to the first task status queue, it is determined that 20 of them have been distributed and the task execution result has been received, and 20 have been distributed and If the task execution result is not received, it can be determined that the currently undistributed second tasks to be distributed are the remaining 10 second tasks to be distributed.
  • the task queue including the currently undistributed second to-be-distributed task sent by the second main distribution server may be received, and the task queue described here may be sent by the first main distribution server to For the second main distribution server, after distributing any second task to be distributed, the second main distribution server may delete the distributed second task to be distributed from the task queue.
  • S402 Update the task queue and the flag in the task queue based on the currently undistributed first task to be distributed and the currently undistributed second task to be distributed.
  • the first task to be distributed is distributed by the first main distribution server, the first task to be distributed that is not currently distributed can be determined.
  • the task queue is also updated in real time as task distribution proceeds.
  • the distributed second task to be distributed may be cleared from the task queue according to the first task status queue; and the distributed first task to be distributed may be deleted according to the second task status queue
  • the task is cleared from the task queue, and the update of the task queue can be completed; when updating the mark in the task queue, it can be the number of the first task to be distributed and the second task to be distributed that are not currently distributed. If there is a big difference (for example, the difference between the two is greater than the preset value, the ratio is greater than the preset ratio, etc.), the update can be performed according to the currently undistributed first task to be distributed and second task to be distributed.
  • the marks in the task list are adjusted by the number of , so as to achieve load balancing of the tasks to be distributed among the main distribution servers.
  • first tasks to be distributed there are currently 30 first tasks to be distributed that are not distributed, and 10 second tasks to be distributed that are not currently distributed. If the ratio is 3:1, which exceeds the preset ratio of 2:1, 10 of the first tasks to be distributed can be allocated to the second main distribution server (marked to become the second tasks to be distributed), Thereby, load balancing between the first main distribution server and the second main distribution server is realized.
  • S403 Send the updated task queue to the second main distribution server, so that the second main distribution server performs task distribution based on the updated task queue.
  • a data synchronization backup tool rsync can be used, so that the task queue can be sent safely and quickly.
  • the above-mentioned updating process may be performed every preset time period.
  • the preset duration can be set to be very short (millisecond level), so that the loads of the first main distribution server and the second main distribution server can be in a dynamic balance state.
  • the tasks to be distributed that have been distributed and have not received the corresponding task execution results in the task status queue can be re-added to the task queue after certain conditions are met, and re-executed. distribution.
  • a preset duration can be set for the tasks to be distributed in the task status queue of each main distribution server that have been distributed but have not received results. If the distribution/execution of the distribution task fails (it is regarded as not being distributed), it is re-joined into the task queue, and can be re-distributed by the main distribution server or other main distribution servers.
  • the preset duration may be set to 5 minutes, and when the first main distribution server A does not receive the corresponding first task execution result 5 minutes after distributing a certain first task to be distributed, it may The task in the second task state queue is re-added to the task queue, and marked for redistribution by the second main distribution server A or B.
  • the main distribution server is abnormal
  • the abnormality in the main distribution server includes the abnormality in the first main distribution server, the abnormality in the second main distribution server, and the abnormality in the first main distribution server and the second main distribution server at the same time.
  • the corresponding abnormal situation can be monitored by a preset monitoring tool. Reporting to the corresponding monitoring client, the monitoring tool can monitor the working status of each device in the task distribution system and the connectivity of the network in real time.
  • the abnormality may be a situation that the task distribution cannot be performed normally due to reasons such as server downtime.
  • the second main distribution server can continue to perform task distribution through the following steps:
  • S501 Based on the received second task status queue, determine an undistributed first task to be distributed and a first task to be distributed that has been distributed and has not received a first task execution result in the second task status queue.
  • the second main distribution server can sense it in time and based on the latest received second task status queue In the information, determine the task to be distributed that needs it (that is, the second main distribution server) to be distributed at this time, here, the task to be distributed that needs to be distributed includes that the first main distribution server has not The distributed first task to be distributed and the first task to be distributed that have been distributed but have not received the execution result.
  • S502 Distribute the undistributed first task to be distributed and the first task to be distributed in the second task status queue that has been distributed and has not received the execution result of the first task to the execution server.
  • a second main distribution server may be used to replace the first main distribution server to communicate with the client.
  • the second main distribution server can also communicate with the client through the following steps: :
  • S601 Receive a third task to be distributed sent by the client, and distribute the third task to be distributed to a second execution server.
  • the communication can be performed through a virtual IP address, that is, the user terminal sends a virtual IP address.
  • the task to be distributed, and the corresponding task execution result is received from the virtual IP address.
  • the virtual IP address can only be used by one main distribution server at the same time.
  • the first main distribution server is abnormal, the second main distribution server can take over and use the virtual IP address to establish communication with the user terminal.
  • the third task to be distributed is a new task to be distributed sent by the client after the first main distribution server is abnormal.
  • the second main distribution server can first determine the sub-distribution servers that can be connected.
  • the distribution server distributes the third task to be distributed through the sub-distribution servers that can be connected to the second main distribution server.
  • S602 Receive the third task execution result sent by the second execution server, and respectively send the third task execution result to the corresponding client.
  • the first main distribution server may determine, based on the received first task status queue, the undistributed The second task to be distributed and the second task to be distributed in the state queue of the first task that have been distributed and have not received the execution result of the second task; then, combine the undistributed second task to be distributed with the first task to be distributed The second task to be distributed that has been distributed in the task status queue and has not received the execution result of the second task is distributed to the execution server.
  • the abnormality of the sub-distribution server includes the abnormality of the first sub-distribution server, the abnormality of the second sub-distribution server, and the abnormality of the first sub-distribution server and the first sub-main distribution server at the same time.
  • the main distribution server connected to the sub-distribution server can send the corresponding The abnormal situation is reported to the corresponding user terminal, wherein, if the sub-distribution server is connected to multiple main distribution servers at the same time, the corresponding abnormal situation can be reported through any one of the main distribution servers and the preset monitoring tool. Report to the corresponding monitoring client.
  • each main distribution server when each main distribution server performs task distribution, in addition to task distribution and task execution result reception, it is also necessary to monitor and store the connected tasks at the same time.
  • the running status of each sub-distribution server includes the load status of the sub-distribution server and whether it is alive, etc., so that when the sub-distribution server is abnormal, corresponding measures can be taken in time to reduce the abnormality of the sub-distribution server. the losses caused.
  • the abnormality may be a situation in which task forwarding cannot be performed normally due to reasons such as server downtime.
  • the first main distribution server may perform task distribution according to the following steps:
  • S701 Update the stored running status of each of the first sub-distribution servers.
  • a survival list may be set for the first sub-distribution server, and the survival list is used to represent whether the first sub-distribution server is alive.
  • the survival list may be as shown in Table 3 below:
  • sub-distributor 1 survive sub-distributor 2 survive sub-distributor 3 abnormal
  • the first row indicates that the child distribution server 1 is currently alive; the second row indicates that the child distribution server 2 is currently alive; the third row indicates that the child distribution server 3 is currently abnormal.
  • the survival list can be updated in real time according to monitoring conditions.
  • S702 Determine the first to-be-distributed task for which the first task execution result has not been received among the first to-be-distributed tasks distributed to the first sub-distribution server in which the abnormality occurs.
  • S703 Re-distribute the first task to be distributed for which the first task execution result has not been received.
  • the main distribution server senses abnormal sub-distribution servers in time, and redistributes the uncompleted tasks to be distributed, thereby improving the fault tolerance of the task distribution system.
  • the second main distribution server may update the stored running status of each of the second sub-distribution servers; In the second to-be-distributed task of the second sub-distribution server that has an exception, the second to-be-distributed task that has not received the second task execution result; re-distributes the second to-be-distributed task that has not received the second task execution result tasks are distributed.
  • the main distribution server since the main distribution server has the requirement of task state synchronization, it needs to synchronize the current distributed tasks and undistributed tasks, so as to coordinate the orderly progress of task distribution.
  • the communication is abnormal, the synchronization of the task status cannot be continued, and each master distribution server cannot perceive the survival of other master distribution servers.
  • each master distribution server In order to reduce the probability of missing tasks, each master distribution server needs to distribute all undistributed tasks according to their respective task queues, so that multiple master distribution servers may repeatedly distribute the same task, resulting in wrong task execution results and task execution. Crash of the distribution system.
  • any main distribution server can perform task distribution through the following steps:
  • S801 Determine currently undistributed tasks to be distributed, and tasks to be distributed that have been distributed but have not received task execution results.
  • each master distribution server can sense in time and determine the currently undistributed task status based on the task status at the latest synchronization. Tasks to be distributed, and tasks to be distributed that have been distributed but have not received task execution results.
  • S802 Distribute tasks for tasks to be distributed that are not currently distributed and tasks to be distributed that have been distributed but have not received task execution results.
  • each master distribution server should allow each main distribution server to distribute tasks that are currently not distributed, and tasks that have been distributed but have not received task execution results to be distributed. Tasks perform task distribution.
  • the sub-distribution server since the sub-distribution server includes a third sub-distribution server connected to multiple main distribution servers, and the first sub-distribution server and the second sub-distribution server include the third sub-distribution server, therefore , after any main distribution server performs task distribution according to step S802, the third sub-distribution server may receive the same task to be distributed sent by the multiple main distribution servers, if the third sub-distribution server has more If the same task to be distributed is distributed twice, the same task to be distributed will be executed multiple times, which may cause confusion in task execution and errors in task execution results. Therefore, an exception handling mechanism needs to be set up to deal with this abnormal situation.
  • the third sub-distribution server can perform task distribution and task execution results through the following steps: send:
  • S901 Send the same task to be distributed to an execution server, and receive a task execution result sent by the execution server; and, determine the target number of sub-distribution servers connected to each of the multiple master distribution servers .
  • the third sub-distribution server A server can be determined by the live list in the master distribution server to which it is connected. For example, if there are 10 surviving first sub-distribution servers in the survival list of the first main distribution server, the target number is 10.
  • the survival list may be sent by the main distribution server to the third sub-distribution server at the same time when the main distribution server sends the task to be distributed; or, the third sub-distribution server may receive the same pending task. After the task is distributed, it is obtained by sending a request to the main distribution server to which it is connected.
  • S902 Send the task execution result corresponding to the same task to be distributed to the main distribution server with the largest number of corresponding targets.
  • the main distribution server with the largest number of corresponding targets is not the main distribution server that established communication with the client before, the main distribution server can be used to replace the main distribution server that communicated with the client before, thereby Complete the receiving of the new task to be distributed and the sending of the task execution result.
  • the system running process includes the following steps:
  • Step 1 The client sends the task to be distributed to the first main distribution server.
  • Step 2 After receiving the to-be-distributed task, the first main distribution server adds the to-be-distributed task to the task list, performs task assignment, determines the first to-be-distributed task and the second to-be-distributed task in the task list, and A mark is added to the task list, and the marked task list is sent to the second main distribution server.
  • Step 3 The first main distribution server and the second main distribution server respectively send the first task to be distributed and the second task to be distributed to the corresponding sub-distribution servers, so that the sub-distribution servers send them to the corresponding execution servers.
  • Step 4 The first main distribution server and the second main distribution server receive the task execution result sent by the execution server through the sub-distribution server, and the second main distribution server sends the received second execution result to the first main distribution server, and the first main distribution server sends the result to the first main distribution server.
  • the main distribution server uniformly sends it to the client.
  • the process when the system is running includes the following steps:
  • Step 1 The second master distribution server redistributes the tasks that have been distributed by the first master distribution server but have not received the execution result based on the received second task status queue.
  • Step 2 The second main distribution server distributes the undistributed first task to be distributed and the second task to be distributed based on the task queue.
  • Step 3 The second main distribution server establishes communication with the client, receives the task to be distributed newly sent by the client, and sends the execution result of the task to the client.
  • the process when the system runs includes the following steps:
  • Step 1 Based on the received first task status queue, the first master distribution server redistributes the tasks that have been distributed by the second master distribution server but have not received the execution result.
  • Step 2 The first main distribution server distributes the undistributed first task to be distributed and the second task to be distributed based on the task queue.
  • the process when the system runs includes the following steps:
  • Step 1 Each master distribution server updates the survival list of its respective child distribution server according to the abnormal situation of the child distribution server.
  • Step 2 Each master distribution server redistributes the tasks of the task execution results that have been distributed but not received in the abnormal sub-distribution server.
  • the process when the system is running includes the following steps:
  • Step 1 Each primary distribution server performs task distribution according to the undistributed first task to be distributed and the second task to be distributed.
  • Step 2 The third sub-distribution server that receives the same task to be distributed determines the main distribution server with the largest number of connected sub-distribution servers.
  • Step 3 The third sub-distribution server sends the task execution result to the main distribution server with the largest number of connected sub-distribution servers.
  • the task distribution method provided by the embodiment of the present disclosure will be introduced by taking the execution subject as the first main distribution server as an example.
  • the steps and contents performed by the remaining servers/clients in the task distribution system refer to the above related descriptions. No further description will be given later.
  • a flowchart of a task distribution method provided by an embodiment of the present disclosure, applied to the first master distribution server includes:
  • S1001 Receive the task to be distributed sent by the client.
  • S1002 Determine, from the tasks to be distributed, a first task to be distributed to be distributed by the first main distribution server, and a second task to be distributed to be distributed by the second main distribution server.
  • S1003 Send the first task to be distributed to a first execution server, and send the second task to be distributed to a second main distribution server.
  • S1004 Receive the first task execution result sent by the first execution server and the second task execution result sent by the second main distribution server, and respectively send the first task execution result and the second task execution result to the corresponding user terminal.
  • a first main distribution server, at least one second main distribution server, and multiple task execution servers are used to construct a multi-channel task distribution system, which can simultaneously pass through multiple main distribution servers.
  • the task distribution system provided by the embodiment of the present disclosure improves the task distribution efficiency and also reduces the failure of a certain link in the single-channel task distribution system.
  • the task distribution of the second main distribution server by the first main distribution server reduces the probability of task distribution confusion; by using multiple Each primary distribution server performs task distribution work at the same time, which reduces the probability of wasting computing resources due to the cold standby of the primary distribution server, and improves the task distribution efficiency.
  • the writing order of each step does not mean a strict execution order but constitutes any limitation on the implementation process, and the specific execution order of each step should be based on its function and possible Internal logic is determined.
  • the embodiments of the present disclosure also provide a task distribution device corresponding to the task distribution method. Reference may be made to the implementation of the method, and repeated descriptions will not be repeated.
  • the apparatus includes: a receiving module 1101, a determining module 1102, a first sending module 1103, and a second sending module 1104; wherein,
  • a determining module 1102 configured to determine, from the tasks to be distributed, a first task to be distributed to be distributed by the first main distribution server, and a second task to be distributed to be distributed by the second main distribution server;
  • a first sending module 1103, configured to send the first task to be distributed to the first execution server, and send the second task to be distributed to the second main distribution server;
  • the second sending module 1104 is configured to receive the first task execution result sent by the first execution server and the second task execution result sent by the second main distribution server, and execute the first task execution result and the second task execution result. The results are respectively sent to the corresponding client.
  • the task distribution apparatus provided by the embodiments of the present disclosure allocates tasks to the second main distribution server through the first main distribution server, thereby reducing the probability of task distribution confusion and improving the task distribution efficiency.
  • a schematic structural diagram of a computer device 1200 provided by an embodiment of the present disclosure includes a processor 1201 , a memory 1202 , and a bus 1203 .
  • the memory 1202 is used to store the execution instructions, including the memory 12021 and the external memory 12022; the memory 12021 here is also called the internal memory, and is used to temporarily store the operation data in the processor 1201 and the data exchanged with the external memory 12022 such as the hard disk,
  • the processor 1201 exchanges data with the external memory 12022 through the memory 12021.
  • the processor 1201 and the memory 1202 communicate through the bus 1203, so that the processor 1201 executes the following instructions:
  • Embodiments of the present disclosure further provide a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is run by a processor, the steps of the task distribution method described in the foregoing method embodiments are executed.
  • the storage medium may be a volatile or non-volatile computer-readable storage medium.
  • Embodiments of the present disclosure also provide a computer program product, including computer-readable codes, or a non-volatile computer-readable storage medium carrying computer-readable codes, when the computer-readable codes are stored in a processor of an electronic device At runtime, the processor in the electronic device is used to implement the above method.
  • the above-mentioned computer program product can be specifically implemented by means of hardware, software or a combination thereof.
  • the computer program product is embodied as a computer storage medium, and in another optional embodiment, the computer program product is embodied as a software product, such as a software development kit (Software Development Kit, SDK), etc. Wait.
  • the units described as separate components may or may not be physically separated, and components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution in this embodiment.
  • each functional unit in each embodiment of the present disclosure may be integrated into one processing unit, or each unit may exist physically alone, or two or more units may be integrated into one unit.
  • the functions, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a processor-executable non-volatile computer-readable storage medium.
  • the computer software products are stored in a storage medium, including Several instructions are used to cause a computer device (which may be a personal computer, a server, or a network device, etc.) to execute all or part of the steps of the methods described in various embodiments of the present disclosure.
  • the aforementioned storage medium includes: U disk, mobile hard disk, read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disk and other media that can store program codes .

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Hardware Design (AREA)
  • Quality & Reliability (AREA)
  • Computer And Data Communications (AREA)

Abstract

本公开提供了一种任务分发系统、方法、装置、计算机设备及存储介质,包括:第一主分发服务器,用于接收用户端发送的待分发任务;从待分发任务中确定第一待分发任务和第二待分发任务;将第一待分发任务发送至第一执行服务器,以及将第二待分发任务发送至第二主分发服务器;接收第一执行服务器发送的第一任务执行结果,和第二主分发服务器发送的第二任务执行结果,将第一任务执行结果和第二任务执行结果分别发送至对应的用户端;第二主分发服务器,用于接收第一主分发服务器发送的第二待分发任务,将第二待分发任务发送至第二执行服务器;接收第二执行服务器发送的第二任务执行结果,将第二任务执行结果发送至第一主分发服务器。

Description

任务分发系统、方法、装置、计算机设备及存储介质
本申请要求2021年04月21日提交、申请号为202110430723.5,发明名称为“一种任务分发系统、方法、装置、计算机设备及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本公开涉及计算机技术领域,具体而言,涉及任务分发系统、方法、装置、计算机设备及存储介质。
背景技术
在大规模的集群管理系统中,一般通过负责任务分发的服务器接收用户的任务请求,然后将任务分发到集群中处理任务的各个服务器。
随着集群规模的增大,负责任务分发的服务器面临的处理压力也越来越大,分发效率影响着任务处理效率。
发明内容
本公开实施例至少提供一种任务分发系统、方法、装置、计算机设备及存储介质。
第一方面,本公开实施例提供了一种任务分发系统,其特征在于,包括第一主分发服务器、至少一个第二主分发服务器、以及多个执行服务器,其中:
第一主分发服务器,用于接收用户端发送的待分发任务;从所述待分发任务中确定由第一主分发服务器进行分发的第一待分发任务,和由第二主分发服务器进行分发的第二待分发任务;将所述第一待分发任务发送至第一执行服务器,以及将所述第二待分发任务发送至第二主分发服务器;接收所述第一执行服务器发送的第一任务执行结果,和第二主分发服务器发送的第二任务执行结果,将所述第一任务执行结果和第二任务执行结果分别发送至对应的用户端;
第二主分发服务器,用于接收所述第一主分发服务器发送的第二待分发任务,将所述第二待分发任务发送至第二执行服务器;接收所述第二执行服务器发送的所述第二任务执行结果,将所述第二任务执行结果发送至所述第一主分发服务器。
一种可能的实施方式中,所述系统还包括子分发服务器,所述子分发服务器包括与所述第一主分发服务器和所述第一执行服务器连接的第一子分发服务器,以及与所述第二主分发服务器和所述第二执行服务器连接的第二子分发服务器;
所述子分发服务器,用于将主分发服务器发送的待分发任务转发至对应的执行服务器;将所述执行服务器发送的任务执行结果转发至对应的主分发服务器;其中,所述主分发服务器包括所述第一主分发服务器和所述第二主分发服务器。
这样,通过在任务系统中加入子分发服务器进行待分发任务的转发,能够增大整个任务分发系统所能连接的执行服务器的数量,从而提高任务分发系统的分发效率和任务处理效率。
一种可能的实施方式中,所述第一主分发服务器,在接收用户端发送的待分发任务后,还用于:
将所述待分发任务存储在任务队列中;
在确定所述第二待分发任务后,在所述任务队列中对所述第二待分发任务进行标记;
所述将所述第二待分发任务发送至第二主分发服务器,包括:
将标记后的所述任务队列发送至所述第二主分发服务器。
这样,通过使用任务队列将任务进行存储,可以提高任务分发系统在执行任务时的连续性;通过对任务队列中的第二待分发任务进行标记并发送至第二主分发服务器,可以使得第一主分发服务器和第二主分发服务器都能拥有完整的任务队列,且同时执行任务的分发,在提高了任务处理效率 的同时,也提高了任务分发系统的容错度。
一种可能的实施方式中,所述第一主分发服务器,在接收第二主分发服务器发送的第二任务执行结果时,用于:
接收并存储所述第二主分发服务按照预设时间间隔发送的第一任务状态队列;
所述第一任务状态队列中包括已分发且接收到第二任务执行结果的第二待分发任务、已分发且未接收到第二任务执行结果的第二待分发任务、以及已接收到的第二任务执行结果。
这样,通过接收并存储所述第一任务状态队列,可以及时的感知到第二主分发服务器的任务分发进度,从而在所述第二主分发服务器出现时,第二主分发服务器在执行任务分发时的记录也不会丢失,从而提高了任务分发系统的容错度。
一种可能的实施方式中,所述第一主分发服务器还用于:
基于已分发的第一待分发任务的任务状态,更新第二任务状态队列;
按照预设时间间隔,将第二任务状态队列发送至所述第二主分发服务器。
这样,通过向所述第二主分发服务器发送第二任务状态队列,这样即使所述第一主分发服务器出现异常,其在执行任务分发时的记录也不会丢失,从而提高了任务分发系统的容错度。
一种可能的实施方式中,所述第一主分发服务器,在接收并存储所述第二主分发服务按照预设时间间隔发送的第一任务状态队列之后,还用于:
基于所述第一任务状态队列和标记后的所述任务队列,确定所述第二主分发服务器当前未分发的所述第二待分发任务;
基于当前未分发的第一待分发任务和所述当前未分发的所述第二待分发任务,更新所述任务队列,以及所述任务队列中的标记;
将更新后的所述任务队列发送至所述第二主分发服务器,以使所述第二主分发服务器基于所述更新后的任务队列进行任务分发。
这样,通过对所述任务队列和标记的更新,能够对任务在各主分发服务器间的分配根据任务执行情况进行调整,使得第一主分发服务器和第二主分发服务器在进行任务分发时的负载均衡,提高了整个任务分发系统的任务分发效率
一种可能的实施方式中,在任务分发系统中的第一主分发服务器出现异常的情况下,所述第二主分发服务器还用于:
基于接收到的所述第二任务状态队列,确定未分发的第一待分发任务和所述第二任务状态队列中已分发且未接收到第一任务执行结果的第一待分发任务;
将所述未分发的第一待分发任务和所述第二任务状态队列中已分发且未接收到第一任务执行结果的第一待分发任务分发到所述执行服务器。
这样,通过使用第二主分发服务器对出现异常的第一主分发服务器的任务进行重新分发,从而降低了主分发服务器出现异常时,系统无法运行的概率,提高了任务分发系统的容错度。
一种可能的实施方式中,在任务分发系统中的第一主分发服务器出现异常的情况下,所述第二主分发服务器还用于:
接收所述用户端发送的第三待分发任务,并将所述第三待分发任务分发至第二执行服务器;
接收所述第二执行服务器发送的第三任务执行结果,并将所述第三任务执行结果分别发送至对应的用户端。
这样,在第一主分发服务器出现异常的情况下,通过使用第二主分发服务器与用户端建立交互,从而使得后续的任务分发能够继续进行,从而提高了任务分发系统的容错度。
一种可能的实施方式中,所述第一主分发服务器还用于:
监控并存储至少一个所述第一子分发服务器的运行状态;
在任一所述第一子分发服务器出现异常的情况下,所述第一主分发服务器还用于:
更新存储的所述至少一个第一子分发服务器的运行状态;
确定分发给出现异常的所述第一子分发服务器的第一待分发任务中,未收到第一任务执行结果的第一待分发任务;
重新对所述未收到第一任务执行结果的第一待分发任务进行分发。
这样,通过监控并存储子分发服务器的运行状态,可以使得在子分发服务器出现异常的情况下,主分发服务器能够及时的感知,并将其中未完成的待分发任务进行重新分发,从而提高了任务分发系统的容错度。
一种可能的实施方式中,在任务分发系统中的主分发服务器之间出现通讯异常的情况下,任一主分发服务器用于:
确定当前未分发的待分发任务,以及已分发未接收任务执行结果的待分发任务;
对当前未分发的待分发任务,以及已分发未接收任务执行结果的待分发任务进行任务分发。
这样,在主分发服务器之间出现通讯异常的情况下,通过至少一个主分发服务器对各自的待分发任务进行分发,从而降低了因为异常情况所导致的任务丢失的概率,提高了任务分发系统的容错度。
一种可能的实施方式中,所述子分发服务器包括与多个主分发服务器连接的第三子分发服务器,所述第一子分发服务器和所述第二子分发服务器包括所述第三子分发服务器;
在任务分发系统中的主分发服务器之间出现通讯异常的情况下,若所述第三子分发服务器接收到所述多个主分发服务器发送的同一待分发任务,所述第三子分发服务器用于:
将所述同一待分发任务发送至执行服务器,并接收所述执行服务器发送的任务执行结果;以及,确定所述多个主分发服务器中一个或多个主分发服务器连接的子分发服务器的目标数量;
将所述同一待分发任务对应的任务执行结果发送至对应的目标数量最多的主分发服务器。
这样,在主分发服务器之间出现通讯异常的情况下,通过对所述第三子分发服务器设置异常处理机制,使得其在同时接收到多个同一待分发任务时,能够通过相应的处理机制,降低了因任务重复分发导致任务执行结果错误和任务分发系统的崩溃的概率,从而提高了任务分发系统的容错度。
第二方面,本公开实施例还提供一种任务分发方法,应用于第一主分发服务器,包括:
接收用户端发送的待分发任务;
从所述待分发任务中确定由第一主分发服务器进行分发的第一待分发任务,和由第二主分发服务器进行分发的第二待分发任务;
将所述第一待分发任务发送至第一执行服务器,以及将所述第二待分发任务发送至第二主分发服务器;
接收所述第一执行服务器发送的第一任务执行结果,和第二主分发服务器发送的第二任务执行结果,将所述第一任务执行结果和第二任务执行结果分别发送至对应的用户端。
第三方面,本公开实施例还提供一种任务分发装置,包括:
接收模块,用于接收用户端发送的待分发任务;
确定模块,用于从所述待分发任务中确定由第一主分发服务器进行分发的第一待分发任务,和由第二主分发服务器进行分发的第二待分发任务;
第一发送模块,用于将所述第一待分发任务发送至第一执行服务器,以及将所述第二待分发任务发送至第二主分发服务器;
第二发送模块,用于接收所述第一执行服务器发送的第一任务执行结果,和第二主分发服务器发送的第二任务执行结果,将所述第一任务执行结果和第二任务执行结果分别发送至对应的用户端。
第四方面,本公开实施例还提供一种计算机设备,包括:处理器、存储器和总线,所述存储器存储有所述处理器可执行的机器可读指令,当计算机设备运行时,所述处理器与所述存储器之间通过总线通信,所述机器可读指令被所述处理器执行时执行上述第二方面中的步骤。
第五方面,本公开实施例还提供一种计算机可读存储介质,该计算机可读存储介质上存储有计算机程序,该计算机程序被处理器运行时执行上述第二方面中的步骤。
第六方面,本公开实施例还提供一种计算机程序产品,包括计算机可读代码,或者承载有计算机可读代码的非易失性计算机可读存储介质,当所述计算机可读代码在电子设备的处理器中运行时,所述电子设备中的处理器用于实现上述方法。
本公开实施例提供的任务分发系统、方法、装置、计算机设备及存储介质,采用第一主分发服务器、至少一个第二主分发服务器、多个任务执行服务器构建了一个多路的任务分发系统,可以同时通过多个主分发服务器完成多个任务的分发,相较于单路的任务分发系统,本公开实施例提供的任务分发系统,在提高了任务分发效率的同时,也降低了了单路任务分发系统中某一环节出现故障所导致的整体系统瘫痪的概率,从而提高了系统的可靠性和容错度;通过第一主分发服务器对第二主分发服务器进行任务的分配,降低了出现任务分发混乱的概率;通过同时使用多个主分发服务器同时进行任务分发工作,降低了由于主分发服务器冷备所造成的计算资源浪费的概率,提高了任务分发效率。
为使本公开能更明显易懂,下文特举实施例,并配合所附附图,作详细说明如下。
附图说明
为了更清楚地说明本公开实施例的技术方案,下面将对实施例中所使用的附图作简单地介绍,此处的附图被并入说明书中并构成本说明书中的一部分,这些附图示出了符合本公开的实施例,并与说明书一起用于说明本公开的技术方案。应当理解,以下附图仅示出了本公开的某些实施例,因此不应被看作是对范围的限定,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他相关的附图。
图1a示出了本公开实施例所提供的一种任务分发系统的架构示意图;
图1b示出了本公开实施例所提供的另一种任务分发系统的架构示意图;
图2示出了本公开实施例所提供的任务分发系统中,将所述第二待分发任务发送至第二主分发服务器的流程图;
图3示出了本公开实施例所提供的任务分发系统中,向所述第二主分发服务器发送第二任务状态队列的流程图;
图4示出了本公开实施例所提供的任务分发系统中,对所述待分发任务的分配情况进行调整的流程图;
图5示出了本公开实施例所提供的任务分发系统中,在任务分发系统中的第一主分发服务器出现异常的情况下,第二主分发服务器进行任务分发的流程图;
图6示出了本公开实施例所提供的任务分发系统中,在任务分发系统中的第一主分发服务器出现异常的情况下,第二主分发服务器与所述用户端进行通讯的流程图;
图7示出了本公开实施例所提供的任务分发系统中,在任一所述第一子分发服务器出现异常的情况下,第一主分发服务器进行任务分发的流程图;
图8示出了本公开实施例所提供的任务分发系统中,在任务分发系统中的主分发服务器之间出现通讯异常的情况下,任一主分发服务器进行任务分发的流程图;
图9示出了本公开实施例所提供的任务分发系统中,在任务分发系统中的主分发服务器之间出现通讯异常的情况下,第三子分发服务器进行任务分发和任务执行结果发送的流程图;
图10示出了本公开实施例所提供的一种任务分发方法的流程图;
图11示出了本公开实施例所提供的一种任务分发装置的示意图;
图12示出了本公开实施例所提供的一种计算机设备的示意图。
具体实施方式
为使本公开实施例更加清楚,下面将结合本公开实施例中附图,对本公开实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本公开一部分实施例,而不是全部的实施例。通常在此处附图中描述和示出的本公开实施例的组件可以以各种不同的配置来布置和设计。因此,以下对在附图中提供的本公开的实施例的详细描述并非旨在限制要求保护的本公开的范围,而是仅仅表示本公开的选定实施例。基于本公开的实施例,本领域技术人员在没有做出创造性劳动的前提下所获得的其他实施例,都属于本公开保护的范围。
应注意到:相似的标号和字母在下面的附图中表示类似项,因此,一旦某一项在一个附图中被定义,则在随后的附图中不对其进行进一步定义和解释。
本文中术语“和/或”,仅仅是描述一种关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。另外,本文中术语“至少一种”表示多种中的任意一种或多种中的至少两种的任意组合,例如,包括A、B、C中的至少一种,可以表示包括从A、B和C构成的集合中选择的任意一个或多个元素。
经研究发现,在大规模的集群管理系统中,一般通过负责任务分发的服务器接收用户的任务请求,然后将任务分发到集群中处理任务的各个服务器。
随着集群规模的增大,负责任务分发的服务器面临的处理压力也越来越大,分发效率影响着任务处理效率。
基于上述研究,本公开提供了一种任务分发系统、方法、装置、计算机设备及存储介质,采用第一主分发服务器、至少一个第二主分发服务器、多个任务执行服务器构建了一个多路的任务分发系统,可以同时通过多个主分发服务器完成多个任务的分发,相较于单路的任务分发系统,本公开实施例提供的任务分发系统,在提高了任务分发效率的同时,也降低了单路任务分发系统中某一环节出现故障所导致的整体系统的瘫痪的概率,从而提高了系统的可靠性和容错度;通过第一主分发服务器对第二主分发服务器进行任务的分配,降低了出现任务分发混乱的概率;通过同时使用多个主分发服务器同时进行任务分发工作,降低了由于主分发服务器冷备所造成的计算资源浪费的概率,提高了任务分发效率。
为便于对本实施例进行理解,首先对本公开实施例所公开的任务分发系统的架构进行详细介绍,所述任务分发系统由用户端和服务器组成。
参见图1a所示,为本公开实施例提供的一种任务分发系统的架构示意图,包括第一主分发服务器11、至少一个第二主分发服务器12、多个执行服务器13、用户端14,其中:
第一主分发服务器11,用于接收用户端14发送的待分发任务;从所述待分发任务中确定由第一主分发服务器11进行分发的第一待分发任务,和由第二主分发服务器12进行分发的第二待分发任务;将所述第一待分发任务发送至第一执行服务器131,以及将所述第二待分发任务发送至第二主分发服务器12;接收所述第一执行服务器131发送的第一任务执行结果,和第二主分发服务器12发送的第二任务执行结果,将所述第一任务执行结果和第二任务执行结果分别发送至对应的用户端14;
这里,所述用户端14可以包括多个用户端,多个用户端均可以向所述第一主分发服务器11发送待分发任务,所述将第一任务执行结果和第二任务执行结果分别发送至对应的用户端14,可以理解为将任务执行结果发送至请求执行该任务的用户端,即将第一任务执行结果发送至请求执行第一待分发任务的用户端,将第二任务执行结果发送至请求执行第二待分发任务的用户端。
第二主分发服务器12,用于接收所述第一主分发服务器11发送的第二待分发任务,将所述第二待分发任务发送至第二执行服务器132;接收所述第二执行服务器132发送的所述第二任务执行结果,将所述第二任务执行结果发送至所述第一主分发服务器11。
在将各个待分发任务发送至对应的执行服务器时,一种可能的实施方式中,与各个主分发服务器连接的执行服务器所能执行的任务类型是相同的,因此,在将各个待分发任务发送至对应的执行服务器时,可以根据各个执行服务器当前的任务执行情况进行分发;在另外一种可能的实施方式中,各个执行服务器所能执行的任务类型是不同的,例如某些特定任务只能在特定的执行服务器上执行,因此,在将各个待分发任务发送至对应的执行服务器时,可以根据待分发任务的任务类型进行分发。
实际应用中,在集群管理系统的规模达到一定规模时,由于每个主分发服务器所能直接连接的设备(执行服务器)是有限的,因此任务分发系统需要部署具有中转功能的设备,在接收到主分发服务器分发的任务后,将所述主分发服务器分发的任务转发至所述执行服务器,从而可以减轻主分发服务器的设备连接压力;另一方面,在整个任务分发系统中可以通过增设该连接有执行服务器的中转设备,来增加整个任务分发系统的任务处理能力,从而提高了所述任务分发系统的任务分发性能。
参见图1b所示,为本公开实施例提供的另一种任务分发系统的架构示意图,包括第一主分发服务器11、至少一个第二主分发服务器12、多个执行服务器13、以及子分发服务器15,其中:
第一主分发服务器11,用于接收用户端14发送的待分发任务;从所述待分发任务中确定由第一主分发服务器11进行分发的第一待分发任务,和由第二主分发服务器12进行分发的第二待分发任务;将所述第一待分发任务发送至其连接的第一子分发服务器151,并经由所述第一子分发服务器151转发至第一执行服务器131,以及将所述第二待分发任务发送至第二主分发服务器12;接收所述第一执行服务器131发送的第一任务执行结果,和第二主分发服务器12发送的第二任务执行结果,将所述第一任务执行结果和第二任务执行结果分别发送至对应的用户端14;
第二主分发服务器12,用于接收所述第一主分发服务器11发送的第二待分发任务,将所述第二待分发任务发送至其连接的第二子分发服务器152,并经由所述第二子分发服务器152转发至第二执行服务器132;接收所述第二执行服务器132发送的所述第二任务执行结果,将所述第二任务执行结果发送至所述第一主分发服务器11;
子分发服务器15,用于将主分发服务器发送的待分发任务转发至对应的执行服务器13,其中,所述待分发任务对应的执行服务器13,为可以执行所述待分发任务的执行服务器13;将所述执行服务器13发送的任务执行结果转发至对应的主分发服务器;其中,所述主分发服务器包括所述第一主分发服务器11和所述第二主分发服务器12;其中,所述子分发服务器15包括与所述第一主分发服务器11和所述第一执行服务器131连接的第一子分发服务器151,以及与所述第二主分发服务器12和所述第二执行服务器132连接的第二子分发服务器152。
这里,“第一主分发服务器11”下文中简称“第一主分发服务器”;“第二主分发服务器12”下文中简称“第二主分发服务器”;“第一执行服务器131”下文中简称“第一执行服务器”;“第二执行服务器132”下文中简称“第二执行服务器”;“执行服务器13”下文中简称“执行服务器”;“用户端14”下文中简称“用户端”;“子分发服务器15”下文中简称“子分发服务器”;“第一子分发服务器151”下文中简称“第一子分发服务器”;“第二子分发服务器152”下文中简称“第二子分发服务器”。
实际应用中,子分发服务器与连接该子分发服务器的执行服务器通常部署在同一位置处(比如机房等),因此子分发服务器与连接该子分发服务器的执行服务器之间的通讯是连通的,而连接该子分发服务器的执行服务器与其他的子分发服务器往往处于不同位置,因此连接该子分发服务器的执行服务器无法与其他的子分发服务器连通;子分发服务器与对应的主分发服务器通过网络相连接。实际应用中同一子分发服务器可以与多个主分发服务器建立通讯,因此对于这些连接多个主分发服务器的子分发服务器即是第一子分发服务器,也是第二子分发服务器。
这里,子分发服务器的划分标准是以能否与对应的主分发服务器相连接,能够与所述第一主分发服务器相连接的即为第一子分发服务器,能够与所述第二主分发服务器相连接的即为第二子分发服务器。比如,任务分发系统中,总共有10个子分发服务器,其中6个与所述第一主分发服务器相连接,8个与所述第二主分发服务器相连接,则第一子分发服务器即为6个,第二子分发服务器即为8个,其中有4个子分发服务器既可以与所述第一主分发服务器相连接,又可以与所述第二主分发服务器相连接,则这四个子分发服务器为第三子分发服务器,第一子分发服务器中包括第三子分 发服务器,第二子分发服务器中也包括第三子分发服务器。
此外,图1a中,执行服务器的划分标准与所述图1b中子分发服务器的划分标准类似,在此不再展开说明。
这里,图1a和图1b中为了各部分连接的简洁和美观,仅对本公开实施例所提供的任务分发系统的架构进行了示意,重点突出整个任务分发系统的多路分发架构,各部分之间具体的连接关系参照上文具体描述。
在本公开实施例中,所述执行服务器包括在进行任务分发时,与所述第一主分发服务器连接的第一执行服务器,以及与所述第二主分发服务器连接的第二执行服务器;所述用户端在进行任务分发时可以为一个或多个,只要与第一主分发服务器处于相同的通讯网络中(能与所述第一主分发服务器连通)即可;所述第二主分发服务器的数量也可以为一个或多个,其数量可以根据任务分发数量级、任务时间需求等实际需要进行设置,并在设置所述第二主分发服务器的数量后,设置对应的任务分发逻辑即可,例如,所述第二主分发服务器的数量可以设置为1。
一种可能的实施方式中,如图2所示,所述第一主分发服务器,在接收用户端发送的待分发任务后,可以根据以下步骤将所述第二待分发任务发送至第二主分发服务器:
S201:将所述待分发任务存储在任务队列中。
这里,所述任务队列中存储的任务为未分发的待分发任务,当任务分发完成后,可以相应的从任务队列中移除。
示例性的,所述任务队列可以如下表1所示:
表1
任务1  
任务2  
任务3  
表1中,任务1、任务2、任务3为未分发的待分发任务。
S202:在确定所述第二待分发任务后,在所述任务队列中对所述第二待分发任务进行标记。
这里,在接收到待分发任务之后,可以根据待分发任务的执行权限、各分发服务器与执行服务器的连接情况、各主分发服务器的负载情况等多种分配要素,确定所述第二待分发任务。
其中,所述待分发任务的执行权限表征该待分发任务在执行时所使用的权限,当所需权限较高时,持有权限较低的分发服务器/执行服务器便无法分发/执行所述待分发任务;所述待分发任务可以通过调用特定数据库中的数据执行,而特定的数据库与任务分发系统中对应的执行服务器相连接,因此需要有特定的执行服务器进行执行,从而需要确定所述各分发服务器与执行服务器的连接情况,才能通过对应的分发服务器将该待分发任务分发到能够执行该待分发任务的执行服务器;为了协调各个主分发服务器的工作负载,降低出现部分主分发服务器分发任务的概率,而另一部分主分发服务器不工作所导致的资源浪费,可以根据负载情况对待分发任务进行分配,比如有1000个待分发任务,此时第一主分发服务器A和第二主分发服务器B都没有待分发的任务(负载为0),从而可以确定所述第二待分发任务为500,也即A和B各自分发500个待分发任务,从而达到了主分发服务器之间的负载均衡。
具体的,在对所述第二待分发任务进行标记时,也即标记出需要所述第二主分发服务器分发的所述第二待分发任务。
示例性的,标记后的任务队列可以如下表2所示:
表2
任务1 2
任务2 2
任务3 2
表2中,第一行表示任务1由第二主分发服务器进行分发;第二行表示任务2由第二主分发服务器进行分发;第三行表示任务3由第二主分发服务器进行分发。
S203:将标记后的所述任务队列发送至所述第二主分发服务器。
这里,在发送所述任务队列时,可以使用数据同步备份工具rsync(remote synchronize),从而可以安全且快速的发送所述任务队列。
实际应用中,所述第二主分发服务器按照标记后的所述任务队列进行待分发任务的分发之后,可以接收到执行服务器发送的第二任务执行结果,而由于用户端一次与一个主分发服务器(第一主分发服务器)进行通讯,因此需要将所述第二任务执行结果发送至所述第一主分发服务器。
具体的,在所述用户端与所述第一主分发服务器进行通讯(发送待分发任务和任务执行结果)时,可以通过一个虚拟的IP地址进行通讯,也即用户端向一个虚拟的IP地址发送待分发任务,并从所述虚拟的IP地址接收到相应的任务执行结果,为了保证任务分发系统的正常运行,所述虚拟的IP地址在同一时刻仅能被一个主分发服务器使用,用于与所述用户端建立通讯,在第一主分发服务器运行正常的情况下,虚拟的IP地址一般被第一主分发服务器使用。
一种可能的实施方式中,所述第二主分发服务器在向第一主分发服务器发送第二任务执行结果时,可以是通过任务状态队列发送的。具体的,所述第一主分发服务器,可以是接收存储所述第二主分发服务按照预设时间间隔发送的第一任务状态队列;其中,所述第一任务状态队列中包括已分发且接收到第二任务执行结果的第二待分发任务、已分发且未接收到第二任务执行结果的第二待分发任务、以及已接收到的第二任务执行结果。
具体的,所述第一任务状态队列中,所述已分发且接收到第二任务执行结果的第二待分发任务,和所述已分发且未接收到第二任务执行结果的第二待分发任务,用于表征所述第二待分发任务的任务分发进度,从而为后续根据各主分发服务器的任务分发进度实时进行负载均衡提供了支持。
实际应用中,所述第一任务状态队列还可以用于数据备份,降低了第二主分发服务器出现异常情况时,第一主分发服务器无法获取到第二待分发任务的任务状态的概率;对应的,所述第一主分发服务器的第二任务状态队列也可以进行数据备份。
这里所述进行数据备份可以理解为,当第二主分发服务器宕机时,可能有部分任务已经分发出去但是并未接收到任务执行结果,而任务执行服务器在得到任务执行结果后,也只会返回给第二主分发服务器,这就导致第一主分发服务器无法获取这部分任务的任务执行结果。针对这部分任务,由于第二主分发服务器已经将第一任务状态队列同步至第一主分发服务器,因此第一主分发服务器可以对这部分任务重新进行分配,进而可以保证在第二主分发服务器宕机的情况下,任务执行结果不会丢失。
一种可能的实施方式中,如图3所示,所述第一主分发服务器也可以通过以下步骤,向所述第二主分发服务器发送第二任务状态队列:
S301:基于已分发的第一待分发任务的任务状态,更新第二任务状态队列。
这里,所述任务状态包括已分发且接收任务执行结果和已分发且未接收任务执行结果;所述第二任务状态队列与所述第一任务状态队列相似,包括已分发且接收到第一任务执行结果的第一待分发任务、已分发且未接收到第一任务执行结果的第一待分发任务。
一种可能的实施方式中,所述第二任务状态队列中还可以包括已接收到的第一任务执行结果,可用于将所述第一任务执行结果发送至所述第二主分发服务器,使得所述任务执行结果不会因为第一主分发服务器出现异常而丢失,从而完成了对所述第一执行结果的数据备份。
S302:按照预设时间间隔,将第二任务状态队列发送至所述第二主分发服务器。
这里,所述预设时间间隔可以是一个很小的时间间隔,从而快速且频繁发送第二任务状态队列。具体的,可以使用数据同步备份工具rsync进行发送,从而可以安全且快速的发送所述第二任务状态队列。
实际应用中,由于是多个主分发服务器同时执行任务分发,而受到主分发服务器性能等因素的影响,常常会出现各个主分发服务器任务分发进度不一致的情况,因此,为了提高任务分发的效率,可以对分配给不同主分发服务器的待分发任务进行动态调整,从而实现各个主分发服务器的负载均衡,提高任务分发系统的任务分发效率。
一种可能的实施方式中,如图4所示,所述第一主分发服务器可以通过以下步骤对所述待分发任务的分配情况进行调整:
S401:基于所述第一任务状态队列和标记后的所述任务队列,确定所述第二主分发服务器当前未分发的所述第二待分发任务。
这里,标记后的所述任务队列中包含有此前分配给所述第二主分发服务器进行分发的第二待分发任务,所述第一任务状态队列表征了所述第二已分发的任务的状态,因此,基于第一任务状态队列和标记后的任务队列,可以确定出当前未分发的所述第二待分发任务。
示例性的,所述任务队列中包含50个标记后的所述第二待分发任务,根据所述第一任务状态队列,确定其中20个已分发且接收到任务执行结果,20个已分发且未接收到任务执行结果,则可以确定出当前未分发的所述第二待分发任务为剩余的10个所述第二待分发任务。
或者,在另外一种可能的实施方式中,可以接收第二主分发服务器发送的包括当前未分发的第二待分发任务的任务队列,这里所述的任务队列可以是第一主分发服务器发送给第二主分发服务器的,第二主分发服务器在分发任一第二待分发任务之后,可以将分发的第二待分发任务从任务队列当中删除。
S402:基于当前未分发的第一待分发任务和所述当前未分发的所述第二待分发任务,更新所述任务队列,以及所述任务队列中的标记。
这里,由于所述第一待分发任务由第一主分发服务器进行分发,所以可以确定出当前未分发的第一待分发任务。
由于任务队列中存储的是未分发的任务,因此随着任务分发的进行,任务队列也是在实时更新的。具体的,可以根据所述第一任务状态队列,将已分发的第二待分发任务从所述任务队列中清除;以及,可以根据所述第二任务状态队列,将已分发的第一待分发任务从所述任务队列中清除,即可完成所述任务队列的更新;在更新所述任务队列中的标记时,可以是在当前未分发的第一待分发任务和第二待分发任务在数量上存在较大差异(比如两者之差大于预设值、比例大于预设比例等)时进行更新的,在进行更新时,可以根据当前未分发的第一待分发任务和第二待分发任务的数量对任务列表中的标记进行调整,从而实现待分发任务在各个主分发服务器之间的负载均衡。
示例性的,当前未分发的第一待分发任务有30个,当前未分发的第二待分发任务有10个,此时未分发的第一待分发任务与未分发的第二待分发任务的比例为3∶1,超过了预设比例2∶1,则可以将所述第一待分发任务中的10个分配给所述第二主分发服务器(进行标记,成为第二待分发任务),从而实现第一主分发服务器和第二主分发服务器的负载均衡。
S403:将更新后的所述任务队列发送至所述第二主分发服务器,以使所述第二主分发服务器基于所述更新后的任务队列进行任务分发。
这里,在发送所述更新后的所述任务队列时,可以使用数据同步备份工具rsync,从而可以安全且快速的发送所述任务队列。上述更新的过程可以每隔预设时长进行。其中,所述预设时长可以设置的很短(毫秒级别),从而可以达到所述第一主分发服务器和第二主分发服务器的负载处于一个动态平衡状态。
一种可能的实施方式中,所述任务状态队列中的已分发且未接收到对应任务执行结果的待分发任务,在满足一定的条件后,可以重新添加到所述任务队列中,并进行重新分发。
比如,可以为各主分发服务器的任务状态队列中的已分发且未收到结果的待分发任务设置一个预设时长,当超过所述预设时长仍未收到结果时,视为对应的待分发任务分发/执行失败(视为未被分发),重新加入到所述任务队列中,并可以由该主分发服务器,或者其他主分发服务器进行重新分发。
示例性的,所述预设时长可以设置为5分钟,当第一主分发服务器A在分发某一第一待分发任务后5分钟仍未收到对应的第一任务执行结果,则可以将所述第二任务状态队列中的该任务重新添加到所述任务队列中,并标记由第二主分发服务器A或B进行重新分发。
实际应用中,由于任务分发系统中的设备或者网络可能会出现异常,从而导致任务分发的中断,因此需要根据异常情况,相应的设置异常处理方案,以使所述任务分发系统在出现异常的情况下,仍能进行任务分发和任务执行结果的回收,下面将根据不同的异常类型,介绍对应的异常处理方案。
异常类型1、主分发服务器出现异常
这里,所述主分发服务器出现异常包括第一主分发服务器出现异常、第二主分发服务器出现异常、第一主分发服务器和第二主分发服务器同时出现异常。其中,当所述第一主分发服务器和第二主分发服务器同时出现异常时,也即此时任务分发系统完全无法执行任务分发工作,此时可以通过预设的监控工具,将对应的异常情况报告给对应的监控用户端,所述监控工具可以实时监控任务分发系统中各个设备的工作状态和网络的连通情况。
这里,将重点介绍第一主分发服务器以及第二主分发服务器出现异常这两种异常情况,所述异常可以是由服务器宕机等原因导致的无法正常进行任务分发的情况。
一种可能的实施方式中,在任务分发系统中的第一主分发服务器出现异常的情况下,如图5所示,所述第二主分发服务器可以通过以下步骤继续进行任务分发:
S501:基于接收到的所述第二任务状态队列,确定未分发的第一待分发任务和所述第二任务状态队列中已分发且未接收到第一任务执行结果的第一待分发任务。
这里,由于在进行任务状态队列的发送是间隔很短的,在所述第一主分发服务器出现异常时,第二主分发服务器能够及时的感知并基于最新一次接收的所述第二任务状态队列中的信息,确定出此时需要其(即第二主分发服务器)进行分发的待分发任务,这里,所述需要其进行分发的待分发任务包括所述第一主分发服务器在出现异常前未分发的第一待分发任务以及已分发且未收到执行结果的第一待分发任务。
S502:将所述未分发的第一待分发任务和所述第二任务状态队列中已分发且未接收到第一任务执行结果的第一待分发任务分发到所述执行服务器。
这样,可以将第一主分发服务器未能完成的待分发任务的分发继续执行,从而降低了待分发任务和任务执行结果的丢失的概率,提高了系统的容错度。
实际应用中,由于所述第一主分发服务器出现异常,为了能够在工作人员完成对所述第一主分发服务器的维修之前,所述任务分发系统能够继续进行待分发任务的接收、分发以及执行结果采集,可以使用第二主分发服务器接替所述第一主分发服务器与所述用户端进行通讯。
一种可能的实施方式中,在任务分发系统中的第一主分发服务器出现异常的情况下,如图6所示,所述第二主分发服务器还可以通过以下步骤与所述用户端进行通讯:
S601:接收所述用户端发送的第三待分发任务,并将所述第三待分发任务分发至第二执行服务器。
这里,在所述用户端与所述第一主分发服务器进行通讯(发送待分发任务和任务执行结果)时,可以通过一个虚拟的IP地址进行通讯,也即用户端向一个虚拟的IP地址发送待分发任务,并从所述虚拟的IP地址接收到相应的任务执行结果,为了保证任务分发系统的正常运行,所述虚拟的IP地址同一时刻仅能被一个主分发服务器使用,用于与所述用户端建立通讯,当所述第一主分发服务器出现异常时,第二主分发服务器便可以接管并使用所述虚拟的IP地址,与所述用户端建立通讯。其中,所述第三待分发任务为在所述第一主分发服务器出现异常之后,所述用户端发送的新的待分发任务。
进一步的,在任务分发系统的架构如图1b所示,包含子分发服务器的情况下,当所述第一主分发服务器出现异常时,所述第二主分发服务器可以先确定所能连接的子分发服务器,再通过所述第二主分发服务器所能连接的子分发服务器对所述第三待分发任务进行分发。
S602:接收所述第二执行服务器发送的第三任务执行结果,并将所述第三任务执行结果分别发送至对应的用户端。
这样,通过使用第二主分发服务器与用户端建立通讯,从而使得后续的任务分发能够继续进行,从而提高了任务分发系统的容错度。
另一种可能的实施方式中,在任务分发系统中的第二主分发服务器出现异常的情况下,所述第一主分发服务器可以基于接收到的所述第一任务状态队列,确定未分发的第二待分发任务和所述第一任务状态队列中已分发且未接收到第二任务执行结果的第二待分发任务;然后,将所述未分发的第二待分发任务和所述第一任务状态队列中已分发且未接收到第二任务执行结果的第二待分发任务分发到所述执行服务器。
这里,所述第一主分发服务器在第二主分发服务器出现异常时的处理方案的具体内容,可以参照上述第一主分发服务器出现异常时,第二主分发服务器执行操作的相关描述,在此不再展开说明。但是,由于所述用户端是与所述第一主分发服务器进行通讯的,因此当所述第二主分发服务器出现异常时,所述第一主分发服务器并不需要重新与所述用户端建立通讯。
异常类型2、子分发服务器出现异常
这里,所述子分发服务器出现异常包括第一子分发服务器出现异常、第二子分发服务器出现异常、第一子分发服务器和第子主分发服务器同时出现异常。其中,当所述第一子分发服务器和第二子分发服务器同时出现异常时,也即此时任务分发系统完全无法执行任务分发工作,此时可以通过子分发服务器连接的主分发服务器,将对应的异常情况报告给对应的用户端,其中,若所述子分发服务器同时与多个主分发服务器相连接,则可以通过其中的任意一个主分发服务器和预设的监控工具,将对应的异常情况报告给对应的监控用户端。
实际应用中,在所述任务分发系统中包含子分发服务器时,各主分发服务器在进行任务分发时,除了要进行任务的分发和任务执行结果的接收,还需要同时监控并存储与其相连接的各个子分发服务器的运行状态,所述运行状态包括子分发服务器的负载情况以及是否存活等,以便在子分发服务器出现异常时能够通过及时的采取相应的措施,减小所述子分发服务器出现异常所带来的损失。
这里,将重点介绍第一子分发服务器以及第二子分发服务器出现异常这两种异常情况,所述异常可以是由服务器宕机等原因导致的无法正常进行任务转发的情况。
一种可能的实施方式中,在任一所述第一子分发服务器出现异常的情况下,如图7所示,所述第一主分发服务器可以按照以下步骤进行任务分发:
S701:更新存储的所述各个第一子分发服务器的运行状态。
这里,在存储所述各个第一子分发服务器的运行状态时,可以为所述第一子分发服务器设置一个存活列表,所述存活列表用于表征所述第一子分发服务器是否存活。
示例性的,所述存活列表可以如下表3所示:
表3
子分发服务器1 存活
子分发服务器2 存活
子分发服务器3 异常
表3中,第一行表示子分发服务器1当前存活;第二行表示子分发服务器2当前存活;第三行表示子分发服务器3当前异常。所述存活列表可以根据监控情况进行实时更新。
S702:确定分发给出现异常的所述第一子分发服务器的第一待分发任务中,未收到第一任务执行结果的第一待分发任务。
承接上例,当检测到所述子分发服务器3出现异常时,可以确定此前经由所述子分发服务器3进行转发的待分发任务中,预设时间段内未收到执行结果的第一待分发任务。
S703:重新对所述未收到第一任务执行结果的第一待分发任务进行分发。
这样,主分发服务器通过及时的感知出现异常的子分发服务器,并将其中未完成的待分发任务进行重新分发,从而提高了任务分发系统的容错度。
另一种可能的实施方式中,在任一所述第二子分发服务器出现异常的情况下,所述第二主分发服务器可以更新存储的所述各个第二子分发服务器的运行状态;确定分发给出现异常的所述第二子分发服务器的第二待分发任务中,未收到第二任务执行结果的第二待分发任务;重新对所述未收到第二任务执行结果的第二待分发任务进行分发。
这里,所述第二主分发服务器在第二子分发服务器出现异常时的处理方案的具体内容,可以参照上述第一主分发服务器在第一子分发服务器出现异常时的处理方案的相关描述,在此不再展开说明。
异常类型3、主分发服务器之间的通讯出现异常
这里,由于主分发服务器有任务状态同步的需求,需要同步当前的已分发任务和未分发任务,从而协调任务分发的有序进行。当通讯出现异常时,便无法继续进行任务状态的同步,各主分发服务器便无法感知其他主分发服务器的存活情况。为了降低任务漏发的概率,各主分发服务器需要按照各自的任务队列进行全部未分发任务的分发,从而可能会出现因多个主分发服务器对同一任务重复分发,进而导致任务执行结果错误和任务分发系统的崩溃。
一种可能的实施方式中,在任务分发系统中的主分发服务器之间出现通讯异常的情况下,如图8所示,任一主分发服务器可以通过以下步骤进行任务分发:
S801:确定当前未分发的待分发任务,以及已分发未接收任务执行结果的待分发任务。
这里,由于进行任务状态同步是间隔很短的,在所述主分发服务器之间出现通讯异常时,各主分发服务器能够及时的感知并基于最新一次同步时的任务状态,确定出当前未分发的待分发任务,以及已分发未接收任务执行结果的待分发任务。
S802:对当前未分发的待分发任务,以及已分发未接收任务执行结果的待分发任务进行任务分发。
这里,为了降低待分发任务的漏发概率,因此即使存在待分发任务重复分发的可能,也要让各个主分发服务器对当前未分发的待分发任务,以及已分发未接收任务执行结果的待分发任务进行任务分发。
具体应用中,由于所述子分发服务器包括与多个主分发服务器连接的第三子分发服务器,所述第一子分发服务器和所述第二子分发服务器包括所述第三子分发服务器,因此,任一主分发服务器在按照所述步骤S802进行任务分发后,所述第三子分发服务器可以接收到所述多个主分发服务器发送的同一待分发任务,若所述第三子分发服务器多次分发同一所述待分发任务,则同一所述待分发任务会被多次执行,可能会造成任务执行的混乱和任务执行结果的错误,因此需要设置一个异常处理机制以应对这种异常情况。
一种可能的实施方式中,在任务分发系统中的主分发服务器之间出现通讯异常的情况下,如图9所示,所述第三子分发服务器可以通过以下步骤进行任务分发和任务执行结果发送:
S901:将所述同一待分发任务发送至执行服务器,并接收所述执行服务器发送的任务执行结果;以及,确定所述多个主分发服务器中每个主分发服务器连接的子分发服务器的目标数量。
这里,在将所述同一待分发任务发送至执行服务器时,只需发送一遍即可,无需重复分发;在确定所述每个主分发服务器连接的子分发服务器的目标数量时,第三子分发服务器可以通过其所连接的主分发服务器中的所述存活列表来确定。比如,所述第一主分发服务器中的存活列表里有10个存活的第一子分发服务器,则所述目标数量即为10。
具体的,所述存活列表可以是主分发服务器在发送待分发任务时,同时发送给所述第三子分发服务器的;或者,也可以是所述第三子分发服务器在接收到所述同一待分发任务之后,向其所连接的主分发服务器发送请求后获取的。
S902:将所述同一待分发任务对应的任务执行结果发送至对应的目标数量最多的主分发服务器。
这样,通过确保所述同一待分发任务只被分发和执行一次,且对应的任务执行结果也只返回一次,可以降低因任务重复分发导致任务执行结果错误和任务分发系统崩溃的概率,从而提高了任务分发系统的容错度。
实际应用中,若对应的目标数量最多的主分发服务器不是此前与所述用户端建立通讯的主分发服务器,则可以使用该主分发服务器接替此前与所述用户端进行通讯的主分发服务器,从而完成新的待分发任务的接收和任务执行结果的发送。
综上,在整个任务分发系统正常运行的情况下,系统运行时的流程包括以下步骤:
步骤1、用户端向第一主分发服务器发送待分发任务。
步骤2、第一主分发服务器在接收到所述待分发任务后,将待分发任务添加至任务列表,并进行任务分配,确定任务列表中的第一待分发任务和第二待分发任务,并对任务列表添加标记,将标记后的任务列表发送至第二主分发服务器。
步骤3、第一主分发服务器和第二主分发服务器分别将第一待分发任务和第二待分发任务发送至对应的子分发服务器,以使子分发服务器发送至对应的执行服务器。
步骤4、第一主分发服务器和第二主分发服务器接收执行服务器通过子分发服务器发送的任务执行结果,第二主分发服务器将接收的第二执行结果发送至第一主分发服务器,由第一主分发服务器统一发送至用户端。
此外,在任务分发系统中的第一主分发服务器出现异常的情况下,系统运行时的流程包括以下步骤:
步骤1、第二主分发服务器基于接收到的所述第二任务状态队列,将第一主分发服务器已分发且未收到执行结果的任务进行重新分发。
步骤2、第二主分发服务器基于任务队列,对未分发的第一待分发任务和第二待分发任务进行分发。
步骤3、第二主分发服务器建立与所述用户端的通讯,接收用户端新发送的待分发任务,并将任务的执行结果发送至所述用户端。
此外,在任务分发系统中的第二主分发服务器出现异常的情况下,系统运行时的流程包括以下步骤:
步骤1、第一主分发服务器基于接收到的所述第一任务状态队列,将第二主分发服务器已分发且未收到执行结果的任务进行重新分发。
步骤2、第一主分发服务器基于任务队列,对未分发的第一待分发任务和第二待分发任务进行分发。
此外,在任务分发系统中的子分发服务器出现异常的情况下,系统运行时的流程包括以下步骤:
步骤1、各主分发服务器根据子分发服务器的异常情况,更新各自的子分发服务器的存活列表。
步骤2、各主分发服务器对出现异常的子分发服务器中已分发且未收到的任务执行结果的任务进行重新分发。
此外,在任务分发系统中主分发服务器之间的通讯出现异常的情况下,系统运行时的流程包括以下步骤:
步骤1、各主分发服务器根据未分发的第一待分发任务和第二待分发任务进行任务分发。
步骤2、接收到同一待分发任务的第三子分发服务器,确定出连接子分发服务器数量最多的主分发服务器。
步骤3、第三子分发服务器将任务执行结果发送至所述连接子分发服务器数量最多的主分发服务器。
上述各流程的具体内容参见上文相关描述,在此不再赘述。
下面,以执行主体为所述第一主分发服务器为例,对本公开实施例所提供的任务分发方法进行介绍,所述任务分发系统中其余服务器/用户端执行的步骤及内容参见上述相关描述,后文不再赘述。
参见图10所示,为本公开实施例提供的一种任务分发方法的流程图,应用于第一主分发服务器,包括:
S1001:接收用户端发送的待分发任务。
S1002:从所述待分发任务中确定由第一主分发服务器进行分发的第一待分发任务,和由第二主分发服务器进行分发的第二待分发任务。
S1003:将所述第一待分发任务发送至第一执行服务器,以及将所述第二待分发任务发送至第二主分发服务器。
S1004:接收所述第一执行服务器发送的第一任务执行结果,和第二主分发服务器发送的第二任务执行结果,将所述第一任务执行结果和第二任务执行结果分别发送至对应的用户端。
各步骤的详细内容参见上述相关描述,在此不再赘述。
本公开实施例提供的任务分发系统及方法,采用第一主分发服务器、至少一个第二主分发服务器、多个任务执行服务器构建了一个多路的任务分发系统,可以同时通过多个主分发服务器完成多个任务的分发,相较于单路的任务分发系统,本公开实施例提供的任务分发系统,在提高了任务分发效率的同时,也降低了单路任务分发系统中某一环节出现故障所导致的整体系统瘫痪的概率,从而提高了系统的可靠性和容错度;通过第一主分发服务器对第二主分发服务器进行任务的分配,降低了出现任务分发混乱的概率;通过同时使用多个主分发服务器同时进行任务分发工作,降低了由于主分发服务器冷备所造成的计算资源浪费的概率,提高了任务分发效率。
本领域技术人员可以理解,在具体实施方式的上述方法中,各步骤的撰写顺序并不意味着严格的执行顺序而对实施过程构成任何限定,各步骤的具体执行顺序应当以其功能和可能的内在逻辑确定。
基于同一发明构思,本公开实施例中还提供了与任务分发方法对应的任务分发装置,由于本公开实施例中的装置解决问题的原理与本公开实施例上述任务分发方法相似,因此装置的实施可以参见方法的实施,重复之处不再赘述。
参照图11所示,为本公开实施例提供的一种任务分发装置的示意图,所述装置包括:接收模块1101、确定模块1102、第一发送模块1103、第二发送模块1104;其中,
接收模块1101,用于接收用户端发送的待分发任务;
确定模块1102,用于从所述待分发任务中确定由第一主分发服务器进行分发的第一待分发任务,和由第二主分发服务器进行分发的第二待分发任务;
第一发送模块1103,用于将所述第一待分发任务发送至第一执行服务器,以及将所述第二待分发任务发送至第二主分发服务器;
第二发送模块1104,用于接收所述第一执行服务器发送的第一任务执行结果,和第二主分发服务器发送的第二任务执行结果,将所述第一任务执行结果和第二任务执行结果分别发送至对应的用户端。
本公开实施例提供的任务分发装置,通过第一主分发服务器对第二主分发服务器进行任务的分配,降低出现任务分发混乱的概率,提高了任务分发效率。
关于装置中的各模块的处理流程、以及各模块之间的交互流程的描述可以参照上述方法实施例中的相关说明,这里不再详述。
基于同一技术构思,本公开实施例还提供了一种计算机设备。参照图12所示,为本公开实施例提供的计算机设备1200的结构示意图,包括处理器1201、存储器1202、和总线1203。其中,存储器1202用于存储执行指令,包括内存12021和外部存储器12022;这里的内存12021也称内存储器, 用于暂时存放处理器1201中的运算数据,以及与硬盘等外部存储器12022交换的数据,处理器1201通过内存12021与外部存储器12022进行数据交换,当计算机设备1200运行时,处理器1201与存储器1202之间通过总线1203通信,使得处理器1201在执行以下指令:
接收用户端发送的待分发任务;
从所述待分发任务中确定由第一主分发服务器进行分发的第一待分发任务,和由第二主分发服务器进行分发的第二待分发任务;
将所述第一待分发任务发送至第一执行服务器,以及将所述第二待分发任务发送至第二主分发服务器;
接收所述第一执行服务器发送的第一任务执行结果,和第二主分发服务器发送的第二任务执行结果,将所述第一任务执行结果和第二任务执行结果分别发送至对应的用户端。
本公开实施例还提供一种计算机可读存储介质,该计算机可读存储介质上存储有计算机程序,该计算机程序被处理器运行时执行上述方法实施例中所述的任务分发方法的步骤。其中,该存储介质可以是易失性或非易失的计算机可读取存储介质。
本公开实施例还提供一种计算机程序产品,包括计算机可读代码,或者承载有计算机可读代码的非易失性计算机可读存储介质,当所述计算机可读代码在电子设备的处理器中运行时,所述电子设备中的处理器用于实现上述方法。
其中,上述计算机程序产品可以具体通过硬件、软件或其结合的方式实现。在一个可选实施例中,所述计算机程序产品具体体现为计算机存储介质,在另一个可选实施例中,计算机程序产品具体体现为软件产品,例如软件开发包(Software Development Kit,SDK)等等。
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统和装置的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。在本公开所提供的几个实施例中,应该理解到,所揭露的系统、装置和方法,可以通过其它的方式实现。以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,又例如,多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些通信接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本公开各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。
所述功能如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个处理器可执行的非易失的计算机可读取存储介质中。基于这样的理解,本公开的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本公开各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(Read-Only Memory,ROM)、随机存取存储器(Random Access Memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。
最后应说明的是:以上所述实施例,仅为本公开的具体实施方式,用以说明本公开的技术方案,而非对其限制,本公开的保护范围并不局限于此,尽管参照前述实施例对本公开进行了详细的说明,本领域的普通技术人员应当理解:任何熟悉本技术领域的技术人员在本公开揭露的技术范围内,其依然可以对前述实施例所记载的技术方案进行修改或可轻易想到变化,或者对其中部分技术特征进行等同替换;而这些修改、变化或者替换,并不使相应技术方案的本质脱离本公开实施例技术方案的精神和范围,都应涵盖在本公开的保护范围之内。因此,本公开的保护范围应所述以权利要求的保护范围为准。

Claims (16)

  1. 一种任务分发系统,其特征在于,包括第一主分发服务器、至少一个第二主分发服务器、以及多个执行服务器,其中:
    第一主分发服务器,用于接收用户端发送的待分发任务;从所述待分发任务中确定由第一主分发服务器进行分发的第一待分发任务,和由第二主分发服务器进行分发的第二待分发任务;将所述第一待分发任务发送至第一执行服务器,以及将所述第二待分发任务发送至第二主分发服务器;接收所述第一执行服务器发送的第一任务执行结果,和第二主分发服务器发送的第二任务执行结果,将所述第一任务执行结果和第二任务执行结果分别发送至对应的用户端;
    第二主分发服务器,用于接收所述第一主分发服务器发送的第二待分发任务,将所述第二待分发任务发送至第二执行服务器;接收所述第二执行服务器发送的所述第二任务执行结果,将所述第二任务执行结果发送至所述第一主分发服务器。
  2. 根据权利要求1所述的系统,其特征在于,所述系统还包括子分发服务器,所述子分发服务器包括与所述第一主分发服务器和所述第一执行服务器连接的第一子分发服务器,以及与所述第二主分发服务器和所述第二执行服务器连接的第二子分发服务器;
    所述子分发服务器,用于将主分发服务器发送的待分发任务转发至对应的执行服务器;将所述执行服务器发送的任务执行结果转发至对应的主分发服务器;其中,所述主分发服务器包括所述第一主分发服务器和所述第二主分发服务器。
  3. 根据权利要求1或2所述的任务分发系统,其特征在于,所述第一主分发服务器,在接收用户端发送的待分发任务后,还用于:
    将所述待分发任务存储在任务队列中;
    在确定所述第二待分发任务后,在所述任务队列中对所述第二待分发任务进行标记;
    所述将所述第二待分发任务发送至第二主分发服务器,包括:
    将标记后的所述任务队列发送至所述第二主分发服务器。
  4. 根据权利要求1-3任一所述的系统,其特征在于,所述第一主分发服务器,在接收第二主分发服务器发送的第二任务执行结果时,用于:
    接收并存储所述第二主分发服务按照预设时间间隔发送的第一任务状态队列;
    所述第一任务状态队列中包括已分发且接收到第二任务执行结果的第二待分发任务、已分发且未接收到第二任务执行结果的第二待分发任务、以及已接收到的第二任务执行结果。
  5. 根据权利要求1-4任一所述的系统,其特征在于,所述第一主分发服务器还用于:
    基于已分发的第一待分发任务的任务状态,更新第二任务状态队列;
    按照预设时间间隔,将第二任务状态队列发送至所述第二主分发服务器。
  6. 根据权利要求4所述的系统,其特征在于,所述第一主分发服务器,在接收并存储所述第二主分发服务按照预设时间间隔发送的第一任务状态队列之后,还用于:
    基于所述第一任务状态队列和标记后的所述任务队列,确定所述第二主分发服务器当前未分发的所述第二待分发任务;
    基于当前未分发的第一待分发任务和所述当前未分发的所述第二待分发任务,更新所述任务队列,以及所述任务队列中的标记;
    将更新后的所述任务队列发送至所述第二主分发服务器,以使所述第二主分发服务器基于所述更新后的任务队列进行任务分发。
  7. 根据权利要求1-6任一所述的系统,其特征在于,在任务分发系统中的第一主分发服务器出现异常的情况下,所述第二主分发服务器还用于:
    基于接收到的所述第二任务状态队列,确定未分发的第一待分发任务和所述第二任务状态队列中已分发且未接收到第一任务执行结果的第一待分发任务;
    将所述未分发的第一待分发任务和所述第二任务状态队列中已分发且未接收到第一任务执行结果的第一待分发任务分发到所述执行服务器。
  8. 根据权利要求1~7任一所述的系统,其特征在于,在任务分发系统中的第一主分发服务器出现异常的情况下,所述第二主分发服务器还用于:
    接收所述用户端发送的第三待分发任务,并将所述第三待分发任务分发至第二执行服务器;
    接收所述第二执行服务器发送的第三任务执行结果,并将所述第三任务执行结果分别发送至对应的用户端。
  9. 根据权利要求2所述的系统,其特征在于,所述第一主分发服务器还用于:
    监控并存储至少一个所述第一子分发服务器的运行状态;
    在任一所述第一子分发服务器出现异常的情况下,所述第一主分发服务器还用于:
    更新存储的所述至少一个第一子分发服务器的运行状态;
    确定分发给出现异常的所述第一子分发服务器的第一待分发任务中,未收到第一任务执行结果的第一待分发任务;
    重新对所述未收到第一任务执行结果的第一待分发任务进行分发。
  10. 根据权利要求2所述的系统,其特征在于,在任务分发系统中的主分发服务器之间出现通讯异常的情况下,任一主分发服务器用于:
    确定当前未分发的待分发任务,以及已分发未接收任务执行结果的待分发任务;
    对当前未分发的待分发任务,以及已分发未接收任务执行结果的待分发任务进行任务分发。
  11. 根据权利要求2或10所述的系统,其特征在于,所述子分发服务器包括与多个主分发服务器连接的第三子分发服务器,所述第一子分发服务器和所述第二子分发服务器包括所述第三子分发服务器;
    在任务分发系统中的主分发服务器之间出现通讯异常的情况下,若所述第三子分发服务器接收到所述多个主分发服务器发送的同一待分发任务,所述第三子分发服务器用于:
    将所述同一待分发任务发送至执行服务器,并接收所述执行服务器发送的任务执行结果;以及,确定所述多个主分发服务器中一个或多个主分发服务器连接的子分发服务器的目标数量;
    将所述同一待分发任务对应的任务执行结果发送至对应的目标数量最多的主分发服务器。
  12. 一种任务分发方法,其特征在于,应用于第一主分发服务器,包括:
    接收用户端发送的待分发任务;
    从所述待分发任务中确定由第一主分发服务器进行分发的第一待分发任务,和由第二主分发服务器进行分发的第二待分发任务;
    将所述第一待分发任务发送至第一执行服务器,以及将所述第二待分发任务发送至第二主分发服务器;
    接收所述第一执行服务器发送的第一任务执行结果,和第二主分发服务器发送的第二任务执行结果,将所述第一任务执行结果和第二任务执行结果分别发送至对应的用户端。
  13. 一种任务分发装置,其特征在于,包括:
    接收模块,用于接收用户端发送的待分发任务;
    确定模块,用于从所述待分发任务中确定由第一主分发服务器进行分发的第一待分发任务,和由第二主分发服务器进行分发的第二待分发任务;
    第一发送模块,用于将所述第一待分发任务发送至第一执行服务器,以及将所述第二待分发任务发送至第二主分发服务器;
    第二发送模块,用于接收所述第一执行服务器发送的第一任务执行结果,和第二主分发服务器发送的第二任务执行结果,将所述第一任务执行结果和第二任务执行结果分别发送至对应的用户端。
  14. 一种计算机设备,其特征在于,包括:处理器、存储器和总线,所述存储器存储有所述处理器可执行的机器可读指令,当计算机设备运行时,所述处理器与所述存储器之间通过总线通信,所述机器可读指令被所述处理器执行时执行如权利要求12所述的任务分发方法的步骤。
  15. 一种计算机可读存储介质,其特征在于,该计算机可读存储介质上存储有计算机程序,该计算机程序被处理器运行时执行如权利要求12所述的任务分发方法的步骤。
  16. 一种计算机程序产品,包括计算机可读代码,或者承载有计算机可读代码的非易失性计算机可读存储介质,当所述计算机可读代码在电子设备的处理器中运行时,所述电子设备中的处理器用于实现权利要求12中的方法。
PCT/CN2021/126624 2021-04-21 2021-10-27 任务分发系统、方法、装置、计算机设备及存储介质 WO2022222403A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110430723.5 2021-04-21
CN202110430723.5A CN113094177A (zh) 2021-04-21 2021-04-21 一种任务分发系统、方法、装置、计算机设备及存储介质

Publications (1)

Publication Number Publication Date
WO2022222403A1 true WO2022222403A1 (zh) 2022-10-27

Family

ID=76679064

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/126624 WO2022222403A1 (zh) 2021-04-21 2021-10-27 任务分发系统、方法、装置、计算机设备及存储介质

Country Status (2)

Country Link
CN (1) CN113094177A (zh)
WO (1) WO2022222403A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113094177A (zh) * 2021-04-21 2021-07-09 上海商汤科技开发有限公司 一种任务分发系统、方法、装置、计算机设备及存储介质

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103036800A (zh) * 2012-12-14 2013-04-10 北京高森明晨信息科技有限公司 虚拟机负载均衡系统、节点及方法
US20170351549A1 (en) * 2016-06-03 2017-12-07 International Business Machines Corporation Task queuing and dispatching mechanisms in a computational device
CN107992392A (zh) * 2017-11-21 2018-05-04 国家超级计算深圳中心(深圳云计算中心) 一种用于云渲染系统的自动监控修复系统和方法
CN108710543A (zh) * 2018-05-21 2018-10-26 苏州本乔信息技术有限公司 一种渲染任务的处理方法及设备
CN113094177A (zh) * 2021-04-21 2021-07-09 上海商汤科技开发有限公司 一种任务分发系统、方法、装置、计算机设备及存储介质

Family Cites Families (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
SE0200418D0 (sv) * 2002-02-13 2002-02-13 Ericsson Telefon Ab L M A method and apparatus for computer load sharing and data distribution
US20120259956A1 (en) * 2011-04-07 2012-10-11 Infosys Technologies, Ltd. System and method for implementing a dynamic change in server operating condition in a secured server network
JP5776339B2 (ja) * 2011-06-03 2015-09-09 富士通株式会社 ファイル配布方法、ファイル配布システム、マスタサーバ、及びファイル配布プログラム
CN103294533B (zh) * 2012-10-30 2016-09-07 北京安天电子设备有限公司 任务流控制方法及系统
JP6083290B2 (ja) * 2013-03-27 2017-02-22 日本電気株式会社 分散処理システム
JP6357807B2 (ja) * 2014-03-05 2018-07-18 富士通株式会社 タスク割当プログラム、タスク実行プログラム、マスタサーバ、スレーブサーバおよびタスク割当方法
JP6721820B2 (ja) * 2015-08-14 2020-07-15 富士通株式会社 異常対処決定プログラム、異常対処決定方法、及び、状態管理装置
CN107103009B (zh) * 2016-02-23 2020-04-10 杭州海康威视数字技术股份有限公司 一种数据处理方法及装置
JP2017211868A (ja) * 2016-05-26 2017-11-30 株式会社リコー 機器監視システム、中継装置、及び機器監視サービス用Proxyプログラム
CN107613025B (zh) * 2017-10-31 2021-01-08 武汉光迅科技股份有限公司 一种基于消息队列顺序回复的实现方法和装置
CN110928673A (zh) * 2018-09-20 2020-03-27 北京国双科技有限公司 任务的分配方法及装置
KR102598084B1 (ko) * 2018-11-06 2023-11-03 삼성전자주식회사 작업 의존성에 기초하여 컴퓨팅 작업을 서버에 스케줄링하는 방법 및 장치
CN109586969B (zh) * 2018-12-13 2022-02-11 平安科技(深圳)有限公司 内容分发网络容灾方法、装置、计算机设备及存储介质
US10996993B2 (en) * 2019-06-20 2021-05-04 Western Digital Technologies, Inc. Adaptive work distribution in distributed systems

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103036800A (zh) * 2012-12-14 2013-04-10 北京高森明晨信息科技有限公司 虚拟机负载均衡系统、节点及方法
US20170351549A1 (en) * 2016-06-03 2017-12-07 International Business Machines Corporation Task queuing and dispatching mechanisms in a computational device
CN107992392A (zh) * 2017-11-21 2018-05-04 国家超级计算深圳中心(深圳云计算中心) 一种用于云渲染系统的自动监控修复系统和方法
CN108710543A (zh) * 2018-05-21 2018-10-26 苏州本乔信息技术有限公司 一种渲染任务的处理方法及设备
CN113094177A (zh) * 2021-04-21 2021-07-09 上海商汤科技开发有限公司 一种任务分发系统、方法、装置、计算机设备及存储介质

Also Published As

Publication number Publication date
CN113094177A (zh) 2021-07-09

Similar Documents

Publication Publication Date Title
US10868840B1 (en) Multiple-master DNS system
TWI710915B (zh) 一種基於互聯網資料中心的資源處理方法、相關裝置以及通信系統
US8930316B2 (en) System and method for providing partition persistent state consistency in a distributed data grid
US10122595B2 (en) System and method for supporting service level quorum in a data grid cluster
US10983880B2 (en) Role designation in a high availability node
US9703610B2 (en) Extensible centralized dynamic resource distribution in a clustered data grid
US20070121490A1 (en) Cluster system, load balancer, node reassigning method and recording medium storing node reassigning program
CN106817408B (zh) 一种分布式服务器集群调度方法及装置
WO2017050254A1 (zh) 热备方法、装置及系统
CN108881512B (zh) Ctdb的虚拟ip均衡分配方法、装置、设备及介质
US10673936B2 (en) Self-organized retail source request routing and distributed load sharing systems and methods
WO2019210580A1 (zh) 访问请求处理方法、装置、计算机设备和存储介质
CN104811476A (zh) 一种面向应用服务的高可用部署实现方法
CN106775953A (zh) 实现OpenStack高可用的方法与系统
US9754032B2 (en) Distributed multi-system management
US9047126B2 (en) Continuous availability between sites at unlimited distances
CN112217847A (zh) 微服务平台及其实现方法、电子设备及存储介质
KR101586354B1 (ko) 병렬 연결식 서버시스템의 통신 장애 복구방법
WO2022222403A1 (zh) 任务分发系统、方法、装置、计算机设备及存储介质
CN107645396B (zh) 一种集群扩容方法及装置
KR101294268B1 (ko) 복수 개의 로그서버들을 이용한 로그 분산 처리 방법 및 로그 분산 처리 시스템
EP3346671B1 (en) Service processing method and equipment
US20240176762A1 (en) Geographically dispersed hybrid cloud cluster
US20240028611A1 (en) Granular Replica Healing for Distributed Databases
CN112910796A (zh) 流量管理方法、装置、设备、存储介质以及程序产品

Legal Events

Date Code Title Description
ENP Entry into the national phase

Ref document number: 2022531460

Country of ref document: JP

Kind code of ref document: A

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21937625

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21937625

Country of ref document: EP

Kind code of ref document: A1