WO2018059423A1 - 分布式资源调度方法、调度节点及接入节点 - Google Patents

分布式资源调度方法、调度节点及接入节点 Download PDF

Info

Publication number
WO2018059423A1
WO2018059423A1 PCT/CN2017/103606 CN2017103606W WO2018059423A1 WO 2018059423 A1 WO2018059423 A1 WO 2018059423A1 CN 2017103606 W CN2017103606 W CN 2017103606W WO 2018059423 A1 WO2018059423 A1 WO 2018059423A1
Authority
WO
WIPO (PCT)
Prior art keywords
node
scheduling
resource
service
request
Prior art date
Application number
PCT/CN2017/103606
Other languages
English (en)
French (fr)
Inventor
颜克洲
李雅卿
Original Assignee
腾讯科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 腾讯科技(深圳)有限公司 filed Critical 腾讯科技(深圳)有限公司
Publication of WO2018059423A1 publication Critical patent/WO2018059423A1/zh
Priority to US16/201,606 priority Critical patent/US10838777B2/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5083Techniques for rebalancing the load in a distributed system
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • G06F9/5038Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the execution order of a plurality of tasks, e.g. taking priority or time dependency constraints into consideration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/546Message passing systems or structures, e.g. queues
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/54Indexing scheme relating to G06F9/54
    • G06F2209/548Queue

Definitions

  • the present application relates to the field of computer processing, and in particular, to a distributed resource scheduling method, a scheduling node, and an access node.
  • Distributed computing is the decomposition of a complex task or application into many small parts that are distributed to multiple computers for parallel processing.
  • the scheduling node determines, according to the received request, a resource node that processes the request among the plurality of resource nodes.
  • the embodiment of the present application provides a distributed resource scheduling method, a scheduling node, and an access node, which can improve the efficiency of scheduling resource execution by the scheduling node and resource utilization of the resource node.
  • An embodiment of the present application provides a resource scheduling method, where the method includes:
  • Performing resource scheduling according to the service request message and the work queue status information Assigning at least one resource node to each service to be processed in the access node, and generating a scheduling result, and sending the scheduling result to the access node, so that the access node sends the allocated at least one resource according to the scheduling result.
  • the node sends at least one task request.
  • the embodiment of the present application further provides a resource scheduling method, where the method includes:
  • the information performs resource scheduling, allocates at least one resource node for each service to be processed, and generates a scheduling result;
  • the embodiment of the present application further provides a scheduling node, including a processor and a memory, where the memory stores instructions executable by the processor, and when executing the instruction, the processor is configured to:
  • the application provides an access node including a processor and a memory, the memory storing instructions executable by the processor, when the instruction is executed, the processor is configured to:
  • the service Sending, to the scheduling node, a service request message indicating the at least one service, so that the scheduling node performs resource scheduling according to the service request message and the obtained work queue state information of the at least one resource node, which is to be processed.
  • the service allocates at least one resource node and generates a scheduling result;
  • FIG. 1 is a schematic structural diagram of a distributed resource scheduling system according to an embodiment of the present application.
  • FIG. 2 is a schematic flowchart of a resource scheduling method according to an embodiment of the present application.
  • FIG. 3 is a schematic flowchart of a resource scheduling method according to another embodiment of the present application.
  • FIG. 4 is a schematic diagram of interaction of a resource scheduling method according to an embodiment of the present application.
  • FIG. 5 is a schematic diagram of interaction of a resource scheduling method according to another embodiment of the present application.
  • FIG. 6 is a schematic diagram of interaction of a resource scheduling method according to still another embodiment of the present application.
  • FIG. 7 is a schematic structural diagram of a scheduling node according to an embodiment of the present application.
  • FIG. 8 is a schematic structural diagram of a scheduling node according to an embodiment of the present application.
  • FIG. 9 is a schematic structural diagram of an access node according to an embodiment of the present application.
  • FIG. 10 is a schematic structural diagram of an access node according to an embodiment of the present application.
  • the scheduling node allocates a corresponding resource node for each request by estimating the processing capability of each resource node in advance. If the prediction of the processing power of the resource node is not accurate enough, the scheduling of the entire system will be unreasonable, and the request cannot be scheduled according to the real capability of the resource node. In addition, when the processing time required for each request is different, and there is no way to predict in advance, it is easy to cause the load of the resource node to process the request to be unbalanced. Therefore, the resource allocation of resource nodes is unreasonable, resulting in a decrease in resource utilization.
  • FIG. 1 is a schematic structural diagram of a distributed resource scheduling system according to an embodiment of the present application.
  • the distributed resource scheduling system 100 includes: access nodes 1101 to 110L, scheduling nodes 1201 to 120M, and resource nodes 1301 to 130N.
  • access nodes 1101 to 110L there are L access nodes, M scheduling nodes, and N resource nodes.
  • M scheduling nodes
  • N resource nodes.
  • N is much larger than L.
  • the distributed resource scheduling system 100 is used for distributed computing to process large computing tasks, such as transcoding a large video file.
  • Each of the three types of nodes included has different functions.
  • the access nodes 1101 to 110L are responsible for receiving a task request corresponding to a certain service from the upper layer, reporting the status of the request in each service to the scheduling nodes 1201 to 120M to apply for a resource, and controlling execution of the request with the resource node.
  • the access nodes 1101 to 110L may periodically forward the work queue state information of each of the resource nodes 1301 to 130N to the scheduling nodes 1201 to 120M.
  • the scheduling nodes 1201 - 120M are responsible for communicating with the access nodes 1101 - 110L and/or resource sections
  • the interaction of the points 1301 to 130N maintains the working queue state information of each resource node, and allocates resource nodes to the respective access nodes 1101 to 110L according to the request status reported by the access node.
  • the resource nodes 1301 to 130N are responsible for processing the task request sent by the access node, and reporting the work queue status information of the local node to the access nodes 1101 to 110L and the scheduling nodes 1201 to 120M.
  • FIG. 2 is a schematic flowchart diagram of a resource scheduling method according to an embodiment of the present application. This method is applied to the scheduling node. Referring to Figure 2, the method includes:
  • Step 201 Receive a service request message sent by at least one access node.
  • a service request message from an access node indicates at least one service to be processed in the access node.
  • each access node receives a task request of various services from an upper-layer service management unit, and maintains a multi-service request queue.
  • the task request is used to indicate that the request processes a task for a certain service.
  • the access node writes the task request to the request queue for the service.
  • the service request message carries the number of task requests in the request queue of each service in the access node.
  • the service request message may also be referred to as a queue report packet, and is used by the access node to report the status of the request in each service to the scheduling node.
  • the access node can send a service request message at different times. In this way, the scheduling node can know the number of requests in the request queue of each service of each access node.
  • the scheduling node maintains a service request list according to the received service request message of each access node, where the identifier (ID) of the access node, the ID of each service, and the number of task requests in the request queue are stored in the list. No.
  • Table 1 shows an example of a service request list.
  • There are a total of L access nodes, each of which includes multiple services, such as the lth access node including Sl services, l 1, ... L.
  • the access node ID may also be represented by an IP address of the access node.
  • Step 202 Obtain working queue status information of at least one resource node.
  • the scheduling node may obtain working queue status information of at least one resource node in the following two manners.
  • the scheduling node sends a status monitoring message to each resource node every predetermined time interval, and then receives a status monitoring confirmation message fed back by each resource node, where the status monitoring confirmation message carries the working queue status information of the resource node.
  • the status monitoring message is a heartbeat packet
  • the scheduling node periodically sends a heartbeat packet to each resource node.
  • the predetermined time interval is 3 seconds.
  • the resource node After receiving the heartbeat packet sent by the scheduling node, the resource node returns a heartbeat acknowledgement (ACK) packet (ie, status) to the scheduling node. Monitoring the confirmation message), and generating a working queue status information of the resource node at this time in real time, and then returning the working queue status information to the scheduling node by being carried in the heartbeat ACK packet.
  • ACK heartbeat acknowledgement
  • the scheduling node receives the status monitoring message sent by each access node, and the status monitoring message carries the working queue status information of the at least one resource node acquired by the access node.
  • the access node obtains the current working queue state information of the resource node when interacting with the resource node. Then, when the access node sends the status monitoring message to the scheduling node, the access node carries the working queue status information of the at least one resource node acquired by the access node.
  • the status monitoring message may also be a heartbeat packet.
  • each resource node establishes a work queue according to a task request to be processed received from the access node, where the work queue includes a task request that the resource node is processing, and the length of the so-called work queue refers to the The number of task requests being processed by the resource node.
  • the scheduling node records the received work queue status information of each resource node, for example, establishes a work queue status table, and maintains the work queue status table according to the received work queue status information.
  • Table 2 shows an example of a work queue status table established by a scheduling node. There are a total of K resource nodes.
  • the working queue status information of each resource node includes a timestamp t k (including date and time) and a working queue length q k . .
  • the scheduling node receives the working queue state information of a resource node
  • the recorded working queue state information of the resource node is updated according to the sequence of time stamps t k .
  • the work queue status information of the kth resource node locally recorded by the scheduling node has a timestamp of t k and a queue length of q k .
  • the time stamp is t g, the queue length q g.
  • Step 203 Perform resource scheduling according to the service request message and the work queue state information for each access node, allocate at least one resource node for each service to be processed in the access node, and generate a scheduling result.
  • the scheduling node includes the following steps when performing resource scheduling:
  • Step 2031 Determine an idle resource node from the at least one resource node according to the length of the work queue.
  • Step 2032 Pre-set the priority of each service in the access node.
  • the scheduling node prioritizes each service and pre-configures the priority P to which each service belongs.
  • P ⁇ 0, 1, 2, 3 ⁇ , where 0 represents the highest priority.
  • Step 2033 Assign at least one idle resource node to each service according to the priority.
  • each service of each access node is polled in turn, and at least one idle resource node is allocated for each service.
  • the scheduling node polls all the access nodes in turn for the service with priority 0. If so, allocate an idle resource node for each request in the service queue and increment the work queue length q k of the allocated resource node by one. Until there is no service with the highest priority 0 in all access nodes, the scheduling node starts scheduling traffic with priority 1, and so on.
  • the scheduling node is in the allocation process, the idle resource node is allocated, the current scheduling is stopped; or, if all the services of all the nodes have been allocated, the scheduling is stopped.
  • Step 204 Send a scheduling result to the access node, so that the access node sends at least one task request to the allocated at least one resource node according to the scheduling result.
  • the scheduling node After the scheduling node completes the resource scheduling, the scheduling node notifies the access node of the scheduling result.
  • the scheduling result will indicate the ID of the allocated resource node corresponding to each service ID, that is, for each service of the access node, specify which resource nodes each service is assigned to.
  • one request in each service corresponds to one resource node, that is, M j resource nodes will be allocated for the jth service.
  • the access node according to the received scheduling result, sends the same number of task requests from the corresponding request queue to the resource nodes given in the scheduling result for processing. That is, M j task requests are taken from the request queue of the jth service, and one task request is sent to each of the M j scheduled resource nodes.
  • one resource node can handle multiple task requests in terms of resource nodes.
  • the service queue message information of the at least one resource node is obtained by receiving the service request message sent by the at least one access node, so that the scheduling node does not need to manage each specific service request, and the resource node does not need to be estimated in advance.
  • the processing capability is performed. For each access node, resource scheduling is performed according to the service request message and the work queue state information, and at least one resource node is scheduled for each service of the access node, so that some resource nodes are completely loaded and some resource nodes are empty. In the case of the load, even if the processing time required for each service request is different and the processing capacity of the resource node is different, the length of the work queue of each resource node can be guaranteed to be at the same horizontal line. When the cluster pressure is high, all resource nodes can be guaranteed. Both are running at full load, which improves the efficiency of resource scheduling and improves the processing efficiency and resource utilization of scheduling nodes and resource nodes.
  • the upper layer service management unit receives the task request of the video transcoding service
  • the access node is specifically a video access node (access) for continuously receiving the task for the video transcoding service from the upper layer service management unit.
  • the video scheduling node acquires work queue state information of a video resource node (or a video work machine worker), and performs resource scheduling in conjunction with the received queue report packet to allocate a resource node for each video transcoding service.
  • FIG. 3 is a schematic flowchart diagram of a resource scheduling method according to another embodiment of the present application. This method is applied to an access node. Referring to Figure 3, the method includes:
  • Step 301 Receive a task request for at least one service to be processed.
  • the upper-layer service management unit sends task requests of various services to the access. node.
  • the access node generates a request queue for at least one service and is responsible for maintaining a multi-service request queue. Whenever a task request for a service is received, the task request is written to the request queue of the service.
  • each service corresponds to a request queue, and each request queue includes one or more task requests.
  • the time at which each task request arrives is different, and the processing time required for each task request can also be different, such as a few seconds, or a few minutes.
  • Step 302 Select one scheduling node from at least one scheduling node, and send a service request message indicating at least one service to the scheduling node.
  • the service request message carries the number of task requests in the request queue of each service.
  • the scheduling node may perform resource scheduling according to the service request message and the obtained work queue state information of the at least one resource node, schedule at least one resource node for each service, and generate a scheduling result.
  • the access node After receiving a task request and writing to the request queue of the corresponding service, the access node immediately triggers the sending, that is, selects a scheduling node and sends a service request message to the scheduling node.
  • the access node continuously receives the task request, but triggers the transmission of the service request message according to a predetermined period, for example, selecting one scheduling node every T seconds and transmitting a service request message to the scheduling node.
  • the above method (2) is applicable to the case where the task request arrives too frequently, and reporting according to the period can save the processing load of the access node.
  • Step 303 Receive a scheduling result from the scheduling node, and send at least one task request to the allocated at least one resource node according to the scheduling result.
  • the access node extracts the same number of task requests from the request queue of each service according to the received scheduling result, and sends the same to the resource node given in the scheduling result for processing. That is, M j task requests are taken from the request queue of the jth service, and one task request is sent to each of the M j scheduled resource nodes. That is to say, after the scheduling is completed, one task request corresponds to one resource node, and one resource node can process multiple task requests.
  • the access node by generating a request queue of at least one service, each time a task request for a service is received, the task request is written into the request queue of the service, and one scheduling node is selected from the at least one scheduling node. Sending a service request message for the request queue to the scheduling node. As long as one scheduling node is online and working normally, the access node can send a service request message to it and perform normal resource scheduling, without being in a distributed system. The performance of all scheduling nodes is required.
  • the scheduling node only needs to receive the service request message reported by the access node, and the service request message only carries the number of task requests in the request queue of each service in the access node, so it is not necessary to manage each task request by itself.
  • the specific state for example, the time when the task request is received and the processing time required for the task request, thereby reducing the processing load of the scheduling node, reducing the failure rate of the scheduling node, and improving the resource utilization rate of the scheduling node.
  • FIG. 4 is a schematic diagram of interaction of a resource scheduling method according to an embodiment of the present application. Taking an access node as an example, the interaction between the access node, the at least one scheduling node, and the at least one resource node is involved. As shown in Figure 4, the following steps are included:
  • Step 401 The access node receives a task request for at least one service to be processed.
  • Step 402 The access node selects one scheduling node from the at least one scheduling node.
  • Step 403 The access node sends a service request message for the request queue to the selected scheduling node.
  • step 403 is an intermediate step of implementing step 402, which is completed synchronously, that is, the scheduling node is sent to the scheduling node in the process of selecting the scheduling node.
  • step 403 is an intermediate step of implementing step 402, which is completed synchronously, that is, the scheduling node is sent to the scheduling node in the process of selecting the scheduling node.
  • Step 404 Each scheduling node acquires working queue state information of at least one resource node.
  • Step 405 Each scheduling node performs resource scheduling according to the received service request message and the working queue state information of the at least one resource node, and schedules at least one resource node for each service of each access node and generates a scheduling result.
  • Step 406 The scheduling node returns a corresponding scheduling result to the access node.
  • Step 407 The access node sends a task request to the allocated resource node.
  • Step 408 The allocated resource node performs corresponding processing according to the received task request.
  • FIG. 5 is a schematic diagram of interaction of a resource scheduling method according to another embodiment of the present application, and a specific implementation manner is provided for the foregoing steps 402 and 403. As shown in Figure 5, the following steps are included:
  • Step 501 The access node generates a request queue of at least one service, and each time a task request for a service is received, the task request is written into the request queue of the service.
  • Step 502 The access node sends a status monitoring message to each scheduling node at predetermined time intervals.
  • the access node pre-configures the IP addresses of all the scheduling nodes, and the status monitoring message may be a heartbeat packet.
  • the access node periodically sends a heartbeat packet to each scheduling node, for example, the predetermined time interval is T seconds.
  • Step 503 An online scheduling node feeds back a status monitoring confirmation message to the access node.
  • the online scheduling node immediately returns a heartbeat ACK packet to the access node after receiving the heartbeat packet sent by the access node.
  • Step 504 The access node determines, according to the received status monitoring confirmation message, that the scheduling node is online, and according to the moment when the status monitoring confirmation message is received, all online scheduling is performed. Nodes are sorted in chronological order.
  • the service request message is sent to each online scheduling node in turn, and the service request confirmation message fed back by the online scheduling node is monitored. That is, according to whether the online scheduling node can work normally, steps 505 and 506 are executed cyclically, and when a service request confirmation message fed back by a scheduling node is received, it is stopped.
  • Step 505 Send a service request message to the i-th online scheduling node according to the sorting result.
  • the initial value of i is 1. That is, the access node first selects the first online scheduling node, and sends a service request message to it, or is called a queue report packet.
  • Step 506 The i-th online and normal working scheduling node feeds back the service request confirmation message to the access node.
  • the i-th scheduling node After receiving the queue report packet sent by the access node, the i-th scheduling node immediately responds to the access node with a queue report packet ACK packet, that is, a service request acknowledgement message.
  • Step 507 The access node determines, according to the service request confirmation message, the final scheduling node.
  • the scheduling node determines the selected scheduling node, and receives the scheduling result from the scheduling node.
  • the access node discards the scheduling request.
  • Step 508 Each scheduling node sends a status monitoring message to each resource node every predetermined time interval.
  • Step 509 Each resource node feeds back a status monitoring confirmation message to the corresponding scheduling node, where the status monitoring confirmation message carries the working queue status information of the resource node.
  • the scheduling node periodically sends a heartbeat packet to the resource node, and receives the feedback heartbeat ACK packet to obtain the working queue state information of the resource node.
  • Steps 508 and 509 can also be performed in parallel with steps 501-506.
  • Step 510 The scheduling node performs resource scheduling according to the received service request message and the working queue state information of each resource node, and schedules at least one resource node for each service of each access node and generates a scheduling result.
  • Step 511 The scheduling node returns a corresponding scheduling result to the access node that sends the service request message.
  • Step 512 The access node sends a task request to the allocated resource node.
  • Step 513 The allocated resource node performs corresponding processing according to the received task request.
  • FIG. 6 is a schematic diagram of interaction of a resource scheduling method according to still another embodiment of the present application.
  • the scheduling node may also obtain the working queue state information of the at least one resource node from the access node. As shown in Figure 6.
  • Step 601 The access node generates a request queue of at least one service, and each time a task request for a service is received, the task request is written into the request queue of the service.
  • Step 602 The access node periodically sends a heartbeat packet to each scheduling node.
  • the access node after receiving the working queue status information fed back by the resource node (see step 612 and/or step 614), the access node carries the access node in the heartbeat packet and acquires the access node in a certain time interval. Work queue status information for the resource node. After the transmission is completed, the access node will clear the work queue status information of the resource node stored locally.
  • the scheduling node may trigger to perform step 608.
  • Step 603 An online scheduling node feeds back a heartbeat ACK packet to the access node.
  • Step 604 The access node determines, according to the received heartbeat ACK packet, that the scheduling node is online, and sorts all online scheduling nodes in chronological order according to the time when the heartbeat ACK packet is received.
  • the queue report packet is sent to each online dispatch node in turn, and the queue report packet ACK packet fed back by the online dispatch node is monitored.
  • Step 605 Send a queue report packet to the i-th online scheduling node according to the sorting result.
  • Step 606 The i-th online and working scheduling node reports a packet report packet ACK packet to the access node.
  • the next online scheduling node sends a queue report packet, and so on, until a queue report packet ACK packet fed back by a scheduling node is received. .
  • Step 607 The access node determines the final scheduling node according to the received queue report packet ACK packet.
  • the access node discards the scheduling request.
  • Step 608 The scheduling node performs resource scheduling according to the received queue report packet and the work queue state information of each resource node, and schedules at least one resource node for each service of each access node and generates a scheduling result.
  • Step 609 The scheduling node returns a corresponding scheduling result to the access node that sends the service request message.
  • Step 611 The access node sends a task request to the allocated resource node.
  • Step 612 The allocated resource node returns a request receiving confirmation message carrying the working queue status information of the resource node to the access node before processing the received task request.
  • the request receipt confirmation message is used to indicate receipt of a task request.
  • Step 613 The allocated resource node performs corresponding processing according to the received task request.
  • Step 614 After processing the received task request, the allocated resource node returns a request processing confirmation message carrying the work queue status information of the resource node to the access node.
  • the request processing confirmation message is used to indicate that the received task request has been processed.
  • the resource node may report the work queue status information to the access node at two moments:
  • the 612 is a step, prior to processing the received request to the task, the task request timestamp is received, the process has not been performed in time, this time, the current queue length of the working length of q k +1;
  • the other is the step 614, after completion of the processing tasks to the received request, the time stamp of the processing is completed at this time, the current queue length of the working length of q k -1.
  • the working queue status information reported at these two moments is different, reflecting different working states of the resource nodes, and can be used for resource scheduling after being forwarded to the scheduling node via the access node.
  • the method is as follows: After the step 602, the access node sends the information about the working queue status of the obtained resource node to the scheduling node in the heartbeat packet, which indicates that the status of the working queue of the resource node may be
  • the scheduling node can initiate resource scheduling.
  • the scheduling node receives the queue report packet sent by the access node, which indicates that a new task request is coming, and the scheduling node can also start resource scheduling.
  • the two events respectively indicate the working state of the resource node and the request state of the access node, and can trigger the scheduling node to perform resource scheduling again to fully utilize the idle resource node to meet the new task request.
  • FIG. 7 is a schematic structural diagram of a scheduling node according to an embodiment of the present application. As shown in FIG. 7, the scheduling node 700 includes:
  • the receiving module 710 is configured to receive a service request message sent by the at least one access node, where the service request message from one access node indicates at least one service to be processed in the access node;
  • the obtaining module 720 is configured to obtain working queue state information of the at least one resource node.
  • the resource scheduling module 730 is configured to perform resource scheduling according to the service request message received by the receiving module 710 and the working queue state information acquired by the obtaining module 720 for each access node, and allocate each service to be processed in the access node. At least one resource node and generating a scheduling result;
  • the sending module 740 is configured to send, to each access node, a scheduling result obtained by the resource scheduling module 730 to the access node, so that the access node sends at least one task request to the allocated at least one resource node according to the scheduling result. .
  • the receiving module 710 is further configured to receive a status monitoring message from each access node every predetermined time interval;
  • the sending module 740 is further configured to: feed back a status monitoring acknowledgement message to the access node, so that the access node sends a service request message to the scheduling node after receiving the status monitoring acknowledgement message; and feed back the service request confirmation to the access node. Message.
  • the sending module 740 is further configured to send a status monitoring message to each resource node every predetermined time interval;
  • the obtaining module 720 is configured to receive a status monitoring acknowledgement message fed back by each resource node, where the status monitoring acknowledgement message carries the work queue status information of the resource node.
  • the obtaining module 720 is configured to receive a status monitoring message sent by each access node, where the status monitoring message carries the working queue status information of the at least one resource node acquired by the access node.
  • the obtaining module 720 is configured to record the working queue state information of each resource node, where the working queue state information includes a timestamp when the resource node generates the working queue state information; when a resource node is obtained When the queue status information is worked, the recorded work queue status information of the resource node is updated according to the sequence of timestamps.
  • the work queue state information includes a length of a work queue of a resource node
  • the resource scheduling module 730 is configured to determine an idle resource node from the at least one resource node according to the length of the work queue; The priority of each service in the ingress node; at least one idle resource node is allocated for each service according to the priority.
  • the resource scheduling module 730 is configured to sequentially poll each service of each access node according to the order of priority from high to low, and allocate M j idle resource nodes for the jth service, where the service with high priority Priority access to resource nodes.
  • FIG. 8 is a schematic structural diagram of a scheduling node according to an embodiment of the present application.
  • the scheduling node 800 includes a processor 810, a memory 820, a port 830, and a bus 840.
  • Processor 810 and memory 820 are interconnected by a bus 840.
  • Processor 810 can receive and transmit data through port 830. among them,
  • the processor 810 is configured to execute a machine readable instruction module stored by the memory 820.
  • the memory 820 stores machine readable instruction modules executable by the processor 810.
  • the instruction module executable by the processor 810 includes a receiving module 821, an obtaining module 822, a resource scheduling module 823, and a sending module 824. among them,
  • the receiving module 821 may be executed by the processor 810 to: receive at least one access node a service request message sent, wherein a service request message from an access node indicates at least one service to be processed in the access node;
  • the method may be: acquiring working queue state information of the at least one resource node;
  • the resource scheduling module 823 When the resource scheduling module 823 is executed by the processor 810, the resource scheduling may be performed according to the service request message received by the receiving module 821 and the working queue status information acquired by the obtaining module 822 for each access node, and the resource scheduling is performed for the access node.
  • Each service processed allocates at least one resource node and generates a scheduling result;
  • the scheduling result obtained by the resource scheduling module 823 may be sent to the access node for each access node, so that the access node sends the allocated at least one resource according to the scheduling result.
  • the node sends at least one task request.
  • FIG. 9 is a schematic structural diagram of an access node according to an embodiment of the present application.
  • the access node 900 includes: a receiving module 910, a selecting module 920, and a sending module 930, where
  • the receiving module 910 is configured to receive a task request for at least one service to be processed
  • the selecting module 920 is configured to select one scheduling node from the at least one scheduling node
  • the sending module 930 is configured to send, to the scheduling node selected by the selecting module 920, a service request message indicating at least one service in the receiving module 910, so that the scheduling node according to the service request message and the obtained work queue status of the at least one resource node
  • the information performs resource scheduling, allocates at least one resource node for each service to be processed, and generates a scheduling result;
  • the receiving module 910 is further configured to receive a scheduling result from the scheduling node
  • the sending module 930 is further configured to: according to the scheduling result received by the receiving module 910 The assigned at least one resource node sends at least one task request.
  • the sending module 930 is configured to send a status monitoring message to each scheduling node every predetermined time interval
  • the selecting module 920 is configured to: when the receiving module 910 receives the status monitoring confirmation message fed back by the scheduling node, determine that the scheduling node is online, and according to the time when the status monitoring confirmation message is received, perform all the online scheduling nodes in chronological order. Sorting, according to the sorting result control sending module 930, in turn, sends a service request message to each online scheduling node, and simultaneously monitors whether the receiving module 910 receives the service request confirmation message fed back by the online scheduling node; when the receiving module 910 receives one When the service request confirmation message fed back by the online scheduling node is used, the online scheduling node is used as the selected scheduling node.
  • the access node 900 further includes:
  • the generating module 940 is configured to generate a request queue for each service, and each time the receiving module 910 receives a task request for a service, the task request is written into a request queue of the service, where the service request message The number of task requests in the request queue carrying each service.
  • the receiving module 910 is further configured to: before the allocated one resource node processes the received task request, receive, from the resource node, a request receiving confirmation message that carries the working queue status information of the resource node;
  • the sending module 930 is further configured to send, to the scheduling node, a status monitoring message that carries the working queue status information of each resource node, so that the scheduling node monitors the status according to the status. Get the work queue status information of the resource node.
  • the receiving module 910 is further configured to: after the allocated one resource node processes the received task request, receive, from the resource node, a request processing confirmation message that carries the working queue status information of the resource node;
  • the sending module 930 is further configured to send, to the scheduling node, a status monitoring message that carries the working queue status information of each resource node, so that the scheduling node acquires the working queue status information of the resource node according to the status monitoring message.
  • FIG. 10 is a schematic structural diagram of an access node according to an embodiment of the present application.
  • the access node 1000 includes a processor 1010, a memory 1020, a port 1030, and a bus 1040.
  • the processor 1010 and the memory 1020 are interconnected by a bus 1040.
  • the processor 1010 can receive and transmit data through the port 1030. among them,
  • the processor 1010 is configured to execute a machine readable instruction module stored by the memory 1020.
  • the memory 1020 stores machine readable instruction modules executable by the processor 1010.
  • the instruction module executable by the processor 1010 includes a receiving module 1021, a selecting module 1022, and a sending module 1023. among them,
  • the receiving module 1021 when executed by the processor 1010, may be: receiving a task request for at least one service to be processed;
  • the selecting module 1022 is executed by the processor 1010, and may be: selecting one scheduling node from the at least one scheduling node;
  • the sending module 1023 when executed by the processor 1010, may be: sending, to the scheduling node selected by the selecting module 1022, a service request message indicating at least one service in the receiving module 1021, so that the scheduling node according to the service request message and the acquired at least one
  • the resource queue state information of the resource node performs resource scheduling, allocates at least one resource node for each service to be processed, and generates a scheduling result;
  • the receiving module 1021 when executed by the processor 1010, may further be: from the scheduling node Receiving scheduling results;
  • the method may further be: sending, according to the scheduling result received by the receiving module 1021, at least one task request to the allocated at least one resource node.
  • the instruction module executable by the processor 1010 further includes: a generation module 1024.
  • the request queue may be generated for each service.
  • the receiving module 1021 receives a task request for a service, the task request is written into the request queue of the service, where The service request message carries the number of task requests in the request queue of each service.
  • each functional module in each embodiment of the present application may be integrated into one processing unit, or each module may exist physically separately, or two or more modules may be integrated into one unit.
  • the above integrated unit can be implemented in the form of hardware or in the form of a software functional unit.
  • each embodiment of the present application can be implemented by a data processing device such as a computer
  • the data processing program of the line is implemented.
  • the data processing program constitutes the present application.
  • a data processing program usually stored in a storage medium is executed by directly reading a program out of a storage medium or by installing or copying the program to a storage device (such as a hard disk and or a memory) of the data processing device. Therefore, such a storage medium also constitutes the present application.
  • the storage medium can use any type of recording method, such as paper storage medium (such as paper tape, etc.), magnetic storage medium (such as floppy disk, hard disk, flash memory, etc.), optical storage medium (such as CD-ROM, etc.), magneto-optical storage medium (such as MO, etc.).
  • paper storage medium such as paper tape, etc.
  • magnetic storage medium such as floppy disk, hard disk, flash memory, etc.
  • optical storage medium such as CD-ROM, etc.
  • magneto-optical storage medium Such as MO, etc.
  • the present application also discloses a storage medium in which is stored a data processing program for performing any of the above-described embodiments of the present application.

Abstract

本申请公开了一种资源调度方法、调度节点及接入节点。该方法包括:接收至少一个接入节点发送的业务请求消息,其中,来自一个接入节点的业务请求消息指示该接入节点中待处理的至少一个业务;获取至少一个资源节点的工作队列状态信息;及,针对每个接入节点,执行如下处理:根据业务请求消息和工作队列状态信息执行资源调度,为该接入节点中待处理的每个业务分配至少一个资源节点并生成调度结果,向该接入节点发送调度结果,以使该接入节点根据调度结果向所分配的至少一个资源节点发送至少一个任务请求。

Description

分布式资源调度方法、调度节点及接入节点
本申请要求于2016年9月30日提交中国专利局、申请号为201610874061.X、申请名称为“分布式资源调度方法、调度节点及接入节点”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及计算机处理领域,尤其涉及一种分布式资源调度方法、调度节点及接入节点。
发明背景
分布式计算是将一个复杂的任务或者应用分解成许多小的部分,分配给多台计算机进行并行处理。在进行资源调度时,调度节点根据接收到的请求,在多个资源节点中确定处理该请求的资源节点。
发明内容
有鉴于此,本申请实施例提供了一种分布式资源调度方法、调度节点及接入节点,能够提高调度节点执行资源调度的效率和资源节点的资源利用率。
本申请实施例的技术方案是这样实现的:
本申请实施例提供了一种资源调度方法,所述方法包括:
接收至少一个接入节点发送的业务请求消息,其中,来自一个接入节点的所述业务请求消息指示该接入节点中待处理的至少一个业务;
获取至少一个资源节点的工作队列状态信息;及,
针对每个接入节点,执行如下处理:
根据所述业务请求消息和所述工作队列状态信息执行资源调度,为 该接入节点中待处理的每个业务分配至少一个资源节点并生成调度结果,向该接入节点发送所述调度结果,以使该接入节点根据所述调度结果向所分配的至少一个资源节点发送至少一个任务请求。
本申请实施例还提供了一种资源调度方法,所述方法包括:
接收针对待处理的至少一个业务的任务请求;
从至少一个调度节点中选择一个调度节点,向该调度节点发送指示所述至少一个业务的业务请求消息,以使该调度节点根据所述业务请求消息和获取到的至少一个资源节点的工作队列状态信息执行资源调度,为待处理的每个业务分配至少一个资源节点并生成调度结果;及,
从该调度节点接收所述调度结果,并根据所述调度结果向所分配的至少一个资源节点发送至少一个任务请求。
本申请实施例又提供了一种调度节点,包括处理器和存储器,所述存储器中存储可被所述处理器执行的指令,当执行所述指令时,所述处理器用于:
接收至少一个接入节点发送的业务请求消息,其中,来自一个接入节点的所述业务请求消息指示该接入节点中待处理的至少一个业务;
获取至少一个资源节点的工作队列状态信息;
针对每个接入节点,执行如下处理:根据所述业务请求消息和所述工作队列状态信息执行资源调度,为该接入节点中待处理的每个业务分配至少一个资源节点并生成调度结果,向该接入节点发送所述调度结果,以使该接入节点根据所述调度结果向所分配的至少一个资源节点发送至少一个任务请求。
本申请提供了一种接入节点,包括处理器和存储器,所述存储器中存储可被所述处理器执行的指令,当执行所述指令时,所述处理器用于:
接收针对待处理的至少一个业务的任务请求;
从至少一个调度节点中选择一个调度节点;
向该调度节点发送指示所述至少一个业务的业务请求消息,以使该调度节点根据所述业务请求消息和获取到的至少一个资源节点的工作队列状态信息执行资源调度,为待处理的每个业务分配至少一个资源节点并生成调度结果;
从该调度节点接收所述调度结果;
根据所述调度结果向所分配的至少一个资源节点发送至少一个任务请求。
附图简要说明
为了更清楚的说明本申请实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简单的介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来说,在不付出创造性劳动的前提下,还可以根据这些附图获得其它的附图。其中,
图1为本申请实施例所涉及的分布式资源调度系统的结构示意图;
图2为依据本申请一实施例的资源调度方法的流程示意图;
图3为依据本申请另一实施例的资源调度方法的流程示意图;
图4为依据本申请一实施例的资源调度方法的交互示意图;
图5为依据本申请另一实施例的资源调度方法的交互示意图;
图6为依据本申请又一实施例的资源调度方法的交互示意图;
图7为依据本申请一实施例的调度节点的结构示意图;
图8为依据本申请一实施例的调度节点的结构示意图;
图9为依据本申请一实施例的接入节点的结构示意图;
图10为依据本申请一实施例的接入节点的结构示意图。
实施方式
针对分布式计算,在现有的资源调度方法中,调度节点通过预先估计每个资源节点的处理能力,为每个请求分配相应的资源节点。如果对资源节点的处理能力的预计不够准确,则会造成整个系统的调度不合理,不能按照资源节点的真实能力来调度请求。此外,当各个请求需要的处理时间不同,且没有办法提前预知时,容易造成资源节点处理请求的负荷不均衡。因此,造成资源节点的资源分配不合理,从而导致资源利用率下降。
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
图1为本申请实施例所涉及的分布式资源调度系统的结构示意图。参见图1,该分布式资源调度系统100包括:接入节点1101~110L、调度节点1201~120M和资源节点1301~130N。其中,接入节点有L个,调度节点有M个,资源节点有N个。通常,N远大于L。
分布式资源调度系统100用于进行分布式计算,处理大型的计算任务,例如对一大的视频文件进行转码处理。所包括的这三类节点各自具备不同的功能。具体而言,接入节点1101~110L负责从上层接收对应某一个业务的任务请求,向调度节点1201~120M上报各业务中请求的状态以申请资源,并控制与资源节点之间请求的执行。此外,接入节点1101~110L还可以定期向调度节点1201~120M转发各个资源节点1301~130N的工作队列状态信息。
调度节点1201~120M负责通过与接入节点1101~110L和/或资源节 点1301~130N的交互维护各个资源节点的工作队列状态信息,并根据接入节点上报的请求状态向各个接入节点1101~110L分配资源节点。
资源节点1301~130N负责处理接入节点发来的任务请求,并向接入节点1101~110L、调度节点1201~120M上报本节点的工作队列状态信息。
图2为依据本申请一实施例的资源调度方法的流程示意图。该方法应用于调度节点。参见图2,该方法包括:
步骤201、接收至少一个接入节点发送的业务请求消息。
来自一个接入节点的业务请求消息指示该接入节点中待处理的至少一个业务。具体而言,每个接入节点从上层业务管理单元接收各种业务的任务请求,并维护一个多业务的请求队列。所述任务请求用于指示请求处理针对某个业务的任务(task)。每当接收到针对一个业务的一个任务请求时,接入节点会将该任务请求写入该业务的请求队列中。业务请求消息携带有该接入节点中每个业务的请求队列中任务请求的个数。该业务请求消息也可称为队列报告包,用于接入节点向调度节点汇报当前每个业务中请求的状态。
接入节点可以在不同的时刻发送业务请求消息。这样,调度节点能够获知每个接入节点当前每个业务的请求队列中请求的个数。调度节点会根据接收到的各个接入节点的业务请求消息维护一个业务请求列表,在该列表中存储有接入节点的标识(ID)、每个业务的ID、请求队列中任务请求的个数No.。
表1给出了一个业务请求列表的示例,一共有L个接入节点,每个接入节点包括多个业务,如第l个接入节点包括Sl个业务,l=1,…L。例如,接入节点ID=1时,共有S1个业务,每个业务的请求队列中任务请求的个数不等,如业务ID=11时,请求队列中任务请求的个数No.=1;业务ID=1S1时,请求队列中任务请求的个数No.=0,即当前没有接收到 任何任务请求。其中,接入节点ID也可以由接入节点的IP地址来表示。
Figure PCTCN2017103606-appb-000001
表1业务请求列表
步骤202、获取至少一个资源节点的工作队列状态信息。
调度节点可以有以下两种方式获得至少一个资源节点的工作队列状态信息。
方式(1):从资源节点处获得
调度节点每隔预定时间间隔向每个资源节点发送状态监测消息,然后接收每个资源节点反馈的状态监测确认消息,该状态监测确认消息携带有该资源节点的工作队列状态信息。
例如,状态监测消息为一心跳包,调度节点会定期向每个资源节点发送心跳包。例如,该预定时间间隔为3秒。资源节点在收到调度节点发来的心跳包后,会向调度节点返回一个心跳确认(ACK)包(即状态 监测确认消息),并实时产生一个此时该资源节点的工作队列状态信息,然后通过携带于心跳ACK包中将工作队列状态信息返回给调度节点。
方式(2):从接入节点处获得
调度节点会接收到每个接入节点发送的状态监测消息,状态监测消息携带有该接入节点获取到的至少一个资源节点的工作队列状态信息。其中,接入节点在与资源节点交互任务请求时,可以获得资源节点当前的工作队列状态信息。然后,接入节点在向调度节点发送状态监测消息时,携带上该接入节点获取到的至少一个资源节点的工作队列状态信息。此时,状态监测消息也可以是一心跳包。
本申请实施例中,每个资源节点根据从接入节点接收到的待处理的任务请求建立一个工作队列,该工作队列中包括该资源节点正在处理的任务请求,所谓工作队列的长度是指该资源节点正在处理的任务请求的个数。
具体而言,工作队列状态信息包括第k个资源节点生成该工作队列状态信息时的时间戳tk和该资源节点的工作队列的长度qk,k=1,…,K。调度节点记录接收到的每个资源节点的工作队列状态信息,例如,建立一个工作队列状态表,并根据接收到的工作队列状态信息对该工作队列状态表进行维护。
表2为一个调度节点所建立的工作队列状态表的示例,一共有K个资源节点,每个资源节点的工作队列状态信息包括时间戳tk(包括日期和时间)和工作队列的长度qk
Figure PCTCN2017103606-appb-000002
Figure PCTCN2017103606-appb-000003
表2工作队列状态表
每当调度节点接收到一个资源节点的工作队列状态信息,根据时间戳tk的先后顺序更新所记录的该资源节点的工作队列状态信息。例如,调度节点本地记录的第k个资源节点的工作队列状态信息中时间戳为tk,队列长度为qk。在某个时刻调度节点又接收到第k个资源节点的工作队列状态信息,其中时间戳为tg,队列长度为qg。若tk>tg,即在时间先后顺序上tg为相对最近的时刻,则更新本地的工作队列状态信息,令tk=tg,qk=qg,否则,就放弃本次更新,维持之前的记录不变。
步骤203、针对每个接入节点,根据业务请求消息和工作队列状态信息执行资源调度,为该接入节点中待处理的每个业务分配至少一个资源节点并生成调度结果。
调度节点在执行资源调度时,包括如下步骤:
步骤2031、根据工作队列的长度从至少一个资源节点中确定出空闲的资源节点。
具体为,调度节点预先设置一个资源节点的工作队列的长度门限Q。若一个资源节点的工作队列的长度qk<Q,则认为该资源节点为空闲节点,否则认为该资源节点为繁忙节点。例如,Q=3。
步骤2032、预先设置该接入节点中各个业务的优先级。
调度节点会为各个业务划分优先级,并在本地预先配置每个业务所属的优先级P,其中,优先级高的业务优先获得资源节点。例如,P={0,1,2,3},其中,0代表最高优先级。如表1所示,接入节点ID=1时,所包括的业务ID=11所属的优先级为最高级0,业务ID=12所属的优先级为1。
步骤2033、根据优先级为每个业务分配至少一个空闲的资源节点。
具体为:根据优先级从高到低的排序,依次轮询每个接入节点的各个业务,为每个业务分配至少一个空闲的资源节点。
业务请求消息中携带有一个接入节点的各个业务的请求队列中任务请求的个数,第j个业务的任务请求个数为Mj,j=1,…J,J为业务的总数,那么调度节点在执行资源调度时,将为第j个业务分配Mj个空闲的资源节点。
具体调度时,调度节点会依次轮询所有接入节点中是否具有优先级为0的业务。若有,则针对该业务队列中的每个请求,分配一个空闲的资源节点,并把该分配的资源节点的工作队列长度qk加1。直到所有接入节点中没有最高优先级0的业务后,调度节点开始调度优先级为1的业务,以此类推。
若一个接入节点中包括相同优先级的多个业务,如表1所示,接入节点ID=2时,所包括的业务ID=21和ID=22所属的优先级均为最高级0。那么,可以在一次轮询时为这几个相同优先级的业务都分配资源节点,或者也可以在一次轮询时只为其中的一个业务分配资源节点。
若调度节点在分配过程中,空闲的资源节点被分配完毕,则停止本次调度;或者,为所有节点的所有业务都已分配完毕,则停止本次调度。
步骤204、向该接入节点发送调度结果,以使该接入节点根据调度结果向所分配的至少一个资源节点发送至少一个任务请求。
调度节点在完成资源调度后,将调度结果通知给接入节点。调度结果将指示每个业务ID对应的所分配的资源节点的ID,即针对该接入节点的各个业务,指定每个业务被分到哪几个资源节点。其中,每个业务中的一个请求对应一个资源节点,即为第j个业务将会分配Mj个资源节点。然后,接入节点根据接收到的调度结果,从相应的请求队列中,取 出数量相同的任务请求分别发送给调度结果中给出的资源节点进行处理。即从第j个业务的请求队列中取出Mj个任务请求,分别向Mj个调度的资源节点中的每个资源节点发送一个任务请求。在完成调度后,就资源节点而言,一个资源节点可以处理多个任务请求。
在本实施例中,通过接收至少一个接入节点发送的业务请求消息,获取至少一个资源节点的工作队列状态信息,使得调度节点无需管理每个具体的业务请求,并且无需预先估计每个资源节点的处理能力;针对每个接入节点,根据业务请求消息和工作队列状态信息执行资源调度,为该接入节点的每个业务调度至少一个资源节点,可以避免部分资源节点满载而部分资源节点空载的情况,即使各个业务请求所需的处理时间不同、资源节点的处理能力不同,仍可以保障各个资源节点的工作队列的长度基本处于同一水平线,在集群压力较大时,可以保证所有资源节点都满负荷运行,从而提高了资源调度的效率,提高了调度节点、资源节点的处理效率和资源利用率。
在一具体场景中,上层业务管理单元接收各种视频转码业务的任务请求,接入节点具体为视频接入节点(access),用于从上层业务管理单元持续接收针对视频转码业务的任务请求,并写入相应视频转码业务的请求队列,然后向在线且正常工作的视频调度节点(或称为视频控制机master)发送队列报告包。视频调度节点获取到视频资源节点(或称为视频工作机worker)的工作队列状态信息,以及结合接收到的队列报告包执行资源调度,为每个视频转码业务分配资源节点。
图3为依据本申请另一实施例的资源调度方法的流程示意图。该方法应用于接入节点。参见图3,该方法包括:
步骤301、接收针对待处理的至少一个业务的任务请求。
本步骤中,上层业务管理单元会将各种业务的任务请求发送给接入 节点。接入节点生成至少一个业务的请求队列,并负责维护一个多业务的请求队列。每当接收到针对一个业务的一个任务请求时,将该任务请求写入该业务的请求队列中。这样,每个业务对应一个请求队列,每个请求队列中包括一个或多个任务请求。每个任务请求到达的时刻不同,并且每个任务请求所需的处理时间也可以不同,比如几秒钟,或者几分钟等。
步骤302、从至少一个调度节点中选择一个调度节点,向该调度节点发送指示至少一个业务的业务请求消息。
本步骤中,业务请求消息携带有每个业务的请求队列中任务请求的个数。例如,接入节点目前一共接收到J个业务的请求,第j个业务的请求队列中任务请求的个数为Mj,j=1,…,J。这样,该调度节点可以根据业务请求消息和获取到的至少一个资源节点的工作队列状态信息执行资源调度,为每个业务调度至少一个资源节点并生成调度结果。
此外,接入节点发送业务请求消息的触发事件可以有以下两种:
(1)接入节点在接收到一个任务请求、写入到相应业务的请求队列后,立即触发发送,即选择一个调度节点并向该调度节点发送业务请求消息。
(2)接入节点持续接收任务请求,但按照一个预定的周期触发发送业务请求消息,例如,每隔T秒选择一个调度节点并向该调度节点发送业务请求消息。
其中,上述方式(2)适用于任务请求到达太过频繁的情况,按照周期进行上报,可以节省接入节点的处理负荷。
步骤303、从该调度节点接收调度结果,并根据调度结果向所分配的至少一个资源节点发送至少一个任务请求。
对应步骤204的描述,接入节点根据接收到的调度结果,从每个业 务的请求队列中,取出数量相同的任务请求发送给调度结果中给出的资源节点进行处理。即从第j个业务的请求队列中取出Mj个任务请求,分别向Mj个调度的资源节点中的每个资源节点发送一个任务请求。也就说,在完成调度后,一个任务请求对应一个资源节点,而一个资源节点可以处理多个任务请求。
在上述实施例中,通过生成至少一个业务的请求队列,每当接收到针对一个业务的任务请求时,将该任务请求写入该业务的请求队列中,从至少一个调度节点中选择一个调度节点,向该调度节点发送针对请求队列的业务请求消息,只要有一个调度节点在线并且能正常工作,那么接入节点就可以向其发送业务请求消息并进行正常的资源调度,无需对分布式系统中所有调度节点的性能做出要求。
此外,调度节点只需接收接入节点上报的业务请求消息,业务请求消息仅携带有该接入节点中每个业务的请求队列中任务请求的个数,因此,自身无需管理每个任务请求的具体状态,例如接收到任务请求的时间以及该任务请求所需的处理时间等,从而降低了调度节点的处理负荷,减少了调度节点的故障发生率,提高了调度节点的资源利用率。
图4为依据本申请一实施例的资源调度方法的交互示意图,以一个接入节点为例,涉及到该接入节点、至少一个调度节点和至少一个资源节点之间的交互。如图4所示,包括如下步骤:
步骤401、接入节点接收针对待处理的至少一个业务的任务请求。
步骤402、接入节点从至少一个调度节点中选择一个调度节点。
步骤403、接入节点向所选择的调度节点发送针对请求队列的业务请求消息。
在具体执行时,步骤403是实现步骤402的一个中间步骤,这两个步骤是同步完成的,即在选择调度节点的过程中向该调度节点发送了业 务请求消息,具体参见下述图5所示的实施例。
步骤404、每个调度节点获取至少一个资源节点的工作队列状态信息。
步骤405、每个调度节点根据接收到的业务请求消息和至少一个资源节点的工作队列状态信息执行资源调度,为各个接入节点的每个业务调度至少一个资源节点并生成调度结果。
步骤406、调度节点向接入节点返回相应的调度结果。
步骤407、接入节点向所分配的资源节点发送任务请求。
步骤408、所分配的资源节点根据接收到的任务请求执行相应的处理。
图5给出了依据本申请另一实施例的资源调度方法的交互示意图,针对上述步骤402和步骤403给出了一种具体实现方式。如图5所示,包括如下步骤:
步骤501、接入节点生成至少一个业务的请求队列,每当接收到针对一个业务的任务请求时,将该任务请求写入该业务的请求队列中。
步骤502、接入节点每隔预定时间间隔分别向每个调度节点发送状态监测消息。
本步骤中,接入节点预先配置好所有调度节点的IP地址,状态监测消息可以为心跳包。接入节点定期向每个调度节点发送心跳包,例如,预定时间间隔为T秒。
步骤503、一个在线的调度节点向接入节点反馈状态监测确认消息。
例如,在线的调度节点在收到接入节点发送的心跳包后,立即向接入节点回复心跳ACK包。
步骤504、接入节点根据接收到的状态监测确认消息,确定该调度节点在线,并根据接收到状态监测确认消息的时刻,对所有在线的调度 节点按照时间先后顺序进行排序。
然后,根据排序结果依次向每个在线的调度节点发送业务请求消息,同时监测是否接收到该在线的调度节点反馈的业务请求确认消息。即根据在线的调度节点是否能够正常工作的情况,循环执行步骤505和506,当接收到一个调度节点反馈的业务请求确认消息时则停止。
步骤505、根据排序结果向第i个在线的调度节点发送业务请求消息。
这里,i的初始值1。即接入节点首先选择第一个在线的调度节点,向其发送业务请求消息,或者称为队列报告包。
步骤506、第i个在线并且正常工作的调度节点向接入节点反馈业务请求确认消息。
第i个调度节点在收到接入节点发送的队列报告包后,如果工作正常,会立即向接入节点回复一队列报告包ACK包,即业务请求确认消息。
如果接入节点没有收到第一个在线的调度节点发送的队列报告包ACK包,则认为该在线的调度节点无法正常工作或者故障,则i=i+1,向第二个在线的调度节点发送队列报告包。以此类推。
步骤507、接入节点根据业务请求确认消息确定出最终的调度节点。
当接入节点首次接收到一个在线的调度节点发送的队列报告包ACK包时,将该调度节点确定为所选择的调度节点,将从该调度节点接收调度结果。
若所有的调度节点都不在线或者没有回复队列报告包ACK包,则接入节点放弃本次调度请求。
步骤508、每个调度节点每隔预定时间间隔向每个资源节点发送状态监测消息。
步骤509,每个资源节点向相应的调度节点反馈状态监测确认消息,状态监测确认消息携带有该资源节点的工作队列状态信息。
如步骤202中方式(1)所述,调度节点定期向资源节点发送心跳包,通过接收反馈的心跳ACK包,来获取资源节点的工作队列状态信息。
步骤508和步骤509也可以与步骤501-506并行执行。
步骤510、调度节点根据接收到的业务请求消息和每个资源节点的工作队列状态信息执行资源调度,为各个接入节点的每个业务调度至少一个资源节点并生成调度结果。
步骤511、调度节点向发送业务请求消息的接入节点返回相应的调度结果。
步骤512、接入节点向所分配的资源节点发送任务请求。
步骤513、所分配的资源节点根据接收到的任务请求执行相应的处理。
图6为依据本申请又一实施例的资源调度方法的交互示意图。如步骤202中方式(2)所述,调度节点也可以从接入节点处获得至少一个资源节点的工作队列状态信息。如图6所示。
步骤601、接入节点生成至少一个业务的请求队列,每当接收到针对一个业务的任务请求时,将该任务请求写入该业务的请求队列中。
步骤602、接入节点定期分别向每个调度节点发送心跳包。
在一实施例中,当接收到资源节点反馈的工作队列状态信息(见步骤612和/或步骤614)后,接入节点在心跳包中携带上该接入节点在一定时间间隔内获取到的资源节点的工作队列状态信息。在发送完毕后,接入节点将清空本地所存储的资源节点的工作队列状态信息。
此外,调度节点在接收到接入节点发送的携带有工作队列状态信息的心跳包后,可以触发执行步骤608。
步骤603、一个在线的调度节点向接入节点反馈心跳ACK包。
步骤604、接入节点根据接收到的心跳ACK包,确定该调度节点在线,并根据接收到心跳ACK包的时刻,对所有在线的调度节点按照时间先后顺序进行排序。
然后,根据排序结果依次向每个在线的调度节点发送队列报告包,同时监测是否接收到该在线的调度节点反馈的队列报告包ACK包。
步骤605、根据排序结果向第i个在线的调度节点发送队列报告包。
步骤606、第i个在线并且正常工作的调度节点向接入节点反馈队列报告包ACK包。
若第i个在线的调度节点无法正常工作或者故障,则i=i+1,向下一个在线的调度节点发送队列报告包,以此类推,直到接收到一个调度节点反馈的队列报告包ACK包。
步骤607、接入节点根据接收到的队列报告包ACK包确定出最终的调度节点。
若所有的调度节点都不在线或者没有回复队列报告包ACK包,则接入节点放弃本次调度请求。
步骤608、调度节点根据接收到的队列报告包和每个资源节点的工作队列状态信息执行资源调度,为各个接入节点的每个业务调度至少一个资源节点并生成调度结果。
步骤609、调度节点向发送业务请求消息的接入节点返回相应的调度结果。
步骤611、接入节点向所分配的资源节点发送任务请求。
步骤612、所分配的资源节点在处理接收到的任务请求之前,向接入节点返回携带有该资源节点的工作队列状态信息的请求接收确认消息。
该请求接收确认消息用于指示确认接收到一个任务请求。
步骤613、所分配的资源节点根据接收到的任务请求执行相应的处理。
步骤614、所分配的资源节点在处理完接收到的任务请求之后,向接入节点返回携带有该资源节点的工作队列状态信息的请求处理确认消息。
该请求处理确认消息用于指示确认已处理完接收到的任务请求。
在上述实施例中,资源节点可以在两个时刻向接入节点上报工作队列状态信息:
一个是如步骤612所述,在处理接收到的任务请求之前,时间戳为接收到任务请求、还未进行处理的时刻,此时,工作队列的长度是当前长度qk+1;
另一个是如步骤614所述,在处理完毕接收到的任务请求之后,时间戳为处理完毕的那个时刻,此时,工作队列的长度是当前长度qk-1。
可见,这两个时刻上报的工作队列状态信息是不同的,反映了资源节点不同的工作状态,在经由接入节点转发给调度节点后,可以用于资源调度。
此外,在上述实施例中,触发调度节点执行资源调度的事件也有两个:
一个是,如步骤602之后,接入节点在定期发送心跳包时,将获取到的资源节点的工作队列状态信息携带在心跳包中发送给调度节点,这表明资源节点的工作队列的状态可能有所变化,调度节点可以启动资源调度。
另一个是,如步骤607之后,调度节点接收到接入节点发送的队列报告包,这表明有新的任务请求来了,调度节点也可以启动资源调度。
可见,上述两个事件分别指示了资源节点的工作状态和接入节点的请求状态,都可以触发调度节点重新进行资源调度,以将空闲的资源节点充分利用起来,满足新的任务请求。
图7为依据本申请一实施例的调度节点的结构示意图。如图7所示,调度节点700包括:
接收模块710,用于接收至少一个接入节点发送的业务请求消息,其中,来自一个接入节点的业务请求消息指示该接入节点中待处理的至少一个业务;
获取模块720,用于获取至少一个资源节点的工作队列状态信息;
资源调度模块730,用于针对每个接入节点,根据接收模块710接收的业务请求消息和获取模块720获取的工作队列状态信息执行资源调度,为该接入节点中待处理的每个业务分配至少一个资源节点并生成调度结果;
发送模块740,用于针对每个接入节点,向该接入节点发送资源调度模块730得到的调度结果,以使该接入节点根据调度结果向所分配的至少一个资源节点发送至少一个任务请求。
在一实施例中,接收模块710进一步用于,每隔预定时间间隔从每个接入节点接收状态监测消息;
发送模块740进一步用于,向该接入节点反馈状态监测确认消息,以使该接入节点在接收到状态监测确认消息之后,向调度节点发送业务请求消息;向该接入节点反馈业务请求确认消息。
在一实施例中,发送模块740进一步用于,每隔预定时间间隔向每个资源节点发送状态监测消息;
获取模块720用于,接收每个资源节点反馈的状态监测确认消息,状态监测确认消息携带有该资源节点的工作队列状态信息。
在一实施例中,获取模块720用于,接收每个接入节点发送的状态监测消息,状态监测消息携带有该接入节点获取到的至少一个资源节点的工作队列状态信息。
在一实施例中,获取模块720用于,记录每个资源节点的工作队列状态信息,其中,工作队列状态信息包括该资源节点生成该工作队列状态信息时的时间戳;当获取到一个资源节点的工作队列状态信息时,根据时间戳的先后顺序更新所记录的该资源节点的工作队列状态信息。
在一实施例中,工作队列状态信息包括一个资源节点的工作队列的长度,资源调度模块730用于,根据工作队列的长度从至少一个资源节点中确定出空闲的资源节点;预先设置每个接入节点中各个业务的优先级;根据优先级为每个业务分配至少一个空闲的资源节点。
在一实施例中,业务请求消息携带有一个接入节点的J个业务的请求队列中任务请求的个数,若第j个业务的任务请求个数为Mj,j=1,…J,资源调度模块730用于,根据优先级从高到低的排序,依次轮询每个接入节点的各个业务,为第j个业务分配Mj个空闲的资源节点,其中,优先级高的业务优先获得资源节点。
图8为依据本申请一实施例的调度节点的结构示意图。该调度节点800包括:处理器810、存储器820、端口830以及总线840。处理器810和存储器820通过总线840互联。处理器810可通过端口830接收和发送数据。其中,
处理器810用于执行存储器820存储的机器可读指令模块。
存储器820存储有处理器810可执行的机器可读指令模块。处理器810可执行的指令模块包括:接收模块821、获取模块822、资源调度模块823和发送模块824。其中,
接收模块821被处理器810执行时可以为:接收至少一个接入节点 发送的业务请求消息,其中,来自一个接入节点的业务请求消息指示该接入节点中待处理的至少一个业务;
获取模块822被处理器810执行时可以为:获取至少一个资源节点的工作队列状态信息;
资源调度模块823被处理器810执行时可以为:针对每个接入节点,根据接收模块821接收的业务请求消息和获取模块822获取的工作队列状态信息执行资源调度,为该接入节点中待处理的每个业务分配至少一个资源节点并生成调度结果;
发送模块824被处理器810执行时可以为:针对每个接入节点,向该接入节点发送资源调度模块823得到的调度结果,以使该接入节点根据调度结果向所分配的至少一个资源节点发送至少一个任务请求。
由此可以看出,当存储在存储器820中的指令模块被处理器810执行时,可实现前述各个实施例中接收模块、获取模块、资源调度模块和发送模块的各种功能。
图9为依据本申请一实施例的接入节点的结构示意图。如图9所示,接入节点900包括:接收模块910、选择模块920和发送模块930,其中,
接收模块910用于,接收针对待处理的至少一个业务的任务请求;
选择模块920用于,从至少一个调度节点中选择一个调度节点;
发送模块930用于,向选择模块920选出的调度节点发送指示接收模块910中至少一个业务的业务请求消息,以使该调度节点根据业务请求消息和获取到的至少一个资源节点的工作队列状态信息执行资源调度,为待处理的每个业务分配至少一个资源节点并生成调度结果;
接收模块910进一步用于,从该调度节点接收调度结果;
发送模块930进一步用于,根据接收模块910接收的调度结果向所 分配的至少一个资源节点发送至少一个任务请求。
在一实施例中,发送模块930用于,每隔预定时间间隔分别向每个调度节点发送状态监测消息;
选择模块920用于,当接收模块910接收到一个调度节点反馈的状态监测确认消息时,确定该调度节点在线,根据接收到状态监测确认消息的时刻,对所有在线的调度节点按照时间先后顺序进行排序,根据排序结果控制发送模块930依次向每个在线的调度节点发送业务请求消息,并同时监测接收模块910是否接收到该在线的调度节点反馈的业务请求确认消息;当接收模块910接收到一个在线的调度节点反馈的业务请求确认消息时,将该在线的调度节点作为选择出的调度节点。
在一实施例中,接入节点900进一步包括:
生成模块940,用于为每个业务生成请求队列,每当接收模块910接收到针对一个业务的一个任务请求时,将该任务请求写入该业务的请求队列中,其中,所述业务请求消息携带有每个业务的请求队列中任务请求的个数。
在一实施例中,业务请求消息携带有J个业务的请求队列中任务请求的个数,若第j个业务的任务请求个数为Mj,j=1,…J,调度结果将指示为第j个业务分配Mj个资源节点,发送模块930用于,从生成模块940中生成的第j个业务的请求队列中取出Mj个任务请求,分别向所分配的每个资源节点发送一个任务请求。
在一实施例中,接收模块910进一步用于,在所分配的一个资源节点处理接收到的任务请求之前,从该资源节点接收携带有该资源节点的工作队列状态信息的请求接收确认消息;
发送模块930进一步用于,向该调度节点发送携带有各个资源节点的工作队列状态信息的状态监测消息,以使该调度节点根据状态监测消 息获取到资源节点的工作队列状态信息。
在一实施例中,接收模块910进一步用于,在所分配的一个资源节点处理完接收到的任务请求之后,从该资源节点接收携带有该资源节点的工作队列状态信息的请求处理确认消息;
发送模块930进一步用于,向该调度节点发送携带有各个资源节点的工作队列状态信息的状态监测消息,以使该调度节点根据状态监测消息获取到资源节点的工作队列状态信息。
图10为依据本申请一实施例的接入节点的结构示意图。如图10所示,接入节点1000包括:处理器1010、存储器1020、端口1030以及总线1040。处理器1010和存储器1020通过总线1040互联。处理器1010可通过端口1030接收和发送数据。其中,
处理器1010用于执行存储器1020存储的机器可读指令模块。
存储器1020存储有处理器1010可执行的机器可读指令模块。处理器1010可执行的指令模块包括:接收模块1021、选择模块1022和发送模块1023。其中,
接收模块1021被处理器1010执行时可以为:接收针对待处理的至少一个业务的任务请求;
选择模块1022被处理器1010执行时可以为:从至少一个调度节点中选择一个调度节点;
发送模块1023被处理器1010执行时可以为:向选择模块1022选出的调度节点发送指示接收模块1021中至少一个业务的业务请求消息,以使该调度节点根据业务请求消息和获取到的至少一个资源节点的工作队列状态信息执行资源调度,为待处理的每个业务分配至少一个资源节点并生成调度结果;
接收模块1021被处理器1010执行时进一步可以为:从该调度节点 接收调度结果;
发送模块1023被处理器1010执行时进一步可以为:根据接收模块1021接收的调度结果向所分配的至少一个资源节点发送至少一个任务请求。
在一实施例中,处理器1010可执行的指令模块还包括:生成模块1024。
生成模块1024被处理器1010执行时可以为:为每个业务生成请求队列,每当接收模块1021接收到针对一个业务的一个任务请求时,将该任务请求写入该业务的请求队列中,其中,所述业务请求消息携带有每个业务的请求队列中任务请求的个数。
在一实施例中,业务请求消息携带有J个业务的请求队列中任务请求的个数,若第j个业务的任务请求个数为Mj,j=1,…J,调度结果将指示为第j个业务分配Mj个资源节点,发送模块1023被处理器1010执行时可以为:从生成模块1024中生成的第j个业务的请求队列中取出Mj个任务请求,分别向所分配的每个资源节点发送一个任务请求。
由此可以看出,当存储在存储器1020中的指令模块被处理器1010执行时,可实现前述各个实施例中接收模块、选择模块、发送模块和生成模块的各种功能。
上述装置实施例中,各个模块实现自身功能的具体方法在方法实施例中均有描述,这里不再赘述。
另外,在本申请各个实施例中的各功能模块可以集成在一个处理单元中,也可以是各个模块单独物理存在,也可以两个或两个以上模块集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。
另外,本申请的每一个实施例可以通过由数据处理设备如计算机执 行的数据处理程序来实现。显然,数据处理程序构成了本申请。此外,通常存储在一个存储介质中的数据处理程序通过直接将程序读取出存储介质或者通过将程序安装或复制到数据处理设备的存储设备(如硬盘和或内存)中执行。因此,这样的存储介质也构成了本申请。存储介质可以使用任何类别的记录方式,例如纸张存储介质(如纸带等)、磁存储介质(如软盘、硬盘、闪存等)、光存储介质(如CD-ROM等)、磁光存储介质(如MO等)等。
因此,本申请还公开了一种存储介质,其中存储有数据处理程序,该数据处理程序用于执行本申请上述方法的任何一种实施例。
以上仅为本申请的较佳实施例而已,并不用以限制本申请,凡在本申请的精神和原则之内,所做的任何修改、等同替换、改进等,均应包含在本申请保护的范围之内。

Claims (25)

  1. 一种分布式资源调度方法,其特征在于,所述方法包括:
    接收至少一个接入节点发送的业务请求消息,其中,来自一个接入节点的所述业务请求消息指示该接入节点中待处理的至少一个业务;
    获取至少一个资源节点的工作队列状态信息;及,
    针对每个接入节点,执行如下处理:
    根据所述业务请求消息和所述工作队列状态信息执行资源调度,为该接入节点中待处理的每个业务分配至少一个资源节点并生成调度结果,向该接入节点发送所述调度结果,以使该接入节点根据所述调度结果向所分配的至少一个资源节点发送至少一个任务请求。
  2. 根据权利要求1所述的方法,进一步包括:
    每隔预定时间间隔从每个接入节点接收状态监测消息,并向该接入节点反馈状态监测确认消息,以使该接入节点在接收到所述状态监测确认消息之后,向所述调度节点发送所述业务请求消息;
    向该接入节点反馈业务请求确认消息。
  3. 根据权利要求1所述的方法,其中,所述获取至少一个资源节点的工作队列状态信息包括:
    每隔预定时间间隔向每个资源节点发送状态监测消息;
    接收每个资源节点反馈的状态监测确认消息,所述状态监测确认消息携带有该资源节点的所述工作队列状态信息。
  4. 根据权利要求1所述的方法,其中,所述获取至少一个资源节点的工作队列状态信息包括:
    接收每个接入节点发送的状态监测消息,所述状态监测消息携带有该接入节点获取到的至少一个资源节点的工作队列状态信息。
  5. 根据权利要求1至4中任一项所述的方法,所述方法进一步包括:
    记录每个资源节点的工作队列状态信息,其中,所述工作队列状态信息包括该资源节点生成该工作队列状态信息时的时间戳;
    当获取到一个资源节点的工作队列状态信息时,根据时间戳的先后顺序更新所记录的该资源节点的工作队列状态信息。
  6. 根据权利要求1至4中任一项所述的方法,其中,所述工作队列状态信息包括一个资源节点的工作队列的长度,所述根据所述业务请求消息和所述工作队列状态信息执行资源调度包括:
    根据所述工作队列的长度从所述至少一个资源节点中确定出空闲的资源节点;
    预先设置每个接入节点中各个业务的优先级;
    根据所述优先级为每个业务分配至少一个空闲的资源节点。
  7. 根据权利要求6所述的方法,其中,所述业务请求消息携带有一个接入节点的J个业务的请求队列中任务请求的个数,若第j个业务的任务请求个数为Mj,j=1,…J,所述根据所述优先级为每个业务分配至少一个空闲的资源节点包括:
    根据所述优先级从高到低的排序,依次轮询每个接入节点的各个业务,为第j个业务分配Mj个空闲的资源节点,其中,优先级高的业务优先获得资源节点。
  8. 一种分布式资源调度方法,其特征在于,所述方法包括:
    接收针对待处理的至少一个业务的任务请求;
    从至少一个调度节点中选择一个调度节点,向该调度节点发送指示所述至少一个业务的业务请求消息,以使该调度节点根据所述业务请求消息和获取到的至少一个资源节点的工作队列状态信息执行资源调度,为待处理的每个业务分配至少一个资源节点并生成调度结果;及,
    从该调度节点接收所述调度结果,并根据所述调度结果向所分配的 至少一个资源节点发送至少一个任务请求。
  9. 根据权利要求8所述的方法,其中,所述从至少一个调度节点中选择一个调度节点,向该调度节点发送指示所述至少一个业务的业务请求消息包括:
    每隔预定时间间隔分别向每个调度节点发送状态监测消息,当接收到一个调度节点反馈的状态监测确认消息时,确定该调度节点在线;
    根据接收到所述状态监测确认消息的时刻,对所有在线的调度节点按照时间先后顺序进行排序,根据排序结果依次向每个在线的调度节点发送所述业务请求消息,并同时监测是否接收到该在线的调度节点反馈的业务请求确认消息;
    当接收到一个在线的调度节点反馈的业务请求确认消息时,将该在线的调度节点作为选择出的调度节点。
  10. 根据权利要求8所述的方法,进一步包括:
    为每个业务生成请求队列,每当接收到针对一个业务的一个任务请求时,将该任务请求写入该业务的请求队列中,其中,所述业务请求消息携带有每个业务的请求队列中任务请求的个数。
  11. 根据权利要求10所述的方法,其中,所述业务请求消息携带有J个业务的请求队列中任务请求的个数,若第j个业务的任务请求个数为Mj,j=1,…J,所述调度结果将指示为第j个业务分配Mj个资源节点,所述根据所述调度结果向所分配的至少一个资源节点发送至少一个任务请求包括:
    从第j个业务的请求队列中取出Mj个任务请求,分别向所分配的每个资源节点发送一个任务请求。
  12. 根据权利要求8至11中任一项所述的方法,进一步包括:
    在所分配的一个资源节点处理接收到的任务请求之前,从该资源节 点接收携带有该资源节点的所述工作队列状态信息的请求接收确认消息;
    向该调度节点发送携带有各个资源节点的工作队列状态信息的状态监测消息,以使该调度节点根据所述状态监测消息获取到资源节点的工作队列状态信息。
  13. 根据权利要求8至11中任一项所述的方法,进一步包括:
    在所分配的一个资源节点处理完接收到的任务请求之后,从该资源节点接收携带有该资源节点的所述工作队列状态信息的请求处理确认消息;
    向该调度节点发送携带有各个资源节点的工作队列状态信息的状态监测消息,以使该调度节点根据所述状态监测消息获取到资源节点的工作队列状态信息。
  14. 一种调度节点,其特征在于,包括处理器和存储器,所述存储器中存储可被所述处理器执行的指令,当执行所述指令时,所述处理器用于:
    接收至少一个接入节点发送的业务请求消息,其中,来自一个接入节点的所述业务请求消息指示该接入节点中待处理的至少一个业务;
    获取至少一个资源节点的工作队列状态信息;
    针对每个接入节点,执行如下处理:根据所述业务请求消息和所述工作队列状态信息执行资源调度,为该接入节点中待处理的每个业务分配至少一个资源节点并生成调度结果,向该接入节点发送所述调度结果,以使该接入节点根据所述调度结果向所分配的至少一个资源节点发送至少一个任务请求。
  15. 根据权利要求14所述的调度节点,其中,当执行所述指令时,所述处理器进一步用于:
    每隔预定时间间隔从每个接入节点接收状态监测消息,向该接入节点反馈状态监测确认消息,以使该接入节点在接收到所述状态监测确认消息之后,向所述调度节点发送所述业务请求消息;向该接入节点反馈业务请求确认消息。
  16. 根据权利要求14所述的调度节点,其中,当执行所述指令时,所述处理器进一步用于:
    每隔预定时间间隔向每个资源节点发送状态监测消息;接收每个资源节点反馈的状态监测确认消息,所述状态监测确认消息携带有该资源节点的所述工作队列状态信息。
  17. 根据权利要求14所述的调度节点,其中,当执行所述指令时,所述处理器进一步用于:
    接收每个接入节点发送的状态监测消息,所述状态监测消息携带有该接入节点获取到的至少一个资源节点的工作队列状态信息。
  18. 根据权利要求14至17中任一项所述的调度节点,其中,当执行所述指令时,所述处理器进一步用于:
    记录每个资源节点的工作队列状态信息,其中,所述工作队列状态信息包括该资源节点生成该工作队列状态信息时的时间戳;当获取到一个资源节点的工作队列状态信息时,根据时间戳的先后顺序更新所记录的该资源节点的工作队列状态信息。
  19. 根据权利要求14至17中任一项所述的调度节点,其中,所述工作队列状态信息包括一个资源节点的工作队列的长度,当执行所述指令时,所述处理器进一步用于:
    根据所述工作队列的长度从所述至少一个资源节点中确定出空闲的资源节点;预先设置每个接入节点中各个业务的优先级;根据所述优先级为每个业务分配至少一个空闲的资源节点。
  20. 一种接入节点,其特征在于,包括处理器和存储器,所述存储器中存储可被所述处理器执行的指令,当执行所述指令时,所述处理器用于:
    接收针对待处理的至少一个业务的任务请求;
    从至少一个调度节点中选择一个调度节点;
    向该调度节点发送指示所述至少一个业务的业务请求消息,以使该调度节点根据所述业务请求消息和获取到的至少一个资源节点的工作队列状态信息执行资源调度,为待处理的每个业务分配至少一个资源节点并生成调度结果;
    从该调度节点接收所述调度结果;
    根据所述调度结果向所分配的至少一个资源节点发送至少一个任务请求。
  21. 根据权利要求20所述的接入节点,其中,当执行所述指令时,所述处理器进一步用于:
    每隔预定时间间隔分别向每个调度节点发送状态监测消息;
    当接收到一个调度节点反馈的状态监测确认消息时,确定该调度节点在线,根据接收到所述状态监测确认消息的时刻,对所有在线的调度节点按照时间先后顺序进行排序,根据排序结果依次向每个在线的调度节点发送所述业务请求消息,并同时监测是否接收到该在线的调度节点反馈的业务请求确认消息;当接收到一个在线的调度节点反馈的业务请求确认消息时,将该在线的调度节点作为选择出的调度节点。
  22. 根据权利要求20所述的接入节点,其中,当执行所述指令时,所述处理器进一步用于:
    为每个业务生成请求队列,每当接收到针对一个业务的一个任务请求时,将该任务请求写入该业务的请求队列中,其中,所述业务请求消 息携带有每个业务的请求队列中任务请求的个数。
  23. 根据权利要求20至22中任一项所述的接入节点,其中,当执行所述指令时,所述处理器进一步用于:
    在所分配的一个资源节点处理接收到的任务请求之前,从该资源节点接收携带有该资源节点的所述工作队列状态信息的请求接收确认消息;
    向该调度节点发送携带有各个资源节点的工作队列状态信息的状态监测消息,以使该调度节点根据所述状态监测消息获取到资源节点的工作队列状态信息。
  24. 根据权利要求20至22中任一项所述的接入节点,其中,当执行所述指令时,所述处理器进一步用于:在所分配的一个资源节点处理完接收到的任务请求之后,从该资源节点接收携带有该资源节点的所述工作队列状态信息的请求处理确认消息;
    向该调度节点发送携带有各个资源节点的工作队列状态信息的状态监测消息,以使该调度节点根据所述状态监测消息获取到资源节点的工作队列状态信息。
  25. 一种计算机可读存储介质,其特征在于,存储有计算机可读指令,可以使至少一个处理器执行如权利要求1至13任一项所述的方法。
PCT/CN2017/103606 2016-09-30 2017-09-27 分布式资源调度方法、调度节点及接入节点 WO2018059423A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US16/201,606 US10838777B2 (en) 2016-09-30 2018-11-27 Distributed resource allocation method, allocation node, and access node

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201610874061.XA CN107885594B (zh) 2016-09-30 2016-09-30 分布式资源调度方法、调度节点及接入节点
CN201610874061.X 2016-09-30

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US16/201,606 Continuation US10838777B2 (en) 2016-09-30 2018-11-27 Distributed resource allocation method, allocation node, and access node

Publications (1)

Publication Number Publication Date
WO2018059423A1 true WO2018059423A1 (zh) 2018-04-05

Family

ID=61763142

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/103606 WO2018059423A1 (zh) 2016-09-30 2017-09-27 分布式资源调度方法、调度节点及接入节点

Country Status (3)

Country Link
US (1) US10838777B2 (zh)
CN (1) CN107885594B (zh)
WO (1) WO2018059423A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111796950A (zh) * 2020-07-16 2020-10-20 网易(杭州)网络有限公司 数据处理方法和系统
CN112148468A (zh) * 2019-06-28 2020-12-29 杭州海康威视数字技术股份有限公司 一种资源调度方法、装置、电子设备及存储介质

Families Citing this family (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109218452B (zh) * 2018-11-16 2020-11-24 京东数字科技控股有限公司 用于推送节点信息的方法和装置
CN109788315A (zh) * 2019-01-31 2019-05-21 湖南快乐阳光互动娱乐传媒有限公司 视频转码方法、装置及系统
CN110636120B (zh) * 2019-09-09 2022-02-08 广西东信易联科技有限公司 一种基于业务请求的分布式资源协调系统及其方法
CN112887353B (zh) * 2019-11-29 2024-01-23 中国移动通信有限公司研究院 一种信息处理方法、装置、终端及存储介质
CN110968411A (zh) * 2019-12-06 2020-04-07 北京明略软件系统有限公司 定时任务调度方法、装置、服务器及存储介质
CN111338797B (zh) * 2020-02-19 2023-09-05 望海康信(北京)科技股份公司 任务处理方法、装置、电子设备及计算机可读存储介质
CN111443870B (zh) * 2020-03-26 2021-08-03 腾讯科技(深圳)有限公司 一种数据处理的方法、设备及存储介质
CN111866159A (zh) * 2020-07-28 2020-10-30 阿戈斯智能科技(苏州)有限公司 人工智能服务的调用方法、系统、设备和存储介质
CN112054923B (zh) * 2020-08-24 2023-08-18 腾讯科技(深圳)有限公司 业务请求检测方法、设备及介质
CN112114971A (zh) * 2020-09-28 2020-12-22 中国建设银行股份有限公司 一种任务分配方法、装置及设备
CN112689007B (zh) * 2020-12-23 2023-05-05 江苏苏宁云计算有限公司 资源分配方法、装置、计算机设备和存储介质
CN112948111B (zh) * 2021-02-26 2023-07-14 北京奇艺世纪科技有限公司 一种任务分配方法、装置、设备及计算机可读介质
CN113448737B (zh) * 2021-07-26 2024-03-22 北京清博智能科技有限公司 一种在多任务系统使用的高速均衡分配方法
CN113807924A (zh) * 2021-09-24 2021-12-17 华院分析技术(上海)有限公司 基于批量处理算法的业务处理分配方法、系统、存储介质及设备
CN114138500B (zh) * 2022-01-29 2022-07-08 阿里云计算有限公司 资源调度系统及方法
CN114785794A (zh) * 2022-03-29 2022-07-22 北京字节跳动网络技术有限公司 资源分配方法、装置、设备、介质、程序产品和系统
CN114924877B (zh) * 2022-05-17 2023-10-17 江苏泰坦智慧科技有限公司 一种基于数据流的动态分配计算方法、装置和设备
CN115996466B (zh) * 2023-03-23 2023-06-27 广州世炬网络科技有限公司 基于传输链路参数的节点功能切换控制方法及装置
CN116149827A (zh) * 2023-04-04 2023-05-23 云粒智慧科技有限公司 分布式任务调度系统和分布式任务调度执行系统
CN116909780B (zh) * 2023-09-12 2023-11-17 天津卓朗昆仑云软件技术有限公司 基于内存的本地分布式队列插件、系统和队列处理方法
CN117519951B (zh) * 2024-01-04 2024-05-03 深圳博瑞天下科技有限公司 基于消息中台的实时数据处理方法及系统

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101957780A (zh) * 2010-08-17 2011-01-26 中国电子科技集团公司第二十八研究所 一种基于资源状态信息的网格任务调度处理器及方法
CN102073546A (zh) * 2010-12-13 2011-05-25 北京航空航天大学 一种云计算环境中分布式计算模式下的任务动态调度方法
CN103699445A (zh) * 2013-12-19 2014-04-02 北京奇艺世纪科技有限公司 一种任务调度方法、装置及系统
CN104363300A (zh) * 2014-11-26 2015-02-18 浙江宇视科技有限公司 一种服务器集群中计算任务分布式调度装置
CN104657214A (zh) * 2015-03-13 2015-05-27 华存数据信息技术有限公司 一种基于多队列和多优先级的大数据任务管理系统和方法

Family Cites Families (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4325120A (en) * 1978-12-21 1982-04-13 Intel Corporation Data processing system
US4387427A (en) * 1978-12-21 1983-06-07 Intel Corporation Hardware scheduler/dispatcher for data processing system
US6213652B1 (en) * 1995-04-18 2001-04-10 Fuji Xerox Co., Ltd. Job scheduling system for print processing
US5787304A (en) * 1996-02-05 1998-07-28 International Business Machines Corporation Multipath I/O storage systems with multipath I/O request mechanisms
CN101202761B (zh) * 2007-12-04 2010-11-03 赵晓宇 一种分布式资源调度系统及其方法
US9377979B1 (en) * 2009-06-09 2016-06-28 Breezyprint Corporation Secure mobile printing from a third-party device with proximity-based device listing
JP2011192250A (ja) * 2010-02-22 2011-09-29 Canon Inc クラウドコンピューティングシステム、クラウドコンピューティングシステムの制御方法
CN102033777B (zh) * 2010-09-17 2013-03-20 中国资源卫星应用中心 基于ice的分布式作业调度引擎
JP2012083845A (ja) * 2010-10-07 2012-04-26 Canon Inc クラウドコンピューティングシステム、情報処理方法及びプログラム
US9367276B2 (en) * 2011-02-23 2016-06-14 Ricoh Company, Ltd. Resolution of conflicts between print jobs and printers in a print shop environment
CN103414657A (zh) * 2013-08-22 2013-11-27 浪潮(北京)电子信息产业有限公司 一种跨数据中心的资源调度方法、超级调度中心和系统
KR102254099B1 (ko) * 2014-05-19 2021-05-20 삼성전자주식회사 메모리 스와핑 처리 방법과 이를 적용하는 호스트 장치, 스토리지 장치 및 데이터 처리 시스템
US9965323B2 (en) * 2015-03-11 2018-05-08 Western Digital Technologies, Inc. Task queues
US10509675B2 (en) * 2018-02-02 2019-12-17 EMC IP Holding Company LLC Dynamic allocation of worker nodes for distributed replication

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101957780A (zh) * 2010-08-17 2011-01-26 中国电子科技集团公司第二十八研究所 一种基于资源状态信息的网格任务调度处理器及方法
CN102073546A (zh) * 2010-12-13 2011-05-25 北京航空航天大学 一种云计算环境中分布式计算模式下的任务动态调度方法
CN103699445A (zh) * 2013-12-19 2014-04-02 北京奇艺世纪科技有限公司 一种任务调度方法、装置及系统
CN104363300A (zh) * 2014-11-26 2015-02-18 浙江宇视科技有限公司 一种服务器集群中计算任务分布式调度装置
CN104657214A (zh) * 2015-03-13 2015-05-27 华存数据信息技术有限公司 一种基于多队列和多优先级的大数据任务管理系统和方法

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112148468A (zh) * 2019-06-28 2020-12-29 杭州海康威视数字技术股份有限公司 一种资源调度方法、装置、电子设备及存储介质
CN112148468B (zh) * 2019-06-28 2023-10-10 杭州海康威视数字技术股份有限公司 一种资源调度方法、装置、电子设备及存储介质
CN111796950A (zh) * 2020-07-16 2020-10-20 网易(杭州)网络有限公司 数据处理方法和系统
CN111796950B (zh) * 2020-07-16 2023-06-30 网易(杭州)网络有限公司 数据处理方法和系统

Also Published As

Publication number Publication date
US10838777B2 (en) 2020-11-17
CN107885594B (zh) 2020-06-12
CN107885594A (zh) 2018-04-06
US20190108069A1 (en) 2019-04-11

Similar Documents

Publication Publication Date Title
WO2018059423A1 (zh) 分布式资源调度方法、调度节点及接入节点
CN110297711B (zh) 批量数据处理方法、装置、计算机设备及存储介质
CN101645022B (zh) 用于多个集群的作业调度管理系统及方法
CN107291547B (zh) 一种任务调度处理方法、装置及系统
WO2020147330A1 (zh) 一种数据流处理方法及系统
CN111897638B (zh) 分布式任务调度方法及系统
CN108960773B (zh) 业务管理方法、计算机设备和存储介质
WO2022007552A1 (zh) 处理节点的管理方法、配置方法及相关装置
CN110383764B (zh) 无服务器系统中使用历史数据处理事件的系统和方法
CN107426274B (zh) 基于时序的业务应用及监控分析调度的方法和系统
CN109343939B (zh) 一种分布式集群及并行计算任务调度方法
WO2019037626A1 (zh) 一种分布式系统资源分配方法、装置及系统
WO2014194869A1 (zh) 一种请求处理方法、装置及系统
JP2012521607A (ja) 分散アプリケーションの監視
CN111338791A (zh) 集群队列资源的调度方法、装置、设备及存储介质
TWI484346B (zh) 最適化網路連接器並減少中斷
CN109634730A (zh) 任务调度方法、装置、计算机设备和存储介质
US10606650B2 (en) Methods and nodes for scheduling data processing
US7707080B2 (en) Resource usage metering of network services
CN109117244B (zh) 一种虚拟机资源申请排队机制的实现方法
US9607275B2 (en) Method and system for integration of systems management with project and portfolio management
JP6279816B2 (ja) ストレージ監視システムおよびその監視方法
CN111913784A (zh) 任务调度方法及装置、网元、存储介质
CN110955504A (zh) 智能分配渲染任务的方法、服务器、系统及存储介质
WO2022222975A1 (zh) 负载处理方法、计算节点、计算节点集群及相关设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17854873

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 17854873

Country of ref document: EP

Kind code of ref document: A1