WO2021057514A1 - 任务调度方法、装置、计算机设备和计算机可读介质 - Google Patents
任务调度方法、装置、计算机设备和计算机可读介质 Download PDFInfo
- Publication number
- WO2021057514A1 WO2021057514A1 PCT/CN2020/114800 CN2020114800W WO2021057514A1 WO 2021057514 A1 WO2021057514 A1 WO 2021057514A1 CN 2020114800 W CN2020114800 W CN 2020114800W WO 2021057514 A1 WO2021057514 A1 WO 2021057514A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- task
- execution node
- scheduling device
- node
- scheduling
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5083—Techniques for rebalancing the load in a distributed system
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/25—Integrating or interfacing systems involving database management systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/25—Integrating or interfacing systems involving database management systems
- G06F16/254—Extract, transform and load [ETL] procedures, e.g. ETL data flows in data warehouses
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/48—Program initiating; Program switching, e.g. by interrupt
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/48—Program initiating; Program switching, e.g. by interrupt
- G06F9/4806—Task transfer initiation or dispatching
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/54—Interprogram communication
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/54—Interprogram communication
- G06F9/542—Event management; Broadcasting; Multicasting; Notifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/54—Interprogram communication
- G06F9/546—Message passing systems or structures, e.g. queues
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2209/00—Indexing scheme relating to G06F9/00
- G06F2209/54—Indexing scheme relating to G06F9/54
- G06F2209/548—Queue
Definitions
- the embodiments of the present application relate to the field of computer network technology, and in particular to a task scheduling method, device, computer equipment, and computer-readable medium.
- a data warehouse is a collection of subject-oriented, integrated, time-related, and unmodifiable data.
- ETL Extract-Transform-Load, extraction, transformation and loading
- ETL node converts the data extracted from multiple different data sources and loads it into the data warehouse of multiple local nodes.
- the traditional ETL task scheduling scheme is to manually assign these tasks to the ETL execution nodes when creating specific ETL tasks. This will cause the ETL task load of some execution nodes to be too heavy, but some execution nodes Very idle, there is a problem of unbalanced load among execution nodes.
- the execution node works normally when the ETL task is created, but if the execution node fails when the ETL task is started, the ETL task on the execution node cannot be executed on time, and there is a single point of failure problem.
- the embodiments of the present application provide a task scheduling method, device, computer equipment, and computer-readable medium in response to the above-mentioned shortcomings in related technologies.
- an embodiment of the present application provides a task scheduling method, which is applied to a first scheduling device configured as a master scheduling device in a cluster, and the method includes:
- the task is taken out from the task queue and distributed to the execution node.
- the determining the execution node for executing the task according to the minimum resource requirement unit of the task in the task queue and the total amount of resources and resource usage reported by each execution node in the cluster includes:
- the minimum resource requirement unit of the task in the task queue and the total amount of resources and resource usage reported by each execution node in the cluster respectively calculate the number of the minimum resource requirement unit that each execution node in the cluster can execute the task
- the determining the execution node with the largest number and using the execution node as the execution node for executing the task includes:
- an execution node whose node type does not correspond to the task type of the task is selected as the execution node for executing the task.
- the method further includes:
- the task is put into the task queue.
- the task scheduling method further includes one or any combination of the following steps:
- mapping relationship between the task information of the task and the node address and synchronizing the mapping relationship to a second scheduling device, which is currently configured as a backup scheduling device;
- the method also includes:
- the task scheduling method further includes:
- the method further includes:
- the address of the device is broadcast in the cluster, and the device is configured as the main scheduling device.
- an embodiment of the present application also provides a task scheduling device, which can be configured as a main scheduling device in a cluster, and includes a node determination module and a task scheduling module;
- the node determining module is configured to determine the execution node used to execute the task according to the minimum resource requirement unit of the task in the task queue and the total amount of resources and resource usage reported by each execution node in the cluster.
- the task scheduling module is configured to take the task out of the task queue and distribute it to the execution node.
- an embodiment of the present application further provides a computer device, including: one or more processors and a storage device; wherein, one or more programs are stored on the storage device, and when the above one or more programs are used by the above one When executed by or multiple processors, the foregoing one or more processors implement the task scheduling methods provided in the foregoing embodiments.
- the embodiments of the present application also provide a computer-readable medium on which a computer program is stored, wherein the computer program implements the task scheduling method provided in the foregoing embodiments when the computer program is executed.
- the first scheduling device configured as the master scheduling device determines according to the minimum resource requirement unit of the task in the task queue and the total amount of resources and resource usage reported by each execution node in the cluster
- the execution node used to execute the task in the cluster takes the task from the task queue and distributes it to the determined execution node to start the task, wherein the tasks in the task queue meet the corresponding task start conditions.
- the embodiment of the application only schedules the task when the task start condition is met, and schedules the task according to the total amount of resources and resource usage of each execution node. Not only can the load balance among the execution nodes be realized, but the execution of the tasks assigned The nodes are currently working normally, avoiding single point of failure and improving system reliability.
- Figure 1 is a system architecture diagram provided by an embodiment of the application
- FIG. 2 is a flowchart of a task scheduling method provided by an embodiment of the application
- FIG. 3 is a flowchart of determining a node for executing a task according to an embodiment of the application
- FIG. 4 is a schematic diagram of data synchronization between a first scheduling device and a second scheduling device provided by an embodiment of the application;
- FIG. 5 is a flowchart of switching between the active and standby scheduling devices provided by an embodiment of the application
- FIG. 6 is a schematic structural diagram of a scheduling device provided by an embodiment of the application.
- An embodiment of the present application provides a task scheduling method.
- the task scheduling method is applied to an ETL system, and is specifically applied to a first scheduling device in the ETL system.
- the ETL system includes a first scheduling device, a second scheduling device, and multiple execution nodes for executing tasks. Only one scheduling device is allowed to be configured as the master scheduling device at the moment, and the master scheduling device can be each Perform node scheduling tasks.
- the first scheduling device is configured as the main scheduling device at the current moment for description.
- the task scheduling method of the embodiment of the present application will be described in detail below with reference to FIG. 2. As shown in Figure 2, the method includes the following steps:
- Step 11 Determine an execution node for executing the task according to the minimum resource requirement unit of the task in the task queue and the total amount of resources and resource usage reported by each execution node in the cluster.
- the task queue is used to store a record of the tasks to be started, and the tasks to be started refer to the tasks that meet the task start conditions, that is, the tasks in the task queue meet the task start conditions of the corresponding tasks. , That is, when the task start condition of a certain task is met, the task is placed at the end of the task queue.
- the task start condition may include: the occurrence of an event that meets the task start triggers the task start (for example, manually triggers the start task) and the time for the start of the task reaches the trigger task start (for example, the scheduled start task).
- execution nodes are assigned to each task in sequence according to the order of the task queue.
- Each execution node in the cluster reports its total resource amount and resource usage to the first scheduling device (ie, the main scheduling device) according to a preset cycle.
- the resources include but are not limited to: memory resources, CPU computing power, and disk space.
- the first scheduling device records the total amount of resources and resource usage reported by each execution node, and generates and maintains a node resource table (Resource Table).
- the first scheduling device determines the execution node for executing the task according to the task scheduling strategy, and its specific implementation will be described in detail later with reference to FIG. 3.
- Step 12 Take the task out of the task queue and distribute it to the execution nodes.
- the first task in the task queue is dequeued, and the task is distributed to the execution node determined in step 11 to start the task.
- the task scheduling method provided by the embodiment of the present application is configured as the first scheduling device of the master scheduling device according to the minimum resource requirement unit of the task in the task queue and the report from each execution node in the cluster.
- the total amount of resources and resource usage determine the execution node used to execute the task in the cluster, take the task out of the task queue and distribute it to the determined execution node, so as to start the task.
- the task in the task queue Meet the corresponding task start conditions.
- the embodiment of the application only schedules the task when the task start condition is met, and schedules the task according to the total amount of resources and resource usage of each execution node. Not only can the load balance among the execution nodes be realized, but the execution of the tasks assigned The nodes are currently working normally, avoiding single point of failure and improving system reliability.
- the execution node used to execute the task is determined according to the minimum resource requirement unit of the task in the task queue and the total amount of resources and resource usage reported by each execution node in the cluster (Ie step 11), specifically including the following steps:
- Step 111 According to the minimum resource requirement unit of the task in the task queue and the total resource amount and resource usage reported by each execution node in the cluster, respectively calculate the number of the minimum resource requirement unit that each execution node in the cluster can execute the task.
- the number of minimum resource unit requirements for each execution node in the cluster to perform tasks can be calculated according to the following formula (1):
- N ij Min(M i *(1-M i ')/M” j , C i *(1-C i ')/C” j , D i *(1-D i ')/D” j ) ;
- i is the node identifier
- j is the task identification
- N ij i is the number of nodes capable of performing the tasks performed j minimum resource requirements units
- M i is the total amount of memory execution node i
- C i to node i performs the overall computing power CPU
- C i ' is the CPU usage execution node i
- D i is the amount of disk space execution node i
- D i' is the use of disk space for execution node i rate.
- the minimum resource requirement unit RU j (M" j , C" j , D" j ) of task j where M" j is the minimum memory requirement of task j, C" j is the minimum CPU computing power requirement of task j, and D " J is the minimum disk space requirement of task j.
- Step 112 Determine the execution node with the largest number, and use the execution node as the execution node for executing the task.
- the execution node with the largest number of RUs is selected as the execution node for executing the task.
- the cluster contains three execution nodes: node 1, node 2 and node 3.
- the resource situation of the three nodes at the current moment is shown in Table 1.
- Table 1 is the node resource list (Resource Table).
- the first task in the task queue is a, and the minimum resource requirement unit RU of task a is (memory is 4G, CPU computing power is 5, and disk capacity is 20G).
- N 2a 7.5 RU
- N 3a 4.4 RU. Since the execution node with the largest number of RUs is node 1, node 1 is selected as the execution node to execute task a.
- a node whose node type does not correspond to the task type of the task is selected as the execution node for executing the task.
- the task types can be divided into the following types: memory dependent (menDependence), CPU dependent (cpuDependence) or disk dependent (diskDependence).
- the node types are divided into the following types: memory shortage (menShortage), CPU shortage (cpuShortage) or disk shortage (diskShortage).
- the matching of the node type and the task type means that the memory-dependent task corresponds to the memory-scarce node, the CPU-dependent task corresponds to the CPU-scarce node, and the disk-dependent task corresponds to the disk-scarce node.
- the task is memory-dependent, select non-memory-scarce execution nodes from the execution nodes of the same RU number (ie, CPU-scarce or disk-scarce execution) Node) as the node used to perform the task.
- step 12 after the task is taken out of the task queue and distributed to the execution node (ie step 12), it may further include the following step: obtain the status of the task from the execution node according to the preset first cycle If it is determined that the task fails to start according to the status of the task, the task is put into the task queue.
- the first scheduling device ie, the main scheduling device
- the status of the task includes: start status (including success or failure), running status, stop status, End state (including successful or failed operation). If the first scheduling device (that is, the main scheduling device) determines that the task fails to start, the task is put into the task queue again, so as to restart the task.
- the first scheduling device ie, the main scheduling device
- the main scheduling device distributes tasks to specific ETL execution nodes, it also monitors the running status of the tasks, and reenters the tasks that failed to start to the queue to ensure that the tasks can be started.
- the task scheduling method may further include one or any combination of the following steps:
- mapping relationship between the task information of the task and the node address and synchronize the mapping relationship to the second scheduling device, which is currently configured as a backup scheduling device.
- the task information may include the task identifier and the task status
- the mapping relationship between the task information and the node address may be stored in the form of a mapping table (Mapping Table).
- the first scheduling device (ie the main scheduling device) can synchronize the task queue to the task array (Task Array) of the second scheduling device (ie, the backup scheduling device). Specifically, the first scheduling device can request the second scheduling via HTTP. The device synchronizes the task queue.
- the first scheduling device ie, the main scheduling device
- the second scheduling device synchronizes the node resource list (Resource Table) to the second scheduling device (ie, the backup scheduling device).
- the task scheduling method also includes the following steps:
- the first scheduling device that is, the main scheduling device
- the second scheduling device that is, The backup scheduling device
- each scheduling device is provided with an ETL task information database.
- the first scheduling device and the second scheduling device can compare the task queue, the mapping relationship between the task information and the node address, and the cluster according to the preset period.
- the resource usage reported by each execution node in the internal storage is stored in the ETL task information database to achieve data persistence and storage.
- the task scheduling method may further include the following steps: receiving a broadcast message, where the broadcast message includes the address of the second scheduling device, and configuring the device as a backup scheduling device.
- the first scheduling device receives a broadcast message that includes the address of the second scheduling device, it means that the second scheduling device has determined that the first scheduling device is working abnormally, and configures itself as the master scheduling device and is in the cluster. Broadcast its own IP address, therefore, the first scheduling device configures the device as a backup scheduling device, that is, the first scheduling device switches from the main scheduling device to the backup scheduling device.
- the task scheduling method may further include the following steps:
- Step 51 Obtain system state information of the second scheduling device according to a preset second cycle.
- the second scheduling device is currently configured as the master scheduling device.
- the first scheduling device (configured as a backup scheduling device at this time) sends HTTP heartbeat information to the second scheduling device (configured as the primary scheduling device at this time) every 5s to inform the second scheduling device of its own System status, and obtain the system status of the second scheduling device.
- Step 52 If it is determined that the second scheduling device is working abnormally according to the system status information, broadcast the address of the device in the cluster, and configure the device as the main scheduling device.
- the first scheduling device (configured as a backup scheduling device at this time) fails to obtain the system status of the second scheduling device (configured as the primary scheduling device at this time) for three consecutive times, it is considered that the first scheduling device is down. , The service is unavailable, then configure this device (the first scheduling device) as the master scheduling device, thereby switching the identity to the master scheduling device, and broadcast its own IP address in the cluster, so that each execution node subsequently reports resources based on the IP address Total amount and resource usage.
- the first master scheduling device synchronizes the task queue, the mapping relationship between the task information of the task and the node address, and the resource usage reported by each node in the cluster to the second backup schedule via HTTP request.
- the second backup scheduling device immediately broadcasts its own IP address to each ETL execution node in the cluster according to the node information, and fulfills the obligations of the main scheduling device, thereby achieving disaster recovery backup .
- the distributed ETL tasks are uniformly scheduled according to the resource utilization of each execution node.
- Each execution node in the cluster reports the node's own resource usage, and the main scheduling device calculates and filters out the resource occupation
- the low execution node distributes tasks and monitors the task status during the running cycle of the task.
- the management node scheduling device responsible for the unified scheduling and management of tasks the active and standby node mode is set.
- the master scheduling device of the master node is responsible for task scheduling and monitoring under the condition that the main scheduling device is healthy and working, and saves the task scheduling and Running information, and periodically synchronize the information with the backup scheduling device of the backup node. Once the main node of the main scheduling device goes down, immediately switch to the backup node scheduling device of the main scheduling device to ensure the normal scheduling and operation of ETL tasks.
- an embodiment of the present application also provides a scheduling device.
- the scheduling device is configured as a main scheduling device in a cluster, and includes a node determination module 61 and a task scheduling module 62.
- the node determination module 61 is configured to determine the execution node used to execute the task according to the minimum resource requirement unit of the task in the task queue and the total amount of resources and resource usage reported by each execution node in the cluster, and the tasks in the task queue meet the corresponding requirements.
- the task start conditions are configured to determine the execution node used to execute the task according to the minimum resource requirement unit of the task in the task queue and the total amount of resources and resource usage reported by each execution node in the cluster, and the tasks in the task queue meet the corresponding requirements. The task start conditions.
- the task scheduling module 62 is configured to take out tasks from the task queue and distribute them to the execution nodes.
- the node determination module 61 is configured to calculate the capacity of each execution node in the cluster according to the minimum resource requirement unit of the task in the task queue and the total amount of resources and resource usage reported by each execution node in the cluster. The number of minimum resource requirement units for executing the task; determine the execution node with the largest number, and use the execution node as the execution node for executing the task.
- the node determining module 61 is configured to select an execution node whose node type does not correspond to the task type of the task as the execution node for executing the task when there are at least two execution nodes with the largest number.
- the scheduling device further includes a task queue maintenance module, and the task queue maintenance module is set to: after the task scheduling module takes out the task from the task queue and distributes it to the execution node, according to the preset first cycle Obtain the status of the task from the execution node; when it is determined that the task fails to start according to the status of the task, the task is put into the task queue.
- the scheduling device further includes a data update and synchronization module.
- the data update and synchronization module is configured to perform one or any combination of the following steps: record the mapping relationship between the task information of the task and the node address, and The mapping relationship is synchronized to the second scheduling device, the second scheduling device is currently configured as a backup scheduling device; the task queue is synchronized to the second scheduling device; the resource usage reported by each execution node in the cluster is synchronized to the second scheduling device; and When the task ends, delete the mapping relationship between the task information corresponding to the task and the node address and/or the task in the task queue, and synchronously update the mapping relationship between the task information and the node address stored in the second scheduling device and / Or task queue.
- the scheduling device further includes an active/standby switching module, and the active/standby switching module is configured to configure the device as a backup scheduling device when receiving a broadcast message, wherein the broadcast message includes the address of the second scheduling device .
- the active-standby switching module is further configured to obtain the system state information of the second scheduling device according to a preset second cycle after the device is configured as a backup scheduling device, wherein the second scheduling device It is currently configured as the master scheduling device; when it is determined that the second scheduling device is working abnormally according to the system status information, the address of the device is broadcast in the cluster, and the device is configured as the master scheduling device.
- An embodiment of the present application also provides a computer device, which includes: one or more processors and a storage device; wherein, one or more programs are stored on the storage device, and when the one or more programs are When executed by or multiple processors, the foregoing one or more processors implement the task scheduling methods provided in the foregoing embodiments.
- the embodiments of the present application also provide a computer-readable medium on which a computer program is stored, wherein the computer program implements the task scheduling method provided in the foregoing embodiments when the computer program is executed.
- the functional modules/units in the device can be implemented as software, firmware, hardware, and appropriate combinations thereof.
- the division between functional modules/units mentioned in the above description does not necessarily correspond to the division of physical components; for example, a physical component may have multiple functions, or a function or step may consist of several physical components.
- the components are executed cooperatively.
- Some physical components or all physical components can be implemented as software executed by a processor, such as a central processing unit, a digital signal processor, or a microprocessor, or as hardware, or as an integrated circuit, such as an application specific integrated circuit .
- Such software may be distributed on a computer-readable medium
- the computer-readable medium may include a computer storage medium (or non-transitory medium) and a communication medium (or transitory medium).
- the term computer storage medium includes volatile and non-volatile data implemented in any method or technology for storing information (such as computer-readable instructions, data structures, program modules, or other data).
- Information such as computer-readable instructions, data structures, program modules, or other data.
- Computer storage media include but are not limited to RAM, ROM, EEPROM, flash memory or other memory technologies, CD-ROM, digital versatile disk (DVD) or other optical disk storage, magnetic cassettes, magnetic tapes, magnetic disk storage or other magnetic storage devices, or Any other medium used to store desired information and that can be accessed by a computer.
- communication media usually contain computer-readable instructions, data structures, program modules, or other data in a modulated data signal such as carrier waves or other transmission mechanisms, and may include any information delivery media. .
- the first scheduling device configured as the master scheduling device determines according to the minimum resource requirement unit of the task in the task queue and the total amount of resources and resource usage reported by each execution node in the cluster
- the execution node used to execute the task in the cluster takes the task from the task queue and distributes it to the determined execution node to start the task, wherein the tasks in the task queue meet the corresponding task start conditions.
- the embodiment of the application only schedules the task when the task start condition is met, and schedules the task according to the total amount of resources and resource usage of each execution node. Not only can the load balance among the execution nodes be realized, but the execution of the tasks assigned The nodes are currently working normally, avoiding single point of failure and improving system reliability.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Software Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Databases & Information Systems (AREA)
- Data Mining & Analysis (AREA)
- Multimedia (AREA)
- Hardware Redundancy (AREA)
Abstract
Description
Claims (10)
- 一种任务调度方法,应用于第一调度装置,所述第一调度装置被配置为集群内的主调度装置,所述方法包括:根据任务队列中的任务的最小资源需求单元以及集群内各执行节点上报的资源总量和资源使用情况,确定用于执行所述任务的执行节点,所述任务队列中的任务满足相应的任务启动条件;将所述任务从所述任务队列中取出并分发到所述执行节点中。
- 如权利要求1所述的方法,其中,所述根据任务队列中的任务的最小资源需求单元以及集群内各执行节点上报的资源总量和资源使用情况,确定用于执行所述任务的执行节点,包括:根据任务队列中的任务的最小资源需求单元以及集群内各执行节点上报的资源总量和资源使用情况,分别计算集群内各执行节点能够执行所述任务的最小资源需求单元的数量;确定所述数量最多的执行节点,并将所述执行节点作为用于执行所述任务的执行节点。
- 如权利要求2所述的方法,其中,所述确定所述数量最多的执行节点,并将所述执行节点作为用于执行所述任务的执行节点,包括:若所述数量最多的执行节点为至少两个,则从中选择节点类型与所述任务的任务类型不对应的执行节点作为用于执行所述任务的执行节点。
- 如权利要求1所述的方法,其中,所述将所述任务从所述任务队列中取出并分发到所述执行节点中之后,还包括:按照预设的第一周期从所述执行节点获取所述任务的状态;若根据所述任务的状态确定出所述任务启动失败,则将所述任务放入所述任务队列中。
- 如权利要求1所述的方法,其中,还包括以下步骤之一或任意组合:记录所述任务的任务信息与节点地址之间的映射关系,并将所述映射关系同步到第二调度装置,所述第二调度装置当前被配置为备份调度装置;将所述任务队列同步到所述第二调度装置;将集群内各执行节点上报的资源使用情况同步到所述第二调度装置;所述方法还包括:当所述任务结束时,删除与所述任务对应的任务信息与节点地址之间的映射关系和/或任务队列中的所述任务,并同步更新所述第二调度装置存储的任务信息与节点地址之间的映射关系和/或任务队列。
- 如权利要求1-5任一项所述的方法,其中,还包括:接收广播消息,所述广播消息包括第二调度装置的地址;将本设备配置为备份调度装置。
- 如权利要求6所述的方法,其中,所述将本设备配置为备份调度装置之后,还包括:按照预设的第二周期获取所述第二调度装置的系统状态信息,其中,所述第二调度装置当前被配置为主调度装置;若根据所述系统状态信息确定出所述第二调度装置工作异常,则在集群内广播本设备的地址,并将本设备配置为主调度装置。
- 一种调度装置,所述调度装置能够被配置为集群内的主调度装置,包括节点确定模块和任务调度模块;所述节点确定模块设置为,根据任务队列中的任务的最小资源需求单元以及集群内各执行节点上报的资源总量和资源使用情况,确定用于执行所述任务的执行节点,所述任务队列中的任务满足相应的任务启动条件;所述任务调度模块设置为,将所述任务从所述任务队列中取出并分发到所述执行节点中。
- 一种计算机设备,包括:一个或多个处理器;存储装置,其上存储有一个或多个程序;当所述一个或多个程序被所述一个或多个处理器执行时,使得所述一个或多个处理器实现如权利要求1-7任一项所述的任务调度方法。
- 一种计算机可读介质,其上存储有计算机程序,其中,所述程序被执行时实现如权利要求1-7任一项所述的任务调度方法。
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910905458.4 | 2019-09-24 | ||
CN201910905458.4A CN112631764A (zh) | 2019-09-24 | 2019-09-24 | 任务调度方法、装置、计算机设备和计算机可读介质 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2021057514A1 true WO2021057514A1 (zh) | 2021-04-01 |
Family
ID=75166410
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2020/114800 WO2021057514A1 (zh) | 2019-09-24 | 2020-09-11 | 任务调度方法、装置、计算机设备和计算机可读介质 |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN112631764A (zh) |
WO (1) | WO2021057514A1 (zh) |
Families Citing this family (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113590278A (zh) * | 2021-07-05 | 2021-11-02 | 杭州智家通科技有限公司 | 去除重复执行任务的方法、装置、设备及存储介质 |
CN114116178B (zh) * | 2021-12-06 | 2024-10-22 | 深圳市和讯华谷信息技术有限公司 | 集群框架任务管理方法以及相关装置 |
CN114416346B (zh) * | 2021-12-23 | 2023-03-24 | 广州市玄武无线科技股份有限公司 | 一种多节点任务调度方法、装置、设备及存储介质 |
CN114185688B (zh) * | 2022-02-14 | 2023-03-10 | 维塔科技(北京)有限公司 | 物理资源占用状态的矫正方法、调度器及可读存储介质 |
CN114546623B (zh) * | 2022-03-01 | 2022-12-27 | 淮安市第二人民医院 | 一种基于大数据系统的任务调度方法和系统 |
CN117112180B (zh) * | 2023-09-27 | 2024-03-29 | 广州有机云计算有限责任公司 | 一种基于任务的集群自动化控制方法 |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101141315A (zh) * | 2007-10-11 | 2008-03-12 | 上海交通大学 | 网络资源调度仿真系统 |
CN103259829A (zh) * | 2012-03-05 | 2013-08-21 | 合肥华云通信技术有限公司 | 一种提高云计算调度系统备份效率的方法 |
US20140289733A1 (en) * | 2013-03-22 | 2014-09-25 | Palo Alto Research Center Incorporated | System and method for efficient task scheduling in heterogeneous, distributed compute infrastructures via pervasive diagnosis |
US20170279734A1 (en) * | 2016-03-28 | 2017-09-28 | The Travelers Indemnity Company | Systems and methods for dynamically allocating computing tasks to computer resources in a distributed processing environment |
CN108762910A (zh) * | 2018-06-06 | 2018-11-06 | 亚信科技(中国)有限公司 | 一种分布式任务调度方法及系统 |
CN109408236A (zh) * | 2018-10-22 | 2019-03-01 | 福建南威软件有限公司 | 一种etl在集群上的任务负载均衡方法 |
-
2019
- 2019-09-24 CN CN201910905458.4A patent/CN112631764A/zh active Pending
-
2020
- 2020-09-11 WO PCT/CN2020/114800 patent/WO2021057514A1/zh active Application Filing
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101141315A (zh) * | 2007-10-11 | 2008-03-12 | 上海交通大学 | 网络资源调度仿真系统 |
CN103259829A (zh) * | 2012-03-05 | 2013-08-21 | 合肥华云通信技术有限公司 | 一种提高云计算调度系统备份效率的方法 |
US20140289733A1 (en) * | 2013-03-22 | 2014-09-25 | Palo Alto Research Center Incorporated | System and method for efficient task scheduling in heterogeneous, distributed compute infrastructures via pervasive diagnosis |
US20170279734A1 (en) * | 2016-03-28 | 2017-09-28 | The Travelers Indemnity Company | Systems and methods for dynamically allocating computing tasks to computer resources in a distributed processing environment |
CN108762910A (zh) * | 2018-06-06 | 2018-11-06 | 亚信科技(中国)有限公司 | 一种分布式任务调度方法及系统 |
CN109408236A (zh) * | 2018-10-22 | 2019-03-01 | 福建南威软件有限公司 | 一种etl在集群上的任务负载均衡方法 |
Also Published As
Publication number | Publication date |
---|---|
CN112631764A (zh) | 2021-04-09 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2021057514A1 (zh) | 任务调度方法、装置、计算机设备和计算机可读介质 | |
WO2019154394A1 (zh) | 分布式数据库集群系统、数据同步方法及存储介质 | |
TWI755417B (zh) | 計算任務分配方法、流計算任務的執行方法、控制伺服器、流計算中心伺服器集群、流計算系統及異地多活系統 | |
US11068499B2 (en) | Method, device, and system for peer-to-peer data replication and method, device, and system for master node switching | |
CN108132837B (zh) | 一种分布式集群调度系统及方法 | |
CA3168286A1 (en) | Data flow processing method and system | |
CN106817408B (zh) | 一种分布式服务器集群调度方法及装置 | |
US20150066849A1 (en) | System and method for supporting parallel asynchronous synchronization between clusters in a distributed data grid | |
CN111427670A (zh) | 任务调度方法和系统 | |
CN108322358B (zh) | 异地多活的分布式消息发送、处理、消费方法及装置 | |
WO2019020081A1 (zh) | 分布式系统及其故障恢复方法、装置、产品和存储介质 | |
CN106034137A (zh) | 用于分布式系统的智能调度方法及分布式服务系统 | |
WO2017181430A1 (zh) | 分布式系统的数据库复制方法及装置 | |
US10802896B2 (en) | Rest gateway for messaging | |
EP3087483A1 (en) | System and method for supporting asynchronous invocation in a distributed data grid | |
CN104484228A (zh) | 基于Intelli-DSC的分布式并行任务处理系统 | |
CN113765690A (zh) | 集群切换方法、系统、装置、终端、服务器及存储介质 | |
WO2017050177A1 (zh) | 一种数据同步方法和装置 | |
CN108154343B (zh) | 一种企业级信息系统的应急处理方法及系统 | |
CN103973811A (zh) | 一种可动态迁移的高可用集群管理方法 | |
CN114116178A (zh) | 集群框架任务管理方法以及相关装置 | |
CN105095248B (zh) | 一种数据库集群系统及其恢复方法、管理节点 | |
CN106844021B (zh) | 计算环境资源管理系统及其管理方法 | |
CN107679817B (zh) | 工作流执行方法及相关设备 | |
CN117331751A (zh) | 一种数据库的多节点备份系统及方法 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 20869385 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 20869385 Country of ref document: EP Kind code of ref document: A1 |
|
32PN | Ep: public notification in the ep bulletin as address of the adressee cannot be established |
Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 200223) |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 20869385 Country of ref document: EP Kind code of ref document: A1 |