WO2017162075A1 - Task scheduling method and device - Google Patents

Task scheduling method and device Download PDF

Info

Publication number
WO2017162075A1
WO2017162075A1 PCT/CN2017/076709 CN2017076709W WO2017162075A1 WO 2017162075 A1 WO2017162075 A1 WO 2017162075A1 CN 2017076709 W CN2017076709 W CN 2017076709W WO 2017162075 A1 WO2017162075 A1 WO 2017162075A1
Authority
WO
WIPO (PCT)
Prior art keywords
task
cluster
network
default
scheduling
Prior art date
Application number
PCT/CN2017/076709
Other languages
French (fr)
Chinese (zh)
Inventor
何乐
黄俨
史英杰
张�杰
张辰
Original Assignee
阿里巴巴集团控股有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 阿里巴巴集团控股有限公司 filed Critical 阿里巴巴集团控股有限公司
Publication of WO2017162075A1 publication Critical patent/WO2017162075A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • G06F9/5038Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the execution order of a plurality of tasks, e.g. taking priority or time dependency constraints into consideration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5083Techniques for rebalancing the load in a distributed system
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Definitions

  • the present invention relates to computer technology, and in particular, to a task scheduling method and apparatus.
  • cluster technology In order to improve the stability of the system and the data processing capability and service capability of the network center, cluster technology is usually adopted.
  • clustering technology enables servers to be connected to each other to form a cluster. Multiple clusters are interconnected to form a distributed system. Each cluster in the distributed system runs a series of common applications.
  • the application is divided into multiple tasks, each task is assigned a cluster to run, the assigned cluster is used as the default cluster for the task, and the task is run on the default cluster, and the storage task is run.
  • Required task data It can be seen that in this case, when the running capacity required by the task does not match the running capability of the cluster, the load of each cluster is unbalanced.
  • the distributed system can schedule the computing tasks based on the load conditions of the clusters, and run the computing tasks from the clusters that are scheduled.
  • the bandwidth usage between the clusters is too high.
  • the invention provides a task scheduling method and device for solving the situation that the bandwidth occupation between clusters is too high in the prior art.
  • a task scheduling method is provided to determine network resources between a default cluster of a task and an idle target cluster; the default cluster is a cluster that stores task data required for the task to run;
  • the task is scheduled according to the network resource.
  • a task scheduling apparatus including:
  • a determining module configured to determine a network resource between a default cluster of the task and an idle target cluster; the default cluster is a cluster storing task data required for the task to run;
  • a scheduling module configured to schedule the task according to the network resource.
  • the task scheduling method and device provided by the embodiment of the present invention, after determining the network resource between the default cluster of the task and the idle target cluster, scheduling the task according to the determined network resource.
  • the default cluster is a cluster that stores the task data required for the task to run.
  • the bandwidth usage is too high because the task is scheduled to run to the target cluster.
  • it still needs to read the task data required for running from the default cluster. Therefore, the method of scheduling tasks to the target cluster is solved only when the network resources between the target cluster and the default cluster are better. In the prior art, the bandwidth occupation between clusters is too high.
  • FIG. 1 is a schematic flowchart of a task scheduling method according to Embodiment 1 of the present invention.
  • FIG. 2 is a schematic structural diagram of a network
  • FIG. 3 is a schematic flowchart of a task scheduling method according to Embodiment 2 of the present invention.
  • FIG. 4 is a schematic structural diagram of a task scheduling apparatus according to Embodiment 3 of the present invention.
  • FIG. 5 is a schematic structural diagram of another task scheduling apparatus according to Embodiment 3 of the present invention.
  • FIG. 1 is a schematic flowchart of a task scheduling method according to Embodiment 1 of the present invention.
  • the method provided in this embodiment may be performed by a task manager in a distributed system. As shown in FIG. 1 , the method includes:
  • Step 101 Determine network resources between a default cluster of the task and an idle target cluster.
  • the default cluster is a cluster that stores task data required for the task to run.
  • the network resource includes at least one of network bandwidth and network bandwidth time delay product.
  • a network model can be established, which is used to distinguish network structure relationships between different clusters.
  • the network structure relationship mentioned herein may include the same core switch, the same region, and different locations.
  • the same core switch means that the two clusters belong to the same core switch
  • the same area means that the two clusters belong to the same area
  • the off-site means that the two clusters belong to different areas.
  • the idle target cluster can be determined based on the load balancing.
  • the level of the network resource is determined to be the first level, such as the priority; if the default cluster and the target cluster are the same region, Determine the level of the network resource as the second level, as in the general; if the default cluster and the target cluster are different, determine the level of the network resource as the third level, such as the difference.
  • the inter-cluster distance can also be used to represent the network structure relationship between the clusters. The closer the distance is, the closer the network structure relationship is. The farther the distance is, the more distant the network structure relationship is, for example, the distance between clusters.
  • the inter-cluster is the same core switch; when the inter-cluster distance is 21, the inter-cluster is the same area; when the inter-cluster distance is 22, the inter-cluster is different.
  • FIG. 2 is a schematic structural diagram of a network.
  • cluster 1 and cluster 2 belong to one core switch, and cluster 3 and cluster 4 belong to different switches, and cluster 1 and cluster 2
  • the cluster 3 and the cluster 4 belong to the area 1
  • the cluster 5 belongs to the area 2 and is different from the cluster 1-4.
  • the network model when the established cluster 1 is the default cluster is:
  • Cluster 1 and cluster 2 are the same core switch with a network distance of 1.
  • Cluster 1 and cluster 3 are in the same area, and the network distance is 2;
  • Cluster 1 and cluster 4 are in the same area, and the network distance is 2.
  • Cluster 1 and cluster 5 are offsite with a network distance of 4.
  • the area mentioned here does not refer to the area in the administrative area, but the area in the network.
  • Step 102 Schedule the task according to the determined network resource.
  • the task is preferentially scheduled to a target cluster with the most network resources between the default cluster and the target cluster according to the network resources in at least an order.
  • the excessive bandwidth consumption occurs mainly because the task needs to be read from the default cluster to read the task data required for the operation, even though the task is scheduled to run to the target cluster.
  • using only the network resources between the target cluster and the default cluster is better, thus making The network resource between the target cluster and the default cluster can meet the requirements of the task, and then the task is dispatched to the target cluster. This solves the problem of excessive bandwidth usage between clusters in the prior art.
  • network resources can be divided into levels according to the target cluster with the most network resources between the default cluster and the target cluster.
  • the level of the network resource of the target cluster with the most network resources If the level of the network resource of the target cluster with the most current network resources is superior, the task is scheduled to the target cluster; if the level of the network resource Generally, the task is scheduled to the target cluster according to the network resource occupancy scheduled for the task; if the level of the network resource is poor, the task is scheduled to the default cluster, and the task is not scheduled to the current network resource.
  • the target cluster unless the task needs to read dependent data from the target cluster.
  • the dependency data is the running result data generated by other tasks required for the task to run.
  • the task is scheduled to be Excessive use of network resources caused by the target cluster.
  • FIG. 3 is a schematic flowchart of a task scheduling method according to Embodiment 2 of the present invention. As shown in FIG. 3, the method includes:
  • Step 201 Query whether the load of the default cluster of the task is idle. If it is idle, go to step 202. Otherwise, go to step 203.
  • the load of the default cluster of the task is idle. If the task is idle, the task is run by the default cluster. This is because no matter which cluster the task runs on, the task needs to be defaulted.
  • the cluster reads the task data required for the operation. Therefore, if the task is run on the default cluster, the bandwidth consumption caused by reading the task data can be effectively avoided, thereby avoiding the situation that the bandwidth usage is too high.
  • step 202 the task is scheduled to the default cluster, and the process ends.
  • the task is scheduled to run on the default cluster.
  • Step 203 Determine whether there is a target cluster in the cluster corresponding to the service unit to which the task belongs and the same core cluster as the core switch. If yes, go to step 204. Otherwise, go to step 202.
  • the network model of the distributed system may be established in advance, and the cluster corresponding to each service unit is recorded in the network model, so that each service unit performs tasks in the service unit by using the corresponding clusters, thereby facilitating management of the service.
  • the network distance is also used to describe the network relationship between the clusters.
  • the network distance between the clusters is recorded as the network distance of 20, and the inter-cluster is the same area as the network distance. 21, the inter-cluster is off-site for a network distance of 22.
  • cluster 1 and cluster 2 belong to the same service unit 1
  • cluster 3 and cluster 4 belong to service unit 2
  • cluster 5 belongs to service unit 3.
  • the clusters to which the service unit to which the task belongs are corresponding, and in these clusters, the cluster with the network distance of 20 from the default cluster is first queried to schedule the tasks.
  • the target cluster is selected from the distributed system in a distributed system according to the network distance from near to far, thereby ensuring that the task is preferentially scheduled to a target cluster with better network resources. on.
  • Step 204 Determine whether the target cluster of the same core switch is idle. If yes, go to step 205. Otherwise, go to step 206.
  • Step 205 Schedule the task to a target cluster of the same core switch.
  • Step 206 Determine whether there is a target cluster in the same region as the default cluster in the cluster corresponding to the service unit to which the task belongs. If yes, go to step 207. Otherwise, go to step 202.
  • Step 207 Determine whether the target cluster in the same area is idle. If yes, execute step 208; otherwise, perform step 202.
  • the task is scheduled to the default cluster that is also in the overload state. This is because, although there may be a remote target cluster, the tasks are scheduled to the off-site target cluster. When the network bandwidth is occupied, the task needs to be scheduled to the default cluster with less network resources to solve the problem of more network bandwidth usage.
  • Step 208 Determine whether the network bandwidth condition between the target cluster and the default cluster in the same area can meet the network overhead of the task. If yes, go to step 209; otherwise, go to step 202.
  • the task only accesses one task data across the cluster, and the length of time for the task to access the task data can be obtained from the historical data, wherein the length of time is equal to the difference between the end time and the start time, that is, the interval.
  • the network overhead caused by this task is: the ratio of the data volume of the task data to the length of time.
  • the bandwidth between clusters is a fixed value. If only the task is running when the task accesses the task data period, that is, between the end time and the start time, the network overhead of the task can be satisfied as long as the network overhead is less than the bandwidth.
  • Step 209 Schedule the task to the target cluster in the same area, and the process ends.
  • the task is preferentially scheduled to the target cluster with the most network bandwidth, that is, On the target cluster of the same core switch, if the same core switch is overloaded, the task is scheduled to the target cluster with the second most network bandwidth, that is, the target cluster in the same region, while performing load balancing.
  • the network bandwidth occupation of the task is minimized, and the bandwidth occupation between the clusters in the prior art is solved.
  • FIG. 4 is a schematic structural diagram of a task scheduling apparatus according to Embodiment 3 of the present invention. As shown in FIG. 3, the method includes: a determining module 31 and a scheduling module 32.
  • the determining module 31 is configured to determine network resources between the default cluster of the task and the idle target cluster.
  • the default cluster is a cluster that stores task data required for the task to run.
  • the network resource includes at least one of network bandwidth and network bandwidth time delay product.
  • the scheduling module 32 is configured to schedule tasks according to network resources between the default cluster and the target cluster.
  • the scheduling module 32 is specifically configured to schedule the task to the target cluster with the most network resources.
  • FIG. 5 is a schematic structural diagram of another task scheduling apparatus according to Embodiment 3 of the present invention.
  • the determining module 31 includes: a relationship determining unit 311. And resource determination unit 312.
  • the relationship determining unit 311 is configured to determine a network structure relationship between the default cluster and the target cluster.
  • the network structure relationship includes the same core switch, the same geographical area and different places.
  • the resource determining unit 312 is configured to determine the network resource according to the network structure relationship.
  • the resource determining unit 312 is specifically configured to: if the default cluster and the target cluster are the same core switch, determine that the level of the network resource is a first level; if the default cluster and the target cluster The level of the network resource is determined to be a second level. If the default cluster is different from the target cluster, the level of the network resource is determined to be a third level.
  • the scheduling module 32 includes: a first scheduling unit 321, a second scheduling unit 322, and a third scheduling unit 323.
  • the first scheduling unit 321 is configured to schedule the task to the target cluster if the level of the network resource between the default cluster and the target cluster is a first level.
  • the second scheduling unit 322 is configured to: if the level of the network resource between the default cluster and the target cluster is a second level, determine, according to the network resource occupancy situation scheduled by the task, scheduling the task to the location The default cluster or the target cluster.
  • the second scheduling unit 322 is specifically configured to obtain the task from the history. a length of time for reading the task data in a single time; calculating a ratio of the data amount of the task data to the length of time, obtaining network overhead of the task; if the network overhead of the task is smaller than the default cluster and Dedicating the network bandwidth between the target clusters to the target cluster; if the network overhead of the task is not less than the network bandwidth between the default cluster and the target cluster, the task is Dispatched to the default cluster.
  • the third scheduling unit 323 is configured to schedule the task to the default cluster if the level of the network resource between the default cluster and the target cluster is a third level.
  • the task scheduling device further includes:
  • the load balancing module 33 is configured to determine the target cluster based on a load balancing manner if the default cluster is in an overload state.
  • the task is scheduled according to the determined network resource.
  • the default cluster is a cluster that stores the task data required for the task to run.
  • the bandwidth usage is too high because the task is scheduled to run to the target cluster.
  • it still needs to read the task data required for running from the default cluster. Therefore, the method of scheduling tasks to the target cluster is solved only when the network resources between the target cluster and the default cluster are better. In the prior art, the bandwidth occupation between clusters is too high.
  • the aforementioned program can be stored in a computer readable storage medium.
  • the program when executed, performs the steps including the foregoing method embodiments; and the foregoing storage medium includes various media that can store program codes, such as a ROM, a RAM, a magnetic disk, or an optical disk.

Abstract

Provided are a task scheduling method and device. Network resources between a default cluster of a task and an idle target cluster are determined, and the task is scheduled according to the determined network resources. The default cluster is a cluster storing task data required for the task to run. When a task is scheduled on the basis of a cluster load situation, excessive bandwidth occupation is mainly caused by the task being scheduled to a target cluster to run but still needing to read task data required for running from a default cluster. Therefore, using a means wherein a task is scheduled to a target cluster only when the network resources situation between the target cluster and a default cluster is relatively good solves inter-cluster excessive bandwidth occupation in the prior art.

Description

任务调度方法和装置Task scheduling method and device
本申请要求2016年03月25日递交的申请号为201610180450.2、发明名称为“任务调度方法和装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。The present application claims the priority of the Chinese Patent Application Serial No. No. No. No. No. No. No.
技术领域Technical field
本发明涉及计算机技术,尤其涉及一种任务调度方法和装置。The present invention relates to computer technology, and in particular, to a task scheduling method and apparatus.
背景技术Background technique
为了提高系统的稳定性和网络中心的数据处理能力及服务能力,通常采用集群技术。集群技术的出现,能够使得服务器相互连接在一起,构成一个集群,多个集群相互连接构成一个分布式系统,该分布式系统内的各个集群运行一系列共同的应用程序。In order to improve the stability of the system and the data processing capability and service capability of the network center, cluster technology is usually adopted. The emergence of clustering technology enables servers to be connected to each other to form a cluster. Multiple clusters are interconnected to form a distributed system. Each cluster in the distributed system runs a series of common applications.
在分布式系统内部,将应用程序划分为多个任务,每个任务分配一个集群进行运行,将所分配的集群作为该任务的默认集群,并在默认集群上运行该任务,以及存储任务运行所需的任务数据。可见,在这种情况下,当任务所需的运行能力会与集群的运行能力不匹配时,从而出现各个集群负载不均衡的情况。Within a distributed system, the application is divided into multiple tasks, each task is assigned a cluster to run, the assigned cluster is used as the default cluster for the task, and the task is run on the default cluster, and the storage task is run. Required task data. It can be seen that in this case, when the running capacity required by the task does not match the running capability of the cluster, the load of each cluster is unbalanced.
为了提高各个集群的运行效率,从而使得分布式系统的运行效率最大化,分布式系统可以基于各集群负载情况对计算任务进行调度,由所调度至的集群运行计算任务。但在实际运行过程中,往往会出现集群间的带宽占用过高的情况。In order to improve the operating efficiency of each cluster and maximize the operational efficiency of the distributed system, the distributed system can schedule the computing tasks based on the load conditions of the clusters, and run the computing tasks from the clusters that are scheduled. However, in actual operation, there is often a situation in which the bandwidth usage between the clusters is too high.
发明内容Summary of the invention
本发明提供一种任务调度方法和装置,用于解决现有技术中集群间的带宽占用过高的情况。The invention provides a task scheduling method and device for solving the situation that the bandwidth occupation between clusters is too high in the prior art.
为达到上述目的,本发明的实施例采用如下技术方案:In order to achieve the above object, embodiments of the present invention adopt the following technical solutions:
第一方面,提供了一种任务调度方法,确定任务的默认集群与空闲的目标集群之间的网络资源;所述默认集群为存储有所述任务运行所需的任务数据的集群;In a first aspect, a task scheduling method is provided to determine network resources between a default cluster of a task and an idle target cluster; the default cluster is a cluster that stores task data required for the task to run;
根据所述网络资源,对所述任务进行调度。The task is scheduled according to the network resource.
第二方面,提供了一种任务调度装置,包括:In a second aspect, a task scheduling apparatus is provided, including:
确定模块,用于确定任务的默认集群与空闲的目标集群之间的网络资源;所述默认集群为存储有所述任务运行所需的任务数据的集群;a determining module, configured to determine a network resource between a default cluster of the task and an idle target cluster; the default cluster is a cluster storing task data required for the task to run;
调度模块,用于根据所述网络资源,对所述任务进行调度。 And a scheduling module, configured to schedule the task according to the network resource.
本发明实施例提供的任务调度方法和装置,通过确定任务的默认集群与空闲的目标集群之间的网络资源之后,根据所确定出的网络资源,对所述任务进行调度。其中,默认集群为存储有所述任务运行所需的任务数据的集群,由于基于集群负载情况对任务进行调度时,所出现的带宽占用过高的情况主要是由于任务尽管调度至目标集群运行,但仍需要从默认集群读取运行所需的任务数据而产生的,因此,采用只在目标集群和默认集群之间的网络资源情况较好的情况下,将任务调度至目标集群的方式,解决了现有技术中集群间的带宽占用过高的情况。The task scheduling method and device provided by the embodiment of the present invention, after determining the network resource between the default cluster of the task and the idle target cluster, scheduling the task according to the determined network resource. The default cluster is a cluster that stores the task data required for the task to run. When the task is scheduled based on the cluster load, the bandwidth usage is too high because the task is scheduled to run to the target cluster. However, it still needs to read the task data required for running from the default cluster. Therefore, the method of scheduling tasks to the target cluster is solved only when the network resources between the target cluster and the default cluster are better. In the prior art, the bandwidth occupation between clusters is too high.
上述说明仅是本发明技术方案的概述,为了能够更清楚了解本发明的技术手段,而可依照说明书的内容予以实施,并且为了让本发明的上述和其它目的、特征和优点能够更明显易懂,以下特举本发明的具体实施方式。The above description is only an overview of the technical solutions of the present invention, and the above-described and other objects, features and advantages of the present invention can be more clearly understood. Specific embodiments of the invention are set forth below.
附图说明DRAWINGS
通过阅读下文优选实施方式的详细描述,各种其他的优点和益处对于本领域普通技术人员将变得清楚明了。附图仅用于示出优选实施方式的目的,而并不认为是对本发明的限制。而且在整个附图中,用相同的参考符号表示相同的部件。在附图中:Various other advantages and benefits will become apparent to those skilled in the art from a The drawings are only for the purpose of illustrating the preferred embodiments and are not to be construed as limiting. Throughout the drawings, the same reference numerals are used to refer to the same parts. In the drawing:
图1为本发明实施例一提供的一种任务调度方法的流程示意图;1 is a schematic flowchart of a task scheduling method according to Embodiment 1 of the present invention;
图2为一种网络的结构示意图;2 is a schematic structural diagram of a network;
图3为本发明实施例二提供的一种任务调度方法的流程示意图;3 is a schematic flowchart of a task scheduling method according to Embodiment 2 of the present invention;
图4为本发明实施例三提供的一种任务调度装置的结构示意图;4 is a schematic structural diagram of a task scheduling apparatus according to Embodiment 3 of the present invention;
图5为本发明实施例三提供的另一种任务调度装置的结构示意图。FIG. 5 is a schematic structural diagram of another task scheduling apparatus according to Embodiment 3 of the present invention.
具体实施方式detailed description
下面将参照附图更详细地描述本公开的示例性实施例。虽然附图中显示了本公开的示例性实施例,然而应当理解,可以以各种形式实现本公开而不应被这里阐述的实施例所限制。相反,提供这些实施例是为了能够更透彻地理解本公开,并且能够将本公开的范围完整的传达给本领域的技术人员。Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While the embodiments of the present invention have been shown in the drawings, the embodiments Rather, these embodiments are provided so that this disclosure will be more fully understood and the scope of the disclosure will be fully disclosed.
下面结合附图对本发明实施例提供的任务调度方法和装置进行详细描述。The task scheduling method and apparatus provided by the embodiments of the present invention are described in detail below with reference to the accompanying drawings.
实施例一Embodiment 1
图1为本发明实施例一提供的一种任务调度方法的流程示意图,本实施例所提供的方法,可以由分布式系统中的任务管理器执行,如图1所示,方法包括: FIG. 1 is a schematic flowchart of a task scheduling method according to Embodiment 1 of the present invention. The method provided in this embodiment may be performed by a task manager in a distributed system. As shown in FIG. 1 , the method includes:
步骤101、确定任务的默认集群与空闲的目标集群之间的网络资源。Step 101: Determine network resources between a default cluster of the task and an idle target cluster.
其中,默认集群为存储有该任务运行所需的任务数据的集群,网络资源包括:网络带宽和网络带宽时延积中的至少一个。The default cluster is a cluster that stores task data required for the task to run. The network resource includes at least one of network bandwidth and network bandwidth time delay product.
具体的,可以建立起一套网络模型,该网络模型用于区分不同集群之间的网络结构关系,这里所说的网络结构关系可以包括同核心交换机、同地域和异地。其中,同核心交换机是指两集群属于同一个核心交换机,同地域是指两集群属于同一个地域,异地是指两集群分属于不同的地域。首先,可以基于负载均衡确定空闲的目标集群,若默认集群与目标集群之间为同核心交换机,确定网络资源的级别为第一等级,如优等;若默认集群与目标集群之间为同地域,确定网络资源的级别为第二等级,如一般;若默认集群与目标集群之间为异地,确定网络资源的级别为第三等级,如差等。Specifically, a network model can be established, which is used to distinguish network structure relationships between different clusters. The network structure relationship mentioned herein may include the same core switch, the same region, and different locations. Among them, the same core switch means that the two clusters belong to the same core switch, and the same area means that the two clusters belong to the same area, and the off-site means that the two clusters belong to different areas. First, the idle target cluster can be determined based on the load balancing. If the default cluster and the target cluster are the same core switch, the level of the network resource is determined to be the first level, such as the priority; if the default cluster and the target cluster are the same region, Determine the level of the network resource as the second level, as in the general; if the default cluster and the target cluster are different, determine the level of the network resource as the third level, such as the difference.
进一步,在所建立起的网络模型中,还可以采用集群间距离表示集群间的网络结构关系,距离越近则网络结构关系越紧密,距离越远则网络结构关系越疏远,例如:集群间距离为20时,集群间为同核心交换机;集群间距离为21时,集群间为同地域;集群间距离为22时,集群间为异地。Further, in the established network model, the inter-cluster distance can also be used to represent the network structure relationship between the clusters. The closer the distance is, the closer the network structure relationship is. The farther the distance is, the more distant the network structure relationship is, for example, the distance between clusters. At 20 o'clock, the inter-cluster is the same core switch; when the inter-cluster distance is 21, the inter-cluster is the same area; when the inter-cluster distance is 22, the inter-cluster is different.
例如:图2为一种网络的结构示意图,针对如图2所示的网络结构,集群1和集群2同属于一个核心交换机,集群3和集群4分属于不同交换机,同时,集群1、集群2、集群3和集群4同属于地域1,另外,集群5属于地域2,与集群1-4为异地。For example, FIG. 2 is a schematic structural diagram of a network. For the network structure shown in FIG. 2, cluster 1 and cluster 2 belong to one core switch, and cluster 3 and cluster 4 belong to different switches, and cluster 1 and cluster 2 The cluster 3 and the cluster 4 belong to the area 1, and the cluster 5 belongs to the area 2 and is different from the cluster 1-4.
因此,所建立的集群1为默认集群时的网络模型为:Therefore, the network model when the established cluster 1 is the default cluster is:
集群1和集群2之间为同核心交换机,网络距离为1;Cluster 1 and cluster 2 are the same core switch with a network distance of 1.
集群1和集群3之间为同地域,网络距离为2;Cluster 1 and cluster 3 are in the same area, and the network distance is 2;
集群1和集群4之间为同地域,网络距离为2;Cluster 1 and cluster 4 are in the same area, and the network distance is 2.
集群1和集群5之间为异地,网络距离为4。Cluster 1 and cluster 5 are offsite with a network distance of 4.
需要说明的是,这里所说的地域不是指行政区域上的地域,而是网络中的地域。网络距离可以采用2n的方式进行计算,同核心交换机时n=0,,同地域时n=1,异地时n=2。It should be noted that the area mentioned here does not refer to the area in the administrative area, but the area in the network. The network distance can be calculated by 2n. When the core switch is n=0, the same area is n=1, and when it is different, n=2.
步骤102、根据所确定出的网络资源,对该任务进行调度。Step 102: Schedule the task according to the determined network resource.
具体的,按照所述网络资源由多至少的顺序,优先将任务调度至默认集群和目标集群之间的网络资源最多的目标集群。Specifically, the task is preferentially scheduled to a target cluster with the most network resources between the default cluster and the target cluster according to the network resources in at least an order.
由于基于集群负载情况对任务进行调度时,所出现的带宽占用过高的情况主要是由于任务尽管调度至目标集群运行,但仍需要从默认集群读取运行所需的任务数据而产生的,因此,采用只在目标集群和默认集群之间的网络资源情况较好的情况下,从而使得 目标集群和默认集群之间的网络资源能够满足任务所需,才将任务调度至目标集群的方式,解决了现有技术中集群间的带宽占用过高的情况。When the task is scheduled based on the cluster load condition, the excessive bandwidth consumption occurs mainly because the task needs to be read from the default cluster to read the task data required for the operation, even though the task is scheduled to run to the target cluster. , using only the network resources between the target cluster and the default cluster is better, thus making The network resource between the target cluster and the default cluster can meet the requirements of the task, and then the task is dispatched to the target cluster. This solves the problem of excessive bandwidth usage between clusters in the prior art.
作为一种可能的实现方式,在优先将任务调度至默认集群和目标集群之间的网络资源最多的目标集群的基础上,还可以将网络资源划分为各个级别。在对任务进行调度之前,判断当前网络资源最多的目标集群的网络资源的级别,若当前网络资源最多的目标集群的网络资源的级别为优等,将任务调度至该目标集群;若网络资源的级别为一般,根据对任务进行调度的网络资源占用情况确定是否将任务调度至目标集群;若网络资源的级别为差等,将任务调度至默认集群,而不将该任务调度至当前网络资源最多的目标集群,除非该任务需要从该目标集群读取依赖数据。As a possible implementation manner, network resources can be divided into levels according to the target cluster with the most network resources between the default cluster and the target cluster. Before scheduling the task, determine the level of the network resource of the target cluster with the most network resources. If the level of the network resource of the target cluster with the most current network resources is superior, the task is scheduled to the target cluster; if the level of the network resource Generally, the task is scheduled to the target cluster according to the network resource occupancy scheduled for the task; if the level of the network resource is poor, the task is scheduled to the default cluster, and the task is not scheduled to the current network resource. The target cluster, unless the task needs to read dependent data from the target cluster.
其中,依赖数据是该任务运行所需的其他任务生成的运行结果数据。Among them, the dependency data is the running result data generated by other tasks required for the task to run.
通过这种方式,避免了当前网络资源最多的目标集群不能满足任务所需的网络资源的情况下,例如:当前网络资源最多的目标集群与默认集群之间为跨地域时,将任务调度至该目标集群所导致的网络资源占用过多的情况。In this way, when the target cluster with the most network resources cannot meet the network resources required by the task, for example, when the target cluster with the largest network resource is the cross-region between the default cluster, the task is scheduled to be Excessive use of network resources caused by the target cluster.
实施例二Embodiment 2
图3为本发明实施例二提供的一种任务调度方法的流程示意图,如图3所示,包括:FIG. 3 is a schematic flowchart of a task scheduling method according to Embodiment 2 of the present invention. As shown in FIG. 3, the method includes:
步骤201、查询任务的默认集群的负载是否空闲,如果空闲则执行步骤202,否则执行步骤203。Step 201: Query whether the load of the default cluster of the task is idle. If it is idle, go to step 202. Otherwise, go to step 203.
具体的,获取到待调度的任务后,首先查询任务的默认集群的负载是否空闲,若空闲则由默认集群运行该任务,这是由于无论任务在哪一个集群上运行,该任务均需要从默认集群读取运行所需的任务数据,因此,将任务运行在默认集群上,则能够有效避免因为读取任务数据所产生的带宽占用,从而避免带宽占用过高的情况发生。Specifically, after obtaining the task to be scheduled, first query whether the load of the default cluster of the task is idle. If the task is idle, the task is run by the default cluster. This is because no matter which cluster the task runs on, the task needs to be defaulted. The cluster reads the task data required for the operation. Therefore, if the task is run on the default cluster, the bandwidth consumption caused by reading the task data can be effectively avoided, thereby avoiding the situation that the bandwidth usage is too high.
步骤202、将任务调度至默认集群,流程结束。In step 202, the task is scheduled to the default cluster, and the process ends.
具体的,将任务调度至默认集群上排队等待运行。Specifically, the task is scheduled to run on the default cluster.
步骤203、判断是否存在该任务所属业务单元所对应的集群中与默认集群之间为同核心交换机的目标集群,如果存在,则执行步骤204,否则执行步骤202。Step 203: Determine whether there is a target cluster in the cluster corresponding to the service unit to which the task belongs and the same core cluster as the core switch. If yes, go to step 204. Otherwise, go to step 202.
具体的,可以预先建立分布式系统的网络模型,网络模型中记载了各个业务单元所对应的集群,使得各业务单元利用各自对应的集群执行业务单元内的任务,便于对业务进行管理。同时,在网络模型中还采用了网络距离的方式描述了集群相互之间的网络关系,将集群间为同核心交换机记为网络距离为20,将集群间为同地域记为网络距离为 21,将集群间为异地为网络距离为22。如图2所示,集群1和集群2属于同一业务单元1,集群3和集群4同属于业务单元2,集群5属于业务单元3。Specifically, the network model of the distributed system may be established in advance, and the cluster corresponding to each service unit is recorded in the network model, so that each service unit performs tasks in the service unit by using the corresponding clusters, thereby facilitating management of the service. At the same time, in the network model, the network distance is also used to describe the network relationship between the clusters. The network distance between the clusters is recorded as the network distance of 20, and the inter-cluster is the same area as the network distance. 21, the inter-cluster is off-site for a network distance of 22. As shown in FIG. 2, cluster 1 and cluster 2 belong to the same service unit 1, cluster 3 and cluster 4 belong to service unit 2, and cluster 5 belongs to service unit 3.
基于这一预先建立的网络模型,在本步骤中查询任务所属的业务单元对应了哪些集群,进而在这些集群中首先查询与默认集群之间网络距离为20的集群,以对任务进行调度。Based on this pre-established network model, in this step, the clusters to which the service unit to which the task belongs are corresponding, and in these clusters, the cluster with the network distance of 20 from the default cluster is first queried to schedule the tasks.
从而通过预先建立的网络模型,在分布式系统中按照网络距离由近至远的方式,依次从分布式系统中选定目标集群,从而保证了优先将任务调度至网络资源情况较好的目标集群上。Therefore, through the pre-established network model, the target cluster is selected from the distributed system in a distributed system according to the network distance from near to far, thereby ensuring that the task is preferentially scheduled to a target cluster with better network resources. on.
步骤204、判断同核心交换机的目标集群是否空闲,如果是,则执行步骤205,否则,执行步骤206。Step 204: Determine whether the target cluster of the same core switch is idle. If yes, go to step 205. Otherwise, go to step 206.
步骤205、调度任务至同核心交换机的目标集群。Step 205: Schedule the task to a target cluster of the same core switch.
步骤206、判断任务所属业务单元所对应的集群中是否存在与默认集群之间为同地域的目标集群,如果存在,则执行步骤207,否则执行步骤202。Step 206: Determine whether there is a target cluster in the same region as the default cluster in the cluster corresponding to the service unit to which the task belongs. If yes, go to step 207. Otherwise, go to step 202.
具体的,基于预先建立的网络模型,查询与默认集群之间网络距离为2的集群。Specifically, based on the pre-established network model, query a cluster with a network distance of 2 from the default cluster.
步骤207、判断同地域的目标集群是否空闲,如果是,则执行步骤208,否则执行步骤202。Step 207: Determine whether the target cluster in the same area is idle. If yes, execute step 208; otherwise, perform step 202.
若同地域的目标集群均为超负荷状态,则将任务调度至同样为超负荷状态的默认集群上,这是由于,尽管有可能存在异地的目标集群,但将任务调度至异地的目标集群上时,网络带宽占用较多,因此,需要将任务调度至网络资源占用少的默认集群上,才能够解决网络带宽占用较多的问题。If the target clusters in the same region are overloaded, the task is scheduled to the default cluster that is also in the overload state. This is because, although there may be a remote target cluster, the tasks are scheduled to the off-site target cluster. When the network bandwidth is occupied, the task needs to be scheduled to the default cluster with less network resources to solve the problem of more network bandwidth usage.
步骤208、判断同地域的目标集群与默认集群之间的网络带宽情况是否能够满足任务的网络开销,若满足,执行步骤209,否则执行步骤202。Step 208: Determine whether the network bandwidth condition between the target cluster and the default cluster in the same area can meet the network overhead of the task. If yes, go to step 209; otherwise, go to step 202.
具体的,假设任务只会跨集群访问一份任务数据,能够从历史数据中获得该任务单次访问任务数据的时间长度,其中时间长度等于结束时刻与开始时刻之间的差值,即间隔。假设在这段时间内该任务的读取数据速率恒定,那么这个任务造成的网络开销为:任务数据的数据量与时间长度之比。集群间的带宽是定值,如果当任务访问任务数据期间,即结束时刻与开始时刻之间,只有该任务在运行,那么只要网络开销小于带宽那么就是能够满足任务的网络开销的。Specifically, it is assumed that the task only accesses one task data across the cluster, and the length of time for the task to access the task data can be obtained from the historical data, wherein the length of time is equal to the difference between the end time and the start time, that is, the interval. Assuming that the read data rate of the task is constant during this time, the network overhead caused by this task is: the ratio of the data volume of the task data to the length of time. The bandwidth between clusters is a fixed value. If only the task is running when the task accesses the task data period, that is, between the end time and the start time, the network overhead of the task can be satisfied as long as the network overhead is less than the bandwidth.
步骤209、调度任务到同地域的目标集群上,流程结束。Step 209: Schedule the task to the target cluster in the same area, and the process ends.
按照网络带宽由多至少的顺序,优先将任务调度至网络带宽最多的目标集群,也就 是同核心交换机的目标集群上,若同核心交换机均为超负荷状态下,再将任务调度至网络带宽次多的目标集群上,也就是同地域的目标集群上,在进行负载均衡的同时使得任务的网络带宽占用最小化,解决了现有技术中集群间的带宽占用过高的情况。According to the network bandwidth, at least in order, the task is preferentially scheduled to the target cluster with the most network bandwidth, that is, On the target cluster of the same core switch, if the same core switch is overloaded, the task is scheduled to the target cluster with the second most network bandwidth, that is, the target cluster in the same region, while performing load balancing. The network bandwidth occupation of the task is minimized, and the bandwidth occupation between the clusters in the prior art is solved.
实施例三Embodiment 3
图4为本发明实施例三提供的一种任务调度装置的结构示意图,如图3所示,包括:确定模块31和调度模块32。4 is a schematic structural diagram of a task scheduling apparatus according to Embodiment 3 of the present invention. As shown in FIG. 3, the method includes: a determining module 31 and a scheduling module 32.
确定模块31,用于确定任务的默认集群与空闲的目标集群之间的网络资源。The determining module 31 is configured to determine network resources between the default cluster of the task and the idle target cluster.
其中,默认集群为存储有该任务运行所需的任务数据的集群,网络资源包括:网络带宽和网络带宽时延积中的至少一个。The default cluster is a cluster that stores task data required for the task to run. The network resource includes at least one of network bandwidth and network bandwidth time delay product.
调度模块32,用于根据默认集群与目标集群之间的网络资源,对任务进行调度。The scheduling module 32 is configured to schedule tasks according to network resources between the default cluster and the target cluster.
具体的,调度模块32,具体用于将任务调度至网络资源最多的目标集群。Specifically, the scheduling module 32 is specifically configured to schedule the task to the target cluster with the most network resources.
进一步,图5为本发明实施例三提供的另一种任务调度装置的结构示意图,如图5所示,在图4所提供的任务调度装置的基础上,确定模块31包括:关系确定单元311和资源确定单元312。Further, FIG. 5 is a schematic structural diagram of another task scheduling apparatus according to Embodiment 3 of the present invention. As shown in FIG. 5, on the basis of the task scheduling apparatus provided in FIG. 4, the determining module 31 includes: a relationship determining unit 311. And resource determination unit 312.
关系确定单元311,用于确定所述默认集群与所述目标集群之间的网络结构关系。The relationship determining unit 311 is configured to determine a network structure relationship between the default cluster and the target cluster.
其中,网络结构关系包括同核心交换机、同地域和异地。Among them, the network structure relationship includes the same core switch, the same geographical area and different places.
资源确定单元312,用于根据所述网络结构关系,确定所述网络资源。The resource determining unit 312 is configured to determine the network resource according to the network structure relationship.
具体的,资源确定单元312,具体用于若所述默认集群与所述目标集群之间为同核心交换机,确定所述网络资源的级别为第一等级;若所述默认集群与所述目标集群之间为同地域,确定所述网络资源的级别为第二等级;若所述默认集群与所述目标集群之间为异地,确定所述网络资源的级别为第三等级。Specifically, the resource determining unit 312 is specifically configured to: if the default cluster and the target cluster are the same core switch, determine that the level of the network resource is a first level; if the default cluster and the target cluster The level of the network resource is determined to be a second level. If the default cluster is different from the target cluster, the level of the network resource is determined to be a third level.
进一步,调度模块32,包括:第一调度单元321、第二调度单元322和第三调度单元323。Further, the scheduling module 32 includes: a first scheduling unit 321, a second scheduling unit 322, and a third scheduling unit 323.
第一调度单元321,用于若所述默认集群与所述目标集群之间的网络资源的级别为第一等级,将所述任务调度至所述目标集群。The first scheduling unit 321 is configured to schedule the task to the target cluster if the level of the network resource between the default cluster and the target cluster is a first level.
第二调度单元322,用于若所述默认集群与所述目标集群之间的网络资源的级别为第二等级,根据对所述任务进行调度的网络资源占用情况确定将所述任务调度至所述默认集群或所述目标集群。The second scheduling unit 322 is configured to: if the level of the network resource between the default cluster and the target cluster is a second level, determine, according to the network resource occupancy situation scheduled by the task, scheduling the task to the location The default cluster or the target cluster.
若网络资源为网络带宽,则第二调度单元322具体用于从历史记录中获得所述任务 单次读取所述任务数据的时间长度;计算所述任务数据的数据量与所述时间长度之比,获得所述任务的网络开销;若所述任务的网络开销小于所述默认集群与所述目标集群之间的网络带宽,则将所述任务调度至所述目标集群;若所述任务的网络开销不小于所述默认集群与所述目标集群之间的网络带宽,则将所述任务调度至所述默认集群。If the network resource is the network bandwidth, the second scheduling unit 322 is specifically configured to obtain the task from the history. a length of time for reading the task data in a single time; calculating a ratio of the data amount of the task data to the length of time, obtaining network overhead of the task; if the network overhead of the task is smaller than the default cluster and Dedicating the network bandwidth between the target clusters to the target cluster; if the network overhead of the task is not less than the network bandwidth between the default cluster and the target cluster, the task is Dispatched to the default cluster.
第三调度单元323,用于若所述默认集群与所述目标集群之间的网络资源的级别为第三等级,将所述任务调度至所述默认集群。The third scheduling unit 323 is configured to schedule the task to the default cluster if the level of the network resource between the default cluster and the target cluster is a third level.
进一步,任务调度装置,还包括:Further, the task scheduling device further includes:
负载均衡模块33,用于若所述默认集群处于超负荷状态,则基于负载均衡方式,确定所述目标集群。The load balancing module 33 is configured to determine the target cluster based on a load balancing manner if the default cluster is in an overload state.
本实施例中,通过确定任务的默认集群与空闲的目标集群之间的网络资源之后,根据所确定出的网络资源,对所述任务进行调度。其中,默认集群为存储有所述任务运行所需的任务数据的集群,由于基于集群负载情况对任务进行调度时,所出现的带宽占用过高的情况主要是由于任务尽管调度至目标集群运行,但仍需要从默认集群读取运行所需的任务数据而产生的,因此,采用只在目标集群和默认集群之间的网络资源情况较好的情况下,将任务调度至目标集群的方式,解决了现有技术中集群间的带宽占用过高的情况。In this embodiment, after determining the network resource between the default cluster of the task and the idle target cluster, the task is scheduled according to the determined network resource. The default cluster is a cluster that stores the task data required for the task to run. When the task is scheduled based on the cluster load, the bandwidth usage is too high because the task is scheduled to run to the target cluster. However, it still needs to read the task data required for running from the default cluster. Therefore, the method of scheduling tasks to the target cluster is solved only when the network resources between the target cluster and the default cluster are better. In the prior art, the bandwidth occupation between clusters is too high.
本领域普通技术人员可以理解:实现上述各方法实施例的全部或部分步骤可以通过程序指令相关的硬件来完成。前述的程序可以存储于一计算机可读取存储介质中。该程序在执行时,执行包括上述各方法实施例的步骤;而前述的存储介质包括:ROM、RAM、磁碟或者光盘等各种可以存储程序代码的介质。One of ordinary skill in the art will appreciate that all or part of the steps to implement the various method embodiments described above may be accomplished by hardware associated with the program instructions. The aforementioned program can be stored in a computer readable storage medium. The program, when executed, performs the steps including the foregoing method embodiments; and the foregoing storage medium includes various media that can store program codes, such as a ROM, a RAM, a magnetic disk, or an optical disk.
最后应说明的是:以上各实施例仅用以说明本发明的技术方案,而非对其限制;尽管参照前述各实施例对本发明进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分或者全部技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本发明各实施例技术方案的范围。 Finally, it should be noted that the above embodiments are merely illustrative of the technical solutions of the present invention, and are not intended to be limiting; although the present invention has been described in detail with reference to the foregoing embodiments, those skilled in the art will understand that The technical solutions described in the foregoing embodiments may be modified, or some or all of the technical features may be equivalently replaced; and the modifications or substitutions do not deviate from the technical solutions of the embodiments of the present invention. range.

Claims (16)

  1. 一种任务调度方法,其特征在于,包括:A task scheduling method, comprising:
    确定任务的默认集群与空闲的目标集群之间的网络资源;所述默认集群为存储有所述任务运行所需的任务数据的集群;Determining network resources between the default cluster of the task and the idle target cluster; the default cluster is a cluster storing task data required for the task to run;
    根据所述网络资源,对所述任务进行调度。The task is scheduled according to the network resource.
  2. 根据权利要求1所述的任务调度方法,其特征在于,所述根据所述网络资源,对所述任务进行调度,包括:The task scheduling method according to claim 1, wherein the scheduling the task according to the network resource comprises:
    将所述任务调度至所述网络资源最多的目标集群。The task is scheduled to the target cluster with the most network resources.
  3. 根据权利要求1所述的任务调度方法,其特征在于,所述确定任务的默认集群与空闲的目标集群之间的网络资源,包括:The task scheduling method according to claim 1, wherein the determining the network resource between the default cluster of the task and the idle target cluster comprises:
    确定所述默认集群与所述目标集群之间的网络结构关系;Determining a network structure relationship between the default cluster and the target cluster;
    根据所述网络结构关系,确定所述网络资源。Determining the network resource according to the network structure relationship.
  4. 根据权利要求3所述的任务调度方法,其特征在于,所述网络结构关系包括同核心交换机、同地域和异地;The task scheduling method according to claim 3, wherein the network structure relationship comprises a core switch, a same area, and a different place;
    所述根据所述网络结构关系,确定所述网络资源包括:Determining, according to the network structure relationship, the network resource includes:
    若所述默认集群与所述目标集群之间为同核心交换机,确定所述网络资源的级别为第一等级;If the default cluster is the same core switch as the target cluster, determine that the level of the network resource is a first level;
    若所述默认集群与所述目标集群之间为同地域,确定所述网络资源的级别为第二等级;If the default cluster and the target cluster are in the same area, determine that the level of the network resource is a second level;
    若所述默认集群与所述目标集群之间为异地,确定所述网络资源的级别为第三等级。If the default cluster is different from the target cluster, determine that the level of the network resource is a third level.
  5. 根据权利要求4所述的任务调度方法,其特征在于,所述根据所述网络资源,对所述任务进行调度,包括:The task scheduling method according to claim 4, wherein the scheduling the task according to the network resource comprises:
    若所述默认集群与所述目标集群之间的网络资源的级别为第一等级,将所述任务调度至所述目标集群;If the level of the network resource between the default cluster and the target cluster is the first level, scheduling the task to the target cluster;
    若所述默认集群与所述目标集群之间的网络资源的级别为第二等级,根据对所述任务进行调度的网络资源占用情况确定将所述任务调度至所述默认集群或所述目标集群;If the level of the network resource between the default cluster and the target cluster is a second level, determining, according to the network resource occupancy situation scheduled by the task, scheduling the task to the default cluster or the target cluster ;
    若所述默认集群与所述目标集群之间的网络资源的级别为第三等级,将所述任务调度至所述默认集群。If the level of network resources between the default cluster and the target cluster is a third level, the task is scheduled to the default cluster.
  6. 根据权利要求5所述的任务调度方法,其特征在于,所述网络资源为网络带宽, 所述根据对所述任务进行调度的网络资源占用情况确定将所述任务调度至所述默认集群或所述目标集群,包括:The task scheduling method according to claim 5, wherein the network resource is a network bandwidth, Determining, according to the network resource occupancy situation scheduled by the task, scheduling the task to the default cluster or the target cluster, including:
    从历史记录中获得所述任务单次读取所述任务数据的时间长度;Obtaining, from the history record, a length of time for the task to read the task data in a single time;
    计算所述任务数据的数据量与所述时间长度之比,获得所述任务的网络开销;Calculating a ratio of the data amount of the task data to the length of time, and obtaining network overhead of the task;
    若所述任务的网络开销小于所述默认集群与所述目标集群之间的网络带宽,则将所述任务调度至所述目标集群;If the network overhead of the task is less than the network bandwidth between the default cluster and the target cluster, scheduling the task to the target cluster;
    若所述任务的网络开销不小于所述默认集群与所述目标集群之间的网络带宽,则将所述任务调度至所述默认集群。If the network overhead of the task is not less than the network bandwidth between the default cluster and the target cluster, the task is scheduled to the default cluster.
  7. 根据权利要求1所述的任务调度方法,其特征在于,所述确定任务的默认集群与空闲的目标集群之间的网络资源之前,还包括:The task scheduling method according to claim 1, wherein before the determining the network resource between the default cluster of the task and the idle target cluster, the method further includes:
    若所述默认集群处于超负荷状态,则基于负载均衡方式,确定所述目标集群。If the default cluster is in an overload state, the target cluster is determined based on a load balancing manner.
  8. 根据权利要求1-5任一项所述的任务调度方法,其特征在于,所述网络资源包括:网络带宽和网络带宽时延积中的至少一个。The task scheduling method according to any one of claims 1 to 5, wherein the network resource comprises at least one of a network bandwidth and a network bandwidth time delay product.
  9. 一种任务调度装置,其特征在于,包括:A task scheduling device, comprising:
    确定模块,用于确定任务的默认集群与空闲的目标集群之间的网络资源;所述默认集群为存储有所述任务运行所需的任务数据的集群;a determining module, configured to determine a network resource between a default cluster of the task and an idle target cluster; the default cluster is a cluster storing task data required for the task to run;
    调度模块,用于根据所述网络资源,对所述任务进行调度。And a scheduling module, configured to schedule the task according to the network resource.
  10. 根据权利要求9所述的任务调度装置,其特征在于,A task scheduling apparatus according to claim 9, wherein:
    所述调度模块,具体用于将所述任务调度至所述网络资源最多的目标集群。The scheduling module is specifically configured to schedule the task to a target cluster with the most network resources.
  11. 根据权利要求9所述的任务调度装置,其特征在于,所述确定模块,包括:The task scheduling apparatus according to claim 9, wherein the determining module comprises:
    关系确定单元,用于确定所述默认集群与所述目标集群之间的网络结构关系;a relationship determining unit, configured to determine a network structure relationship between the default cluster and the target cluster;
    资源确定单元,用于根据所述网络结构关系,确定所述网络资源。And a resource determining unit, configured to determine the network resource according to the network structure relationship.
  12. 根据权利要求11所述的任务调度装置,其特征在于,所述网络结构关系包括同核心交换机、同地域和异地;The task scheduling apparatus according to claim 11, wherein the network structure relationship comprises a core switch, a same area, and a different place;
    所述资源确定单元,具体用于若所述默认集群与所述目标集群之间为同核心交换机,确定所述网络资源的级别为第一等级;若所述默认集群与所述目标集群之间为同地域,确定所述网络资源的级别为第二等级;若所述默认集群与所述目标集群之间为异地,确定所述网络资源的级别为第三等级。The resource determining unit is specifically configured to: if the default cluster and the target cluster are the same core switch, determine that the level of the network resource is a first level; and between the default cluster and the target cluster For the same area, the level of the network resource is determined to be a second level; if the default cluster is different from the target cluster, the level of the network resource is determined to be a third level.
  13. 根据权利要求12所述的任务调度装置,其特征在于,所述调度模块,包括:The task scheduling apparatus according to claim 12, wherein the scheduling module comprises:
    第一调度单元,用于若所述默认集群与所述目标集群之间的网络资源的级别为第一 等级,将所述任务调度至所述目标集群;a first scheduling unit, configured to: if a level of network resources between the default cluster and the target cluster is first Level, scheduling the task to the target cluster;
    第二调度单元,用于若所述默认集群与所述目标集群之间的网络资源的级别为第二等级,根据对所述任务进行调度的网络资源占用情况确定将所述任务调度至所述默认集群或所述目标集群;a second scheduling unit, configured to: if the level of the network resource between the default cluster and the target cluster is a second level, determine, according to the network resource occupancy situation scheduled by the task, scheduling the task to the Default cluster or the target cluster;
    第三调度单元,用于若所述默认集群与所述目标集群之间的网络资源的级别为第三等级,将所述任务调度至所述默认集群。The third scheduling unit is configured to schedule the task to the default cluster if the level of the network resource between the default cluster and the target cluster is a third level.
  14. 根据权利要求13所述的任务调度装置,其特征在于,所述网络资源为网络带宽;The task scheduling apparatus according to claim 13, wherein the network resource is a network bandwidth;
    所述第二调度单元,具体用于从历史记录中获得所述任务单次读取所述任务数据的时间长度;计算所述任务数据的数据量与所述时间长度之比,获得所述任务的网络开销;若所述任务的网络开销小于所述默认集群与所述目标集群之间的网络带宽,则将所述任务调度至所述目标集群;若所述任务的网络开销不小于所述默认集群与所述目标集群之间的网络带宽,则将所述任务调度至所述默认集群。The second scheduling unit is specifically configured to obtain, from a history record, a length of time for the task to read the task data in a single time; calculate a ratio of a data amount of the task data to the length of time, and obtain the task. Network overhead; if the network overhead of the task is less than the network bandwidth between the default cluster and the target cluster, scheduling the task to the target cluster; if the network overhead of the task is not less than the The network bandwidth between the default cluster and the target cluster is scheduled to the default cluster.
  15. 根据权利要求9所述的任务调度装置,其特征在于,所述装置,还包括:The task scheduling device according to claim 9, wherein the device further comprises:
    负载均衡模块,用于若所述默认集群处于超负荷状态,则基于负载均衡方式,确定所述目标集群。The load balancing module is configured to determine the target cluster based on a load balancing manner if the default cluster is in an overload state.
  16. 根据权利要求9-13任一项所述的任务调度装置,其特征在于,所述网络资源包括:网络带宽和网络带宽时延积中的至少一个。 The task scheduling apparatus according to any one of claims 9 to 13, wherein the network resource comprises at least one of a network bandwidth and a network bandwidth time delay product.
PCT/CN2017/076709 2016-03-25 2017-03-15 Task scheduling method and device WO2017162075A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201610180450.2 2016-03-25
CN201610180450.2A CN107229519B (en) 2016-03-25 2016-03-25 Task scheduling method and device

Publications (1)

Publication Number Publication Date
WO2017162075A1 true WO2017162075A1 (en) 2017-09-28

Family

ID=59899248

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/076709 WO2017162075A1 (en) 2016-03-25 2017-03-15 Task scheduling method and device

Country Status (3)

Country Link
CN (1) CN107229519B (en)
TW (1) TWI718252B (en)
WO (1) WO2017162075A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112019581A (en) * 2019-05-30 2020-12-01 华为技术有限公司 Method and device for scheduling task processing entities
CN113296913A (en) * 2021-05-25 2021-08-24 未鲲(上海)科技服务有限公司 Data processing method, device and equipment based on single cluster and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103647820A (en) * 2013-12-09 2014-03-19 华为数字技术(苏州)有限公司 Arbitration method and arbitration apparatus for distributed cluster systems
US20140207736A1 (en) * 2013-01-18 2014-07-24 Microsoft Corporation Replication of assets across data centers
CN105391742A (en) * 2015-12-18 2016-03-09 桂林电子科技大学 Hadoop-based distributed intrusion detection system

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN100485625C (en) * 2007-11-01 2009-05-06 北京工业大学 Real-time system task scheduling method
WO2011075729A2 (en) * 2009-12-18 2011-06-23 Morningside Analytics, Llc System and method for attentive clustering and related analytics and visualizations
CN102143046B (en) * 2010-08-25 2015-03-11 华为技术有限公司 Load balancing method, equipment and system
CN103605567B (en) * 2013-10-29 2017-03-22 河海大学 Cloud computing task scheduling method facing real-time demand change
WO2015139164A1 (en) * 2014-03-17 2015-09-24 华为技术有限公司 Task scheduling method, apparatus and device
US9367366B2 (en) * 2014-03-27 2016-06-14 Nec Corporation System and methods for collaborative query processing for large scale data processing with software defined networking

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140207736A1 (en) * 2013-01-18 2014-07-24 Microsoft Corporation Replication of assets across data centers
CN103647820A (en) * 2013-12-09 2014-03-19 华为数字技术(苏州)有限公司 Arbitration method and arbitration apparatus for distributed cluster systems
CN105391742A (en) * 2015-12-18 2016-03-09 桂林电子科技大学 Hadoop-based distributed intrusion detection system

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112019581A (en) * 2019-05-30 2020-12-01 华为技术有限公司 Method and device for scheduling task processing entities
WO2020238989A1 (en) * 2019-05-30 2020-12-03 华为技术有限公司 Method and apparatus for scheduling task processing entity
CN112019581B (en) * 2019-05-30 2022-02-25 华为技术有限公司 Method and device for scheduling task processing entities
CN113296913A (en) * 2021-05-25 2021-08-24 未鲲(上海)科技服务有限公司 Data processing method, device and equipment based on single cluster and storage medium

Also Published As

Publication number Publication date
CN107229519B (en) 2021-04-23
TW201735596A (en) 2017-10-01
TWI718252B (en) 2021-02-11
CN107229519A (en) 2017-10-03

Similar Documents

Publication Publication Date Title
US20150295970A1 (en) Method and device for augmenting and releasing capacity of computing resources in real-time stream computing system
US20190324819A1 (en) Distributed-system task assignment method and apparatus
US8959219B2 (en) Dynamic rerouting of service requests between service endpoints for web services in a composite service
CN104391737B (en) The optimization method of load balance in cloud platform
WO2016082693A1 (en) Method and device for scheduling computation tasks in cluster
CN109960573B (en) Cross-domain computing task scheduling method and system based on intelligent perception
US9870269B1 (en) Job allocation in a clustered environment
CN104834569A (en) Cluster resource scheduling method and cluster resource scheduling system based on application types
CN103164279A (en) Method and system for distributing cloud computing resources
TWI738721B (en) Task scheduling method and device
WO2017028696A1 (en) Method and device for monitoring load of distributed storage system
CN109861850B (en) SLA-based stateless cloud workflow load balancing scheduling method
US20120089734A1 (en) Allocation of resources between web services in a composite service
US10148531B1 (en) Partitioned performance: adaptive predicted impact
US20190020571A1 (en) Optimized consumption of third-party web services in a composite service
KR20150030332A (en) Distributed and parallel processing system on data and method of operating the same
US10142195B1 (en) Partitioned performance tracking core resource consumption independently
CN104407926A (en) Scheduling method of cloud computing resources
CN103746934A (en) CDN bandwidth balancing method, CDN control center and system
US9851988B1 (en) Recommending computer sizes for automatically scalable computer groups
CN102339233A (en) Cloud computing centralized management platform
KR20180072295A (en) Dynamic job scheduling system and method for supporting real-time stream data processing in distributed in-memory environment
WO2017162075A1 (en) Task scheduling method and device
CN110096339B (en) System load-based capacity expansion and contraction configuration recommendation system and method
Rahmani et al. Burst‐aware virtual machine migration for improving performance in the cloud

Legal Events

Date Code Title Description
NENP Non-entry into the national phase

Ref country code: DE

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17769352

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 17769352

Country of ref document: EP

Kind code of ref document: A1