WO2024045784A1 - 作业调度方法、调度器及相关设备 - Google Patents

作业调度方法、调度器及相关设备 Download PDF

Info

Publication number
WO2024045784A1
WO2024045784A1 PCT/CN2023/101231 CN2023101231W WO2024045784A1 WO 2024045784 A1 WO2024045784 A1 WO 2024045784A1 CN 2023101231 W CN2023101231 W CN 2023101231W WO 2024045784 A1 WO2024045784 A1 WO 2024045784A1
Authority
WO
WIPO (PCT)
Prior art keywords
data centers
resource
available resources
scheduler
relationship
Prior art date
Application number
PCT/CN2023/101231
Other languages
English (en)
French (fr)
Inventor
林雅婷
孔凡斌
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2024045784A1 publication Critical patent/WO2024045784A1/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Definitions

  • This application relates to the field of computing power network technology, and in particular, to a job scheduling method, device and related equipment.
  • Computing network is a new information infrastructure that allocates and flexibly schedules computing resources, storage resources and network resources on demand among clouds, edges, and terminals. It can combine heterogeneous computing resources in different regions. Data centers are interconnected to form a computing network. By coordinating and scheduling multi-dimensional resources such as network, storage, and computing, resource processing operations in different regions can be called up in real time and on demand.
  • the scheduler In the computing power network, for jobs submitted by users, the scheduler usually schedules the job to the data center in the computing power network for execution based on first-come first-service (FCFS) and other algorithms; when When a job submitted by a user includes multiple sub-jobs, the scheduler will schedule the multiple sub-jobs to different data centers in the computing network in order so that different data centers can execute different sub-jobs.
  • FCFS first-come first-service
  • data communication usually occurs between different data centers, such as the transmission of execution results obtained by executing sub-jobs between each data center. This makes data transmission between different data centers difficult. It is easy to cause large communication overhead caused by executing jobs submitted by users.
  • This application provides a job scheduling method that effectively reduces the execution of multiple sub-jobs (i.e., user-submitted jobs) by scheduling multiple sub-jobs in a job submitted by a user to a group of data centers with low communication overhead for processing.
  • this application also provides corresponding schedulers, computing devices, computer-readable storage media, and computer program products.
  • this application provides a job scheduling method, which is applied to a computing power network and can be executed by a scheduler.
  • the scheduler obtains the network topology relationship of the computing power network, and the network topology relationship is used to indicate the computing power.
  • the connection relationship and communication overhead between multiple data centers included in the network are obtained, and the available resource set of the multiple data centers is obtained.
  • the available resource set includes one or more available resource amounts, wherein the available resource amount is used for Indicates the resource statistics results of a data center with available resources among multiple data centers, such as one or more of available computing resources, storage resources, and network resources.
  • the scheduler determines the network topology relationship and available resources based on the network topology relationship and the available resources.
  • the resource topology relationship is used to indicate the clustering results of data centers with available resources in multiple data centers.
  • the communication overhead between different data centers belonging to the same clustering result is less than that of data centers belonging to the same clustering result.
  • the communication overhead between data centers under different clustering results allows the scheduler to schedule multiple sub-jobs in the job to be scheduled to multiple data centers under the same clustering result based on the resource topology relationship.
  • the scheduler schedules multiple sub-jobs of the same job to multiple data centers belonging to the same clustering result based on the resource topology relationship, and the communication overhead between different data centers belonging to the same clustering result is less than that of different clustering results.
  • the communication overhead between data centers is reduced, which makes the communication overhead caused by data interaction between different data centers smaller in the process of executing multiple sub-jobs by multiple data centers, thereby reducing the cost of executing multiple sub-jobs. the communication overhead incurred.
  • the scheduler when the scheduler generates the resource topology relationship, it may specifically extract the structural characteristics and communication characteristics of the data center with available resources among the multiple data centers based on the network topology relationship and the set of available resources. , this structural feature is used to indicate the connection relationship between data centers with available resources in multiple data centers, and this communication feature is used to indicate the communication overhead between data centers with available resources in multiple data centers, so that the scheduler Resource topology relationships can be generated based on the structural characteristics and communication characteristics. In this way, the scheduler can cluster multiple data centers with similar structures and communication overheads in the computing network into one category, thereby generating a clustering result indicating the data center with available resources among the multiple data centers. Resource topology relationship, so that multiple sub-jobs can be scheduled later based on the resource topology relationship.
  • the scheduler before the scheduler generates the resource topology relationship, can extract the resource characteristics of the data centers with available resources in multiple data centers based on the available resource amount set, so that the scheduler can extract the resource characteristics of the data centers based on the available resources in the multiple data centers.
  • There are available resources in the data The structural characteristics, communication characteristics and resource characteristics of the heart are used to generate resource topological relationships.
  • the scheduler can not only aggregate multiple data centers with similar structures and communication overheads in the computing network into one category, but also the amount of available resources between different data centers belonging to the same category is at a similar level, so the scheduler can When scheduling multiple sub-jobs based on resource topology relationships, multiple sub-jobs can be scheduled to multiple data centers with similar resource conditions.
  • the scheduler when the scheduler generates the resource topology relationship, the scheduler may cluster the data centers with available resources among the multiple data centers according to the network topology relationship and the set of available resources to obtain at least one Clustering results, thereby generating resource topology relationships based on the set of available resource amounts and at least one clustering result. In this way, resource topological relationships can be generated through clustering.
  • the algorithm used by the clustering data center may be, for example, K-means clustering algorithm, DBSCAN algorithm, or OPTICS algorithm, or other applicable clustering algorithms.
  • this application provides a scheduler, which includes various modules for executing the job scheduling method in the first aspect or any possible implementation of the first aspect.
  • the present application provides a processor, which can be connected to a memory and used to execute instructions stored in the memory, so that the processor executes the first aspect or any implementation of the first aspect.
  • the steps of the job scheduling method can be connected to a memory and used to execute instructions stored in the memory, so that the processor executes the first aspect or any implementation of the first aspect.
  • this application provides a scheduler, where the computing device includes a processor and a memory.
  • the processor and the memory communicate with each other.
  • the processor is configured to execute instructions stored in the memory, so that the scheduler executes the job scheduling method in the first aspect or any implementation of the first aspect.
  • the memory can be integrated into the processor or independent of the processor.
  • the computing device may also include a bus. Among them, the processor is connected to the memory through a bus.
  • the memory may include readable memory and random access memory.
  • the present application provides a computer-readable storage medium that stores instructions, which when run on a computing device, cause the computing device to execute the above-mentioned first aspect or any of the first aspects. Operation steps of the job scheduling method described in one implementation manner.
  • the present application provides a computer program product containing instructions that, when run on a computing device, cause the computing device to execute the job scheduling method described in the first aspect or any implementation of the first aspect. Steps.
  • Figure 1 is a schematic architectural diagram of an exemplary computing power network provided by this application.
  • Figure 2 is a schematic flow chart of a job scheduling method provided by this application.
  • Figure 3 is a schematic diagram of a network topology provided by this application.
  • Figure 4 is a schematic diagram provided by this application for clustering multiple data centers in the computing power network 100 to obtain three data center sets;
  • Figure 5 is a schematic diagram of an exemplary resource attribute configuration table provided by this application.
  • Figure 6 is a schematic structural diagram of a scheduler provided by this application.
  • Figure 7 is a schematic diagram of the hardware structure of a scheduler provided by this application.
  • this application provides a job scheduling device that schedules distributed jobs to a group of data centers with smaller communication overhead for execution, thereby reducing the cost of execution. Communication overhead incurred by executing distributed jobs.
  • the computing power network 100 may include a scheduler 200 and multiple data centers (specifically, data center 1 to data center 10).
  • the scheduler 200 is used to receive jobs submitted by users and schedule the jobs to the multiple data centers for execution.
  • FIG. 1 takes the data center as the granularity for job scheduling as an example.
  • the scheduler 200 can also perform job scheduling at the granularity of availability zone (AZ) or region (region).
  • AZ availability zone
  • region region
  • each data center can include multiple devices, including computing devices, storage devices, network devices (such as network cards), etc.
  • Each AZ can include one data center or multiple geographically close data centers, and each region can include one or multiple AZs.
  • the number of data centers is not limited to the example shown in Figure 1.
  • multiple data centers can be interconnected based on the total-distribution network architecture and the peer-to-peer (P2P) network architecture.
  • data center 1 to data center 10 can be logically divided into area 1 to area 3, where area 1 includes data center 1 to data center 3; area 2 includes data center 4 to data center 7; area 3 includes data center 4 to data center 7.
  • Different data centers in each region can be interconnected based on the central-distribution network architecture. For example, data center 1 in region 1 is connected to all other data centers in region 1; data centers belonging to different regions can be interconnected based on P2P network architecture for interconnection.
  • the multiple data centers included in the computing power network 100 can also be interconnected based on P2P network architecture only, or only based on the total-distribution network architecture, or based on other applicable
  • the architecture is interconnected and is not limited here.
  • the job submitted by the user to the scheduler 200 may include multiple sub-jobs, and the multiple sub-jobs may be executed in parallel.
  • the job may be called a distributed job.
  • the distributed job submitted by the user can be a distributed training job.
  • the scheduler 200 can split the network architecture of the AI model into multiple modules for training respectively, or divide the training samples of the AI model into multiple subsets to train the AI model, etc.
  • a distributed job can also be a distributed computing job, a distributed search job, and other types of jobs.
  • each area can be configured with an agent component, such as the agent component 101, the agent component 102, the agent component 103, etc. in Figure 1, so that the scheduler 200 can schedule multiple sub-jobs to the corresponding agent components respectively.
  • the agent component forwards subjobs to the appropriate data center.
  • the scheduler 200 can schedule sub-job 1 and sub-job 2 to the agent component 101, and schedule sub-job 3 and sub-job 4 to the agent component 103, and the agent component 101 schedules sub-job 1 to data center 1 and sub-job 2 to data center 2.
  • the agent component 103 schedules sub-job 3 to data center 8 and sub-job 4 to data center 9.
  • the agent component in each area can be deployed independently of the data center in the area.
  • the agent component 102 can be deployed on a computing device independent of the data center 4, and establish a wired or Wireless connections.
  • the agent component in each area can be deployed in one of the data centers in the area.
  • the agent component 102 in Figure 1 can be deployed in the data center 4.
  • This application describes the deployment method of the agent component in the area. Not limited.
  • the scheduler 200 schedules multiple sub-jobs included in the distributed job to different data centers in sequence according to scheduling algorithms such as FCFS and short job priority, this makes it easy for multiple sub-jobs to be scheduled to geographical locations. Different data centers located far apart. In the process of executing distributed jobs, different data centers often interact with each other, such as the execution results obtained by interactively executing sub-jobs in different data centers. This causes data interactions between data centers that are far apart to produce larger data. communication overhead, resulting in greater communication overhead required to execute distributed jobs. Taking the execution of a distributed training job as an example, the scheduler 200 can divide the training samples into multiple subsets.
  • the task of using a subset to iteratively train the AI model can be a sub-job of the distributed job.
  • the model parameters obtained by different data centers using a subset to train the AI model in each round are usually sent to other data centers so that each data center can calculate them through model parameter exchange and accumulation algorithms.
  • the global model parameters are obtained, and the global model parameters are used to update the local AI model and execute the next round of AI model.
  • the scheduler 200 schedules different sub-jobs to data centers that are geographically far apart, during the process of iteratively training the AI model, multiple data centers frequently interact with each other and the model parameters obtained in each round of training will be generated. Large communication overhead, resulting in large communication overhead generated by executing distributed jobs.
  • this application provides a job scheduling method to reduce the communication overhead generated by executing distributed jobs.
  • the scheduler 200 obtains the network topology relationship of the computing power network 100 and the set of available resources of multiple data centers included in the computing power network 100.
  • the network topology relationship is used to indicate multiple data included in the computing power network 100.
  • the available resource amount set includes one or more available resource amounts.
  • the available resource amount is used to indicate the resource statistics result of one data center among multiple data centers that has available resources.
  • the scheduler 200 generates a clustering result indicating a data center with available resources among multiple data centers based on the obtained network topology relationship and the available resource amount set, wherein the clustering results between different data centers belonging to the same clustering result
  • the communication overhead is less than the communication overhead between data centers belonging to different clustering results. Therefore, the scheduler 200 schedules multiple sub-jobs in the job to be scheduled to multiple sub-jobs belonging to the same clustering result according to the resource topology relationship. data center.
  • the scheduler 200 schedules multiple sub-jobs of the same job to multiple data centers belonging to the same clustering result according to the resource topology relationship, and the communication overhead between different data centers belonging to the same clustering result is less than that of different data centers belonging to different clusters.
  • Data Center under category results This makes the communication overhead caused by data interaction between different data centers smaller when multiple data centers execute multiple sub-jobs, thereby reducing the communication overhead caused by executing multiple sub-jobs.
  • scheduler 200 may be implemented in software or hardware.
  • the scheduler 200 may be implemented by, for example, at least one of a virtual machine, a container, and a computing engine.
  • the scheduler 200 may be implemented by a physical device including a processor, such as a server or the like.
  • the processor can be a CPU, as well as application-specific integrated circuit (ASIC), programmable logic device (PLD), complex programmable logical device (CPLD), field programmable Gate array (field-programmable gate array, FPGA), general array logic (GAL), system on chip (SoC), software-defined architecture (software-defined infrastructure, SDI) chip, artificial intelligence (artificial) Intelligence (AI) chip, data processing unit (DPU) and other processors or any combination thereof.
  • ASIC application-specific integrated circuit
  • PLD programmable logic device
  • CPLD complex programmable logical device
  • FPGA field programmable Gate array
  • GAL general array logic
  • SoC system on chip
  • software-defined architecture software-defined infrastructure
  • SDI software-defined infrastructure
  • AI artificial intelligence
  • DPU data processing unit
  • the number of processors included in the physical device may be one or multiple. Specifically, the number of processors may be set according to the business requirements of the actual application, which is not limited in this embodiment.
  • computing power network 100 shown in Figure 1 is only used as an example. In actual application, the job scheduling method provided by the embodiment of the present application can also be applied to other applicable computing power networks.
  • FIG 2 is a schematic flowchart of a job scheduling method provided by an embodiment of the present application.
  • the job scheduling method shown in Figure 2 is applied to the computing power network 100 shown in Figure 1, or can be applied to other applicable computing power networks, which is not limited in this embodiment.
  • the computing power network shown in Figure 1 is used as an example for illustrative explanation.
  • the job scheduling method shown in Figure 2 may specifically include:
  • S201 Obtain the network topology relationship of the computing power network 100.
  • the network topology relationship is used to indicate the connection relationship and communication overhead between multiple data centers included in the computing power network 100.
  • the computing power network 100 can be deployed with lightweight agent components, such as the agent component 101 to the agent component 103 in Figure 1, so that the scheduler 200 can instruct the agent component to collect data from each data center in the computing power network 100.
  • the connection relationship between other data centers and the communication overhead generated by data interaction between different data centers are reported to the scheduler 200 .
  • the scheduler 200 can proxy the information reported by the component to generate a network topology relationship.
  • the network topology relationship can be, for example, a network topology diagram as shown in Figure 3.
  • the nodes in the network topology diagram are used to uniquely identify the computing power network 100.
  • the values on the edges connecting different nodes in the network topology diagram are used to indicate the communication overhead between the two data centers, and the larger the value, the greater the communication overhead between the two data centers.
  • the network topology relationship can also be implemented in other ways.
  • the communication overhead between different data centers is used to represent the cost of data communication between two data centers.
  • the communication overhead can be measured by the physical distance between the geographical locations of two data centers, and the greater the physical distance, the greater the communication overhead between the two data centers.
  • communication overhead can be measured by the amount of resources required to communicate a unit length of data between two data centers. In other embodiments, communication overhead may also be measured in other ways.
  • the scheduler 200 obtains a set of available resources of multiple data centers.
  • the set of available resources includes one or more available resources.
  • the available resources are used to indicate a data center with available resources among the multiple data centers. Resource statistics results.
  • each agent component in the computing power network 100 can periodically report the current available resource amount of each data center to the scheduler 200 based on the heartbeat mechanism, that is, the amount of available resources.
  • available resources which can also be called idle resources, refer to the resources in the data center that can be used to schedule jobs and execute the jobs.
  • the scheduler 200 can add the available resources reported by one or more agent components to the available resource set, so that the scheduler 200 can subsequently schedule the job to a data center with available resources in the available resource set.
  • the available resources reported by the agent component can be configured by users or technical personnel (such as administrators).
  • the scheduler 200 can obtain the resource attribute configuration table provided by the user or technician, and deliver the resource attribute configuration table to the agent component.
  • the resource attribute configuration table can be stored based on another markup language (YAML ain't a markup language, YAML) format file, and the resource attribute configuration table can define information on available resources to be collected by the proxy component.
  • the resource attribute configuration table can be shown in Figure 5 and can include computing resources: CPU, graphics processing unit (GPU), neural network processing unit (NPU), FPGA, ASIC, etc.; Storage resources: hard disk drive, HDD), solid state drive (SSD), storage based on non-volatile memory host controller interface specification (non-volatile memory express, NVME), cassette tape (CASTAPE), IO rate etc.; Network resources: uplink bandwidth (up_band), downlink bandwidth (dn_band), network card type (NIC_type).
  • the resource attributes in the resource attribute configuration table may include more or less attribute content. In this way, the agent component can periodically collect, count and report the available resources of each data center based on the resource attributes defined in the resource attribute configuration table.
  • the scheduler 200 may also process the amount of available resources reported by the agent component after receiving it.
  • the scheduler 200 can perform data cleaning on the amount of available resources reported by the agent component to remove abnormal values such as garbled characters, or when determining that there are abnormal values such as garbled characters, the scheduler 200 can instruct the agent component to re-collect, collect statistics, and Report the available resources of each data center.
  • the scheduler 200 can normalize the amount of available resources in each data center to avoid large differences in resources on different data centers resulting in a large range of numerical changes, which in turn leads to an increase in subsequent calculation complexity.
  • the scheduler 200 can perform data conversion on resource information such as network card type in the amount of available resources. Specifically, it can convert the network card type into a corresponding numerical value, so that different numerical values can be used to identify different types of network cards, so that subsequent processing can be based on the conversion. The following value is used to identify the network card type for job scheduling.
  • the scheduler 200 generates a resource topology relationship based on the network topology relationship and the available resource set.
  • the resource topology relationship is used to indicate the clustering results of data centers with available resources in multiple data centers, and different data belonging to the same clustering result.
  • the communication overhead between centers is less than the communication overhead between data centers belonging to different clustering results.
  • the scheduler 200 can cluster data centers with available resources according to communication overhead according to network topology relationships and available resource sets, so as to cluster multiple data centers with small communication overhead and available resources. Classes are a group.
  • the scheduler 200 can determine multiple data centers with available resources in the computing power network 100 based on the set of available resources, and extract multiple data centers with available resources based on the network topology relationship.
  • the structural characteristics and communication characteristics of the center are used to indicate the connection relationship between multiple data centers, and the communication characteristics are used to indicate the communication overhead of transmitting data between multiple data centers.
  • the scheduler 200 can determine the degree of each node in the network topology diagram, where the degree of each node is the number of edges connected to the node, Therefore, the scheduler 200 can target each node and expand K layers outward with the node as the center.
  • K is the diameter of the network topology graph with the node as the center, that is, the number of layers of the node farthest from the node in the network topology graph.
  • the K values corresponding to different nodes are different. In this way, the scheduler 200 can calculate the structural characteristics corresponding to the node based on the K-layer expansion result corresponding to each node and the degree of each node.
  • the structural features corresponding to each node can be quantified by the following formula (1) to obtain the structural feature vector of the node.
  • v structure ( ⁇ 1 , ⁇ 2 ,..., ⁇ i ,... ⁇ k )
  • ⁇ stucture is the structural feature vector
  • ⁇ i represents the sum of the degrees of adjacent nodes of the i-th layer that the node extends outward
  • i is a positive integer not greater than k.
  • the scheduler 200 can also determine the relationship between the node and other (n- 1) The communication overhead between nodes, and use this to generate an n-dimensional vector, which can be used as the communication feature corresponding to the node.
  • n is the number of nodes included in the network topology diagram, that is, the number of multiple data centers included in the computing power network 100.
  • each data center in the computing power network 100 may have a unique identifier.
  • each data center has a unique corresponding node in the generated computing power network topology.
  • the communication characteristics corresponding to each node can be quantified through the following formula (2) and formula (3) to obtain the communication characteristic vector of the node.
  • v link ( ⁇ s1 , ⁇ s2 ,..., ⁇ st ,..., ⁇ sn )
  • ⁇ st min(e si *e ij *...*e lt )
  • ⁇ link is the communication feature vector
  • ⁇ sj represents the communication overhead between the node s and the t-th node in the network topology diagram
  • min(e si *e ij *...*e lt ) represents the communication cost between node s and the t-th node in the network topology diagram.
  • the minimum value of the communication overhead generated by all possible communication paths between nodes, e si represents the communication overhead between node i and node s on the communication path between node s and node t.
  • the scheduler 200 can also extract structural features and communication features of multiple data centers through other methods.
  • the communication feature vector corresponding to each node that is, each data center
  • the communication feature vector corresponding to each node can be calculated based on the above formula (2) and the following formula (4).
  • ⁇ st min(e si +e ij +...+e lt )
  • the scheduler 200 clusters multiple data centers with available resources in the computing power network 100 according to the structural characteristics and communication characteristics of the multiple data centers to obtain multiple data center sets.
  • Each data center set is A clustering result.
  • the scheduler 200 can perform vector splicing of the structural features and communication features of each data center to obtain a feature vector of each data center, so that the scheduler 200 can use the corresponding clustering algorithm to perform clustering on multiple data centers. Clustering to obtain multiple data center sets. As shown in FIG. 4 , the scheduler 200 can cluster multiple data centers in the computing power network 100 to obtain three data center sets, and each data center set includes multiple data centers.
  • the scheduler 200 can use the K-means clustering algorithm to cluster multiple data centers into multiple data center sets.
  • the objective function of the K-means clustering algorithm can be referred to the following formula (5).
  • n is the number of multiple data centers included in the computing network 100
  • k is the number of cluster sets
  • xi represents the i-th data center
  • i ranges from 1 to n
  • u j represents the j-th set.
  • the center of mass of j ranges from 1 to k.
  • the scheduler 200 may also apply a density based spatial clustering of applications with noise (DBSCAN) algorithm or determine an ordering point to identify the cluster structure (OPTICS). ) algorithm to cluster multiple data centers, or other applicable algorithms may be used to perform clustering, which is not limited in this embodiment.
  • DBSCAN density based spatial clustering of applications with noise
  • OTICS ordering point to identify the cluster structure
  • the scheduler 200 can store the generated feature vectors of each data center in the redis database for efficient storage by the scheduler 200. and, after clustering to obtain multiple data center sets (that is, multiple clustering results), the scheduler 200 can save them to a database, such as a redis database, etc., so that subsequent submissions to users based on the multiple data center sets can be obtained Distributed jobs are scheduled.
  • the scheduler 200 may also divide multiple data centers into multiple data center sets in other ways. For example, the scheduler 200 may divide multiple data centers according to the configuration of the multiple data centers by technicians, etc. This embodiment does not limit this.
  • the scheduler 200 can also cluster multiple data centers with available resources based on more dimensional information. For example, before clustering multiple data centers, the scheduler 200 can also obtain the resource amounts of available resources corresponding to multiple data centers in the computing power network 100, and calculate the available resources according to the network topology map and the available resources of each data center. The amount of resources clusters multiple data centers in the computing power network 100 . In this way, in each data center set obtained by clustering, not only the communication overhead between different data centers is small, but also the amount of available resources in different data centers can be at a similar level and have similar computing power, which helps Improve load balancing capabilities in this data center collection.
  • the scheduler 200 can extract the resource characteristics corresponding to each data center according to the amount of available resources in each data center.
  • the resource characteristics can be represented in a vector form. As an example, assuming that the agent component collects and reports the resource amounts of available resources of each data center based on the resource attribute configuration table shown in Figure 5, the scheduler 200 can calculate the available resources of each data center based on the following formula (6) Quantify so that the resource characteristics of each data center can be extracted.
  • v resource (v cpu ,v gpu ,v npu ,v fpga ,v asic ,v hdd ,v ssd ,v nvme , v castape ,v io ,v up_band ,v dn_band ,v nic_type )
  • v cpu represents the CPU resources of the data center
  • v gpu represents the GPU resources of the data center
  • v npu represents the NPU resources of the data center.
  • Source v fpga represents the FPGA resources of the data center
  • v asic represents the ASIC resources of the data center
  • v hdd represents the HDD resources of the data center
  • v ssd represents the SSD resources of the data center
  • v nvme represents the NVME resources of the data center
  • v castape represents CASTAP resources of the data center
  • v io represents the IO resources of the data center
  • v up_band represents the uplink bandwidth of the data center
  • v dn_band represents the downlink bandwidth of the data center
  • v nic_type represents the network card type of the data center.
  • the scheduler 200 may also extract the resource characteristics of each data center through other formulas or other methods.
  • the scheduler 200 can cluster multiple data centers with available resources in the computing power network 100 based on the extracted resource characteristics of each data center and the extracted structural characteristics and communication characteristics of the multiple data centers. , such as using the above K-means algorithm for clustering, etc., and using this clustering to obtain one or more data center sets, that is, one or more clustering results are obtained.
  • the scheduler 200 can also cluster multiple data centers by combining the data center's tariff information (such as electricity prices, etc.) and energy consumption information, so that each data center can be aggregated
  • the data centers have similar energy consumption and costs. Therefore, after the distributed jobs are scheduled to the data center set, the energy consumption and cost of the distributed jobs can be reduced, and the energy efficiency ratio of the computing power can be increased.
  • the scheduler 200 can also combine more other types of information to cluster multiple data centers, which is not limited in this embodiment.
  • the scheduler 200 can generate a resource topology indicating the clustering result of the data center with available resources based on the clustering result and the available resource amount set.
  • the resource topology relationship can be implemented through a topology diagram used to indicate resource distribution and data center clustering results, or can be implemented in other ways.
  • the scheduler 200 can schedule the job according to the resource topology relationship. Specifically, the scheduler 200 continues to perform the following steps:
  • the scheduler 200 receives a job to be scheduled submitted by the user.
  • the job includes multiple sub-jobs, which are hereinafter referred to as distributed jobs.
  • the user can submit a distributed job to the computing power network 100 and request the computing power network 100 to process the distributed job.
  • the user can remotely log in to the scheduler 200 through a terminal or a client provided by the scheduler 200, and submit a distributed job on the terminal or client.
  • the terminal or client can generate a data processing request, and the data processing request includes
  • the terminal or client can send the pending distributed job submitted by the user to the scheduler 200; accordingly, the scheduler 200 can obtain the distributed job submitted by the user by parsing the received data processing request.
  • the user when submitting a job, the user can also add a label labeled distributed job to the job, so that the scheduler 200 determines that the job currently submitted by the user is a distributed job based on the label.
  • a distributed job means that the job includes multiple sub-jobs that can be executed independently, and the multiple sub-jobs can be processed in parallel through multiple different data centers in the computing power network 100, thereby improving the processing efficiency of the distributed job.
  • a distributed job for example, may be a distributed training job, used to train a large-scale AI model.
  • the distributed training job may divide the training samples or the AI model architecture into the AI model.
  • the training job is divided into multiple sub-jobs.
  • Each sub-job is used to train the AI model with a subset of training samples, or is used to train a part of the architecture of the AI model with training samples, or is used to train the AI model with a subset of training samples.
  • a part of the architecture, etc., this embodiment does not limit this.
  • the distributed job may also be a distributed calculation or distributed search operation on multiple data, which is not limited in this embodiment.
  • the multiple data centers in the computing power network 100 may be multiple data centers as shown in FIG. 1 , or may be multiple AZs, or may be multiple regions, etc., which is not limited in this embodiment.
  • S205 The scheduler 200 schedules multiple sub-jobs in the distributed job to multiple data centers belonging to the same clustering result according to the resource topology relationship.
  • the scheduler 200 can schedule multiple sub-jobs in a distributed job to multiple data centers belonging to the same clustering result based on load balancing or random scheduling based on resource topology relationships.
  • the scheduler 200 can determine the first clustering result according to the resource topology relationship and the amount of resources required by the distributed job.
  • this embodiment provides the following implementation methods for determining the first clustering result.
  • the user when submitting a distributed job, can specify the resource type and resource quantity required to execute the distributed job, so that the scheduler 200 can configure the resource topology according to the resource type and resource quantity specified by the user.
  • the scheduler 200 can determine the amount of available resources required to execute each sub-job included in the distributed job (the resource amount is rounded up) based on the resource type and resource quantity specified by the user, and the scheduler 200 can Calculate the average amount of available resources in the data center under each clustering result. Specifically, you can calculate the ratio of the amount of available resources in the data center set to the number of data centers included in the data center set based on the following formula (7). Get the average amount of resources.
  • Characterizes the average amount of computing resources available in the data center set (i.e., multiple data centers under the clustering result)
  • Characterizes the average amount of storage resources available in a data center collection represents the average amount of network resources available in the data center set
  • m represents the number of data centers included in the data center set
  • Characterizes the amount of computing resources available for the kth data center in the data center set Characterizes the amount of storage resources available for the k-th data center in the data center set
  • Characterizes the amount of network resources available for the k-th data center in the data center set Characterizes the amount of network resources available for the k-th data center in the data center set.
  • the scheduler 200 may determine, based on the amount of available resources required by each sub-job and the calculated average amount of available resources for each data center set, the average amount of resources that can satisfy the available resources required by the single sub-job.
  • the scheduler 200 may calculate the difference ⁇ between the average resource amount of available resources of the data center collection and the resource amount of available resources required by a single sub-job based on the following formula (8).
  • x represents the number of attributes of computing resources
  • y represents the number of attributes of storage resources
  • z represents the number of attributes of network resources.
  • the scheduler 200 does not schedule the distributed job to The data center collection.
  • the scheduler 200 can determine one or more data center sets that can be used to execute the distributed job, that is, determine one or more clustering results that can be used to execute the distributed job.
  • the scheduler 200 can determine the clustering result as the first clustering result, and schedule the distributed job to the third clustering result.
  • a clustering result When the available resources of the data center under multiple clustering results can meet the needs of the distributed job, the scheduler 200 can select the clustering result of the load from the multiple clustering results as the third load based on the load balancing policy.
  • a clustering result, or a clustering result is randomly selected from the plurality of clustering results as the first clustering result, and then the distributed job is scheduled to be executed in multiple data centers under the first clustering result.
  • the available resources of different data centers under the first clustering result may be different.
  • the number of data centers is smaller than the number of sub-jobs included in the distributed job, which represents the number of data centers under the first clustering result. It is difficult for each data center to execute all sub-jobs of the distributed job in parallel due to resource constraints.
  • the data center under the first clustering result can instruct the scheduler 200 to reschedule the distributed job so that the scheduler 200
  • the distributed job can be rescheduled to multiple data centers under other clustering results that can execute all sub-jobs in parallel.
  • the user when submitting a distributed job, can specify the type and amount of resources required to execute the distributed job, and can also specify the number of data centers or the distribution required to execute the distributed job.
  • the scheduler 200 can determine the type and amount of resources required to execute each sub-job based on the number of data centers or the number of sub-jobs specified by the user. Then, the scheduler 200 can traverse each clustering result indicated by the resource topology relationship, and count the currently traversed The number of data centers among the multiple data centers under the clustering result whose resource type and resource amount meet the resource type and resource amount required to execute each sub-job.
  • the scheduler 200 can stop the traversal and determine the clustering result as the first clustering result, and Distributed jobs are scheduled to multiple data centers under the first clustering result.
  • the scheduler 200 can obtain the total amount of available resources of multiple data centers under each clustering result in real time, and after receiving the distributed job submitted by the user, can schedule the distributed job according to the requirements of the distributed job.
  • the resource type and resource amount are used to select one or more data center sets that meet the resource conditions from multiple clustering results.
  • the resource condition refers to the difference between the total amount of available resources of multiple data centers under the clustering result and the amount of available resources required for distributed jobs exceeding the first threshold, or multiple data under the clustering result.
  • the total amount of available resources in the center exceeds the second threshold, which indicates that the data center under one or more of the filtered clustering results usually has enough available resources to execute distributed jobs. Therefore, the scheduler 200 can determine the first clustering result from one or more filtered clustering results, and schedule the distributed job submitted by the user to multiple data centers belonging to the first clustering result. Execution etc.
  • the scheduler 200 may specify some or all of the data centers used to execute the distributed job, and specify the sub-jobs executed by each data center. , so that multiple data centers belonging to the first clustering result execute multiple sub-jobs of the distributed job in parallel under the scheduling of the scheduler 200 .
  • the scheduler 200 may receive multiple distributed jobs in sequence, so that the scheduler 200 200 can generate resource topology relationships in real time, and schedule multiple received distributed jobs one by one to multiple data centers under corresponding clustering results for execution based on the resource topology relationships generated in real time. For example, after receiving distributed jobs submitted by other users, the scheduler 200 can schedule the new distributed jobs to multiple data centers under the second clustering result among the multiple clustering results in order to utilize the second clustering. As a result, multiple data centers execute multiple sub-jobs in the new distributed job in parallel, which is not limited in this embodiment.
  • the scheduler 200 schedules multiple sub-jobs of the same job to multiple data centers belonging to the same clustering result according to the resource topology relationship, the communication between different data centers belonging to the same clustering result is
  • the overhead is less than the communication overhead between data centers belonging to different clustering results. This makes the communication overhead caused by data interaction between multiple data centers during the execution of multiple sub-jobs smaller, so that it can Reduce the communication overhead caused by executing multiple sub-jobs.
  • the resource topology relationship is also obtained by clustering multiple data centers with available resources in the computing power network 100 based on the resource characteristics of the data center, the different data under each clustering result indicated by the resource topology relationship
  • the difference between the resource type and resource amount of the available resources in the center is usually small.
  • using multiple data centers under one of the clustering results to execute multiple sub-jobs included in the distributed job in parallel can reduce the number of data.
  • the generation of resource fragments on the center reduces resource waste and can also improve the load balancing capability of the computing power network 100.
  • the execution order of the steps shown in Figure 2 is only used as an illustrative explanation and is not used to limit its actual application.
  • the order of steps executed by the scheduler 200 is limited to the example shown in Figure 2.
  • the scheduler 200 may execute step S201 and step S202 in parallel, or the scheduler 200 may execute step S202 first, and then execute step S201; or the scheduler 200 may execute step S204 first, and then execute steps S201 to S203, etc., in this embodiment This is not limited.
  • the scheduler 600 includes:
  • Information acquisition module 601 is used to obtain the network topology relationship of the computing power network, the network topology relationship is used to indicate the connection relationship and communication overhead between multiple data centers included in the computing power network; obtain the A set of available resources of multiple data centers.
  • the set of available resources includes one or more available resources.
  • the available resources are used to indicate the resource statistics of a data center with available resources among the multiple data centers. result;
  • Generating module 602 configured to generate a resource topology relationship according to the network topology relationship and the available resource amount set, where the resource topology relationship is used to indicate the clustering results of data centers with available resources among the multiple data centers. , the communication overhead between different data centers belonging to the same clustering result is less than the communication overhead between data centers belonging to different clustering results;
  • the scheduling module 603 is configured to schedule multiple sub-jobs in the job to be scheduled to multiple data centers belonging to the same clustering result according to the resource topology relationship.
  • the generation module 602 is used to:
  • the structural characteristics and communication characteristics of the data centers with available resources among the plurality of data centers are extracted, and the structural characteristics are used to indicate that the plurality of data centers have available resources.
  • the connection relationship between data centers with available resources, and the communication characteristics are used to indicate the communication overhead between data centers with available resources among the multiple data centers;
  • the resource topological relationship is generated according to the structural characteristics and the communication characteristics.
  • the generation module 602 is further configured to, before generating the resource topology relationship, extract the resource characteristics of a data center with available resources among the multiple data centers according to the set of available resources. ;
  • the generating module 602 is specifically configured to generate the resource topological relationship according to the structural characteristics, the communication characteristics and the resource characteristics.
  • the generation module 602 is used to:
  • cluster data centers with available resources among the plurality of data centers to obtain at least one clustering result
  • the resource topological relationship is generated according to the set of available resource amounts and the at least one clustering result.
  • scheduler 600 shown in Figure 6 corresponds to the method executed by the scheduler 200 shown in Figure 2, the specific implementation of the scheduler 600 shown in Figure 6 and its technical effects can be found in the foregoing embodiments. The relevant descriptions will not be repeated here.
  • FIG. 7 is a schematic diagram of a scheduler 700 provided by this application.
  • the scheduler 700 can implement the functions of the scheduler 200 in the embodiment shown in FIG. 2 .
  • the scheduler 700 includes a processor 701, a memory 702, and a communication interface 703. Among them, the processor 701, the memory 702, and the communication interface 703 communicate through the bus 704. Communication can also be achieved through other means such as wireless transmission.
  • the memory 702 is used to store instructions, and the processor 701 is used to execute the instructions stored in the memory 702. Further, the scheduler 700 may also include a memory unit 705, and the memory unit 705 may be connected to the processor 701, the storage medium 702, and the communication interface 703 through the bus 704.
  • the memory 702 stores program code, and the processor 701 can call the program code stored in the memory 702 to perform the following operations:
  • the set of available resources includes one or more available resources.
  • the available resources are used to indicate one of the multiple data centers that has available resources. Resource statistics results;
  • the resource topology relationship is used to indicate the clustering results of data centers with available resources among the multiple data centers, which belong to the same clustering result.
  • the communication overhead between different data centers is less than the communication overhead between data centers belonging to different clustering results;
  • multiple sub-jobs in the job to be scheduled are respectively scheduled to multiple data centers belonging to the same clustering result.
  • the processor 701 may be a CPU, and the processor 701 may also be other general-purpose processors, digital signal processors (DSPs), application-specific integrated circuits (ASICs), or field programmable gate arrays (FPGAs). ) or other programmable logic devices, discrete gate or transistor logic devices, discrete device components, etc.
  • DSPs digital signal processors
  • ASICs application-specific integrated circuits
  • FPGAs field programmable gate arrays
  • a general-purpose processor can be a microprocessor or any conventional processor, etc.
  • Memory 702 may include read-only memory and random access memory and provides instructions and data to processor 701 .
  • Memory 702 may also include non-volatile random access memory.
  • memory 702 may also store device type information.
  • Memory 702 may be volatile memory or non-volatile memory, or may include both volatile and non-volatile memory.
  • the non-volatile memory can be read-only memory (ROM), programmable ROM (PROM), erasable programmable read-only memory (erasable PROM, EPROM), electrically removable memory.
  • ROM read-only memory
  • PROM programmable ROM
  • EPROM erasable programmable read-only memory
  • Erase programmable read-only memory electrically EPROM, EEPROM
  • Volatile memory may be random access memory (RAM), which is used as an external cache.
  • RAM random access memory
  • SRAM static random access memory
  • DRAM dynamic random access memory
  • SDRAM synchronous dynamic random access memory
  • double data rate synchronous dynamic random access memory double data date SDRAM, DDR SDRAM
  • enhanced Type synchronous dynamic random access memory enhanced SDRAM, ESDRAM
  • serial DRAM synchronous link dynamic random access memory
  • direct memory bus random access memory direct rambus RAM, DR RAM
  • the communication interface 703 is used to communicate with other devices connected to the scheduler 700.
  • the bus 704 may also include a power bus, a control bus, a status signal bus, etc.
  • the various buses are labeled bus 704 in the figure.
  • the scheduler 700 may correspond to the scheduler 600 in the embodiment of the present application, and may correspond to the scheduler 200 in executing the method shown in FIG. 2 according to the embodiment of the present application, and the scheduler
  • the above and other operations and/or functions implemented by 700 are respectively to implement the corresponding processes of the method in Figure 2. For the sake of simplicity, they will not be described again here.
  • An embodiment of the present application also provides a processor, which is connected to a memory.
  • the processor is used to execute instructions in the memory, so that the processor performs the job scheduling performed by the scheduler 200 in the embodiment shown in Figure 2. method.
  • An embodiment of the present application also provides a computer-readable storage medium.
  • the computer-readable storage medium may be any available medium that a computing device can store or a data storage device such as a data center that contains one or more available media.
  • the available media may be magnetic media (eg, floppy disk, hard disk, tape), optical media (eg, DVD), or semiconductor media (eg, solid state drive), etc.
  • the computer-readable storage medium includes instructions that instruct the computing device to perform the above-described job scheduling method.
  • An embodiment of the present application also provides a computer program product.
  • the computer program product includes one or more computer instructions. When the computer instructions are loaded and executed on a computing device, the processes or functions described in accordance with the embodiments of the present application are generated in whole or in part.
  • the computer instructions may be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another, e.g., the computer instructions may be transmitted over a wired connection from a website, computer, or data center. (such as coaxial cable, optical fiber, digital subscriber line (DSL)) or wireless (such as infrared, wireless, microwave, etc.) to another website, computer or data center.
  • a wired connection such as coaxial cable, optical fiber, digital subscriber line (DSL)
  • wireless such as infrared, wireless, microwave, etc.
  • the computer program product may be a software installation package. If it is necessary to use any of the foregoing job scheduling methods, the computer program product may be downloaded and executed on the computing device.
  • the above embodiments may be implemented in whole or in part by software, hardware, firmware, or any other combination.
  • the above-described embodiments may be implemented in whole or in part in the form of a computer program product.
  • the computer program product includes one or more computer instructions.
  • the processes or functions described in the embodiments of the present application are generated in whole or in part.
  • the computer may be a general-purpose computer, a special-purpose computer, a computer network, or other programmable devices.
  • the computer instructions may be stored in or transmitted from one computer-readable storage medium to another, e.g., the computer instructions may be transferred from a website, computer, server, or data center Transmission to another website, computer, server or data center by wired (such as coaxial cable, optical fiber, digital subscriber line (DSL)) or wireless (such as infrared, wireless, microwave, etc.) means.
  • the computer-readable storage medium may be any available medium that a computer can access, or a data storage device such as a server or a data center that contains one or more sets of available media.
  • the available media may be magnetic media (eg, floppy disk, hard disk, tape), optical media (eg, DVD), or semiconductor media.
  • the semiconductor medium may be a solid state drive.

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

一种作业调度方法,应用于算力网络,具体地,调度器获取算力网络的网络拓扑关系,该网络拓扑关系用于指示多个数据中心之间的连接关系和通信开销,并获取可用资源量集合,其中的一个可用资源量用于指示有可用资源的一个数据中心的资源统计结果,然后,调度器根据获取的信息,生成用于指示聚类结果的资源拓扑关系,属于同一聚类结果的不同数据中心之间的通信开销,小于属于不同聚类结果下的数据中心之间的通信开销,从而调度器根据该资源拓扑关系,将多个子作业分别调度至同一聚类结果下的多个数据中心。如此,不同数据中心之间进行数据交互所产生的通信开销较小,从而可以减少执行多个子作业所产生的通信开销。

Description

作业调度方法、调度器及相关设备
本申请要求于2022年8月29日提交中国国家知识产权局、申请号为202211041424.3、申请名称为“作业调度方法、调度器及相关设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及算力网络技术领域,尤其涉及一种作业调度方法、装置及相关设备。
背景技术
算力网络(computing network),是一种在云、边、端之间按需分配和灵活调度计算资源、存储资源以及网络资源的新型信息基础设施,能够将不同地域中具有异构计算资源的数据中心互联形成计算网络,通过对网络、存储、计算等多维度资源进行统筹调度,能够实时、按需调用不同地域的资源处理作业。
在算力网络中,对于用户提交的作业,调度器通常会基于先来先服务(first-come first-service,FCFS)等算法,将该作业调度至算力网络中的数据中心进行执行;当用户提交的作业包括多个子作业时,调度器会将该多个子作业依次调度至算力网络中的不同数据中心,以便由不同数据中心分别执行不同的子作业。但是,在执行多个子作业的过程中,不同数据中心之间通常会进行数据通信,如各个数据中心间传输各自执行子作业所得到的执行结果等,这就使得不同数据中心之间的数据传输容易导致执行用户提交的作业所产生的通信开销较大。
因此,如何减小执行用户提交的作业(包括多个子作业)所产生的通信开销,成为亟需解决的重要问题。
发明内容
本申请提供了一种作业调度方法,通过将用户提交的作业中的多个子作业调度至一组通信开销较小的数据中心进行处理,实现有效降低执行多个子作业(也即用户提交的作业)所产生的通信开销。此外,本申请还提供了对应的调度器、计算设备、计算机可读存储介质以及计算机程序产品。
第一方面,本申请提供一种作业调度方法,该方法应用于算力网络,可以由调度器执行,具体地,调度器获取算力网络的网络拓扑关系,该网络拓扑关系用于指示算力网络所包括的多个数据中心之间的连接关系和通信开销,并获取该多个数据中心的可用资源集合,该可用资源集合包括一个或者多个可用资源量,其中,该可用资源量用于指示多个数据中心中有可用资源的一个数据中心的资源统计结果,如包括可用的计算资源、存储资源、网络资源中的一种或者多种,然后,调度器根据该网络拓扑关系和可用资源量集合,生成资源拓扑关系,该资源拓扑关系用于指示多个数据中心中有可用资源的数据中心的聚类结果,其中,属于同一聚类结果的不同数据中心之间的通信开销,小于属于不同聚类结果下的数据中心之间的通信开销,从而调度器根据该资源拓扑关系,将待调度的作业中的多个子作业分别调度至同一聚类结果下的多个数据中心。
由于调度器根据资源拓扑关系,将同一作业的多个子作业分别调度至属于同一聚类结果下的多个数据中心,而属于同一聚类结果的不同数据中心之间的通信开销小于属于不同聚类结果下的数据中心之间的通信开销,这使得多个数据中心在执行多个子作业的过程中,不同数据中心之间进行数据交互所产生的通信开销较小,从而可以减少执行多个子作业所产生的通信开销。
在一种可能的实施方式中,调度器在生成资源拓扑关系时,具体可以是根据网络拓扑关系以及可用资源量集合,提取该多个数据中心中有可用资源的数据中心的结构特征以及通信特征,该结构特征用于指示多个数据中心有可用资源的数据中心之间的连接关系,该通信特征用于指示多个数据中心中有可用资源的数据中心之间的通信开销大小,从而调度器可以根据该结构特征以及通信特征,生成资源拓扑关系。如此,调度器可以将算力网络中具有相近结构和通信开销的多个数据中心聚集为一个类别,从而可以生成用于指示所述多个数据中心中有可用资源的数据中心的聚类结果的资源拓扑关系,以便后续根据该资源拓扑关系调度多个子作业。
在一种可能的实施方式中,调度器在生成资源拓扑关系之前,可以根据可用资源量集合提取多个数据中心有可用资源的数据中心的资源特征,从而调度器可以根据该多个数据中心中有可用资源的数据中 心的结构特征、通信特征以及资源特征,生成资源拓扑关系。如此,调度器不仅可以将算力网络中具有相近结构和通信开销的多个数据中心聚集为一个类别,而且,属于同一类别下的不同数据中心之间的可用资源量处于相似水平,从而调度器在根据资源拓扑关系调度多个子作业时,可以将多个子作业分别调度至资源情况相近的多个数据中心。
在一种可能的实施方式中,调度器在生成资源拓扑关系时,具体可以是根据网络拓扑关系以及可用资源量集合,对多个数据中心中具有可用资源的数据中心进行聚类,得到至少一个聚类结果,从而根据该可用资源量集合以及至少一个聚类结果,生成资源拓扑关系。如此,可以通过聚类的方式生成资源拓扑关系。
示例性地,聚类数据中心采用的算法,例如可以是K-means聚类算法、DBSCAN算法、或OPTICS算法,或者可以通过其它可适用的聚类算法。
第二方面,本申请提供一种调度器,所述调度器包括用于执行第一方面或第一方面任一种可能实现方式中的作业调度方法的各个模块。
第三方面,本申请提供一种处理器,该处理器可以与存储器连接,用于执行存储器中存储的指令,以使得该处理器执行如第一方面或第一方面的任一种实现方式中的作业调度方法的步骤。
第四方面,本申请提供一种调度器,所述计算设备包括处理器、存储器。所述处理器、所述存储器进行相互的通信。所述处理器用于执行存储器中存储的指令,以使得调度器执行如第一方面或第一方面的任一种实现方式中的作业调度方法。需要说明的是,该存储器可以集成于处理器中,也可以是独立于处理器之外。计算设备还可以包括总线。其中,处理器通过总线连接存储器。其中,存储器可以包括可读存储器以及随机存取存储器。
第五方面,本申请提供一种计算机可读存储介质,所述计算机可读存储介质中存储有指令,当其在计算设备上运行时,使得计算设备执行上述第一方面或第一方面的任一种实现方式所述作业调度方法的操作步骤。
第六方面,本申请提供了一种包含指令的计算机程序产品,当其在计算设备上运行时,使得计算设备执行上述第一方面或第一方面的任一种实现方式所述作业调度方法的操作步骤。
本申请在上述各方面提供的实现方式的基础上,还可以进行进一步组合以提供更多实现方式。
附图说明
图1为本申请提供的一示例性算力网络的架构示意图;
图2为本申请提供的一种作业调度方法的流程示意图;
图3为本申请提供的一种网络拓扑图的示意图;
图4为本申请提供的将算力网络100中的多个数据中心聚类得到3个数据中心集合的示意图;
图5为本申请提供的一示例性资源属性配置表的示意图;
图6为本申请提供的一种调度器的结构示意图;
图7为本申请提供的一种调度器的硬件结构示意图。
具体实施方式
为了解决在算力网络中执行分布式作业所产生的通信开销较大的问题,本申请提供了一种作业调度装置,通过将分布式作业调度至通信开销较小的一组数据中心执行,降低执行分布式作业所产生的通信开销。
下面将结合本申请实施例中的附图,对本申请中的技术方案进行描述。
参见图1,为本申请实施例提供的一示例性算力网络示意图。如图1所示,算力网络100可以包括调度器200以及多个数据中心(具体为数据中心1至数据中心10)。其中,调度器200用于接收用户提交的作业,并将作业调度至该多个数据中心中进行执行。
其中,图1中是以数据中心为粒度进行作业调度为例,在其它可能的实施例中,调度器200也可以以可用区(availability zone,AZ)或者区域(region)等粒度进行作业调度。通常情况下,每个数据中心可以包括多个设备,如包括计算设备、存储设备、网络设备(如网卡)等。每个AZ可以包括一个数据中心或多个地理位置相近的数据中心,每个region可以包括一个或者多个AZ。并且,算力网络100中的 数据中心的数量不局限于图1所示示例。
在图1所示的算力网络中,多个数据中心之间可以基于总-分网络架构以及对等(peer to peer,P2P)网络架构进行互联。具体地,数据中心1至数据中心10在逻辑上可以划分为区域1至区域3,其中,区域1包括数据中心1至数据中心3;区域2包括数据中心4至数据中心7;区域3包括数据中心8至数据中心10。每个区域内的不同数据中心之间可以基于总-分网络架构进行互联,如区域1中的数据中心1连接了区域1内的其它所有数据中心;属于不同区域的数据中心之间,可以基于P2P网络架构进行互联。在其它可能的算力网络中,算力网络100包括的多个数据中心之间,也可以仅基于P2P网络架构进行互联,或者仅基于总-分网络架构进行互联,又或者基于其它可适用的架构进行互联,在此并不进行限定。
实际应用时,用户向调度器200提交的作业可能包括多个子作业,该多个子作业可以并行执行,此时,该作业又可被称为分布式作业。示例性地,用户提交的分布式作业,例如可以是分布式训练作业,当人工智能(artificial intelligence,AI)模型中的参数的较多、训练样本的数据量较大(如参数量高达2000亿,数据量高达40太字节)时,调度器200可以将该AI模型的网络架构拆分成多个模块分别进行训练,或者将该AI模型的训练样本切分成多个子集训练AI模型等。或者,分布式作业,也可以是分布式计算作业、分布式搜索作业等其它类型的作业。
然后,调度器200将该多个子作业分别调度至不同的数据中心进行执行,以便基于多个数据中心并行执行多个子作业来提高分布式作业的处理效率。具体实现时,每个区域可以配置有代理组件,如图1中的代理组件101、代理组件102、代理组件103等,从而调度器200可以将多个子作业分别调度至相应的代理组件,并由该代理组件将子作业转发至相应的数据中心。例如,假设分布式作业包括子作业1至子作业4,则调度器200可以将子作业1和子作业2调度至代理组件101、将子作业3和子作业4调度至代理组件103,并由代理组件101将子作业1调度至数据中心1、将子作业2调度至数据中心2,由代理组件103将子作业3调度至数据中心8、将子作业4调度至数据中心9。其中,每个区域内的代理组件,可以在该区域中独立于数据中心进行部署,如图1中代理组件102可以部署在独立于数据中心4的计算设备上,并与数据中心4建立有线或者无线连接。或者,每个区域内的代理组件,可以部署于该区域中的其中一个数据中心,例如可以将图1中的代理组件102部署于数据中心4中,本申请对于代理组件在区域中的部署方式并不进行限定。
在调度子作业的过程中,如果调度器200根据FCFS、短作业优先等调度算法,将该分布式作业包括的多个子作业依次调度至不同的数据中心,这使得多个子作业容易被调度至地理位置相距较远的不同数据中心。在执行分布式作业的过程中,不同数据中心经常会进行数据交互,如不同数据中心交互执行子作业所得到的执行结果等,这使得相距较远的数据中心之间的数据交互会产生较大的通信开销,从而导致执行分布式作业所需的通信开销较大。以执行分布式训练作业为例,调度器200可以将训练样本切分成多个子集,则,利用一个子集来迭代训练AI模型的任务即可以是分布式作业的一个子作业。由于在执行分布式训练作业的过程中,不同数据中心每轮利用一个子集训练AI模型后所得到的模型参数,通常会发送给其它数据中心,以便各个数据中心通过模型参数交换和累积算法计算出全局的模型参数,并利用该全局的模型参数更新本地的AI模型并执行下一轮的AI模型。此时,如果调度器200将不同的子作业调度至地理位置相距较远的数据中心,则在迭代训练AI模型的过程中,多个数据中心之间频繁交互每轮训练得到的模型参数会产生较大的通信开销,从而导致执行分布式作业所产生的通信开销较大。
为此,本申请提供了一种作业调度方法,以减小执行分布式作业所产生的通信开销。具体实现时,调度器200获取算力网络100的网络拓扑关系以及算力网络100包括的多个数据中心的可用资源量集合,该网络拓扑关系用于指示算力网络100所包括的多个数据中心之间的连接关系和通信开销,该可用资源量集合包括一个或者多个可用资源量,该可用资源量用于指示多个数据中心中有可用资源的一个数据中心的资源统计结果。然后,调度器200根据获取的网络拓扑关系以及可用资源量集合生成用于指示多个数据中心中有可用资源的数据中心的聚类结果,其中,属于同一聚类结果的不同数据中心之间的通信开销,小于属于不同聚类结果下的数据中心之间的通信开销,从而调度器200根据该资源拓扑关系,将待调度的作业中的多个子作业分别调度至属于同一聚类结果下的多个数据中心。
由于调度器200根据资源拓扑关系,将同一作业的多个子作业分别调度至属于同一聚类结果下的多个数据中心,而属于同一聚类结果的不同数据中心之间的通信开销小于属于不同聚类结果下的数据中心 之间的通信开销,这使得多个数据中心在执行多个子作业的过程中,不同数据中心之间进行数据交互所产生的通信开销较小,从而可以减少执行多个子作业所产生的通信开销。
作为一些示例,调度器200可以通过软件或者硬件实现。其中,当通过软件实现时,调度器200例如可以通过虚拟机、容器、计算引擎中的至少一种实现等。或者,当通过硬件实现时,调度器200可以通过包括处理器的物理设备实现,如服务器等。其中,处理器可以是CPU,以及专用集成电路(application-specific integrated circuit,ASIC)、可编程逻辑器件(programmable logic device,PLD)、复杂程序逻辑器件(complex programmable logical device,CPLD)、现场可编程门阵列(field-programmable gate array,FPGA)、通用阵列逻辑(generic array logic,GAL)、片上系统(system on chip,SoC)、软件定义架构(software-defined infrastructure,SDI)芯片、人工智能(artificial intelligence,AI)芯片、数据处理单元(data processing unit,DPU)等任意一种处理器或其任意组合。并且,该物理设备中所包括的处理器的数量可以是一个,也可以是多个,具体可以根据实际应用的业务需求设定处理器数量,本实施例对此并不进行限定。
值得注意的是,上述图1所示的算力网络100仅作为一种示例性说明,实际应用时,本申请实施例提供的作业调度方法也可以应用于其他可适用的算力网络。
接下来结合附图,对本申请提供的作业调度方法的实施例进行描述。
参见图2,图2为本申请实施例提供的一种作业调度方法的流程示意图。其中,图2所示的作业调度方法应用于图1所示的算力网络100,或者可以应用于其它可适用的算力网络,本实施例对此并不进行限定。为便于说明,本实施例中以应用于图1所示的算力网络为例进行示例性说明。
基于图1所示的算力网络100,图2所示的作业调度方法具体可以包括:
S201:获取算力网络100的网络拓扑关系,该网络拓扑关系用于指示算力网络100包括的多个数据中心之间的连接关系以及通信开销。
实际应用时,算力网络100可以部署有轻量级的代理组件,如图1中的代理组件101至代理组件103,从而调度器200可以指示该代理组件采集算力网络100中各个数据中心与其它数据中心之间的连接关系、不同数据中心之间进行数据交互所产生的通信开销,并将其上报给调度器200。如此,可以调度器200可以代理组件上报的信息,生成网络拓扑关系,该网络拓扑关系例如可以是如图3所示的网络拓扑图,该网络拓扑图中的节点用于唯一标识算力网络100中的数据中心,网络拓扑图中连接不同节点的边上的数值用于指示两个数据中心之间的通信开销,并且,数值越大,表征两个数据中心之间的通信开销越大。在其他可能的实施方式中,网络拓扑关系也可以是通过其他方式实现。
其中,不同数据中心之间的通信开销,用于表征两个数据中心进行数据通信时所需付出的代价。示例性地,通信开销,例如可以通过两个数据中心的地理位置之间的物理距离进行衡量,并且,物理距离越大,这两个数据中心之间的通信开销越大。或者,通信开销,可以通过两个数据中心之间通信单位长度的数据时所需消耗的资源量进行衡量。在其它实施例中,通信开销也可以是通过其它方式进行衡量。
S202:调度器200获取多个数据中心的可用资源量集合,该可用资源量集合包括一个或者多个可用资源量,该可用资源量用于指示多个数据中心中具有可用资源的一个数据中心的资源统计结果。
具体地,算力网络100中的各个代理组件,可以基于心跳机制周期性地向调度器200上报各个数据中心当前的可用资源量,即可用资源的资源量。其中,可用资源,也可称为空闲资源,是指数据中心中可以用于调度给作业并用于执行该作业的资源。如此,调度器200可以将一个或者多个代理组件上报的可用资源量添加至可用资源量集合中,以便后续调度器200将作业调度至可用资源量集合中的具有可用资源的数据中心。
代理组件所上报的可用资源,可以由用户或者技术人员(如管理员等)配置该资源的属性。比如,调度器200可以获取用户或者技术人员提供的资源属性配置表,并将该资源属性配置表下发给代理组件。其中,资源属性配置表可以基于另一种标记语言(YAML ain't a markup language,YAML)形式文件进行存储,并且,该资源属性配置表中可以定义代理组件所要采集的可用资源的信息。例如,资源属性配置表可以如图5所示,可以包括计算资源:CPU、图形处理器(graphics processing unit,GPU)、神经网络处理器(neural-network processing unit,NPU)、FPGA、ASIC等;存储资源:硬盘驱动器(hard disk  drive,HDD)、固态硬盘(solid state drives,SSD)、基于非易失性内存主机控制器接口规范(non-volatile memory express,NVME)的存储器、盒式磁带(cassette tape,CASTAPE)、IO速率等;网络资源:上行带宽(up_band)、下行带宽(dn_band)、网卡类型(NIC_type)。在其它实施例中,资源属性配置表中的资源属性可以包括更多或者更少的属性内容。这样,代理组件可以基于资源属性配置表中所定义的资源属性,周期性的采集、统计并上报各个数据中心的可用资源量。
实际应用时,调度器200在接收到的代理组件上报的可用资源量后,还可以对其进行处理。例如,调度器200可以对代理组件上报的可用资源量进行数据清洗,以去除其中的乱码等异常值,或者,调度器200可以在确定存在乱码等异常值时,指示代理组件重新采集、统计并上报各个数据中心的可用资源量。又比如,调度器200可以对各个数据中心的可用资源量进行归一化处理,以避免不同数据中心上的资源差异较大而导致数值变化范围大,进而导致后续的计算复杂度提高。再比如,调度器200可以对可用资源量中的网卡类型等资源信息进行数据转换,具体可以是将网卡类型转换成相应的数值,从而可以利用不同的数值标识不同类型的网卡,以便后续基于转化后的用于标识网卡类型的数值进行作业调度。
S203:调度器200根据网络拓扑关系和可用资源集合,生成资源拓扑关系,该资源拓扑关系用于指示多个数据中心中有可用资源的数据中心的聚类结果,属于同一聚类结果的不同数据中心之间的通信开销,小于属于不同聚类结果下的数据中心之间的通信开销。
本实施例中,调度器200可以根据网络拓扑关系以及可用资源集合,对具有可用资源的数据中心按照通信开销进行聚类,以便将通信开销较小的、并且具有可用资源的多个数据中心聚类为一组。
在一种可能的实施方式中,调度器200可以根据可用资源量集合,确定算力网络100中具有可用资源的多个数据中心,并根据该根据网络拓扑关系,提取具有可用资源的多个数据中心的结构特征以及通信特征,该结构特征用于指示多个数据中心之间的连接关系,该通信特征用于指示多个数据中心之间传输数据的通信开销大小。
具体实现时,以网络拓扑关系具体为网络拓扑图为例,调度器200可以确定网络拓扑图中每个节点的度数,其中,每个节点的度数,即为与该节点连接的边的数目,从而调度器200可以针对每个节点,以该节点为中心,向外拓展K层。其中,K为以该节点为中心时网络拓扑图的直径,也即为网络拓扑图中与该节点相距最远的节点所在的层数,不同节点对应的K值存在差异。如此,调度器200可以根据每个节点对应的K层拓展结果以及每个节点的度数,计算得到该节点对应的结构特征。
示例性地,每个节点对应的结构特征可以通过下述公式(1)进行量化,得到该节点的结构特征向量。
vstucture=(α12,…,αi,…αk)     公式(1)
其中,νstucture为结构特征向量,αi表征该节点向外拓展第i层的相邻节点的度数之和,i为不大于k的正整数。如此,可以基于公式(1)计算得到每个节点(也即每个数据中心)对应的向量化结构特征。
并且,调度器200还可以根据网络拓扑图中所指示的不同节点之间数据传输的通信开销(也即不同数据中心之间的通信开销),确定该节点与网络拓扑图中的其它(n-1)个节点之间的通信开销,并以此生成n维的向量,该向量可以作为该节点对应的通信特征。其中,n为网络拓扑图包括的节点的数量,也即上述算力网络100包括的多个数据中心的数量。通常情况下,算力网络100中的每个数据中心可以具有唯一标识,相应地,每个数据中心在生成的算力网络拓扑结构中,也即具有唯一与之对应的节点。
示例性地,每个节点对应的通信特征,可以通过下述公式(2)以及公式(3)进行量化,得到该节点的通信特征向量。
vlink=(βs1s2,…,βst,…,βsn)    公式(2)
βst=min(esi*eij*…*elt)    公式(3)
其中,νlink为通信特征向量,βsj表征该节点s与网络拓扑图中第t个节点之间的通信开销,min(esi*eij*…*elt)表征节点s与第t个节点之间的所有可能的通信路径所产生的通信开销中的最小值,esi表征节点s与节点t之间的通信路径上的节点i与节点s之间的通信开销。
值得注意的是,上述提取多个数据中心的结构特征以及通信特征的具体实现过程仅作为一些示例性 说明,在其它实施例中,调度器200也可以通过其它方式提取多个数据中心的结构特征以及通信特征。比如,在提取数据中心的通信特征向量时,可以基于上述公式(2)以及下述公式(4)计算得到每个节点(也即每个数据中心)对应的通信特征向量。
βst=min(esi+eij+…+elt)    公式(4)
然后,调度器200根据多个数据中心的结构特征以及通信特征,对算力网络100中的具有可用资源的多个数据中心进行聚类,得到多个数据中心集合,每个数据中心集合即为一个聚类结果。
具体实现时,调度器200可以将每个数据中心的结构特征以及通信特征进行向量拼接,得到每个数据中心的一个特征向量,从而调度器200可以利用相应的聚类算法对多个数据中心进行聚类,得到多个数据中心集合。如图4所示,调度器200可以将算力网络100中的多个数据中心,聚类得到3个数据中心集合,每个数据中心集合包括多个数据中心。
示例性地,调度器200可以利用K-means聚类算法将多个数据中心聚类成多个数据中心集合,其中,K-means聚类算法的目标函数可以参见下述公式(5)。
其中,n为算力网络100包括的多个数据中心的数量,k为聚类的集合数量,xi表征第i个数据中心,i的取值从1至n,uj表征第j个集合的质心,j的取值从1至k。
在其它示例中,调度器200也可以通过基于密度的噪声应用空间聚类(density based spatial clustering of applications with noise,DBSCAN)算法、或确定簇结构的排序点(ordering point to identify the cluster structure,OPTICS)算法聚类多个数据中心,或者可以通过其它可适用的算法进行聚类,本实施例对此并不进行限定。
实际应用场景中,调度器200在对算力网络100中的多个数据中心进行聚类的过程中,可以将生成的各个数据中心的特征向量存储于redis数据库,以供调度器200进行高效存取;并且,调度器200在聚类得到多个数据中心集合(也即多个聚类结果)后,可以将其保存至数据库,如redis数据库等,以便后续根据多个数据中心集合对用户提交的分布式作业进行调度。
如此,调度器200所确定的每个数据中心集合中,不同数据中心之间的通信开销,通常会小于该集合中的数据中心与其它集合中的数据中心之间的通信开销,从而调度器200可以实现将相互之间通信开销较小的多个数据中心划分至同一集合中。在其它实施例中,调度器200也可以是通过其它方式将多个数据中心划分为多个数据中心集合,如可以根据技术人员对于该多个数据中心的配置对多个数据中心进行划分等,本实施例对此并不进行限定。
在进一步可能的实施方式中,调度器200还可以基于更多维度的信息,对具有可用资源多个数据中心进行聚类。例如,调度器200在对多个数据中心进行聚类之前,还可以获取算力网络100中多个数据中心分别对应的可用资源的资源量,并根据网络拓扑图以及各个数据中心的可用资源的资源量对算力网络100中的多个数据中心进行聚类。如此,聚类得到的每个数据中心集合中,不仅不同数据中心之间的通信开销较小,而且,不同数据中心的可用资源的资源量可以处于相似水平,具有相近的算力,有助于在该数据中心集合中提升负载均衡能力。
具体实现时,调度器200可以根据各个数据中心的可用资源的资源量,提取各个数据中心分别对应的资源特征,该资源特征可以通过向量形式表征。作为一种示例,假设代理组件基于图5所示的资源属性配置表采集和上报各个数据中心的可用资源的资源量,则调度器200可以基于下述公式(6)对各个数据中心的可用资源进行量化,从而可以提取得到各个数据中心的资源特征。
vresource=(vcpu,vgpu,vnpu,vfpga,vasic,vhdd,vssd,vnvme,
vcastape,vio,vup_band,vdn_band,vnic_type)    公式(6)
其中,vcpu表征数据中心的CPU资源,vgpu表征数据中心的GPU资源,vnpu表征数据中心的NPU资 源,vfpga表征数据中心的FPGA资源,vasic表征数据中心的ASIC资源,vhdd表征数据中心的HDD资源,vssd表征数据中心的SSD资源,vnvme表征数据中心的NVME资源,vcastape表征数据中心的CASTAP资源,vio表征数据中心的IO资源,vup_band表征数据中心的上行带宽,vdn_band表征数据中心的下行带宽,vnic_type表征数据中心的网卡类型。在其它实施例中,调度器200也可以是通过其它公式或者其它方式提取出各个数据中心的资源特征。
接着,调度器200可以基于提取出的各个数据中心的资源特征,以及上述提取出的多个数据中心的结构特征以及通信特征,对算力网络100中具有可用资源的多个数据中心进行聚类,如采用上述K-means算法进行聚类等,以此聚类得到一个或者多个数据中心集合,也即得到一个或者多个聚类结果。
除了集合数据中心的资源特征之外,调度器200也可以是结合数据中心的资费信息(如电价等)、能耗信息对多个数据中心进行聚类,以此可以使得每个数据中心集合中的数据中心具有相近的能耗和成本,从而在将分布式作业调度至该数据中心集合后,可以降低分布式作业的能耗和成本,增加算力的能效比。在其它实施例中,调度器200还可以结合更多其它类型的信息对多个数据中心进行聚类,本实施例对此并不进行限定。
调度器200在根据网络拓扑关系以及可用资源集合,得到至少一个聚类结果后,可以根据该聚类结果以及可用资源量集合,生成用于指示具有可用资源的数据中心的聚类结果的资源拓扑关系,该资源拓扑关系可以通过用于指示资源分布以及数据中心聚类结果的拓扑图进行实现,或者可以采用其他方式实现等。
本实施例中,调度器200在生成资源拓扑关系后,当用户提交的作业包括多个子作业时,调度器200可以根据该资源拓扑关系调度该作业。具体地,调度器200继续执行以下步骤:
S204:调度器200接收用户提交的待调度的作业,该作业包括多个子作业,以下称之为分布式作业。
实际应用时,用户可以将分布式作业提交至算力网络100,并请求算力网络100对该分布式作业进行处理。例如,用户可以通过终端或者调度器200对外提供的客户端远程登录调度器200,并在该终端或者客户端提交分布式作业,该终端或者客户端可以生成数据处理请求,该数据处理请求中包括用户提交的待处理的分布式作业,从而该终端或者客户端可以将其发送给调度器200;相应地,调度器200可以通过对接收到的数据处理请求进行解析,得到用户提交的分布式作业。进一步地,用户在提交作业时,还可以为该作业添加标签分布式作业的标签,以便调度器200根据该标签确定用户当前所提交的作业为分布式作业。
其中,分布式作业,是指该作业包括多个可独立执行的子作业,并且该多个子作业可以通过算力网络100中的多个不同数据中心并行处理,以此提高分布式作业的处理效率。示例性地,分布式作业,例如可以是分布式训练作业,用于训练规模较大的AI模型,该分布式训练作业可以通过切分训练样本或者切分AI模型架构的方式,将AI模型的训练作业划分成多个子作业,每个子作业用于通过训练样本的一个子集训练AI模型,或者用于通过训练样本训练AI模型的一部分架构,或者用于通过训练样本的一个子集训练AI模型的一部分架构等,本实施例对此并不进行限定。或者,分布式作业,也可以是对多个数据进行分布式计算、或分布式搜索等作业,本实施例对此并不进行限定。
算力网络100中的多个数据中心,可以是图1所示的多个数据中心,或者可以是多个AZ,或者可以是多个region等,本实施例对此并不进行限定。
S205:调度器200根据资源拓扑关系,将分布式作业中的多个子作业分别调度至属于同一聚类结果下的多个数据中心。
例如,调度器200可以根据资源拓扑关系,基于负载均衡或者随机调度等方式,将分布式作业中的多个子作业分别调度至属于同一聚类结果下的多个数据中心。
实际应用时,由于算力网络100中的多个数据中心在运行过程中,其可用资源量通常发生动态变化,并且,不同数据中心集合中的可用资源的类型和数量可能存在差异,这使得虽然属于每个聚类结果下的不同数据中心之间的通信开销较小,但是可能存在部分聚类结果下的数据中心的可用资源量难以满足该分布式作业的需求。因此,调度器200可以根据资源拓扑关系以及分布式作业需求的资源量,确定第一聚类结果。
作为一些示例,本实施例提供了以下几种确定第一聚类结果的实现方式。
在第一种实现示例中,用户在提交分布式作业时,可以指定执行该分布式作业的所需的资源类型以及资源数量,从而调度器200可以根据用户指定的资源类型以及资源数量对资源拓扑关系所指示的各个数据中心的聚类结果进行遍历,确定具有用户指定资源类型以及相应的资源数量满足用户要求的一个或者多个聚类结果,也即确定满足分布式作业需求资源量的一个或者多个数据中心集合。具体地,调度器200可以根据用户指定的资源类型以及资源数量,确定执行该分布式作业包括的各个子作业所需的可用资源的资源量(资源量向上取整),并且,调度器200可以计算出每个聚类结果下的数据中心的可用资源的平均资源量,具体可以基于下述公式(7)计算数据中心集合的可用资源的资源量与该数据中心集合包括的数据中心数量的比值得到平均资源量。
其中,表征数据中心集合(即聚类结果下的多个数据中心)中可用的计算资源的平均资源量,表征数据中心集合中可用的存储资源的平均资源量,表征数据中心集合中可用的网络资源的平均资源量,m表征数据中心集合中包括的数据中心的数量,表征数据中心集合中第k个数据中心的可用的计算资源的资源量,表征数据中心集合中第k个数据中心的可用的存储资源的资源量,表征数据中心集合中第k个数据中心的可用的网络资源的资源量。
然后,调度器200可以根据每个子作业所需的可用资源的资源量以及算出的每个数据中心集合的可用资源的平均资源量,确定平均资源量能够满足单个子作业所需的可用资源的资源量的数据中心集合。例如,调度器200可以基于下述公式(8)计算数据中心集合的可用资源的平均资源量与单个子作业所需的可用资源的资源量的差值Δδ。
其中,表征单个子作业所需的可用的计算资源的资源量,x表征计算资源的属性数量,表征单个子作业所需的可用的计算资源的资源量,y表征存储资源的属性数量,表征单个子作业所需的可用的网络资源的资源量,z表征网络资源的属性数量。
当差值Δδ的方向为正方向时,表征数据中心集合的可用资源的平均资源量能够满足单个子作业所需求的可用资源的资源量,也即该数据中心集合可以执行分布式作业。当差值Δδ的方向为负方向时,表征数据中心集合的可用资源的平均资源量不满足单个子作业所需求的可用资源的资源量,此时,调度器200不将该分布式作业调度至该数据中心集合。
基于上述方式,调度器200可以确定能够用于执行分布式作业的一个或者多个数据中心集合,也即确定能够用于执行该分布式作业的一个或者多个聚类结果。其中,当仅存在一个聚类结果下的数据中心的可用资源满足分布式作业的需求时,调度器200可以将该聚类结果确定为第一聚类结果,并将分布式作业调度至该第一聚类结果。而当存在多个聚类结果下的数据中心的可用资源均能满足该分布式作业的需求时,调度器200可以基于负载均衡策略,从多个聚类结果中选择负载的聚类结果作为第一聚类结果,或者从该多个聚类结果中随机选择一个聚类结果作为第一聚类结果,然后将分布式作业调度至该第一聚类结果下的多个数据中心中进行执行。
实际应用时,第一聚类结果下的不同数据中心的可用资源可能差异,此时,在将分布式作业调度至第一聚类结果下的多个数据中心后,当第一聚类结果下的多个数据中心中可用资源的资源量能够满足单个子作业所需的可用资源的资源量的数据中心数量,小于分布式作业包括的子作业的数量时,表征第一聚类结果下的多个数据中心因为资源限制难以并行执行该分布式作业的所有子作业,此时,第一聚类结果下的数据中心可以向调度器200指示对该分布式作业进行重调度,以便于调度器200能够将该分布式作业重新调度至能够并行执行所有子作业的其他聚类结果下的多个数据中心中。
在第二种实现示例中,用户在提交分布式作业时,可以指定执行该分布式作业的所需的资源类型以及资源量,还可以指定执行该分布式作业所需的数据中心数量或者该分布式作业包括的子作业的数量,从而调度器200可以根据用户指定的数据中心数量或者子作业的数量,确定执行每个子作业所需的资源类型以及资源量。然后,调度器200可以遍历资源拓扑关系所指示的各个聚类结果, 并统计当前遍历的 聚类结果下的多个数据中心中资源类型以及资源量满足执行每个子作业所需的资源类型以及资源量的数据中心的数量。当统计的数据中心的数量大于或者等于用户指定的数据中心数量或者分布式作业包括的子作业的数量时,调度器200可以停止遍历,并将该聚类结果确定为第一聚类结果,将分布式作业调度至该第一聚类结果下的多个数据中心中。
在第三种实现示例中,调度器200可以实时获取各个聚类结果下的多个数据中心的可用资源量总和,并在接收到用户提交的分布式作业后,可以根据该分布式作业所需求的资源类型以及资源量,从多个聚类结果中筛选出满足资源条件的一个或者多个数据中心集合。其中,该资源条件是指聚类结果下的多个数据中心的可用资源量总和与分布式作业所需的可用资源量之间的差值超出第一阈值,或者聚类结果下的多个数据中心的可用资源量总和超出第二阈值,这表明所筛选出的一个或者多个聚类结果下的数据中心通常会具有足够的可用资源来执行分布式作业。因此,调度器200可以从筛选出的一个或者多个聚类结果中确定第一聚类结果,并将用户提交的分布式作业调度至属于该第一聚类结果下的多个数据中心中进行执行等。
本实施例中,对于调度器200从多个聚类结果中确定第一聚类结果的实现方式并不进行限定。
调度器200在将分布式作业调度至第一聚类结果下的多个数据中心时,可以指定用于执行该分布式作业的部分或者全部数据中心,并指定每个数据中心所执行的子作业,从而属于第一聚类结果的多个数据中心在调度器200的调度下,并行执行分布式作业的多个子作业。
值得注意的是,上述步骤S201至步骤S205是以调度器200对一个分布式作业进行调度为例进行示例性说明,实际应用时,调度器200可以会依次接收多个分布式作业,从而调度器200可以实时生成资源拓扑关系,并根据实时生成的资源拓扑关系,将接收到的多个分布式作业逐个调度至相应的聚类结果下的多个数据中心进行执行。比如,调度器200在接收到其他用户提交的分布式作业后,可以将新的分布式作业调度至多个聚类结果中的第二聚类结果下的多个数据中心,以便利用第二聚类结果下的多个数据中心并行执行该新的分布式作业中的多个子作业,本实施例对此并不进行限定。
本实施例中,由于调度器200根据资源拓扑关系,将同一作业的多个子作业分别调度至属于同一聚类结果下的多个数据中心,而属于同一聚类结果的不同数据中心之间的通信开销小于属于不同聚类结果下的数据中心之间的通信开销,这使得多个数据中心在执行多个子作业的过程中,不同数据中心之间进行数据交互所产生的通信开销较小,从而可以减少执行多个子作业所产生的通信开销。
进一步地,当资源拓扑关系还基于数据中心的资源特征对算力网络100中具有可用资源的多个数据中心进行聚类得到时,该资源拓扑关系所指示的每个聚类结果下的不同数据中心的可用资源的资源类型以及资源量之间的差异通常较小,此时,利用其中一个聚类结果下的多个数据中心并行执行分布式作业包括的多个子作业,可以减少该多个数据中心上的资源碎片的产生,减少资源浪费,也能提高算力网络100的负载均衡能力。
值得注意的是,图2所示的步骤执行顺序,仅作为一种示例性说明,并不用于限定其实际应用时调度器200所执行步骤的顺序局限于图2所示示例,比如,调度器200可以并行执行步骤S201与步骤S202,或者,调度器200也可以先执行步骤S202,再执行步骤S201;或者,调度器200可以先执行步骤S204,再执行步骤S201至步骤S203等,本实施例对此并不进行限定。
值得注意的是,本领域的技术人员根据以上描述的内容,能够想到的其他合理的步骤组合,也属于本申请的保护范围内。其次,本领域技术人员也应该熟悉,说明书中所描述的实施例均属于优选实施例,所涉及的动作并不一定是本申请所必须的。
以上结合图1至图5对本申请实施例提供的作业调度方法进行介绍,接下来结合附图对本申请实施例提供的调度器的功能以及实现该调度器的计算设备进行介绍。
参见图6,示出了一种调度器的结构示意图,该调度器600包括:
信息获取模块601,用于获取所述算力网络的网络拓扑关系,所述网络拓扑关系用于指示所述算力网络所包括的多个数据中心之间的连接关系和通信开销;获取所述多个数据中心的可用资源量集合,所述可用资源量集合包括一个或多个可用资源量,所述可用资源量用于指示所述多个数据中心中有可用资源的一个数据中心的资源统计结果;
生成模块602,用于根据所述网络拓扑关系和所述可用资源量集合,生成资源拓扑关系,所述资源拓扑关系用于指示所述多个数据中心中有可用资源的数据中心的聚类结果,属于同一聚类结果的不同数据中心之间的通信开销,小于属于不同聚类结果下的数据中心之间的通信开销;
调度模块603,用于根据所述资源拓扑关系,将待调度的作业中的多个子作业分别调度至属于同一聚类结果下的多个数据中心。
在一种可能的实施方式中,所述生成模块602,用于:
根据所述网络拓扑关系以及所述可用资源量集合,提取所述多个数据中心中有可用资源的数据中心的结构特征以及通信特征,所述结构特征用于指示所述多个数据中心中有可用资源的数据中心之间的连接关系,所述通信特征用于指示所述多个数据中心中有可用资源的数据中心之间的通信开销大小;
根据所述结构特征以及所述通信特征,生成所述资源拓扑关系。
在一种可能的实施方式中,生成模块602,还用于在生成所述资源拓扑关系之前,根据所述可用资源量集合,提取所述多个数据中心中有可用资源的数据中心的资源特征;
则,所述生成模块602,具体用于根据所述结构特征、所述通信特征以及所述资源特征,生成所述资源拓扑关系。
在一种可能的实施方式中,所述生成模块602,用于:
根据所述网络拓扑关系和所述可用资源量集合,对所述多个数据中心中有可用资源的数据中心进行聚类,得到至少一个聚类结果;
根据所述可用资源量集合以及所述至少一个聚类结果,生成所述资源拓扑关系。
由于图6所示的调度器600对应于图2所示调度器200所执行的方法,故图6所示的调度器600的具体实现方式及其所具有的技术效果,可以参见前述实施例中的相关之处描述,在此不做赘述。
图7为本申请提供的一种调度器700的示意图,该调度器700可以实现上述图2所示实施例中的调度器200的功能。
如图7所示,所述调度器700包括处理器701、存储器702、通信接口703。其中,处理器701、存储器702、通信接口703通过总线704进行通信,也可以通过无线传输等其他手段实现通信。该存储器702用于存储指令,该处理器701用于执行该存储器702存储的指令。进一步的,调度器700还可以包括内存单元705,还内存单元705可以通过总线704与处理器701、存储介质702以及通信接口703连接。其中,该存储器702存储程序代码,且处理器701可以调用存储器702中存储的程序代码执行以下操作:
获取算力网络的网络拓扑关系,所述网络拓扑关系用于指示所述算力网络所包括的多个数据中心之间的连接关系和通信开销;
获取所述多个数据中心的可用资源量集合,所述可用资源量集合包括一个或多个可用资源量,所述可用资源量用于指示所述多个数据中心中有可用资源的一个数据中心的资源统计结果;
根据所述网络拓扑关系和所述可用资源量集合,生成资源拓扑关系,所述资源拓扑关系用于指示所述多个数据中心中有可用资源的数据中心的聚类结果,属于同一聚类结果的不同数据中心之间的通信开销,小于属于不同聚类结果下的数据中心之间的通信开销;
根据所述资源拓扑关系,将待调度的作业中的多个子作业分别调度至属于同一聚类结果下的多个数据中心。
应理解,在本申请实施例中,处理器701可以是CPU,处理器701还可以是其他通用处理器、数字信号处理器(DSP)、专用集成电路(ASIC)、现场可编程门阵列(FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立器件组件等。通用处理器可以是微处理器或者是任何常规的处理器等。
存储器702可以包括只读存储器和随机存取存储器,并向处理器701提供指令和数据。存储器702还可以包括非易失性随机存取存储器。例如,存储器702还可以存储设备类型的信息。
存储器702可以是易失性存储器或非易失性存储器,或可包括易失性和非易失性存储器两者。其中,非易失性存储器可以是只读存储器(read-only memory,ROM)、可编程只读存储器(programmable ROM,PROM)、可擦除可编程只读存储器(erasable PROM,EPROM)、电可擦除可编程只读存储器(electrically EPROM,EEPROM)或闪存。易失性存储器可以是随机存取存储器(random access memory,RAM),其用作外部高速缓存。通过示例性但不是限制性说明,许多形式的RAM可用,例如静态随机存取存储器 (static RAM,SRAM)、动态随机存取存储器(DRAM)、同步动态随机存取存储器(synchronous DRAM,SDRAM)、双倍数据速率同步动态随机存取存储器(double data date SDRAM,DDR SDRAM)、增强型同步动态随机存取存储器(enhanced SDRAM,ESDRAM)、同步连接动态随机存取存储器(synchlink DRAM,SLDRAM)和直接内存总线随机存取存储器(direct rambus RAM,DR RAM)。
通信接口703用于与调度器700连接的其它设备进行通信。该总线704除包括数据总线之外,还可以包括电源总线、控制总线和状态信号总线等。但是为了清楚说明起见,在图中将各种总线都标为总线704。
应理解,根据本申请实施例的调度器700可对应于本申请实施例中的调度器600,并可以对应于执行根据本申请实施例中图2所示方法中的调度器200,并且调度器700所实现的上述和其它操作和/或功能分别为了实现图2中方法的相应流程,为了简洁,在此不再赘述。
本申请实施例还提供了一种处理器,该处理器与存储器连接,该处理器用于执行存储器中的指令,以使得处理器执行上述图2所示实施例中调度器200所执行的作业调度方法。
本申请实施例还提供了一种计算机可读存储介质。所述计算机可读存储介质可以是计算设备能够存储的任何可用介质或者是包含一个或多个可用介质的数据中心等数据存储设备。所述可用介质可以是磁性介质,(例如,软盘、硬盘、磁带)、光介质(例如,DVD)、或者半导体介质(例如固态硬盘)等。该计算机可读存储介质包括指令,所述指令指示计算设备执行上述作业调度方法。
本申请实施例还提供了一种计算机程序产品。所述计算机程序产品包括一个或多个计算机指令。在计算设备上加载和执行所述计算机指令时,全部或部分地产生按照本申请实施例所述的流程或功能。
所述计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一计算机可读存储介质传输,例如,所述计算机指令可以从一个网站站点、计算机或数据中心通过有线(例如同轴电缆、光纤、数字用户线(DSL))或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机或数据中心进行传输。
所述计算机程序产品可以为一个软件安装包,在需要使用前述作业调度方法的任一方法的情况下,可以下载该计算机程序产品并在计算设备上执行该计算机程序产品。
上述实施例,可以全部或部分地通过软件、硬件、固件或其他任意组合来实现。当使用软件实现时,上述实施例可以全部或部分地以计算机程序产品的形式实现。所述计算机程序产品包括一个或多个计算机指令。在计算机上加载或执行所述计算机程序指令时,全部或部分地产生按照本申请实施例所述的流程或功能。所述计算机可以为通用计算机、专用计算机、计算机网络、或者其他可编程装置。所述计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质传输,例如,所述计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如同轴电缆、光纤、数字用户线(DSL))或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。所述计算机可读存储介质可以是计算机能够存取的任何可用介质或者是包含一个或多个可用介质集合的服务器、数据中心等数据存储设备。所述可用介质可以是磁性介质(例如,软盘、硬盘、磁带)、光介质(例如,DVD)、或者半导体介质。半导体介质可以是固态硬盘。
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到各种等效的修改或替换,这些修改或替换都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以权利要求的保护范围为准。

Claims (10)

  1. 一种作业调度方法,其特征在于,所述方法应用于算力网络,包括:
    获取所述算力网络的网络拓扑关系,所述网络拓扑关系用于指示所述算力网络所包括的多个数据中心之间的连接关系和通信开销;
    获取所述多个数据中心的可用资源量集合,所述可用资源量集合包括一个或多个可用资源量,所述可用资源量用于指示所述多个数据中心中有可用资源的一个数据中心的资源统计结果;
    根据所述网络拓扑关系和所述可用资源量集合,生成资源拓扑关系,所述资源拓扑关系用于指示所述多个数据中心中有可用资源的数据中心的聚类结果,属于同一聚类结果的不同数据中心之间的通信开销,小于属于不同聚类结果下的数据中心之间的通信开销;
    根据所述资源拓扑关系,将待调度的作业中的多个子作业分别调度至属于同一聚类结果下的多个数据中心。
  2. 根据权利要求1所述的方法,其特征在于,所述根据所述网络拓扑关系和所述可用资源量集合,生成资源拓扑关系,包括:
    根据所述网络拓扑关系以及所述可用资源量集合,提取所述多个数据中心中有可用资源的数据中心的结构特征以及通信特征,所述结构特征用于指示所述多个数据中心中有可用资源的数据中心之间的连接关系,所述通信特征用于指示所述多个数据中心中有可用资源的数据中心之间的通信开销大小;
    根据所述结构特征以及所述通信特征,生成所述资源拓扑关系。
  3. 根据权利要求2所述的方法,其特征在于,在生成所述资源拓扑关系之前,所述方法还包括:
    根据所述可用资源量集合,提取所述多个数据中心中有可用资源的数据中心的资源特征;
    则,所述根据所述结构特征以及所述通信特征,生成所述资源拓扑关系,包括:
    根据所述结构特征、所述通信特征以及所述资源特征,生成所述资源拓扑关系。
  4. 根据权利要求1至3任一项所述的方法,其特征在于,所述根据所述网络拓扑关系和所述可用资源量集合,生成资源拓扑关系,包括:
    根据所述网络拓扑关系和所述可用资源量集合,对所述多个数据中心中有可用资源的数据中心进行聚类,得到至少一个聚类结果;
    根据所述可用资源量集合以及所述至少一个聚类结果,生成所述资源拓扑关系。
  5. 一种调度器,其特征在于,所述调度器应用于算力网络,所述调度器包括:
    信息获取模块,用于获取所述算力网络的网络拓扑关系,所述网络拓扑关系用于指示所述算力网络所包括的多个数据中心之间的连接关系和通信开销;获取所述多个数据中心的可用资源量集合,所述可用资源量集合包括一个或多个可用资源量,所述可用资源量用于指示所述多个数据中心中有可用资源的一个数据中心的资源统计结果;
    生成模块,用于根据所述网络拓扑关系和所述可用资源量集合,生成资源拓扑关系,所述资源拓扑关系用于指示所述多个数据中心中有可用资源的数据中心的聚类结果,属于同一聚类结果的不同数据中心之间的通信开销,小于属于不同聚类结果下的数据中心之间的通信开销;
    调度模块,用于根据所述资源拓扑关系,将待调度的作业中的多个子作业分别调度至属于同一聚类结果下的多个数据中心。
  6. 根据权利要求5所述的调度器,其特征在于,所述生成模块,用于:
    根据所述网络拓扑关系以及所述可用资源量集合,提取所述多个数据中心中有可用资源的数据中心的结构特征以及通信特征,所述结构特征用于指示所述多个数据中心中有可用资源的数据中心之间的连接关系,所述通信特征用于指示所述多个数据中心中有可用资源的数据中心之间的通信开销大小;
    根据所述结构特征以及所述通信特征,生成所述资源拓扑关系。
  7. 根据权利要求6所述的调度器,其特征在于,生成模块,还用于在生成所述资源拓扑关系之前,根据所述可用资源量集合,提取所述多个数据中心中有可用资源的数据中心的资源特征;
    则,所述生成模块,具体用于根据所述结构特征、所述通信特征以及所述资源特征,生成所述资源拓扑关系。
  8. 根据权利要求5至7任一项所述的调度器,其特征在于,所述生成模块,用于:
    根据所述网络拓扑关系和所述可用资源量集合,对所述多个数据中心中有可用资源的数据中心进行聚类,得到至少一个聚类结果;
    根据所述可用资源量集合以及所述至少一个聚类结果,生成所述资源拓扑关系。
  9. 一种处理器,其特征在于,所述处理器与存储器连接,所述处理器用于执行所述存储器中存储的指令,以使所述处理器执行如权利要求1至4任一项所述方法的步骤。
  10. 一种调度器,其特征在于,包括处理器、存储器;
    所述处理器用于执行所述存储器中存储的指令,以使所述调度器执行如权利要求1至4任一项所述方法的步骤。
PCT/CN2023/101231 2022-08-29 2023-06-20 作业调度方法、调度器及相关设备 WO2024045784A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202211041424.3A CN117667327A (zh) 2022-08-29 2022-08-29 作业调度方法、调度器及相关设备
CN202211041424.3 2022-08-29

Publications (1)

Publication Number Publication Date
WO2024045784A1 true WO2024045784A1 (zh) 2024-03-07

Family

ID=90071897

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/101231 WO2024045784A1 (zh) 2022-08-29 2023-06-20 作业调度方法、调度器及相关设备

Country Status (2)

Country Link
CN (1) CN117667327A (zh)
WO (1) WO2024045784A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118069374A (zh) * 2024-04-18 2024-05-24 清华大学 数据中心智能训练仿真事务加速方法、装置、设备及介质

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113448721A (zh) * 2020-03-27 2021-09-28 中国移动通信有限公司研究院 算力处理的网络系统及算力处理方法
CN113535399A (zh) * 2021-07-15 2021-10-22 电子科技大学 一种nfv资源调度方法、装置以及系统
CN113590301A (zh) * 2021-09-30 2021-11-02 苏州浪潮智能科技有限公司 一种深度学习业务的任务调度方法及相关装置
CN113810977A (zh) * 2020-06-11 2021-12-17 中国移动通信有限公司研究院 一种生成算力拓扑的方法、系统、节点及介质
CN114039858A (zh) * 2021-10-25 2022-02-11 中国联合网络通信集团有限公司 一种算网资源融合方法、装置、设备及存储介质
CN114500521A (zh) * 2020-11-13 2022-05-13 中国移动通信有限公司研究院 一种算力调度方法、装置、调度设备、系统和存储介质
CN114756340A (zh) * 2022-03-17 2022-07-15 中国联合网络通信集团有限公司 算力调度系统、方法、装置和存储介质

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113448721A (zh) * 2020-03-27 2021-09-28 中国移动通信有限公司研究院 算力处理的网络系统及算力处理方法
CN113810977A (zh) * 2020-06-11 2021-12-17 中国移动通信有限公司研究院 一种生成算力拓扑的方法、系统、节点及介质
CN114500521A (zh) * 2020-11-13 2022-05-13 中国移动通信有限公司研究院 一种算力调度方法、装置、调度设备、系统和存储介质
CN113535399A (zh) * 2021-07-15 2021-10-22 电子科技大学 一种nfv资源调度方法、装置以及系统
CN113590301A (zh) * 2021-09-30 2021-11-02 苏州浪潮智能科技有限公司 一种深度学习业务的任务调度方法及相关装置
CN114039858A (zh) * 2021-10-25 2022-02-11 中国联合网络通信集团有限公司 一种算网资源融合方法、装置、设备及存储介质
CN114756340A (zh) * 2022-03-17 2022-07-15 中国联合网络通信集团有限公司 算力调度系统、方法、装置和存储介质

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118069374A (zh) * 2024-04-18 2024-05-24 清华大学 数据中心智能训练仿真事务加速方法、装置、设备及介质

Also Published As

Publication number Publication date
CN117667327A (zh) 2024-03-08

Similar Documents

Publication Publication Date Title
Wang et al. Resource-efficient federated learning with hierarchical aggregation in edge computing
US11681547B2 (en) File operation task optimization
US11429566B2 (en) Approach for a controllable trade-off between cost and availability of indexed data in a cloud log aggregation solution such as splunk or sumo
WO2024045784A1 (zh) 作业调度方法、调度器及相关设备
WO2023125493A1 (zh) 资源管理方法、装置及资源管理平台
WO2022007781A1 (zh) 任务处理方法、边缘计算设备、计算机设备和介质
Jiang et al. Towards max-min fair resource allocation for stream big data analytics in shared clouds
WO2024114484A1 (zh) 一种服务器无感知计算自适应资源调度方法、系统及计算机设备
WO2023087658A1 (zh) 一种任务调度方法、装置、设备及可读存储介质
CN114679451B (zh) 面向边缘计算的服务调度系统及其调度方法
CN112235344A (zh) 一种面向分布式机器学习的稀疏通信模型的实现方法
CN113900810A (zh) 分布式图处理方法、系统及存储介质
Al-Sinayyid et al. Job scheduler for streaming applications in heterogeneous distributed processing systems
CN117076133A (zh) 云游戏平台异构资源分配方法、计算机装置及存储介质
CN110515716B (zh) 一种支持优先级和反亲和的云优化调度方法及系统
CN115562841B (zh) 一种云视频服务自适应资源调度系统和方法
WO2022161081A1 (zh) 集成学习模型的训练方法、装置、系统和相关设备
Shahid et al. Some new observations on slo-aware edge stream processing
CN114443293A (zh) 一种大数据平台的部署系统及方法
WO2023169408A1 (zh) 资源调度方法、装置及相关设备
Yuan et al. Fairness-aware scheduling algorithm for multiple DAGs based on task replication
Li et al. A Traffic Classification Method for Data Centers Based on Triangular Modulus Fusion Operator
CN117407464A (zh) 跨集群的数据库扩容方法、装置、电子设备及存储介质
Verma et al. A review: intelligent load prediction techniques for CloudIoT
Wen et al. Research of Grid Scheduling Algorithm Based on P2P_Grid Model

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23858823

Country of ref document: EP

Kind code of ref document: A1