WO2019179250A1 - 调度方法、调度器、存储介质及系统 - Google Patents

调度方法、调度器、存储介质及系统 Download PDF

Info

Publication number
WO2019179250A1
WO2019179250A1 PCT/CN2019/074017 CN2019074017W WO2019179250A1 WO 2019179250 A1 WO2019179250 A1 WO 2019179250A1 CN 2019074017 W CN2019074017 W CN 2019074017W WO 2019179250 A1 WO2019179250 A1 WO 2019179250A1
Authority
WO
WIPO (PCT)
Prior art keywords
job request
node
job
data
request
Prior art date
Application number
PCT/CN2019/074017
Other languages
English (en)
French (fr)
Inventor
徐聪
刘浏
张海波
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to EP19770974.4A priority Critical patent/EP3770774B1/en
Publication of WO2019179250A1 publication Critical patent/WO2019179250A1/zh
Priority to US17/021,425 priority patent/US11190618B2/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • G06F16/9032Query formulation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/50Network services
    • H04L67/60Scheduling or organising the servicing of application requests, e.g. requests for application data transmissions using the analysis and optimisation of the required network resources
    • H04L67/63Routing a service request depending on the request content or context
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • G06F16/90335Query processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
    • G06F9/4881Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues

Definitions

  • the present disclosure relates to the field of communications technologies, and in particular, to a scheduling method, a scheduler, a storage medium, and a system.
  • big data applications When the big data application is implemented, it is necessary to build a computing cluster with a certain scale, and dynamically adjust the computing resources inside the computing cluster according to the requirements.
  • big data applications are usually deployed in cloud computing systems, and unified management and more flexible management scheduling can be realized through the cloud computing system.
  • cloud computing systems bring a lot of conveniences for big data applications, they also face many problems that need to be solved. Among them, users are most concerned about performance problems, that is, the overall execution time of cloud computing systems.
  • FIG. 1 a schematic diagram of an architecture of a cloud computing system including a job layer and a platform layer, a job layer including a scheduler, and a platform layer including a plurality of servers, which can serve as storage nodes for storing data It can also be used as a compute node to perform calculations based on stored data.
  • the cloud computing system usually adopts a job layer scheduling policy, that is, the terminal submits a job request, and the scheduler of the job layer adds the received multiple job requests to the scheduling queue, and pairs the scheduling queue according to the preset sorting policy.
  • the job request is sorted, and each job request is sequentially dispatched to the server in the order of arrangement, and the scheduled server performs calculation according to the job request and according to the corresponding data of the job request.
  • the storage location of the data is fixed.
  • the cloud computing system uses a fixed data, mobile computing scheduling method.
  • the calculation process is affected by the distribution of data. For example, if the data required for a job request is distributed on different servers, when a server performs calculations according to the job request, it needs to be transferred from other servers. The required data, which leads to data transfer across nodes.
  • the storage location of the data in the above scheduling process is fixed, and the scheduling effect can only depend on the initial data distribution. If the initial data distribution is not good, there will be a large number of cross-node data transmission in the calculation process. The execution time of the job is too long, and the calculation efficiency is low, which affects the performance of the cloud computing system.
  • the present disclosure provides a scheduling method, a scheduler, a storage medium, and a system to solve the above problems.
  • the technical solution is as follows:
  • a scheduling method for use in a scheduler of a cloud computing system, the cloud computing system including the scheduler and a plurality of servers, the plurality of servers for storing data, the method comprising :
  • association graph includes a plurality of nodes and at least one directed edge
  • each node refers to a piece of data
  • each directed edge has a source node and a destination node
  • the destination node refers to a job request for data
  • the data distribution information includes a server where each piece of data is located;
  • the job request corresponding to the node refers to a job request indicated by the directed edge of the node as the source node;
  • At least one job request represented by the at least one directed edge is sequentially scheduled to the located server.
  • the preset node sorting policy is a policy for sorting in order of degree, the degree refers to the number of directed edges of the node connection, according to the association relationship.
  • the data distribution information traversing the nodes in the association graph according to a preset node sorting strategy, and sequentially positioning the job request corresponding to the traversed node to any of the data pointed to by the traversed node
  • the server including:
  • the method before the obtaining the association graph and the data distribution information, the method further includes:
  • the method before the obtaining the association graph and the data distribution information, the method further includes:
  • the scheduling the at least one job request represented by the at least one directed edge to the located server in sequence comprises: sequentially scheduling the at least one job request to the located one according to an execution round of each job request In the server.
  • the method further includes:
  • each job request includes a map map task, a shuffle shuffle task, and a merge reduce task, where the calculation of the job execution time of each job request under the current cluster size includes:
  • L j (r) represents the job execution time of the jth job request
  • r represents the number of racks currently allocated by the jth job request
  • k represents the number of servers in one rack.
  • ⁇ map indicates the average processing efficiency of the map task
  • ⁇ reduce indicates the average processing efficiency of the reduce task.
  • the calculating continues the job execution time of each job request under the current cluster size, and increases the cluster size of the job request with the longest job execution time until the job execution time is the longest.
  • the method further includes:
  • the relationship chain is a first node and a set of nodes and directed edges that are reached by any second node to the first node
  • the first node is a node that does not have a directed edge with the first node as a source node
  • the second node is a node that has at least two directed edges with the second node as a source node, or There is a directed edge with the second node as a source node and no directed edge with the second node as a destination node, the length of the relationship chain being directional by the relationship chain
  • the number of sides is determined;
  • the data corresponding to the specified job request is located in a server matching the cluster size of the specified job request, and according to the other direction in the relationship chain with the largest current length
  • the cluster size of the job request indicated by the edge, and the data corresponding to the other job request is located in a server that is located by the specified job request and whose quantity matches the cluster size of the other job request;
  • the method further includes:
  • the log record is updated, and the log record includes a job request record and a data processing record;
  • the association graph is updated according to the updated log record.
  • the number of requests for each directed edge in the association graph refers to the number of corresponding job requests
  • the weight of each directed edge refers to the execution frequency of the corresponding job request
  • the weight of each directed edge in the association graph is updated according to the updated execution frequency of each job request.
  • a scheduler for use in a cloud computing system, the cloud computing system including the scheduler and a plurality of servers, the plurality of servers for storing data, the scheduler comprising:
  • An obtaining module configured to acquire an association graph and a data distribution information, where the association graph includes a plurality of nodes and at least one directed edge, each node refers to a piece of data, and each directed edge has a source node and a destination a node, the directed edge is directed to the destination node of the directed edge by the source node of the directed edge, and the directed edge is used to calculate the data according to the source node of the directed edge.
  • a requesting positioning module configured to traverse the nodes in the association relationship map according to the preset node ranking policy according to the association relationship diagram and the data distribution information, and sequentially locate the job request corresponding to the traversed node to the traversal
  • the job request corresponding to the node refers to a job request represented by the directed edge of the node as the source node;
  • a scheduling module configured to sequentially schedule at least one job request represented by the at least one directed edge into the located server.
  • the preset node sorting policy is a policy for sorting in order of degree
  • the degree refers to the number of directed edges of the node connection
  • the request positioning module includes :
  • a determining unit configured to traverse the nodes in the association diagram according to the preset node ranking policy, and determine, as the traversed node, the node with the largest degree in the association graph;
  • a positioning unit configured to locate a job request corresponding to the traversed node to any server where the data referred to by the traversed node is located;
  • a deleting unit configured to delete a directed edge that uses the traversed node as a source node
  • the determining unit, the positioning unit, and the deleting unit are further configured to continue to traverse the nodes in the association graph according to the preset node ranking policy, and determine, as the node with the largest degree in the association graph, And the step of performing the positioning operation request on the traversed node and the step of deleting the corresponding directed edge until the degree of all nodes in the association diagram is not greater than 1.
  • the scheduler further includes:
  • a log obtaining module configured to acquire a log record, where the log record includes a job request record and a data processing record;
  • a model building module configured to construct a random queuing model according to the job request record, and determine input data and output data of each job request in the job request record based on the random queuing model;
  • a data association model building module configured to construct a data association model according to the data processing record, where the data association model includes a plurality of nodes, each node referring to one piece of data in the data processing record; determining each of the The source node corresponding to the input data of the job request and the destination node corresponding to the output data add a directed edge pointed by the source node to the destination node in the data association model to obtain the association relationship diagram.
  • the scheduler further includes:
  • a round number calculation module configured to calculate a number of execution rounds W required to complete the execution of the plurality of job requests in the job request record;
  • a priority calculation module configured to calculate a priority of each job request according to a job execution frequency and a job execution time of each job request, the priority being positively correlated with the job execution frequency and the job execution time;
  • a round determination module for sorting the plurality of job requests in order of priority from highest to lowest, and ranking the nW+mth job request and ranking (n+2)W+1-m
  • the execution round of the job request is determined to be m, m is a positive integer, and m is not greater than W, and n is an integer;
  • the scheduling module is further configured to sequentially schedule the at least one job request to the located server according to an execution round of each job request.
  • the scheduler further includes:
  • a cluster size setting module configured to initialize a cluster size of each job request, where the number of servers occupied by the corresponding data of the job request is positively related to the cluster size of the job request;
  • the cluster size adjustment module is used to calculate the job execution time of each job request under the current cluster size, and increase the cluster size of the job request with the longest job execution time; continue to calculate each job request under the current cluster size.
  • each job request includes a map map task, a shuffle shuffle task, and a merge reduce task
  • the cluster size adjustment module is further configured to calculate any job request in the current cluster by using the following formula: Job execution time at scale:
  • L j (r) represents the job execution time of the jth job request
  • r represents the number of racks currently allocated by the jth job request
  • k represents the number of servers in one rack.
  • ⁇ map indicates the average processing efficiency of the map task
  • ⁇ reduce indicates the average processing efficiency of the reduce task.
  • the scheduler further includes:
  • a sorting module configured to determine a relationship chain with the largest current length in the association graph, where the relationship chain is a first node and a node and a directed edge that are passed by any second node to the first node
  • the first node is a node that does not have a directed edge with the first node as a source node
  • the second node is that there are at least two directed edges with the second node as a source node a node, or a node having a directed edge with the second node as a source node and a directed edge having the second node as a destination node, the length of each relationship chain being corresponding to the relationship chain
  • the number of directed edges included in the determination is determined.
  • the job requests indicated by the directed edges in the relationship chain with the largest current length are sorted according to the order of the cluster size, and the job request with the largest cluster size is determined as Specify a job request;
  • a data locating module configured to locate, according to the cluster size of the specified job request, data corresponding to the specified job request to a server whose number matches the cluster size of the specified job request, and according to the current length a cluster size of the job request indicated by the other directed edges in the relationship chain, and the data corresponding to the other job requests is located in a server that is located by the specified job request and whose number matches the cluster size of the other job requests;
  • a deleting module configured to delete the relationship chain with the largest current length
  • the sorting module, the data locating module, and the deleting module are further configured to continue to determine a relationship chain with the largest current length in the association graph, and perform the step of locating the data to the determined relationship chain until the association The data location corresponding to the job request indicated by each directed edge in the diagram is completed.
  • the scheduler further includes:
  • An update module configured to update a log record after the execution of the job request is completed, where the log record includes a job request record and a data processing record;
  • the update module is further configured to update the association relationship map according to the updated log record.
  • the number of requests for each directed edge in the association graph refers to the number of corresponding job requests
  • the weight of each directed edge refers to the execution frequency of the corresponding job request.
  • a first update unit configured to determine, according to the updated job request record, the updated plurality of job requests and the execution frequency after each job request is updated;
  • a second updating unit configured to update a directed edge in the association graph according to the updated multiple job requests, and update a number of requests for each directed edge
  • a third updating unit configured to update the weight of each directed edge in the association graph according to the updated execution frequency of each job request.
  • a scheduler comprising: at least one processor and a memory, wherein the memory stores at least one instruction loaded by the at least one processor and executed to implement the first aspect The action performed in the method.
  • a fourth aspect a computer readable storage medium having stored therein at least one instruction loaded by a processor and executed to implement the method described in the first aspect operating.
  • a computer program is provided, the computer program being executed by a processor or a computer to perform the operations performed in the method described in the first aspect.
  • a cloud computing system comprising a plurality of servers and a scheduler as described in the second aspect or the third aspect.
  • the method provided by the embodiment of the present disclosure expresses the relationship between the data and the correspondence between the job request and the data by using the association diagram, and represents the data distribution by the data distribution information, and traverses according to the association diagram and the data distribution information.
  • Each node locates the job request represented by the traversed node as the directional edge of the source node to any server where the traversed node refers to the data, thereby realizing the positioning of the job request and making full use of it.
  • the characteristics of the platform layer and the job layer break the information gap between the cross-layer schedulers, strengthen the perception and association of cross-layer resources, reduce the data transmission across nodes, and shorten the job execution time of the overall job request. Calculate efficiency, improve system performance, and optimize integration of all platform resources.
  • association graph may represent a cross-layer dependency relationship between the job layer and the platform layer, and has a static characteristic of the data stream and a dynamic characteristic of the job stream, according to the
  • the scheduling strategy of the job request and the data distribution can be fully considered, thereby formulating a reasonable scheduling strategy, reducing the execution time of the job, improving the calculation efficiency, and improving the performance of the cloud computing system.
  • the association diagram can be updated in time as the scheduling process of the job request progresses, so as to ensure that the association diagram and the job stream change are matched in time, and the burst of the job request can be responded to in a timely manner.
  • the situation can be applied to scenarios with relatively stable job flow, and can also be applied to scenarios with frequent job flow bursts, such as node failures, network line changes, new types of application joins, or sudden requests for job requests. Sudden ability.
  • the data scheduling strategy is combined with the job scheduling strategy, which makes full use of the cross-layer relationship between the data and the job request, which can realize the collaborative scheduling of the job stream and the data stream, and realize the optimization and integration of the global resources. .
  • the association relationship is more complicated, and the length of the relationship chain is larger, the complexity is higher, and the influence of data distribution on the overall performance of the system is greater. Therefore, the relationship chain is The length is sorted, and the relationship chain with the largest length is preferentially located, which can reduce the positioning complexity and avoid the influence of unreasonable data distribution on subsequent data processing.
  • the plurality of job requests are sorted in order of priority from high to low, and the job request of the nW+mth position and the execution round of the job request of the (n+2)W+1-m are determined.
  • the execution round of each job request at least one job request is sequentially scheduled to the located server, which can ensure that the priority of job requests executed in different rounds is not much different, and each 2W round of jobs After the request is dispatched, the overall size of the job requests for each round is approximately equal, thereby approximately tiling the job requests on all racks of the cloud computing system.
  • the heuristic method is used to initialize the execution round of each job request, so that the job request executed concurrently in each round can occupy all clustering capabilities, avoid waste of computing power, and improve execution efficiency.
  • FIG. 1 is a schematic structural diagram of a cloud computing system provided by related art
  • FIG. 2 is a schematic structural diagram of a cloud computing system according to an embodiment of the present disclosure
  • FIG. 3 is a schematic diagram of a scheduling policy provided by a related art
  • FIG. 4A is a schematic structural diagram of an end user portrait system according to an embodiment of the present disclosure.
  • 4B is a schematic structural diagram of a scheduler according to an embodiment of the present disclosure.
  • FIG. 5 is a flowchart of a scheduling method according to an embodiment of the present disclosure.
  • FIG. 6 is a schematic diagram of a process of creating an association diagram according to an embodiment of the present disclosure.
  • FIG. 7 is a schematic diagram of a process of generating and updating an association relationship diagram according to an embodiment of the present disclosure
  • FIG. 8 is a flowchart of a scheduling method according to an embodiment of the present disclosure.
  • FIG. 9 is a schematic diagram of relationship between cluster size and job execution time according to an embodiment of the present disclosure.
  • FIG. 10 is a flowchart of a scheduling method according to an embodiment of the present disclosure.
  • FIG. 11 is a schematic diagram of data positioning provided by an embodiment of the present disclosure.
  • FIG. 12 is a flowchart of a scheduling method according to an embodiment of the present disclosure.
  • FIG. 13 is a schematic diagram of job request positioning according to an embodiment of the present disclosure.
  • FIG. 14 is a schematic diagram of determining an execution round according to an embodiment of the present disclosure.
  • FIG. 15 is a flowchart of operations provided by an embodiment of the present disclosure.
  • 16 is a schematic diagram showing changes in a job request completion ratio of a terminal user portrait system in three scenarios provided by an embodiment of the present disclosure
  • 17 is a schematic diagram showing changes in the percentage of speed increase of the terminal user portrait system in the three schemes provided by the embodiments of the present disclosure.
  • FIG. 18 is a schematic structural diagram of a scheduler according to an embodiment of the present disclosure.
  • FIG. 19 is a schematic structural diagram of a scheduler according to an embodiment of the present disclosure.
  • the cloud computing system includes a job layer and a platform layer.
  • the job layer includes a scheduler 201
  • the platform layer includes a plurality of servers 202.
  • the plurality of servers 202 are used to store data as storage nodes, and are also used as calculation nodes to perform calculations based on data and job requests, and a plurality of servers 202 are connected to each other to perform data transmission.
  • the scheduling layer 201 is configured to schedule the received job request to a certain one or more computing nodes (server 202) to perform calculations on the computing node based on the data and the job request.
  • a job request can be submitted to the job layer, and the job request is used to perform calculation according to the corresponding data.
  • the scheduler 201 stores the received job request to the dispatch queue and dispatches it from the dispatch queue to the server 202, thereby performing calculations in the server 202 based on the scheduled job request and the data corresponding to the job request.
  • the scheduling strategy adopted includes two types: a job layer scheduling policy and a platform layer scheduling. Strategy.
  • the job layer scheduling strategy adopts the fixed data and mobile computing scheduling mode, that is, the data storage location is fixed, and the job layer scheduler sorts multiple job requests in the scheduling queue according to the preset sorting strategy, in the order of arrangement. Each job request is scheduled to be dispatched to the server in turn, and the scheduled server performs the calculation according to the job request and according to the corresponding data of the job request.
  • the job layer scheduling strategy includes an association based ranking strategy or a weight based ranking strategy.
  • the job layer scheduling strategy assumes that the distribution of data follows the default uniform distribution or known distribution, and assumes that the data distribution remains constant throughout the scheduling process, so the real-time distribution of the underlying data is not perceived during the scheduling process, so scheduling The effect is often affected by the initial distribution of the data. However, due to the complex cross-layer relationship between data and job requests, the result of scheduling will be further affected. If the initial distribution of data is not good, even if the job layer scheduling policy can achieve load balancing of each server, it will still result in A large amount of cross-node data transmission, the overall job execution efficiency is still affected. In a heterogeneous cluster environment, the scheduling strategy of the pure job layer cannot achieve perfect resource distribution.
  • the platform layer scheduling strategy adopts the fixed computing and mobile data scheduling methods, that is, the data association relationship is introduced. During the execution of the job request, it is determined which data will be read by the same job request at the same time, and then the data is associated, and then Change the traditional data evenly distributed scheme, deploy the associated data to the same server, thereby realizing the dynamic adjustment of data distribution, optimizing the localization degree of data under a specific workflow distribution, and executing the job request process. Minimize data transmission across nodes and improve the performance of big data services.
  • the platform layer scheduling policy includes a scheduling policy based on data association, a scheduling policy based on data access frequency, or a scheduling policy based on device performance.
  • the platform layer scheduling policy determines the data distribution, it will not be adjusted frequently in a short period of time. This feature leads to poor flexibility of the platform layer scheduling strategy.
  • the platform layer scheduling policy is not aware of the situation of the job stream, so it cannot be matched with the job stream change in time, and the data distribution cannot be adjusted according to the sudden situation of the job stream in time, thereby deteriorating the performance of the big data service, and is generally not applicable to The scenario where the workflow is abrupt, but only for scenarios where the user behavior is relatively stable, that is, the job flow is relatively stable.
  • the embodiment of the present disclosure breaks the barrier between the platform layer and the scheduling policy of the job layer, and performs collaborative scheduling on the job stream and the data stream based on the cross-layer association relationship between the data and the job request, thereby further realizing the optimization and integration of the global resource. Improve execution efficiency.
  • the embodiment of the present disclosure incorporates the platform layer factor, and the steady state performance of the scheduling policy is better, and the embodiment of the present disclosure considers the burst characteristics of the job stream and the transient convergence with respect to the platform layer scheduling policy. More sexual.
  • the embodiment of the present disclosure makes full use of the cross-layer association relationship between the data and the job request, optimizes the data distribution, and reduces the communication overhead during the execution of the job.
  • the embodiments of the present disclosure are applied to an application scenario of a large-scale data center.
  • a large-scale data center multiple devices concurrently execute a big data service, and multiple job requests are submitted, and the data and the job frequently interact with each other.
  • the scheduling policy of the embodiment of the present disclosure can uniformly plan and schedule the job request and the flow of data during the entire job execution process.
  • the process from the job request to the completion of the job request execution includes three stages: job request scheduling, data loading, and job request execution.
  • job request scheduling job request scheduling
  • data loading data loading
  • job request execution job request execution
  • the work of the platform layer is in the data loading phase.
  • the research at the platform layer mainly provides the corresponding deployment strategy for the application data on the storage platform.
  • the optimized deployment scheme of the storage platform using the data of the platform layer can improve the loading rate of data of big data applications to different computing nodes and optimize the time span of the data loading phase.
  • the job layer is in the job request scheduling phase.
  • the job layer research mainly deals with job requests according to the performance requirements and constraints of different services.
  • the job request scheduling strategy of the job layer further plans the execution order of multiple job requests and the way the job requests are distributed to the data.
  • the embodiment of the present disclosure can be applied to an end user portrait system.
  • the user data needs to be a Hadoop Distribute File System (HDFS) file, Hive.
  • HDFS Hadoop Distribute File System
  • Hive a Hadoop Distribute File System
  • Redis a key-value storage system index and other formats for storage, and data flow is very frequent.
  • Spark SQL computing cluster a kind of spark component used to process structured data
  • big data operations such as table item generation, label generation, and application recommendation. Therefore, a complex relationship between the application and the data is formed.
  • the HDFS includes a NameNode (name node) and a DataNode (data node), and the Spark SQL computing cluster includes a Master node (a master node) and a Worker node (a working node).
  • the scenario is suitable for unified scheduling management of data streams and job streams by using the method provided by the embodiments of the present disclosure.
  • the scheduling method of the embodiment of the present disclosure can be implemented by extending the classic scheduler of the existing big data system, such as Apache Hadoop YARN, Apache Mesos, etc., mainly retaining the tradition in the traditional scheduler.
  • the job request scheduling function of the scheduler as a sub-module of the new scheduler, additionally adds the characteristics of the data plane, and introduces a sub-module with data deployment function.
  • the two sub-modules do not interfere with each other, work independently, and introduce a cross at the bottom layer.
  • the fine-grained association diagram of the layer closely associates the information used by the two sub-modules, so that the operation of the two sub-modules can be planned uniformly, and the coordinated scheduling of the data stream and the job stream is realized, and the global optimal combination scheduling is achieved.
  • the embodiment of the present disclosure can be implemented by modifying and modifying the Apache Hadoop YARN scheduler.
  • the specific process of the embodiments of the present disclosure is detailed in the following examples.
  • FIG. 5 is a flowchart of a scheduling method according to an embodiment of the present disclosure. The flowchart is applied to the scheduler shown in the foregoing embodiment. Referring to Figure 5, the method includes:
  • the log record includes a job request record and a data processing record.
  • the job request record includes a job request that needs to be executed during the execution of the job.
  • the job request record may include different types of job requests, such as an application recommendation request and a webpage browsing request. Wait.
  • the job request record may further include a request frequency of each job request, an execution time of each job request, etc., and these parameters may be obtained by statistically calculating a history execution process of each job request.
  • the data processing record includes a plurality of pieces of data, and according to the data processing record, not only can the data be processed before, but the data can be considered as the data required during the execution of the job.
  • the scheduler Each time the scheduler receives a job request, the scheduler stores the job request in a job request record, such as storing the received job request in a dispatch queue. Also, data processing records can be generated based on the data processed during job execution.
  • the scheduler After the scheduler reads the job request record, it can abstractly model the execution process of the job request according to the job request record, and construct a random queuing model, which can simulate the process of waiting for execution of multiple job requests, based on the random
  • the queuing model can determine the input data and output data of each job request, the input data refers to the data required for the execution of the job request, and the output data refers to the data obtained by calculating the input data according to the job request.
  • the scheduler can treat each piece of data in the data processing record as a node, thereby constructing a data association model including a plurality of nodes, the data association model can represent an association relationship between the data, and each job request is in the data association model
  • the input data corresponding to the node is calculated to obtain a job request of the output data corresponding to the destination node, that is, the directed edge can represent the relationship between the input data and the output data.
  • the association graph After adding the directed edge in the data association model, the association graph is obtained, and the association graph includes a plurality of nodes and at least one directed edge, each node refers to a piece of data, and each directed edge has a source.
  • a node and a destination node the directed edge is directed to the destination node of the directed edge by the source node of the directed edge, and each directed edge is used to represent the data pointed to by the source node of the directed edge to obtain the directed edge.
  • the job request of the data pointed to by the destination node can integrate the data stream and the job stream, and can represent the relationship between the data and the job request.
  • the abstraction modeling can be performed according to the data processing record, and the Directed Acyclical Graphs (DAG) model is obtained, and the association diagram is constructed according to the DAG model and the job request record.
  • DAG Directed Acyclical Graphs
  • the same directed edge is added to the data association model, and the topology of the nodes in the model can be correlated according to the data.
  • the multiple job requests are aggregated on the same node to form an initial association diagram, and then the number and weight of requests are set for each directed edge in the association diagram, and the number of requests refers to the job corresponding to the directed edge.
  • the number of requests which refers to the execution frequency of the job request corresponding to the directed edge, and can be determined according to the execution frequency of the job request in the job request record.
  • a Markov chain is created according to a job request record, in which the different states of the scheduling queue are used as different nodes, and the number in each node refers to the number of job requests in the scheduling queue, ⁇ Indicates the arrival rate of the job request, ⁇ indicates the processing rate of the job request, ⁇ t indicates the time interval between the different states, and the Markov chain can represent the job stream.
  • a data association graph is generated, wherein the data association graph uses different data as different nodes, and the edge between the node and the node represents a job request for calculating data corresponding to the destination node from data corresponding to the source node.
  • an association relationship diagram may be generated, in which different data are different nodes, and the directed edges corresponding to different job requests are connected on the same node, Each directed edge has a weight, and the weight indicates the execution efficiency of the corresponding job request.
  • the method provided by the embodiment of the present disclosure may generate an association relationship diagram according to the job request record and the data processing record, and the association relationship diagram may represent a cross-layer dependency relationship between the job layer and the platform layer, and have static characteristics and operations of the data stream.
  • the dynamic characteristics of the flow, according to the association diagram can fully consider the scheduling strategy of the job request and the data distribution, so as to formulate a reasonable scheduling strategy, reduce the execution time of the operation, improve the calculation efficiency, and then improve the cloud computing system. Performance.
  • the scheduler may update the log record and update the association diagram according to the updated log record.
  • the update process includes: determining, according to the updated job request record, the updated plurality of job requests and the execution frequency after each job request is updated, the updated plurality of job requests may be after removing the job request that has been executed The remaining job requests, and some of the remaining job requests may change due to the execution frequency during the execution of this job, so the update frequency is updated, or the update is performed after the job execution process ends.
  • the subsequent multiple job requests may be job requests that need to be executed for the next job execution process.
  • the directed edges in the association diagram are updated, the request data of each directed edge is updated, and the updated execution frequency is requested according to each job, for each Update the weight of the edge.
  • the above update process may be performed periodically, or after each job execution process ends, for example, when the current cycle ends but the current job execution process has not ended yet, and there is still a job request that is not completed, the job request record is The job request and the execution frequency are updated, and the job request record is re-read, the weight of the directed edge in the association diagram is updated, and the updated association diagram is applied in the remaining execution process. Then, when the current execution of the job ends, the association diagram is updated according to the execution status of each job request during the execution of the job.
  • the process of generating and updating the association graph may be as shown in FIG. 7 , and the association relationship graph is periodically updated during the entire job execution period by using the above update process, so that the directed edge information of the cross-layer association graph may follow
  • the scheduling is progressed and updated in time to respond to the sudden situation of the job request in a timely manner.
  • the weight of the directed edge is valid for the update period.
  • the scheduler considers the workflow of each application to remain unchanged.
  • FIG. 8 is a flowchart of a scheduling method according to an embodiment of the present disclosure.
  • the process of determining a job request cluster size is described in the embodiment of the present disclosure.
  • the scale process can be performed when data is uploaded to the cloud computing system. Referring to Figure 8, the method includes:
  • the number of servers occupied by the data corresponding to the job request is positively related to the cluster size of the job request.
  • the number of servers occupied by the data corresponding to the job request may be determined to be equal to the cluster size, that is, the cluster size is corresponding to the job request.
  • the number of servers occupied by the data, or the number of racks can be represented by the cluster size r, and each rack contains the number of servers, and the number of servers occupied by the data corresponding to the job request is rk.
  • the data corresponding to each job request can constitute multiple copies of data and are deployed in multiple servers.
  • the number of deployed servers is large, data transmission across nodes can be reduced during the execution of the job request, thereby reducing job requests.
  • Job execution time Therefore, for each job request, the job execution time is shorter when the cluster size is larger. From a global perspective, it is necessary to comprehensively consider the job execution time of each job request in order to reasonably determine the cluster size of each job request.
  • the scheduler can first initialize the cluster size of each job request, and then adjust the cluster size of some job requests.
  • the cluster size of each job request in the job request record may be set to a first value when initializing, and the first value may be a positive integer, and may be 1 or other values.
  • the scheduler can first simulate the execution process of each job request according to the cluster size of each job request, and determine the job execution time of each job request under the current cluster size. For the job request with the longest job execution time, the cluster size of the job request is increased, and the job execution time of the job request can be reduced. When increasing the cluster size of the job request, you can increase the cluster size of the job request by 1 or add other values.
  • the job request may be divided into a map map task, a shuffle shuffle task, and a merge reduce task, and the execution process of the job request includes an execution process of the foregoing three tasks, and the job execution time of the job request may be The execution time of the three tasks is determined.
  • the scheduler can establish a performance model based on which the job execution time of each job request at the current cluster size is calculated.
  • the parameters of the performance model are defined as follows:
  • J a collection of job requests
  • R the total number of racks
  • k the number of mid-range servers in a rack
  • L j (r) job execution time of the jth job request (execution of the job request in r racks);
  • r the number of racks to which the jth job request is allocated, that is, the size of the computing cluster
  • the jth job request is allocated on the corresponding rack, that is, the set of (J 1j , J 2j , ..., J Rj )
  • Map map the average processing efficiency of the map task
  • ⁇ reduce The average processing efficiency of the single task of the Reduce task.
  • Each job request can be regarded as a MapReduce job, including a map task, a shuffle task, and a reduce task.
  • the execution time is determined by the execution time of the map task, the shuffle task, and the reduce task.
  • the execution time of each round in the execution phase of the shuffle task depends on the maximum time of intra-rack transmission and cross-rack transmission.
  • each compute node has a (r-1)/r ratio of data that needs to be transmitted across the rack, so the time required to transport across the rack is:
  • each compute node has a 1/r ratio of data that does not need to be transmitted across the rack, so the time required for the in-rack transfer process is:
  • the job execution time of the job request in the cluster is quantified as follows:
  • the scheduler can calculate the job execution time of any job request at the current cluster size using the following formula:
  • the cluster size is:
  • the cluster size is two racks
  • the job execution time of the job request is maximized
  • the larger the cluster size the shorter the job execution time of the job request.
  • the preliminary conclusion is that considering the computing performance and communication performance of the MapReduce cluster, the larger the cluster size, the shorter the execution time span of MapReduce jobs. This conclusion is used in the optimization of the cluster size in the design of the subsequent collaborative scheduling method.
  • the result of the performance model is also used in the initial data distribution operation.
  • step 802 is re-executed, that is, the job of the job request is increased under the increased cluster size.
  • Execution time and compare the job execution time of other job requests in the current cluster size, re-determine the job request with the longest job execution time, and increase the cluster size of the job request with the longest execution time of the current job. In this way, until the cluster size of the job request with the longest execution time of the current job increases, the number of servers corresponding to the cluster size is equal to the total number of servers in the cloud computing system.
  • the cluster size is equal to the number of servers occupied by the data corresponding to the job request
  • the cluster size is equal to the total server in the cloud computing system. Stop adjusting the cluster size when the quantity is reached.
  • the cluster size is the number of racks r occupied by the data corresponding to the job request, and the number of servers included in each rack is k, and the number of servers occupied by the data corresponding to the job request is rk, and the current After the cluster size r of the job request with the longest job execution time increases, rk is equal to the total number of servers in the cloud computing system, and the cluster size is stopped.
  • the server is used as the minimum unit of the computing node, and the server is deployed in the rack, and the cluster size is adjusted by the server or the cluster size is adjusted by the rack size.
  • the server may be deployed in other manners, and the cluster size may be adjusted by using other granularities, which may be determined according to deployment requirements.
  • FIG. 10 is a flowchart of a scheduling method according to an embodiment of the present disclosure. The flowchart is applied to the scheduler shown in the foregoing embodiment, and the process of the positioning data may be described in the process of determining the data. Executed after the requested cluster size. Referring to Figure 10, the method includes:
  • a plurality of relationship chains can be obtained according to the node connection relationship, and the relationship chain is a first node and a set of nodes and directed edges that the second node reaches the first node, the first node
  • the second node is a node having at least two directed edges with the second node as the source node, or there is one node with the second node as the source node.
  • the directed edge does not have a node with a directed edge of the second node as the destination node, and the length of each relationship chain is determined by the number of directed edges included in the relationship chain.
  • each relationship chain includes The number of directed edges and the relationship chain with the largest current length.
  • the relationship chain with the largest current length contains not only multiple nodes, but also directed edges between different nodes. These directed edges correspond to job requests. It can be considered that there is at least one job request in the relationship chain with the largest current length.
  • the cluster size of each job request is sorted in descending order, and the job request with the largest cluster size is determined as the specified job request.
  • the data corresponding to the specified job request is located in a server whose quantity matches the cluster size of the specified job request.
  • the data positioning refers to determining a target server of the data, and storing the data in the target server.
  • the target server may be used as a computing node to process the data, or the target server may send the data to the other server.
  • the compute node is processed by other compute nodes.
  • the cluster size is equal to the number of servers occupied by the data corresponding to the job request
  • the data corresponding to the specified job request is located in a server whose quantity is equal to the cluster size.
  • the cluster size is the number of racks occupied by the data corresponding to the job request, and the number of servers included in each rack is k, and the number of servers occupied by the data corresponding to the job request is rk,
  • the data corresponding to the specified job request is located in r racks, that is, it is located in rk servers.
  • the data corresponding to the other job requests is located in the server that is located in the specified job request and the number matches the cluster size of other job requests.
  • the data corresponding to the job request indicated by the other directed edges in the relationship chain with the largest current length is located, and the data corresponding to each job request is located to the server whose number matches the cluster size.
  • the data corresponding to each job request should be preferentially located to the server to which the specified job request is located.
  • the data corresponding to each job request should be preferentially located in the server where the other job requests in the relationship chain are located, so as to ensure that the data corresponding to the job request in the same relationship chain is as Navigate to the same server.
  • the relationship chain with the largest length in the association diagram is “-B7-B9-B10”, and the three job requests in the relationship chain are J1, J2, and J3, respectively, and the cluster size r1 of J1.
  • the largest, J2 cluster size r2 second, J3 cluster size r3 is the smallest, the J1 corresponding data is located in r1 racks, J2 corresponding data is located in r2 racks, J3 corresponding data positioning In r3 racks, and the r2 racks and the r3 racks belong to a part of the rack of the previously positioned r1 rack, and the r3 racks belong to a part of the rack of the r2 racks, so that It can be ensured that the data corresponding to J1 and the data corresponding to J2 can be located in the same r2 racks, and the data corresponding to J1, the data corresponding to J2, and the data corresponding to J3 can be located in the same r3 racks.
  • the relationship chain completed by the data positioning may be deleted, and then it is determined whether there is still data to be located in the association diagram after the deletion.
  • Corresponding nodes if there is a node corresponding to the data to be located, continue to repeat the above steps according to the association diagram after the deletion, that is, to re-determine the relationship chain with the largest current length in the association diagram, and determine the relationship.
  • the data corresponding to the job request indicated by the directed edge in the chain is located until the data positioning corresponding to each job request is completed. At this time, there is no node corresponding to the data to be located, that is, the server location where all data positioning has been determined has been completed.
  • the final data distribution, the data distribution information is obtained, and the data distribution information includes the server where each data is located, and the initialization of the data distribution information is realized.
  • the data used at the same time may be used to locate the same data.
  • the server it facilitates subsequent data processing, avoids the transmission of data between different servers as much as possible, optimizes communication overhead, and reduces waste of communication capability.
  • the association diagram contains multiple relationship chains, the association relationship is more complicated, and the length of the relationship chain is larger, the complexity is higher, and the influence of data distribution on the overall performance of the system is greater. Therefore, the length of the relationship chain is Sorting and prioritizing the relationship chain with the largest length can reduce the positioning complexity and avoid the impact of unreasonable data distribution on subsequent data processing.
  • FIG. 12 is a flowchart of a scheduling method according to an embodiment of the present disclosure. The flowchart is applied to the scheduler shown in the foregoing embodiment, and the process of scheduling a job request is described in the embodiment of the disclosure. The distribution information is determined after execution. Referring to Figure 12, the method includes:
  • the association diagram includes a plurality of nodes and at least one directed edge, each node refers to a piece of data, each directed edge has a source node and a destination node, and the directed edge is pointed by the source node of the directed edge.
  • each directed edge is used to represent a job request for calculating the data referred to by the destination node of the directed edge based on the data referred to by the source node of the directed edge.
  • the data distribution information includes a server where each piece of data is located.
  • the data distribution information may include a server identifier where each piece of data is located.
  • the server identifier is used to uniquely determine a corresponding server, and may be address information or sequence number of the server.
  • the job request corresponding to the node refers to the job request indicated by the directed edge of the node as the source node.
  • the scheduler can set a preset node sorting strategy, and the nodes in the association graph can be sorted according to the preset node sorting strategy, and the job requests corresponding to each node can be sequentially located according to the sorting.
  • the positioning of the job request refers to a computing node that determines a job request among a plurality of servers of the cloud computing system, and the computing node is configured to execute the job request, that is, according to the job request and according to the job request corresponding data. Process it.
  • the traversal is performed according to the node arrangement order determined by the preset node sorting strategy, and the job request represented by the directional edge of each node traversed as the source node is the current job request to be located.
  • the distribution information has already indicated that the traversed node refers to the server where the data is located, and the job request needs to use the data during execution, so in order to avoid the transmission of data between different servers as much as possible, the job request is targeted to The data is in any of the servers.
  • the above positioning method can be used to determine the server to which each job request is located, that is, the computing node for each job request.
  • the preset node sorting strategy is a strategy for sorting in order of degree from large to small, and the degree refers to the number of directed edges of the node connection, and then, when performing positioning, first determining the association graph The degree of each node is sorted in order of degree, and the node with the largest degree is determined, that is, the node that is traversed, and the job request corresponding to the traversed node is located to the node that is traversed.
  • the traversed node is deleted as the directional edge of the source node, and then the above steps are repeatedly performed, and the job request corresponding to the node with the degree greater than 1 continues to be located, that is, continues to be in accordance with the degree Traversing from the largest to the smallest to the next node, the traversed node as the source node's directed edge representation of the job request is located in any server where the traversed node refers to the data, will be traversed The node is deleted as the directed edge of the source node, and so on, until the degrees of all nodes in the association graph are not greater than one.
  • the two job requests with the node B4 as the source node are preferentially located in the server where the data corresponding to the node B4 is located, and the job request corresponding to the other node is located. .
  • the association relationship is more complicated, and it is necessary to prioritize positioning to locate as many job requests as possible in the server where the data pointed by the node is located, so according to the degree of the node Sorting the size can preferentially locate the job request corresponding to the node with the largest degree, reduce the positioning complexity, complete the one-to-one mapping of the job request and the data, and avoid the unreasonable data distribution to the subsequent data processing process. influences.
  • each job request can be sequentially scheduled to the located server, thereby executing a corresponding job request in the scheduled server.
  • a plurality of policies may be used to determine the scheduling order of the job request, thereby determining the execution order of the job request.
  • the Big Data Application for the throughput-priority type uses the Shortest Short Queue (JSQ), and the Shortest Expected Delay Routing (SEDR) for the response-time-oriented Big Data application.
  • the strategy, the big data application for the sudden burst of work, uses the Myopic Max Weight strategy.
  • parameters such as the arrival rate and execution rate of the job request can be fed back, the weight of the directed edge in the association graph is updated, and the connection relationship between the node and the directed edge is corrected according to the node topology structure in the association diagram.
  • the server set used in the subsequent scheduling process, or the period used in the scheduling process may be updated according to actual needs.
  • the method provided by the embodiment of the present disclosure indicates the data distribution by using the data distribution information by using the association diagram to represent the relationship between the data and the correspondence between the job request and the data, according to the association diagram and the data distribution information.
  • the job request with the current node as the source node is located in any server where the data pointed by the current node is located, which can reduce the data transmission across the nodes, thereby shortening the execution time of the overall job request, improving the calculation efficiency, and improving the system performance.
  • the scheduler may first determine the execution round of each job request when scheduling multiple job requests, and schedule at least one job request to the located server according to the execution round of each job request. So that multiple job requests can be executed in multiple rounds, with at least one job request being executed per round.
  • While determining the execution round of each job request it is possible to calculate the number of execution rounds W required for the execution of the plurality of job requests in the job request record, and the number of execution rounds W indicates that all job requests are required to be completed in an ideal situation. How many rounds the server executes, then calculate the priority of each job request according to the job execution frequency and job execution time of each job request, and sort the multiple job requests in order of priority from highest to lowest, ranking nW.
  • the execution request of the +m position and the execution round of the job request of the (n+2)W+1-m are determined as m, that is, the job request and ranking of the nW+mth position are ranked (n+2).
  • the job request of W+1-m is set to be executed in the mth round, m is a positive integer, and m is not greater than W, and n is an integer.
  • the priority of the job request is positively related to the job execution frequency and the job execution time, and may be, for example, a product of the job execution efficiency and the job execution time.
  • a high value ensures that the job request can be executed first. Even if the job request takes a long time, it can be executed in parallel with other rounds of other job requests, so as not to extend the overall job execution time.
  • the priorities of J1, J2, ..., Jj are sequentially decreased, and the first W job requests J1 to JW are sequentially scheduled to the 1st to the Wth, and the JW+1 to J2W job requests are sequentially scheduled to the Wth.
  • the overall size of each round of job requests is approximately equal.
  • the execution round of each job request is initialized by the heuristic method, so that the job request executed concurrently in each round can occupy all clustering capabilities, avoid waste of computing power, and improve execution efficiency.
  • the operation flowchart of the method provided by the embodiment of the present disclosure may be as shown in FIG. 15 .
  • the cluster size of each job request is determined, and the data distribution is determined.
  • the cluster size of each job request, and the location of the corresponding data the job request is located and scheduled to execute the job request.
  • the parameters such as the execution frequency and the job execution time in the execution process are fed back to update the association diagram.
  • steps 1 and 4 can be performed in the entire scheduling process until the end of the scheduling, and the real-time update of the association graph and the real-time scheduling of the workflow are mainly completed.
  • Steps 2 and 3 are only executed once in the beginning of the scheduling process, mainly to complete the scheduling of the data stream. Since the redirection of the data stream brings serious communication overhead, steps 2 and 3 are not suitable for frequent execution.
  • Step 5 is triggered at the end of the job execution, mainly for the next round of scheduling correction information. The five steps work together to achieve a coordinated scheduling of the job stream and data stream for big data applications.
  • the technical solution of the embodiment of the present disclosure introduces a cross-layer association relationship between the platform layer and the work layer, and promotes information integration between the data flow and the work flow, compared with the simple related combination of the related related technical scheme and the related technical scheme.
  • the collaborative scheduling method of the embodiments of the present disclosure will enable further performance optimization of big data applications.
  • FIG. 16 is a schematic diagram showing changes in the proportion of job request completion ratios of the terminal user portrait system in the three schemes provided by the embodiments of the present disclosure
  • FIG. 17 is a schematic diagram showing changes in the percentage of speed increase of the terminal user portrait system in the three schemes provided by the embodiments of the present disclosure.
  • the three schemes include: the scheme of the embodiment of the disclosure, the delay layer (Laddered Scheduling) strategy platform layer adopts the ActCap scheme (the job layer scheduling strategy and the platform layer scheduling strategy are simply combined, and the job request and data are not perceived). The relationship between the relationship) and the Shuffle Watcher solution (using the platform layer scheduling strategy).
  • the scheduling policy of the embodiment of the present disclosure is relatively better than the other two schemes, and the data flow and the job stream are optimally scheduled, because the workflow and the data stream are simultaneously considered.
  • the scheduling strategy of the embodiment of the present disclosure further introduces a cross-layer association relationship, optimizes the collaborative integration of the job stream and the data stream, and utilizes the global resource more reasonably, so that the priority platform resource accommodates more job requests.
  • the scheduling strategy of the embodiment of the present disclosure further introduces a cross-layer association relationship, optimizes the collaborative integration of the job stream and the data stream, enables the big data operation to obtain a better speed, and further optimizes the global resource integration of the platform. This allows the end user portrait system to achieve optimal performance acceleration.
  • FIG. 18 is a schematic structural diagram of a scheduler according to an embodiment of the present disclosure. Referring to FIG. 18, the scheduler is applied to the cloud computing system shown in the foregoing embodiment, where the scheduler includes:
  • the obtaining module 1801 is configured to acquire an association graph and a data distribution information, where the association graph includes a plurality of nodes and at least one directed edge, each node refers to a piece of data, and each directed edge has a source node and a destination node.
  • the directed edge is directed to the destination node of the directed edge by the source node of the directed edge, and the directed edge is used to represent the data pointed to by the destination node of the directed edge to obtain the data of the directed node.
  • Job request, data distribution information includes the server where each piece of data is located;
  • the request positioning module 1802 is configured to traverse the nodes in the association relationship map according to the preset node ranking policy according to the association relationship graph and the data distribution information, and sequentially locate the job request corresponding to the traversed node to the data pointed to by the traversed node.
  • the job request corresponding to the node refers to a job request represented by a directed edge of the node as a source node;
  • the scheduling module 1803 is configured to sequentially schedule at least one job request represented by the at least one directed edge into the located server.
  • the preset node sorting policy is a policy of sorting in order of the degree of the degree
  • the degree is the number of the directed edges of the node connection
  • the request positioning module 1802 includes:
  • a determining unit configured to traverse the nodes in the association diagram according to the preset node sorting strategy, and determine the node with the largest degree in the association graph as the traversed node;
  • a positioning unit configured to locate a job request corresponding to the traversed node to any server where the data referred to by the traversed node is located;
  • the determining unit, the positioning unit and the deleting unit are further configured to continue to traverse the nodes in the association diagram according to the preset node sorting strategy, determine the node with the largest degree in the association graph as the traversed node, and locate the traversed node.
  • the scheduler further includes:
  • a log obtaining module configured to acquire a log record, where the log record includes a job request record and a data processing record;
  • a model building module configured to construct a random queuing model according to the job request record, and determine input data and output data of each job request in the job request record based on the random queuing model;
  • a data association model building module is configured to construct a data association model according to the data processing record, wherein the data association model includes a plurality of nodes, each node refers to a piece of data in the data processing record; and determines a source corresponding to the input data of each job request The node corresponding to the node and the output data adds a directed edge from the source node to the destination node in the data association model to obtain an association graph.
  • the scheduler further includes:
  • a round number calculation module configured to calculate a number of execution rounds W required to complete the execution of the plurality of job requests in the job request record;
  • a priority calculation module configured to calculate a priority of each job request according to a job execution frequency and a job execution time of each job request, and the priority is positively correlated with the job execution frequency and the job execution time;
  • a round determination module for sorting a plurality of job requests in order of priority from highest to lowest, and ranking the nW+m-bit job request and the (n+2)W+1-m job request
  • the execution round is determined to be m, m is a positive integer, and m is not greater than W, and n is an integer;
  • the scheduling module is further configured to sequentially schedule at least one job request to the located server according to an execution round of each job request.
  • the scheduler further includes:
  • the cluster size setting module is used to initialize the cluster size of each job request, and the number of servers occupied by the data corresponding to the job request is positively related to the cluster size of the job request;
  • the cluster size adjustment module is used to calculate the job execution time of each job request under the current cluster size, and increase the cluster size of the job request with the longest job execution time; continue to calculate each job request under the current cluster size.
  • each job request includes a map map task, a shuffle shuffle task, and a merge reduce task
  • the cluster size adjustment module is also used to calculate any job request under the current cluster size by using the following formula.
  • Job execution time :
  • L j (r) represents the job execution time of the jth job request
  • r represents the number of racks currently allocated by the jth job request
  • k represents the number of servers in one rack.
  • ⁇ map indicates the average processing efficiency of the map task
  • ⁇ reduce indicates the average processing efficiency of the reduce task.
  • the scheduler further includes:
  • the sorting module is configured to determine a relationship chain with the largest current length in the association graph, and the relationship chain is a first node and a set of nodes and directed edges that the second node reaches the first node, the first node
  • the second node is a node having at least two directed edges with the second node as the source node, or there is one node with the second node as the source node.
  • the data locating module is configured to locate, according to the cluster size of the specified job request, the data corresponding to the specified job request to the server matching the cluster size of the specified job request, and represent the other directed edges in the relationship chain with the largest current length.
  • the cluster size of the job request, and the data corresponding to the other job requests is located in the server where the specified job request is located and the number matches the cluster size of other job requests;
  • the sorting module, the data locating module and the deleting module are further configured to continue to determine the relationship chain with the largest current length in the association graph, and perform the step of locating the data for the determined relationship chain until the operation indicated by each directed edge in the association graph The corresponding data location is requested to be completed.
  • the scheduler further includes:
  • An update module configured to update the log record after the execution of the job request is completed, and the log record includes a job request record and a data processing record;
  • the update module is also used to update the association diagram based on the updated log records.
  • the number of requests for each directed edge in the association graph refers to the number of corresponding job requests
  • the weight of each directed edge refers to the execution frequency of the corresponding job request
  • the update module include:
  • a first update unit configured to determine, according to the updated job request record, the updated plurality of job requests and the execution frequency after each job request is updated;
  • a second update unit configured to update the directed edges in the association graph according to the updated multiple job requests, and update the number of requests for each directed edge
  • a third updating unit configured to update the weight of each directed edge in the association graph according to the updated execution frequency of each job request.
  • FIG. 19 is a schematic structural diagram of a scheduler according to an embodiment of the present disclosure.
  • the scheduler is applied to the cloud computing system shown in the foregoing embodiment, where the scheduler includes: a memory 1901 and a processor 1902, and the memory 1901 Connected to the processor 1902, the memory 1901 stores at least one instruction, and the processor 1902 is configured to invoke an instruction to perform the operations performed by the scheduler in the above embodiment.
  • the embodiment of the present disclosure further provides a computer readable storage medium, where the computer readable storage medium stores at least one instruction, when the instruction is loaded and executed by the processor, causing the computer to execute the scheduler as in the above embodiment. Operation.
  • a person skilled in the art may understand that all or part of the steps of implementing the above embodiments may be completed by hardware, or may be instructed by a program to execute related hardware, and the program may be stored in a computer readable storage medium.
  • the storage medium mentioned may be a read only memory, a magnetic disk or an optical disk or the like.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本公开提供了一种调度方法、调度器、存储介质及系统,属于通信技术领域。该方法包括:获取关联关系图和数据分布信息,关联关系图中每个节点指代一条数据,每条有向边用于表示根据源节点指代的数据进行计算得到目的节点指代的数据的作业请求;按照预设节点排序策略,依次将以遍历到的节点对应的作业请求定位到所述遍历到的节点指代的数据所在的任一服务器中;将至少一条有向边表示的至少一个作业请求依次调度至所定位的服务器中。通过以关联关系图来表示数据之间的关联关系以及作业请求与数据的对应关系,以数据分布信息来表示数据分布情况,可以减少跨节点的数据传输,进而缩短整体作业请求的作业执行时间,提高计算效率,提升系统性能。

Description

调度方法、调度器、存储介质及系统
本申请要求于2018年3月23日提交的申请号为201810244746.5、发明名称为“调度方法、调度器、存储介质及系统”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本公开涉及通信技术领域,特别涉及一种调度方法、调度器、存储介质及系统。
背景技术
伴随着数据规模的增长和数据多样性的发展趋势,数据分析和处理技术也在不断进行着改进,目前已出现了多种数据分析处理架构,如Hadoop/MapReduce(分布式计算/映射归并)架构、基于内存计算的大数据处理框架Spark等。基于这些数据分析处理架构开发的大数据应用,能够提高大数据分析处理的效率,一定程度上满足了大数据分析处理对于实时性的需求。
大数据应用具体实现时需要构建具有一定规模的计算集群,并根据需求动态调整计算集群内部的计算资源。为了满足上述需求,通常会将大数据应用部署在云计算系统,通过云计算系统可以实现统一的管理以及更加灵活的管理调度。然而,云计算系统为大数据应用带来很多便利条件的同时,也面临着亟需解决的诸多问题,其中用户最为关注的就是性能问题,即云计算系统整体的作业执行时间的问题。
参见图1,其示出了一种云计算系统的架构示意图,该云计算系统包括作业层和平台层,作业层包括调度器,平台层包括多个服务器,这些服务器可以作为存储节点来存储数据,还可以作为计算节点来根据存储的数据进行计算。目前云计算系统通常会采用作业层调度策略,即终端提交作业请求,作业层的调度器会将接收到的多个作业请求添加到调度队列,并按照预设排序策略对调度队列中的多个作业请求进行排序,按照排列的顺序依次将每个作业请求调度到服务器上,由调度的服务器按照该作业请求并根据该作业请求对应的数据进行计算。而在上述调度过程中,数据的存储位置固定不变。综上所述,云计算系统采用了固定数据、移动计算的调度方式。
计算过程会受到数据分布情况的影响,例如,如果某一作业请求所需的数据分布在不同的服务器上,当某一服务器在按照该作业请求进行计算的过程中,需要从其他的服务器传输所需的数据,这会导致跨节点的数据传输。而上述调度过程中数据的存储位置固定不变,调度效果只能依赖于初始的数据分布情况,如果初始的数据分布情况不佳,就会导致计算过程存在着大量的跨节点的数据传输,进而导致作业执行时间过长,计算效率低下,影响了云计算系统的性能。
发明内容
本公开提供了一种调度方法、调度器、存储介质及系统,以解决上述问题。所述技术方案如下:
第一方面,提供了一种调度方法,应用于云计算系统的调度器中,所述云计算系统包括所述调度器以及多个服务器,所述多个服务器用于存储数据,所述方法包括:
获取关联关系图和数据分布信息,所述关联关系图包括多个节点以及至少一条有向边,每个节点指代一条数据,每条有向边具有一个源节点和一个目的节点,所述有向边由所述有向边的源节点指向所述有向边的目的节点,所述有向边用于表示根据所述有向边的源节点指代的数据进行计算得到所述有向边的目的节点指代的数据的作业请求,所述数据分布信息包括每条数据所在的服务器;
根据所述关联关系图和所述数据分布信息,按照预设节点排序策略遍历所述关联关系图中的节点,依次将遍历到的节点对应的作业请求定位到所述遍历到的节点指代的数据所在的任一服务器中,所述节点对应的作业请求是指以所述节点作为源节点的有向边表示的作业请求;
将所述至少一条有向边表示的至少一个作业请求依次调度至所定位的服务器中。
在一种可能实现方式中,所述预设节点排序策略为按照度数从大到小的顺序进行排序的策略,所述度数是指节点连接的有向边的数量,所述根据所述关联关系图和所述数据分布信息,按照预设节点排序策略遍历所述关联关系图中的节点,依次将遍历到的节点对应的作业请求定位到所述遍历到的节点指代的数据所在的任一服务器中,包括:
按照所述预设节点排序策略遍历所述关联关系图中的节点,将所述关联关系图中当前度数最大的节点确定为所述遍历到的节点;
将所述遍历到的节点对应的作业请求定位到所述遍历到的节点指代的数据所在的任一服务器中;
将以所述遍历到的节点作为源节点的有向边删除;
继续按照所述预设节点排序策略遍历所述关联关系图中的节点,将所述关联关系图中当前度数最大的节点确定为所述遍历到的节点,对所述遍历到的节点进行所述定位作业请求的步骤和所述删除对应有向边的步骤,直至所述关联关系图中所有节点的度数均不大于1为止。
在另一种可能实现方式中,所述获取关联关系图和数据分布信息之前,所述方法还包括:
获取日志记录,所述日志记录包括作业请求记录和数据处理记录;
根据所述作业请求记录构建随机排队模型,基于所述随机排队模型确定所述作业请求记录中的每个作业请求的输入数据和输出数据;
根据所述数据处理记录构建数据关联模型,所述数据关联模型中包括多个节点,每个节点指代所述数据处理记录中的一条数据;
确定所述每个作业请求的输入数据对应的源节点和输出数据对应的目的节点,在所述数据关联模型中添加由所述源节点指向所述目的节点的有向边,得到所述关联关系图。
在另一种可能实现方式中,所述获取关联关系图和数据分布信息之前,所述方法还包括:
计算作业请求记录中多个作业请求均执行结束所需的执行轮数W;
根据每个作业请求的作业执行频率和作业执行时间,计算每个作业请求的优先度,所述优先度与所述作业执行频率和所述作业执行时间正相关;
按照优先度从高到低的顺序对所述多个作业请求进行排序,将排名第nW+m位的作业请求和排名第(n+2)W+1-m的作业请求的执行轮次确定为m,m为正整数,且m不大于W,n为整数;
所述将所述至少一条有向边表示的至少一个作业请求依次调度至所定位的服务器中,包括:根据每个作业请求的执行轮次,将所述至少一个作业请求依次调度至所定位的服务器中。
在另一种可能实现方式中,所述方法还包括:
初始化每个作业请求的集群规模,所述作业请求对应的数据占用的服务器数量与所述作业请求的集群规模正相关;
计算每个作业请求在当前的集群规模下的作业执行时间,并增大作业执行时间最长的作业请求的集群规模;
继续计算每个作业请求在当前的集群规模下的作业执行时间,并增大作业执行时间最长的作业请求的集群规模,直至作业执行时间最长的作业请求的集群规模对应的服务器数量等于所述云计算系统中的服务器总数量时为止。
在另一种可能实现方式中,每个作业请求包括映射map任务、混洗shuffle任务和归并reduce任务,所述计算每个作业请求在当前的集群规模下的作业执行时间,包括:
采用以下公式,计算任一作业请求在当前的集群规模下的作业执行时间:
Figure PCTCN2019074017-appb-000001
其中,L j(r)表示第j个作业请求的作业执行时间,r表示第j个作业请求当前分配的机架数量,k表示一个机架中的服务器数量,
Figure PCTCN2019074017-appb-000002
表示第j个作业请求输入文件的大小,
Figure PCTCN2019074017-appb-000003
表示第j个作业请求输出文件的大小,
Figure PCTCN2019074017-appb-000004
表示第j个作业请求的shuffle文件的大小,μ map表示map任务的单机平均处理效率,μ reduce表示reduce任务的单机平均处理效率,
Figure PCTCN2019074017-appb-000005
表示第j个作业请求的reduce任务的数量,V表示过载率,B表示服务器的带宽。
在另一种可能实现方式中,所述继续计算每个作业请求在当前的集群规模下的作业执行时间,并增大作业执行时间最长的作业请求的集群规模,直至作业执行时间最长的作业请求的集群规模对应的服务器数量等于所述云计算系统中的服务器总数量时为止之后,所述方法还包括:
确定所述关联关系图中当前长度最大的关系链,所述关系链为一个第一节点以及由任一第二节点到达所述第一节点所经过的节点及有向边组成的集合,所述第一节点为不存在以所述第一节点为源节点的有向边的节点,所述第二节点为存在至少两条以所述第二节点为源节点的有向边的节点,或者为存在一条以所述第二节点为源节点的有向边且不存在以所述第二节点为目的节点的有向边的节点,所述关系链的长度由所述关系链中包含的有向边个数确定;
按照集群规模从大到小的顺序,对所述当前长度最大的关系链中的有向边表示的作业请求进行排序,将集群规模最大的作业请求确定为指定作业请求;
按照所述指定作业请求的集群规模,将所述指定作业请求对应的数据定位到数量与所述指定作业请求的集群规模匹配的服务器中,并按照所述当前长度最大的关系链 中其他有向边表示的作业请求的集群规模,将所述其他作业请求对应的数据定位到所述指定作业请求所定位的且数量与所述其他作业请求的集群规模匹配的服务器中;
将所述当前长度最大的关系链删除;
继续确定所述关联关系图中当前长度最大的关系链,对确定的关系链进行所述定位数据的步骤,直至所述关联关系图中的每条有向边表示的作业请求对应的数据定位完成。
在另一种可能实现方式中,所述方法还包括:
当作业请求执行结束后,对日志记录进行更新,所述日志记录包括作业请求记录和数据处理记录;
根据更新后的日志记录,对所述关联关系图进行更新。
在另一种可能实现方式中,所述关联关系图中每条有向边的请求数量指代对应的作业请求的个数,每条有向边的权重指代对应作业请求的执行频率,根据更新后的日志记录,对所述关联关系图进行更新,包括:
根据更新后的作业请求记录,确定更新后的多个作业请求以及每个作业请求更新后的执行频率;
根据所述更新后的多个作业请求,对所述关联关系图中的有向边进行更新,并对每条有向边的请求数量进行更新;
根据所述每个作业请求更新后的执行频率,对所述关联关系图中每条有向边的权重进行更新。
第二方面,提供了一种调度器,应用于云计算系统,所述云计算系统包括所述调度器以及多个服务器,所述多个服务器用于存储数据,所述调度器包括:
获取模块,用于获取关联关系图和数据分布信息,所述关联关系图包括多个节点以及至少一条有向边,每个节点指代一条数据,每条有向边具有一个源节点和一个目的节点,所述有向边由所述有向边的源节点指向所述有向边的目的节点,所述有向边用于表示根据所述有向边的源节点指代的数据进行计算得到所述有向边的目的节点指代的数据的作业请求,所述数据分布信息包括每条数据所在的服务器;
请求定位模块,用于根据所述关联关系图和所述数据分布信息,按照预设节点排序策略遍历所述关联关系图中的节点,依次将遍历到的节点对应的作业请求定位到所述遍历到的节点指代的数据所在的任一服务器中,所述节点对应的作业请求是指以所述节点作为源节点的有向边表示的作业请求;
调度模块,用于将所述至少一条有向边表示的至少一个作业请求依次调度至所定位的服务器中。
在一种可能实现方式中,所述预设节点排序策略为按照度数从大到小的顺序进行排序的策略,所述度数是指节点连接的有向边的数量,所述请求定位模块,包括:
确定单元,用于按照所述预设节点排序策略遍历所述关联关系图中的节点,将所述关联关系图中当前度数最大的节点确定为所述遍历到的节点;
定位单元,用于将所述遍历到的节点对应的作业请求定位到所述遍历到的节点指代的数据所在的任一服务器中;
删除单元,用于将以所述遍历到的节点作为源节点的有向边删除;
所述确定单元、所述定位单元和所述删除单元还用于继续按照所述预设节点排序策略遍历所述关联关系图中的节点,将所述关联关系图中当前度数最大的节点确定为所述遍历到的节点,对所述遍历到的节点进行所述定位作业请求的步骤和所述删除对应有向边的步骤,直至所述关联关系图中所有节点的度数均不大于1为止。
在另一种可能实现方式中,所述调度器还包括:
日志获取模块,用于获取日志记录,所述日志记录包括作业请求记录和数据处理记录;
模型构建模块,用于根据所述作业请求记录构建随机排队模型,基于所述随机排队模型确定所述作业请求记录中的每个作业请求的输入数据和输出数据;
数据关联模型构建模块,用于根据所述数据处理记录构建数据关联模型,所述数据关联模型中包括多个节点,每个节点指代所述数据处理记录中的一条数据;确定所述每个作业请求的输入数据对应的源节点和输出数据对应的目的节点,在所述数据关联模型中添加由所述源节点指向所述目的节点的有向边,得到所述关联关系图。
在另一种可能实现方式中,所述调度器还包括:
轮数计算模块,用于计算作业请求记录中多个作业请求均执行结束所需的执行轮数W;
优先度计算模块,用于根据每个作业请求的作业执行频率和作业执行时间,计算每个作业请求的优先度,所述优先度与所述作业执行频率和所述作业执行时间正相关;
轮次确定模块,用于按照优先度从高到低的顺序对所述多个作业请求进行排序,将排名第nW+m位的作业请求和排名第(n+2)W+1-m的作业请求的执行轮次确定为m,m为正整数,且m不大于W,n为整数;
所述调度模块,还用于根据每个作业请求的执行轮次,将所述至少一个作业请求依次调度至所定位的服务器中。
在另一种可能实现方式中,所述调度器还包括:
集群规模设置模块,用于初始化每个作业请求的集群规模,所述作业请求对应的数据占用的服务器数量与所述作业请求的集群规模正相关;
集群规模调整模块,用于计算每个作业请求在当前的集群规模下的作业执行时间,并增大作业执行时间最长的作业请求的集群规模;继续计算每个作业请求在当前的集群规模下的作业执行时间,并增大作业执行时间最长的作业请求的集群规模,直至作业执行时间最长的作业请求的集群规模对应的服务器数量等于所述云计算系统中的服务器总数量时为止。
在另一种可能实现方式中,每个作业请求包括映射map任务、混洗shuffle任务和归并reduce任务,所述集群规模调整模块,还用于采用以下公式,计算任一作业请求在当前的集群规模下的作业执行时间:
Figure PCTCN2019074017-appb-000006
其中,L j(r)表示第j个作业请求的作业执行时间,r表示第j个作业请求当前分配的机架数量,k表示一个机架中的服务器数量,
Figure PCTCN2019074017-appb-000007
表示第j个作业请求输入文件的大 小,
Figure PCTCN2019074017-appb-000008
表示第j个作业请求输出文件的大小,
Figure PCTCN2019074017-appb-000009
表示第j个作业请求的shuffle文件的大小,μ map表示map任务的单机平均处理效率,μ reduce表示reduce任务的单机平均处理效率,
Figure PCTCN2019074017-appb-000010
表示第j个作业请求的reduce任务的数量,V表示过载率,B表示服务器的带宽。
在另一种可能实现方式中,所述调度器还包括:
排序模块,用于确定所述关联关系图中当前长度最大的关系链,所述关系链为一个第一节点以及由任一第二节点到达所述第一节点所经过的节点及有向边组成的集合,所述第一节点为不存在以所述第一节点为源节点的有向边的节点,所述第二节点为存在至少两条以所述第二节点为源节点的有向边的节点,或者为存在一条以所述第二节点为源节点的有向边且不存在以所述第二节点为目的节点的有向边的节点,每条关系链的长度由对应的关系链中包含的有向边个数确定;按照集群规模从大到小的顺序,对所述当前长度最大的关系链中的有向边表示的作业请求进行排序,将集群规模最大的作业请求确定为指定作业请求;
数据定位模块,用于按照所述指定作业请求的集群规模,将所述指定作业请求对应的数据定位到数量与所述指定作业请求的集群规模匹配的服务器中,并按照所述当前长度最大的关系链中其他有向边表示的作业请求的集群规模,将所述其他作业请求对应的数据定位到所述指定作业请求所定位的且数量与所述其他作业请求的集群规模匹配的服务器中;
删除模块,用于将所述当前长度最大的关系链删除;
所述排序模块、所述数据定位模块和所述删除模块还用于继续确定所述关联关系图中当前长度最大的关系链,对确定的关系链进行所述定位数据的步骤,直至所述关联关系图中的每条有向边表示的作业请求对应的数据定位完成。
在另一种可能实现方式中,所述调度器还包括:
更新模块,用于当作业请求执行结束后,对日志记录进行更新,所述日志记录包括作业请求记录和数据处理记录;
所述更新模块,还用于根据更新后的日志记录,对所述关联关系图进行更新。
在另一种可能实现方式中,所述关联关系图中每条有向边的请求数量指代对应的作业请求的个数,每条有向边的权重指代对应作业请求的执行频率,所述更新模块,包括:
第一更新单元,用于根据更新后的作业请求记录,确定更新后的多个作业请求以及每个作业请求更新后的执行频率;
第二更新单元,用于根据所述更新后的多个作业请求,对所述关联关系图中的有向边进行更新,并对每条有向边的请求数量进行更新;
第三更新单元,用于根据所述每个作业请求更新后的执行频率,对所述关联关系图中每条有向边的权重进行更新。
第三方面,提供了一种调度器,包括:至少一个处理器和存储器,所述存储器中存储有至少一条指令,所述指令由所述至少一个处理器加载并执行以实现第一方面描述的方法中所执行的操作。
第四方面,提供了一种计算机可读存储介质,所述计算机可读存储介质中存储有 至少一条指令,所述指令由处理器加载并执行以实现第一方面描述的的方法中所执行的操作。
第五方面,提供了一种计算机程序,所述计算机程序由处理器或计算机执行以实现第一方面描述的方法中所执行的操作。
第六方面,提供了一种云计算系统,所述云计算系统包括多个服务器和第二方面或第三方面描述的调度器。
本公开实施例达到的有益效果是:
本公开实施例提供的方法,通过以关联关系图来表示数据之间的关联关系以及作业请求与数据的对应关系,以数据分布信息来表示数据分布情况,根据关联关系图和数据分布信息,遍历每个节点,将以遍历到的节点作为源节点的有向边表示的作业请求定位到遍历到的节点指代的数据所在的任一服务器中,进而实现了对作业请求的定位,充分利用了平台层与作业层的特征,同时打破了跨层调度器之间的信息隔阂,加强了跨层资源的感知与关联,可以减少跨节点的数据传输,进而缩短整体作业请求的作业执行时间,提高计算效率,提升系统性能,优化整合了全平台资源。
并且,根据作业请求记录和数据处理记录生成关联关系图,该关联关系图可以表示作业层和平台层之间的跨层依赖关系,同时具备数据流的静态特性及作业流的动态特性,根据该关联关系图进行调度时,可以充分考虑到作业请求的排序策略以及数据分布情况,从而制定出合理的调度策略,减少作业执行时间,提高计算效率,进而提升云计算系统的性能。
并且,通过对关联关系图进行更新可以使关联关系图可以随着作业请求的调度过程的进行而及时更新,以保证关联关系图与作业流变化及时匹配,能更及时地应对作业请求的突发状况,既可以适用于作业流较为稳定的场景,也可以适用于作业流突发现象频繁的场景,如节点故障、网络线路变化、新类型应用加入或者作业请求瞬间突发等情况,提高了抗突发的能力。
并且,通过将每个作业请求的集群规模初始化后再根据作业执行时间进行调整,能够持续调整当前作业执行时间最长的作业请求的集群规模,增加该作业请求的并发程度,减小该作业请求的作业执行时间,进而统一了多作业请求的作业执行时间。
并且,考虑到同一关系链中的数据之间具有关联关系,在作业执行过程中很可能会同时使用,因此利用数据之间的关联关系,将可能会同时使用的数据定位到相同的服务器中,便于后续的数据处理,实现了精度极高的数据最优分布,尽可能避免了数据在不同服务器之间的传输,优化了通信开销,减少了通信能力的浪费,优化了大数据服务的计算能力和通信能力。且通过关联关系图,将数据调度策略与作业调度策略相结合,充分利用了数据与作业请求之间的跨层关联关系,可以实现作业流与数据流的协同调度,实现了全局资源的优化整合。另外,考虑到关联关系图中包含多条关系链,关联关系较为复杂,且关系链的长度越大,复杂度越高,数据分布情况对系统整体性能的影响越大,因此,以关系链的长度进行排序,优先对长度最大的关系链进行定位,可以减小定位复杂度,并尽可能避免数据分布不合理对后续的数据处理过程造成的影响。
并且,考虑到度数较大的节点对应着较多的作业请求,关联关系较为复杂,需要 优先进行定位才能将尽可能多的作业请求定位到该节点指代的数据所在的服务器中,因此按照节点的度数大小进行排序,优先对度数最大的节点对应的作业请求进行定位,减小定位复杂度,完成了作业请求与数据的一一映射,并尽可能避免数据分布不合理对后续的数据处理过程的影响。
并且,按照优先度从高到低的顺序对多个作业请求进行排序,将排名第nW+m位的作业请求和排名第(n+2)W+1-m的作业请求的执行轮次确定为m,根据每个作业请求的执行轮次,将至少一个作业请求依次调度至所定位的服务器中,可以保证不同轮次中所执行的作业请求的优先度相差不大,每2W轮的作业请求调度结束后,每轮的作业请求整体规模近似相等,从而近似实现作业请求在云计算系统的所有机架上的平铺。
并且,通过启发式方法,将每个作业请求的执行轮次进行初始化,实现了每轮并发执行的作业请求可以占用全部的集群能力,避免计算能力的浪费,提高执行效率。
附图说明
图1是相关技术提供的一种云计算系统的架构示意图;
图2是本公开实施例提供的一种云计算系统的架构示意图;
图3是相关技术提供的调度策略示意图;
图4A是本公开实施例提供的一种终端用户画像系统的架构示意图;
图4B是本公开实施例提供的一种调度器的结构示意图;
图5是本公开实施例提供的一种调度方法的流程图;
图6是本公开实施例提供的一种关联关系图创建流程示意图;
图7是本公开实施例提供的一种关联关系图的生成和更新过程示意图;
图8是本公开实施例提供的一种调度方法的流程图;
图9是本公开实施例提供的一种集群规模与作业执行时间的关系示意图;
图10是本公开实施例提供的一种调度方法的流程图;
图11是本公开实施例提供的一种数据定位的示意图;
图12是本公开实施例提供的一种调度方法的流程图;
图13是本公开实施例提供的一种作业请求定位的示意图;
图14是本公开实施例提供的一种确定执行轮次的示意图;
图15是本公开实施例提供的一种操作流程图;
图16是本公开实施例提供的三种方案下终端用户画像系统的作业请求完成比例的变化示意图;
图17是本公开实施例提供的三种方案下终端用户画像系统的提速百分比的变化示意图;
图18是本公开实施例提供的一种调度器的结构示意图;
图19是本公开实施例提供的一种调度器的结构示意图。
具体实施方式
为使本公开的目的、技术方案和优点更加清楚,下面将结合附图对本公开实施方式作进一步地详细描述。
图2是本公开实施例提供的一种云计算系统的架构示意图,参见图2,该云计算 系统包括:作业层和平台层,作业层包括调度器201,平台层包括多个服务器202。
多个服务器202用于作为存储节点来存储数据,还用于作为计算节点来根据数据和作业请求进行计算,且多个服务器202之间相互连接,可以进行数据传输。
调度层201用于将接收到的作业请求调度至某一个或多个计算节点(服务器202)上,从而在计算节点上根据数据和作业请求进行计算。
因此,用户设备、应用服务器等设备接入作业层后,可以向作业层提交作业请求,该作业请求用于根据对应的数据进行计算。调度器201将接收到的作业请求存储至调度队列,并从调度队列中调度至服务器202中,从而在服务器202中根据调度的作业请求以及作业请求对应的数据进行计算。
相关技术的调度方案中,平台层的数据调度与作业层的作业请求调度是互为黑盒、相互不感知的,参见图3,采用的调度策略包括两种:作业层调度策略和平台层调度策略。
作业层调度策略采用固定数据、移动计算的调度方式,即数据的存储位置固定不变,作业层的调度器会按照预设排序策略对调度队列中的多个作业请求进行排序,按照排列的顺序依次将每个作业请求调度到服务器上,由调度的服务器按照该作业请求并根据该作业请求对应的数据进行计算。作业层调度策略包括基于关联的排序策略或者基于权值的排序策略等。
作业层调度策略会假设数据的分布服从默认的均匀分布或已知分布,并假设在整个调度过程中数据分布保持恒定不变,因此在调度过程中对底层数据的实时分布状况不感知,因此调度效果往往受到数据初始分布的影响。但是,由于数据与作业请求之间存在着复杂的跨层关联关系,会进一步影响调度的结果,如果数据初始分布状况不佳,即便作业层调度策略可以实现各服务器的负载均衡,但仍会导致大量的跨节点数据传输,整体的作业执行效率依然受到影响。在异构性强的集群环境中,纯作业层的调度策略无法达到完美的资源分布效果。
平台层调度策略采用固定计算、移动数据的调度方式,即引入了数据关联关系,在执行作业请求的过程中会判断哪些数据会被相同的作业请求同时读取,进而将这些数据进行关联,之后,改变传统的数据均匀分布的方案,将相关联的数据部署到相同的服务器中,从而实现数据分布的动态调整,在特定的作业流分布下优化数据的本地化程度,在执行作业请求的过程中尽可能减少跨节点的数据传输,提高大数据服务的性能。平台层调度策略包括基于数据关联的调度策略、基于数据访问频率的调度策略或基于设备性能的调度策略等。
由于移动数据比移动计算的开销要大,因此平台层调度策略一旦确定了数据分布情况后,短时间内也不会频繁进行调整,这个特性导致了平台层调度策略的灵活性欠佳。并且,平台层调度策略对作业流的情况无法感知,因此无法与作业流变化及时匹配,不能及时根据作业流的突发状况调整数据分布情况,从而恶化了大数据服务的性能,通常不适用于作业流突发现象频繁的场景,而仅适用于用户行为相对稳定,即作业流较为稳定的场景。
而本公开实施例打破了平台层与作业层的调度策略之间的隔阂,基于数据与作业 请求之间的跨层关联关系,对作业流和数据流进行协同调度,进一步实现全局资源的优化整合,提高执行效率。相对于作业层调度策略,本公开实施例融入了平台层因素,调度策略的稳态性能更优,而相对于平台层调度策略,本公开实施例考虑了作业流的突发特性,瞬态收敛性更强。并且相对于这两种策略的简单结合,本公开实施例充分利用了数据与作业请求之间的跨层关联关系,优化了数据分布,减小了作业执行过程中的通信开销。
本公开实施例应用于大规模数据中心的应用场景中,在大规模数据中心中,多个设备并发执行大数据服务,会提交多个作业请求,且数据与作业会进行频繁地交互,则采用本公开实施例的调度策略可以对整个作业执行过程中作业请求和数据的流向进行统一地规划调度。
云计算系统中调用大数据应用时,从作业请求到达到作业请求执行完毕的过程包括作业请求调度、数据装载和作业请求执行三个阶段,这三个阶段的执行耗时决定了大数据应用的整体性能。
平台层的工作在于数据装载阶段,平台层的研究主要为应用的数据在存储平台上提供相应的部署策略。使用平台层的数据在存储平台的优化部署方案,可以提高大数据应用的数据向不同计算节点的装载速率,优化数据装载阶段的时间跨度。
作业层的工作在于作业请求调度阶段,作业层的研究主要为根据不同服务的性能需求以及约束条件,对作业请求进行调度。在平台层优化了数据分布的基础上,作业层的作业请求调度策略更进一步规划了多个作业请求的执行顺序以及作业请求向数据的分流方式。
例如,参见图4A,本公开实施例可以应用于终端用户画像系统中,该终端用户画像系统在工作过程中,用户数据需要以Hadoop分布式文件系统(Hadoop Distribute File System,简称HDFS)文件、Hive(基于Hadoop的一个数据仓库工具)表格、Redis(一种键值存储系统)索引等不同格式进行存储,且数据流动非常频繁。同时不同格式的数据都需要由Spark SQL计算集群(一种用来处理结构化数据的spark组件)读取,并进行不同类型的大数据运算,如表项生成、标签生成、应用推荐等操作,因此应用与数据之间形成了复杂的关联关系。其中,HDFS中包括NameNode(名字节点)和DataNode(数据节点),Spark SQL计算集群中包括Master节点(主节点)和Worker节点(工作节点)。该场景适合采用本公开实施例提供的方法,对数据流和作业流进行统一调度管理。
参见图4B,本公开实施例的调度方法,可以在现有大数据系统的经典调度器的基础上进行扩展而实现,如Apache Hadoop YARN、Apache Mesos等,主要是在传统调度器中保留了传统调度器的作业请求调度功能,作为新的调度器的子模块,另外加入数据平面的特征,引入了具有数据部署功能的子模块,两个子模块互不干扰,独立工作,并通过在底层引入跨层的细粒度关联关系图,将两个子模块使用的信息进行紧密关联,从而可以统一规划两个子模块的运作,实现数据流和作业流的协同调度,达到全局最优组合调度。例如,若由于整个云计算系统的底层环境是HDFS文件系统,则本公开实施例可以通过对Apache Hadoop YARN调度器进行修改拓展而实现。本公开 实施例的具体过程详见下述实施例。
图5是本公开实施例提供的一种调度方法的流程图,应用于上述实施例所示的调度器中,本公开实施例对获取关联关系图的过程进行说明。参见图5,该方法包括:
501、获取日志记录,日志记录包括作业请求记录和数据处理记录。
其中,作业请求记录中包括本次作业执行过程中需要执行的作业请求,根据大数据应用提供的不同类型的功能,作业请求记录中可以包括不同类型的作业请求,例如应用推荐请求、网页浏览请求等。另外,该作业请求记录中还可以包括每个作业请求的请求频率,每个作业请求的执行时间等,这些参数可以通过对每个作业请求的历史执行过程进行统计得到。
而数据处理记录中包括多条数据,根据该数据处理记录不仅可以确定之前处理过哪些数据,而且这些数据可以认为是本次作业执行过程中所需的数据。
调度器可以每次接收到作业请求时,将该作业请求存储至作业请求记录中,如将接收到的作业请求存储至调度队列中。并且,还会根据作业执行过程中处理过的数据可以生成数据处理记录。
502、根据作业请求记录构建随机排队模型,基于随机排队模型确定作业请求记录中的每个作业请求的输入数据和输出数据。
调度器读取作业请求记录后,可以根据作业请求记录对作业请求的执行过程进行抽象建模,构建随机排队模型,该随机排队模型可以模拟多个作业请求排队等候执行的过程,则基于该随机排队模型可以确定每一个作业请求的输入数据和输出数据,该输入数据是指作业请求执行时所需的数据,输出数据是指按照作业请求对输入数据进行计算后得到的数据。
503、根据数据处理记录构建数据关联模型,数据关联模型中包括多个节点,每个节点指代数据处理记录中的一条数据。
504、确定每个作业请求的输入数据对应的源节点和输出数据对应的目的节点,在数据关联模型中添加由源节点指向目的节点的有向边,得到关联关系图。
调度器可以将数据处理记录中的每条数据作为一个节点,从而构建包括多个节点的数据关联模型,该数据关联模型可以表示数据之间的关联关系,而每个作业请求在该数据关联模型中会存在输入数据对应的源节点和输出数据对应的目的节点,因此在数据关联模型中添加由每个作业请求对应的源节点指向目的节点的有向边,该有向边用于表示根据源节点对应的输入数据进行计算得到目的节点对应的输出数据的作业请求,也即是该有向边可以表示输入数据与输出数据之间的关联关系。
在数据关联模型中添加有向边完成后,即可得到关联关系图,该关联关系图包括多个节点以及至少一条有向边,每个节点指代一条数据,每条有向边具有一个源节点和一个目的节点,有向边由有向边的源节点指向有向边的目的节点,每条有向边用于表示根据有向边的源节点指代的数据进行计算得到有向边的目的节点指代的数据的作业请求,能够将数据流和作业流综合起来,可以表示数据和作业请求之间的关联关系。
其中,可以根据数据处理记录进行抽象建模,得到有向无环图(Directed Acyclical Graphs,简称DAG)模型,再根据DAG模型和作业请求记录构建关联关系图。
在一种可能实现方式中,考虑到多个作业请求可能会具有相同的输入数据和输出数据,因此会在数据关联模型中添加相同的有向边,则可以根据数据关联模型中节点的拓扑结构,在同一个节点上对多个作业请求进行聚合,形成初始的关联关系图,之后为关联关系图中的每条有向边设置请求数量和权重,该请求数量指代有向边对应的作业请求的数量,该权重指代有向边对应的作业请求的执行频率,可以根据作业请求记录中作业请求的执行频率确定。
例如,参见图6,根据作业请求记录创建马尔科夫链,该马尔科夫链中以调度队列的不同状态作为不同的节点,每个节点中的数量指代调度队列中作业请求的数量,λ表示作业请求的到达速率,μ表示作业请求的处理速率,Δt表示不同状态之间的时间间隔,该马尔科夫链可以表示作业流。并且,生成数据关联图,该数据关联图以不同的数据作为不同的节点,节点与节点之间的边表示由源节点对应的数据计算得到目的节点对应的数据的作业请求。根据该马尔科夫链和该数据关联图可以生成关联关系图,在该关联关系图中以不同的数据为不同的节点,并在同一个节点上连接不同的作业请求所对应的有向边,每个有向边具有权重,以权重来表示对应作业请求的执行效率。
本公开实施例提供的方法,可以根据作业请求记录和数据处理记录生成关联关系图,该关联关系图可以表示作业层和平台层之间的跨层依赖关系,同时具备数据流的静态特性及作业流的动态特性,根据该关联关系图进行调度时,可以充分考虑到作业请求的排序策略以及数据分布情况,从而制定出合理的调度策略,减少作业执行时间,提高计算效率,进而提升云计算系统的性能。
需要说明的是,本公开实施例中,调度器可以对日志记录进行更新,并根据更新后的日志记录,对关联关系图进行更新。该更新过程包括:根据更新后的作业请求记录,确定更新后的多个作业请求以及每个作业请求更新后的执行频率,该更新后的多个作业请求可以是去除已经执行完成的作业请求后剩余的作业请求,且有些剩余的作业请求可能会由于在本次作业执行过程中已经执行过而导致执行频率发生变化,因此要对执行频率进行更新,或者本次作业执行过程结束后,该更新后的多个作业请求可以为下次作业执行过程需要执行的作业请求。之后根据更新后的多个作业请求,对关联关系图中的有向边进行更新,对每条有向边的请求数据进行更新,并根据每个作业请求更新后的执行频率,对每条有向边的权重进行更新。
上述更新过程可以周期性执行,或者在每次作业执行过程结束后执行,例如,当前周期结束但本次作业执行过程还未结束,还存在未执行完成的作业请求时,则对作业请求记录中的作业请求以及执行频率进行更新,并重新读取作业请求记录,对关联关系图中的有向边的权重进行更新,在剩余的执行过程中应用更新后的关联关系图。之后当本次作业执行过程结束时,根据本次作业执行过程中每个作业请求的执行情况更新关联关系图。
关联关系图的生成和更新过程可以如图7所示,通过上述更新过程,在整个作业执行期间内周期性对关联关系图进行更新,从而使得跨层关联关系图的有向边信息可以随着调度的进行而及时更新,以便更及时地应对作业请求的突发情况。每次更新后有向边的权重的有效期为更新周期,在有效期内调度器认为各个应用的作业流保持不变。
图8是本公开实施例提供的一种调度方法的流程图,应用于上述实施例所示的调度器中,本公开实施例对确定作业请求集群规模的过程进行说明,该确定作业请求的集群规模的过程可以在数据上传至云计算系统时执行。参见图8,该方法包括:
801、初始化每个作业请求的集群规模。
其中,作业请求对应的数据占用的服务器数量与作业请求的集群规模正相关,例如可以将作业请求对应的数据占用的服务器数量确定为与集群规模相等,也即是该集群规模为作业请求对应的数据占用的服务器数量,或者可以以集群规模r来表示机架的数量,而每个机架包含的服务器数量均为k,则作业请求对应的数据占用的服务器数量为rk。
每个作业请求对应的数据可以构成多个数据副本,并分别部署在多个服务器中,所部署的服务器数量较多时,可以在执行作业请求的过程中减少跨节点的数据传输,从而减少作业请求的作业执行时间。因此针对每个作业请求来说,集群规模较大时作业执行时间较短。而从全局出发,需要综合考虑每个作业请求的作业执行时间,以便合理确定每个作业请求的集群规模。
为此,调度器可以先初始化每个作业请求的集群规模,后续再对某些作业请求的集群规模进行调整。例如,初始化时可以将作业请求记录中的每个作业请求的集群规模设置为第一数值,该第一数值可以为正整数,可以为1或者其他数值。
802、计算每个作业请求在当前的集群规模下的作业执行时间,并增大作业执行时间最长的作业请求的集群规模。
调度器可以先根据每个作业请求的集群规模对每个作业请求的执行过程进行模拟,确定每个作业请求在当前的集群规模下的作业执行时间。则针对作业执行时间最长的作业请求,将该作业请求的集群规模增大,可以减少该作业请求的作业执行时间。增大该作业请求的集群规模时,可以将该作业请求的集群规模加1或者增加其他数值。
在一种可能实现方式中,作业请求可以划分为映射map任务、混洗shuffle任务和归并reduce任务,则作业请求的执行过程包括上述三种任务的执行过程,作业请求的作业执行时间可以由上述三种任务的执行时间决定。调度器可以建立性能模型,基于该性能模型计算每个作业请求在当前的集群规模下的作业执行时间。
该性能模型的参数定义如下:
S:输入文件的总大小,包含多个作业请求的输入文件;
J:作业请求的集合;
R:机架的总数量;
k:一个机架里中面服务器的数量;
B:服务器的带宽;
V:过载率(over subscription ratio),跨机架传输时使用;
L j(r):第j个作业请求的作业执行时间(在r个机架中执行该作业请求);
r:第j个作业请求被分配的机架的数量,即计算集群的规模;
J ij:第j个作业请求是否部署在第i个机架上执行;
Rj:第j个作业请求被分配在相应机架上,即(J 1j,J 2j,…,J Rj)的集合
Figure PCTCN2019074017-appb-000011
第j个作业请求输入文件的大小;
Figure PCTCN2019074017-appb-000012
第j个作业请求的shuffle文件的大小;
Figure PCTCN2019074017-appb-000013
第j个作业请求输出文件的大小;
Figure PCTCN2019074017-appb-000014
第j个作业请求的map任务的数量;
Figure PCTCN2019074017-appb-000015
第j个作业请求的reduce任务的数量;
μ map:Map任务的单机平均处理效率;
μ reduce:Reduce任务的单机平均处理效率。
每个作业请求可以看作是一个MapReduce作业,包括map任务、shuffle任务和reduce任务,执行时间由map任务、shuffle任务和reduce任务三阶段的执行时间决定,即:
Figure PCTCN2019074017-appb-000016
首先,在map任务的执行阶段,假设第j个作业请求中,map任务的轮数为
Figure PCTCN2019074017-appb-000017
有:
Figure PCTCN2019074017-appb-000018
表示上取整运算;
假设第j个作业请求中,reduce任务的轮数为
Figure PCTCN2019074017-appb-000019
则有:
Figure PCTCN2019074017-appb-000020
而shuffle任务的执行阶段中每轮的执行时间取决于机架内传输和跨机架传输的最大时间,有:
Figure PCTCN2019074017-appb-000021
跨机架传输场景中,每个计算节点有(r-1)/r比例的数据需要跨机架传输,因此跨机架传输过程所需的时间为:
Figure PCTCN2019074017-appb-000022
跨机架传输场景中,每个计算节点有1/r比例的数据不需要跨机架传输,因此机架内传输过程所需的时间为:
Figure PCTCN2019074017-appb-000023
因此Shuffle任务执行阶段跨机架传输时间占主导情况的情形为:
Figure PCTCN2019074017-appb-000024
即,当集群规模大于1个节点时,跨节点传输成为MapReduce作业请求中Shuffle任务的执行阶段的瓶颈。
结合节点的计算性能,作业请求在集群中的作业执行时间量化如下:
Figure PCTCN2019074017-appb-000025
也即是,调度器可以采用以下公式,计算任一作业请求在当前的集群规模下的作业执行时间:
Figure PCTCN2019074017-appb-000026
当作业执行时间取得极值时,集群规模为:
Figure PCTCN2019074017-appb-000027
Figure PCTCN2019074017-appb-000028
即,集群规模为2个机架时,作业请求的作业执行时间取得极大值,之后集群规模越大,作业请求的作业执行时间越短。则根据性能建模结果得到的初步结论是,综合考虑MapReduce集群的计算性能和通信性能,集群规模越大则MapReduce作业的执行时间跨度越短。此结论在后续协同调度方法设计时,会用于对集群规模的优化调整操作中,另外,在数据初始分布操作中,也会使用该性能模型的结果。
803、继续计算每个作业请求在当前的集群规模下的作业执行时间,并增大作业执行时间最长的作业请求的集群规模,直至作业执行时间最长的作业请求的集群规模对应的服务器数量等于所述云计算系统中的服务器总数量时为止。
在增大作业执行时间最长的作业请求的集群规模之后,该作业请求的作业执行时间减少,此时重新执行步骤802,也即是重新计算该作业请求在增大后的集群规模下的作业执行时间,并与其他的作业请求在当前的集群规模下的作业执行时间进行对比,重新确定作业执行时间最长的作业请求,并增大当前作业执行时间最长的作业请求的集群规模,以此类推,直至当前作业执行时间最长的作业请求的集群规模增大之后,该集群规模所对应的服务器数量等于云计算系统中的服务器总数量时为止。
在第一种可能实现方式中,集群规模等于作业请求对应的数据占用的服务器数量时,当前作业执行时间最长的作业请求的集群规模增大之后,该集群规模等于云计算系统中的服务器总数量时停止调整集群规模。
在第二种可能实现方式中,集群规模为作业请求对应的数据占用的机架数量r,每个机架包含的服务器数量均为k,作业请求对应的数据占用的服务器数量为rk,则当前作业执行时间最长的作业请求的集群规模r增大之后,rk等于云计算系统中的服务器总数量时停止调整集群规模。
需要说明的是,本公开实施例中以服务器作为计算节点的最小单位,服务器部署于机架中,则以服务器为粒度进行集群规模的调整或者以机架为粒度进行集群规模的调整,而在其他实施例中服务器也可以采用其他方式进行部署,则可以采用其他粒度进行集群规模的调整,具体可以根据部署需求确定。
通过采用上述调整集群规模的方式,持续调整当前作业执行时间最长的作业请求的集群规模,可以增加该作业请求的并发程度,减小该作业请求的作业执行时间,重 复执行之后可以保证所有作业请求的作业执行时间会被调整到大致相同的情况,如图9所示。
图10是本公开实施例提供的一种调度方法的流程图,应用于上述实施例所示的调度器中,本公开实施例对定位数据的过程进行说明,该定位数据的过程可以在确定作业请求的集群规模之后执行。参见图10,该方法包括:
1001、确定关联关系图中当前长度最大的关系链,按照集群规模从大到小的顺序,对当前长度最大的关系链中的有向边表示的作业请求进行排序,将集群规模最大的作业请求确定为指定作业请求。
在获取到关联关系图,并且确定每个作业请求的集群规模之后,可以确定每条数据要定位的服务器,也即是每条数据要部署的位置。
关联关系图中,根据节点连接关系可以获取到多条关系链,关系链为一个第一节点以及由任一第二节点到达第一节点所经过的节点及有向边组成的集合,第一节点为不存在以第一节点为源节点的有向边的节点,第二节点为存在至少两条以第二节点为源节点的有向边的节点,或者为存在一条以第二节点为源节点的有向边且不存在以第二节点为目的节点的有向边的节点,每条关系链的长度由关系链中包含的有向边个数确定。当确定关联关系图中的第一节点和第二节点后,可以从第二节点处进行分割,得到多条关系链,确定关联关系图中每条关系链的长度,即每条关系链中包含的有向边的个数,并确定当前长度最大的关系链。当前长度最大的关系链不仅包含多个节点,还会包含不同节点之间的有向边,这些有向边对应于作业请求,可以认为在当前长度最大的关系链中存在至少一个作业请求,则按照每个作业请求的集群规模从大到小的顺序进行排序,将集群规模最大的作业请求确定为指定作业请求。
1002、按照指定作业请求的集群规模,将指定作业请求对应的数据定位到数量与指定作业请求的集群规模匹配的服务器中。
确定指定作业请求后,确定该指定作业请求的集群规模匹配的服务器数量,则将指定作业请求对应的数据定位到数量与该匹配的服务器数量相等的服务器中。
其中,数据定位是指确定数据的目标服务器,将数据存储至目标服务器中,后续数据处理过程中该目标服务器可以作为计算节点,对该数据进行处理,或者该目标服务器可以将该数据发送给其他的计算节点,由其他的计算节点对该数据进行处理。
在第一种可能实现方式中,集群规模等于作业请求对应的的数据占用的服务器数量,则将指定作业请求对应的数据定位到数量与该集群规模相等的服务器中。
在第二种可能实现方式中,集群规模为作业请求对应的数据占用的机架数量r,每个机架包含的服务器数量均为k,作业请求对应的数据占用的服务器数量为rk,则将指定作业请求对应的数据定位到r个机架中,也即是定位到rk个服务器中。
1003、按照当前长度最大的关系链中其他有向边表示的作业请求的集群规模,将其他作业请求对应的数据定位到指定作业请求所定位的且数量与其他作业请求的集群规模匹配的服务器中。
指定作业请求对应的数据定位完成后,再定位当前长度最大的关系链中其他有向边表示的作业请求对应的数据,且定位时每个作业请求对应的数据定位到数量与集群 规模匹配的服务器中,并且每个作业请求对应的数据应当优先定位到该指定作业请求所定位的服务器。在一种可能实现方式中,每个作业请求对应的数据应当优先定位到所在关系链中其他已定位的作业请求所定位的服务器中,这样可以保证同一关系链中的作业请求对应的数据尽可能定位到相同服务器中。
举例来说,参见图11,关联关系图中长度最大的关系链为“-B7-B9-B10”,该关系链中的三个作业请求依次为J1、J2、J3,其中J1的集群规模r1最大,J2的集群规模r2次之,J3的集群规模r3最小,则将J1对应的数据定位到r1个机架中,将J2对应的数据定位到r2个机架中,将J3对应的数据定位到r3个机架中,且这r2个机架和这r3个机架均属于之前定位的r1机架的部分机架,且这r3个机架属于这r2个机架的部分机架,这样可以保证J1对应的数据和J2对应的数据可以位于相同的r2个机架上,J1对应的数据、J2对应的数据和J3对应的数据可以位于相同的r3个机架中。
1004、将当前长度最大的关系链删除。
1005、继续确定关联关系图中当前长度最大的关系链,对确定的关系链进行所述定位数据的步骤,直至关联关系图中的每条有向边表示的作业请求对应的数据定位完成。
当前长度最大的关系链中的有向边表示的作业请求对应的数据定位完成后,即可将该数据定位完成的关系链删除,之后判断删除之后的关联关系图中是否还有需要定位的数据对应的节点,如果还有需要定位的数据对应的节点,则根据删除之后的关联关系图继续重复上述步骤进行定位,也即是重新确定关联关系图中当前长度最大的关系链,对确定的关系链中的有向边表示的作业请求对应的数据进行定位,直至每个作业请求对应的数据定位完成,此时没有需要定位的数据对应的节点,即已确定所有数据定位的服务器位置,已经完成最终的数据分布,得到了数据分布信息,该数据分布信息包括每条数据所在的服务器,实现了数据分布信息的初始化。
本公开实施例中,考虑到同一关系链中的数据之间具有关联关系,在作业执行过程中很可能会同时使用,因此利用数据之间的关联关系,将可能会同时使用的数据定位到相同的服务器中,便于后续的数据处理,尽可能避免了数据在不同服务器之间的传输,优化了通信开销,减少了通信能力的浪费。且考虑到关联关系图中包含多条关系链,关联关系较为复杂,且关系链的长度越大,复杂度越高,数据分布情况对系统整体性能的影响越大,因此,以关系链的长度进行排序,优先对长度最大的关系链进行定位,可以减小定位复杂度,并尽可能避免数据分布不合理对后续的数据处理过程造成的影响。
图12是本公开实施例提供的一种调度方法的流程图,应用于上述实施例所示的调度器中,本公开实施例对调度作业请求的过程进行说明,该调度作业请求的过程在数据分布信息确定之后执行。参见图12,该方法包括:
1201、获取关联关系图和数据分布信息。
其中,关联关系图包括多个节点以及至少一条有向边,每个节点指代一条数据,每条有向边具有一个源节点和一个目的节点,有向边由有向边的源节点指向有向边的目的节点,每条有向边用于表示根据有向边的源节点指代的数据进行计算得到有向边 的目的节点指代的数据的作业请求。
其中,数据分布信息包括每条数据所在的服务器,例如数据分布信息可以包括每条数据所在的服务器标识,该服务器标识用于唯一确定对应的服务器,可以为服务器的地址信息或者顺序编号等。
1202、根据关联关系图和数据分布信息,按照预设节点排序策略遍历关联关系图中的节点,依次将遍历到的节点对应的作业请求定位到遍历到的节点指代的数据所在的任一服务器中,节点对应的作业请求是指以节点作为源节点的有向边表示的作业请求。
调度器可设置预设节点排序策略,根据预设节点排序策略可以对关联关系图中的节点进行排序,根据排序可以依次对每个节点对应的作业请求进行定位。其中,对作业请求进行定位是指在云计算系统的多个服务器中确定作业请求的计算节点,该计算节点用于执行该作业请求,也即是按照该作业请求并根据该作业请求对应的数据进行处理。
进行定位时,会按照预设节点排序策略确定的节点排列顺序进行遍历,以每次遍历到的节点作为源节点的有向边表示的作业请求即为当前要定位的作业请求,此时由于数据分布信息中已经表明遍历到的节点指代的数据所在的服务器,且该作业请求在执行过程中需要使用该数据,则为了尽可能避免数据在不同服务器之间的传输,将该作业请求定位到该数据所在的任一服务器中。采用上述定位方式可以确定每个作业请求所定位的服务器,即每个作业请求的计算节点。
在一种可能实现方式中,预设节点排序策略为按照度数从大到小的顺序进行排序的策略,度数是指节点连接的有向边的数量,则进行定位时,先确定关联关系图中每个节点的度数,并按照度数从大到小的顺序进行排序,确定当前度数最大的节点,即为遍历到的节点,将遍历到的节点对应的作业请求定位到遍历到的节点指代的数据所在的任一服务器中,将以遍历到的节点作为源节点的有向边删除,之后重复执行上述步骤,针对度数大于1的节点所对应的作业请求继续进行定位,也即是继续按照度数从大到小的顺序遍历到下一个节点,将遍历到的节点作为源节点的有向边表示的作业请求定位到遍历到的节点指代的数据所在的任一服务器中,将以遍历到的节点作为源节点的有向边删除,以此类推,直至关联关系图中所有节点的度数均不大于1为止。
例如,参见图13,关联关系图中节点B4的度数最大,则优先将以节点B4为源节点的两个作业请求定位到节点B4对应的数据所在的服务器中,再定位其他节点对应的作业请求。
考虑到度数较大的节点对应着较多的作业请求,关联关系较为复杂,需要优先进行定位才能将尽可能多的作业请求定位到该节点指代的数据所在的服务器中,因此按照节点的度数大小进行排序,可以优先对度数最大的节点对应的作业请求进行定位,减小定位复杂度,完成了作业请求与数据的一一映射,并尽可能避免数据分布不合理对后续的数据处理过程的影响。
1203、将至少一条有向边表示的至少一个作业请求依次调度至所定位的服务器中。
定位完成之后,即可将每个作业请求依次调度到所定位的服务器中,从而在调度到的服务器中执行相应的作业请求。
其中,调度时可以采用多种策略来确定作业请求的调度顺序,进而确定作业请求的执行顺序。例如针对吞吐率优先型的大数据应用采用最短队列分配策略(Join the Shortest Queue,简称JSQ)、针对响应时间优先型的大数据应用采用最小期望延时路由策略(Shortest Expected Delay Routing,简称SEDR)策略、针对作业突发性强的大数据应用采用Myopic Max Weight策略。
任一作业请求执行完成之后,如果作业请求记录中的所有作业请求均执行完毕,则作业流调度完成,而如果作业请求记录中还存在未执行完毕的作业请求,则对关联关系图进行更新,将更新后的关联关系图应用到后续调度过程中继续进行调度和执行。
更新时可以对作业请求的到达速率和执行速率等参数进行反馈,更新关联关系图中有向边的权重,并根据关联关系图中的节点拓扑结构,修正节点与有向边的连接关系。并且,还可以根据实际需求更新后续调度过程中所用的服务器集合,或者调度过程中采用的周期等。
本公开实施例提供的方法,通过以关联关系图来表示数据之间的关联关系以及作业请求与数据的对应关系,以数据分布信息来表示数据分布情况,根据关联关系图和数据分布信息,将以当前节点作为源节点的作业请求定位到当前节点指代的数据所在的任一服务器中,可以减少跨节点的数据传输,进而缩短整体作业请求的作业执行时间,提高计算效率,提升系统性能。
在一种可能实现方式中,考虑到存在多个作业请求时,需要执行多轮才能将多个作业请求执行完成,如果每轮内执行的至少两个作业请求需要由同一服务器执行,会导致该服务器的负载较大,而其他服务器的负载较小,这反而会造成计算能力的浪费,导致执行效率降低。因此,作业请求的执行轮次也会影响执行效率,进而影响系统性能。
为此,调度器在对多个作业请求进行调度时,可以先确定每个作业请求的执行轮次,根据每个作业请求的执行轮次,将至少一个作业请求依次调度至所定位的服务器中,以使多个作业请求可以分多轮执行,每轮执行至少一个作业请求。
而确定每个作业请求的执行轮次时,可以计算作业请求记录中多个作业请求均执行结束所需的执行轮数W,该执行轮数W表示了在理想情况下所有作业请求执行完成需要服务器执行多少轮,之后根据每个作业请求的作业执行频率和作业执行时间,计算每个作业请求的优先度,按照优先度从高到低的顺序对多个作业请求进行排序,将排名第nW+m位的作业请求和排名第(n+2)W+1-m的作业请求的执行轮次确定为m,也即是将排名第nW+m位的作业请求和排名第(n+2)W+1-m的作业请求设置为在第m轮执行,m为正整数,且m不大于W,n为整数。
其中,作业请求的优先度与作业执行频率和作业执行时间正相关,例如可以为作业执行效率与作业执行时间的乘积。作业请求的作业执行效率越高,优先度越高,可以优先执行,而作业请求的作业执行时间越长,表示需要花费较长的时间才能处理完成,此时将作业请求的优先度设置为较高的数值可以保证该作业请求可以优先执行,即使该作业请求花费的时间较长,后续仍可以与多轮的其他作业请求并行执行,以免延长整体的作业执行时间。
通过上述方式,可以保证不同轮次中所执行的作业请求的优先度相差不大,每2W轮的作业请求调度结束后,每轮的作业请求整体规模近似相等,从而近似实现作业请求在云计算系统的所有机架上的平铺。
参见图14,J1、J2……Jj的优先度依次降低,则将前W个作业请求J1至JW依次调度到第1至W轮,将第JW+1至J2W个作业请求依次调度到第W至1轮(逆向调度),以此类推,每2W个作业请求调度结束后,每轮作业请求的整体规模近似相等。
本公开实施例中,通过启发式方法,将每个作业请求的执行轮次进行初始化,实现了每轮并发执行的作业请求可以占用全部的集群能力,避免计算能力的浪费,提高执行效率。
本公开实施例提供的方法可以包括如下几个步骤:
1、关联关系图的生成与更新;
2、作业请求的集群规模的确定;
3、数据分布情况的调整;
4、作业请求的定位和调度;
5、作业与数据的反馈。
本公开实施例提供的方法的操作流程图可以如图15所示,参见图15,针对已到达的作业请求,生成关联关系图后,确定每个作业请求的集群规模,并确定数据分布情况,之后根据关联关系图、每个作业请求的集群规模和对应数据的所在位置,对作业请求进行定位和调度,从而执行作业请求。之后将执行过程中的执行频率、作业执行时间等参数进行反馈,从而更新关联关系图。
其中,上述步骤1和步骤4可以在整个调度过程中执行,直至调度结束,主要完成关联关系图的实时更新和作业流的实时调度。而步骤2和步骤3仅在调度开始阶段执行一次,主要完成数据流的调度,由于数据流的重定向会带来严重的通信开销,因此步骤2和3不宜频繁执行。步骤5在作业执行结束时触发,主要为下一轮调度修正信息。五个步骤相互协作,最终实现对于大数据应用的作业流及数据流的协同调度。
本公开实施例的技术方案引入了平台层与作业层的跨层关联关系,促进了数据流与作业流之间的信息整合,相比于单独的相关技术的方案以及相关技术的方案的简单结合,本公开实施例的协同调度方法会使大数据应用得到进一步的性能优化。
图16是本公开实施例提供的三种方案下终端用户画像系统的作业请求完成比例的变化示意图,图17是本公开实施例提供的三种方案下终端用户画像系统的提速百分比的变化示意图。
其中,这三种方案包括:本公开实施例的方案、作业层采用延迟调度(Delayed Scheduling)策略平台层采用ActCap的方案(作业层调度策略与平台层调度策略简单结合,不感知作业请求与数据之间的关联关系)以及Shuffle Watcher方案(采用平台层调度策略)。
由图16可以发现,由于同时考虑了作业流与数据流,本公开实施例的调度策略与其他两种方案相比,作业请求完成情况相对较好,数据流和作业流都得到了优化调度; 而本公开实施例的调度策略进一步引入了跨层关联关系,优化了作业流与数据流的协同整合,更合理地利用了全局资源,因此使得优先的平台资源容纳了更多的作业请求。
由图17可以发现,本公开实施例的调度策略进一步引入了跨层关联关系,优化了作业流与数据流的协同整合,使得大数据作业获得更优的提速,进一步优化了平台的全局资源整合,使得终端用户画像系统得到了最优的性能提速。
图18是本公开实施例提供的一种调度器的结构示意图,参见图18,该调度器应用于上述实施例示出的云计算系统中,该调度器包括:
获取模块1801,用于获取关联关系图和数据分布信息,关联关系图包括多个节点以及至少一条有向边,每个节点指代一条数据,每条有向边具有一个源节点和一个目的节点,有向边由有向边的源节点指向有向边的目的节点,有向边用于表示根据有向边的源节点指代的数据进行计算得到有向边的目的节点指代的数据的作业请求,数据分布信息包括每条数据所在的服务器;
请求定位模块1802,用于根据关联关系图和数据分布信息,按照预设节点排序策略遍历关联关系图中的节点,依次将遍历到的节点对应的作业请求定位到遍历到的节点指代的数据所在的任一服务器中,节点对应的作业请求是指以节点作为源节点的有向边表示的作业请求;
调度模块1803,用于将至少一条有向边表示的至少一个作业请求依次调度至所定位的服务器中。
在一种可能实现方式中,预设节点排序策略为按照度数从大到小的顺序进行排序的策略,度数是指节点连接的有向边的数量,请求定位模块1802,包括:
确定单元,用于按照预设节点排序策略遍历关联关系图中的节点,将关联关系图中当前度数最大的节点确定为遍历到的节点;
定位单元,用于将遍历到的节点对应的作业请求定位到遍历到的节点指代的数据所在的任一服务器中;
删除单元,用于将以遍历到的节点作为源节点的有向边删除;
确定单元、定位单元和删除单元还用于继续按照预设节点排序策略遍历关联关系图中的节点,将关联关系图中当前度数最大的节点确定为遍历到的节点,对遍历到的节点进行定位作业请求的步骤和删除对应有向边的步骤,直至关联关系图中所有节点的度数均不大于1为止。
在另一种可能实现方式中,调度器还包括:
日志获取模块,用于获取日志记录,日志记录包括作业请求记录和数据处理记录;
模型构建模块,用于根据作业请求记录构建随机排队模型,基于随机排队模型确定作业请求记录中的每个作业请求的输入数据和输出数据;
数据关联模型构建模块,用于根据数据处理记录构建数据关联模型,数据关联模型中包括多个节点,每个节点指代数据处理记录中的一条数据;确定每个作业请求的输入数据对应的源节点和输出数据对应的目的节点,在数据关联模型中添加由源节点指向目的节点的有向边,得到关联关系图。
在另一种可能实现方式中,调度器还包括:
轮数计算模块,用于计算作业请求记录中多个作业请求均执行结束所需的执行轮数W;
优先度计算模块,用于根据每个作业请求的作业执行频率和作业执行时间,计算每个作业请求的优先度,优先度与作业执行频率和作业执行时间正相关;
轮次确定模块,用于按照优先度从高到低的顺序对多个作业请求进行排序,将排名第nW+m位的作业请求和排名第(n+2)W+1-m的作业请求的执行轮次确定为m,m为正整数,且m不大于W,n为整数;
调度模块,还用于根据每个作业请求的执行轮次,将至少一个作业请求依次调度至所定位的服务器中。
在另一种可能实现方式中,调度器还包括:
集群规模设置模块,用于初始化每个作业请求的集群规模,作业请求对应的数据占用的服务器数量与作业请求的集群规模正相关;
集群规模调整模块,用于计算每个作业请求在当前的集群规模下的作业执行时间,并增大作业执行时间最长的作业请求的集群规模;继续计算每个作业请求在当前的集群规模下的作业执行时间,并增大作业执行时间最长的作业请求的集群规模,直至作业执行时间最长的作业请求的集群规模对应的服务器数量等于云计算系统中的服务器总数量时为止。
在另一种可能实现方式中,每个作业请求包括映射map任务、混洗shuffle任务和归并reduce任务,集群规模调整模块,还用于采用以下公式,计算任一作业请求在当前的集群规模下的作业执行时间:
Figure PCTCN2019074017-appb-000029
其中,L j(r)表示第j个作业请求的作业执行时间,r表示第j个作业请求当前分配的机架数量,k表示一个机架中的服务器数量,
Figure PCTCN2019074017-appb-000030
表示第j个作业请求输入文件的大小,
Figure PCTCN2019074017-appb-000031
表示第j个作业请求输出文件的大小,
Figure PCTCN2019074017-appb-000032
表示第j个作业请求的shuffle文件的大小,μ map表示map任务的单机平均处理效率,μ reduce表示reduce任务的单机平均处理效率,
Figure PCTCN2019074017-appb-000033
表示第j个作业请求的reduce任务的数量,V表示过载率,B表示服务器的带宽。
在另一种可能实现方式中,调度器还包括:
排序模块,用于确定关联关系图中当前长度最大的关系链,关系链为一个第一节点以及由任一第二节点到达第一节点所经过的节点及有向边组成的集合,第一节点为不存在以第一节点为源节点的有向边的节点,第二节点为存在至少两条以第二节点为源节点的有向边的节点,或者为存在一条以第二节点为源节点的有向边且不存在以第二节点为目的节点的有向边的节点,每条关系链的长度由关系链中包含的有向边个数确定;按照集群规模从大到小的顺序,对当前长度最大的关系链中的有向边表示的作业请求进行排序,将集群规模最大的作业请求确定为指定作业请求;
数据定位模块,用于按照指定作业请求的集群规模,将指定作业请求对应的数据定位到数量与指定作业请求的集群规模匹配的服务器中,并按照当前长度最大的关系链中其他有向边表示的作业请求的集群规模,将其他作业请求对应的数据定位到指定 作业请求所定位的且数量与其他作业请求的集群规模匹配的服务器中;
删除模块,用于将当前长度最大的关系链删除;
排序模块、数据定位模块和删除模块还用于继续确定关联关系图中当前长度最大的关系链,对确定的关系链进行定位数据的步骤,直至关联关系图中的每条有向边表示的作业请求对应的数据定位完成。
在另一种可能实现方式中,调度器还包括:
更新模块,用于当作业请求执行结束后,对日志记录进行更新,日志记录包括作业请求记录和数据处理记录;
更新模块,还用于根据更新后的日志记录,对关联关系图进行更新。
在另一种可能实现方式中,关联关系图中每条有向边的请求数量指代对应的作业请求的个数,每条有向边的权重指代对应作业请求的执行频率,更新模块,包括:
第一更新单元,用于根据更新后的作业请求记录,确定更新后的多个作业请求以及每个作业请求更新后的执行频率;
第二更新单元,用于根据更新后的多个作业请求,对关联关系图中的有向边进行更新,并对每条有向边的请求数量进行更新;
第三更新单元,用于根据每个作业请求更新后的执行频率,对关联关系图中每条有向边的权重进行更新。
图19是本公开实施例提供的一种调度器的结构示意图,参见图19,该调度器应用于上述实施例示出的云计算系统中,该调度器包括:存储器1901和处理器1902,存储器1901与处理器1902连接,存储器1901存储有至少一条指令,处理器1902用于调用指令,执行上述实施例中调度器所执行的操作。
本公开实施例还提供了一种计算机可读存储介质,计算机可读存储介质中存储有至少一条指令,该指令由处理器加载并执行时,使得计算机执行如上述实施例中的调度器所执行的操作。
本领域普通技术人员可以理解实现上述实施例的全部或部分步骤可以通过硬件来完成,也可以通过程序来指令相关的硬件完成,所述的程序可以存储于一种计算机可读存储介质中,上述提到的存储介质可以是只读存储器,磁盘或光盘等。
以上所述仅为本公开的可选实施例,并不用以限制本公开,凡在本公开的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本公开的保护范围之内。

Claims (21)

  1. 一种调度方法,其特征在于,所述方法包括:
    获取关联关系图和数据分布信息,所述关联关系图包括多个节点以及至少一条有向边,每个节点指代一条数据,每条有向边具有一个源节点和一个目的节点,所述有向边由所述有向边的源节点指向所述有向边的目的节点,所述有向边用于表示根据所述有向边的源节点指代的数据进行计算得到所述有向边的目的节点指代的数据的作业请求,所述数据分布信息包括每条数据所在的服务器;
    根据所述关联关系图和所述数据分布信息,按照预设节点排序策略遍历所述关联关系图中的节点,依次将遍历到的节点对应的作业请求定位到所述遍历到的节点指代的数据所在的任一服务器中,所述节点对应的作业请求是指以所述节点作为源节点的有向边表示的作业请求;
    将所述至少一条有向边表示的至少一个作业请求依次调度至所定位的服务器中。
  2. 根据权利要求1所述的方法,其特征在于,所述预设节点排序策略为按照度数从大到小的顺序进行排序的策略,所述度数是指节点连接的有向边的数量,所述根据所述关联关系图和所述数据分布信息,按照预设节点排序策略遍历所述关联关系图中的节点,依次将遍历到的节点对应的作业请求定位到所述遍历到的节点指代的数据所在的任一服务器中,包括:
    按照所述预设节点排序策略遍历所述关联关系图中的节点,将所述关联关系图中当前度数最大的节点确定为所述遍历到的节点;
    将所述遍历到的节点对应的作业请求定位到所述遍历到的节点指代的数据所在的任一服务器中;
    将以所述遍历到的节点作为源节点的有向边删除;
    继续按照所述预设节点排序策略遍历所述关联关系图中的节点,将所述关联关系图中当前度数最大的节点确定为所述遍历到的节点,对所述遍历到的节点进行所述定位作业请求的步骤和所述删除对应有向边的步骤,直至所述关联关系图中所有节点的度数均不大于1为止。
  3. 根据权利要求1所述的方法,其特征在于,所述获取关联关系图和数据分布信息之前,所述方法还包括:
    获取日志记录,所述日志记录包括作业请求记录和数据处理记录;
    根据所述作业请求记录构建随机排队模型,基于所述随机排队模型确定所述作业请求记录中的每个作业请求的输入数据和输出数据;
    根据所述数据处理记录构建数据关联模型,所述数据关联模型中包括多个节点,每个节点指代所述数据处理记录中的一条数据;
    确定所述每个作业请求的输入数据对应的源节点和输出数据对应的目的节点,在所述数据关联模型中添加由所述源节点指向所述目的节点的有向边,得到所述关联关系图。
  4. 根据权利要求1所述的方法,其特征在于,所述获取关联关系图和数据分布信息之前,所述方法还包括:
    计算作业请求记录中多个作业请求均执行结束所需的执行轮数W;
    根据每个作业请求的作业执行频率和作业执行时间,计算每个作业请求的优先度,所述优先度与所述作业执行频率和所述作业执行时间正相关;
    按照优先度从高到低的顺序对所述多个作业请求进行排序,将排名第nW+m位的作业请求和排名第(n+2)W+1-m的作业请求的执行轮次确定为m,m为正整数,且m不大于W,n为整数;
    所述将所述至少一条有向边表示的至少一个作业请求依次调度至所定位的服务器中,包括:根据每个作业请求的执行轮次,将所述至少一个作业请求依次调度至所定位的服务器中。
  5. 根据权利要求1所述的方法,其特征在于,所述方法还包括:
    初始化每个作业请求的集群规模,所述作业请求对应的数据占用的服务器数量与所述作业请求的集群规模正相关;
    计算每个作业请求在当前的集群规模下的作业执行时间,并增大作业执行时间最长的作业请求的集群规模;
    继续计算每个作业请求在当前的集群规模下的作业执行时间,并增大作业执行时间最长的作业请求的集群规模,直至作业执行时间最长的作业请求的集群规模对应的服务器数量等于所述云计算系统中的服务器总数量时为止。
  6. 根据权利要求5所述的方法,其特征在于,每个作业请求包括映射map任务、混洗shuffle任务和归并reduce任务,所述计算每个作业请求在当前的集群规模下的作业执行时间,包括:
    采用以下公式,计算任一作业请求在当前的集群规模下的作业执行时间:
    Figure PCTCN2019074017-appb-100001
    其中,L j(r)表示第j个作业请求的作业执行时间,r表示第j个作业请求当前分配的机架数量,k表示一个机架中的服务器数量,
    Figure PCTCN2019074017-appb-100002
    表示第j个作业请求输入文件的大小,
    Figure PCTCN2019074017-appb-100003
    表示第j个作业请求输出文件的大小,
    Figure PCTCN2019074017-appb-100004
    表示第j个作业请求的shuffle文件的大小,μ map表示map任务的单机平均处理效率,μ reduce表示reduce任务的单机平均处理效率,
    Figure PCTCN2019074017-appb-100005
    表示第j个作业请求的reduce任务的数量,V表示过载率,B表示服务器的带宽。
  7. 根据权利要求5所述的方法,其特征在于,所述继续计算每个作业请求在当前的集群规模下的作业执行时间,并增大作业执行时间最长的作业请求的集群规模,直至作业执行时间最长的作业请求的集群规模对应的服务器数量等于所述云计算系统中的服务器总数量时为止之后,所述方法还包括:
    确定所述关联关系图中当前长度最大的关系链,所述关系链为一个第一节点以及由任一第二节点到达所述第一节点所经过的节点及有向边组成的集合,所述第一节点为不存在以所述第一节点为源节点的有向边的节点,所述第二节点为存在至少两条以所述第二节点为源节点的有向边的节点,或者为存在一条以所述第二节点为源节点的有向边且不存在以所述第二节点为目的节点的有向边的节点,所述关系链的长度由所述关系链中包含的有向边个数确定;
    按照集群规模从大到小的顺序,对所述当前长度最大的关系链中的有向边表示的作业请求进行排序,将集群规模最大的作业请求确定为指定作业请求;
    按照所述指定作业请求的集群规模,将所述指定作业请求对应的数据定位到数量与所述指定作业请求的集群规模匹配的服务器中,并按照所述当前长度最大的关系链中其他有向边表示的作业请求的集群规模,将所述其他作业请求对应的数据定位到所述指定作业请求所定位的且数量与所述其他作业请求的集群规模匹配的服务器中;
    将所述当前长度最大的关系链删除;
    继续确定所述关联关系图中当前长度最大的关系链,对确定的关系链进行所述定位数据的步骤,直至所述关联关系图中的每条有向边表示的作业请求对应的数据定位完成。
  8. 根据权利要求1-7任一项所述的方法,其特征在于,所述方法还包括:
    当作业请求执行结束后,对日志记录进行更新,所述日志记录包括作业请求记录和数据处理记录;
    根据更新后的日志记录,对所述关联关系图进行更新。
  9. 根据权利要求8所述的方法,其特征在于,所述关联关系图中每条有向边的请求数量指代对应的作业请求的个数,每条有向边的权重指代对应作业请求的执行频率,根据更新后的日志记录,对所述关联关系图进行更新,包括:
    根据更新后的作业请求记录,确定更新后的多个作业请求以及每个作业请求更新后的执行频率;
    根据所述更新后的多个作业请求,对所述关联关系图中的有向边进行更新,并对每条有向边的请求数量进行更新;
    根据所述每个作业请求更新后的执行频率,对所述关联关系图中每条有向边的权重进行更新。
  10. 一种调度器,其特征在于,所述调度器包括:
    获取模块,用于获取关联关系图和数据分布信息,所述关联关系图包括多个节点以及至少一条有向边,每个节点指代一条数据,每条有向边具有一个源节点和一个目的节点,所述有向边由所述有向边的源节点指向所述有向边的目的节点,所述有向边用于表示根据所述有向边的源节点指代的数据进行计算得到所述有向边的目的节点指代的数据的作业请求,所述数据分布信息包括每条数据所在的服务器;
    请求定位模块,用于根据所述关联关系图和所述数据分布信息,按照预设节点排 序策略遍历所述关联关系图中的节点,依次将遍历到的节点对应的作业请求定位到所述遍历到的节点指代的数据所在的任一服务器中,所述节点对应的作业请求是指以所述节点作为源节点的有向边表示的作业请求;
    调度模块,用于将所述至少一条有向边表示的至少一个作业请求依次调度至所定位的服务器中。
  11. 根据权利要求10所述的调度器,其特征在于,所述预设节点排序策略为按照度数从大到小的顺序进行排序的策略,所述度数是指节点连接的有向边的数量,所述请求定位模块,包括:
    确定单元,用于按照所述预设节点排序策略遍历所述关联关系图中的节点,将所述关联关系图中当前度数最大的节点确定为所述遍历到的节点;
    定位单元,用于将所述遍历到的节点对应的作业请求定位到所述遍历到的节点指代的数据所在的任一服务器中;
    删除单元,用于将以所述遍历到的节点作为源节点的有向边删除;
    所述确定单元、所述定位单元和所述删除单元还用于继续按照所述预设节点排序策略遍历所述关联关系图中的节点,将所述关联关系图中当前度数最大的节点确定为所述遍历到的节点,对所述遍历到的节点进行所述定位作业请求的步骤和所述删除对应有向边的步骤,直至所述关联关系图中所有节点的度数均不大于1为止。
  12. 根据权利要求10所述的调度器,其特征在于,所述调度器还包括:
    日志获取模块,用于获取日志记录,所述日志记录包括作业请求记录和数据处理记录;
    模型构建模块,用于根据所述作业请求记录构建随机排队模型,基于所述随机排队模型确定所述作业请求记录中的每个作业请求的输入数据和输出数据;
    数据关联模型构建模块,用于根据所述数据处理记录构建数据关联模型,所述数据关联模型中包括多个节点,每个节点指代所述数据处理记录中的一条数据;确定所述每个作业请求的输入数据对应的源节点和输出数据对应的目的节点,在所述数据关联模型中添加由所述源节点指向所述目的节点的有向边,得到所述关联关系图。
  13. 根据权利要求10所述的调度器,其特征在于,所述调度器还包括:
    轮数计算模块,用于计算作业请求记录中多个作业请求均执行结束所需的执行轮数W;
    优先度计算模块,用于根据每个作业请求的作业执行频率和作业执行时间,计算每个作业请求的优先度,所述优先度与所述作业执行频率和所述作业执行时间正相关;
    轮次确定模块,用于按照优先度从高到低的顺序对所述多个作业请求进行排序,将排名第nW+m位的作业请求和排名第(n+2)W+1-m的作业请求的执行轮次确定为m,m为正整数,且m不大于W,n为整数;
    所述调度模块,还用于根据每个作业请求的执行轮次,将所述至少一个作业请求依次调度至所定位的服务器中。
  14. 根据权利要求10所述的调度器,其特征在于,所述调度器还包括:
    集群规模设置模块,用于初始化每个作业请求的集群规模,所述作业请求对应的数据占用的服务器数量与所述作业请求的集群规模正相关;
    集群规模调整模块,用于计算每个作业请求在当前的集群规模下的作业执行时间,并增大作业执行时间最长的作业请求的集群规模;继续计算每个作业请求在当前的集群规模下的作业执行时间,并增大作业执行时间最长的作业请求的集群规模,直至作业执行时间最长的作业请求的集群规模对应的服务器数量等于所述云计算系统中的服务器总数量时为止。
  15. 根据权利要求14所述的调度器,其特征在于,每个作业请求包括映射map任务、混洗shuffle任务和归并reduce任务,所述集群规模调整模块,还用于采用以下公式,计算任一作业请求在当前的集群规模下的作业执行时间:
    Figure PCTCN2019074017-appb-100006
    其中,L j(r)表示第j个作业请求的作业执行时间,r表示第j个作业请求当前分配的机架数量,k表示一个机架中的服务器数量,
    Figure PCTCN2019074017-appb-100007
    表示第j个作业请求输入文件的大小,
    Figure PCTCN2019074017-appb-100008
    表示第j个作业请求输出文件的大小,
    Figure PCTCN2019074017-appb-100009
    表示第j个作业请求的shuffle文件的大小,μ map表示map任务的单机平均处理效率,μ reduce表示reduce任务的单机平均处理效率,
    Figure PCTCN2019074017-appb-100010
    表示第j个作业请求的reduce任务的数量,V表示过载率,B表示服务器的带宽。
  16. 根据权利要求14所述的调度器,其特征在于,所述调度器还包括:
    排序模块,用于确定所述关联关系图中当前长度最大的关系链,所述关系链为一个第一节点以及由任一第二节点到达所述第一节点所经过的节点及有向边组成的集合,所述第一节点为不存在以所述第一节点为源节点的有向边的节点,所述第二节点为存在至少两条以所述第二节点为源节点的有向边的节点,或者为存在一条以所述第二节点为源节点的有向边且不存在以所述第二节点为目的节点的有向边的节点,所述关系链的长度由所述关系链中包含的有向边个数确定;按照集群规模从大到小的顺序,对所述当前长度最大的关系链中的有向边表示的作业请求进行排序,将集群规模最大的作业请求确定为指定作业请求;
    数据定位模块,用于按照所述指定作业请求的集群规模,将所述指定作业请求对应的数据定位到数量与所述指定作业请求的集群规模匹配的服务器中,并按照所述当前长度最大的关系链中其他有向边表示的作业请求的集群规模,将所述其他作业请求对应的数据定位到所述指定作业请求所定位的且数量与所述其他作业请求的集群规模匹配的服务器中;
    删除模块,用于将所述当前长度最大的关系链删除;
    所述排序模块、所述数据定位模块和所述删除模块还用于继续确定所述关联关系图中当前长度最大的关系链,对确定的关系链进行所述定位数据的步骤,直至所述关 联关系图中的每条有向边表示的作业请求对应的数据定位完成。
  17. 根据权利要求10-16任一项所述的调度器,其特征在于,所述调度器还包括:
    更新模块,用于当作业请求执行结束后,对日志记录进行更新,所述日志记录包括作业请求记录和数据处理记录;
    所述更新模块,还用于根据更新后的日志记录,对所述关联关系图进行更新。
  18. 根据权利要求17所述的调度器,其特征在于,所述关联关系图中每条有向边的请求数量指代对应的作业请求的个数,每条有向边的权重指代对应作业请求的执行频率,所述更新模块,包括:
    第一更新单元,用于根据更新后的作业请求记录,确定更新后的多个作业请求以及每个作业请求更新后的执行频率;
    第二更新单元,用于根据所述更新后的多个作业请求,对所述关联关系图中的有向边进行更新,并对每条有向边的请求数量进行更新;
    第三更新单元,用于根据所述每个作业请求更新后的执行频率,对所述关联关系图中每条有向边的权重进行更新。
  19. 一种调度器,其特征在于,所述调度器包括:处理器和存储器,所述存储器中存储有至少一条指令,所述指令由所述处理器加载并执行以实现如权利要求1至9任一项所述的方法中所执行的操作。
  20. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质中存储有至少一条指令,所述指令由处理器加载并执行以实现如权利要求1至9任一项所述的方法中所执行的操作。
  21. 一种云计算系统,其特征在于,所述云计算系统包括多个服务器和如权利要求10至18任一项所述的调度器。
PCT/CN2019/074017 2018-03-23 2019-01-30 调度方法、调度器、存储介质及系统 WO2019179250A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP19770974.4A EP3770774B1 (en) 2018-03-23 2019-01-30 Control method for household appliance, and household appliance
US17/021,425 US11190618B2 (en) 2018-03-23 2020-09-15 Scheduling method, scheduler, storage medium, and system

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810244746.5A CN110297699B (zh) 2018-03-23 2018-03-23 调度方法、调度器、存储介质及系统
CN201810244746.5 2018-03-23

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/021,425 Continuation US11190618B2 (en) 2018-03-23 2020-09-15 Scheduling method, scheduler, storage medium, and system

Publications (1)

Publication Number Publication Date
WO2019179250A1 true WO2019179250A1 (zh) 2019-09-26

Family

ID=67986645

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/074017 WO2019179250A1 (zh) 2018-03-23 2019-01-30 调度方法、调度器、存储介质及系统

Country Status (4)

Country Link
US (1) US11190618B2 (zh)
EP (1) EP3770774B1 (zh)
CN (1) CN110297699B (zh)
WO (1) WO2019179250A1 (zh)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112615846A (zh) * 2020-12-14 2021-04-06 重庆邮电大学 一种基于dag的区块链系统认证门限的更新方法
CN115913975A (zh) * 2022-11-07 2023-04-04 奇安信网神信息技术(北京)股份有限公司 有向拓扑图布局方法、装置、电子设备及存储介质
CN116562054A (zh) * 2023-07-06 2023-08-08 西安羚控电子科技有限公司 一种多实体协同实时仿真系统的构建方法及装置

Families Citing this family (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110297699B (zh) * 2018-03-23 2021-09-14 华为技术有限公司 调度方法、调度器、存储介质及系统
US11533268B2 (en) * 2018-03-30 2022-12-20 Intel Corporation Methods and apparatus to schedule service requests in a network computing system using hardware queue managers
US10820057B2 (en) 2018-11-07 2020-10-27 Nvidia Corp. Scalable light-weight protocols for wire-speed packet ordering
US11108704B2 (en) * 2018-12-04 2021-08-31 Nvidia Corp. Use of stashing buffers to improve the efficiency of crossbar switches
CN111953614B (zh) * 2020-08-07 2023-10-24 腾讯科技(深圳)有限公司 数据传输方法、装置、处理设备及介质
US20220197707A1 (en) * 2020-12-17 2022-06-23 EMC IP Holding Company LLC System and method for efficient data collection based on data access pattern for reporting in large scale multi tenancy environment
CN112817770B (zh) * 2021-03-09 2024-01-30 皖西学院 一种提高分布式边缘计算系统负载均衡的方法
CN113377540B (zh) * 2021-06-15 2024-08-09 上海商汤科技开发有限公司 集群资源调度方法及装置、电子设备和存储介质
WO2023069384A1 (en) * 2021-10-19 2023-04-27 Google Llc Large-scale accelerator system energy performance optimization
CN114168198B (zh) * 2022-02-10 2022-04-26 北京创新乐知网络技术有限公司 线上处理流程调整方法、系统及配置中心、服务端
US11770215B2 (en) 2022-02-17 2023-09-26 Nvidia Corp. Transceiver system with end-to-end reliability and ordering protocols
CN114816715B (zh) * 2022-05-20 2022-11-22 中国地质大学(北京) 一种面向跨地域的流计算延迟优化方法及装置

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120120920A1 (en) * 2003-11-05 2012-05-17 Interdigital Technology Corporation Method and wireless transmit/receive unit for supporting an enhanced uplink dedicated channel inter-node-b serving cell change
CN106201356A (zh) * 2016-07-14 2016-12-07 北京理工大学 一种基于链路可用带宽状态的动态数据调度方法
CN106610866A (zh) * 2016-06-17 2017-05-03 四川用联信息技术有限公司 云存储环境下一种服务价值约束的任务调度算法
CN106663075A (zh) * 2014-09-02 2017-05-10 起元科技有限公司 执行基于图的程序规范

Family Cites Families (24)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7496912B2 (en) * 2004-02-27 2009-02-24 International Business Machines Corporation Methods and arrangements for ordering changes in computing systems
US7870556B2 (en) * 2006-05-16 2011-01-11 Ab Initio Technology Llc Managing computing resources in graph-based computations
US9424315B2 (en) * 2007-08-27 2016-08-23 Teradata Us, Inc. Methods and systems for run-time scheduling database operations that are executed in hardware
AU2009322602B2 (en) * 2008-12-02 2015-06-25 Ab Initio Technology Llc Mapping instances of a dataset within a data management system
EP2399192A4 (en) * 2009-02-13 2016-09-07 Ab Initio Technology Llc COMMUNICATION WITH DATA STORAGE SYSTEMS
US10673952B1 (en) * 2014-11-10 2020-06-02 Turbonomic, Inc. Systems, apparatus, and methods for managing computer workload availability and performance
CN101710286A (zh) * 2009-12-23 2010-05-19 天津大学 面向dag数据驱动型应用的并行编程模型系统和实现方法
CN101916162B (zh) * 2010-08-05 2012-11-28 中国工商银行股份有限公司 一种基于有向图的动态界面生成方法、服务器及系统
CN102591712B (zh) 2011-12-30 2013-11-20 大连理工大学 一种云计算中依赖任务的解耦并行调度方法
CN104714753A (zh) * 2013-12-12 2015-06-17 中兴通讯股份有限公司 一种数据访问存储方法及装置
US9838478B2 (en) * 2014-06-30 2017-12-05 International Business Machines Corporation Identifying a task execution resource of a dispersed storage network
CN104636204B (zh) * 2014-12-04 2018-06-01 中国联合网络通信集团有限公司 一种任务调度方法与装置
US20170329645A1 (en) * 2014-12-19 2017-11-16 Intel Corporation Apparatus and method for adding nodes to a computing cluster
CN106325756B (zh) * 2015-06-15 2020-04-24 阿里巴巴集团控股有限公司 一种数据存储、数据计算方法和设备
US9934071B2 (en) * 2015-12-30 2018-04-03 Palo Alto Research Center Incorporated Job scheduler for distributed systems using pervasive state estimation with modeling of capabilities of compute nodes
CN105912383A (zh) 2016-05-05 2016-08-31 中国人民解放军国防科学技术大学 一种高可靠性的依赖任务调度与资源配置方法
CN107688488B (zh) * 2016-08-03 2020-10-20 中国移动通信集团湖北有限公司 一种基于元数据的任务调度的优化方法及装置
CN106331150B (zh) * 2016-09-18 2018-05-18 北京百度网讯科技有限公司 用于调度云服务器的方法和装置
US10261837B2 (en) * 2017-06-30 2019-04-16 Sas Institute Inc. Two-part job scheduling with capacity constraints and preferences
CN109218355B (zh) * 2017-06-30 2021-06-15 华为技术有限公司 负载均衡引擎,客户端,分布式计算系统以及负载均衡方法
CN111079942B (zh) * 2017-08-30 2023-03-24 第四范式(北京)技术有限公司 执行机器学习的分布式系统及其方法
CN107590254B (zh) * 2017-09-19 2020-03-17 华南理工大学 具有合并处理方法的大数据支撑平台
US10606640B2 (en) * 2017-12-23 2020-03-31 International Business Machines Corporation Rescheduling high performance computing jobs based on personalized sanity checks and job problem resolution classification
CN110297699B (zh) * 2018-03-23 2021-09-14 华为技术有限公司 调度方法、调度器、存储介质及系统

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120120920A1 (en) * 2003-11-05 2012-05-17 Interdigital Technology Corporation Method and wireless transmit/receive unit for supporting an enhanced uplink dedicated channel inter-node-b serving cell change
CN106663075A (zh) * 2014-09-02 2017-05-10 起元科技有限公司 执行基于图的程序规范
CN106610866A (zh) * 2016-06-17 2017-05-03 四川用联信息技术有限公司 云存储环境下一种服务价值约束的任务调度算法
CN106201356A (zh) * 2016-07-14 2016-12-07 北京理工大学 一种基于链路可用带宽状态的动态数据调度方法

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP3770774A4

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112615846A (zh) * 2020-12-14 2021-04-06 重庆邮电大学 一种基于dag的区块链系统认证门限的更新方法
CN112615846B (zh) * 2020-12-14 2022-03-22 重庆邮电大学 一种基于dag的区块链系统认证门限的更新方法
CN115913975A (zh) * 2022-11-07 2023-04-04 奇安信网神信息技术(北京)股份有限公司 有向拓扑图布局方法、装置、电子设备及存储介质
CN116562054A (zh) * 2023-07-06 2023-08-08 西安羚控电子科技有限公司 一种多实体协同实时仿真系统的构建方法及装置
CN116562054B (zh) * 2023-07-06 2023-10-13 西安羚控电子科技有限公司 一种多实体协同实时仿真系统的构建方法及装置

Also Published As

Publication number Publication date
EP3770774A4 (en) 2021-05-26
EP3770774A1 (en) 2021-01-27
EP3770774B1 (en) 2022-09-28
CN110297699B (zh) 2021-09-14
US20200412835A1 (en) 2020-12-31
CN110297699A (zh) 2019-10-01
US11190618B2 (en) 2021-11-30

Similar Documents

Publication Publication Date Title
WO2019179250A1 (zh) 调度方法、调度器、存储介质及系统
Ge et al. GA-based task scheduler for the cloud computing systems
CN102170396B (zh) 一种基于区分服务的云存储系统QoS控制方法
CN114138486A (zh) 面向云边异构环境的容器化微服务编排方法、系统及介质
US8843929B1 (en) Scheduling in computer clusters
Tantalaki et al. Pipeline-based linear scheduling of big data streams in the cloud
CN114610474B (zh) 一种异构超算环境下多策略的作业调度方法及系统
CN111782627B (zh) 面向广域高性能计算环境的任务与数据协同调度方法
WO2024016596A1 (zh) 容器集群调度的方法、装置、设备及存储介质
Khalifa¹ et al. Collaborative autonomic resource management system for mobile cloud computing
Gabi et al. Systematic review on existing load balancing techniques in cloud computing
CN112306642B (zh) 一种基于稳定匹配博弈理论的工作流调度方法
Qureshi et al. Grid resource allocation for real-time data-intensive tasks
Tao et al. Congestion-aware traffic allocation for geo-distributed data centers
Zhang et al. Employ AI to improve AI services: Q-learning based holistic traffic control for distributed co-inference in deep learning
CN108304253A (zh) 基于缓存感知和数据本地性的map任务调度方法
Abdalkafor et al. A hybrid approach for scheduling applications in cloud computing environment
CN117909061A (zh) 基于gpu混合集群的模型任务处理系统和资源调度方法
Ru et al. An efficient deadline constrained and data locality aware dynamic scheduling framework for multitenancy clouds
Patel et al. An improved approach for load balancing among heterogeneous resources in computational grids
Goswami et al. Deadline stringency based job scheduling in computational grid environment
Loganathan et al. Job scheduling with efficient resource monitoring in cloud datacenter
CN112698911B (zh) 一种基于深度强化学习的云作业调度方法
Setia et al. Literature survey on various scheduling approaches in grid computing environment
Yan et al. Cloud computing workflow framework with resource scheduling mechanism

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19770974

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

WWE Wipo information: entry into national phase

Ref document number: 2019770974

Country of ref document: EP