CN117640770A - Application scheduling method, cloud service platform and related equipment - Google Patents

Application scheduling method, cloud service platform and related equipment Download PDF

Info

Publication number
CN117640770A
CN117640770A CN202310181241.XA CN202310181241A CN117640770A CN 117640770 A CN117640770 A CN 117640770A CN 202310181241 A CN202310181241 A CN 202310181241A CN 117640770 A CN117640770 A CN 117640770A
Authority
CN
China
Prior art keywords
application
scheduling
information
cloud service
service platform
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310181241.XA
Other languages
Chinese (zh)
Inventor
张嘉伟
王雷博
黄毽
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Cloud Computing Technologies Co Ltd
Original Assignee
Huawei Cloud Computing Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Cloud Computing Technologies Co Ltd filed Critical Huawei Cloud Computing Technologies Co Ltd
Priority to PCT/CN2023/104403 priority Critical patent/WO2024032239A1/en
Publication of CN117640770A publication Critical patent/CN117640770A/en
Pending legal-status Critical Current

Links

Abstract

The embodiment of the application discloses an application scheduling method, a cloud service platform and related equipment, which reduce difficulty of tenant resource purchase and application deployment and simultaneously maximize utilization of resources. The method is applied to a cloud service platform, wherein the cloud service platform is used for managing an infrastructure, the infrastructure comprises a plurality of data centers arranged in different areas, and each data center is provided with a plurality of servers. The method includes obtaining an application scheduling target of a target application of a distributed deployment of tenant input at least one data center of an infrastructure. An application topology of the target application and resource information representing resource usage and/or distribution of each micro-service in the application topology in the infrastructure is determined. And determining a first scheduling policy meeting the application scheduling target according to the application topology and the resource information, and adjusting the resource use and/or distribution of at least one micro-service in the application topology in the infrastructure based on the first scheduling policy so as to meet the application scheduling target.

Description

Application scheduling method, cloud service platform and related equipment
The present application claims priority from the chinese patent application filed at 12 of 08 of 2022, filed at 202210968594.X under the name of "a data processing method and related apparatus", the entire contents of which are incorporated herein by reference.
Technical Field
The present application relates to the field of cloud computing, and in particular, to an application scheduling method, a cloud service platform, and related devices.
Background
In recent years, with the release of the strategy of "east-west calculation" and the continuous improvement of demands for tenants to schedule applications in multiple areas (regions) to meet feature demands, distributed cloud and multi-cloud services integrating multiple area cloud services have been raised. For ease of presentation, hereinafter distributed cloud and multi-cloud services are collectively referred to as distributed cloud services, which allow tenants to manage and use cloud resources of different areas on one interface.
In the existing cloud resource processing method, if a tenant needs to manually purchase resources by using a distributed cloud service, after the resources are purchased, the tenant designates or depends on the platform scheduling, and the purchased resources are mapped to tasks included in the application so as to complete the scheduling of the application.
In the method, since the resources purchased by the tenant depend on the tenant to decide, the dispatching application also depends on the tenant decision or the basic dispatching capability of the platform; in addition, the performance and price difference of different resources under the distributed cloud service scene are large, and the resource characteristics of each area need to be fully considered in application scheduling, so that the threshold for manually completing resource purchase and application scheduling by the tenant is high, that is, the maximum utilization of the resources is difficult under the condition of meeting the demand of the tenant.
Disclosure of Invention
The application scheduling method, the cloud service platform and related equipment are provided. In the application scheduling method, a cloud service platform can acquire an application scheduling target of a target application of a distributed deployment at least one data center of a basic device managed by the cloud service platform, which is input by a tenant, and then determine a first scheduling policy meeting the application scheduling target according to application topology and resource information of the target application. In the method, the tenant is not required to consider factors of each layer, so that the difficulty of resource purchase and application deployment is reduced. Meanwhile, the first scheduling strategy determined by combining the application topology and the resource information can also meet the requirements, and the resources are utilized to the maximum extent.
The first aspect of the present application provides an application scheduling method, where the method is applied to a cloud service platform, where the cloud service platform is used for managing an infrastructure, and the infrastructure includes a plurality of data centers disposed in different regions (regions), and each data center is provided with a plurality of servers. The cloud service platform may obtain an application scheduling target of a tenant input target application, the target application being distributed to at least one data center in the infrastructure. The cloud service platform may acquire the application scheduling target in a man-machine interaction manner, for example, by displaying an application deployment interface, and acquire the application scheduling target of the target application input by the tenant. That is, after the tenant logs in the cloud service platform, the application deployment interface may be opened, and an application scheduling target of the target application may be input in the application deployment interface. In addition, the cloud service platform may obtain an application scheduling target input by the tenant through an application program interface (application programming interface, API), a template uploading interface of a scripting language, etc., where the application scheduling target refers to a scheduling effect that the tenant wants to achieve, and the effect includes multiple aspects of scheduling cost, scheduling performance or scheduling quality, and is not limited in this specific embodiment. Alternatively, the application scheduling target may also reflect various metrics entered for the tenant. The cloud service platform can also confirm application topology and resource information corresponding to the target application, wherein the application topology indicates task characteristics included in the target application, namely calling relations among tasks included in the target application; the resource information is used to represent the resource usage and/or distribution in the infrastructure of each micro-service in the application topology. The cloud service platform determines a first scheduling policy meeting an application scheduling target based on the application topology and the resource information, and adjusts resource usage and/or distribution of at least one micro-service in the application topology in the infrastructure based on the first scheduling policy so as to meet the application scheduling target.
In the application, the cloud service platform can acquire an application scheduling target of a target application of at least one data center of a base device managed by the cloud service platform in a distributed mode, which is input by a tenant, and then determine a first scheduling policy meeting the application scheduling target according to application topology and resource information of the target application. In the method, the tenant is not required to consider factors of each layer, so that the difficulty of resource purchase and application deployment is reduced. Meanwhile, the first scheduling strategy determined by combining the application topology and the resource information can also meet the requirements, and the resources are utilized to the maximum extent.
In a possible implementation manner of the first aspect, the cloud service platform confirms an application topology of the target application according to target application information, where the target application information reflects attribute information of the target application. The target application information is input by the tenant, and/or the cloud service platform is obtained by detecting the running state of the target application. The tenant may input the target application information in a plurality of ways, and may input the target application information in the application deployment interface, or may input the target application information through the API interface. In other words, for the cloud service platform, the target application information is obtained in response to an operation instruction of the tenant for the application deployment interface, or is obtained through the API interface.
In the application, the cloud service platform obtains the target application information in a plurality of ways, the implementation ways of the technical scheme of the application are enriched, the cloud service platform can be flexibly applied to different scenes, and the practicability of the technical scheme of the application is improved.
In a possible implementation manner of the first aspect, after the cloud service platform acquires the resource information, a resource model of the cloud service platform is determined according to the resource information. And processing the application topology and the resource model based on the decision algorithm to determine the first scheduling resource. The decision algorithm is used for determining the resource use and/or distribution mode corresponding to the target application, namely determining the resource scheduling strategy.
In a possible implementation manner of the first aspect, the decision algorithm includes a plurality of parallel algorithms with different performances, or the decision algorithm includes a plurality of parallel algorithms with different performances and an artificial intelligence (artificial intelligence, AI) algorithm. The method comprises the steps of processing application topology and resource models respectively by a plurality of parallel algorithms with different performances to determine a first scheduling strategy. And the artificial intelligent algorithm is used for correcting the time delay errors of the target application information and the resource information.
In the application, the decision method comprises a plurality of possibilities, and under the condition of comprising a plurality of parallel algorithms with different performances, the calculated amount is small, so that the operation resources are saved; under the condition of comprising a plurality of parallel algorithms and AI algorithms with different performances, the time delay error can be corrected, and the accuracy of calculation is improved.
In a possible implementation manner of the first aspect, the cloud service platform may determine, in response to an operation of the tenant, a first scheduling policy that meets an application scheduling target. Specifically, the cloud service platform obtains a selection instruction input by the tenant, wherein the selection instruction indicates a first scheduling policy in at least one scheduling policy, and the at least one scheduling policy meets an application scheduling target. Then, the cloud service platform determines a first scheduling policy from the at least one scheduling policy in response to the selection instruction. The cloud service platform may acquire the selection instruction input by the tenant in various manners, which may be acquired through a display scheduling effect interface or through an API interface, and is not limited herein. It can be understood that the cloud service platform can also display the scheduling effects of different scheduling strategies through the scheduling effect interface, so as to provide a reference for the selection of tenants.
In a possible implementation manner of the first aspect, if the cloud service platform calculates only one scheduling policy that meets the application scheduling target, the cloud service platform may not display the scheduling effect interface, and takes the scheduling policy as the first scheduling policy by default.
In the application, the cloud service platform may determine a plurality of scheduling strategies meeting the application scheduling targets, and in this case, the first scheduling strategy can be determined in response to the selection instruction of the tenant, so that the final scheduling effect can meet the actual requirements of the tenant, and the use experience of the tenant is improved.
In a possible implementation manner of the first aspect, the application scheduling target may be expressed as effects of a plurality of aspects in a general manner, including: application scheduling cost priority, or application scheduling performance priority, or application scheduling quality priority. Where application scheduling cost first means minimizing costs, but at the same time the end-to-end latency and outage probability are within acceptable ranges. Application scheduling performance prioritization refers to minimizing end-to-end latency, but with both cost and outage probabilities within acceptable limits. The application scheduling quality priority refers to: the outage probability is minimized, but at the same time the end-to-end latency and expense is within acceptable limits.
In one possible implementation manner of the first aspect, the application scheduling target may reflect various indexes input by the tenant, including but not limited to: 1) And applying end-to-end time delay, namely submitting the task perceived by the tenant to return time delay. 2) Time delay of a single task or a part of a task chain. 3) The outage probability that an application can tolerate. 4) The total price of the scheduling resource over the life cycle of the application.
In the application, the application scheduling target can reflect the application scheduling effect from multiple dimensions, so that the implementation mode of the technical scheme is enriched, and the flexibility of the scheme is improved.
In a possible implementation manner of the first aspect, the scheduling policy provided by the cloud service platform may be updated. Specifically, after determining that the first scheduling policy of the application scheduling target is met, as the target application and the cloud service platform run, the target application information and the resource information of the cloud service platform may change, the first scheduling policy may not be applicable any more, or a better scheduling policy exists in case that the application scheduling target is met. The cloud service platform can timely update the target application information and the resource information, then determines a second scheduling policy corresponding to the target application according to the updated target application information and the updated resource information, the second scheduling policy indicates resources and deployment modes corresponding to the target application, and the second scheduling policy meets the application scheduling target.
In the application, the cloud service platform can update the scheduling strategy in real time, so that the operation of the target application is smoother, and the practicability of the technical scheme of the application is improved.
In a possible implementation manner of the first aspect, after determining the second scheduling policy corresponding to the target application, the cloud service platform may respond to the operation of the tenant to update the scheduling policy. Specifically, the cloud service platform may display a policy update interface, where the policy update interface includes an update control, where the update control corresponds to the second scheduling policy and is capable of reminding the tenant whether to update the scheduling policy. And then updating the scheduling strategy of the target application into a second scheduling strategy in response to the operation instruction aiming at the updating control.
In a possible implementation manner of the first aspect, the target application information includes current application information of the target application, or current application information and historical application information of the target application; the resource information also includes idle resources of the cloud service platform.
In a possible implementation manner of the first aspect, the target application information includes one or more of the following: task calculation amount and memory requirement corresponding to target application, task model, concurrent call information, task attribution information, task type, task execution sequence, task number, task data amount and life cycle.
In a possible implementation manner of the first aspect, the resource information includes one or more of the following: the cloud service platform comprises node attribute information corresponding to the cloud service platform, transmission information among nodes and tenants, node scheduling cost and configuration information of a control surface corresponding to the nodes.
In the application, the target application information and the resource information have multiple possibilities, and the attribute of the target application and the resource attribute of the cloud service platform can be described from multiple angles respectively. Meanwhile, in practical application, corresponding information can be selected according to the needs, different scene requirements can be flexibly met, and the practicability and flexibility of the technical scheme are further improved.
A second aspect of the present application provides a cloud service platform for managing an infrastructure, the infrastructure including a plurality of data centers disposed in different areas, each data center being provided with a plurality of servers. The cloud service platform comprises: and the acquisition unit is used for acquiring an application scheduling target of the target application which is input by the tenant and distributed and deployed in at least one data center of the infrastructure. A processing unit, configured to confirm an application topology corresponding to the target application and resource information for representing resource usage and/or distribution of each micro-service in the application topology in the infrastructure; determining a first scheduling strategy meeting an application scheduling target according to the application topology and the resource information; resource usage and/or distribution of at least one micro-service in the application topology in the infrastructure is adjusted based on the first scheduling policy to meet the application scheduling objective.
It should be noted that the second aspect or any implementation manner of the second aspect is an implementation manner of the first aspect or a device corresponding to any implementation manner of the first aspect, and the description in the first aspect or any implementation manner of the first aspect is applicable to the second aspect or any implementation manner of the second aspect, which is not repeated herein.
A third aspect of the present application provides a cluster of computing devices, comprising at least one computing device, each computing device comprising a processor and a memory; the processor of the at least one computing device is configured to execute instructions stored in the memory of the at least one computing device to cause the cluster of computing devices to implement the method disclosed in the first aspect and any one of the possible implementations of the first aspect.
A fourth aspect of the present application provides a computer program product comprising instructions which, when executed by a cluster of computer devices, cause the cluster of computer devices to implement the method as disclosed in the first aspect and any one of the possible implementations of the first aspect.
A fifth aspect of the present application provides a computer readable storage medium comprising computer program instructions which, when executed by a cluster of computing devices, cause the cluster of computing devices to perform the method disclosed in the first aspect and any one of the possible implementations of the first aspect.
Advantageous effects shown in the second to fifth aspects of the present application are similar to the first aspect and any possible implementation manner of the first aspect, and are not described here again.
Drawings
FIG. 1 is a schematic diagram of an architecture according to an embodiment of the present application;
FIG. 2a is a schematic diagram of a scheduling system according to an embodiment of the present disclosure;
FIG. 2b is a schematic diagram of another architecture of a scheduling system according to an embodiment of the present disclosure;
FIG. 2c is another schematic diagram of a scheduling system according to an embodiment of the present disclosure;
fig. 3 is a schematic flow chart of an application scheduling method according to an embodiment of the present application;
FIG. 4a is a schematic diagram of an application deployment interface provided by an embodiment of the present application;
FIG. 4b is another schematic diagram of an application deployment interface provided by embodiments of the present application;
FIG. 5 is a schematic diagram of a framework of an application topology and platform model generation subsystem according to an embodiment of the present application;
FIG. 6 is a schematic diagram of an algorithm architecture according to an embodiment of the present disclosure;
FIG. 7 is a schematic diagram of a scheduling effects interface provided by embodiments of the present application;
FIG. 8 is a schematic diagram of a policy update interface according to an embodiment of the present application;
fig. 9 is another flow chart of an application scheduling method provided in an embodiment of the present application;
Fig. 10 is another flow chart of an application scheduling method provided in an embodiment of the present application;
fig. 11 is another flow chart of an application scheduling method provided in an embodiment of the present application;
fig. 12 is a schematic structural diagram of a cloud service platform according to an embodiment of the present application;
FIG. 13 is a schematic diagram of a computing device according to an embodiment of the present application;
fig. 14 is a schematic structural diagram of a computing device cluster according to an embodiment of the present application.
Detailed Description
The application scheduling method, the cloud service platform and related equipment are provided. In the application scheduling method, an application scheduling target of a target application of at least one data center of a base device managed by a cloud service platform and distributed deployment input by a tenant can be obtained, and then a first scheduling policy meeting the application scheduling target is determined according to application topology and resource information of the target application. In the method, the tenant is not required to consider factors of each layer, so that the difficulty of resource purchase and application deployment is reduced. Meanwhile, the first scheduling strategy determined by combining the application topology and the resource information can also meet the requirements, and the resources are utilized to the maximum extent.
Embodiments of the present application are described below with reference to the accompanying drawings. As one of ordinary skill in the art can appreciate, with the development of technology and the appearance of new scenes, the technical solutions provided in the embodiments of the present application are applicable to similar technical problems.
The terms first, second and the like in the description and in the claims of the present application and in the above-described figures, are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the terms so used are interchangeable under appropriate circumstances and are merely illustrative of the manner in which the embodiments of the application described herein have been described for objects of the same nature. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of elements is not necessarily limited to those elements, but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. In addition, "at least one" means one or more, and "a plurality" means two or more. "and/or", describes an association relationship of an association object, and indicates that there may be three relationships, for example, a and/or B, and may indicate: a alone, a and B together, and B alone, wherein a, B may be singular or plural. The character "/" generally indicates that the context-dependent object is an "or" relationship. "at least one of" or the like means any combination of these items, including any combination of single item(s) or plural items(s). For example, at least one (one) of a, b, or c may represent: a, b, c, a-b, a-c, b-c, or a-b-c, wherein a, b, c may be single or plural.
First, concepts and related terms that may be referred to in the present application will be described.
Service instance: the type of the service instance comprises one or any combination of a virtual machine, a container and a proprietary host.
Virtual Machine (VM): when creating a virtual machine in a server through a complete computer system which has complete hardware system functions and runs in a complete isolation environment and is simulated by software, partial hard disks and memory capacities of the physical machine are required to be used as the hard disks and memory capacities of the virtual machine, each virtual machine has an independent hard disk and an operating system, and tenants of the virtual machine can operate the virtual machine like a server.
A container: a virtualization technology in a computer operating system enables a process to run in a relatively independent and isolated environment (comprising an independent file system, a naming space, a resource view and the like), so that the deployment flow of software can be simplified, portability and safety of the software are enhanced, and the utilization rate of system resources is improved.
Dedicated host (DeH): the tenant may share exclusive physical host resources. The tenant creates the cloud server on the exclusive host, so that higher requirements of the cloud server on isolation, safety and performance can be met. The tenant of the dedicated host does not need to share the physical resources of the host with other tenants.
Cloud management platform and infrastructure: the cloud management platform is used for managing an infrastructure of a cloud manufacturer, the infrastructure is a plurality of cloud data centers arranged in different areas, at least one cloud data center is arranged in each area, the cloud management platform can provide interfaces related to cloud computing services, such as configuration pages (interface) or APIs (application program interfaces) for tenants to access the cloud services, the tenants can log in the cloud management platform through pre-registered account passwords, and after logging in successfully, cloud services provided by the cloud data centers in the preset areas, such as object storage services, virtual machine services, container services or other known cloud services, are selected and purchased.
Tenant: the cloud management platform is used for managing top-level objects of cloud services and/or cloud resources, tenants register tenant accounts and set tenant passwords on the cloud management platform through local clients (such as a browser), the local clients remotely log in the cloud management platform through the tenant accounts and the set tenant passwords, and the cloud management platform provides a configuration interface or API for the tenants to configure and use the cloud services, wherein the cloud services are provided by the infrastructure managed by the cloud management platform specifically.
Next, referring to fig. 1, fig. 1 is a schematic diagram of an architecture provided in an embodiment of the present application.
As shown in fig. 1, a tenant logs in to the cloud service platform 30 through an account number and a password registered in the cloud service platform 30 via the internet 20 by the client 10. The cloud service platform 30 manages an infrastructure including a plurality of data centers disposed in different areas, for example, an area 1 shown in fig. 1 includes a cloud data center 1 and a cloud data center 2, and an area 2 includes a cloud data center 3 and a cloud data 1 center 4. Each cloud data center is provided with a plurality of servers, and service instances (comprising at least one of virtual machines, containers and proprietary hosts) are run on the servers.
In the embodiment of the present application, an application scheduling service is deployed in a service instance, a tenant purchases a cloud service at a cloud service platform 30 through a client, and the tenant sends a call request to the cloud service platform 30, where the call request is used for requesting the cloud service from the cloud service platform 30. The cloud service comprises the specific content of determining the resource use and/or distribution of at least one micro-service in the application topology corresponding to the target application for the tenant, so that the target application can normally run.
Next, an application scheduling method provided in the embodiment of the present application will be described in terms of a scheduling system.
Referring to fig. 2a, fig. 2a is a schematic architecture diagram of an application scheduling system according to an embodiment of the present application.
As shown in fig. 2a, the cloud service platform may be divided based on logical functions: tenant console, control plane function and data plane function. The tenant console provides an interface or API to interact with the tenant, thereby obtaining an application scheduling objective. The control plane function is used to determine a scheduling policy that meets an application scheduling objective. The data plane function is to adjust resource usage and/or distribution of at least one micro-service in an application topology of the target application in the infrastructure based on the scheduling policy.
As shown in fig. 2a, the application scheduling system provided in the embodiment of the present application may form a distributed cloud global scheduling system architecture. The distributed cloud infrastructure is distributed in different regions, and the specification of each region is different, for example, the model may have a time, a number of different models, the scheduling location may be located at the network edge, etc. Illustratively, in the embodiment shown in FIG. 2a, each region includes a different number of servers.
The implementation of the technical scheme of the application can be based on a software system on a distributed cloud infrastructure, and the system comprises the following subsystems: 1) And the index acquisition subsystem is used for acquiring the resource information and the application information of the cloud service platform. 2) And the application and platform model generation subsystem is used for constructing a resource model of the application topology and cloud service platform based on the acquired information. 3) And the decision algorithm is used for calculating a scheduling strategy based on the model output by the application and platform model generation subsystem. 4) And the self-adaptive elastic subsystem obtains an updated resource purchasing scheme. 5) And the global unified scheduling subsystem calculates an application scheduling scheme. 6) And the intelligent optimizing subsystem adjusts the strategy of resource purchase and application scheduling according to the historical resource purchase and application execution conditions, namely adjusts the scheduling strategy of the application. The scheduling strategy obtained by the system can be transmitted to the control surface of the cloud service platform for execution. Alternatively, the subsystems may be centrally scheduled on a central control plane, or may be distributed and scheduled in each region, which is not limited herein.
The relation between a decision algorithm used in the application scheduling process and each subsystem is as follows: 1) The decision algorithm takes the platform resource model and the application model from the application and platform model generation subsystem as inputs, and if there are multiple groups of data such as the current model, the predicted model and the statistical model, the multiple groups of calculation are executed in parallel respectively. 2) The decision algorithm is also a bottom layer driver of the self-adaptive elastic subsystem and the global unified scheduling subsystem, namely, the decision result comprises a resource use condition and an application scheduling scheme, when the first deployment or calculation result change exceeds a threshold value, the self-adaptive elastic subsystem determines the resources which should be elastically stretched according to the resource use condition obtained by the algorithm, and the global unified scheduling subsystem adjusts the deployment condition of each current task instance according to the application scheduling scheme.
In addition, after long-term operation of the application, in addition to generating predicted and statistical resources and application models, and improving the AI model training accuracy in the decision algorithm, the intelligent tuning subsystem may also make adjustments based on historically collected data, including but not limited to the following: 1) The result of the current, predicted or statistical resource and application model calculation is selected to be adopted. 2) The choice of which algorithm to employ results of the calculation. 3) Parameters of the algorithm are adjusted. 4) Decision resource flexibility and application deployment adjust gray level release strategy.
It should be noted that, the application scheduling system (i.e. the cloud service platform) provided in the embodiment of the present application may be deployed not only on a distributed cloud platform, but also on an edge cloud platform, which is described below.
Referring to fig. 2b and fig. 2c, fig. 2b and fig. 2c are schematic diagrams of an application scheduling system according to an embodiment of the present application.
As shown in fig. 2b, consider a distributed Yun Rongqi platform that includes multiple regions, each having several available regions and a cluster of containers scheduled on the available regions. In order to manage the cluster and acquire infrastructure as a service (infrastructure as a service, iaaS) resource information, each Region is scheduled with a control plane, and a control plane of a distributed cloud is also scheduled at one of the regions to manage all the regions and acquire information of all the regions. In the device case, the module and the distributed cloud control plane can be designed to be scheduled together and connected with each other, and both access the console to interact with the tenant.
As shown in fig. 2c, consider a cloud Bian Rongqi platform that includes a single Region with container clusters scheduled on its availability and an edge availability, and a control plane to manage the clusters and obtain IaaS resource information. The cloud service platform and the Region control plane are scheduled together and are connected with each other, and both access the control console and interact with the tenant.
It should be noted that, the application scheduling method provided in the present application may be applied to not only the cross-Region distributed cloud scenario shown in fig. 2b and the single-Region cloud-edge combined platform shown in fig. 2c, but also other scenarios, such as hybrid cloud, virtual machine, container, bare metal and other various platforms, which is not limited in this specific application.
The application scheduling method is applied to a cloud service platform, wherein the cloud service platform is used for managing an infrastructure, the infrastructure comprises a plurality of data centers arranged in different areas, and each data center is provided with a plurality of servers.
Referring to fig. 3, fig. 3 is a flowchart of an application scheduling method according to an embodiment of the present application, including the following steps:
301. an application scheduling target of a target application of the distributed deployment of tenant input at least one data center in the infrastructure is obtained.
In the application, the target application is distributed and deployed in at least one data center in the infrastructure managed by the cloud service platform. The application scheduling target actually reflects an application scheduling effect that the tenant wants to achieve, and the application scheduling effect can be represented as effects in multiple aspects in a general way or can be reflected as various indexes input by the tenant, and is not limited in specific terms.
The effects of the aspects can be understood as optimizing a certain index, and the other indexes can meet the requirements. Specifically, the application scheduling effect includes the effects of various aspects such as scheduling cost, scheduling performance or scheduling quality, and is not limited herein. That is, application scheduling effects include, but are not limited to: 1) Cost priority: minimizing costs while the end-to-end latency and outage probability are within acceptable limits. 2) Performance priority: end-to-end latency is minimized, but the simultaneous cost and outage probability is within acceptable limits. 3) Quality priority: the outage probability is minimized, but at the same time the end-to-end latency and expense is within acceptable limits.
The various indicators include, but are not limited to: 1) And applying end-to-end time delay, namely submitting the task perceived by the tenant to return time delay. 2) Time delay of a single task or a part of a task chain. 3) The outage probability that an application can tolerate. 4) The total price of the scheduling resource over the life cycle of the application.
Optionally, the application scheduling effect may further include extensibility, security, and so on, where these attributes are equivalent to the attributes of the cloud service platform itself, and are not further described in this application.
The cloud service platform may acquire the application scheduling target input by the tenant in various ways, and may be acquired through an interactive interface or through an API, which is not limited herein.
The following description will take the example of acquisition through an interactive interface, with reference to a schematic diagram. Referring to fig. 4a, fig. 4a is a schematic diagram of an application deployment interface according to an embodiment of the present application.
After the tenant logs in the cloud service platform, an application deployment interface shown in fig. 4a can be opened, and an application scheduling target of the target application is input in the application deployment interface. In other words, for the cloud service platform, the cloud service platform obtains the application scheduling target input by the tenant by displaying the application deployment interface.
Illustratively, fig. 4a is an example of tenant input indicating various metrics of an application scheduling target. As shown in fig. 4a, the application deployment interface includes an input box 401, in which a tenant may input an application scheduling target. The tenant experience time delay is the time delay from the application end to the end, and the comprehensive cost is the total price of the scheduling resource in the application life cycle. It should be noted that fig. 4a only takes these two indexes as an example, and in practical application, the tenant may also input or select a greater or lesser number of indexes, which is not limited herein.
It should be noted that in the embodiment shown in fig. 4a, the tenant experiences delay and comprehensive cost, requiring the tenant to input specific values. However, in actual application, the cloud service platform may also obtain the application scheduling target of the target application input by the tenant in other manners. The cloud service platform may preset a plurality of values corresponding to the indexes, and display the values on an application deployment interface, from which the tenant selects.
It should be noted that if the application deployment interface displays a plurality of index options, but only some of the indexes are set by the tenant, the cloud service platform needs to meet the index specified by the tenant when performing scheduling, and may not require the index for unselected use.
302. An application topology of the target application and resource information representing resource usage and/or distribution of each micro-service in the application topology in the infrastructure is validated.
Firstly, a process of determining an application topology of a target application by a cloud service platform is described:
generally, the cloud service platform obtains target application information indicating attribute information of a target application, and determines an application topology according to the target application information.
In practical applications, the tenant may call the target application for the first time based on the cloud service platform, or may call the target application for the first time, and the target application information acquired by the cloud service platform is different for different situations. Specifically, if the cloud service platform is a first scheduling target application, the target application information includes current application information of the target application; if the cloud service platform is not the first scheduling target application, the target application information includes current application information and historical application information of the target application. It will be appreciated that the attribute information of the target application, whether current or historical application information, may be reflected and may include, but is not limited to, one or more of the following: task calculation amount and memory requirement corresponding to target application, task model, concurrent call information, task attribution information, task type, task execution sequence, task number, task data amount and life cycle. That is, the target application information is not limited to the above-listed information, and other attribute information capable of reflecting the target application also belongs to the target application information, and is not limited herein.
The task calculation amount includes a calculation amount of a request corresponding to the processing task and a processor (central processing unit, CPU) speed consumed in the idle running. The task model indicates whether a task needs a special model, which includes an advanced reduced instruction set processor (advanced RISC machines, ARM), a graphics processor (graphics processing unit, GPU), a Network Processor (NPU), etc., and is not limited herein. The concurrent call information indicates whether the processing target application calls the peripheral service, and the acceleration of the concurrent call. Invoking the peripheral services includes retrieving data to be processed from a database, querying infrastructure as a service (Infrastructure as a Service, iaaS) information, and the like, and is not limited in this regard. The task attribution information may indicate an affinity of the task with the scheduling location. The task execution sequence can reflect the dependency relationship between different tasks in the application, for example, the video live broadcast application scheduled by the tenant comprises tasks such as video transcoding, rendering, special effect generation, tenant authentication, usage analysis and the like, the video transcoding has a great calculation amount and needs lower time delay with the tenant, the rendering task needs GPU and has a precedence relationship with the transcoding, and the usage analysis task needs to count the state of the task scheduled in each place. The dependencies between tasks may be represented by a topological graph of application and transmission link composition, which may typically be represented by a directed acyclic graph (directed acyclic graph, DAG). The number of tasks includes the number of applications processed per unit time of application, including the lowest number, the highest number, the average number, the instantaneous number, and the like. The task data size indicates the data transmission size between tasks, i.e. the data that needs to be transmitted per processing of a request, and the data size that is output to the tenant when the application is completed.
There are various possible ways in which the cloud service platform obtains the target application information: target application information input by the tenant can be obtained, and the target application information can also be obtained by detecting the running state of the target application; in addition, the two modes can be combined to obtain, that is, part of target application information input by the tenant is obtained, and the other part of target application information is obtained by detecting the running state of the target application, which is not limited herein. The first two different acquisition modes are described below:
1) The cloud service platform acquires target application information input by a tenant:
the tenant may input the target application information through an interactive interface or API. The following describes an example of the cloud service platform responding to an operation instruction for acquiring target application information for an application deployment interface. The specific corresponding process of the method comprises the following steps: the cloud service platform displays an application deployment interface, and a tenant can input or select target application information on the application deployment interface, and the cloud service platform responds to the operation to acquire the target application information.
For clarity of illustration, please refer to fig. 4b, fig. 4b is a schematic diagram of an application deployment interface provided in an embodiment of the present application.
As shown in fig. 4b, assuming that the target application includes 4 tasks shown in fig. 4b, the tenant clicks a block graph for representing task 3, and the application deployment interface displays an information box 402, and by responding to an operation instruction of the tenant corresponding to the information box 402, the cloud service platform can obtain information related to task 3 included in the target application information.
Alternatively, the information related to the task 3 may be divided into a necessary option and an optional option, and in the embodiment shown in fig. 4b, the model requirement and the peripheral service belong to the necessary option, and the information of the tenant preference class or the characteristics of the attribute class inherent to the task are input by the tenant more accurately. In the embodiment shown in fig. 4b, the selectable items include the calculated amount, the number of requests, and the like, and these items can be input by the tenant, or can be acquired by the cloud service platform in the running process of the target application. If the tenant inputs the selectable item, the overall deployment efficiency can be improved, and the convergence speed is optimized.
It should be noted that the necessary options may be different for different tasks. For video processing applications, the necessary options are, for example, model (e.g., specification requiring processing device with Network Processor (NPU)) and affinity of the task corresponding to the video source to the scheduling location. For large data processing applications, the necessary options include calling relationships for the surrounding services (e.g., locations where data is stored).
In addition, in the embodiment shown in fig. 4b, the tenant may set the affinity relationship between the task and the scheduling area in a dragging manner, that is, set the task attribute information of the target application. The tasks shown in fig. 4b are performed by a device that spans the country, and in practical applications, the tasks may be set to span provinces, spans, etc., which are not limited herein.
It should be noted that fig. 4b is only an example of obtaining the target application information, and in practical applications, the application deployment interface may also display other contents, so long as the obtaining of the target application information can be indicated, which is not limited herein.
2) The cloud service platform acquires target application information by detecting the running state of the target application:
the cloud service platform can acquire the target application information through a cloud native detection mode. For example, the resource usage may be obtained through a container management platform (e.g., kubernetes, etc.), the call chain information may be obtained through a service grid, the resource usage of each task may be obtained through a container detection scheme (e.g., promethaus, etc.), and the call relationship between tasks may be obtained through an eBPF-based container monitoring scheme (e.g., pixie, etc.), which is not limited herein.
In this application, the target application information is possible in many ways, and the attribute of the target application can be described from many angles. In addition, the cloud service platform can obtain the target application information in various ways, can be flexibly suitable for different scenes, and can select corresponding information according to the needs in practical application, so that the practicability and flexibility of the technical scheme are further improved.
After the cloud service platform acquires the target application information, an application topology corresponding to the target application is determined through the application topology and the platform model generation subsystem. The application topology indicates task characteristics included in the target application, that is, scheduling relationships between tasks included in the target application.
Referring to fig. 5, fig. 5 is a schematic diagram of an application topology and platform model generation subsystem provided in the present application.
It should be noted that, in fig. 5, taking an example that the cloud service platform does not schedule the target application for the first time, the target application information includes current application information and historical application information of the target application.
As shown in the left diagram of fig. 5, as input, the current application information may be collected in real-time by the metrics collection subsystem and entered by the tenant. The index collection subsystem may be obtained by means of cloud primary detection, for example, through container management platform information, service grid, container detection service, and the like, which is not limited herein. The current application information is input into a real-time topology calculator, and the current application topology is obtained through a topology analysis algorithm and other approaches.
Historical application information may originate from a variety of detection components: for example, call chain information acquired through a service grid, and resource usage of each task is acquired through a container detection scheme (such as Prometheus and the like); for example, the call relationship between tasks is obtained through a container detection scheme (such as Pixie, etc.) based on the eBPF, and the resource use condition is obtained through a container management platform (such as Kubernetes, etc.), which is not limited herein.
The historical application information and the current application topology are input into a statistics and prediction device, inaccuracy generated by time delay of information collection is corrected by using methods such as graph convolution neural network (graph convolutional neural networks, GCN) and the like, and the topology of long-time operation of the application can be obtained by using a statistics method.
Next, a process of acquiring resource information is described. The resource information indicates the resource usage and/or distribution of each micro-service in the application topology in the infrastructure.
From the perspective of resource usage, the resource information includes purchased resources of the tenant and idle resources of the cloud service platform. The purchased resources refer to resources purchased by the tenant, and the idle resources refer to resources which can be used in the cloud service platform and are in an idle state. From the perspective of the type of resource, the resource information of the cloud service platform includes, but is not limited to, one or more of the following: the cloud service platform comprises node attribute information corresponding to the cloud service platform, transmission information among nodes and tenants, node scheduling cost and configuration information of a control surface corresponding to the nodes. That is, the resource information of the cloud service platform is not limited to the above listed information, and other information related to the resource of the cloud service platform also belongs to the resource information of the cloud service platform, and is not limited herein.
The node attribute information corresponding to the cloud service platform comprises: different types of computing nodes in each region (region), fixed usage quota of each type of tenant, fixed computing speed (CPU frequency and core number) and memory capacity of the node, whether the node is a bidding example and whether the bidding example is preempted, whether the node is a special model (such as ARM model), and delay of calling a certain peripheral service (always selecting a deployment point closest to the node) by the region where the node is located. The transmission information between nodes includes the bandwidth of data transmission between two nodes in a single region, the transmission start time delay and the cost of transmitting unit data. The transmission information between the node and the tenant includes bandwidth, transmission start time delay and cost of transmitting unit data between the node and the tenant. The node schedule fee includes a unit price of the node. The configuration information of the control plane corresponding to the node comprises whether the control plane is deployed in the region where the node is located and the configuration cost of the control plane.
Alternatively, the cloud service platform may obtain the resource information by calling an underlying service application program interface (application programming interface, API).
In the application, the resource information of the cloud service platform has various possibilities, and the resource attribute of the cloud service platform can be described from multiple angles. Meanwhile, in practical application, corresponding information can be selected according to the needs, different scene requirements can be flexibly met, and the practicability and flexibility of the technical scheme are further improved.
303. And determining a first adjustment strategy meeting the application scheduling target according to the application topology and the resource information.
Generally, after acquiring the resource information, the cloud service platform determines a resource model of the cloud service platform according to the resource information. And processing the application topology and the resource model based on the decision algorithm to determine the first scheduling resource. The decision algorithm is used for determining the resource use and/or distribution mode corresponding to the target application, namely determining the resource scheduling strategy.
Specifically, after the cloud service platform acquires the resource information of the cloud service platform, the resource model of the cloud service platform is determined by applying the topology and platform model generation subsystem.
As shown in the right-hand diagram of fig. 5, resource information may be obtained by calling an underlying service API. And inputting the purchased resources and the available resources information from the index acquisition subsystem into a real-time resource analyzer to obtain the current resource situation. For example, the information obtained by the API is filled into the real-time resource analyzer by statistical methods (such as averaging, median, etc.) or directly taking the latest value. And simultaneously inputting the information and the resource information constructed in real time into a resource statistics and predictor of technical domain historical data, correcting inaccuracy generated by information collection time delay by using methods such as a fully connected deep neural network (fully connected deep neural network, FC-DNN) and the like, and finally obtaining a resource model of the cloud service platform.
In the process of determining the resource model of the cloud service platform based on the resource information, the purchased resources of the tenant are preferentially considered. Under the condition that the purchased resources of the tenant cannot meet the application scheduling target, adding the available resources of the cloud service platform, thereby ensuring the maximum utilization of the purchased resources of the tenant and reducing the cost of the tenant for purchasing the resources.
In the present application, the application execution process may be described as a parallel computing problem, and in the process of the cloud service platform processing the application topology and the resource model to determine the scheduling policy based on the decision algorithm, the relevant data may be calculated based on the principle as follows:
1) The calculation time of a task processing a request is calculated as the unit calculation divided by the node speed minus the idle consumption speed plus the time consumed for calling the peripheral service, and if the memory requirement for processing the request is not met after the idle memory consumption is removed, the calculation time is considered to be infinite. 2) The transmission time between tasks is the amount of transmission divided by the bandwidth plus the delay. 3) For a certain task in a request, the starting time is the last time of completing calculation and data transmission in the parent task. The parent task refers to a task which is the last task of the task in the directed acyclic graph. 4) Costs within the application lifecycle include the total price of the purchasing node (including tenant traffic nodes and control plane nodes) and the cost of bandwidth usage (including transmission between tasks to apply the transmission of output to the tenant). 5) The end-to-end delay is the time interval from the first task start time to the last task completion to the tenant transmission. 6) The application outage probability is the probability of failure of any node where the task is located, i.e. 1 minus the probability that all nodes are available.
In addition, in addition to the above limitations, the process of applying scheduling needs to consider the following limitations: 1) Examples of each task are and only one node performs. This is because an instance is the smallest unit of a task, and is not separable. 2) The computing resources are not overloaded after each node carries a task. 3) The model requirements of task scheduling meet the requirements. 4) The upper limit of the usable node is the tenant quota. 5) For the same application, the difference between the two schedules should be less than a certain range.
After the cloud service platform acquires the application topology and the resource model, the application topology and the resource model are processed by using a decision algorithm, and a scheduling strategy meeting an application scheduling target is determined on the premise of meeting the requirements. The decision algorithm may include a plurality of parallel algorithms with different performances, or a plurality of parallel algorithms with different performances and AI algorithms, and is not limited herein. The method comprises the steps of processing an application topology and a resource model respectively by a plurality of parallel algorithms with different performances to determine a plurality of scheduling strategies meeting an application scheduling target. And the artificial intelligence algorithm is used for correcting the time delay errors of the target application information and the resource information.
In the following, a decision algorithm is described by taking an example in which the decision algorithm includes a plurality of parallel algorithms and AI algorithms having different performances. Referring to fig. 6, fig. 6 is a schematic diagram of an algorithm architecture according to an embodiment of the present application.
It will be appreciated that resource purchase and task scheduling issues (i.e., issues that determine the application of deployment policies) are subordinate to parallel scheduling issues. For the general parallel scheduling algorithm part, there are a number of excellent algorithms available in the parallel computing field, which take the application topology and resource model as inputs and the application scheduling policy as outputs. Based on the method, available resource combinations can be enumerated first, then scheduling is carried out on each resource combination, and the combination of each resource combination and a corresponding scheduling scheme is obtained. In addition, because the performance of each general parallel scheduling algorithm is different, a plurality of algorithms can be adopted and executed simultaneously.
AI algorithms are used to promote decision accuracy, and in the absence of historical data, deep reinforcement learning (deep reinforcement learning, DRL) can be used as a infrastructure. Alternatively, for stability reasons, a DRL framework of the policy evaluation (Actor-Critic) class may be employed herein, such as: asynchronous dominant motion assessment algorithms (asynchronous advantage actor critic, A3C), deep deterministic strategy gradient algorithms (deep deterministic policy gradient, DDPG), etc., wherein the strategy (actor) and value (critic) networks may employ deep neural networks (deep neural network, DNN) capable of capturing application and platform characteristics, such as convolutional neural networks (convolutional neural network, CNN), GCN, etc., without limitation herein.
Alternatively, other types of AI tools may be used, but sufficient historical resources and application scheduling data are collected. In addition, if the actual scheduling environment lacks training data, a simulation system can be constructed to realize the process of resource purchase and application scheduling for training the AI model.
In the application, the decision method comprises a plurality of possibilities, and under the condition of comprising a plurality of parallel algorithms with different performances, the calculated amount is small, so that the operation resources are saved; under the condition of comprising a plurality of parallel algorithms and AI algorithms with different performances, the time delay error can be corrected, and the accuracy of calculation is improved.
In the application, since the decision algorithm comprises a plurality of parallel algorithms with different performances, each algorithm can obtain a scheduling strategy meeting an application scheduling target, that is, the cloud service platform can obtain a plurality of scheduling strategies. In the actual application process, one of the scheduling strategies is selected as the first scheduling strategy. The selection manner may be that the cloud service platform selects one policy as the first scheduling policy, or that the first scheduling policy is determined from a plurality of scheduling policies according to a selection instruction input by the tenant, which is not limited herein. The selection instruction may be input to the tenant in various manners, for example, through an interactive interface, or through an API, which is not limited herein.
The process of determining the first scheduling policy based on tenant selection is described below in conjunction with the schematic diagram. Referring to fig. 7, fig. 7 is a schematic diagram of a scheduling effect interface provided in an embodiment of the present application. It can be understood that the scheduling effect interface shown in fig. 7 is an interaction interface where a tenant interacts with the cloud service platform.
In the application, the cloud service platform can acquire a selection instruction input by a tenant through displaying the scheduling effect interface, wherein the selection instruction indicates a first scheduling policy in at least one scheduling policy included in the scheduling effect interface, and the at least one scheduling policy meets an application scheduling target. And then the cloud service platform responds to the selection instruction to determine a first scheduling strategy.
As shown in fig. 7, the scheduling effect interface shows two scheduling policies, scheduling policy (1) and scheduling policy (2), respectively. Both scheduling strategies, while meeting the application scheduling objectives, in contrast the price of scheduling strategy (2) is lower, i.e. the deployment cost is lower. The tenant can click a selection box in front of the scheduling policy (2) to send a selection instruction to the cloud service platform. And the cloud service platform responds to the selection instruction and determines the scheduling strategy (2) as a first scheduling strategy.
It should be noted that fig. 7 is only a schematic illustration of the scheduling effect interface, and in practical applications, the scheduling effect interface may also include other contents, as long as the scheduling effect of each scheduling policy can be represented. The method for selecting the first scheduling policy is not limited to the method shown in fig. 7, and other methods are also possible, for example, inputting the serial number of the first scheduling policy by the tenant, double clicking an icon indicating the first scheduling policy, and the like, and the method is not limited in detail here.
In the application, the cloud service platform may determine a plurality of scheduling strategies meeting the application scheduling targets, and in this case, the first scheduling strategy can be determined in response to the selection instruction of the tenant, so that the final scheduling effect can meet the actual requirements of the tenant, and the use experience of the tenant is improved.
304. Resource usage and/or distribution of at least one micro-service in the application topology in the infrastructure is adjusted based on the first scheduling policy to meet the application scheduling objective.
After the cloud service platform determines the first scheduling policy, the target application is processed based on the first scheduling policy. In particular, the resource usage and/or distribution of at least one micro-service in the application topology in the infrastructure is adjusted according to the first scheduling policy so as to meet the application scheduling objective.
Based on the above description, in the present application, the cloud service platform may obtain an application scheduling target of a target application of the target application, which is input by a tenant and distributed and deployed in at least one data center of a base device managed by the cloud service platform, and then determine a first scheduling policy that meets the application scheduling target according to an application topology and resource information of the target application. In the method, the tenant is not required to consider factors of each layer, so that the difficulty of resource purchase and application deployment is reduced. Meanwhile, the first scheduling strategy determined by combining the application topology and the resource information can also meet the requirements, and the resources are utilized to the maximum extent.
In some alternative embodiments, after step 304, i.e., after determining that the first scheduling policy of the application scheduling target is met, the scheduling policy provided by the cloud service platform may be updated. Specifically, after determining that the first scheduling policy of the application scheduling target is met, as the target application and the cloud service platform run, the target application information and the resource information of the cloud service platform may change, the first scheduling policy may not be applicable any more, or a better scheduling policy exists in case that the application scheduling target is met. The cloud service platform can timely update the target application information and the resource information, then determines a second scheduling policy corresponding to the target application according to the updated target application information and the updated resource information, the second scheduling policy indicates resources and deployment modes corresponding to the target application, and the second scheduling policy meets the application scheduling target.
The specific process of determining the scheduling policy corresponding to the target application by the cloud service platform according to the updated target application information and the updated resource information is similar to the above manner of determining the first scheduling policy, and is not repeated here. After the cloud service platform determines a new scheduling strategy, one scheduling strategy can be selected from the new scheduling strategies, or the scheduling strategy with the best scheduling effect is selected as a second scheduling strategy; in addition, the second scheduling policy may be determined from the new scheduling policies by acquiring an update instruction input by the tenant, which is not limited herein. The tenant may input the update instruction through the interactive interface, and may input the update instruction through the API, which is not limited herein.
The process of determining the second scheduling policy based on tenant selection is described below in conjunction with the schematic diagram. Referring to fig. 8, fig. 8 is a schematic diagram of a policy update interface according to an embodiment of the present application. It can be appreciated that the policy update interface shown in fig. 8 is an interaction interface where a tenant interacts with the cloud service platform.
As shown in fig. 8, the scheduling effects interface may present the overall application scheduling effect for the tenant, as well as interactively schedule improvements with the tenant. The location of each task schedule and the calling relationship between the location are shown in fig. 8, and the information such as the time delay of application calling and the characteristics of each regional resource are shown in fig. 8. The tenant may also expose execution of a task (not shown in fig. 8) after selecting a certain task.
When the cloud service platform is operated through the scheduling algorithm, the application deployment mode is considered to be required to be adjusted, and the application deployment mode is displayed in a prompting mode. For example: a policy update interface 403 is displayed, and the policy update interface 403 is used to remind the tenant whether to update. The policy update interface 403 includes an update control 4031, the update control 4031 corresponding to a second scheduling policy. The tenant clicks the update control 4031, and the cloud service platform responds to the operation instruction to update the scheduling policy of the target application to the second scheduling policy, that is, the new scheduling policy shown in fig. 8.
It should be noted that fig. 8 is only an illustration of the policy update interface, and in practical application, the policy update interface may further include other contents, which is not limited herein. In addition, the manner of determining the second scheduling policy is not limited to the manner shown in fig. 8, and there may be other manners, for example, the tenant inputs the sequence number of the second scheduling policy, a list including a plurality of scheduling policies is displayed on the policy update interface, or the tenant double clicks one of the scheduling policies to confirm the second scheduling policy, or the tenant double clicks an icon indicating the second scheduling policy, etc., which is not limited herein.
In the application, the cloud service platform can update the scheduling strategy in real time, so that the operation of the target application is smoother, and the practicability of the technical scheme of the application is improved. In addition, the mode of selecting the updated scheduling strategy is possible, so that the method can be flexibly adapted to different scenes, and the implementation mode of the technical scheme is enriched.
In general, the application scheduling method provided in the embodiments of the present application may be summarized as fig. 9, and referring to fig. 9, fig. 9 is a flowchart of the application scheduling method provided in the embodiments of the present application.
As shown in fig. 9, when the system is started, the tenant inputs current application information, and the index acquisition subsystem acquires platform resource information, such as non-first scheduling, and may also acquire historical application information. Optionally, the current application information and the historical application information are collected by the index collection subsystem, which is not limited herein.
Platform resources and application modeling are carried out on the collected information, and a decision algorithm is input to calculate the mapping relation between the resources and the application instance, namely, a scheduling strategy is determined. If the application scheduling is the first time, and the application scheduling target or the cost change exceeds a threshold value, the calculation result is transmitted to the self-adaptive elastic subsystem and the global unified scheduling subsystem to calculate to obtain a resource elastic expansion scheme and a corresponding application deployment adjustment scheme, otherwise, the detection of system resources is carried out again. If the application is deployed and runs for a long time, the intelligent tuning subsystem adjusts the decisions of the self-adaptive elastic subsystem and the global unified scheduling subsystem, and iterates to the resource checking step again after the calculation is completed, and the steps are repeated.
In some alternative embodiments, the application scheduling method may be as shown in fig. 10, based on the architecture shown in fig. 2 b. Referring to fig. 10, fig. 10 is a flowchart of an application scheduling method according to an embodiment of the present application.
As shown in fig. 10, from the time of system scheduling, the cloud service platform may synchronize platform resource information to the distributed cloud control plane according to a fixed frequency. For new application scheduling requests, the cloud service platform can calculate a recommended resource purchase scheme and a corresponding application scheduling scheme (i.e., application scheduling policy), and deliver the distributed cloud control plane to the corresponding Region to perform resource purchase and application scheduling. For the scheduled application, the cloud service platform continuously calculates proper resources and scheduling schemes based on the latest resources and application scheduling information, and once the current scheduling cannot ensure that the application scheduling target or the cost can be greatly reduced, the change of the resources and the scheduling is triggered through the distributed cloud control plane, and meanwhile, the change is notified to the tenant. Alternatively, the update of the scheduling policy may be directly performed without notifying the tenant of the change. Or after triggering the update of the scheduling policy, sending an update reminder to the tenant, and after the tenant confirms, updating the scheduling policy, which is not limited herein.
In some alternative embodiments, the application scheduling method may be as shown in fig. 11, based on the architecture shown in fig. 2 c. Referring to fig. 11, fig. 11 is a flowchart illustrating an application scheduling method according to an embodiment of the present application.
As shown in fig. 11, at the time of system scheduling, the cloud service platform may synchronize platform resource information to the Region control plane based on a fixed frequency. For a new application scheduling request, the cloud service platform can calculate a recommended resource purchase scheme and a corresponding application scheduling scheme (i.e., application scheduling policy), which informs the tenant through the tenant console. The tenant purchases the corresponding resource and dispatches the application through the Region control plane. For stock application, the cloud service platform continuously calculates proper resources and scheduling schemes based on the latest resources and application information, and once the current scheduling cannot ensure that the application scheduling target or the cost can be greatly reduced, the tenant is informed through the tenant control console, and the tenant modifies the application scheduling strategy through the Region control plane. Optionally, the cloud service platform may directly update the scheduling policy without prompting the tenant to update, which is not limited herein.
According to the application scheduling method, the embodiment of the invention further discloses an internal structure of the cloud service platform, and the following is specific to the embodiment:
Referring to fig. 12, fig. 12 is a schematic structural diagram of a cloud service platform according to an embodiment of the present application. The cloud service platform 1200 is used to manage an infrastructure including a plurality of data centers disposed in different areas, each of the data centers being provided with a plurality of servers.
As shown in fig. 12, the cloud service platform 1200 includes: an acquisition unit 1201 and a processing unit 1202.
An obtaining unit 1201 is configured to obtain an application scheduling target of a target application distributed and deployed in at least one data center of an infrastructure, where the target application is input by a tenant.
A processing unit 1202 for validating an application topology of the target application and resource information for representing resource usage and/or distribution of each micro-service in the application topology in the infrastructure. And determining a first scheduling policy meeting the application scheduling target according to the application topology and the source information, and adjusting the resource use and/or distribution of at least one micro-service in the application topology in the infrastructure based on the first scheduling policy so as to meet the application scheduling target.
In some alternative embodiments, the processing unit 1202 is specifically configured to confirm the application topology according to the target application information. The target application information is obtained by the obtaining unit 1202 obtaining application information input by the tenant, and/or obtained by detecting an operation state of the target application.
In some optional embodiments, the processing unit 1202 is specifically configured to determine a resource model of the cloud service platform according to the resource information. And processing the application topology and the resource model based on a decision algorithm, and determining a first scheduling strategy, wherein the decision algorithm is used for determining the resource usage and/or distribution corresponding to the target application.
In some alternative embodiments, the decision algorithm comprises a plurality of parallel algorithms of different performance, or the decision algorithm comprises a plurality of parallel algorithms of different performance and artificial intelligence algorithms. The method comprises the steps of processing application topology and resource models respectively by a plurality of parallel algorithms with different performances to determine a first scheduling strategy; and the artificial intelligent algorithm is used for correcting the time delay errors of the target application information and the resource information.
In some optional embodiments, the obtaining unit 1201 is specifically configured to obtain a selection instruction input by the tenant, where the selection instruction indicates a first scheduling policy of the at least one scheduling policies, and each of the at least one scheduling policies satisfies the application scheduling target.
The processing unit 1202 is specifically configured to determine a first scheduling policy in response to the selection instruction.
In some alternative embodiments, the processing unit 1202 is further configured to update the target application information and the resource information. And determining a second scheduling strategy corresponding to the target application according to the updated target application information and the updated resource information, wherein the second scheduling strategy indicates resources and deployment modes corresponding to the target application, and the second scheduling strategy meets the application scheduling target.
In some alternative embodiments, the processing unit 1202 is further configured to display a policy update interface, where the policy update interface includes an update control, and the update control corresponds to the second scheduling policy. And in response to the operation instruction aiming at the update control, updating the scheduling strategy of the target application into a second scheduling strategy.
In some alternative embodiments, the application scheduling objectives include: application scheduling cost priority, or application scheduling performance priority, or application scheduling quality priority.
In some alternative embodiments, the target application information includes current application information of the target application, or current application information and historical application information of the target application. The resource information of the cloud service platform further comprises idle resources of the cloud service platform.
In some alternative embodiments, the target application information includes one or more of the following: task calculation amount and memory requirement corresponding to target application, task model, concurrent call information, task attribution information, task type, task execution sequence, task number, task data amount and life cycle.
In some optional embodiments, the resource information of the cloud service platform includes one or more of: the cloud service platform comprises node attribute information corresponding to the cloud service platform, transmission information among nodes and tenants, node scheduling cost and configuration information of a control surface corresponding to the nodes.
It should be noted that, the acquiring unit 1201 and the processing unit 1202 may be implemented by software, or may be implemented by hardware. By way of example, the processing unit 1202 is next presented as an implementation of the processing unit 1202. Similarly, the implementation of the acquisition unit 1201 may refer to the implementation of the processing unit 1202.
When implemented in software, the processing unit 1202 may be an application or block of code running on a computer device. The computer device may be at least one of a physical host, a virtual machine, a container, and the like. Further, the computer device may be one or more. For example, the processing unit 1202 may be an application running on multiple hosts/virtual machines/containers. It should be noted that a plurality of hosts/virtual machines/containers for running the application may be distributed in the same availability area (availability zone, AZ) or may be distributed in different AZs. Multiple hosts/virtual machines/containers for running the application may be distributed in the same region (region) or may be distributed in different regions, and is not limited herein.
Also, multiple hosts/virtual machines/containers for running the application may be distributed in the same virtual private cloud (virtual private cloud, VPC) or in multiple VPCs. Where typically a region may comprise multiple VPCs and a VPC may comprise multiple AZs.
When implemented in hardware, the processing unit 1202 may include at least one computing device, such as a server or the like. Alternatively, the processing unit 1202 may be a device implemented using an application-specific integrated circuit (ASIC) or a programmable logic device (programmable logic device, PLD), etc. The PLD may be implemented as a complex program logic device (complex programmable logical device, CPLD), a field-programmable gate array (FPGA), a general-purpose array logic (generic array logic, GAL), or any combination thereof.
The processing unit 1202 may include multiple computing devices distributed among the same AZ or among different AZ. The processing unit 1202 may include multiple computing devices distributed in the same region or in different regions. Also, multiple computing devices included in processing unit 1202 may be distributed across the same VPC or across multiple VPCs. Wherein the plurality of computing devices may be any combination of computing devices such as servers, ASIC, PLD, CPLD, FPGA, and GAL.
The cloud service platform 1200 may perform the operations performed by the cloud service platform in the embodiments shown in fig. 1 to 11, which are not described herein.
Referring to fig. 13, fig. 13 is a schematic structural diagram of a computing device according to an embodiment of the present application.
As shown in fig. 13, a computing device 1300 includes: a bus 1303, a memory 1304, a processor 1305, and a communication interface 1306. The processor 1305, memory 1304, and communication interface 1306 communicate over the bus 1303. Computing device 1300 may be a server or a terminal device. It should be appreciated that the present invention is not limited to the number of processors, memories in computing device 1300.
Bus 1303 may be a peripheral component interconnect standard (peripheral component interconnect, PCI) bus, or an extended industry standard architecture (extended industry standard architecture, EISA) bus, among others. The buses may be divided into address buses, data buses, control buses, etc. For ease of illustration, only one line is shown in fig. 13, but not only one bus or one type of bus. The bus 1303 may include a path to transfer information between various components of the computing device 1300 (e.g., the memory 1304, the processor 1305, the communication interface 1306).
The processor 1305 may include any one or more of a central processing unit (central processing unit, CPU), a graphics processor (graphics processing unit, GPU), a Microprocessor (MP), or a digital signal processor (digital signal processor, DSP).
The memory 1304 may include volatile memory (RAM), such as random access memory (random access memory). The processor 1305 may also include non-volatile memory (ROM), such as read-only memory (ROM), flash memory, a mechanical hard disk (HDD) or a solid state disk (solid state drive, SSD).
The memory 1304 has stored therein executable program codes that the processor 1305 executes to realize the functions of the acquisition unit 1301 (corresponding to the aforementioned acquisition unit 1201) and the processing unit 1302 (corresponding to the aforementioned processing unit 1202), respectively, thereby realizing the application-based scheduling method. That is, the instructions of the cloud service platform for executing the application scheduling method provided in the embodiment of the present application are stored in the memory 1304.
Communication interface 1306 enables communication between computing device 1300 and other devices or communication networks using a transceiver unit such as, but not limited to, a network interface card, transceiver, or the like.
The embodiment of the invention also provides a computing device cluster. The cluster of computing devices includes at least one computing device. The computing device may be a server, such as a central server, an edge server, or a local server in a local data center. In some embodiments, the computing device may also be a terminal device such as a desktop, notebook, or smart phone.
Referring to fig. 14, fig. 14 is a schematic structural diagram of a computing device cluster according to an embodiment of the present application.
As shown in fig. 14, the computing device cluster includes at least one computing device 1300. The memory 1304 in one or more computing devices 1300 in the computing device cluster may have stored therein instructions of the same cloud service platform for executing the application scheduling method provided in the embodiments of the present application.
It should be noted that the memory 1304 in different computing devices 1300 in the computing device cluster may store different instructions for performing part of the functions of the cloud service platform. That is, instructions stored in memory 1304 in different computing devices 1300 may implement the functionality of one or more modules in memory acquisition unit 1301 and processing unit 1302.
Embodiments of the present invention also provide a computer program product comprising instructions. The computer program product may be software or a program product containing instructions capable of running on a computing device or stored in any useful medium. The computer program product, when run on at least one computer device, causes the at least one computer device to perform the above-described fault handling method applied to a cloud service platform for performing cloud technology based fault handling.
The embodiment of the invention also provides a computer readable storage medium. The computer readable storage medium may be any available medium that can be stored by a computing device or a data storage device such as a data center containing one or more available media. The usable medium may be a magnetic medium (e.g., floppy disk, hard disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., solid state disk), etc. The computer-readable storage medium includes instructions that instruct a computing device to execute the above-described fault handling method applied to a cloud service platform for executing cloud technology-based fault handling methods.
It will be clear to those skilled in the art that, for convenience and brevity of description, specific working procedures of the above-described systems, apparatuses and units may refer to corresponding procedures in the foregoing method embodiments, which are not repeated herein.
In the several embodiments provided in this application, it should be understood that the disclosed systems, apparatuses, and methods may be implemented in other ways. For example, the apparatus embodiments described above are merely illustrative, e.g., the division of the units is merely a logical function division, and there may be additional divisions when actually implemented, e.g., multiple units or components may be combined or integrated into another system, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical or other form.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in each embodiment of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.
The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be embodied in essence or a part contributing to the prior art or all or part of the technical solution in the form of a software product stored in a storage medium, including several instructions to cause a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the methods described in the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a read-only memory (ROM), a random access memory (random access memory, RAM), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and are not limiting; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; these modifications or substitutions do not depart from the essence of the corresponding technical solutions from the protection scope of the technical solutions of the embodiments of the present invention.

Claims (20)

1. An application scheduling method, wherein the method is applied to a cloud service platform, the cloud service platform is used for managing an infrastructure, the infrastructure comprises a plurality of data centers arranged in different areas, each data center is provided with a plurality of servers, and the method comprises:
acquiring an application scheduling target of target application which is input by a tenant and distributed and deployed in at least one data center of the infrastructure;
confirming an application topology of the target application and resource information for representing resource usage and/or distribution of each micro-service in the application topology in the infrastructure;
determining a first scheduling strategy meeting the application scheduling target according to the application topology and the resource information;
resource usage and/or distribution of at least one micro-service in the application topology in the infrastructure is adjusted based on the first scheduling policy to meet the application scheduling objective.
2. The method of claim 1, wherein said validating an application topology of said target application comprises:
confirming the application topology according to the target application information, wherein the application topology indicates the calling relationship among tasks included in the target application;
The target application information is input by the tenant, and/or is obtained by detecting the running state of the target application.
3. The method according to claim 1 or 2, wherein said determining a first scheduling policy satisfying the application scheduling objective based on the application topology and the resource information comprises:
determining a resource model of the cloud service platform according to the resource information;
and processing the application topology and the resource model based on a decision algorithm, wherein the decision algorithm is used for determining the resource usage and/or distribution corresponding to the target application.
4. A method according to claim 3, wherein the decision algorithm comprises a plurality of parallel algorithms of different performance or the decision algorithm comprises the plurality of parallel algorithms of different performance and artificial intelligence algorithms;
the plurality of parallel algorithms with different performances are used for respectively processing the application topology and the resource model to determine the first scheduling strategy; the artificial intelligence algorithm is used for correcting the time delay errors of the target application information and the resource information.
5. The method according to any one of claims 1 to 4, wherein the determining a first scheduling policy that meets the application scheduling objective comprises:
Acquiring a selection instruction input by a tenant, wherein the selection instruction indicates a first scheduling policy in at least one scheduling policy, and the at least one scheduling policy meets the application scheduling target;
and responding to the selection instruction, and determining the first scheduling strategy.
6. The method according to any one of claims 1 to 5, wherein the application scheduling objective comprises:
application scheduling cost priority, or application scheduling performance priority, or application scheduling quality priority.
7. The method according to any one of claims 2 to 6, wherein the target application information includes current application information of the target application, or current application information and history application information of the target application;
the resource information also comprises idle resources of the cloud service platform.
8. The method of any of claims 2 to 7, wherein the target application information comprises one or more of:
the task computing amount and the memory requirement corresponding to the target application, the task model, the concurrent call information, the task attribute information, the task type, the task execution sequence, the task number, the task data amount and the life cycle.
9. The method of any one of claims 1 to 8, wherein the resource information comprises one or more of:
the cloud service platform comprises node attribute information corresponding to the cloud service platform, transmission information among nodes, transmission information among the nodes and tenants, node scheduling cost and configuration information of a control plane corresponding to the nodes.
10. A cloud service platform for managing an infrastructure including a plurality of data centers disposed in different areas, each data center being provided with a plurality of servers, the cloud service platform comprising:
the system comprises an acquisition unit, a storage unit and a control unit, wherein the acquisition unit is used for acquiring an application scheduling target of a target application which is input by a tenant and distributed and deployed in at least one data center of the infrastructure;
a processing unit for:
confirming an application topology corresponding to the target application and resource information for representing the resource usage and/or distribution of each micro-service in the application topology in the infrastructure;
determining a first scheduling strategy meeting the application scheduling target according to the application topology and the resource information;
resource usage and/or distribution of at least one micro-service in the application topology in the infrastructure is adjusted based on the first scheduling policy to meet the application scheduling objective.
11. The cloud service platform according to claim 10, wherein the processing unit is specifically configured to confirm the application topology according to target application information, where the application topology indicates a call relationship between tasks included in the target application;
the target application information is input by the tenant, and/or is obtained by detecting the running state of the target application.
12. The cloud service platform according to claim 10 or 11, wherein the processing unit is specifically configured to:
determining a resource model of the cloud service platform according to the resource information;
and processing the application topology and the resource model based on a decision algorithm, wherein the decision algorithm is used for determining the resource usage and/or distribution corresponding to the target application.
13. The cloud service platform of claim 12, wherein said decision algorithm comprises a plurality of different performance parallel algorithms or said decision algorithm comprises said plurality of different performance parallel algorithms and artificial intelligence algorithms;
the plurality of parallel algorithms with different performances are used for respectively processing the application topology and the resource model to determine the first scheduling strategy; the artificial intelligence algorithm is used for correcting the time delay errors of the target application information and the resource information.
14. The cloud service platform according to any of claims 10 to 13, wherein the obtaining unit is specifically configured to obtain a selection instruction input by a tenant, where the selection instruction indicates a first scheduling policy of at least one scheduling policy, and each of the at least one scheduling policies meets the application scheduling target;
the processing unit is specifically configured to determine the first scheduling policy in response to the selection instruction.
15. The cloud service platform of any of claims 11 to 14, wherein the target application information comprises current application information of the target application or current application information and historical application information of the target application;
the resource information also comprises idle resources of the cloud service platform.
16. The cloud service platform of any of claims 11-15, wherein the target application information comprises one or more of:
the task computing amount and the memory requirement corresponding to the target application, the task model, the concurrent call information, the task attribute information, the task type, the task execution sequence, the task number, the task data amount and the life cycle.
17. The cloud service platform of any of claims 10 to 16, wherein the resource information comprises one or more of:
the cloud service platform comprises node attribute information corresponding to the cloud service platform, transmission information among nodes, transmission information among the nodes and tenants, node scheduling cost and configuration information of a control plane corresponding to the nodes.
18. A cluster of computing devices, comprising at least one computing device, each computing device comprising a processor and a memory;
the processor of the at least one computing device is configured to execute instructions stored in the memory of the at least one computing device to cause the cluster of computing devices to perform the method of any one of claims 1 to 9.
19. A computer program product containing instructions that, when executed by a cluster of computing devices, cause the cluster of computing devices to perform the method of any of claims 1 to 9.
20. A computer readable storage medium comprising computer program instructions which, when executed by a cluster of computing devices, perform the method of any of claims 1 to 9.
CN202310181241.XA 2022-08-12 2023-02-28 Application scheduling method, cloud service platform and related equipment Pending CN117640770A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/CN2023/104403 WO2024032239A1 (en) 2022-08-12 2023-06-30 Application scheduling method, cloud service platform, and related device

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210968594 2022-08-12
CN202210968594X 2022-08-12

Publications (1)

Publication Number Publication Date
CN117640770A true CN117640770A (en) 2024-03-01

Family

ID=90017076

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310181241.XA Pending CN117640770A (en) 2022-08-12 2023-02-28 Application scheduling method, cloud service platform and related equipment

Country Status (1)

Country Link
CN (1) CN117640770A (en)

Similar Documents

Publication Publication Date Title
US10423457B2 (en) Outcome-based software-defined infrastructure
US11455183B2 (en) Adjusting virtual machine migration plans based on alert conditions related to future migrations
US9336059B2 (en) Forecasting capacity available for processing workloads in a networked computing environment
US11146497B2 (en) Resource prediction for cloud computing
US9679029B2 (en) Optimizing storage cloud environments through adaptive statistical modeling
US10353738B2 (en) Resource allocation based on social networking trends in a networked computing environment
US20140068075A1 (en) Optimizing service factors for computing resources in a networked computing environment
US9329908B2 (en) Proactive identification of hotspots in a cloud computing environment
US11593180B2 (en) Cluster selection for workload deployment
CN113037877B (en) Optimization method for time-space data and resource scheduling under cloud edge architecture
US10886743B2 (en) Providing energy elasticity services via distributed virtual batteries
US8806483B2 (en) Determining starting values for virtual machine attributes in a networked computing environment
US8880676B1 (en) Resource planning for computing
US20120284708A1 (en) Configuring virtual machine images in a networked computing environment
US11956330B2 (en) Adaptive data fetching from network storage
US20130262189A1 (en) Analyzing metered cost effects of deployment patterns in a networked computing environment
US20230325256A1 (en) Deep neural network management of overbooking in a multi-tenant computing environment
US20230196182A1 (en) Database resource management using predictive models
CN117640770A (en) Application scheduling method, cloud service platform and related equipment
US20220188149A1 (en) Distributed multi-environment stream computing
WO2024032239A1 (en) Application scheduling method, cloud service platform, and related device
US11372572B2 (en) Self-relocating data center based on predicted events
US20240103903A1 (en) Dynamic pod priority inference utilizing service mesh telemetry data
US20230289693A1 (en) Interactive what-if analysis system based on imprecision scoring simulation techniques
US20240134717A1 (en) Power and energy optimization across distributed cloud environment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication