WO2023011157A1 - 业务处理方法、装置、服务器、存储介质和计算机程序产品 - Google Patents

业务处理方法、装置、服务器、存储介质和计算机程序产品 Download PDF

Info

Publication number
WO2023011157A1
WO2023011157A1 PCT/CN2022/106367 CN2022106367W WO2023011157A1 WO 2023011157 A1 WO2023011157 A1 WO 2023011157A1 CN 2022106367 W CN2022106367 W CN 2022106367W WO 2023011157 A1 WO2023011157 A1 WO 2023011157A1
Authority
WO
WIPO (PCT)
Prior art keywords
edge
server
edge server
computing
offline
Prior art date
Application number
PCT/CN2022/106367
Other languages
English (en)
French (fr)
Inventor
徐士立
付亚彬
钟炳武
胡玉林
陆燕慧
马啸虎
Original Assignee
腾讯科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 腾讯科技(深圳)有限公司 filed Critical 腾讯科技(深圳)有限公司
Priority to EP22851871.8A priority Critical patent/EP4383074A4/en
Publication of WO2023011157A1 publication Critical patent/WO2023011157A1/zh
Priority to US18/462,164 priority patent/US20230418670A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5061Partitioning or combining of resources
    • G06F9/5066Algorithms for mapping a plurality of inter-dependent sub-tasks onto a plurality of physical CPUs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
    • G06F9/4881Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
    • G06F9/4893Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues taking into account power or heat criteria
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • G06F9/5055Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering software capabilities, i.e. software resources associated or available to the machine
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline or look ahead
    • G06F9/3877Concurrent instruction execution, e.g. pipeline or look ahead using a slave processor, e.g. coprocessor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • G06F9/505Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the load
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5061Partitioning or combining of resources
    • G06F9/5072Grid computing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/50Indexing scheme relating to G06F9/50
    • G06F2209/501Performance criteria
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/50Indexing scheme relating to G06F9/50
    • G06F2209/5017Task decomposition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/50Indexing scheme relating to G06F9/50
    • G06F2209/503Resource availability

Definitions

  • the present application relates to the field of computer technology, and in particular to a business processing method, device, server, storage medium and computer program product.
  • Cloud application refers to a new type of application that transforms traditional software local installation and local computing usage into out-of-the-box services, connects and controls remote server clusters through the Internet or LAN, and completes logic or computing tasks.
  • a cloud application means that the operation and calculation of an application depends on the execution of the cloud server, and the terminal is only responsible for the display of the screen.
  • cloud games are a typical cloud application.
  • Cloud games refer to cloud computing technology. Running on the Internet, the terminal does not need to download and install, and does not need to consider the terminal configuration, which completely solves the problem that the terminal performance is insufficient and cannot run heavy games.
  • a service processing method, device, server, storage medium and computer program product are provided.
  • a business processing method comprising:
  • N edge servers for performing the offline task, where cloud applications are running in the N edge servers; idle computing power resources of the N edge servers are greater than the first computing power resources, and the N edge servers
  • the idle computing resources of the edge server refer to the sum of the idle computing resources of each edge server among the N edge servers, where N is an integer greater than or equal to 1;
  • a business processing method is executed by one of the N edge servers used to perform offline tasks, and cloud applications run in the N edge servers, the business processing method includes:
  • the distributed offline task includes the offline task received by the management server; or, the distributed offline task includes N subtasks related to the edge server Matching subtasks, the N subtasks are obtained by dividing and processing the offline tasks based on the idle computing resources of each edge server in the N edge servers;
  • the distributed offline task is executed by using the idle computing resources of the edge server.
  • a business processing device comprising:
  • a determining unit configured to determine the first computing resource required to perform the offline task
  • the determining unit is further configured to determine N edge servers for performing the offline task, wherein cloud applications are running in the N edge servers; idle computing power resources of the N edge servers are greater than the first Computing resources, the idle computing resources of the N edge servers refer to the sum of the idle computing resources of each of the N edge servers, where N is an integer greater than or equal to 1;
  • a dispatching unit configured to distribute and dispatch the offline service to the N edge servers, so that each edge server in the N edge servers can use each edge server under the condition that the cloud application runs normally
  • the idle computing resources in the server execute the offline task.
  • a business processing device comprising:
  • the receiving unit is configured to receive a distributed offline task distributed by the management server, the distributed offline task includes the offline task received by the management server, or the distributed offline task includes N subtasks that are related to the edge server Matching subtasks, the N subtasks are obtained by dividing and processing the offline task based on the idle computing resources of each edge server in the N edge servers; the N edge servers are used to execute the offline task, cloud applications are running in the N edge servers;
  • the execution unit is configured to execute the distributed offline task using idle computing power resources in the edge server under the condition that the normal execution of the target cloud application is guaranteed.
  • a server comprising:
  • processor adapted to implement one or more computer readable instructions; and a computer storage medium storing one or more computer readable instructions adapted to be executed by the The above processor loads and executes:
  • N edge servers for performing the offline task, where cloud applications are running in the N edge servers; idle computing power resources of the N edge servers are greater than the first computing power resources, and the N edge servers
  • the idle computing resources of the edge server refer to the sum of the idle computing resources of each edge server among the N edge servers, where N is an integer greater than or equal to 1;
  • a server comprising:
  • processor adapted to implement one or more computer readable instructions; and a computer storage medium storing one or more computer readable instructions adapted to be executed by the The above processor loads and executes:
  • the distributed offline task includes the offline task received by the management server; or, the distributed offline task includes N subtasks related to the edge server Matching subtasks, the N subtasks are obtained by dividing and processing the offline tasks based on the idle computing resources of each edge server in the N edge servers;
  • the distributed offline task is executed by using the idle computing resources of the edge server.
  • a computer storage medium stores computer-readable instructions, and when the computer-readable instructions are executed by a processor, they are used to perform:
  • N edge servers for performing the offline task, where cloud applications are running in the N edge servers; idle computing power resources of the N edge servers are greater than the first computing power resources, and the N edge servers
  • the idle computing resources of the edge server refer to the sum of the idle computing resources of each edge server among the N edge servers, where N is an integer greater than or equal to 1;
  • the distributed offline task includes the offline task received by the management server; or, the distributed offline task includes N subtasks related to the edge server Matching subtasks, the N subtasks are obtained by dividing and processing the offline tasks based on the idle computing resources of each edge server in the N edge servers;
  • the distributed offline task is executed by using the idle computing resources of the edge server.
  • a computer program product or computer readable instructions comprising computer readable instructions stored in a computer storage medium; a processor of a server reads the computer readable instructions from the computer storage medium instructions, the processor executes computer readable instructions causing the server to:
  • N edge servers for performing the offline task, where cloud applications are running in the N edge servers; idle computing power resources of the N edge servers are greater than the first computing power resources, and the N edge servers
  • the idle computing resources of the edge server refer to the sum of the idle computing resources of each edge server among the N edge servers, where N is an integer greater than or equal to 1;
  • the processor of the server reads the computer-readable instructions from the computer storage medium, and the processor executes the computer-readable instructions, so that the server executes:
  • the distributed offline task includes the offline task received by the management server; or, the distributed offline task includes N subtasks related to the edge server Matching subtasks, the N subtasks are obtained by dividing and processing the offline tasks based on the idle computing resources of each edge server in the N edge servers;
  • the distributed offline task is executed by using the idle computing resources of the edge server.
  • FIG. 1 is a schematic structural diagram of a cloud application management system provided by an embodiment of the present invention
  • Fig. 2 is a schematic flowchart of a service processing method provided by an embodiment of the present invention.
  • Fig. 3 is a schematic flowchart of another business processing method provided by an embodiment of the present invention.
  • Fig. 4 is a schematic structural diagram of a business processing system provided by an embodiment of the present invention.
  • Fig. 5 is a schematic structural diagram of a service processing device provided by an embodiment of the present invention.
  • Fig. 6 is a schematic structural diagram of another service processing device provided by an embodiment of the present invention.
  • FIG. 7 is a schematic structural diagram of a server provided by an embodiment of the present invention.
  • FIG. 8 is a schematic structural diagram of another server provided by an embodiment of the present invention.
  • the embodiment of the present invention provides a service processing solution, which can make full use of the resources of the edge server, improve resource utilization, and reduce the operating cost of the edge server of the cloud application.
  • the management server when the management server receives an offline task that needs to be executed, it can first evaluate the first computing resources required to execute the offline task; then determine N edge servers for executing the offline task, It should be noted that the idle computing power resources of these N edge servers are greater than the first computing power resources; then the offline tasks are distributed and dispatched to N edge servers, and these N edge servers can ensure the normal operation of cloud applications , use their respective idle computing power resources to execute the offline business.
  • the value of the above N can be greater than or equal to 1.
  • distributed scheduling refers to assigning the offline task to the edge server for separate execution; when the value of N is greater than 1, distributed scheduling It refers to assigning offline tasks to multiple edge servers for common execution.
  • the offline task can be divided into several sub-services, and each edge server is assigned to execute a sub-service. In this way, the load of each edge server can be shared, ensuring The normal operation of the cloud application in each edge server; or, offline tasks can also be assigned to each edge server, and multiple edge servers perform the same offline task, which can improve the execution rate of offline tasks.
  • the idle computing resources in each edge server can be used to perform offline tasks while ensuring the normal operation of the cloud application.
  • the waste of computing power resources in each edge server is avoided, the utilization rate of computing power resources is improved, and the operating cost of the edge servers is also reduced.
  • FIG. 1 it is a schematic structural diagram of a cloud application management system provided by an embodiment of the present invention.
  • the cloud application management system shown in FIG. 1 includes At least one edge server 101, the at least one edge server 101 may be used to run cloud applications, it should be noted that at least one edge server 101 may run the same or different cloud applications.
  • Common cloud applications include cloud games, cloud education, cloud meetings, and cloud social networking.
  • At least one edge server 101 can be assigned to multiple edge computing nodes, an edge computing node can be regarded as a node for performing edge computing, and edge computing can refer to the side close to the source of objects or data, using An open platform that integrates network, computing, storage, and application core capabilities, providing the most recent end-to-end services, and its applications are initiated on the edge side, resulting in faster network service responses, meeting the needs of the industry in real-time business, application intelligence, security and privacy protection and other basic needs.
  • One or more edge servers run in each edge computing node, and these edge servers have graphics processing capability, and each edge server may be called a computing node.
  • edge computing node 1 includes four edge servers, and edge computing node 2 may also include four edge servers.
  • the cloud application management system shown in FIG. 1 may further include a cloud application server 102, the cloud application server 102 is connected to at least one edge server 101, and the cloud application server 102 may provide each edge server 101 with a cloud The running data of the application, so that each edge server 101 can run the cloud application based on the running data provided by the cloud application server.
  • the cloud application management system shown in FIG. 1 may further include a terminal 103, which may be connected to at least one edge server 101, and the terminal 103 is used to receive and display the cloud application rendered by the edge server 101.
  • the terminal 103 displays the game screen rendered by the edge server 101 .
  • the terminal 103 may refer to a mobile smart terminal, which refers to a type of device that has rich human-computer interaction methods, has the ability to access the Internet, is usually equipped with various operating systems, and has strong processing capabilities.
  • the terminal 103 may include a smart phone, a tablet computer, a notebook computer, a desktop computer, a smart speaker, a smart watch, a vehicle terminal, a smart TV, and the like.
  • the cloud application management system shown in FIG. 1 may further include a management server 104, and the management server 104 is connected to the terminal 103 and at least one edge server 101 respectively.
  • the management server 104 can be used to manage and schedule the at least one edge server 101. For example, when it is detected that any cloud application is started in any terminal, the management server 104 can according to the current load conditions of each edge server 101 and In the case of idle computing resources, select one or more suitable edge servers to execute the cloud application started in any terminal.
  • the management server 104 determines to schedule the offline task to one or more edge servers 101 according to the idle computing resources of each edge server, and the assigned offline task
  • One or more edge servers 101 use their idle computing resources to execute assigned offline tasks while ensuring the normal operation of their respective cloud applications. In this way, the normal operation of cloud applications is ensured, and it is also avoided.
  • the idle computing resources of cloud applications are wasted, which improves the resource utilization of each edge server, thereby reducing the operating cost of edge servers.
  • FIG. 2 it is a schematic flowchart of a service processing method provided by an embodiment of the present invention.
  • the service processing method shown in FIG. 2 can be executed by a management server, specifically, by a processor of the management server.
  • the business processing method shown in Figure 2 may include the following steps:
  • Step S201 Determine the first computing resource required to execute the offline task.
  • offline tasks refer to tasks that do not need to be completed online in real time, such as offline rendering of video special effects, offline training of artificial intelligence models, and so on.
  • the first computing resources are different according to different types of main loads for performing offline tasks.
  • the main load for performing offline tasks belongs to the Graphics Processing Unit (GPU) type, that is to say, the main load for performing offline tasks is concentrated on the graphics processor
  • the first computing resources may include any of the following Or more: network bandwidth, memory, FLOPS (Floating-point Operations Per Second) performed by the graphics processor per second, OPS (Operations Per Second) performed by the graphics processor per second, and throughput .
  • the first computing resources can include any one or more of the following : Memory, network bandwidth, FLOPS performed by the central processing unit per second, and OPS performed by the central processing unit per second. If the main load for performing offline tasks is mixed, that is to say, the load for performing offline tasks requires both CPU and GPU, then the first computing power resource is the combination of the first computing power resources of the above two types.
  • the number of floating-point operations FLOPS performed per second can be divided into half-precision, single-precision, and double-precision.
  • the number of half-precision floating-point operations performed the number of single-precision floating-point operations performed by the graphics processor per second, and the number of double-precision floating-point operations performed by the graphics processor per second.
  • FLOPS when FLOPS is used to measure computing power, it usually includes trillion floating-point operations per second (teraFLOPS, TFLOPS), billion floating-point operations per second (gigaFLOPS, GFLOPS), and million floating-point operations per second ( megaFLOPS, MFLOPS), and quadrillion floating-point operations per second (petaFLOPS, PFLOPS), etc.
  • OPS Operations Per Second
  • GOPS Giga Operations Per Second
  • TOPS Tera Operations Per Second
  • the first computing resource required to execute the offline task may be estimated based on the computing resource used in the execution of the historical offline task similar to the offline task.
  • determining the first computing resource required to execute the offline task may include: determining the computational complexity corresponding to the task type of the offline task based on the correspondence between the task type and the computational complexity; according to the determined Computational complexity Find at least one matching historical offline task from historical offline tasks, and the computational complexity corresponding to each matching historical offline task matches the determined computational complexity; based on the computing resources used to execute each matching historical offline task Estimate the computing power resources required for offline tasks, and obtain the first computing power resources required to perform offline tasks.
  • tasks can be classified according to different task contents. For example, if the task content is to render offline videos, then the task type can be video rendering; if the task content is to train models, then the task type can be model training. kind.
  • Computational complexity can also be called algorithmic complexity. Algorithmic complexity refers to the resources required for an algorithm to run after it is written into an executable program. The required resources include time resources and memory resources. Time resources can be measured by the FLOPS and OPS mentioned above. The essence of executing the offline task in the embodiment of the present invention is to execute the executable program that the offline task is written into.
  • the correspondence between task types and computational complexity can be determined by the computational complexity of executing historical offline tasks, such as the computational complexity corresponding to executing model training tasks, and the computational complexity corresponding to executing offline video rendering tasks.
  • the computational complexity corresponding to the task type can be used to reflect the order of complexity of executing the task type.
  • searching for at least one matching historical offline task from historical offline tasks according to the determined computational complexity may include: finding historical offline tasks whose computational complexity matches the determined computational complexity from various historical offline tasks, and These historical offline tasks are determined as matching historical offline tasks.
  • the matching of the two computational complexities can be that the complexity difference between the two computational complexities is less than a specified value. It should be noted that, in the completed historical offline tasks, except for the same task type as the offline task The historical offline tasks of may have matching computational complexity with this offline task, and historical offline tasks with different task types may also have matching computational complexity with this offline task.
  • the matching historical offline task is not selected based on the task type to which the offline task belongs, but is selected according to the computational complexity corresponding to the offline task, so that more matching historical offline tasks can be selected from the historical offline tasks, based on these
  • the estimated first computing power resources are more accurate.
  • a computing power resource which is not specifically limited in the embodiment of the present invention, and can be flexibly selected according to actual needs.
  • the first computing power resource may include any one or more of graphics processor computing power resources, central processing unit computing power resources, memory, network bandwidth, and network throughput.
  • Calculate the computing power resources required for offline tasks to obtain the first computing power resources required to execute offline tasks may include: based on each computing power resource used to execute each matching offline historical task Estimate the corresponding computing resources required for offline tasks. For example, based on the graphics computing resources used to execute each matching historical task to estimate the graphics computing resources required to execute offline tasks; another example is to estimate the memory resources required to execute offline tasks based on the memory resources used to execute each matching historical task Memory.
  • the computing resources required for offline tasks are estimated based on the computing resources used to execute each matching historical offline task, and the first computing resources required for executing offline tasks can be obtained, which may include: executing each Match the computing power resources used by historical offline tasks for averaging calculations, and the calculation results are used as the first computing power resources required to execute offline tasks.
  • the computing resource required for the offline task is estimated based on the computing resource used for executing each matching historical offline task, and the first computing resource required for executing the offline task is obtained, including: according to each matching The relationship between the task type of the historical offline task and the task type of the offline task Assign a weight value to each matching historical offline task, and perform a weighted average operation on at least one matching historical offline task based on the weight value of each matching historical offline task , and the result of the operation is used as the first computing power resource required to perform offline tasks.
  • a matching history offline task and offline task have the same task type, then a higher weight value can be assigned to the matching history offline task, and a matching history offline task and offline task belong to different task types respectively, then you can Assign a lower weight value to the matching history offline task.
  • an offline task may correspond to an execution time threshold, and the first computing power resource required to execute the offline task may specifically refer to the first computing power resource required to complete the offline task within the execution time threshold; the execution time The threshold is different, and the first computing power resources may be different.
  • Step S202 determining N edge servers for performing offline tasks, where cloud applications are running on the N edge servers, and idle computing power resources of the N edge servers are greater than the first computing power resources.
  • the cloud application is deployed to run in M edge servers, that is to say, M edge servers participate in the operation of the cloud application, and the N edge servers in step S202 are selected from the M edge servers.
  • N edge servers for performing offline tasks from among the M edge servers, it may be determined directly based on the idle computing resource and the first computing resource of each edge server among the M edge servers.
  • the idle computing power resources of each edge server may be determined based on the second computing power resources required for running the cloud application in the edge server and the total computing power resources of the edge server.
  • the second computing resource required for running the cloud application in each edge server may also be estimated and determined by the management server based on historical computing resources used for running the cloud application.
  • the management server can obtain the computing power resources used by running the cloud application multiple times in history, and then perform an average calculation on these computing power resources to estimate the second computing power resources required by the edge server to run the cloud application.
  • computing power resources may include multiple types.
  • each computing power resource required to run the cloud application is averaged to obtain the edge server running The computing resources required by the cloud application.
  • N is an integer greater than or equal to 1.
  • any edge server with idle computing power resources greater than the first computing power resource may be selected from M edge servers as the edge server for performing offline tasks.
  • determining N edge servers for performing offline tasks includes: comparing the idle computing resource of each edge server among the M edge servers with the first computing resource; The N edge servers whose idle computing resources are greater than the first computing resources are determined as the N edge servers for performing offline tasks.
  • the idle computing power resources in each edge server can be determined based on the total computing power resources of each edge server and the second computing power resources required to run cloud applications, such as subtracting the second computing power resources from the total computing power resources of each edge server Computing resources obtain the idle computing resources of each edge server; another example is to add the second computing resources required for running cloud applications in each edge server to a reserved computing power, and then each edge server The total computing power resources of , minus the addition result, get the idle computing power resources of each edge server.
  • the purpose of this is to reserve part of the computing power resources for the operation of cloud applications, so as to prevent the edge server from being unable to respond in time when the computing power resources required by cloud applications suddenly increase, reducing the running speed and response efficiency of cloud applications.
  • the method for determining the N edge servers is as follows: among the M edge servers, all the edge servers whose idle resources are greater than the first computing resources are used as the N edge servers for performing offline tasks, although each of the N edge servers The idle computing resources of the edge servers are sufficient so that each edge server can execute offline tasks independently, but in the embodiment of the present invention, the offline tasks can be distributed and dispatched to the N edge servers for common execution, so that the first offline task required Computing resources can be allocated to different edge servers, and each edge server can reserve some redundant computing resources, so that when the computing resources required by cloud applications in an edge server increase, the edge The server allocates the reserved computing power to cloud applications in a timely manner without suspending the execution of offline tasks.
  • determining N edge servers for performing offline tasks includes: comparing the idle computing resource of each edge server in the M edge servers with the first computing resource; if there is no edge server The computing power resource is greater than the first computing power resource; then M edge servers are combined to obtain multiple combinations, and each combination includes at least two edge servers; calculate the sum of the idle computing power resources of each combination, and set the idle The edge servers included in the combination whose sum of computing power resources is greater than the first computing power resource are determined as N edge servers for performing offline tasks. That is to say, when there is no edge server whose idle computing power resource is greater than the first computing power resource among the M edge servers, the sum of the idle computing power resources of the selected N edge servers is greater than the first computing power resource.
  • the M edge servers in the embodiment of the present invention can be allocated to P edge computing nodes, each edge computing node includes one or more edge servers, for example, P edge computing nodes include edge computing node 1 and the edge computing node 2, the edge computing node 1 may include 5 edge servers, and the edge computing node 2 may include M-5 edge servers.
  • L edge computing nodes can be determined first from P edge computing nodes, and the nodes of these L edge computing nodes are idle. The power resource is greater than the first computing power resource, and then N edge servers are selected from the determined L edge computing nodes.
  • determining N edge servers for performing offline tasks includes the following steps:
  • S1 Select L edge computing nodes from the P edge computing nodes, where idle computing resources of the L edge computing nodes are greater than the first computing resources.
  • the node idle computing resources of the L edge computing nodes are the sum of the node idle computing resources of each edge computing node in the L edge computing nodes, and the node idle computing resources of the L edge computing nodes are greater than the first computing resources It may include any of the following situations: the idle computing power of each edge computing node in the L edge computing nodes is greater than the first computing power resource, and some of the edge computing nodes in the L edge computing nodes have idle computing power. The power is greater than the first computing power resource, and the sum of the node idle computing power resources of the remaining edge computing nodes is greater than the first computing power resource; and the node idle computing power resources of each edge computing node in the L edge computing nodes are less than the first computing power resource. One computing power resource, but the sum of idle computing power resources of L edge computing nodes is greater than the first computing power resource.
  • edge computing nodes when selecting L edge computing nodes from P edge computing nodes, you can only select some edge computing nodes whose idle computing power resources are greater than the first computing power resources; or, you can also select both node idle computing power resources For edge computing nodes that are larger than the first computing power resource, you can also select an edge computing node whose idle computing power resource sum is greater than the first computing power resource; or, if there are no node idle computing power resources among the P edge computing nodes If the edge node is greater than the first computing power resource, then only some edge computing nodes whose sum of idle computing power resources of the node is greater than the first computing power resource can be selected at this time.
  • the node idle computing resources of each edge computing node are determined based on the idle computing resources of each edge server included in each edge computing node. For example, the idle computing power resources of multiple edge servers included in an edge computing node are added together to obtain the node idle computing power resources of the edge computing node; The idle computing power resources of the server are averaged to obtain the node idle computing power resources of the edge computing node. It should be noted that the embodiment of the present invention only lists two ways to calculate the idle computing power resources of nodes. In specific applications, any method can be used to calculate the idle computing power resources of nodes according to actual needs. limited.
  • S2 Based on the attribute information of each edge server included in the L edge computing nodes, determine at least one candidate edge server from the edge servers included in the L edge computing nodes;
  • the attribute information of each edge server may include the working state of each edge server, and the working state may include an idle state or a busy state.
  • the working state When the load of an edge server exceeds the upper limit of the load, determine the edge server's The working state is a busy state; otherwise, when the load of an edge server is less than the upper limit of the load, the working state of the edge server is an idle state.
  • Edge servers in a busy state are not scheduled to perform offline tasks, and edge servers in an idle state can be scheduled for offline tasks.
  • a plurality of candidate edge servers are determined from the edge servers included in the L edge computing nodes, including: among the edge servers included in the L edge computing nodes, The edge server whose working state is idle is determined as a candidate edge server.
  • the attribute information of each edge server includes the server type group to which each edge server belongs, and the server type group includes a preset whitelist group and a common group, and the edge servers in the preset whitelist group are used to run high Priority, uninterruptible real-time cloud applications, so the edge servers in the preset whitelist group will not be scheduled offline tasks. Ordinary group edge servers can be scheduled offline tasks.
  • the edge servers in the preset whitelist group are dynamically changed. When an edge server in the preset whitelist group no longer performs high-priority, uninterruptible real-time tasks, the edge server will be removed from the preset whitelist. Removed from the roster group and can be transferred to the normal group.
  • a plurality of candidate edge servers are determined from the edge servers included in the L edge computing nodes, including: among the edge servers included in the L edge computing nodes, The edge servers whose server type is grouped into the common group are determined as candidate edge servers.
  • S3 Determine N edge servers from the at least one candidate edge server according to the idle computing resource and the first computing resource of each edge server in the at least one candidate edge server.
  • N edge servers are selected from at least one candidate edge server.
  • the idle computing power resources of each candidate edge server may be compared with the first computing power resource; edge servers whose idle computing power resources are greater than the first computing power resource are determined as N edge servers.
  • the edge servers whose idle computing resources are greater than the first computing resources and the edge servers whose idle computing resources are greater than the first computing resources are used as N edge servers .
  • at least one candidate edge server includes edge server 1, edge server 2, and edge server 3, the idle computing power resource of edge server 1 is greater than the first computing power resource, the idle computing power resource of edge server 2 and the idle computing power resource of edge server 3 None of the computing power resources is greater than the first computing power resource, but the sum of the idle computing power resources of edge server 2 and the idle computing power resources of edge server 3 is greater than the first computing power resource, then edge server 1, edge server 2 and The edge servers 3 all serve as N edge servers for performing offline tasks.
  • edge servers whose idle computing power is greater than the first computing power resource among each candidate edge server then multiple edge servers whose idle computing power is greater than the first computing power resource are taken as N edge servers server.
  • the first computing power resources may include at least one of CPU computing power resources or GPU computing power resources
  • the idle computing power resources of each edge server include CPU computing power resources, GPU computing power resources, network bandwidth, Any one or more of the throughput and the memory, when referring to the comparison of computing resources, or the addition and averaging of computing resources, are all performed between computing resources of the same type.
  • the first computing resources include GPU computing resources
  • GPU computing resources include the number of half-precision floating-point operations performed by the GPU per second, the number of single-precision floating-point operations performed by the GPU per second, and the number of single-precision floating-point operations performed by the GPU per second.
  • comparing the idle computing power resource of any edge server with the first computing power resource specifically, compare the size relationship between the number of half-precision floating-point operations performed by the two GPUs per second, and the number of half-precision floating-point operations performed by the two GPUs per second.
  • Step S203 dispatch offline tasks to N edge servers in a distributed manner, so that each edge server in the N edge servers can use the idle computing resources of each edge server to perform offline tasks while ensuring the normal operation of the cloud application.
  • the distributed scheduling of offline tasks to N edge servers may include: dividing the offline tasks into N subtasks based on the idle computing resources of each of the N edge servers, Each subtask in the N subtasks is matched with an edge server; the idle computing power resources of the edge server matched with each subtask are greater than the computing power resources required to execute each subtask; The tasks are respectively assigned to the matching edge servers of each subtask, so that each edge server executes the matching subtasks.
  • the offline task is divided into N subtasks, including: dividing the offline task into N subtasks on average, and the computing power resources required to execute each subtask can be equal to the first computing power resource/N, for example, the first computing power resource is x, and the offline task is equally divided into If there are 5 subtasks, then the computing resource required to execute each subtask is equal to x/5; N subtasks are dispatched to N edge servers respectively.
  • matching each subtask with an edge server may mean that a subtask matches any edge server.
  • N edge servers include edge server 1, edge server 2, and edge server 3.
  • the idle computing resource of edge server 1 is equal to x1
  • the idle computing resource of edge server 2 is equal to x2
  • the idle computing resource of edge server 3 is equal to x3.
  • subtask 1 matches edge server 1, and the computing resources required to execute subtask 1 are less than or equal to x1; subtask 2 matches edge server 2, and executes subtask 2
  • the required computing power resource is less than or equal to x2; the subtask 3 matches the edge server 3, and the computing power resource required to execute subtask 3 is less than or equal to x3.
  • the N edge servers include edge servers whose idle computing power resources are greater than the first computing power resources, also include edge servers whose idle computing power resources are not greater than the first computing power resources , then based on the idle computing resources of each edge server in the N edge servers, the offline task is divided into N subtasks, which may include: first, each of the edge servers whose idle computing resources are not greater than the first computing resources The idle computing resources of each edge server divide the offline tasks into several subtasks that match each edge server in this part of the edge servers; then divide the remaining offline tasks evenly and distribute them to another part of the idle computing resources greater than The edge server of the first computing resource.
  • the management server can monitor the performance of each edge server on the matching subtask; if any edge server in the N edge servers is detected When an exception occurs while executing a matching subtask, an edge server is reselected to execute the matching subtask of any edge server.
  • the management server monitors the execution of the matching subtask by each edge server based on the task execution status reported by each edge server, and detects that any one of the N edge servers is executing the matching subtask.
  • An exception occurs during a subtask, which may include: the task execution status reported by any edge server to the management server indicates that any edge server has an exception when executing a subtask; Reported task execution status.
  • an offline task corresponds to an execution time threshold
  • N edge servers need to complete the offline task within the execution time threshold. Since the offline task is divided into N subtasks, each subtask also corresponds to an execution time threshold. The execution time threshold of each subtask may be equal to the execution time threshold corresponding to the offline task.
  • the timeout prompt information is used to instruct the management server to reassign a new edge server to execute the subtask matching the edge server reporting the timeout prompt information.
  • the idle computing power resource of the edge server is greater than the first computing power resource required to execute the offline task, and the idle computing power resource of N edge servers refers to the sum of the idle computing power resources of each edge server.
  • N can take a value of 1 or greater. When N takes a value of 1, it can ensure the centralized execution of offline tasks and facilitate the execution management of offline tasks. When N takes a value greater than 1, the distribution of offline tasks can be realized. This distributed execution method can not only ensure the execution progress of offline tasks, but also share the load of each edge server, so as to ensure the normal operation of cloud applications in each edge server.
  • the embodiment of the present invention provides another service processing method.
  • FIG. 3 it is a schematic flowchart of another service processing method provided by the embodiment of the present invention.
  • the business processing method shown in FIG. 3 may be executed by an edge server among the N edge servers, specifically, may be executed by a processor of the edge server.
  • the edge server may be any one of the N edge servers. cloud application.
  • the business processing method shown in Figure 3 may include the following steps:
  • Step S301 receiving a distributed offline task distributed and scheduled by the management server.
  • the distributed offline task may be an offline task received by the management server, or may be an offline task divided into any one of N subtasks.
  • the idle computing power resources of the edge server are greater than the computing power resources required to execute distributed offline tasks.
  • the N subtasks are obtained by dividing and processing the offline task based on the idle computing resources of each of the N edge servers executing the offline task.
  • the relevant content in step S203 in the embodiment of FIG. 2 please refer to the description of the relevant content in step S203 in the embodiment of FIG. 2 , which will not be repeated here.
  • the edge server may count the idle computing resources of the edge server, and report the idle computing resources of the edge server to the management server.
  • the idle computing power resource of the edge server may be determined based on the total computing power resource of the edge server and the second computing power resource required for running the cloud application.
  • the total computing power resource of the edge server can be subtracted from the second computing power resource, and the result of the subtraction can be used as the idle computing power resource of the edge server, that is to say, the edge server
  • the idle computing power resources of may refer to the remaining computing power resources in the edge server except the second computing power resources running cloud applications.
  • some reserved computing power resources can be set, and the reserved computing power resources and the second computing power resources required to run cloud applications are subtracted from the total computing power resources of the edge server, leaving The computing power resource of is the idle computing power resource of the edge server. In this way, if the computing resources required to run cloud applications suddenly increase, a part of the reserved computing resources can be used to run cloud applications without interrupting the execution of distributed offline tasks.
  • the second computing power resource may include any one or more of CPU computing power resources, GPU computing power resources, memory, network bandwidth, and throughput
  • CPU computing power resources may generally include the number of floating-point operations performed by the CPU per second or at least one of the number of operations performed by the CPU per second
  • the GPU computing resources may include at least one of the number of floating-point operations performed by the GPU per second or the number of operations performed by the GPU per second. Since a single edge server and the edge computing node to which the edge server belongs will affect idle computing resources, the network bandwidth is determined based on the bandwidth of the internal network and the bandwidth of the external network. Specifically, the smaller of the bandwidth of the internal network and the external network can be determined. Or as the edge server's network external width.
  • Step S302 under the condition that the normal operation of the cloud application is ensured, the idle computing resources of the edge server are used to execute distributed offline tasks.
  • each subtask obtained by dividing the offline task and the offline task corresponds to an execution time threshold, so the distributed offline task also corresponds to an execution time threshold.
  • using the idle computing power resources in the edge server to execute the distributed offline task includes: determining the time required to execute the distributed offline task based on the idle computing power resource of the edge server; if the required time is less than the distributed offline task For the corresponding execution time threshold, the idle computing resources in the edge server are used to execute distributed offline tasks.
  • the idle computing resource in each edge server may refer to the remaining computing resource in the edge server except the second computing resource required to run the cloud application.
  • the edge server When the edge server is executing a distributed offline task process In the process, if it is detected that the resources required to run the cloud application in the edge server suddenly increase to be greater than the second computing power resource, at this time, in order to ensure the normal operation of the cloud application, the edge server may need to perform a computing power release operation.
  • the residence time of the distributed offline task in the edge server can be obtained; according to the relationship between the residence time and the execution time threshold corresponding to the distributed offline task, the computing power release operation is performed.
  • the computing power release operation may include suspending the execution of the distributed offline task, so as to facilitate subsequent When the computing power resource required by the cloud application is less than or equal to the second computing power resource, the execution of the distributed offline task is restarted.
  • the computing power release operation may include terminating the execution of the distributed offline task.
  • the edge server can periodically detect the idle computing power resources of the edge server after performing the computing power releasing operation; if the idle computing power of the edge server is detected If the resource is greater than the first computing power resource, the distributed offline task is started to be executed; if the idle computing power resource of the edge server is less than the first computing power resource, and the length of time the distributed offline task stays in the edge server and the execution time threshold If the difference between them is less than the time difference threshold, the execution of the distributed offline task will be terminated.
  • the edge server can only give up continuing the distributed offline task, and inform the management server to reschedule a new edge server to execute the distributed offline task.
  • the edge server may send a timeout prompt message to the management server, and the timeout prompt message is used for Notify the edge server that the time required to complete the distributed offline task is greater than the execution time threshold corresponding to the distributed offline task, and the management server should reassign a new edge server to execute the distributed offline task.
  • the edge server receives a distributed offline task distributed by the management server.
  • the distributed offline task may be an offline task received by the management server, or a subset of N tasks that matches the edge server.
  • the N subtasks can be obtained by dividing and processing the offline task based on the idle computing power resources of the N edge servers used to execute the offline task; in the case of ensuring the normal operation of the cloud application, the idle computing power of the edge server is used Human resources perform distributed offline tasks.
  • the idle computing resources in the edge server can also be used to perform distributed offline tasks under the condition that the cloud application can be guaranteed to run normally, avoiding the edge server
  • the waste of medium computing power resources improves the utilization rate of computing power resources, thereby reducing the operating cost of edge servers.
  • FIG. 4 it is a schematic structural diagram of a business processing system provided by an embodiment of the present invention.
  • the business processing system shown in FIG. 4 may include a management server 401 for managing edge computing nodes, and at least one edge server 402 , at least one edge server 402 is allocated to P edge computing nodes 403, and each edge computing node 403 includes one or more edge servers.
  • each edge server 402 includes a core function module, which is mainly used to realize the core functions of cloud applications.
  • the core function module is used for game rendering, game Functions such as logic meter.
  • the computing power resource requirement of this module is set to the highest priority, that is to say, no matter what kind of offline tasks an edge is performing, if it is found that the module needs more computing power resources, then it will be given priority The module allocates sufficient computing power resources.
  • each edge server 402 may also include a computing power management module, which is used to manage the computing power resources of the edge server to ensure that the requirements of all real-time online tasks on the machine do not exceed the physical Computing power limit.
  • the main functions of the computing power management module can include:
  • the real-time idle computing resource collection and reporting of the local machine, and the data reported to the management server can be used as the basis for subsequent offline task scheduling; it should be understood that the idle computing resource of the local machine can be directly reported to the management server at this time 401. It may also report the occupied computing power resources and available total computing power resources of the machine to the management server 401, and the management server 401 counts the idle computing power resources of the edge server based on the occupied computing power resources and the available total computing power resources. human resources;
  • the edge server 402 may also include an offline task scheduling module.
  • the offline task scheduling module is mainly used to schedule the offline tasks distributed to the local machine by the management server.
  • the main functions may include:
  • the offline task After receiving the offline task issued by the management server (the offline task here can refer to a complete offline task, or a subtask after a completed offline task is divided), calculate whether the idle computing resources of the machine are Satisfying the demand is mainly to determine whether the idle computing resources of the local machine can complete the offline task within the execution time threshold; if so, start the offline task.
  • the offline task scheduling module also regularly checks the execution status of local offline tasks, and re-checks the current idle computing resources for offline tasks that are suspended due to insufficient idle computing power.
  • the computing power management module detects that the computing power required for the current real-time online task is insufficient, it notifies the module to release and execute the computing power release operation, which may specifically include suspending the execution of offline tasks (for tasks that still have sufficient time to complete, that is, the aforementioned, The time difference between the offline task’s stay time on the local machine and the execution time threshold is greater than the time difference threshold), or terminate the execution of the offline task (for tasks that need to be completed immediately, that is, as mentioned above, the offline task stays on this machine. The time difference between execution duration thresholds is less than the time difference threshold);
  • Terminate the offline task When the idle computing resources of the local machine cannot complete the offline task within the execution time threshold, or the local machine needs to be shut down for maintenance, etc., the execution of the offline task is terminated, and the offline task is distributed to other edge servers through the management server.
  • the management server 401 may include an idle capacity prediction module, which is mainly used to calculate the idle computing power resources of each edge server and each edge computing node according to the computing power resource data reported by each edge server. node idle computing resources.
  • the management server 401 may also include an offline task management module, and the main functions of the offline task management module may include:
  • Task reception Receive the offline task uploaded by the user, and classify the offline task.
  • the main classification can include whether the offline task has timeliness requirements, whether the main load of the offline task is GPU type or GPU type, and whether the offline task can be executed in a distributed manner;
  • Task process management Receive the task execution status reported by the edge server executing the cloud application. If the offline task is judged to be executed according to the task execution status, the execution result will be verified and fed back to the user; if the offline task execution status is judged to be abnormal, or the If the task execution status reported by the edge server is not received, the offline task is rescheduled to a new edge server.
  • the management server 401 also includes a policy management module, and the main functions of the policy management module may include:
  • the working state of the edge server includes a busy state and an idle state.
  • an edge server reports that the current edge server has exceeded the load limit, it needs to set the working status of the edge server to busy state to avoid assigning new offline tasks and cloud application instances to the edge server.
  • the edge server informs the management server that it is currently idle, the working state of the edge server is changed to an idle state, and the idle computing resources of the edge server can be recalculated.
  • the management server 401 may also include a distributed scheduling module, and the main functions of the distributed scheduling module include:
  • Task scheduling Split a large offline task into multiple subtasks according to the idle computing resources of each edge server used to execute the offline task, and assign each subtask to an edge server for execution; when each edge server completes the corresponding subtask After execution, report the execution result to the management server 401, and the management server summarizes the reported execution results to obtain the final execution result;
  • the management server 401 may also include a cloud application instance scheduling module, which is used to dynamically allocate cloud application instances according to the computing power resources of each edge server and the node computing power resources of each edge computing node, avoiding a single An edge server is overloaded.
  • a cloud application instance scheduling module which is used to dynamically allocate cloud application instances according to the computing power resources of each edge server and the node computing power resources of each edge computing node, avoiding a single An edge server is overloaded.
  • the management server when the management server receives the offline task to be executed, it first evaluates the first computing resource required for executing the offline task, and further obtains N edge servers for executing the offline task,
  • the idle computing power resources of the N edge servers are greater than the first computing power resources required for performing offline tasks, and the idle computing power resources of the N edge servers refer to the sum of the idle computing power resources of each edge server.
  • the offline task is distributed and dispatched to N edge servers.
  • any edge server among the N edge servers After any edge server among the N edge servers receives the offline task distributed and scheduled by the management server, it uses the idle computing resources of the edge server to execute the offline task under the condition of ensuring the normal operation of the cloud application. In this way, no matter in the peak period or non-peak period of the cloud application, the idle computing resources in the edge server can also be used to perform distributed offline tasks under the condition that the cloud application can be guaranteed to run normally, avoiding the edge server The waste of medium computing power resources improves the utilization rate of computing power resources, thereby reducing the operating cost of edge servers.
  • N can take a value of 1 or greater than 1.
  • N can ensure the centralized execution of offline tasks and provide convenience for the execution management of offline tasks; when the value of N is greater than 1, the offline tasks can be realized.
  • this distributed execution method can not only ensure the execution progress of offline tasks, but also share the load of each edge server, so as to ensure the normal operation of cloud applications in each edge server.
  • an embodiment of the present invention provides a service processing device.
  • FIG. 5 it is a schematic structural diagram of a service processing device provided by an embodiment of the present invention.
  • the business processing device shown in Figure 5 can run the following units:
  • a determining unit 501 configured to determine a first computing power resource required to execute an offline task
  • the determining unit 501 is further configured to determine N edge servers for performing the offline task, wherein cloud applications are running in the N edge servers; idle computing resources of the N edge servers are greater than the first A computing power resource, the idle computing power resource of the N edge servers refers to the sum of the idle computing power resources of each edge server in the N edge servers, and N is an integer greater than or equal to 1;
  • the scheduling unit 502 is configured to distribute and schedule the offline service to the N edge servers, so that each edge server in the N edge servers can use each The idle computing resource in the edge server executes the offline task.
  • the scheduling unit 502 when the scheduling unit 502 distributes and schedules the offline service to the N edge servers, it performs the following steps:
  • each of the N subtasks matches an edge server; each subtask The idle computing power resources of the matching edge server are greater than the computing power resources required to execute each subtask; the each subtask is assigned to the matching edge server of each subtask, so that each edge server Execute the matching subtasks.
  • the cloud application is deployed to be executed in M edge servers, and the M edge servers are allocated to P edge computing nodes, and one or more edge servers are deployed in each edge computing node , M and P are both integers greater than or equal to 1; when the determining unit 501 determines N edge servers for performing the offline task, perform the following steps:
  • the node idle computing power resources of the L edge computing nodes are greater than the first computing power resources, and the node idle computing power of the L edge computing nodes
  • the resource refers to the sum of the node idle computing power resources of each edge computing node; the node idle computing power resources of each edge node are obtained according to the idle computing power resources of the edge server deployed in each edge node;
  • each edge server included in the L edge computing nodes determine at least one candidate edge server from the edge servers included in the L edge computing nodes; according to each of the at least one candidate edge server idle computing resources of edge servers and the first computing resources, and determine N edge servers from the at least one candidate edge server.
  • the attribute information of each edge server includes the working status of each edge server, and the working status includes an idle state or a busy state
  • the determining unit 501 determines from the L edge computing nodes including When determining at least one candidate edge server among the edge servers, perform the following steps:
  • an edge server whose working state is idle is determined as a candidate edge server.
  • the attribute information of each edge server includes the server type group to which each edge server belongs, the server type group includes a preset whitelist group and a common group, and the The attribute information of each edge server included in the node, the determining unit 501 performs the following steps when determining at least one candidate edge server from the edge servers included in the L edge computing nodes:
  • the edge servers included in the L edge computing nodes are determined as candidate edge servers.
  • the service processing device further includes a processing unit 503, and the processing unit 503 is configured to: monitor each edge server for matching subtasks during each edge server executing the matching subtask If it is detected that any one of the N edge servers is abnormal when executing the matching subtask, an edge server is re-selected to execute the matching subtask of any one of the edge servers.
  • a subtask corresponds to an execution time threshold
  • the service processing device further includes a receiving unit 504, which is configured to receive any edge server's The timeout prompt information reported, the timeout prompt information is used to indicate that the time required for any edge server to complete the matching subtask is longer than the execution time threshold corresponding to the matching subtask, and a new edge server needs to be reassigned to execute Subtasks that match any one of the edge servers.
  • the first computing resources include any one or more of the following: graphics processor computing resources, CPU computing resources, memory, network bandwidth, and network throughput; wherein, the graphics The computing power resources of the processor include at least one of the following: the number of floating-point operations performed by the graphics processor per second and the number of operations performed by the graphics processor each time; the computing power resources of the central processing unit include at least one of the following: The number of floating-point operations performed by the processor per second and the number of operations performed by the central processing unit per second.
  • the determining unit performs the following steps when determining the first computing resource required to execute the offline task:
  • the computational complexity corresponding to the task type of the offline task Based on the correspondence between the task type and the computational complexity, determine the computational complexity corresponding to the task type of the offline task; find at least one matching historical offline task from the historical offline tasks according to the determined computational complexity, and each match The computational complexity corresponding to the historical offline task matches the determined computational complexity; the computational resources required for the offline task are estimated based on the computational resources used to execute each matching historical offline task, and the offline task is obtained. The first computing resource required for the task.
  • each step involved in the service processing method shown in FIG. 2 may be executed by each unit in the service processing device shown in FIG. 5 .
  • steps S201 and S202 described in FIG. 2 may be performed by the determining unit 501 in the service processing device shown in FIG. 5
  • step S203 may be performed by the scheduling unit 502 in the service processing device shown in FIG. 5 .
  • each unit in the service processing device shown in Fig. 5 can be combined into one or several other units respectively or all to form, or some (some) units can be further disassembled. Divided into a plurality of functionally smaller units, this can achieve the same operation without affecting the realization of the technical effects of the embodiments of the present invention.
  • the above-mentioned units are divided based on logical functions. In practical applications, the functions of one unit may also be realized by multiple units, or the functions of multiple units may be realized by one unit. In other embodiments of the present invention, the service-based processing device may also include other units. In practical applications, these functions may also be implemented with the assistance of other units, and may be implemented cooperatively by multiple units.
  • a general-purpose computing device such as a computer including processing elements such as a central processing unit (CPU), a random access storage medium (RAM), and a read-only storage medium (ROM) and storage elements.
  • processing elements such as a central processing unit (CPU), a random access storage medium (RAM), and a read-only storage medium (ROM) and storage elements.
  • Running computer-readable instructions capable of executing the steps involved in the corresponding method as shown in Figure 2, to construct the business processing device as shown in Figure 5, and to realize the business processing method of the embodiment of the present invention .
  • the computer-readable instructions may be recorded in, for example, a computing storage medium, loaded into the above-mentioned node device via the computing storage medium, and run there.
  • the idle computing power resource of the edge server is greater than the first computing power resource required to execute the offline task, and the idle computing power resource of N edge servers refers to the sum of the idle computing power resources of each edge server.
  • N can take a value of 1 or greater than 1.
  • N can ensure the centralized execution of offline tasks and provide convenience for the execution management of offline tasks; when the value of N is greater than 1, the offline tasks can be realized.
  • this distributed execution method can not only ensure the execution progress of offline tasks, but also share the load of each edge server, so as to ensure the normal operation of cloud applications in each edge server.
  • FIG. 6 it is a schematic structural diagram of another service processing device provided by an embodiment of the present invention.
  • the business processing device shown in Figure 6 can run the following units:
  • the receiving unit 601 is configured to receive a distributed offline task distributed by the management server, the distributed offline task includes the offline task received by the management server, or the distributed offline task includes N subtasks and the edge server Matching subtasks, the N subtasks are obtained by dividing and processing the offline tasks based on the idle computing resources of each edge server in the N edge servers; the N edge servers are used to execute the For offline tasks, cloud applications are running in the N edge servers;
  • the execution unit 602 is configured to execute the distributed offline task using idle computing resources in the edge server under the condition that the normal execution of the target cloud application is guaranteed.
  • the distributed offline task corresponds to an execution time threshold
  • the execution unit 602 performs the following steps when executing the distributed offline task using idle computing power resources in the edge server:
  • the idle computing power resource of the edge server refers to the remaining computing power resource of the edge server except the second computing power resource required to run the cloud application; the service processing device further includes acquiring Unit 603;
  • the acquiring unit 603 is configured to, during the execution of the distributed offline task, acquire the The length of stay of the distributed offline task in the edge server;
  • the execution unit 602 is configured to perform a computing power release operation according to the relationship between the dwell time and the execution time threshold; wherein, the computing power release operation includes suspending execution of the distributed offline task or terminating execution The distributed offline task; if the time difference between the stay time and the execution time threshold is greater than the time difference threshold, the computing power release operation includes suspending execution of the distributed offline task; if the stay time is different from the execution time threshold If the time difference between the execution time thresholds is less than the time difference threshold, the computing power release operation includes terminating execution of the distributed offline task.
  • the execution unit 602 is further configured to: periodically detect the idle computing power resources of the edge server; If the idle computing power resource is greater than the first computing power resource, start and execute the distributed offline task; if the idle computing power resource of the edge service server is smaller than the first computing power resource, and the distributed offline task If the difference between the dwell time in the edge server and the execution time threshold is smaller than the time difference threshold, execution of the distributed offline task is terminated.
  • the service processing device further includes a sending unit 604, configured to, during the execution of the distributed offline task by the edge server, if it is predicted that the execution of the edge server is completed When the time required for the distributed offline task is greater than the execution time threshold, a timeout prompt message is sent to the management server, and the timeout prompt message is used to instruct the edge server to execute and complete the distributed offline task. If it is greater than the execution time threshold, the management server needs to reassign a new edge server to execute the distributed offline task.
  • each step involved in the service processing method shown in FIG. 3 may be executed by each unit in the service processing device shown in FIG. 6 .
  • step S301 described in FIG. 3 may be performed by the receiving unit 601 in the service processing device shown in FIG. 6
  • step S302 may be performed by the execution unit 602 in the service processing device shown in FIG. 6 .
  • each unit in the service processing device shown in Fig. 6 can be respectively or all combined into one or several other units to form, or one (some) units can be further disassembled. Divided into a plurality of functionally smaller units, this can achieve the same operation without affecting the realization of the technical effects of the embodiments of the present invention.
  • the above-mentioned units are divided based on logical functions. In practical applications, the functions of one unit may also be realized by multiple units, or the functions of multiple units may be realized by one unit. In other embodiments of the present invention, the service-based processing device may also include other units. In practical applications, these functions may also be implemented with the assistance of other units, and may be implemented cooperatively by multiple units.
  • a general-purpose computing device such as a computer including processing elements such as a central processing unit (CPU), a random access storage medium (RAM), and a read-only storage medium (ROM) and storage elements.
  • processing elements such as a central processing unit (CPU), a random access storage medium (RAM), and a read-only storage medium (ROM) and storage elements.
  • Running computer-readable instructions capable of executing the steps involved in the corresponding method as shown in Figure 3 to construct the business processing device as shown in Figure 6 and to implement the business processing method of the embodiment of the present invention .
  • the computer-readable instructions may be recorded in, for example, a computing storage medium, loaded into the above-mentioned node device via the computing storage medium, and run there.
  • the edge server receives a distributed offline task distributed by the management server.
  • the distributed offline task may be an offline task received by the management server, or a subset of N tasks that matches the edge server.
  • the N subtasks can be obtained by dividing and processing the offline task based on the idle computing power resources of the N edge servers used to execute the offline task; in the case of ensuring the normal operation of the cloud application, the idle computing power of the edge server is used Human resources perform distributed offline tasks.
  • the idle computing resources in the edge server can also be used to perform distributed offline tasks under the condition that the cloud application can be guaranteed to run normally, avoiding the edge server
  • the waste of medium computing power resources improves the utilization rate of computing power resources, thereby reducing the operating cost of edge servers.
  • the embodiment of the present invention provides a server.
  • FIG. 7 it is a schematic structural diagram of a server provided by the embodiment of the present invention.
  • the server shown in FIG. 7 may correspond to the aforementioned management server, and the server shown in FIG. 7 may include a processor 701 , an input interface 702 , an output interface 703 , and a computer storage medium 704 .
  • the processor 701, the input interface 702, the output interface 703, and the computer storage medium 704 may be connected through a bus or in other ways.
  • the computer storage medium 704 may be stored in the memory of the server, the computer storage medium 704 is used for storing computer-readable instructions, and the processor 701 is used for executing the computer-readable instructions stored in the computer storage medium 904 .
  • Processor 701 (or CPU (Central Processing Unit, central processing unit)) is the computing core and control core of the server, which is suitable for implementing one or more computer-readable instructions, specifically for loading and executing:
  • Determining the first computing power resource required to execute the offline task Determining the first computing power resource required to execute the offline task; determining N edge servers for performing the offline task, wherein cloud applications are running in the N edge servers; idle computing power resources of the N edge servers greater than the first computing power resource, the idle computing power resources of the N edge servers refer to the sum of the idle computing power resources of each edge server in the N edge servers, and N is an integer greater than or equal to 1;
  • the offline task is distributedly scheduled to the N edge servers, so that each edge server in the N edge servers can use the idle computing power of each edge server under the condition that the cloud application is guaranteed to run normally.
  • the human resource executes the offline business.
  • the idle computing power resource of the edge server is greater than the first computing power resource required to execute the offline task, and the idle computing power resource of N edge servers refers to the sum of the idle computing power resources of each edge server.
  • N can take a value of 1 or greater than 1.
  • N can ensure the centralized execution of offline tasks and provide convenience for the execution management of offline tasks; when the value of N is greater than 1, the offline tasks can be realized.
  • this distributed execution method can not only ensure the execution progress of offline tasks, but also share the load of each edge server, so as to ensure the normal operation of cloud applications in each edge server.
  • the embodiment of the present invention provides another server.
  • FIG. 8 it is a schematic structural diagram of another server provided by the embodiment of the present invention.
  • the server shown in FIG. 8 may correspond to The aforementioned edge server.
  • the server shown in FIG. 8 may include a processor 801 , an input interface 802 , an output interface 803 , and a computer storage medium 804 .
  • the processor 801, the input interface 802, the output interface 803, and the computer storage medium 804 may be connected through a bus or in other ways.
  • the computer storage medium 804 may be stored in the memory of the terminal, the computer storage medium 904 is used for storing computer-readable instructions, and the processor 801 is used for executing the computer-readable instructions stored in the computer storage medium 904 .
  • the processor 801 (or called CPU (Central Processing Unit, central processing unit)) is the calculation core and control core of the terminal, which is suitable for implementing one or more computer-readable instructions, and is specifically suitable for loading and executing:
  • the distributed offline task includes the offline task received by the management server; or, the distributed offline task includes N subtasks related to the edge server Matching subtasks, the N subtasks are obtained by dividing and processing the offline tasks based on the idle computing resources of each edge server in the N edge servers;
  • the distributed offline task is executed by using the idle computing resources of the edge server.
  • the edge server receives a distributed offline task distributed by the management server.
  • the distributed offline task may be an offline task received by the management server, or a subset of N tasks that matches the edge server.
  • the N subtasks can be obtained by dividing and processing the offline task based on the idle computing power resources of the N edge servers used to execute the offline task; in the case of ensuring the normal operation of the cloud application, the idle computing power of the edge server is used Human resources perform distributed offline tasks.
  • the idle computing resources in the edge server can also be used to perform distributed offline tasks under the condition that the cloud application can be guaranteed to run normally, avoiding the edge server
  • the waste of medium computing power resources improves the utilization rate of computing power resources, thereby reducing the operating cost of edge servers.
  • the embodiment of the present invention also provides a computer storage medium (Memory).
  • the computer storage medium is a storage device of a server and is used for storing programs and data. It can be understood that the computer storage medium here may include a built-in storage medium of the server, and certainly may include an extended storage medium supported by the server.
  • the computer storage medium provides storage space, and the storage space stores the operating system of the server.
  • computer-readable instructions adapted to be loaded and executed by the processor 801 or the processor 901 are also stored in the storage space.
  • the computer storage medium here can be a high-speed RAM memory, or a non-volatile memory (non-volatile memory), such as at least one disk memory; computer storage media.
  • the computer-readable instructions stored in the computer storage medium can be loaded and executed by the processor 801:
  • Determining the first computing power resource required to execute the offline task Determining the first computing power resource required to execute the offline task; determining N edge servers for performing the offline task, wherein cloud applications are running in the N edge servers; idle computing power resources of the N edge servers greater than the first computing power resource, the idle computing power resources of the N edge servers refer to the sum of the idle computing power resources of each edge server in the N edge servers, and N is an integer greater than or equal to 1;
  • the offline task is distributedly scheduled to the N edge servers, so that each edge server in the N edge servers can use the idle computing power of each edge server under the condition that the cloud application is guaranteed to run normally.
  • the human resource executes the offline business.
  • processor 801 when the processor 801 distributes and dispatches the offline service to the N edge servers, it performs the following steps:
  • each of the N subtasks matches an edge server; each subtask The idle computing power resources of the matching edge server are greater than the computing power resources required to execute each subtask; the each subtask is assigned to the matching edge server of each subtask, so that each edge server Execute the matching subtasks.
  • the cloud application is deployed to be executed in M edge servers, and the M edge servers are allocated to P edge computing nodes, and one or more edge servers are deployed in each edge computing node , M and P are both integers greater than or equal to 1; when the processor 801 determines N edge servers for performing the offline task, perform the following steps:
  • the node idle computing power resources of the L edge computing nodes are greater than the first computing power resources, and the node idle computing power of the L edge computing nodes
  • the resource refers to the sum of the node idle computing power resources of each edge computing node; the node idle computing power resources of each edge node are obtained according to the idle computing power resources of the edge server deployed in each edge node;
  • the attribute information of each edge server includes the working status of each edge server, the working status includes an idle state or a busy state, and each edge computing node based on the L edge computing nodes includes Attribute information of the server; when the processor 801 determines at least one candidate edge server from the edge servers included in the L edge computing nodes, perform the following steps: , the edge server whose working state is idle is determined as a candidate edge server.
  • the attribute information of each edge server includes the server type group to which each edge server belongs, the server type group includes a preset whitelist group and a common group, and the The attribute information of each edge server included in the node, when the processor 801 determines at least one candidate edge server from the edge servers included in the L edge computing nodes, perform the following steps: Among the edge servers included in the node, the edge servers whose server types belong to the common group are determined as candidate edge servers.
  • the processor 801 is further configured to: monitor each edge server's execution of the matched subtask during each edge server's execution of the matched subtask; if N edge servers are monitored Any one of the edge servers in the servers is abnormal when executing the matching subtask, and then another edge server is selected to execute the matching subtask of the any one edge server.
  • a subtask corresponds to an execution time threshold
  • the processor 801 is further configured to: receive timeout prompt information reported by any edge server during execution of a matching subtask by each edge server, and The above timeout prompt information is used to indicate that the time required for any edge server to complete the matching subtask is longer than the execution time threshold corresponding to the matching subtask, and a new edge server needs to be reassigned to execute the matching subtask of any edge server. Subtasks.
  • the first computing resources include any one or more of the following: graphics processor computing resources, CPU computing resources, memory, network bandwidth, and network throughput; wherein, the graphics The computing power resources of the processor include at least one of the following: the number of floating-point operations performed by the graphics processor per second and the number of operations performed by the graphics processor each time; the computing power resources of the central processing unit include at least one of the following: The number of floating-point operations performed by the processor per second and the number of operations performed by the central processing unit per second.
  • the processor 801 determines the first computing resource required to execute the offline task, it performs the following steps: based on the correspondence between the task type and the computational complexity, determine the The computational complexity corresponding to the task type; find at least one matching historical offline task from the historical offline tasks according to the determined computational complexity, and the computational complexity corresponding to each matching historical offline task matches the determined computational complexity; based on the execution of each Computing resources used by a matching historical offline task estimate the computing resources required by the offline task to obtain the first computing resource required to execute the offline task.
  • the idle computing power resource of the edge server is greater than the first computing power resource required to execute the offline task, and the idle computing power resource of N edge servers refers to the sum of the idle computing power resources of each edge server.
  • N can take a value of 1 or greater than 1.
  • N can ensure the centralized execution of offline tasks and provide convenience for the execution management of offline tasks; when the value of N is greater than 1, the offline tasks can be realized.
  • this distributed execution method can not only ensure the execution progress of offline tasks, but also share the load of each edge server, so as to ensure the normal operation of cloud applications in each edge server.
  • the computer-readable instructions stored in the computer storage medium can be loaded and executed by the processor 901:
  • the distributed offline task includes the offline task received by the management server; or, the distributed offline task includes N subtasks related to the edge server Matching subtasks, the N subtasks are obtained by dividing and processing the offline tasks based on the idle computing power resources of each edge server in the N edge servers; in the case of ensuring the normal execution of the cloud application, using the The idle computing resources of the edge server are used to execute the distributed offline task.
  • the distributed offline task corresponds to an execution time threshold
  • the processor 901 performs the following steps when executing the distributed offline task using the idle computing resources of the edge server:
  • the idle computing resource of the edge server refers to the remaining computing resource of the edge server except the second computing resource required for running the cloud application; the processor 901 is further configured to implement:
  • the computing power releasing operation includes suspending the execution of the distributed offline task or terminating the execution of the distributed offline task; if If the time difference between the dwell time and the execution time threshold is greater than the time difference threshold, the computing power release operation includes suspending execution of the distributed offline task; if the time difference between the dwell time and the execution time threshold is If the difference is less than the time difference threshold, the computing power release operation includes terminating execution of the distributed offline task.
  • the computing power release operation includes suspending the execution of the distributed offline task, and after performing the computing power releasing operation, the processor 901 is further configured to execute:
  • the processor 901 is further configured to execute: during the process of the edge server executing the distributed offline task, if it is predicted that the edge server needs to perform the distributed offline task is greater than the execution time threshold, then send timeout prompt information to the management server, the timeout prompt information is used to indicate that the time required for the edge server to execute and complete the distributed offline task is greater than the execution time threshold, the The management server needs to reassign new edge servers to execute the distributed offline tasks.
  • the edge server receives a distributed offline task distributed by the management server.
  • the distributed offline task may be an offline task received by the management server, or a subset of N tasks that matches the edge server.
  • the N subtasks can be obtained by dividing and processing the offline task based on the idle computing power resources of the N edge servers used to execute the offline task; in the case of ensuring the normal operation of the cloud application, the idle computing power of the edge server is used Human resources perform distributed offline tasks.
  • the idle computing resources in the edge server can also be used to perform distributed offline tasks under the condition that the cloud application can be guaranteed to run normally, avoiding the edge server
  • the waste of medium computing power resources improves the utilization rate of computing power resources, thereby reducing the operating cost of edge servers.
  • an embodiment of the present invention further provides a computer program product, where the computer program product includes computer-readable instructions, and the computer-readable instruction products are stored in a computing storage medium.
  • the processor 801 reads computer-readable instructions from the computing storage medium, so that the server loads and executes: determining the first computing power resource required to execute the offline task; determining N edges used to execute the offline task Server, wherein cloud applications are running in the N edge servers; the idle computing resources of the N edge servers are greater than the first computing resources, and the idle computing resources of the N edge servers refer to the N
  • the sum of the idle computing power resources of each edge server in the edge servers, N is an integer greater than or equal to 1; the offline task is distributed to the N edge servers, so that each of the N edge servers Under the condition that the cloud application is guaranteed to run normally, each edge server uses the idle computing resources of each edge server to execute the offline service.
  • the idle computing power resource of the edge server is greater than the first computing power resource required to execute the offline task, and the idle computing power resource of N edge servers refers to the sum of the idle computing power resources of each edge server.
  • N can take a value of 1 or greater than 1.
  • N can ensure the centralized execution of offline tasks and provide convenience for the execution management of offline tasks; when the value of N is greater than 1, the offline tasks can be realized.
  • this distributed execution method can not only ensure the execution progress of offline tasks, but also share the load of each edge server, so as to ensure the normal operation of cloud applications in each edge server.
  • the processor 901 reads computer-readable instructions from the computing storage medium, and the processor 901 executes the computer-readable instructions, so that the server executes:
  • the distributed offline task includes the offline task received by the management server; or, the distributed offline task includes N subtasks related to the edge server Matching subtasks, the N subtasks are obtained by dividing and processing the offline tasks based on the idle computing power resources of each edge server in the N edge servers; in the case of ensuring the normal execution of the cloud application, using the The idle computing resources of the edge server are used to execute the distributed offline task.
  • the edge server receives a distributed offline task distributed by the management server.
  • the distributed offline task may be an offline task received by the management server, or a subset of N tasks that matches the edge server.
  • the N subtasks can be obtained by dividing and processing the offline task based on the idle computing power resources of the N edge servers used to execute the offline task; in the case of ensuring the normal operation of the cloud application, the idle computing power of the edge server is used Human resources perform distributed offline tasks.
  • the idle computing resources in the edge server can also be used to perform distributed offline tasks under the condition that the cloud application can be guaranteed to run normally, avoiding the edge server
  • the waste of medium computing power resources improves the utilization rate of computing power resources, thereby reducing the operating cost of edge servers.

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Power Sources (AREA)
  • Hardware Redundancy (AREA)

Abstract

本发明实施例公开了一种业务处理方法、装置、服务器、存储介质和计算机程序产品,涉及到云技术中的资源调度,其中方法包括:确定执行离线任务所需的第一算力资源;确定用于执行离线任务的N个边缘服务器,N个边缘服务器中运行有云应用;N个边缘服务器的空闲算力资源大于第一算力资源,N个边缘服务器的空闲算力资源是指N个边缘服务器中各个边缘服务器的空闲算力资源之和;将离线任务分布式调度至N个边缘服务器,以使N个边缘服务器中每个边缘服务器在保证云应用正常运行的情况下,使用每个边缘服务器的空闲算力资源执行离线任务。采用本发明实施例,提高了边缘服务器的资源利用率。

Description

业务处理方法、装置、服务器、存储介质和计算机程序产品
本申请要求于2021年08月02日提交中国专利局,申请号为202110884435.7,申请名称为“业务处理方法及装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及计算机技术领域,尤其涉及一种业务处理方法、装置、服务器、存储介质和计算机程序产品。
背景技术
云应用是指把传统软件本地安装、本地运算的使用方式变为即取即用的服务,通过互联网或局域网连接并操控远程服务器集群,完成逻辑或运算任务的一种新型应用。简单来说,云应用是指某个应用的运行和计算依赖云服务器执行,终端只负责画面展示,例如云游戏作为云应用中的一个典型,云游戏是指基于云计算技术,游戏在远程服务器上运行,终端不需要下载、安装,也不需要考虑终端配置,彻底解决了终端性能不足无法运行重度游戏的问题。
由此可见,云应用的运行需要极低的网络延时以及极高的网络稳定性,传统的网络环境显然无法满足云应用的网络要求。因此,为了给用户提供更加稳定的网络环境,一般都会通过大规模部署边缘服务器的方式,让云应用的服务器离用户更近。通常情况下,云应用用户的在线是有比较明显的潮汐现象的,比如对于云游戏来说,在高峰期时间段比如晚上在线的游戏用户数量较多,在非高峰期比如早上或者中午时间段在线的游戏用户数量相对较少。但是,为了给用户提供更好的云应用体验,无论是在什么情况下,都会按照最高在线人数来准备资源,而这样一来,一部分资源在非高峰期时段就会产生空闲,造成了资源的浪费。因此,在云应用领域中,如何避免资源浪费,提高资源利用率成为当今研究的热点问题之一。
发明内容
根据本申请的各种实施例,提供一种业务处理方法、装置、服务器、存储介质和计算机程序产品。
一种业务处理方法,包括:
确定执行离线任务所需的第一算力资源;
确定用于执行所述离线任务的N个边缘服务器,所述N个边缘服务器中运行有云应用;所述N个边缘服务器的空闲算力资源大于所述第一算力资源,所述N个边缘服务器的空闲算力资源是指所述N个边缘服务器中各个边缘服务器的空闲算力资源之和,N为大于或等于1的整数;
将所述离线任务分布式调度至所述N个边缘服务器,以使所述N个边缘服务器中每个边缘服务器在保证所述云应用正常运行的情况下,使用所述每个边缘服务器的空闲算力资源执行所述离线业务。
一种业务处理方法,该业务处理方法由用于执行离线任务的N个边缘服务器中的一个边缘服务器执行,N个边缘服务器中运行有云应用,业务处理方法包括:
接收管理服务器分布式调度的分布式离线任务,所述分布式离线任务包括所述管理服务器接收到的所述离线任务;或者,所述分布式离线任务包括N个子任务中与所述边缘服务器相匹配的子任务,所述N个子任务是基于N个边缘服务器中每个边缘服务器的空闲算力资源对所述离线任务进行分割处理得到;
在保证所述云应用正常执行的情况下,采用所述边缘服务器的空闲算力资源执行所述分布式离线任务。
一种业务处理装置,包括:
确定单元,用于确定执行离线任务所需的第一算力资源;
所述确定单元,还用于确定用于执行所述离线任务的N个边缘服务器,所述N个边缘服务器中运行有云应用;所述N个边缘服务器的空闲算力资源大于所述第一算力资源,所述N个边缘服务器的空闲算力资源是指所述N个边缘服务器中各个边缘服务器的空闲算力资源之和,N为大于或等于1的整数;
调度单元,用于将所述离线业务分布式调度至所述N个边缘服务器,以使所述N个边缘服务器中每个边缘服务器在保证所述云应用正常运行的情况下,使用每个边缘服务器中的空闲算力资源执行所述离线任务。
一种业务处理装置,包括:
接收单元,用于接收管理服务器分布式调度的分布式离线任务,所述分布式离线任务包括所述管理服务器接收到的离线任务,或者所述分布式离线任务包括N个子任务中与边缘服务器相匹配的子任务,所述N个子任务是将基于N个边缘服务器中每个边缘服务器的空闲算力资源对所述离线任务进行分割处理得到的;所述N个边缘服务器用于执行所述离线任务,所述N个边缘服务器中运行有云应用;
执行单元,用于在保证目标云应用正常执行的情况下,采用所述边缘服务器中的空闲算力资源执行所述分布式离线任务。
一种服务器,包括:
处理器,适用于实现一条或多条计算机可读指令;以及计算机存储介质,所述计算机存储介质存储有一条或多条计算机可读指令,所述一条或多条计算机可读指令适于由所述处理器加载并执行:
确定执行离线任务所需的第一算力资源;
确定用于执行所述离线任务的N个边缘服务器,所述N个边缘服务器中运行有云应用;所述N个边缘服务器的空闲算力资源大于所述第一算力资源,所述N个边缘服务器的空闲算力资源是指所述N个边缘服务器中各个边缘服务器的空闲算力资源之和,N为大于或等于1的整数;
将所述离线任务分布式调度至所述N个边缘服务器,以使所述N个边缘服务器中每个边缘服务器在保证所述云应用正常运行的情况下,使用所述每个边缘服务器的空闲算力资源执行所述离线业务。
一种服务器,包括:
处理器,适用于实现一条或多条计算机可读指令;以及计算机存储介质,所述计算机存储介质存储有一条或多条计算机可读指令,所述一条或多条计算机可读指令适于由所述处理器加载并执行:
接收管理服务器分布式调度的分布式离线任务,所述分布式离线任务包括所述管理服务器接收到的所述离线任务;或者,所述分布式离线任务包括N个子任务中与所述边缘服务器相匹配的子任务,所述N个子任务是基于N个边缘服务器中每个边缘服务器的空闲算力资源对所述离线任务进行分割处理得到;
在保证所述云应用正常执行的情况下,采用所述边缘服务器的空闲算力资源执行所述分布式离线任务。
一种计算机存储介质,所述计算机存储介质存储有计算机可读指令,所述计算机可读指令被处理器执行时,用于执行:
确定执行离线任务所需的第一算力资源;
确定用于执行所述离线任务的N个边缘服务器,所述N个边缘服务器中运行有云应用;所述N个边缘服务器的空闲算力资源大于所述第一算力资源,所述N个边缘服务器的空闲算力资源是指所述N个边缘服务器中各个边缘服务器的空闲算力资源之和,N为大于或等于1的整数;
将所述离线任务分布式调度至所述N个边缘服务器,以使所述N个边缘服务器中每个边缘服务器在保证所述云应用正常运行的情况下,使用所述每个边缘服务器的空闲算力资源执行所述离线业务。
或者,所述计算机可读指令被处理器执行时,用于执行:
接收管理服务器分布式调度的分布式离线任务,所述分布式离线任务包括所述管理服务器接收到的所述离线任务;或者,所述分布式离线任务包括N个子任务中与所述边缘服务器相匹配的子任务,所述N个子任务是基于N个边缘服务器中每个边缘服务器的空闲算力资源对所述离线任务进行分割处理得到;
在保证所述云应用正常执行的情况下,采用所述边缘服务器的空闲算力资源执行所述分布式离线任务。
一种计算机程序产品或计算机可读指令,所述计算机程序产品包括计算机可读指令,该计算机可读指令存储在计算机存储介质中;服务器的处理器从计算机存储介质中读取所述计算机可读指令,该处理器执行计算机可读指令,使得服务器执行:
确定执行离线任务所需的第一算力资源;
确定用于执行所述离线任务的N个边缘服务器,所述N个边缘服务器中运行有云应用;所述N个边缘服务器的空闲算力资源大于所述第一算力资源,所述N个边缘服务器的空闲算力资源是指所述N个边缘服务器中各个边缘服务器的空闲算力资源之和,N为大于或等于1的整数;
将所述离线任务分布式调度至所述N个边缘服务器,以使所述N个边缘服务器中每个边缘服务器在保证所述云应用正常运行的情况下,使用所述每个边缘服务器的空闲算力资源执 行所述离线业务。
或者,服务器的处理器从计算机存储介质中读取该计算机可读指令,该处理器执行该计算机可读指令,使得服务器执行:
接收管理服务器分布式调度的分布式离线任务,所述分布式离线任务包括所述管理服务器接收到的所述离线任务;或者,所述分布式离线任务包括N个子任务中与所述边缘服务器相匹配的子任务,所述N个子任务是基于N个边缘服务器中每个边缘服务器的空闲算力资源对所述离线任务进行分割处理得到;
在保证所述云应用正常执行的情况下,采用所述边缘服务器的空闲算力资源执行所述分布式离线任务。
附图说明
为了更清楚地说明本发明实施例技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1是本发明实施例提供的一种云应用管理系统的结构示意图;
图2是本发明实施例提供的一种业务处理方法的流程示意图;
图3是本发明实施例提供的另一业务处理方法的流程示意图;
图4是本发明实施例提供的一种业务处理系统的结构示意图;
图5是本发明实施例提供的一种业务处理装置的结构示意图;
图6是本发明实施例提供的另一种业务处理装置的结构示意图;
图7是本发明实施例提供的一种服务器的结构示意图;
图8是本发明实施例提供的另一种服务器的结构示意图。
具体实施方式
为了使本申请的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本申请进行进一步详细说明。应当理解,此处描述的具体实施例仅仅用以解释本申请,并不用于限定本申请。
本发明实施例提供了一种业务处理方案,可以使得边缘服务器的资源得到充分利用,提高资源利用率,同时可以降低云应用的边缘服务器的运营成本。在该业务处理方案中,当管理服务器接收到需要执行的离线任务时,首先可以评估执行该离线任务所需的第一算力资源;然后确定出用于执行该离线任务的N个边缘服务器,需要说明的是,这N个边缘服务器的空闲算力资源大于该第一算力资源;然后将离线任务分布式调度至N个边缘服务器,这N个边缘服务器在保证云应用正常运行的情况下,使用各自的空闲算力资源来执行该离线业务。上述N的取值可以大于或等于1,当N的取值为1时,分布式调度是指将该离线任务分配给该一个边缘服务器单独执行;当N的取值大于1时,分布式调度是指将离线任务分配给多个边缘服务器共同执行,可以将该离线任务分割成几个子业务,每个边缘服务器被分配执行一个子业务,这样一来,可以分担每个边缘服务器的负载,保证每个边缘服务器中云应用的正常运行;或者,也可以将离线任务分别分配给每个边缘服务器,多个边缘服务器执行相同的离 线任务,可以提高离线任务的执行率。本发明实施例提供的业务处理方案中无论在云应用的高峰期还是非高峰期,均能在保证云应用正常运行的情况下,利用每个边缘服务器中的空闲算力资源来执行离线任务,避免了每个边缘服务器中算力资源的浪费,提高了算力资源的利用率,从而也降低了边缘服务器的运营成本。
基于上述业务处理方案,本发明实施例提供了一种云应用管理系统,参见图1,为本发明实施例提供的一种云应用管理系统的结构示意图,图1所示的云应用管理系统包括至少一个边缘服务器101,至少一个边缘服务器101可以用于运行云应用,需要说明的是,至少一个边缘服务器101中可以运行相同或者不相同的云应用。常见的云应用包括云游戏、云教育、云会议以及云社交等等。
可选的,至少一个边缘服务器101可以被分配至多个边缘计算节点中,一个边缘计算节点可以看成是用于进行边缘计算的节点,边缘计算可以指在靠近物或数据源头的一侧,采用网络、计算、存储、应用核心能力为一体的开放平台,就近提供最近端服务,其应用程序在边缘侧发起,产生更快的网络服务响应,满足行业在实时业务、应用智能、安全与隐私保护等方面的基本需求。每个边缘计算节点中运行一个或多个边缘服务器,这些边缘服务器具有图形处理运算能力,每个边缘服务器可以称为一个计算节点。例如,在图1中,边缘计算节点1中包括四个边缘服务器,边缘计算节点2中也可以包括四个边缘服务器。
在一个实施例中,图1所示的云应用管理系统中还可以包括云应用服务器102,云应用服务器102与至少一个边缘服务器101相连接,云应用服务器102可以为每个边缘服务器101提供云应用的运行数据,以使每个边缘服务器101可以基于云应用服务器提供的运行数据运行云应用。
在一个实施例中,图1所示的云应用管理系统中还可以包括终端103,终端103可以与至少一个边缘服务器101相连接,终端103用于接收并显示边缘服务器101对云应用进行渲染得到的画面,比如终端103显示边缘服务器101渲染得到的游戏画面。其中,终端103可以指移动智能终端,是指一类具备丰富人机交互方式、拥有接入互联网能力、通常搭载各种操作系统、具有较强处理能力的设备。终端103可以包括智能手机、平板电脑、笔记本电脑、台式计算机、智能音箱、智能手表、车载终端、智能电视等。
在一个实施例中,图1所示的云应用管理系统中还可以包括管理服务器104,管理服务器104分别与终端103和至少一个边缘服务器101相连接。管理服务器104可以用于对所述至少一个边缘服务器101进行管理和调度,比如当监测到任意一个终端中启动了任意一种云应用时,管理服务器104可以根据当前各个边缘服务器101的负载情况以及空闲算力资源情况,选择一个或多个合适的边缘服务器来执行该任意一个终端中启动的云应用。再如,当用户向管理服务器104提交一个离线任务时,管理服务器104根据各个边缘服务器的空闲算力资源情况,确定将该离线任务调度给一个或多个边缘服务器101,被分配了离线任务的一个或多个边缘服务器101在保证各自的云应用正常运行的情况下,利用各自的空闲算力资源执行被分配的离线任务,如此一来,既保证了云应用的正常运行,也避免了运行云应用的空闲算力资源浪费,提高了每个边缘服务器的资源利用率,从而可降低对边缘服务器的运营成本。
基于上述的业务处理系统,本发明实施例提供了一种业务处理方法。参见图2,为本发明实施例提供的一种业务处理方法的流程示意图,图2所示的业务处理方法可由管理服务器执行,具体可由管理服务器的处理器执行。图2所示的业务处理方法可包括如下步骤:
步骤S201、确定执行离线任务所需的第一算力资源。
其中,离线任务是指不需要实时在线完成的任务,比如视频特效的离线渲染,人工智能模型的离线训练等等。
可选的,根据执行离线任务的主要负载的类型不同,第一算力资源不同。具体地,如果执行离线任务的主要负载属于图形处理器(Graphics Processing Unit,GPU)型,也就是说执行离线任务的主要负载集中在图形处理器,则第一算力资源可包括以下任意一种或多种:网络带宽、内存、图形处理器每秒所执行的浮点运算次数FLOPS(Floating-point Operations Per Second)、图形处理器每秒所执行的操作次数OPS(Operations Per Second)以及吞吐量。
如果执行离线任务的主要负载属于中央处理器(Central Processing Unit,CPU)型,也就是说执行离线任务的主要负载集中在中央处理器,则第一算力资源可以包括以下任意一种或多种:内存、网络带宽、中央处理器每秒所执行的浮点运算次数FLOPS、中央处理器每秒所执行的操作次数OPS。如果执行离线任务的主要负载为混合型,也就是说执行离线任务的负载既需要CPU,又需要GPU,那么第一算力资源就是上述两种类型下的第一算力资源的综合。
其中,每秒所执行的浮点运算次数FLOPS可以分为半精度、单精度和双精度,在计算图形处理器的每秒所执行的浮点运算次数时,需要分别计算图形处理器每秒所执行的半精度浮点运算次数、图形处理器每秒所执行的单精度浮点运算次数以及图形处理器每秒所执行的双精度浮点运算次数。同理的,在计算中央处理器的每秒所执行的浮点运算次数时,也需要分别计算中央处理器每秒所执行的半精度浮点运算次数、中央处理器每秒所执行的单精度浮点运算次数以及中央处理器每秒所执行的双精度浮点运算次数。目前在使用FLOPS衡量计算能力的标准时,通常可以包括每秒万亿次浮点运算(teraFLOPS,TFLOPS)、每秒十亿次浮点运算(gigaFLOPS,GFLOPS)以及每秒百万次浮点运算(megaFLOPS,MFLOPS),及每秒千万亿浮点运算(petaFLOPS,PFLOPS)等。
目前在使用OPS衡量计算能力的标准时,通常可以包括每秒百万次操作(Million Operation Per Second,MOPS)、每秒十亿次操作(Giga Operations Per Second,GOPS)以及每秒万亿次操作(Tera Operations Per Second,TOPS)等。
在一个实施例中,执行离线任务所需的第一算力资源可以是基于与离线任务相似的历史离线任务执行时所使用的算力资源估算得到的。具体实现中,确定执行离线任务所需的第一算力资源,可以包括:基于任务类型和计算复杂度之间的对应关系,确定与该离线任务的任务类型对应的计算复杂度;根据确定的计算复杂度从历史离线任务中查找至少一个匹配历史离线任务,每个匹配历史离线任务对应的计算复杂度与确定的计算复杂度相匹配;基于执行 每个匹配历史离线任务所使用的算力资源对离线任务所需的算力资源进行估算,得到执行离线任务所需的第一算力资源。
其中,根据任务内容不同可以将任务进行分类,比如任务内容是对离线视频进行渲染,那么该任务类型可以是视频渲染类;再如任务内容是对模型进行训练,那么该任务类型可以是模型训练类。计算复杂度又可以称为算法复杂度,算法复杂度是指算法在编写成可执行程序后,运行时所需要的资源,所需要的资源包括时间资源和内存资源。时间资源可以是通过上述的FLOPS和OPS来衡量。本发明实施例中执行离线任务实质是执行离线任务被编写成的可执行程序。
任务类型与计算复杂度之间的对应关系可以是执行历史离线任务的计算复杂度确定的,比如执行模型训练类任务对应的计算复杂度,执行离线视频渲染类任务对应的计算复杂度。任务类型对应的计算复杂度可以用于反映执行该任务类型的复杂度数量级。
可选的,根据确定的计算复杂度从历史离线任务中查找至少一个匹配历史离线任务,可以包括:从各个历史离线任务中查找计算复杂度与确定的计算复杂度相匹配的历史离线任务,将这些历史离线任务确定为匹配历史离线任务。两个计算复杂度相匹配可以是这两个计算复杂度之间的复杂度差值小于某个指定值,需要说明的是,在执行完成的历史离线任务中,除了与离线任务具有相同任务类型的历史离线任务可能与该离线任务具有相匹配的计算复杂度,与离线任务具有不同任务类型的历史离线任务也可能与该离线任务具有相匹配的计算复杂度,因此,在确定匹配历史离线任务时,不是基于离线任务所属的任务类型来选择匹配历史离线任务,而是根据该离线任务对应的计算复杂度来选择,这样可以从历史离线任务中选择出较多的匹配历史离线任务,基于这些匹配历史离线任务所使用的算力资源对离线任务所需的算力资源进行估算时,估算出的第一算力资源更加准确。
当然,在实际应用中,也可以从各个历史离线任务中查找与该离线任务的任务类型相同的历史离线任务作为匹配历史离线任务,基于这些匹配历史离线任务所使用的算力资源进行估算得到第一算力资源,本发明实施例中不做具体限定,可以根据实际需求,灵活选择。
由前述可知,第一算力资源可以包括图形处理器算力资源、中央处理器算力资源、内存、网络带宽以及网络吞吐量中任意一个或多个,在基于执行每个匹配历史离线任务所使用的算力资源对离线任务所需的算力资源进行估算,得到执行离线任务所需的第一算力资源,可以包括:基于执行每个匹配离线历史任务所使用的每一种算力资源对离线任务所需的相应算力资源进行估算。比如基于执行每个匹配历史任务所使用的图形算力资源来估算执行离线任务所需的图形算力资源;再如基于执行每个匹配历史任务所使用的内存资源来估算执行离线任务所需的内存。
具体实现中,基于执行每个匹配历史离线任务所使用的算力资源对离线任务所需的算力资源进行估算,得到执行离线任务所需的第一算力资源,可以包括:将执行每个匹配历史离线任务所使用的算力资源进行求平均运算,运算结果作为执行离线任务所需的第一算力资源。
其他实施例中,基于执行每个匹配历史离线任务所使用的算力资源对离线任务所需的算力资源进行估算,得到执行离线任务所需的第一算力资源,包括:根据每个匹配历史离线任 务的任务类型和离线任务的任务类型之间的关系为每个匹配历史离线任务分配一个权重值,基于每个匹配历史离线任务的权重值对至少一个匹配历史离线任务进行加权求平均运算,运算结果作为执行离线任务所需的第一算力资源。比如,一个匹配历史离线任务与离线任务具有相同的任务类型,那么可以为该匹配历史离线任务赋予一个较高一点的权重值,一个匹配历史离线任务与离线任务分别属于不同的任务类型,那么可以为该匹配历史离线任务赋予一个较低的权重值。
在一个实施例中,离线任务可以对应一个执行时长阈值,执行该离线任务所需的第一算力资源具体可以指在执行时长阈值内执行完成离线任务所需的第一算力资源;执行时长阈值不同,可能第一算力资源也不相同。
步骤S202、确定用于执行离线任务的N个边缘服务器,N个边缘服务器中运行有云应用,N个边缘服务器的空闲算力资源大于第一算力资源。
需要说明的是,云应用被部署至M个边缘服务器中运行,也就是说M个边缘服务器参与云应用的运行,在步骤S202的N个边缘服务器是从M个边缘服务器中选择的,在从M个边缘服务器中选择用于执行离线任务的N个边缘服务器时,可以是直接基于M个边缘服务器中每个边缘服务器的空闲算力资源与第一算力资源确定的。每个边缘服务器的空闲算力资源可以是基于该边缘服务器中运行云应用所需的第二算力资源和该边缘服务器的总算力资源确定的。每个边缘服务器中运行云应用所需的第二算力资源也可以是管理服务器基于历史运行该云应用所使用的算力资源进行估算确定的。比如管理服务器可以获取历史上多次运行云应用所使用的算力资源,然后对这些算力资源进行求平均运算,估算出该边缘服务器运行该云应用所需的第二算力资源。应当理解的,由前述可知,算力资源可以包括多种,在确定每种算力资源时,将每次运行该云应用所需的每种算力资源进行求平均运算,得到该边缘服务器运行该云应用的所需的这种算力资源。
可选的,N为大于或等于1的整数,当N等于1时,可以从M个边缘服务器中选择空闲算力资源大于第一算力资源的任意一个边缘服务器作为执行离线任务的边缘服务器。
当N大于1时,在一个实施例中,确定用于执行离线任务的N个边缘服务器,包括:将M个边缘服务器中每个边缘服务器的空闲算力资源与第一算力资源相比较;将空闲算力资源大于第一算力资源的N个边缘服务器确定为用于执行离线任务的N个边缘服务器。每个边缘服务器中的空闲算力资源可以是基于每个边缘服务器的总算力资源和运行云应用所需的第二算力资源确定的,比如将每个边缘服务器的总算力资源减去第二算力资源得到每个边缘服务器的空闲算力资源;再如将每个边缘服务器中用于运行云应用所需的第二算力资源与一个预留算力进行相加,然后每个边缘服务器的总算力资源减去该相加结果得到每个边缘服务器的空闲算力资源。这样做的目的是,为云应用的运行预留出一部分算力资源,避免在云应用所需算力资源突然增加时,边缘服务器不能及时作出响应,降低云应用的运行速度和响应效率。
简单来说,上述确定N个边缘服务器的方法是:将M个边缘服务器中所有空闲资源大于第一算力资源的边缘服务器作为执行离线任务的N个边缘服务器,虽然这N个边缘服务器中每个边缘服务器的空闲算力资源足够使得每个边缘服务器可以单独执行离线任务,但是本发 明实施例中可以将离线任务分布式调度给这N个边缘服务器共同执行,如此离线任务所需的第一算力资源可以分摊到不同的边缘服务器中,每个边缘服务器便可以预留出一些多余的算力资源,这样当某个边缘服务器中云应用所需的算力资源增加时,能够保证该边缘服务器在不暂停执行离线任务的情况下,及时将预留的算力分配给云应用。
另一个实施例中,确定用于执行离线任务的N个边缘服务器,包括:将M个边缘服务器中各个边缘服务器的空闲算力资源与第一算力资源进行比较;若不存在任意一个边缘服务器的算力资源大于该第一算力资源;则将M个边缘服务器进行组合得到多个组合,每个组合中包括至少两个边缘服务器;计算每个组合的空闲算力资源之和,将空闲算力资源之和大于第一算力资源的组合中包括的边缘服务器确定为N个用于执行离线任务的边缘服务器。也就是说在M个边缘服务器中不存在空闲算力资源大于第一算力资源的边缘服务器时,选择出的N个边缘服务器的空闲算力资源之和大于第一算力资源。
可选的,本发明实施例中的M个边缘服务器可以被分配至P个边缘计算节点中,每个边缘计算节点中包括一个或多个边缘服务器,比如P个边缘计算节点包括边缘计算节点1和边缘计算节点2,边缘计算节点1中可以包括5个边缘服务器,边缘计算节点2中可以包括M-5个边缘服务器。
基于此,在从M个边缘服务器中确定用于执行离线任务的N个边缘服务器时,可以先从P个边缘计算节点中确定出L个边缘计算节点,这L个边缘计算节点的节点空闲算力资源大于第一算力资源,然后从确定出的L个边缘计算节点中选择N个边缘服务器。具体实现中,确定用于执行离线任务的N个边缘服务器包括如下步骤:
S1:从P个边缘计算节点中选择L个边缘计算节点,L个边缘计算节点的节点空闲算力资源大于所述第一算力资源。
其中,L个边缘计算节点的节点空闲算力资源是L个边缘计算节点中各个边缘计算节点的节点空闲算力资源之和,L个边缘计算节点的节点空闲算力资源大于第一算力资源可以包括以下几种情况中任意一种:L个边缘计算节点中每个边缘计算节点的节点空闲算力均大于第一算力资源、L个边缘计算节点中有一部分边缘计算节点的节点空闲算力大于第一算力资源,剩余一部分的边缘计算节点的节点空闲算力资源之和大于第一算力资源;以及L个边缘计算节点中每个边缘计算节点的节点空闲算力资源均小于第一算力资源,但是L个边缘计算节点的节点空闲算力资源之和大于第一算力资源。
简单来说,从P个边缘计算节点中选择L个边缘计算节点时,可以只选择节点空闲算力资源大于第一算力资源的一些边缘计算节点;或者,也可以既选择节点空闲算力资源大于第一算力资源的边缘计算节点,还可以选择一部分节点空闲算力资源之和大于第一算力资源的边缘计算节点;又或者,如果P个边缘计算节点中不存在节点空闲算力资源大于第一算力资源的边缘节点,那么此时只能选择节点空闲算力资源之和大于第一算力资源的一些边缘计算节点。
其中,每个边缘计算节点的节点空闲算力资源是基于每个边缘计算节点中包括的各个边缘服务器的空闲算力资源确定的。例如,将一个边缘计算节点的包括的多个边缘服务器的空 闲算力资源进行相加运算,得到该边缘计算节点的节点空闲算力资源;再如,可以将一个边缘计算节点包括的多个边缘服务器的空闲算力资源进行平均运算,得到该边缘计算节点的节点空闲算力资源。需要说明的是,本发明实施例只是列举了两种计算节点空闲算力资源的方式,在具体应用中,可以根据实际需求采用任意方式计算节点空闲算力资源,本发明实施例中不做具体限定。
S2:基于L个边缘计算节点包括的每个边缘服务器的属性信息,从L个边缘计算节点包括的边缘服务器中确定至少一个候选的边缘服务器;
在一个实施例中,每个边缘服务器的属性信息可以包括每个边缘服务器的工作状态,工作状态可以包括空闲状态或忙碌状态,当一个边缘服务器的负载超过了负载上限时,确定该边缘服务器的工作状态为忙碌状态;反之,当一个边缘服务器的负载小于负载上限时,该边缘服务器的工作状态为空闲状态。处于忙碌状态的边缘服务器是不被调度执行离线任务的,处于空闲状态的边缘服务器是可以被调度离线任务的。因此,基于L个边缘计算节点包括的每个边缘服务器的属性信息从L个边缘计算节点包括的边缘服务器中确定多个候选的边缘服务器,包括:将L个边缘计算节点包括的边缘服务器中,工作状态为空闲状态的边缘服务器确定为候选的边缘服务器。
另一个实施例中,每个边缘服务器的属性信息包括每个边缘服务器所属的服务器类型分组,服务器类型分组包括预设白名单组和普通组,预设白名单组中的边缘服务器用于运行高优先级的、不可被打断的实时云应用,因此预设白名单组中的边缘服务器是不被调度离线任务的。普通组的边缘服务器是可以被调度离线任务的。预设白名单组中的边缘服务器是动态变化的,当预设白名单组中的某个边缘服务器不再执行高优先级的、不可被打断的实时任务,这个边缘服务器被从预设白名单组中移除,可以转移到普通组。因此,基于L个边缘计算节点包括的每个边缘服务器的属性信息从L个边缘计算节点包括的边缘服务器中确定多个候选的边缘服务器,包括:将L个边缘计算节点包括的边缘服务器中,所属的服务器类型分组为普通组的边缘服务器确定为候选的边缘服务器。
S3:根据至少一个候选的边缘服务器中每个边缘服务器的空闲算力资源和第一算力资源从至少一个候选的边缘服务器中确定N个边缘服务器。
在通过步骤S2得到一些候选的边缘服务器后,从至少一个候选的边缘服务器中选取N个边缘服务器。在一个实施例中,可以将各个候选的边缘服务器的空闲算力资源与第一算力资源进行比较;将空闲算力资源大于第一算力资源的边缘服务器确定为N个边缘服务器。
另一个实施例中,将各个候选的边缘服务器中空闲算力资源大于第一算力资源的边缘服务器,以及空闲算力资源之和大于第一算力资源的多个边缘服务器作为N个边缘服务器。比如,至少一个候选边缘服务器中包括边缘服务器1、边缘服务器2以及边缘服务器3,边缘服务器1的空闲算力资源大于第一算力资源,边缘服务器2的空闲算力资源和边缘服务器3的空闲算力资源均不大于第一算力资源,但是边缘服务器2的空闲算力资源和边缘服务器3的空闲算力资源之和大于第一算力资源,那么可以将边缘服务器1、边缘服务器2以及边缘服务器3均作为用于执行离线任务的N个边缘服务器。
又一个实施例中,如果各个候选的边缘服务器中不存在空闲算力大于第一算力资源的边缘服务器,那么将空闲算力之和大于第一算力资源的多个边缘服务器作为N个边缘服务器。
需要说明的是,第一算力资源可能包括CPU算力资源或GPU算力资源中的至少一种,每个边缘服务器的空闲算力资源包括CPU算力资源、GPU算力资源、网络带宽、吞吐量以及内存中任意一个或多个,上述在提到算力资源进行比较,或者算力资源相加运算以及求平均运算时,均是相同类型的算力资源之间进行的。比如第一算力资源包括GPU算力资源,GPU算力资源包括GPU每秒所执行的半精度浮点运算次数,GPU每秒所执行的单精度浮点运算次数,以及GPU每秒所执行的双精度浮点运算次数;任意一个边缘服务器的空闲算力资源包括GPU算力资源,GPU算力资源也包括GPU每秒所执行的半精度浮点运算次数,GPU每秒所执行单精度浮点运算次数,以及GPU每秒所执行的双精度浮点运算次数。在将任意一个边缘服务器的空闲算力资源与第一算力资源比较时,具体是比较两个GPU每秒所执行的半精度浮点运算次数之间的大小关系、两个GPU每秒所执行的单精度浮点运算次数之间的大小关系,以及两个GPU每秒所执行的双精度浮点运算次数之间的大小关系。
步骤S203、将离线任务分布式调度至N个边缘服务器,以使N个边缘服务器中每个边缘服务器在保证云应用正常运行的情况下,使用每个边缘服务器的空闲算力资源执行离线任务。
在一个实施例中,将离线任务分布式调度至N个边缘服务器,可以包括:基于所述N个边缘服务器中每个边缘服务器的空闲算力资源,将所述离线任务分割为N个子任务,所述N个子任务中每个子任务与一个边缘服务器相匹配;所述每个子任务相匹配的边缘服务器的空闲算力资源大于执行所述每个子任务所需的算力资源;将所述每个子任务分别分配至所述每个子任务相匹配的边缘服务器,以使每个边缘服务器执行相匹配的子任务。
作为一种可选的实施方式,如果N个边缘服务器中每个边缘服务器的空闲算力资源均大于第一算力资源,则基于每个边缘服务器的空闲算力资源,将离线任务分割为N个子任务,包括:将离线任务平均分割为N个子任务,执行每个子任务所需的算力资源可以等于第一算力资源/N,比如第一算力资源为x,将离线任务平均分割为5个子任务,那么执行每个子任务所需的算力资源等于x/5;将N个子任务分别调度给N个边缘服务器。此时每个子任务与一个边缘服务器相匹配可以指一个子任务与任意一个边缘服务器都是相匹配的。
作为另一种实施例方式,如果N个边缘服务器中不存在空闲算力资源大于第一算力资源的边缘服务器,N个边缘服务器的空闲算力资源之和大于第一算力资源,则基于每个边缘服务器的空闲算力资源,将离线任务分割为N个子任务,包括:为每个边缘服务器分配一个子任务,执行该子任务所需的算力资源小于每个边缘服务器的空闲算力资源,此时一个子任务对应一个固定的边缘服务器。比如N个边缘服务器包括边缘服务器1、边缘服务器2以及边缘服务器3,边缘服务器1的空闲算力资源等于x1,边缘服务器2的空闲算力资源等于x2,边缘服务器3的空闲算力资源等于x3,将离线任务分割成3个子任务,子任务1与边缘服务器1相匹配,执行子任务1所需的算力资源小于或等于x1;子任务2与边缘服务器2相匹配,执行子任务2所需的算力资源小于或等于x2;子任务3与边缘服务器3相匹配,执行子任务3所需的算力资源小于或等于x3。
作为又一种可选的实施例方式,如果N个边缘服务器中包括一部分空闲算力资源大于第一算力资源的边缘服务器,还包括一部分空闲算力资源不大于第一算力资源的边缘服务器,则基于N个边缘服务器中每个边缘服务器的空闲算力资源,将离线任务分割为N个子任务,可以包括:先根据空闲算力资源不大于第一算力资源的这部分边缘服务器中每个边缘服务器的空闲算力资源将离线任务分割出与这部分边缘服务器中每个边缘服务器相匹配的几个子任务;然后将剩下的离线任务进行平均分割后分配给另外一部分空闲算力资源大于第一算力资源的边缘服务器。
上述只是本发明实施例列举的几种分布式调度N个边缘服务器执行离线任务的实施方式,但是在实际应用中,可以根据具体需求选择使用其他方式进行分布式调度N个边缘服务器来执行离线任务,本发明实施例不做具体限定。
在一个实施例中,在每个边缘服务器执行相匹配的子任务过程中,管理服务器可以监测每个边缘服务器对相匹配的子任务的执行情况;如果检测到N个边缘服务器中任意一个边缘服务器在执行相匹配的子任务时发生异常,则重新选择一个边缘服务器执行该任意一个边缘服务器相匹配的子任务。可选的,管理服务器是基于每个边缘服务器上报的任务执行状态来监测每个边缘服务器对相匹配的子任务的执行情况的,检测到N个边缘服务器中任意一个边缘服务器在执行相匹配的子任务时发生异常,可以包括:任意一个边缘服务器向管理服务器上报的任务执行状态指示该任意一个边缘服务器在执行子任务时发生异常;或者,管理服务器很长一段时间没有接收到任意一个边缘服务器上报的任务执行状态。
由前述可知,离线任务对应一个执行时长阈值,N个边缘服务器需要在执行时长阈值内完成该离线任务,由于离线任务被分割成了N个子任务,那么每个子任务也对应一个执行时长阈值,每个子任务的执行时长阈值可以等于离线任务对应的执行时长阈值。在每个边缘服务器执行相匹配的子任务过程中,如果任意一个边缘服务器发现自己不能在该子任务对应的执行时长阈值内执行完该子任务,则任意一个边缘服务器需要向管理服务器上报一个超时提示信息,该超时提示信息用于指示管理服务器重新分配一个新的边缘服务器来执行上报超时提示信息的边缘服务器相匹配的子任务。
本发明实施例中,当接收到待执行的离线任务后,先评估执行该离线任务所需的第一算力资源,进一步的,获取用于执行该离线任务的N个边缘服务器,这N个边缘服务器的空闲算力资源大于执行离线任务所需的第一算力资源,N个边缘服务器的空闲算力资源是指每个边缘服务器的空闲算力资源之和。将该离线任务分布式调度至N个边缘服务器,以使N个边缘服务器中的每个边缘服务器在保证云应用正常运行的情况下,采用每个边缘服务器的空闲算力资源执行该离线任务。这样一来,无论在云应用的高峰期还是非高峰期,均能保证云应用正常运行的情况下,还可以利用每个边缘服务器中的空闲算力资源来执行离线任务,避免了每个边缘服务器中算力资源的浪费,提高了算力资源的利用率,从而也降低了边缘服务器的运营成本。另外,N可以取值为1或大于1,当N取值为1时可以保证离线任务的集中执行,为离线任务的执行管理提供便利;当N取值大于1时,实现了离线任务的分布式执行,这种分布式执行方式不仅能够保证离线任务的执行进度,还可以分担各个边缘服务器的负载,从 而保证各个边缘服务器中云应用能够正常运行。
基于上述的业务处理方法的实施例,本发明实施例提供了另一种业务处理方法,参见图3,为本发明实施例提供的另一种业务处理方法的流程示意图。图3所示的业务处理方法可以由N个边缘服务器中的边缘服务器执行,具体可由边缘服务器的处理器执行,该边缘服务器可以是N个边缘服务器中的任意一个,N个边缘服务器中运行有云应用。图3所示的业务处理方法可包括如下步骤:
步骤S301、接收管理服务器分布式调度的分布式离线任务。
其中,该分布式离线任务可以是管理服务器接收到的离线任务,也可以是离线任务被分割为N个子任务中的任意一个子任务。该边缘服务器的空闲算力资源大于执行分布式离线任务所需的算力资源。
N个子任务是基于执行离线任务的N个边缘服务器中每个边缘服务器的空闲算力资源,对离线任务进行分割处理得到的,具体实现方式可参见图2实施例中步骤S203中相关内容的描述,在此不再赘述。
可选的,在接收管理服务器分布式调度的分布式离线任务之前,该边缘服务器可以统计该边缘服务器的空闲算力资源,并将该边缘服务器的空闲算力资源上报给管理服务器。该边缘服务器的空闲算力资源可以是基于该边缘服务器的总算力资源和运行云应用所需的第二算力资源确定的。
作为一种可选的实施方式,可以将该边缘服务器的总算力资源与第二算力资源进行相减运算,相减运算的结果作为该边缘服务器的空闲算力资源,也就是说该边缘服务器的空闲算力资源可以指该边缘服务器中除去运行云应用的第二算力资源外的剩余算力资源。
作为又一种可选的实施方式,可以设置一些预留算力资源,从该边缘服务器的总算力资源中减去预留算力资源和运行云应用所需的第二算力资源,剩下的算力资源即为该边缘服务器的空闲算力资源。这样一来,如果运行云应用所需的算力资源突然增加时,可以采用一部分预留算力资源来运行云应用,不必打断分布式离线任务的执行。
在实际应用中,不同的云应用、以及同一个云应用的不同场景对算力资源的需求都是不一样的。在计算运行云应用所需的第二算力资源时,可以计算每个云应用的不同场景所需的算力资源,然后取不同场景所需的算力资源中最小的,作为该边缘服务器运行云应用所需的第二算力资源;或者也可以取不同场景所需的算力资源中最大的,作为该边缘服务器运行云应用所需的第二算力资源;或者也可以取不同场景所需的算力资源的平均值,作为该边缘服务器运行云应用所需的第二算力资源。其中,第二算力资源可以包括CPU算力资源、GPU算力资源、内存、网络带宽以及吞吐量中任意一个或多个,CPU算力资源一般可以包括CPU每秒所执行的浮点运算次数或CPU每秒所执行的操作次数中的至少一种,GPU算力资源可以包括GPU每秒所执行的浮点运算次数或GPU每秒所执行的操作次数中的至少一种。由于单个边缘服务器以及该边缘服务器所属的边缘计算节点都会影响空闲算力资源,因此网络带宽是依据内网带宽和外网带宽确定的,具体可以是将内网带宽和外网带宽中的较小者作为该边缘服 务器的网络外宽。
步骤S302、在保证云应用正常运行的情况下,采用边缘服务器的空闲算力资源执行分布式离线任务。
由前述可知,离线任务以及对离线任务进行分割得到的每个子任务均会对应一个执行时长阈值,因此分布式离线任务也会对应一个执行时长阈值。可选的,在采用边缘服务器中的空闲算力资源执行分布式离线任务,包括:基于边缘服务器的空闲算力资源确定执行完成分布式离线任务所需时长;如果所需时长小于分布式离线任务对应的执行时长阈值,则采用边缘服务器中的空闲算力资源执行分布式离线任务。
在一个实施例中,每个边缘服务器中的空闲算力资源可以指边缘服务器中除运行云应用所需的第二算力资源外的剩余算力资源,当边缘服务器在执行分布式离线任务过程中,如果检测到该边缘服务器中运行云应用所需的资源突然增加至大于第二算力资源,此时为了保证云应用的正常运行,边缘服务器可能需要执行算力释放操作。具体实现中,可以获取分布式离线任务在边缘服务器中的停留时长;根据该停留时长与该分布式离线任务对应的执行时长阈值之间的关系,执行算力释放操作。
其中,如果停留时长与执行时长阈值之间的时间差大于时间差阈值,表明还有充足的时间来执行分布式离线任务,此时算力释放操作可以包括暂停执行该分布式离线任务,以便于后续当云应用所需的算力资源小于或等于第二算力资源时,重新启动执行该分布式离线任务。如果停留时长与执行时长阈值之间的时间差小于时间差阈值,表明剩下的用于执行该分布式离线任务的时间不多了,此时可能不能等待该边缘服务器恢复了足够的空闲资源后,再继续启动执行该分布式离线任务,可能需要终止执行该分布式离线任务,并告知管理服务器重新选择具有充足的空闲算力资源的边缘服务器来执行该分布式离线任务。因此,如果停留时长与执行时长阈值之间的时间差小于时间差阈值,则算力释放操作可以包括终止执行分布式离线任务。
可选的,如果算力释放操作是指暂停执行分布式离线任务,则在执行算力释放操作之后,边缘服务器可以定期检测边缘服务器的空闲算力资源;如果检测到该边缘服务器的空闲算力资源大于了第一算力资源,则启动执行该分布式离线任务;若边缘服务器的空闲算力资源小于第一算力资源,且分布式离线任务在边缘服务器中的停留时长与执行时长阈值之间的差值小于时间差阈值,则终止执行分布式离线任务。也就是说,在定期检测边缘服务器的空闲算力资源过程中,如果发现边缘服务器的空闲算力资源不足以用来执行分布式离线任务,但是距离执行时长阈值的剩余时间不多了,因此,该边缘服务器只能放弃继续该分布式离线任务,告知管理服务器重新调度新的边缘服务器来执行该分布式离线任务。
可选的,边缘服务器在执行分布式离线任务过程中,如果预测到该边缘服务器无法在执行时长阈值内完成该分布式离线任务,则可以向管理服务器发送超时提示信息,该超时提示信息用于告知该边缘服务器执行完成分布式离线任务所需时长大于该分布式离线任务对应的执行时长阈值,管理服务器应该重新分配新的边缘服务器来执行分布式离线任务。
本发明实施例中,边缘服务器接收管理服务器分布式调度的分布式离线任务,该分布式 离线任务可以是管理服务器接收到的离线任务,也可以是N个任务中的与边缘服务器相匹配的子任务,这N个子任务可以是基于用于执行离线任务的N个边缘服务器的空闲算力资源对离线任务进行分割处理得到的;在保证云应用正常运行的情况下,采用该边缘服务器的空闲算力资源执行分布式离线任务。这样一来,无论在云应用的高峰期还是非高峰期,均能保证云应用正常运行的情况下,还可以利用该边缘服务器中的空闲算力资源来执行分布式离线任务,避免了边缘服务器中算力资源的浪费,提高了算力资源的利用率,从而也降低了边缘服务器的运营成本。
基于上述业务处理方法的实施例,本发明实施例提供了一种业务处理系统。参见图4,为本发明实施例提供的一种业务处理系统的结构示意图,图4所示的业务处理系统中可以包括一个对边缘计算节点进行管理的管理服务器401,以及包括至少一个边缘服务器402,至少一个边缘服务器402被分配至P个边缘计算节点403,每个边缘计算节点403中包括一个或多个边缘服务器。
在一个实施例中,每个边缘服务器402中包括一个核心功能模块,该核心功能模块主要用于云应用的核心功能实现,比如对于云游戏来说,该核心功能模块用于游戏的渲染、游戏逻辑计等功能。在本发明实施例中,此模块的算力资源需求设置为最高优先级,也就是说无论一个边缘正在执行什么样的离线任务,如果发现该模块需要更多的算力资源,那么都会优先为该模块分配足够的算力资源。
在一个实施例中,每个边缘服务器402中还可以包括一个算力管理模块,算力管理模块用于对该边缘服务器的算力资源进行管理,确保本机所有实时在线任务的需求不超出物理算力上限。算力管理模块主要功能可以包括:
(1)本机实时的空闲算力资源收集以及上报,数据上报到管理服务器,可以作为后续离线任务调度的依据;应当理解的,此时可以直接将本机的空闲算力资源上报给管理服务器401,也可以是将本机已占用的算力资源和可用的总算力资源上报给管理服务器401,由管理服务器401基于已占用的算力资源和可用的总算力资源统计该边缘服务器的空闲算力资源;
(2)本机算力管理,当实时在线任务(本发明实施例主要指云应用)所需算力超出本机的当前可用的算力上限时,如果有离线任务正在运行,通知离线任务调度模块释放算力;如果没有离线任务或者暂停了所有离线任务后仍然超出本机的当前可用的算力上限,则通知管理服务器调度部分云应用实例到其他边缘服务器;
在一个实施例中,边缘服务器402还可以包括离线任务调度模块,离线任务调度模块主要用于对管理服务器分布式调度至本机的离线任务进行调度,主要功能可以包括:
(1)启动任务。接收到管理服务器下发的离线任务(此处的离线任务可以指一个完整的离线任务,也可以指一个完成的离线任务被分割后的一个子任务)后,计算本机的空闲算力资源是否满足需求,主要是确定本机的空闲算力资源是否能够在执行时长阈值内完成该离线任务;如果满足,则启动离线任务。离线任务调度模块还会定期检查本机离线任务的执行状态,针对因空闲算力不足而暂停执行的离线任务,重新检查当前的空闲算力资源,如果当前 的空闲算力资源仍然不足,则检查该离线任务在本机的停留时长,如果停留时长与执行时长阈值之间的时间差值小于时间差阈值,或者停留时长超出了执行时长阈值,则终止执行该离线任务;
(2)释放算力。当算力管理模块检测到当前实时在线任务所需的算力不足,通知此模块释放执行算力释放操作,具体可以包括暂停执行离线任务(针对还有充足完成时间的任务,也就是前述的,离线任务在本机的停留时长与执行时长阈值之间的时间差大于时间差阈值),或者终止执行离线任务(针对需要马上完成的任务,也就是前述提到的,离线任务在本机的停留时长与执行时长阈值之间的时间差小于时间差阈值);
(3)暂停离线任务。根据释放算力部分的指令对离线任务进行暂停操作;
(4)完成离线任务。当离线任务执行完成时,清理本机的临时数据,并把执行完成的计算上报给管理服务器;
(5)终止离线任务。当本机的空闲算力资源无法在执行时长阈值内完成该离线任务,或者本机需要进行停机维护等异常情况发生时,终止执行离线任务,并通过管理服务器把离线任务分配给其他边缘服务器。
在一个实施例中,管理服务器401可以包括空闲容量预测模块,该空闲容量预测模块主要用于根据各个边缘服务器上报的算力资源数据,计算出各个边缘服务器的空闲算力资源以及各个边缘计算节点的节点空闲算力资源。
在一个实施例汇总,管理服务器401还可以包括离线任务管理模块,离线任务管理模块主要的功能可以包括:
(1)任务接收。接收用户上传的离线任务,对该离线任务进行分类,主要分类可以包括该离线任务是否具有时效要求,以及该离线任务的主要负载是GPU型还是GPU型,该离线任务是否可以进行分布式执行;
(2)任务分配。根据执行该离线任务所需的算力资源与各个边缘服务器的空闲算力资源进行匹配,以将离线任务分配到合适的边缘服务器;如果一个单个边缘服务器无法单独完成该离线任务,则将该离线任务分发给分布式调度模块,由分布式调度模块进行分配;
(3)任务过程管理。接收执行云应用的边缘服务器上报的任务执行状态,如果根据任务执行状态判断离线任务被执行完成,则校验执行结果后反馈给用户;如果根据任务执行状态判断出离线任务执行异常,或者长时间没有接收到边缘服务器上报的任务执行状态,则将离线任务重新调度到新的边缘服务器。
在一个实施例中,管理服务器401还包括策略管理模块,策略管理模块的主要功能可以包括:
(1)白名单管理。针对部分边缘服务器需要运行高优先级、不可被打断的云应用,需要将这部分边缘服务器加入到预设的白名单中以确保这部分边缘服务器不被调度离线任务。同时这些边缘服务器也不需要一直在预设表名单中,高优先级的云应用结束后,将这些边缘服务器从预设的白名单移除;
(2)边缘服务器状态管理。边缘服务器的工作状态包括忙碌状态和空闲状态。当一个边 缘服务器上报当前边缘服务器已经超过负载上限时,需要将该边缘服务器的工作状态设置为忙碌状态,避免分配新的离线任务以及云应用实例到该边缘服务器。当该边缘服务器告知管理服务器当前处于空闲时,将该边缘服务器的工作状态修改为空闲状态,并可以重新计算该边缘付服务器的空闲算力资源。
在一个实施例中,管理服务器401还可以包括分布式调度模块,分布式调度模块主要功能包括:
(1)任务调度。把一个大的离线任务根据用于执行离线任务的各个边缘服务器的空闲算力资源,拆分为多个子任务,将每个子任务分配给一个边缘服务器进行执行;当各个边缘服务器完成了相应子任务的执行后,向管理服务器401上报执行结果,管理服务器将上报的执行结果进行汇总后得到最终执行结果;
(2)异常边缘服务器管理。当某个边缘服务器因为实时计算出现异常、掉线、计算超时等异常情况出现时,及时发现并将离线任务调度到其他边缘服务器进行执行。
在一个实施例中,管理服务器401还可以包括云应用实例调度模块,该模块用于根据各个边缘服务器的算力资源以及各个边缘计算节点的节点算力资源情况动态分配云应用的实例,避免单个边缘服务器出现过载情况。
在上述业务处理系统中,当管理服务器接收到待执行的离线任务后,先评估执行该离线任务所需的第一算力资源,进一步的,获取用于执行该离线任务的N个边缘服务器,这N个边缘服务器的空闲算力资源大于执行离线任务所需的第一算力资源,N个边缘服务器的空闲算力资源是指每个边缘服务器的空闲算力资源之和。将该离线任务分布式调度至N个边缘服务器。
N个边缘服务器中的任意一个边缘服务器接收到管理服务器分布式调度的离线任务后,在保证云应用正常运行的情况下,采用该边缘服务器的空闲算力资源执行离线任务。这样一来,无论在云应用的高峰期还是非高峰期,均能保证云应用正常运行的情况下,还可以利用该边缘服务器中的空闲算力资源来执行分布式离线任务,避免了边缘服务器中算力资源的浪费,提高了算力资源的利用率,从而也降低了边缘服务器的运营成本。
另外,N个可以取值为1或大于1,当N取值为1时可以保证离线任务的集中执行,为离线任务的执行管理提供便利;当N取值大于1时,实现了离线任务的分布式执行,这种分布式执行方式不仅能够保证离线任务的执行进度,还可以分担各个边缘服务器的负载,从而保证各个边缘服务器中云应用能够正常运行。
应该理解的是,虽然上述各实施例的流程图中的各个步骤按照箭头的指示依次显示,但是这些步骤并不是必然按照箭头指示的顺序依次执行。除非本文中有明确的说明,这些步骤的执行并没有严格的顺序限制,这些步骤可以以其它的顺序执行。而且,上述各实施例中的至少一部分步骤可以包括多个步骤或者多个阶段,这些步骤或者阶段并不必然是在同一时刻执行完成,而是可以在不同的时刻执行,这些步骤或者阶段的执行顺序也不必然是依次进行,而是可以与其它步骤或者其它步骤中的步骤或者阶段的至少一部分轮流或者交替地执行。
基于上述的业务处理方法的实施例,本发明实施例提供了一种业务处理装置。参见图5,为本发明实施例提供的一种业务处理装置的结构示意图。图5所示的业务处理装置可运行如下单元:
确定单元501,用于确定执行离线任务所需的第一算力资源;
所述确定单元501,还用于确定用于执行所述离线任务的N个边缘服务器,所述N个边缘服务器中运行有云应用;所述N个边缘服务器的空闲算力资源大于所述第一算力资源,所述N个边缘服务器的空闲算力资源是指所述N个边缘服务器中各个边缘服务器的空闲算力资源之和,N为大于或等于1的整数;
调度单元502,用于将所述离线业务分布式调度至所述N个边缘服务器,以使所述N个边缘服务器中每个边缘服务器在保证所述云应用正常运行的情况下,使用每个边缘服务器中的空闲算力资源执行所述离线任务。
在一个实施例中,所述调度单元502在将所述离线业务分布式调度至所述N个边缘服务器时,执行如下步骤:
基于所述N个边缘服务器中每个边缘服务器的空闲算力资源,将所述离线任务分割为N个子任务,所述N个子任务中每个子任务与一个边缘服务器相匹配;所述每个子任务相匹配的边缘服务器的空闲算力资源大于执行所述每个子任务所需的算力资源;将所述每个子任务分别分配至所述每个子任务相匹配的边缘服务器,以使每个边缘服务器执行相匹配的子任务。
在一个实施例中,所述云应用被部署至M个边缘服务器中执行,所述M个边缘服务器被分配至P个边缘计算节点中,每个边缘计算节点中部署有一个或多个边缘服务器,M和P均为大于或等于1的整数;所述确定单元501在确定用于执行所述离线任务的N个边缘服务器时,执行如下步骤:
从所述P个边缘计算节点中选择L个边缘计算节点,所述L个边缘计算节点的节点空闲算力资源大于所述第一算力资源,所述L个边缘计算节点的节点空闲算力资源是指每个边缘计算节点的节点空闲算力资源之和;所述每个边缘节点的节点空闲算力资源是根据所述每个边缘节点中部署的边缘服务器的空闲算力资源得到的;
基于所述L个边缘计算节点包括的每个边缘服务器的属性信息,从所述L个边缘计算节点包括的边缘服务器中确定至少一个候选的边缘服务器;根据所述至少一个候选的边缘服务器中每个边缘服务器的空闲算力资源和所述第一算力资源,从所述至少一个候选的边缘服务器中确定N个边缘服务器。
在一个实施例中,所述每个边缘服务器的属性信息包括每个边缘服务器的工作状态,所述工作状态包括空闲状态或者忙碌状态,所述确定单元501在从所述L个边缘计算节点包括的边缘服务器中确定至少一个候选的边缘服务器时,执行如下步骤:
将所述L个边缘计算节点包括的边缘服务器中,工作状态为空闲状态的边缘服务器确定为候选的边缘服务器。
在一个实施例中,所述每个边缘服务器的属性信息包括每个边缘服务器所属的服务器类型分组,所述服务器类型分组包括预设白名单组和普通组,所述基于所述L个边缘计算节点 包括的每个边缘服务器的属性信息,所述确定单元501在从所述L个边缘计算节点包括的边缘服务器中确定至少一个候选的边缘服务器时,执行如下步骤:
将所述L个边缘计算节点包括的边缘服务器中,所属的服务器类型分组为普通组的边缘服务器确定为候选的边缘服务器。
在一个实施例中,所述业务处理装置还包括处理单元503,所述处理单元503用于:在每个边缘服务器执行相匹配的子任务过程中,监测每个边缘服务器对相匹配的子任务的执行情况;若监测到N个边缘服务器中任意一个边缘服务器在执行相匹配的子任务时发生异常,则重新选择一个边缘服务器执行所述任意一个边缘服务器相匹配的子任务。
在一个实施例中,一个子任务对应一个执行时长阈值,业务处理装置还包括接收单元504,所述接收单元504用于在每个边缘服务器执行相匹配的子任务过程中,接收任意一个边缘服务器上报的超时提示信息,所述超时提示信息用于指示所述任意一个边缘服务器执行完成相匹配的子任务所需时长大于相匹配的子任务对应的执行时长阈值,需重新分配新的边缘服务器执行任意一个边缘服务器相匹配的子任务。
在一个实施例中,所述第一算力资源包括以下任意一项或多项:图形处理器算力资源、中央处理器算力资源、内存、网络带宽以及网络吞吐量;其中,所述图形处理器算力资源包括如下至少一种:图形处理器每秒所执行的浮点运算次数以及图形处理器每次所执行的操作次数;所述中央处理器算力资源包括如下至少一种:中央处理器每秒所执行的浮点运算次数以及中央处理器每秒所执行的操作次数。
在一个实施例中,所述确定单元在确定执行离线任务所需的第一算力资源时,执行如下步骤:
基于任务类型和计算复杂度之间的对应关系,确定与所述离线任务的任务类型对应的计算复杂度;根据确定的计算复杂度从历史离线任务中查找至少一个匹配历史离线任务,每个匹配历史离线任务对应的计算复杂度与确定的计算复杂度相匹配;基于执行每个匹配历史离线任务所使用的算力资源对所述离线任务所需的算力资源进行估算,得到执行所述离线任务所需的第一算力资源。
根据本发明的一个实施例,图2所示的业务处理方法所涉及各个步骤可以是由图5所示的业务处理装置中的各个单元来执行的。例如,图2所述的步骤S201和S202可由图5所示的业务处理装置中的确定单元501来执行,步骤S203可由图5所示的业务处理装置中调度单元502来执行。
根据本发明的另一个实施例,图5所示的业务处理装置中的各个单元可以分别或全部合并为一个或若干个另外的单元来构成,或者其中的某个(些)单元还可以再拆分为功能上更小的多个单元来构成,这可以实现同样的操作,而不影响本发明的实施例的技术效果的实现。上述单元是基于逻辑功能划分的,在实际应用中,一个单元的功能也可以由多个单元来实现,或者多个单元的功能由一个单元实现。在本发明的其它实施例中,基于业务处理装置也可以包括其它单元,在实际应用中,这些功能也可以由其它单元协助实现,并且可以由多个单元协作实现。
根据本发明的另一个实施例,可以通过在包括中央处理单元(CPU)、随机存取存储介质(RAM)、只读存储介质(ROM)等处理元件和存储元件的例如计算机的通用计算设备上运行能够执行如图2所示的相应方法所涉及的各步骤的计算机可读指令(包括程序代码),来构造如图5中所示的业务处理装置,以及来实现本发明实施例业务处理方法。所述计算机可读指令可以记载于例如计算存储介质上,并通过计算存储介质装载于上述节点设备中,并在其中运行。
本发明实施例中,当接收到待执行的离线任务后,先评估执行该离线任务所需的第一算力资源,进一步的,获取用于执行该离线任务的N个边缘服务器,这N个边缘服务器的空闲算力资源大于执行离线任务所需的第一算力资源,N个边缘服务器的空闲算力资源是指每个边缘服务器的空闲算力资源之和。将该离线任务分布式调度至N个边缘服务器,以使N个边缘服务器中的每个边缘服务器在保证云应用正常运行的情况下,采用每个边缘服务器的空闲算力资源执行该离线任务。这样一来,无论在云应用的高峰期还是非高峰期,均能保证云应用正常运行的情况下,还可以利用每个边缘服务器中的空闲算力资源来执行离线任务,避免了每个边缘服务器中算力资源的浪费,提高了算力资源的利用率,从而也降低了边缘服务器的运营成本。另外,N个可以取值为1或大于1,当N取值为1时可以保证离线任务的集中执行,为离线任务的执行管理提供便利;当N取值大于1时,实现了离线任务的分布式执行,这种分布式执行方式不仅能够保证离线任务的执行进度,还可以分担各个边缘服务器的负载,从而保证各个边缘服务器中云应用能够正常运行。
基于上述业务处理方法的实施例以及业务处理装置的实施例,本发明实施例还提供了另一种业务处理装置。参见图6,为本发明实施例提供的另一种业务处理装置的结构示意图。图6所示的业务处理装置可运行如下单元:
接收单元601,用于接收管理服务器分布式调度的分布式离线任务,所述分布式离线任务包括所述管理服务器接收到的离线任务,或者所述分布式离线任务包括N个子任务中与边缘服务器相匹配的子任务,所述N个子任务是将基于N个边缘服务器中每个边缘服务器的空闲算力资源对所述离线任务进行分割处理得到的;所述N个边缘服务器用于执行所述离线任务,所述N个边缘服务器中运行有云应用;
执行单元602,用于在保证目标云应用正常执行的情况下,采用所述边缘服务器中的空闲算力资源执行所述分布式离线任务。
在一个实施例中,所述分布式离线任务对应一个执行时长阈值,所述执行单元602在采用所述边缘服务器中的空闲算力资源执行所述分布式离线任务时,执行如下步骤:
基于所述边缘服务器的空闲算力资源确定执行完成所述分布式离线任务所需时长;如果所需时长小于所述分布式离线任务对应的执行时长阈值,则采用所述边缘服务器的空闲算力资源执行所述分布式离线任务。
在一个实施例中,所述边缘服务器的空闲算力资源是指所述边缘服务器除运行所述云应用所需的第二算力资源外的剩余算力资源;所述业务处理装置还包括获取单元603;
所述获取单元603,用于在执行所述分布式离线任务的过程中,当监测到所述边缘服务器中运行所述云应用所需的资源大于所述第二算力资源时,获取所述分布式离线任务在所述边缘服务器中的停留时长;
所述执行单元602,用于根据所述停留时长与所述执行时长阈值之间的关系,执行算力释放操作;其中,所述算力释放操作包括暂停执行所述分布式离线任务或者终止执行所述分布式离线任务;如果所述停留时长与所述执行时长阈值之间的时间差大于时间差阈值,则所述算力释放操作包括暂停执行所述分布式离线任务;如果所述停留时长与所述执行时长阈值之间的时长差小于时间差阈值,则所述算力释放操作包括终止执行所述分布式离线任务。
在一个实施例中,若所述算力释放操作包括暂停执行所述分布式离线任务,所述执行单元602还用于:定期检测所述边缘服务器的空闲算力资源;若所述边缘服务器的空闲算力资源大于所述第一算力资源,则启动执行所述分布式离线任务;若所述边缘服服务器的空闲算力资源小于所述第一算力资源,且所述分布式离线任务在所述边缘服务器中的停留时长与所述执行时长阈值之间的差值小于时间差阈值,则终止执行所述分布式离线任务。
在一个实施例中,所述业务处理装置还包括发送单元604,所述发送单元604,用于在所述边缘服务器执行所述分布式离线任务的过程中,如果预测到所述边缘服务器执行完成所述分布式离线任务所需时长大于所述执行时长阈值,则向所述管理服务器发送超时提示信息,所述超时提示信息用于指示所述边缘服务器执行完成所述分布式离线任务所需时长大于所述执行时长阈值,所述管理服务器需重新分配新的边缘服务器执行所述分布式离线任务。
根据本发明的一个实施例,图3所示的业务处理方法所涉及各个步骤可以是由图6所示的业务处理装置中的各个单元来执行的。例如,图3所述的步骤S301可由图6所示的业务处理装置中的接收单元601来执行,步骤S302可由图6所示的业务处理装置中执行单元602来执行。
根据本发明的另一个实施例,图6所示的业务处理装置中的各个单元可以分别或全部合并为一个或若干个另外的单元来构成,或者其中的某个(些)单元还可以再拆分为功能上更小的多个单元来构成,这可以实现同样的操作,而不影响本发明的实施例的技术效果的实现。上述单元是基于逻辑功能划分的,在实际应用中,一个单元的功能也可以由多个单元来实现,或者多个单元的功能由一个单元实现。在本发明的其它实施例中,基于业务处理装置也可以包括其它单元,在实际应用中,这些功能也可以由其它单元协助实现,并且可以由多个单元协作实现。
根据本发明的另一个实施例,可以通过在包括中央处理单元(CPU)、随机存取存储介质(RAM)、只读存储介质(ROM)等处理元件和存储元件的例如计算机的通用计算设备上运行能够执行如图3所示的相应方法所涉及的各步骤的计算机可读指令(包括程序代码),来构造如图6中所示的业务处理装置,以及来实现本发明实施例业务处理方法。所述计算机可读指令可以记载于例如计算存储介质上,并通过计算存储介质装载于上述节点设备中,并在其中运行。
本发明实施例中,边缘服务器接收管理服务器分布式调度的分布式离线任务,该分布式 离线任务可以是管理服务器接收到的离线任务,也可以是N个任务中的与边缘服务器相匹配的子任务,这N个子任务可以是基于用于执行离线任务的N个边缘服务器的空闲算力资源对离线任务进行分割处理得到的;在保证云应用正常运行的情况下,采用该边缘服务器的空闲算力资源执行分布式离线任务。这样一来,无论在云应用的高峰期还是非高峰期,均能保证云应用正常运行的情况下,还可以利用该边缘服务器中的空闲算力资源来执行分布式离线任务,避免了边缘服务器中算力资源的浪费,提高了算力资源的利用率,从而也降低了边缘服务器的运营成本。
基于上述的方法实施例以及装置实施例,本发明实施例提供了一种服务器,参见图7,为本发明实施例提供的一种服务器的结构示意图。图7所示的服务器可以对应于前述的管理服务器,图7所示的服务器可以包括处理器701、输入接口702、输出接口703以及计算机存储介质704。其中,处理器701、输入接口702、输出接口703以及计算机存储介质704可通过总线或其他方式连接。
计算机存储介质704可以存储在服务器的存储器中,所述计算机存储介质704用于存储计算机可读指令,所述处理器701用于执行所述计算机存储介质904存储的计算机可读指令。处理器701(或称CPU(Central Processing Unit,中央处理器))是服务器的计算核心以及控制核心,其适于实现一条或多条计算机可读指令,具体适于加载并执行:
确定执行离线任务所需的第一算力资源;确定用于执行所述离线任务的N个边缘服务器,所述N个边缘服务器中运行有云应用;所述N个边缘服务器的空闲算力资源大于所述第一算力资源,所述N个边缘服务器的空闲算力资源是指所述N个边缘服务器中各个边缘服务器的空闲算力资源之和,N为大于或等于1的整数;将所述离线任务分布式调度至所述N个边缘服务器,以使所述N个边缘服务器中每个边缘服务器在保证所述云应用正常运行的情况下,使用所述每个边缘服务器的空闲算力资源执行所述离线业务。
本发明实施例中,当接收到待执行的离线任务后,先评估执行该离线任务所需的第一算力资源,进一步的,获取用于执行该离线任务的N个边缘服务器,这N个边缘服务器的空闲算力资源大于执行离线任务所需的第一算力资源,N个边缘服务器的空闲算力资源是指每个边缘服务器的空闲算力资源之和。将该离线任务分布式调度至N个边缘服务器,以使N个边缘服务器中的每个边缘服务器在保证云应用正常运行的情况下,采用每个边缘服务器的空闲算力资源执行该离线任务。这样一来,无论在云应用的高峰期还是非高峰期,均能保证云应用正常运行的情况下,还可以利用每个边缘服务器中的空闲算力资源来执行离线任务,避免了每个边缘服务器中算力资源的浪费,提高了算力资源的利用率,从而也降低了边缘服务器的运营成本。另外,N个可以取值为1或大于1,当N取值为1时可以保证离线任务的集中执行,为离线任务的执行管理提供便利;当N取值大于1时,实现了离线任务的分布式执行,这种分布式执行方式不仅能够保证离线任务的执行进度,还可以分担各个边缘服务器的负载,从而保证各个边缘服务器中云应用能够正常运行。
基于上述的方法实施例以及装置实施例,本发明实施例提供了另一种服务器,参见图8,为本发明实施例提供的另一种服务器的结构示意图,图8所示的服务器可以对应于前述的边缘服务器。图8所示的服务器可以包括处理器801、输入接口802、输出接口803以及计算机存储介质804。其中,处理器801、输入接口802、输出接口803以及计算机存储介质804可通过总线或其他方式连接。
计算机存储介质804可以存储在终端的存储器中,所述计算机存储介质904用于存储计算机可读指令,所述处理器801用于执行所述计算机存储介质904存储的计算机可读指令。处理器801(或称CPU(Central Processing Unit,中央处理器))是终端的计算核心以及控制核心,其适于实现一条或多条计算机可读指令,具体适于加载并执行:
接收管理服务器分布式调度的分布式离线任务,所述分布式离线任务包括所述管理服务器接收到的所述离线任务;或者,所述分布式离线任务包括N个子任务中与所述边缘服务器相匹配的子任务,所述N个子任务是基于N个边缘服务器中每个边缘服务器的空闲算力资源对所述离线任务进行分割处理得到;
在保证所述云应用正常执行的情况下,采用所述边缘服务器的空闲算力资源执行所述分布式离线任务。
本发明实施例中,边缘服务器接收管理服务器分布式调度的分布式离线任务,该分布式离线任务可以是管理服务器接收到的离线任务,也可以是N个任务中的与边缘服务器相匹配的子任务,这N个子任务可以是基于用于执行离线任务的N个边缘服务器的空闲算力资源对离线任务进行分割处理得到的;在保证云应用正常运行的情况下,采用该边缘服务器的空闲算力资源执行分布式离线任务。这样一来,无论在云应用的高峰期还是非高峰期,均能保证云应用正常运行的情况下,还可以利用该边缘服务器中的空闲算力资源来执行分布式离线任务,避免了边缘服务器中算力资源的浪费,提高了算力资源的利用率,从而也降低了边缘服务器的运营成本。
本发明实施例还提供了一种计算机存储介质(Memory),所述计算机存储介质是服务器的记忆设备,用于存放程序和数据。可以理解的是,此处的计算机存储介质既可以包括服务器的内置存储介质,当然也可以包括服务器所支持的扩展存储介质。计算机存储介质提供存储空间,该存储空间存储了服务器的操作系统。并且,在该存储空间中还存放了适于被处理器801或者处理器901加载并执行的计算机可读指令。需要说明的是,此处的计算机存储介质可以是高速RAM存储器,也可以是非不稳定的存储器(non-volatile memory),例如至少一个磁盘存储器;可选的还可以是至少一个位于远离前述处理器的计算机存储介质。
在一个实施例中,所述计算机存储介质中存储计算机可读指令可由处理器801加载并执行:
确定执行离线任务所需的第一算力资源;确定用于执行所述离线任务的N个边缘服务器,所述N个边缘服务器中运行有云应用;所述N个边缘服务器的空闲算力资源大于所述第一算力资源,所述N个边缘服务器的空闲算力资源是指所述N个边缘服务器中各个边缘服务器的 空闲算力资源之和,N为大于或等于1的整数;将所述离线任务分布式调度至所述N个边缘服务器,以使所述N个边缘服务器中每个边缘服务器在保证所述云应用正常运行的情况下,使用所述每个边缘服务器的空闲算力资源执行所述离线业务。
在一个实施例中,所述处理器801在将所述离线业务分布式调度至所述N个边缘服务器时,执行如下步骤:
基于所述N个边缘服务器中每个边缘服务器的空闲算力资源,将所述离线任务分割为N个子任务,所述N个子任务中每个子任务与一个边缘服务器相匹配;所述每个子任务相匹配的边缘服务器的空闲算力资源大于执行所述每个子任务所需的算力资源;将所述每个子任务分别分配至所述每个子任务相匹配的边缘服务器,以使每个边缘服务器执行相匹配的子任务。
在一个实施例中,所述云应用被部署至M个边缘服务器中执行,所述M个边缘服务器被分配至P个边缘计算节点中,每个边缘计算节点中部署有一个或多个边缘服务器,M和P均为大于或等于1的整数;所述处理器801在确定用于执行所述离线任务的N个边缘服务器时,执行如下步骤:
从所述P个边缘计算节点中选择L个边缘计算节点,所述L个边缘计算节点的节点空闲算力资源大于所述第一算力资源,所述L个边缘计算节点的节点空闲算力资源是指每个边缘计算节点的节点空闲算力资源之和;所述每个边缘节点的节点空闲算力资源是根据所述每个边缘节点中部署的边缘服务器的空闲算力资源得到的;
基于所述L个边缘计算节点包括的每个边缘服务器的属性信息,从所述L个边缘计算节点包括的边缘服务器中确定至少一个候选的边缘服务器;
根据所述至少一个候选的边缘服务器中每个边缘服务器的空闲算力资源和所述第一算力资源,从所述至少一个候选的边缘服务器中确定N个边缘服务器。
在一个实施例中,所述每个边缘服务器的属性信息包括每个边缘服务器的工作状态,所述工作状态包括空闲状态或者忙碌状态,所述基于所述L个边缘计算节点包括的每个边缘服务器的属性信息;所述处理器801在从所述L个边缘计算节点包括的边缘服务器中确定至少一个候选的边缘服务器时,执行如下步骤:将所述L个边缘计算节点包括的边缘服务器中,工作状态为空闲状态的边缘服务器确定为候选的边缘服务器。
在一个实施例中,所述每个边缘服务器的属性信息包括每个边缘服务器所属的服务器类型分组,所述服务器类型分组包括预设白名单组和普通组,所述基于所述L个边缘计算节点包括的每个边缘服务器的属性信息,所述处理器801在从所述L个边缘计算节点包括的边缘服务器中确定至少一个候选的边缘服务器时,执行如下步骤:将所述L个边缘计算节点包括的边缘服务器中,所属的服务器类型分组为普通组的边缘服务器确定为候选的边缘服务器。
在一个实施例中,所述处理器801还用于:在每个边缘服务器执行相匹配的子任务过程中,监测每个边缘服务器对相匹配的子任务的执行情况;若监测到N个边缘服务器中任意一个边缘服务器在执行相匹配的子任务时发生异常,则重新选择一个边缘服务器执行所述任意一个边缘服务器相匹配的子任务。
在一个实施例中,一个子任务对应一个执行时长阈值,所述处理器801还用于:在每个 边缘服务器执行相匹配的子任务过程中,接收任意一个边缘服务器上报的超时提示信息,所述超时提示信息用于指示所述任意一个边缘服务器执行完成相匹配的子任务所需时长大于相匹配的子任务对应的执行时长阈值,需重新分配新的边缘服务器执行任意一个边缘服务器相匹配的子任务。
在一个实施例中,所述第一算力资源包括以下任意一项或多项:图形处理器算力资源、中央处理器算力资源、内存、网络带宽以及网络吞吐量;其中,所述图形处理器算力资源包括如下至少一种:图形处理器每秒所执行的浮点运算次数以及图形处理器每次所执行的操作次数;所述中央处理器算力资源包括如下至少一种:中央处理器每秒所执行的浮点运算次数以及中央处理器每秒所执行的操作次数。
在一个实施例中,所述处理器801在确定执行离线任务所需的第一算力资源时,执行如下步骤:基于任务类型和计算复杂度之间的对应关系,确定与所述离线任务的任务类型对应的计算复杂度;根据确定的计算复杂度从历史离线任务中查找至少一个匹配历史离线任务,每个匹配历史离线任务对应的计算复杂度与确定的计算复杂度相匹配;基于执行每个匹配历史离线任务所使用的算力资源对所述离线任务所需的算力资源进行估算,得到执行所述离线任务所需的第一算力资源。
本发明实施例中,当接收到待执行的离线任务后,先评估执行该离线任务所需的第一算力资源,进一步的,获取用于执行该离线任务的N个边缘服务器,这N个边缘服务器的空闲算力资源大于执行离线任务所需的第一算力资源,N个边缘服务器的空闲算力资源是指每个边缘服务器的空闲算力资源之和。将该离线任务分布式调度至N个边缘服务器,以使N个边缘服务器中的每个边缘服务器在保证云应用正常运行的情况下,采用每个边缘服务器的空闲算力资源执行该离线任务。这样一来,无论在云应用的高峰期还是非高峰期,均能保证云应用正常运行的情况下,还可以利用每个边缘服务器中的空闲算力资源来执行离线任务,避免了每个边缘服务器中算力资源的浪费,提高了算力资源的利用率,从而也降低了边缘服务器的运营成本。另外,N个可以取值为1或大于1,当N取值为1时可以保证离线任务的集中执行,为离线任务的执行管理提供便利;当N取值大于1时,实现了离线任务的分布式执行,这种分布式执行方式不仅能够保证离线任务的执行进度,还可以分担各个边缘服务器的负载,从而保证各个边缘服务器中云应用能够正常运行。
在一个实施例中,所述计算机存储介质中存储的计算机可读指令可由处理器901加载并执行:
接收管理服务器分布式调度的分布式离线任务,所述分布式离线任务包括所述管理服务器接收到的所述离线任务;或者,所述分布式离线任务包括N个子任务中与所述边缘服务器相匹配的子任务,所述N个子任务是基于N个边缘服务器中每个边缘服务器的空闲算力资源对所述离线任务进行分割处理得到;在保证所述云应用正常执行的情况下,采用所述边缘服务器的空闲算力资源执行所述分布式离线任务。
在一个实施例中,所述分布式离线任务对应一个执行时长阈值,所述处理器901在采用所述边缘服务器的空闲算力资源执行所述分布式离线任务时,执行如下步骤:
基于所述边缘服务器的空闲算力资源确定执行完成所述分布式离线任务所需时长;如果所需时长小于所述分布式离线任务对应的执行时长阈值,则采用所述边缘服务器的空闲算力资源执行所述分布式离线任务。
在一个实施例中,所述边缘服务器的空闲算力资源是指所述边缘服务器除运行所述云应用所需的第二算力资源外的剩余算力资源;所述处理器901还用于执行:
在执行所述分布式离线任务的过程中,当监测到所述边缘服务器中运行所述云应用所需的资源大于所述第二算力资源时,获取所述分布式离线任务在所述边缘服务器中的停留时长;
根据所述停留时长与所述执行时长阈值之间的关系,执行算力释放操作;其中,所述算力释放操作包括暂停执行所述分布式离线任务或者终止执行所述分布式离线任务;如果所述停留时长与所述执行时长阈值之间的时间差大于时间差阈值,则所述算力释放操作包括暂停执行所述分布式离线任务;如果所述停留时长与所述执行时长阈值之间的时长差小于时间差阈值,则所述算力释放操作包括终止执行所述分布式离线任务。
在一个实施例中,所述算力释放操作包括暂停执行所述分布式离线任务,在执行算力释放操作后,所述处理器901还用于执行:
定期检测所述边缘服务器的空闲算力资源;若所述边缘服务器的空闲算力资源大于所述第一算力资源,则启动执行所述分布式离线任务;若所述边缘服服务器的空闲算力资源小于所述第一算力资源,且所述分布式离线任务在所述边缘服务器中的停留时长与所述执行时长阈值之间的差值小于时间差阈值,则终止执行所述分布式离线任务。
在一个实施例中,所述处理器901还用于执行:在所述边缘服务器执行所述分布式离线任务的过程中,如果预测到所述边缘服务器执行完成所述分布式离线任务所需时长大于所述执行时长阈值,则向所述管理服务器发送超时提示信息,所述超时提示信息用于指示所述边缘服务器执行完成所述分布式离线任务所需时长大于所述执行时长阈值,所述管理服务器需重新分配新的边缘服务器执行所述分布式离线任务。
本发明实施例中,边缘服务器接收管理服务器分布式调度的分布式离线任务,该分布式离线任务可以是管理服务器接收到的离线任务,也可以是N个任务中的与边缘服务器相匹配的子任务,这N个子任务可以是基于用于执行离线任务的N个边缘服务器的空闲算力资源对离线任务进行分割处理得到的;在保证云应用正常运行的情况下,采用该边缘服务器的空闲算力资源执行分布式离线任务。这样一来,无论在云应用的高峰期还是非高峰期,均能保证云应用正常运行的情况下,还可以利用该边缘服务器中的空闲算力资源来执行分布式离线任务,避免了边缘服务器中算力资源的浪费,提高了算力资源的利用率,从而也降低了边缘服务器的运营成本。
根据本申请的一个方面,本发明实施例还提供了一种计算机程序产品,该计算机程序产品包括计算机可读指令,计算机可读指令产品存储在计算存储介质中。
可选的,处理器801从计算存储介质中读取计算机可读指令,使得服务器加载并执行:确定执行离线任务所需的第一算力资源;确定用于执行所述离线任务的N个边缘服务器,所述N个边缘服务器中运行有云应用;所述N个边缘服务器的空闲算力资源大于所述第一算力 资源,所述N个边缘服务器的空闲算力资源是指所述N个边缘服务器中各个边缘服务器的空闲算力资源之和,N为大于或等于1的整数;将所述离线任务分布式调度至所述N个边缘服务器,以使所述N个边缘服务器中每个边缘服务器在保证所述云应用正常运行的情况下,使用所述每个边缘服务器的空闲算力资源执行所述离线业务。
本发明实施例中,当接收到待执行的离线任务后,先评估执行该离线任务所需的第一算力资源,进一步的,获取用于执行该离线任务的N个边缘服务器,这N个边缘服务器的空闲算力资源大于执行离线任务所需的第一算力资源,N个边缘服务器的空闲算力资源是指每个边缘服务器的空闲算力资源之和。将该离线任务分布式调度至N个边缘服务器,以使N个边缘服务器中的每个边缘服务器在保证云应用正常运行的情况下,采用每个边缘服务器的空闲算力资源执行该离线任务。这样一来,无论在云应用的高峰期还是非高峰期,均能保证云应用正常运行的情况下,还可以利用每个边缘服务器中的空闲算力资源来执行离线任务,避免了每个边缘服务器中算力资源的浪费,提高了算力资源的利用率,从而也降低了边缘服务器的运营成本。另外,N个可以取值为1或大于1,当N取值为1时可以保证离线任务的集中执行,为离线任务的执行管理提供便利;当N取值大于1时,实现了离线任务的分布式执行,这种分布式执行方式不仅能够保证离线任务的执行进度,还可以分担各个边缘服务器的负载,从而保证各个边缘服务器中云应用能够正常运行。
可选的,处理器901从计算存储介质中读取计算机可读指令,处理器901执行该计算机可读指令,使得服务器执行:
接收管理服务器分布式调度的分布式离线任务,所述分布式离线任务包括所述管理服务器接收到的所述离线任务;或者,所述分布式离线任务包括N个子任务中与所述边缘服务器相匹配的子任务,所述N个子任务是基于N个边缘服务器中每个边缘服务器的空闲算力资源对所述离线任务进行分割处理得到;在保证所述云应用正常执行的情况下,采用所述边缘服务器的空闲算力资源执行所述分布式离线任务。
本发明实施例中,边缘服务器接收管理服务器分布式调度的分布式离线任务,该分布式离线任务可以是管理服务器接收到的离线任务,也可以是N个任务中的与边缘服务器相匹配的子任务,这N个子任务可以是基于用于执行离线任务的N个边缘服务器的空闲算力资源对离线任务进行分割处理得到的;在保证云应用正常运行的情况下,采用该边缘服务器的空闲算力资源执行分布式离线任务。这样一来,无论在云应用的高峰期还是非高峰期,均能保证云应用正常运行的情况下,还可以利用该边缘服务器中的空闲算力资源来执行分布式离线任务,避免了边缘服务器中算力资源的浪费,提高了算力资源的利用率,从而也降低了边缘服务器的运营成本。

Claims (20)

  1. 一种业务处理方法,由管理服务器执行,其特征在于,包括:
    确定执行离线任务所需的第一算力资源;
    确定用于执行所述离线任务的N个边缘服务器,所述N个边缘服务器中运行有云应用;所述N个边缘服务器的空闲算力资源大于所述第一算力资源,所述N个边缘服务器的空闲算力资源是指所述N个边缘服务器中各个边缘服务器的空闲算力资源之和,N为大于或等于1的整数;
    将所述离线任务分布式调度至所述N个边缘服务器,以使所述N个边缘服务器中每个边缘服务器在保证所述云应用正常运行的情况下,使用所述每个边缘服务器的空闲算力资源执行所述离线业务。
  2. 如权利要求1所述的方法,其特征在于,所述将所述离线任务分布式调度至所述N个边缘服务器,包括:
    基于所述N个边缘服务器中每个边缘服务器的空闲算力资源,将所述离线任务分割为N个子任务,所述N个子任务中每个子任务与一个边缘服务器相匹配;所述每个子任务相匹配的边缘服务器的空闲算力资源大于执行所述每个子任务所需的算力资源;
    将所述每个子任务分别分配至所述每个子任务相匹配的边缘服务器,以使每个边缘服务器执行相匹配的子任务。
  3. 如权利要求1所述的方法,其特征在于,所述云应用被部署至M个边缘服务器中执行,所述M个边缘服务器被分配至P个边缘计算节点中,每个边缘计算节点中部署有一个或多个边缘服务器,M和P均为大于或等于1的整数;所述确定用于执行所述离线任务的N个边缘服务器,包括:
    从所述P个边缘计算节点中选择L个边缘计算节点,所述L个边缘计算节点的节点空闲算力资源大于所述第一算力资源,所述L个边缘计算节点的节点空闲算力资源是指所述L个边缘计算节点中每个边缘计算节点的节点空闲算力资源之和;所述每个边缘节点的节点空闲算力资源是根据所述每个边缘节点中部署的边缘服务器的空闲算力资源得到的;
    基于所述L个边缘计算节点包括的每个边缘服务器的属性信息,从所述L个边缘计算节点包括的边缘服务器中确定至少一个候选的边缘服务器;
    根据所述至少一个候选的边缘服务器中每个边缘服务器的空闲算力资源和所述第一算力资源,从所述至少一个候选的边缘服务器中确定N个边缘服务器。
  4. 如权利要求3所述的方法,其特征在于,所述每个边缘服务器的属性信息包括每个边 缘服务器的工作状态,所述工作状态包括空闲状态或者忙碌状态,所述基于所述L个边缘计算节点包括的每个边缘服务器的属性信息,从所述L个边缘计算节点包括的边缘服务器中确定至少一个候选的边缘服务器,包括:
    将所述L个边缘计算节点包括的边缘服务器中,工作状态为空闲状态的边缘服务器确定为候选的边缘服务器。
  5. 如权利要求3所述的方法,其特征在于,所述每个边缘服务器的属性信息包括每个边缘服务器所属的服务器类型分组,所述服务器类型分组包括预设白名单组和普通组,所述基于所述L个边缘计算节点包括的每个边缘服务器的属性信息,从所述L个边缘计算节点包括的边缘服务器中确定至少一个候选的边缘服务器,包括:
    将所述L个边缘计算节点包括的边缘服务器中,所属的服务器类型分组为普通组的边缘服务器确定为候选的边缘服务器。
  6. 如权利要求2所述的方法,其特征在于,所述方法还包括:
    在每个边缘服务器执行相匹配的子任务过程中,监测每个边缘服务器对相匹配的子任务的执行情况;
    若监测到N个边缘服务器中任意一个边缘服务器在执行相匹配的子任务时发生异常,则重新选择一个边缘服务器,执行发生异常的所述任意一个边缘服务器相匹配的子任务。
  7. 如权利要求2所述的方法,其特征在于,一个子任务对应一个执行时长阈值,所述方法还包括:
    在每个边缘服务器执行相匹配的子任务过程中,接收任意一个边缘服务器上报的超时提示信息,所述超时提示信息用于指示所述任意一个边缘服务器执行完成相匹配的子任务所需时长大于相匹配的子任务对应的执行时长阈值,需重新分配新的边缘服务器执行所述任意一个边缘服务器相匹配的子任务。
  8. 如权利要求1所述的方法,其特征在于,所述第一算力资源包括以下任意一项或多项:图形处理器算力资源、中央处理器算力资源、内存、网络带宽以及网络吞吐量;其中,所述图形处理器算力资源包括如下至少一种:所述图形处理器每秒所执行的浮点运算次数以及所述图形处理器每次所执行的操作次数;所述中央处理器算力资源包括如下至少一种:所述中央处理器每秒所执行的浮点运算次数以及所述中央处理器每秒所执行的操作次数。
  9. 如权利要求1所述的方法,所述确定执行离线任务所需的第一算力资源,包括:
    基于任务类型和计算复杂度之间的对应关系,确定与所述离线任务的任务类型对应的计算复杂度;
    根据确定的计算复杂度从历史离线任务中查找至少一个匹配历史离线任务,每个匹配历史离线任务对应的计算复杂度与确定的计算复杂度相匹配;
    基于执行每个匹配历史离线任务所使用的算力资源对所述离线任务所需的算力资源进行估算,得到执行所述离线任务所需的第一算力资源。
  10. 一种业务处理方法,其特征在于,所述业务处理方法由用于执行离线任务的N个边缘服务器中的一个边缘服务器执行,所述N个边缘服务器中运行有云应用,所述方法包括:
    接收管理服务器分布式调度的分布式离线任务,所述分布式离线任务包括所述管理服务器接收到的所述离线任务;或者,所述分布式离线任务包括N个子任务中与所述边缘服务器相匹配的子任务,所述N个子任务是基于N个边缘服务器中每个边缘服务器的空闲算力资源对所述离线任务进行分割处理得到;
    在保证所述云应用正常执行的情况下,采用所述边缘服务器的空闲算力资源执行所述分布式离线任务。
  11. 如权利要求10所述的方法,其特征在于,所述分布式离线任务对应一个执行时长阈值,所述采用所述边缘服务器的空闲算力资源执行所述分布式离线任务,包括:
    基于所述边缘服务器的空闲算力资源确定执行完成所述分布式离线任务所需时长;
    如果所需时长小于所述分布式离线任务对应的执行时长阈值,则采用所述边缘服务器的空闲算力资源执行所述分布式离线任务。
  12. 如权利要求11所述的方法,其特征在于,所述边缘服务器的空闲算力资源是指所述边缘服务器除运行所述云应用所需的第二算力资源外的剩余算力资源,所述方法还包括:
    在执行所述分布式离线任务的过程中,当监测到所述边缘服务器中运行所述云应用所需的资源大于所述第二算力资源时,获取所述分布式离线任务在所述边缘服务器中的停留时长;
    根据所述停留时长与所述执行时长阈值之间的关系,执行算力释放操作;其中,所述算力释放操作包括暂停执行所述分布式离线任务或者终止执行所述分布式离线任务;如果所述停留时长与所述执行时长阈值之间的时间差大于时间差阈值,则所述算力释放操作包括暂停执行所述分布式离线任务;如果所述停留时长与所述执行时长阈值之间的时长差小于时间差阈值,则所述算力释放操作包括终止执行所述分布式离线任务。
  13. 如权利要求12所述的方法,其特征在于,所述算力释放操作包括暂停执行所述分布式离线任务,在执行算力释放操作后,所述方法还包括:
    定期检测所述边缘服务器的空闲算力资源;
    若所述边缘服务器的空闲算力资源大于所述第一算力资源,则启动执行所述分布式离线 任务;
    若所述边缘服服务器的空闲算力资源小于所述第一算力资源,且所述分布式离线任务在所述边缘服务器中的停留时长与所述执行时长阈值之间的差值小于时间差阈值,则终止执行所述分布式离线任务。
  14. 如权利要求11所述的方法,其特征在于,所述方法还包括:
    在所述边缘服务器执行所述分布式离线任务的过程中,如果预测到所述边缘服务器执行完成所述分布式离线任务所需时长大于所述执行时长阈值,则向所述管理服务器发送超时提示信息,所述超时提示信息用于指示所述边缘服务器执行完成所述分布式离线任务所需时长大于所述执行时长阈值,所述管理服务器需重新分配新的边缘服务器执行所述分布式离线任务。
  15. 一种业务处理装置,其特征在于,包括:
    确定单元,用于确定执行离线任务所需的第一算力资源;
    所述确定单元,还用于确定用于执行所述离线任务的N个边缘服务器,所述N个边缘服务器中运行有云应用;所述N个边缘服务器的空闲算力资源大于所述第一算力资源,所述N个边缘服务器的空闲算力资源是指所述N个边缘服务器中各个边缘服务器的空闲算力资源之和,N为大于或等于1的整数;
    调度单元,用于将所述离线业务分布式调度至所述N个边缘服务器,以使所述N个边缘服务器中每个边缘服务器在保证所述云应用正常运行的情况下,使用每个边缘服务器中的空闲算力资源执行所述离线任务。
  16. 如权利要求15所述的装置,其特征在于,所述调度单元,还用于:
    基于所述N个边缘服务器中每个边缘服务器的空闲算力资源,将所述离线任务分割为N个子任务,所述N个子任务中每个子任务与一个边缘服务器相匹配;所述每个子任务相匹配的边缘服务器的空闲算力资源大于执行所述每个子任务所需的算力资源;
    将所述每个子任务分别分配至所述每个子任务相匹配的边缘服务器,以使每个边缘服务器执行相匹配的子任务。
  17. 一种业务处理装置,其特征在于,包括:
    接收单元,用于接收管理服务器分布式调度的分布式离线任务,所述分布式离线任务包括所述管理服务器接收到的离线任务,或者所述分布式离线任务包括N个子任务中与边缘服务器相匹配的子任务,所述N个子任务是将基于N个边缘服务器中每个边缘服务器的空闲算力资源对所述离线任务进行分割处理得到的;所述N个边缘服务器用于执行所述离线任务, 所述N个边缘服务器中运行有云应用;
    执行单元,用于在保证目标云应用正常执行的情况下,采用所述边缘服务器中的空闲算力资源执行所述分布式离线任务。
  18. 一种服务器,包括存储器和处理器,所述存储器存储有计算机可读指令,其特征在于,所述处理器执行所述计算机可读指令时实现权利要求1-9任一项所述的业务处理方法;或者处理器执行所述计算机可读指令时实现权利要求10-14任一项所述的业务处理方法。
  19. 一种计算机存储介质,其特征在于,所述计算机存储介质存储有计算机可读指令,所述计算机可读指令被处理器执行时用于执行如权利要求1-9任一项所述的业务处理方法;或者,所述计算机可读指令被处理器执行时用于执行如权利要求10-14任一项所述的业务处理方法。
  20. 一种计算机程序产品,包括计算机可读指令,其特征在于,所述计算机可读指令被处理器执行时实现权利要求1至9中任一项所述的业务处理方法;或者,所述计算机可读指令被处理器执行时实现权利要求10至14中任一项所述的业务处理方法。
PCT/CN2022/106367 2021-08-02 2022-07-19 业务处理方法、装置、服务器、存储介质和计算机程序产品 WO2023011157A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP22851871.8A EP4383074A4 (en) 2021-08-02 2022-07-19 SERVICE PROCESSING METHOD AND APPARATUS, SERVER, STORAGE MEDIUM AND COMPUTER PROGRAM PRODUCT
US18/462,164 US20230418670A1 (en) 2021-08-02 2023-09-06 Service processing method and apparatus, server, storage medium and computer program product

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110884435.7 2021-08-02
CN202110884435.7A CN113608871A (zh) 2021-08-02 2021-08-02 业务处理方法及装置

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US18/462,164 Continuation US20230418670A1 (en) 2021-08-02 2023-09-06 Service processing method and apparatus, server, storage medium and computer program product

Publications (1)

Publication Number Publication Date
WO2023011157A1 true WO2023011157A1 (zh) 2023-02-09

Family

ID=78306580

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/106367 WO2023011157A1 (zh) 2021-08-02 2022-07-19 业务处理方法、装置、服务器、存储介质和计算机程序产品

Country Status (4)

Country Link
US (1) US20230418670A1 (zh)
EP (1) EP4383074A4 (zh)
CN (1) CN113608871A (zh)
WO (1) WO2023011157A1 (zh)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116566696A (zh) * 2023-05-22 2023-08-08 天津施维科技开发有限公司 一种基于云计算的安全性评估系统及方法
CN116665423A (zh) * 2023-07-27 2023-08-29 国网山东省电力公司滨州市滨城区供电公司 一种电缆通道施工监测预警系统及方法
CN116680086A (zh) * 2023-07-25 2023-09-01 联通沃音乐文化有限公司 一种基于离线渲染引擎的调度管理系统
CN116955342A (zh) * 2023-09-20 2023-10-27 彩讯科技股份有限公司 业务数据一致率校验方法和装置
CN118364410A (zh) * 2024-06-17 2024-07-19 湖南绿禾环保科技有限公司 基于多传感器的催化燃烧异常数据处理方法

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113608871A (zh) * 2021-08-02 2021-11-05 腾讯科技(深圳)有限公司 业务处理方法及装置
CN115269198A (zh) * 2022-08-10 2022-11-01 抖音视界有限公司 基于服务器集群的访问请求处理方法及相关设备
CN117032911A (zh) * 2023-04-23 2023-11-10 北京远舢智能科技有限公司 一种基于分布式调度引擎的任务调度方法及装置
CN118118531B (zh) * 2024-04-30 2024-07-16 山东浪潮智慧建筑科技有限公司 一种建筑智能化物联网控制方法及系统

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210011765A1 (en) * 2020-09-22 2021-01-14 Kshitij Arun Doshi Adaptive limited-duration edge resource management
CN112559182A (zh) * 2020-12-16 2021-03-26 北京百度网讯科技有限公司 资源分配方法、装置、设备及存储介质
CN113018871A (zh) * 2021-04-19 2021-06-25 腾讯科技(深圳)有限公司 业务处理方法、装置及存储介质
CN113157418A (zh) * 2021-04-25 2021-07-23 腾讯科技(深圳)有限公司 服务器资源分配方法和装置、存储介质及电子设备
CN113608871A (zh) * 2021-08-02 2021-11-05 腾讯科技(深圳)有限公司 业务处理方法及装置

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8381015B2 (en) * 2010-06-30 2013-02-19 International Business Machines Corporation Fault tolerance for map/reduce computing
US10761900B1 (en) * 2015-04-30 2020-09-01 V2Com S.A. System and method for secure distributed processing across networks of heterogeneous processing nodes
CN109947565B (zh) * 2019-03-08 2021-10-15 北京百度网讯科技有限公司 用于分配计算任务的方法和装置

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210011765A1 (en) * 2020-09-22 2021-01-14 Kshitij Arun Doshi Adaptive limited-duration edge resource management
CN112559182A (zh) * 2020-12-16 2021-03-26 北京百度网讯科技有限公司 资源分配方法、装置、设备及存储介质
CN113018871A (zh) * 2021-04-19 2021-06-25 腾讯科技(深圳)有限公司 业务处理方法、装置及存储介质
CN113157418A (zh) * 2021-04-25 2021-07-23 腾讯科技(深圳)有限公司 服务器资源分配方法和装置、存储介质及电子设备
CN113608871A (zh) * 2021-08-02 2021-11-05 腾讯科技(深圳)有限公司 业务处理方法及装置

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP4383074A4 *

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116566696A (zh) * 2023-05-22 2023-08-08 天津施维科技开发有限公司 一种基于云计算的安全性评估系统及方法
CN116566696B (zh) * 2023-05-22 2024-03-29 深圳市众志天成科技有限公司 一种基于云计算的安全性评估系统及方法
CN116680086A (zh) * 2023-07-25 2023-09-01 联通沃音乐文化有限公司 一种基于离线渲染引擎的调度管理系统
CN116680086B (zh) * 2023-07-25 2024-04-02 联通沃音乐文化有限公司 一种基于离线渲染引擎的调度管理系统
CN116665423A (zh) * 2023-07-27 2023-08-29 国网山东省电力公司滨州市滨城区供电公司 一种电缆通道施工监测预警系统及方法
CN116665423B (zh) * 2023-07-27 2023-10-31 国网山东省电力公司滨州市滨城区供电公司 一种电缆通道施工监测预警系统及方法
CN116955342A (zh) * 2023-09-20 2023-10-27 彩讯科技股份有限公司 业务数据一致率校验方法和装置
CN116955342B (zh) * 2023-09-20 2023-12-15 彩讯科技股份有限公司 业务数据一致率校验方法和装置
CN118364410A (zh) * 2024-06-17 2024-07-19 湖南绿禾环保科技有限公司 基于多传感器的催化燃烧异常数据处理方法

Also Published As

Publication number Publication date
CN113608871A (zh) 2021-11-05
EP4383074A4 (en) 2024-08-14
EP4383074A1 (en) 2024-06-12
US20230418670A1 (en) 2023-12-28

Similar Documents

Publication Publication Date Title
WO2023011157A1 (zh) 业务处理方法、装置、服务器、存储介质和计算机程序产品
US20210250249A1 (en) System and Method for Providing Dynamic Provisioning Within a Compute Environment
Wang et al. A three-phases scheduling in a hierarchical cloud computing network
Rajguru et al. A comparative performance analysis of load balancing algorithms in distributed system using qualitative parameters
Ebadifard et al. A dynamic task scheduling algorithm improved by load balancing in cloud computing
Xu et al. Adaptive task scheduling strategy based on dynamic workload adjustment for heterogeneous Hadoop clusters
CN103927225A (zh) 一种多核心架构的互联网信息处理优化方法
CN104657221A (zh) 一种云计算中基于任务分类的多队列错峰调度模型及方法
Liu et al. A survey on virtual machine scheduling in cloud computing
CN107977271B (zh) 一种数据中心综合管理系统负载均衡方法
El Khoury et al. Energy-aware placement and scheduling of network traffic flows with deadlines on virtual network functions
Baikerikar et al. Comparison of load balancing algorithms in a grid
CN111209098A (zh) 一种智能渲染调度方法、服务器、管理节点及存储介质
CN114546646A (zh) 处理方法和处理装置
Huang et al. Multi-resource packing for job scheduling in virtual machine based cloud environment
Guo Ant colony optimization computing resource allocation algorithm based on cloud computing environment
CN112988363B (zh) 资源调度方法、装置、服务器和存储介质
CN113419842B (zh) 一种基于JavaScript构建边缘计算微服务的方法、装置
CN114546647A (zh) 调度方法和调度装置
Bagga et al. Moldable load scheduling using demand adjustable policies
CN114489978A (zh) 资源调度方法、装置、设备及存储介质
Moussa et al. Service management in the edge cloud for stream processing of IoT data
Brar et al. A survey of load balancing algorithms in cloud computing
CN104506452A (zh) 一种报文处理方法及装置
Ramesh et al. Load Balancing Technique in Cloud Computing Environment

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22851871

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2022851871

Country of ref document: EP

Effective date: 20240304