CN114281556B - Method and apparatus for adaptively optimizing online resource allocation - Google Patents

Method and apparatus for adaptively optimizing online resource allocation Download PDF

Info

Publication number
CN114281556B
CN114281556B CN202210218555.8A CN202210218555A CN114281556B CN 114281556 B CN114281556 B CN 114281556B CN 202210218555 A CN202210218555 A CN 202210218555A CN 114281556 B CN114281556 B CN 114281556B
Authority
CN
China
Prior art keywords
algorithm
virtual
synchronizing
processes
algorithm state
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210218555.8A
Other languages
Chinese (zh)
Other versions
CN114281556A (en
Inventor
方丰斌
方叶青
郭宇梁
杜荣
解承莹
杨霖
朱文豪
薛涛
王煜
王明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ant Yunchuang Digital Technology Beijing Co ltd
Original Assignee
Beijing Ant Cloud Financial Information Service Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Ant Cloud Financial Information Service Co ltd filed Critical Beijing Ant Cloud Financial Information Service Co ltd
Priority to CN202210218555.8A priority Critical patent/CN114281556B/en
Publication of CN114281556A publication Critical patent/CN114281556A/en
Application granted granted Critical
Publication of CN114281556B publication Critical patent/CN114281556B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Hardware Redundancy (AREA)

Abstract

The present disclosure provides a method and apparatus for adaptively optimizing online resource allocation. The method comprises the following steps: receiving an online resource allocation request from a virtual routing unit; processing the received online resource allocation request; generating a processing result; sharing the generated processing result among a plurality of virtual work units; and synchronizing algorithm states among the plurality of virtual work units based on the shared processing results.

Description

Method and apparatus for adaptively optimizing online resource allocation
Technical Field
The present disclosure relates to the field of online resource allocation, and in particular, to a method and apparatus for adaptively optimizing online resource allocation.
Background
In a recommendation system, a search system, a marketing system and an advertising system, on one hand, preference indexes such as click rate and conversion rate need to be considered for online decision making, and on the other hand, the online decision making may be limited in resources such as funds, cost and flow. How to maximize the overall Resource Allocation under the premise of limited resources is called an Online Resource Allocation problem (Online Resource Allocation).
In the case of online resource allocation, it is assumed that there are K constraints on resources (such as resources of red pack, coupon, electronic ticket, consumption ticket, etc.)
Figure 645411DEST_PATH_IMAGE001
(e.g., total number of coupons consumed or total amount of credits), for a service principal i (e.g., each user to allocate resources) for each online decision, the candidate set of online decisions may be discretized into J choices (e.g., J online resource allocation schemes), with the principal i having a revenue for each choice J
Figure 313153DEST_PATH_IMAGE002
Consumption with respect to resource constraints of
Figure 185294DEST_PATH_IMAGE003
. Thus, online resource allocation can be modeled as being constrained at a global resource
Figure 647499DEST_PATH_IMAGE004
Next, the decision variables of 0-1 are solved
Figure 605091DEST_PATH_IMAGE005
To achieve overall revenue maximization
Figure 810944DEST_PATH_IMAGE006
The Linear Programming (LP) of (1).
The current implementation scheme of online resource allocation generally adopts the following three types: (1) combination of offline training model and online scoring: under such a scheme, constraint information can be added when the model is trained offline, but this approach is limited by the efficiency of model update (for example, the model is updated only once in tens of minutes), so that the timeliness is poor; (2) combination of real-time planning and online services: the scheme can carry out real-time planning and solve decision variables according to the resource allocation result, the decision variables can be controllable variables and can be used for carrying out online service, the processing time required by the real-time planning and solving is generally in the minute level, and compared with the first scheme, the processing time is reduced, but the two schemes have hysteresis on the processing of the online resource allocation request, so that the actual algorithm effect is lost; (3) approximation method based on statistics: taking an almost Optimal Fast Approximation (near Optimal Fast Approximation) algorithm as an example, the algorithm can estimate the worst performance of the algorithm by using variance, the Optimal benefit of the algorithm is obtained by reducing the probability of algorithm failure in the decision making process of each request, the algorithm can be distributed, local optimization decision making is carried out on online requests, and the loss of the actual algorithm effect is small.
However, the optimization method of the above scheme can only perform local optimization according to the resource situation, and cannot achieve global optimization of resources. In view of the above, a new solution is needed to overcome the above-mentioned drawbacks.
Disclosure of Invention
In view of the above, the present disclosure proposes techniques for implementing adaptively optimized online resource allocation based on a statistical approximation method to achieve maximization of global operating efficiency of online resources.
According to an aspect of the present disclosure, there is provided a method for adaptively optimizing online resource allocation, comprising: receiving an online resource allocation request from a virtual routing unit; processing the received online resource allocation request; generating a processing result; sharing the generated processing result among a plurality of virtual work units; and synchronizing algorithm states among the plurality of virtual work units based on the shared processing results.
Optionally, in one example of the above aspect, synchronizing the algorithm state among the plurality of virtual work units further comprises one or more of: for each process in a plurality of processes, synchronizing the algorithm state in the process among a plurality of virtual working units in the same process; and synchronizing the algorithm state among the plurality of processes based on the synchronized in-process algorithm state for each process.
Optionally, in an example of the above aspect, synchronizing the in-process algorithm state between the plurality of virtual work units within the same process is implemented by communicating between the plurality of virtual work units using a memory queue, and synchronizing the algorithm state between the plurality of processes is implemented by communicating between the plurality of processes using a UNIX domain socket.
Optionally, in an example of the above aspect, synchronizing the algorithm state among the plurality of virtual work units further comprises one or more of: for each of a plurality of container groups, wherein each container group comprises one or more processes, synchronizing states of algorithms within the container group among the processes within the same container group; and synchronizing the algorithm state among the plurality of container groups based on the synchronized intra-container group algorithm state for each container group.
Optionally, in an example of the above aspect, synchronizing the algorithmic state within a container group among a plurality of processes within the same container group is implemented by communicating among the plurality of processes using UNIX domain sockets, and synchronizing the algorithmic state among the plurality of container groups is implemented by communicating among the plurality of container groups using sockets.
Optionally, in one example of the above aspect, the algorithm state comprises one or more of: real-time information for each virtual unit of work, first order information for each virtual unit of work, and second order information for each virtual unit of work, wherein the real-time information includes information related to current resource consumption, the first order information includes information related to resource consumption for processing each task, and the second order information includes information related to resource consumption for a target task.
According to another aspect of the present disclosure, there is provided an apparatus for adaptively optimizing online resource allocation, comprising: a virtual routing unit configured to: receiving an online resource allocation request, and forwarding the received online resource allocation request to a virtual work unit; a virtual work unit configured to: processing the received online resource allocation request, generating a processing result and sharing the generated processing result among a plurality of virtual working units; and an algorithm state synchronization unit configured to synchronize an algorithm state among the plurality of virtual work units based on the shared processing result.
According to another aspect of the present disclosure, there is provided a computing device comprising: at least one processor, and a memory coupled with the at least one processor, the memory storing instructions that, when executed by the at least one processor, cause the at least one processor to perform a method for adaptively optimizing online resource allocation as described above.
According to another aspect of the present disclosure, there is provided a machine-readable medium storing executable instructions that when executed cause the machine to perform the method for adaptively optimizing online resource allocation as described above.
In embodiments of the present disclosure, the roles of the service router and the algorithm work unit may be abstracted or virtualized such that they no longer bind to a specific process and/or thread structure. By utilizing the abstract algorithm main control unit, the deployment form, the communication mode and the process parameters of the virtualized service router and the algorithm working unit can be optimized from the global dimension, so that the maximization of the global operation efficiency is realized.
Drawings
A further understanding of the nature and advantages of the present disclosure may be realized by reference to the following drawings. In the drawings, similar components or features may have the same reference numerals. The accompanying drawings, which are included to provide a further understanding of the embodiments of the disclosure and are incorporated in and constitute a part of this specification, illustrate embodiments of the disclosure and together with the detailed description serve to explain the embodiments of the disclosure without limiting the embodiments of the disclosure.
Fig. 1 shows an architecture diagram of a local optimization scheme in current resource allocation.
FIG. 2 shows a schematic architecture diagram of algorithm state synchronization in an online resource allocation scheme according to an embodiment of the present disclosure.
FIG. 3 shows a schematic architecture diagram of another algorithm state synchronization in an online resource allocation scheme according to an embodiment of the present disclosure.
FIG. 4 shows a schematic flow chart diagram of a method for adaptively optimizing online resource allocation according to an embodiment of the present disclosure.
FIG. 5 shows a schematic block diagram of an apparatus for adaptively optimizing online resource allocation according to an embodiment of the present disclosure.
FIG. 6 illustrates a schematic hardware block diagram of a computing device for adaptively optimizing online resource allocation, according to an embodiment of the present disclosure.
Detailed Description
The subject matter described herein will now be discussed with reference to example embodiments. It should be understood that these embodiments are discussed only to enable those skilled in the art to better understand and thereby implement the subject matter described herein, and are not intended to limit the scope, applicability, or examples set forth in the disclosure. Changes may be made in the function and arrangement of elements discussed without departing from the scope of the disclosure. Various examples may omit, substitute, or add various procedures or components as needed. For example, the described methods may be performed in an order different from that described, and various steps may be added, omitted, or combined. In addition, features described with respect to some examples may also be combined in other examples.
As used herein, the term "include" and its variants mean open-ended terms in the sense of "including, but not limited to. The term "based on" means "based at least in part on". The terms "one embodiment" and "an embodiment" mean "at least one embodiment". The term "another embodiment" means "at least one other embodiment". The terms "first," "second," and the like may refer to different or the same object. Other definitions, whether explicit or implicit, may be included below. The definition of a term is consistent throughout the specification unless the context clearly dictates otherwise.
As used herein, the term "online resource allocation" refers to the allocation of resources, such as electronic coupons, consumer coupons, red packs, and the like, to a requesting user by a resource allocation system, such as an e-commerce platform issuing consumer coupons to users and the like.
As used herein, the term "virtual routing unit" refers to a service router for routing or forwarding requests, which may serve as an entry for an online service, receive online requests from external users at the data plane using virtual IP, and forward or route the received requests to virtual work units mounted in a process, such as currently idle virtual work units. In some examples, requests from external users are sent uniformly in the data plane to service routers that utilize the virtual IP.
As used herein, the term "virtual work unit" refers to an "algorithmic work unit" or "algorithmic instance" configured to receive online requests forwarded by a service router for solution or processing, and to generate processing results such as decision variable information. In embodiments of the present disclosure, service routers and algorithm work units are virtualized or abstracted into logical basic units and are not tied to a specific process or thread structure.
As used herein, the term "control unit" refers to an "algorithm master" or "algorithm master unit" which may be used to be responsible for the lifecycle management of the algorithms, including, for example, dynamic adjustment of the deployment modalities of the algorithm work units, the manner of communication of the service routers and the algorithm work units, the number of routing units and/or algorithm work units, and the like. In the present disclosure, a deployment modality may include, for example, whether several arithmetic work units are deployed in one process or different multiple processes; the communication means may include, for example, communication means of work units within the same process, communication means of work units between different processes, communication means of work units between different machine or container groups, and the like; the process parameters may include, for example, the total resource size of the machine and the size of the resources occupied by each algorithm work cell, among others.
FIG. 1 shows an architectural diagram 100 of a local optimization scheme in current online resource allocation.
As shown in FIG. 1, cluster 104 includes multiple Pods (container groups) 106-1, 106-2, where each Pod includes one or more processes, such as processes 108-1, 108-2, 108-3. In some examples, each Pod may be implemented by, for example, one machine. There is one service router in each process (e.g., service routers 110-1, 110-2, 110-3 shown in FIG. 1). The service routers 110-1, 110-2, 110-3 each initiate one or more algorithmic workcell threads (shown as shaded ovals in FIG. 1) within the corresponding process that receive requests from outside, e.g., from the data plane 102, through the corresponding service router.
The service routers 110-1, 110-2, 110-3 forward the received requests to the corresponding algorithmic workcell threads, respectively, so that the corresponding algorithmic workcell threads compute and solve the received requests. In fig. 1, a plurality of arithmetic unit of work threads in a process may share one arithmetic state, that is, the arithmetic state of the arithmetic unit of work threads in a process is the same. The service router may adjust the number of algorithmic workcell threads within the corresponding process based on the amount of traffic or requests. In the control plane 112 (as indicated by the dashed line in fig. 1), the algorithm states of the plurality of processes may be synchronized, for example, by a synchronization unit, to obtain synchronized algorithm states for the plurality of processes. However, the deployment scheme and the synchronization method shown in fig. 1 need to combine algorithm implementation and online service together, so that the algorithm structure is difficult to multiplex and expand, and the obtained optimization method can only perform local optimization according to the traffic situation and cannot perform global optimization.
In view of this, the present disclosure provides a globally optimized online resource allocation method and apparatus, which adjust the deployment form, communication mode, and process parameters of a service router and an algorithm working unit according to the actual operating conditions of the service router and the algorithm working unit, thereby maximizing the operating efficiency of an online resource allocation system.
A method and apparatus for adaptively optimizing online resource allocation in accordance with embodiments of the present disclosure will now be described with reference to the accompanying drawings.
FIG. 2 illustrates an exemplary architecture diagram 200 for algorithm state synchronization in an online resource allocation scheme according to the present disclosure. In the example shown in FIG. 2, the algorithm state synchronization includes algorithm state synchronization between algorithm work units across processes.
As shown in FIG. 2, the cluster 204 includes a plurality of Pods 206-1, 206-2, 206-3, where each Pod includes one or more processes. For example, Pod 206-1 includes process 208-1 and process 208-2, Pod 206-2 includes process 208-3, and Pod 206-3 includes process 208-4. In embodiments of the present disclosure, a statistical-based approximation method may be employed for online resource allocation. In some implementations, the algorithm unit may be decomposed into multiple algorithm instances for distributed deployment, where each algorithm instance processes the online resource allocation request after receiving it, and updates its own algorithm state after generating a processing result. In implementations of the present disclosure, algorithm states may be synchronized between these different algorithm instances.
In process 208-4, there is a control unit, such as algorithm master control unit 216 shown in fig. 2, which may be configured to be responsible for the life cycle management of the algorithm, such as dynamically setting or adjusting the deployment configuration, communication manner, number, etc. of the service routers and algorithm work units. In some implementations, the algorithm master 216 collects the processing capabilities and operating conditions of each service router 210-1, 210-2, 210-3 and each algorithm work unit 212-1, 212-2, 212-3, 212-4, and adjusts the deployment configuration, communication manner, number, etc. of each service router and/or algorithm work unit according to the collected processing capabilities and operating conditions. In embodiments of the present disclosure, the processing capabilities of the service router may include, but are not limited to, the capability of the service router to receive requests, e.g., how many requests per second may be received; the processing capabilities of the arithmetic work units may include, but are not limited to, the ability of the arithmetic work units to process requests, such as how many requests per second may be processed. In embodiments of the present disclosure, the operational conditions of a service router may include, but are not limited to, whether there is a queue of received requests at the service router; the operational condition of the algorithmic work cell may include, but is not limited to, the length of time the algorithmic work cell processes each request or task, and the like.
In embodiments of the present disclosure, the number of algorithmic work units to which each service router is coupled may be adjusted according to the processing power of the service router and/or the number of requests received. For example, when the processing power of the service router is N times that of the arithmetic unit of work, the arithmetic master unit 216 may deploy one service router unit and N arithmetic units of work within one process (e.g., the process 208-1 in the Pod 206-1 shown in fig. 2). In this case, the algorithm main control unit 216 may set the message communication mode among the N algorithm work units to be communication through a memory queue mode, so that overhead of a network part is not required. In another example, when the processing power of the service router within a process cannot simultaneously process N arithmetic units of work (i.e., the amount of resources within a process cannot satisfy N arithmetic units of work), the arithmetic master unit 216 may move some arithmetic units of work to other processes within the same Pod (e.g., different processes 208-1, 208-2 within Pod 206-1 shown in fig. 2) or to other processes within different pods (e.g., different pods 206-1 and 206-2 shown in fig. 2). In the case of moving some arithmetic work units to other processes within the same Pod, the arithmetic main control unit 216 may adjust or set a message communication manner between a plurality of arithmetic work units in different processes within the native Pod or the present Pod to communicate in a UNIX domain socket (socket) manner. In the case of moving some arithmetic work units to other processes within different pods, the arithmetic master unit 216 may adjust or set the message communication manner between the plural arithmetic work units within the native and remote machines (or the native Pod and other pods) to communicate by a socket (socket) manner.
In the embodiment of the disclosure, the adjustment of the deployment form and/or the communication mode of the service router and the algorithm working unit is dynamically adjusted according to the processing capacity and/or the operation condition of the service router and the algorithm working unit. In some examples, the processing capabilities of the service router include, but are not limited to: the ability of the service router to receive requests, e.g., how many requests per second can be received, and the processing capabilities of the algorithm work units include, but are not limited to: the ability of the arithmetic work unit to process requests, e.g., how many requests per second can be processed. In other examples, the operational conditions of the service router include, but are not limited to, whether there is a queue of received requests at the service router, and the operational conditions of the algorithmic work units may include, but are not limited to, the amount of time each request or task is processed by the virtual work unit, and so forth. In one example of the present disclosure, the deployment modality of the service router may include, but is not limited to, the number of arithmetic work units deployed in the service router; and the deployment modality of the arithmetic unit of work may include whether the plurality of arithmetic units of work are deployed in one process or different processes, for example, different processes in the same Pod or different processes in different pods. In one example of the present disclosure, the communication manner of the service router may include, but is not limited to, a communication manner between service routers in different processes in the same Pod, a communication manner between service routers in different processes in different pods, and the like; the communication mode of the algorithm work unit may include, but is not limited to, a communication mode of the algorithm work unit in the same process, a communication mode between algorithm work units of different processes in the same Pod or different pods, and the like.
In an embodiment of the present disclosure, the algorithm master control unit 216 may monitor resource usage in a process and adjust process parameters of the process according to the monitored resource usage. For example, when the resource usage within a process is low, the algorithm master 216 may decrease the maximum resource amount threshold for that process; or when the resource usage within a process is high, the algorithm master 216 may increase the maximum resource amount threshold for that process.
In the embodiment of the present disclosure, after the process parameter of a process is adjusted by the algorithm main control unit 216, the algorithm main control unit 216 may further dynamically adjust the life cycles of the service routers and the algorithm working units in the process according to the adjusted process parameter, for example, the deployment forms and/or the communication manners and/or the number of the service routers and the algorithm working units. For example, as the maximum resource amount threshold of a process increases, the number of arithmetic units of work within the process may be increased or arithmetic units of work within other processes within the same Pod or different pods may be moved into the process; when the maximum resource amount threshold of a process is decreased, the number of arithmetic units of work within the process can be decreased or moved to a different process under the same Pod or other processes in a different Pod. In addition, the algorithm main control unit 216 may also adjust the life cycles of the service router and the algorithm work unit in the process according to information reported by the algorithm work unit, where the information reported by the algorithm work unit includes, but is not limited to, information such as the number of requests Per Second (QPS, Query Per Second), request processing response time, Garbage Collection (GC) condition, and memory usage condition.
In the example shown in FIG. 2, the algorithm states are synchronized among all of the algorithmic work units 212-1, 212-2, 212-3, and 212-4 in the cluster 204 in the control plane 214, with each algorithmic work unit having its own algorithm state. In embodiments of the present disclosure, the manner of synchronization of the algorithm states may take any suitable manner, including but not limited to the following: setting a centralized role as a synchronization unit to receive the algorithm state of each algorithm working unit from each algorithm working unit, determining the total algorithm state after the algorithm states of all the algorithm working units are integrated, and synchronizing the total algorithm state to each algorithm working unit; or each algorithm working unit reports or stores the algorithm state to a specific storage area and the like.
As shown in FIG. 2, the states of the algorithmic work units within the same process, such as the algorithmic work units 212-1 and 212-2 in process 208-1, may be synchronized first. After synchronizing the algorithm states of the algorithm work units within the same process, the algorithm states between different processes in the same Pod may be synchronized, such as synchronizing the algorithm states of processes 208-1 and 208-2 within Pod 206-1. Herein, the algorithm state may include, but is not limited to, real-time information, first order information, second order information, etc., wherein the real-time information includes information related to the current resource consumption, e.g., may represent how much revenue is currently co-produced and/or how much resources are consumed; the first order information includes information related to resource consumption for processing each task, for example, may represent how much profit is generated and/or how much resource is consumed for processing each task or request, and the second order information includes information related to resource consumption for a target task, for example, may represent a variance generated for randomness of a task on the target task or consumed resource.
FIG. 3 shows an exemplary architecture diagram 300 for another algorithm state synchronization in an online resource allocation scheme according to an embodiment of the disclosure. In the example shown in FIG. 3, the algorithm state synchronization includes algorithm state synchronization between different algorithm work units across Pod or multiple machines.
As shown in FIG. 3, the cluster 304 includes multiple Pods 306-1, 306-2, 306-3, where each Pod includes one or more processes. For example, Pod 306-1 includes process 308-1 and process 308-2, Pod 306-2 includes process 308-3, and Pod 306-3 includes process 308-4. In embodiments of the present disclosure, a statistical-based approximation method may be employed for online resource allocation. In some implementations, an algorithm unit can be decomposed into multiple algorithm instances for distributed deployment, where each algorithm instance processes an online request after receiving it and updates its own algorithm state after generating a processing result. In implementations of the present disclosure, algorithm states may be synchronized between these different algorithm instances.
In process 308-4, there is a control unit, such as algorithm master unit 316 shown in FIG. 3, which may be used to take care of the lifecycle management of the algorithms. In some implementations, the algorithm master unit 316 collects the processing capabilities and operating conditions of each service router 310-1, 310-2, 310-3 and each algorithm work unit 312-1, 312-2, 312-3, 312-4, and adjusts the deployment configuration, communication manner, number, etc. of each service router and/or algorithm work unit according to the collected processing capabilities and operating conditions.
In embodiments of the present disclosure, the number of algorithmic work units to which each service router is coupled may be adjusted according to the processing power of the service router and/or the number of requests received. For example, when the processing capacity of the service router is N times of the arithmetic work units, the arithmetic master unit 316 may deploy one service router unit and N arithmetic work units in one process (e.g., the process 308-1 in the Pod 306-1 and the arithmetic work units 312-1, 321-2 therein as shown in fig. 3). In another example, when the processing power of the service router in one process cannot simultaneously process N arithmetic units of work (i.e., the amount of resources in one process cannot satisfy N arithmetic units of work), the arithmetic master unit 316 may move some arithmetic units of work to other processes in the same Pod.
In an embodiment of the present disclosure, the algorithm master control unit 316 may monitor resource usage in a process and adjust process parameters of the process according to the monitored resource usage. When the process parameter of a process is adjusted by the algorithm main control unit 316, the algorithm main control unit 316 may further dynamically adjust the life cycle of the service router and the algorithm working unit in the process according to the adjusted process parameter, for example, the deployment form and/or the communication manner and/or the number of the service router and the algorithm working unit. In some examples herein, the operation or function of algorithm master unit 316 is similar to the operation or function of algorithm master unit 216 shown in fig. 2 and, therefore, for the sake of brevity, will not be described in detail herein.
In the example shown in FIG. 3, the algorithm states are synchronized among all of the algorithm work units 312-1, 312-2, 312-3 in the Pod 306-1 in the control plane 314, where each algorithm work unit has its own algorithm state (e.g., block state). In one example, algorithm state synchronization may be performed directly between all algorithm work units (e.g., algorithm work units 312-1, 312-2, 312-3) within the same Pod (e.g., Pod 306-1) based on the states of the algorithm work units. In another example, algorithm state synchronization may be performed between multiple algorithm work units (e.g., algorithm work units 312-1, 321-2) within the same process (e.g., process 308-1) and then between different processes (e.g., processes 308-1, 308-2) in the same Pod (e.g., Pod 306-1). After synchronizing the algorithm states among all of the algorithm work cells 312-1, 312-2, 312-3 in the Pod 306-1, a local global state 1 for the Pod 306-1 may be generated. Similarly, a local global state 2 for Pod 306-2 may be generated. In this example, one algorithm state may be shared among algorithm work units between different processes of the same Pod; algorithm states can be synchronously summarized at the control plane 314 by a parameter server (not shown in the figure) among single machines (or the same Pod) and multiple processes; the algorithm state (e.g., local global state) of the parameter server is synchronized at the control plane 314' at machine (or Pod) granularity across multiple machines (or different pods). It should be understood that although the control planes 314 and 314 'are shown as two dashed boxes in fig. 3, such illustration is merely for convenience of describing different synchronization stages of the algorithm states, not for limitation on the number or deployment of the control planes, and in fact, the control planes 314 and 314' shown separately in fig. 3 may be merged into the same control plane in an implementation.
It should be appreciated that the number of clusters, the number of pods contained within each cluster, the number of processes contained within each Pod, and the number of service routers and algorithmic work elements contained within each process shown in fig. 2 and 3 and described above with respect to fig. 2 and 3 are exemplary, and in other examples or practical applications, any number of clusters may exist, any number of pods included within each cluster, any number of processes included within each Pod, and any number of service routers and algorithmic work elements included within each process.
FIG. 4 shows a schematic flow chart diagram of a method 400 for adaptively optimizing online resource allocation in accordance with an embodiment of the present disclosure.
As shown in fig. 4, in operation 402, an online resource allocation request is received from a virtual routing element. For example, online resource allocation requests received from users outside the cluster are forwarded to the mounted idle virtual work units using the virtual routing unit to enable the idle virtual work units to receive the requests.
In operation 404, the online resource allocation request is processed. In embodiments of the present disclosure, received online resource allocation requests are processed by a virtual work unit.
In operation 406, a processing result for the online resource allocation request is generated.
In operation 408, the generated processing results are shared among the plurality of virtual work units so that each virtual work unit can learn the working conditions of the other virtual work units.
In operation 410, algorithm states are synchronized among the plurality of virtual work units based on the shared processing results. In embodiments of the present disclosure, algorithm states are synchronized among the virtual units of work based on processing results of the respective virtual units of work (e.g., including local algorithm states of the respective virtual units of work) such that each virtual unit of work may obtain a global algorithm state for the plurality of virtual units of work, e.g., a total resource consumption, a total resource remaining, etc., for the plurality of virtual units of work. In one example, the algorithm state includes one or more of: real-time information of each virtual work unit, first-order information of each virtual work unit, and second-order information of each virtual work unit. Herein, the real-time information of a virtual work unit indicates how much revenue and/or resources are currently generated and/or consumed by the virtual work unit, the first order information of a virtual work unit indicates how much revenue and/or resources are consumed by the virtual work unit for processing each task or request, and the second order information of a virtual work unit indicates a variance generated by the virtual work unit for a randomness of a task on a target task or consumed resources.
In one example of the present disclosure, synchronizing the algorithmic state among the plurality of virtual work units comprises one or more of: for each process in a plurality of processes, synchronizing the algorithm state in the process among a plurality of virtual working units in the same process; and synchronizing the algorithm state among the plurality of processes based on the synchronized in-process algorithm state for each process.
In one example of the present disclosure, synchronizing in-process algorithm state between multiple virtual workcells within the same process is accomplished using memory queues to communicate between the multiple virtual workcells, and synchronizing the algorithm state between the multiple processes is accomplished using UNIX domain sockets to communicate between the multiple processes.
Further, in one example of the present disclosure, synchronizing the algorithm state among the plurality of virtual work units further comprises one or more of: for each of a plurality of container groups, wherein each container group comprises one or more processes, synchronizing states of algorithms within the container group among the processes within the same container group; and synchronizing the algorithm state among the plurality of container groups based on the synchronized intra-container group algorithm state for each container group.
In one example of the present disclosure, synchronizing an algorithmic state within a container group among a plurality of processes within the same container group is accomplished by communicating among the plurality of processes using UNIX domain sockets, and synchronizing the algorithmic state among the plurality of container groups is accomplished by communicating among the plurality of container groups using sockets.
Fig. 5 is a block diagram of an apparatus for adaptively optimizing online resource allocation (hereinafter, simply referred to as an online resource allocation apparatus) 500 according to an embodiment of the present disclosure. As shown in fig. 5, the online resource allocation apparatus 500 includes a virtual routing unit 510, a virtual work unit 520, and an algorithm state synchronization unit 530.
The virtual routing unit 510 is configured to receive online resource allocation requests and forward the received online resource allocation requests to the virtual work units. In some examples, the virtual routing unit 510 acts, for example, as an entry to an online service, receives online resource allocation requests from outside in the data plane, and routes the requests to the mounted algorithmic work unit.
The virtual work unit 520 may be configured to process the received online resource allocation request, generate a processing result, and share the biochemical processing result among the plurality of virtual work units. In some examples, the virtual work unit 520 may receive an online resource allocation request forwarded by a virtual routing unit, such as a service router, make an optimization decision on the received request, generate decision variable information, and share the generated information among the plurality of virtual work units in any suitable manner.
The algorithm state synchronization unit 530 may be configured to synchronize the algorithm state among the plurality of virtual work units based on processing results shared by the respective virtual work units. In embodiments of the present disclosure, synchronization may be based on the current algorithm state of each of all of the virtual work units. In another embodiment of the present disclosure, algorithm state synchronization may be performed between the virtual workcells in the same process first, and then algorithm state synchronization may be performed between the processes in the same Pod, and/or optionally algorithm state synchronization may be performed between the pods in the same cluster. In one example of the present disclosure, synchronizing in-process algorithm states among a plurality of virtual units of work within the same process is accomplished using memory queues to communicate among the plurality of virtual units of work, and synchronizing the algorithm states among the plurality of processes is accomplished using UNIX domain sockets to communicate among the plurality of processes.
Furthermore, in one example of the present disclosure, the algorithm state synchronization unit 530 may be further configured to perform one or more of the following: for each of a plurality of container groups (Pod), wherein each container group includes one or more processes, synchronizing algorithm states within the container group among the processes within the same container group; and synchronizing the algorithm state among the plurality of container groups based on the synchronized intra-container group algorithm state for each container group. In one example of the present disclosure, synchronizing an algorithmic state within a container group among a plurality of processes within the same container group is accomplished by communicating among the plurality of processes using UNIX domain sockets, and synchronizing the algorithmic state among the plurality of container groups is accomplished by communicating among the plurality of container groups using sockets.
Embodiments of a method and apparatus for adaptively optimizing online resource allocation according to embodiments of the present disclosure are described above with reference to fig. 1 through 5. The above adaptively optimized online resource allocation means may be implemented in hardware, or may be implemented in software, or a combination of hardware and software.
FIG. 6 illustrates a hardware block diagram of a computing device 600 for adaptively optimizing online resource allocation, according to an embodiment of the present disclosure. As shown in fig. 6, computing device 600 may include at least one processor 610, non-volatile storage 620, memory 630, and communication interface 640, and at least one processor 610, non-volatile storage 620, memory 630, and communication interface 640 are connected together via a bus 660. The at least one processor 610 executes at least one computer-readable instruction (i.e., the elements described above as being implemented in software) stored or encoded in the memory 620.
In one embodiment, computer-executable instructions are stored in the memory 620 that, when executed, cause the at least one processor 610 to: receiving an online resource allocation request from a virtual routing unit; processing the received online resource allocation request; generating a processing result; sharing the generated processing result among a plurality of virtual working units; and synchronizing algorithm states among the plurality of virtual work units based on the shared processing results.
It should be understood that the computer-executable instructions stored in the memory, when executed, cause the at least one processor 610 to perform the various operations and functions described in connection with fig. 1-5 in the various embodiments of the present disclosure.
In the present disclosure, computing device 600 may include, but is not limited to: personal computers, server computers, workstations, desktop computers, laptop computers, notebook computers, mobile computing devices, smart phones, tablet computers, cellular phones, Personal Digital Assistants (PDAs), handheld devices, messaging devices, wearable computing devices, consumer electronics, and so forth.
According to one embodiment, a program product, such as a machine-readable medium, is provided. A machine-readable medium may store executable instructions (i.e., elements described above as being implemented in software) that, when executed by a machine, cause the machine to perform various operations and functions described in connection with fig. 1-5 in various embodiments of the present disclosure. In particular, a system or apparatus may be provided which is configured with a readable storage medium on which software program code implementing the functionality of any of the embodiments described above is stored and which causes a computer or processor of the system or apparatus to read out and execute instructions stored in the readable storage medium.
In this case, the program code itself read from the readable medium can realize the functions of any of the above-described embodiments, and thus the machine-readable code and the readable storage medium storing the machine-readable code form part of the present invention.
Examples of the readable storage medium include floppy disks, hard disks, magneto-optical disks, optical disks (e.g., CD-ROMs, CD-R, CD-RWs, DVD-ROMs, DVD-RAMs, DVD-RWs), magnetic tapes, nonvolatile memory cards, and ROMs. Alternatively, the program code may be downloaded from a server computer or the cloud by a communication network.
It will be understood by those skilled in the art that various changes and modifications may be made in the above-disclosed embodiments without departing from the spirit of the invention.
It should be noted that not all steps and units in the above flows and system structure diagrams are necessary, and some steps or units may be omitted according to actual needs. The execution order of the steps is not fixed, and can be determined as required. The apparatus structures described in the foregoing embodiments may be physical structures or logical structures, that is, some units may be implemented by the same physical entity, or some units may be implemented by multiple physical entities separately, or some units may be implemented by some components in multiple independent devices together.
In the above embodiments, the hardware units or modules may be implemented mechanically or electrically. For example, a hardware unit, module or processor may comprise permanently dedicated circuitry or logic (such as a dedicated processor, FPGA or ASIC) to perform the corresponding operations. The hardware units or processors may also include programmable logic or circuitry (e.g., a general purpose processor or other programmable processor) that may be temporarily configured by software to perform the corresponding operations. The specific implementation (mechanical, or dedicated permanent, or temporarily set) may be determined based on cost and time considerations.
The detailed description set forth above in connection with the appended drawings describes example embodiments, but is not intended to represent all embodiments that may be practiced or that fall within the scope of the disclosure. The term "exemplary" used throughout this specification means "serving as an example, instance, or illustration," and does not mean "preferred" or "advantageous" over other embodiments. The detailed description includes specific details for the purpose of providing an understanding of the described technology. However, the techniques may be practiced without these specific details. In some instances, well-known structures and devices are shown in block diagram form in order to avoid obscuring the concepts of the described embodiments.
The previous description of the disclosure is provided to enable any person skilled in the art to make or use the disclosure. Various modifications to the disclosure will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other variations without departing from the scope of the disclosure. Thus, the disclosure is not intended to be limited to the examples and designs described herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (13)

1. A method for adaptively optimizing online resource allocation, comprising:
receiving an online resource allocation request from a virtual routing unit;
processing the received online resource allocation request;
generating a processing result;
sharing the generated processing result among a plurality of virtual work units; and
synchronizing an algorithm state among the plurality of virtual work units based on the shared processing results, wherein synchronizing the algorithm state among the plurality of virtual work units comprises: based on the processing results of the respective virtual work units, each of the plurality of virtual work units obtains a global algorithm state for the plurality of virtual work units, the global algorithm state comprising at least one of: a total resource consumption amount, a total remaining amount of resources for the plurality of virtual work units.
2. The method of claim 1, wherein synchronizing algorithm state among the plurality of virtual work units further comprises one or more of: for each process in a plurality of processes, synchronizing the algorithm state in the process among a plurality of virtual working units in the same process; and synchronizing the algorithm state among the plurality of processes based on the synchronized in-process algorithm state for each process.
3. The method of claim 2, wherein synchronizing the in-process algorithm state among the plurality of virtual workcells within the same process is accomplished using a memory queue to communicate among the plurality of virtual workcells, and synchronizing the algorithm state among the plurality of processes is accomplished using a UNIX domain socket to communicate among the plurality of processes.
4. The method of claim 1, wherein synchronizing algorithm state among the plurality of virtual work units further comprises one or more of: for each of a plurality of container groups, wherein each container group comprises one or more processes, synchronizing states of algorithms within the container group among the processes within the same container group; and synchronizing the algorithm state among the plurality of container groups based on the synchronized intra-container group algorithm state for each container group.
5. The method of claim 4, wherein synchronizing the algorithmic state within a container group among a plurality of processes within the same container group is accomplished by communicating among the plurality of processes using UNIX domain sockets, and synchronizing the algorithmic state among the plurality of container groups is accomplished by communicating among the plurality of container groups using sockets.
6. The method of claim 1, wherein the algorithm state comprises one or more of: real-time information for each virtual unit of work, first order information for each virtual unit of work, and second order information for each virtual unit of work, wherein the real-time information includes information related to current resource consumption, the first order information includes information related to resource consumption for processing each task, and the second order information includes information related to resource consumption for a target task.
7. An apparatus for adaptively optimizing online resource allocation, comprising:
a virtual routing unit configured to: receiving an online resource allocation request, and forwarding the received online resource allocation request to a virtual work unit;
a virtual work unit configured to: processing the received online resource allocation request, generating a processing result and sharing the generated processing result among a plurality of virtual working units; and
an algorithm state synchronization unit configured to synchronize algorithm states among the plurality of virtual work units based on the shared processing result, wherein synchronizing the algorithm states among the plurality of virtual work units comprises: based on the processing results of the respective virtual work units, each of the plurality of virtual work units obtains a global algorithm state for the plurality of virtual work units, the global algorithm state including at least one of: a total resource consumption amount, a total remaining amount of resources for the plurality of virtual work units.
8. The apparatus of claim 7, wherein the algorithm state synchronization unit is further configured to perform one or more of: for each process in a plurality of processes, synchronizing algorithm states in the process among a plurality of virtual work units in the same process; and synchronizing the algorithm state among the plurality of processes based on the synchronized in-process algorithm state for each process.
9. The apparatus of claim 8, wherein synchronizing in-process algorithm state between multiple virtual units of work within a same process is accomplished using memory queues to communicate between the multiple virtual units of work, and wherein synchronizing the algorithm state between the multiple processes is accomplished using UNIX domain sockets to communicate between the multiple processes.
10. The apparatus of claim 7, wherein the algorithm state synchronization unit is further configured to perform one or more of: for each of a plurality of container groups, wherein each container group comprises one or more processes, synchronizing states of algorithms within the container group among the processes within the same container group; and synchronizing the algorithm state among the plurality of container groups based on the synchronized intra-container group algorithm state for each container group.
11. The apparatus of claim 10, wherein synchronizing the algorithmic state within a container group among a plurality of processes within the same container group is accomplished by communicating among the plurality of processes using UNIX domain sockets, and synchronizing the algorithmic state among the plurality of container groups is accomplished by communicating among the plurality of container groups using sockets.
12. A computing device, comprising:
at least one processor, and
a memory coupled with the at least one processor, the memory storing instructions that, when executed by the at least one processor, cause the at least one processor to perform the method of any of claims 1-6.
13. A machine-readable medium storing executable instructions that when executed cause the machine to perform the method of any one of claims 1 to 6.
CN202210218555.8A 2022-03-08 2022-03-08 Method and apparatus for adaptively optimizing online resource allocation Active CN114281556B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210218555.8A CN114281556B (en) 2022-03-08 2022-03-08 Method and apparatus for adaptively optimizing online resource allocation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210218555.8A CN114281556B (en) 2022-03-08 2022-03-08 Method and apparatus for adaptively optimizing online resource allocation

Publications (2)

Publication Number Publication Date
CN114281556A CN114281556A (en) 2022-04-05
CN114281556B true CN114281556B (en) 2022-07-05

Family

ID=80882295

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210218555.8A Active CN114281556B (en) 2022-03-08 2022-03-08 Method and apparatus for adaptively optimizing online resource allocation

Country Status (1)

Country Link
CN (1) CN114281556B (en)

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103870332B (en) * 2012-12-13 2017-08-25 中国电信股份有限公司 Method of adjustment, device and the dummy machine system of virtual machine processor resource
US9635103B2 (en) * 2014-09-11 2017-04-25 Amazon Technologies, Inc. Dynamic virtual resource request rate control for utilizing physical resources
US10320699B1 (en) * 2015-10-22 2019-06-11 VCE IP Holding Company LLC. Computer implemented system and method, and a computer program product, for allocating virtualized resources across an enterprise
WO2019139515A1 (en) * 2018-01-15 2019-07-18 Telefonaktiebolaget Lm Ericsson (Publ) Management of dynamic sharing of central processing units
CN109039954B (en) * 2018-07-25 2021-03-23 广东石油化工学院 Self-adaptive scheduling method and system for virtual computing resources of multi-tenant container cloud platform
CN112104723B (en) * 2020-09-07 2024-03-15 腾讯科技(深圳)有限公司 Multi-cluster data processing system and method

Also Published As

Publication number Publication date
CN114281556A (en) 2022-04-05

Similar Documents

Publication Publication Date Title
Feng et al. Computation offloading in mobile edge computing networks: A survey
Sharma et al. A novel four-tier architecture for delay aware scheduling and load balancing in fog environment
Liu et al. Solving the multi-objective problem of IoT service placement in fog computing using cuckoo search algorithm
Guo et al. Electricity cost saving strategy in data centers by using energy storage
Guo et al. Energy and network aware workload management for sustainable data centers with thermal storage
US9092209B2 (en) Wireless cloud-based computing for rural and developing areas
Qiao et al. Online learning and optimization for computation offloading in D2D edge computing and networks
Tripathi et al. Non-cooperative power and latency aware load balancing in distributed data centers
Zhu et al. Drl-based deadline-driven advance reservation allocation in eons for cloud–edge computing
Wang et al. Achieving energy efficiency in data centers using an artificial intelligence abstraction model
WO2021046774A1 (en) Resource scheduling method and information prediction method, device, system, and storage medium
Yuan et al. Cost-aware request routing in multi-geography cloud data centres using software-defined networking
Grasso et al. Smart zero-touch management of uav-based edge network
Alqarni et al. A survey of computational offloading in cloud/edge-based architectures: strategies, optimization models and challenges
Bali et al. An effective technique to schedule priority aware tasks to offload data on edge and cloud servers
CN114281556B (en) Method and apparatus for adaptively optimizing online resource allocation
Li et al. Data analysis-oriented stochastic scheduling for cost efficient resource allocation in NFV based MEC network
Zhao et al. On minimizing energy cost in internet-scale systems with dynamic data
Xia et al. Data locality-aware big data query evaluation in distributed clouds
Guo et al. Optimal power and workload management for green data centers with thermal storage
Pahlevan et al. Exploiting CPU-load and data correlations in multi-objective VM placement for geo-distributed data centers
Zhu et al. Deep reinforcement learning-based edge computing offloading algorithm for software-defined IoT
Anastasopoulos et al. Optical wireless network convergence in support of energy-efficient mobile cloud services
Fang et al. Latency aware online tasks scheduling policy for edge computing system
Fang et al. A Scheduling Strategy for Reduced Power Consumption in Mobile Edge Computing

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CP01 Change in the name or title of a patent holder
CP01 Change in the name or title of a patent holder

Address after: 100192 No. 306, 3 / F, building 28, Baosheng Beili West District, Haidian District, Beijing

Patentee after: Ant yunchuang digital technology (Beijing) Co.,Ltd.

Address before: 100192 No. 306, 3 / F, building 28, Baosheng Beili West District, Haidian District, Beijing

Patentee before: Beijing ant cloud Financial Information Service Co.,Ltd.