CN110825520A - Cluster top-speed elastic expansion method for realizing efficient resource utilization - Google Patents

Cluster top-speed elastic expansion method for realizing efficient resource utilization Download PDF

Info

Publication number
CN110825520A
CN110825520A CN201910994328.2A CN201910994328A CN110825520A CN 110825520 A CN110825520 A CN 110825520A CN 201910994328 A CN201910994328 A CN 201910994328A CN 110825520 A CN110825520 A CN 110825520A
Authority
CN
China
Prior art keywords
resource
pod
resources
task
executing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910994328.2A
Other languages
Chinese (zh)
Other versions
CN110825520B (en
Inventor
单朋荣
杨美红
赵志刚
陈静
厉承轩
刘凯
于焕焕
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong Computer Science Center National Super Computing Center in Jinan
Original Assignee
Shandong Computer Science Center National Super Computing Center in Jinan
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong Computer Science Center National Super Computing Center in Jinan filed Critical Shandong Computer Science Center National Super Computing Center in Jinan
Priority to CN201910994328.2A priority Critical patent/CN110825520B/en
Publication of CN110825520A publication Critical patent/CN110825520A/en
Application granted granted Critical
Publication of CN110825520B publication Critical patent/CN110825520B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • G06F9/5038Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the execution order of a plurality of tasks, e.g. taking priority or time dependency constraints into consideration
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/08Configuration management of networks or network elements
    • H04L41/0896Bandwidth or capacity management, i.e. automatically increasing or decreasing capacities
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention discloses a cluster top-speed elastic expansion method for realizing efficient resource utilization, which comprises the following steps of: step 1: creating an intelligent elastic expansion compensation module IACM and a compensation queue; step 2: judging whether the service cluster needs to be added with nodes in a future period of time; and step 3: adding Pod resources into a compensation queue; and 4, step 4: setting the priority of a compensation queue; and 5: performing recovery operation on Pod resources; step 6: maintaining Pod resources; and 7: and (5) judging periodically. According to the method for the extremely-fast elastic expansion and contraction of the cluster, when the resource demand of the service cluster is increased, the new copy already exists in the 'compensation queue', and the service cluster can be added into the cluster by directly pulling up the new copy, so that the time for building the new copy is saved, and the purpose of extremely and rapidly expanding the cluster node resources is realized. When the resource pool is deficient, the resource is adjusted, the resource occupation ratio is large and cannot be adjusted, and the task of expelling or killing can be performed so as to realize resource scheduling and resource expansion.

Description

Cluster top-speed elastic expansion method for realizing efficient resource utilization
Technical Field
The invention relates to a cluster top-speed elastic expansion method, in particular to a cluster top-speed elastic expansion method for realizing efficient resource utilization, and belongs to the technical field of big data and cloud computing platforms.
Background
At present, the method for automatically telescoping the Kubernetes container cluster based on a resource threshold or a user-defined index needs to solve the contradiction between the node quick ejection and cluster oscillation avoidance. Because the newly-built nodes of the cluster consume a long time, the real requirements cannot be well met on the efficiency of timely responding to expansion and contraction. Therefore, the intelligent telescopic compensation technology based on the intelligent telescopic compensation process is provided, the time for adding a newly-built node into a cluster to provide service can be effectively shortened, the telescopic response speed is accelerated, and the quality of the service is ensured.
In addition, in the aspect of scheduling and placing of the automatic telescopic pop-up node, because the existing scheduling strategy only supports the characteristic dimension of resource requirements, the support is insufficient in the behavior characteristic dimension of macro application. The method comprehensively considers two aspects, provides a 'hybrid scheduling soft segmentation' process, can realize the full utilization of resources, and further realizes the minimization of cost.
Disclosure of Invention
In order to overcome the defects of the technical problems, the invention provides a cluster top-speed elastic expansion method for realizing efficient resource utilization.
The invention discloses a cluster top-speed elastic expansion method for realizing efficient resource utilization, which comprises an intelligent expansion compensation process and a hybrid scheduling soft segmentation process; the intelligent telescopic compensation process is characterized by being realized through the following steps:
step 1: the service cluster creates an intelligent elastic expansion compensation module IACM of a user-defined resource in Kubernets, and defines a compensation queue in the intelligent elastic expansion compensation module IACM, wherein the compensation queue is maintained by a thread DaemonSet which is deployed on each node by the IACM;
step 2: judging whether the service cluster needs to add nodes in a future period of time, if so, creating Pod and executing the step 3); if the node does not need to be added, entering the next prediction period; pod is an aggregation of containers sharing a name space and a network space, and is a basic unit of cluster scheduling;
and step 3: adding the Pod resources predicted and generated by the prediction algorithm in the step 2) into a compensation queue, and executing the step 4);
and 4, step 4: setting the priority of the compensation queue, and executing the step 5;
and 5: when the length of the compensation queue exceeds a set threshold value, cleaning and recycling operation is performed on Pod resources in the compensation queue periodically, and step 6 is performed;
step 6: maintaining the operations of entering and exiting the compensation queue of the Pod resources and cleaning the Pod resources from the compensation queue, and executing the step 7;
and 7: the judgment process of step 2 is performed periodically.
The invention discloses a cluster top-speed elastic expansion method for realizing efficient resource utilization, wherein the prediction algorithm logic in the step 2 is realized through the following steps:
step 2-1: fitting the real-time monitoring data and the historical load data, judging the expansion requirement, newly building a Pod resource and adding the Pod resource into a compensation queue when nodes need to be added in the future, and not executing the operation of newly building the Pod resource when the nodes do not need to be added in the future;
step 2-2: judging whether Pod resources needing to be recycled exist or not, then executing a prediction algorithm, if the algorithm predicts that the Pod resources are needed in a future period of time, adding the Pod resources needing to be recycled into a compensation queue, and if the Pod resources are needed in the future period of time, discarding the Pod resources needing to be recycled;
step 2-3: and analyzing the Pod resource requirements submitted by the user in the future period of time, and establishing a new Pod resource according to the requirements submitted by the user and adding the new Pod resource to the compensation queue.
The invention discloses a cluster top-speed elastic expansion method for realizing efficient resource utilization, wherein the priority of a compensation queue in step 4 is obtained through the following steps:
step 4-1: and (3) solving the priority of the Pod resources in the compensation queue according to the formula (1):
Pn=W0*Fn+W1*P0+W2*Cn(1)
wherein, PnIndicating the priority, W, of the newly enqueued Pod resource0、W1、W2Denotes a weight ratio, FnThe fitting factor is a value representing the fitting degree and representing the possibility of being scheduled in the future; p0Representing service preset priority, CnRepresenting a credit factor; w is more than 00,W1,W2<1,W0+W1+W2=1;
Step 4-2: calculating the fitting factor F in the formula (1) by using the formula (2)n
Figure BDA0002239274910000031
Wherein S represents a service cluster, i represents the total i newly added Pod events of the cluster in history, i' represents the total pop resource out-queue number in the service cluster, and OEmPod resource, IE, representing mth out-of-compensation queuemPod representing the m-th enqueue; fi+1The fitting factor representing the newly added Pod resource is judged according to the ratio of the Pod resource which belongs to the same service cluster to the compensation queue in history;
Figure BDA0002239274910000032
representing the sum of Pod resources historically enqueued and dequeued by a certain service cluster,representing the sum of Pod resources historically added to a compensation queue by a certain service cluster; { IE, OE }. epsilon.S denotes a queue Pod resource IE in the setmAnd dequeue Pod resource OEmBelong to a service cluster;
step 4-3: signalling factor C in equation (1)nThe method is realized by the following steps:
step 4-3-1: obtaining historical resource quota utilization rate, setting that the service cluster has i times of newly-added Pod resources in history, and has n types of resources participating in calculation in Pod, wherein the resource quota utilization rates in the i times of newly-added Pod resources in the service cluster in history are respectively expressed as U1(t1,d1),...,Um(tm,dm),...,Ui(ti,di) Wherein:
Um(tm,dm)=(um,1(tm,dm),...,um,k(tm,dm),...,um,n(tm,dm))
Um(tm,dm) Indicates the resource utilization rate of the mth newly added Pod, um,k(tm,dm) Representing the resource quota utilization rate of the kth resource in the mth newly-added Pod, wherein k is 1, 2. t is tmIndicating the time at which the Pod starts providing service, dmIndicating the service duration provided by the Pod;
step 4-3-2: calculating the weight; firstly, calculating the utilization rate u of each resource quota in the newly added Pod by a formula (3)m,k(tm,dm) Occupied weight wm,k
Figure BDA0002239274910000041
wm,kFor each resource quota utilization rate u in newly-increased Podm,k(tm,dm) The weight occupied, α, is for dm0 < α < 1, and the size of D is equal to the period time T of the periodic dynamic adjustment resource quota, m is 1, 2.. times.i;
the weight of the usage rate of each resource quota in the mth newly added Pod is represented as:
Wm=(wm,1,...,wm,k,...,wm,m)
step 4-3-3: calculating the credit C of the cluster where the Pod is located; for various resources in the Pod, after the Pod is newly added for i times in the history of the same service cluster, calculating a credit factor of each newly added Pod resource through a formula (4);
Figure BDA0002239274910000042
for all resource types, after the resource quota is applied for i times, when a new resource quota is allocated to the user for the (i + 1) th time, the credit factor C of the service cluster is as follows:
Figure BDA0002239274910000043
so the credit factor c of the i +1 th newly added resource Podn=ci+1=c。
The cluster top-speed elastic expansion method for realizing the efficient utilization of resources is characterized in that the hybrid scheduling soft segmentation process is realized by the following steps:
a) for Pod resources established in the intelligent telescopic compensation process, dividing the Pod resources into online task resources and offline task resources, and solving an online task resource ratio Rx and an offline task resource ratio Ry through a formula (6):
where x, y are intermediate variables, which are solved by equation (7):
Figure BDA0002239274910000052
α 1, α 2.. α n represents the weight of online tasks, α 1 ', α 2 '. α n ' represents the weight of offline tasks, the types of resource types share n classes, k1, k2,. and kn represent n classes of resource types, the weight of α i + α i ' is 1, the weight of α i and α i ' is determined by the ratio of the respective summations of the various types of resources of the online and offline tasks, i is 1,2,. n;
b) when the online task resource use residual rate is less than or equal to 10% or the task dequeuing rate is slowly lower than a set threshold value, the resource shortage is judged, and the resource needs to be adjusted; if the resource needs to be adjusted, executing the step c), and if the resource does not need to be adjusted, executing the step f);
c) judging the offline task service priority Py > P0If yes, indicating that the offline task cannot be evicted, and executing step d); if not, indicating that the offline task can be evicted to release the resource, and executing the step i); p0An eviction threshold for an offline task;
setting new resource proportion variables of online tasks and offline tasks to be R 'respectively'x、R′y,R′x+R′y1 is ═ 1; then:
R′x=λ+Rx(8)
Rx=R′x,Ry=R′y(9)
wherein, λ is a resource adjustment factor;
Figure BDA0002239274910000053
wherein k1 ', k2 ', kn ' are the resources newly released by the offline task;
d) judging whether the utilization rate of the online task allocation resources is greater than or equal to 100%, if so, executing the step e); if not, executing step f);
e) judging whether the online task can be only scheduled to the current node, if so, executing the step i); if not, executing step h);
f) judging whether the residual rate of the off-line task resources is lower than 10%, if so, encroaching on the on-line task resources but not exceeding 50% of the total resources at most, and executing the step j); if not, executing step k);
g) encroaching on the online task resources, but not exceeding 50% of the total resources of the online task at most, judging whether the encroachment on the online task is successful, if so, executing the step j); if not, executing step k);
h) dispatching to other nodes or newly added nodes, and executing the step a); if the scheduling fails, executing the step i);
i) forcibly expelling the tasks with low priority in the off-line tasks according to the priority of the off-line tasks, and executing the step j);
j) proportioning R 'according to resources'x、R′yDynamically adjusting system resources, and executing the step k);
k) the resource meets the requirement, the circulation is finished, otherwise, the steps b) to j) are executed circularly according to the period T).
In the step c), the online task priority Px and the offline task priority Py are solved through a formula (11):
Figure BDA0002239274910000061
wherein, ω is0、ω1Represents the weight value, ω01+ξ=1,ω01+ξ′=1;PsFor task system priority, RtIndicating the running time of the task, EtIndicating the task deadline, ξ, ξ' are period impact factors.
The cluster top-speed elastic expansion method for realizing the efficient utilization of resources is characterized in that the task dequeuing speed in the step b) is slowly lower than a set threshold value by using a task waiting period, and when the task waiting period exceeds the threshold value T0And then, judging that the resources are booked and needing to be adjusted.
When the expansion of dynamic resources is triggered, a new Pod resource starting mode is not directly adopted, but a compensation queue is traversed first, whether a prepared Pod resource exists or not is checked, if yes, the Pod resource is directly pulled up, and if not, an intelligent expansion compensation process is executed; when triggering dynamic capacity reduction, the reduced Pod is not directly killed after a cooling period, but is put into a compensation queue for waiting, and then is destroyed in an intelligent compensation process; it can be seen that due to the addition of the intelligent telescopic compensation process, the original cooling period can be properly shortened, and therefore a faster and fine-grained dynamic telescopic scheme is realized.
The invention has the beneficial effects that: according to the cluster rapid elastic expansion method, in the intelligent expansion compensation process, the resource requirement of a service in the future period is predicted according to historical load and real-time load information, Pod resources are pre-created and added into a compensation queue, and as the service cluster increases the resource requirement, a required new copy already exists in the compensation queue, the service cluster can be added into the compensation queue by directly pulling up the new copy, so that the preparation time consumed by creating the new copy from zero is saved, and the goal of more extremely and rapidly expanding cluster node resources is realized. In the process of hybrid scheduling soft segmentation, firstly, services are divided into online and offline task types, priorities of offline and online tasks are calculated, when resources cannot meet requirements, the size of a resource pool can be adjusted firstly, when resources in the resource pool are insufficient, resource adjustment is carried out, the resource occupation is almost not adjusted, the priority is low, and tasks can be expelled or killed so as to achieve resource scheduling and resource expansion.
Drawings
FIG. 1 is a schematic diagram of an architecture of a conventional cluster scaling method;
FIG. 2 is a schematic diagram of an architecture of the cluster top-speed elastic scaling method of the present invention;
fig. 3 is a flow chart of the hybrid scheduling soft segmentation process in the present invention.
Detailed Description
The invention is further described with reference to the following figures and examples.
If the automatic scaling architecture system is roughly divided into three parts, namely index acquisition, scaling component and resource scheduling, as shown in fig. 1, a schematic diagram of the existing cluster scaling method is given. Adding the processes of 'intelligent telescopic compensation' and 'mixed scheduling soft segmentation' forms a new scheduling architecture system, as shown in fig. 2, a schematic diagram of the architecture system of the cluster top-speed elastic telescopic method is provided, the system forms a more perfect automatic telescopic flow, and the efficiency of telescopic nodes and the quality of service are improved; and the cluster resources are fully utilized, and the cost is saved.
The intelligent telescopic compensation process mainly comprises two parts of a compensation queue and a control logic. The control logic is added with a prediction algorithm analysis logic, the algorithm logic predicts the resource demand of the service for a period of time in the future according to historical load and real-time load information, and resources are pre-created and added into a compensation queue. Because the intelligent scaling process has independent control logic and life cycle, the intelligent scaling process is designed into a Controller mode and is managed and maintained by a control Manager (Controller Manager) of Kubernets.
The prediction algorithm model of the 'control logic' part can be established according to the characteristics of data required by prediction, and the data characteristics mainly comprise: time and service correlation, volatility, burstiness (discrete points). Establishing a prediction algorithm, a self-establishment algorithm or a combined algorithm according to the data characteristics and by combining the characteristics of the cluster and the service target of the application; and then, carrying out experiments on the algorithm to finally determine the optimal algorithm. The algorithms with better effect at present comprise gray prediction, exponential smoothing, BP neural network and regression, autoregressive algorithm.
The service cluster and main logic oriented to the intelligent scaling compensation process are as follows: the service oriented to the algorithm is a type of stateless application cluster created and maintained by a copy control policy, and the design of the main logic of the algorithm is based on the process of the change of the life cycle of a queue, and the process mainly comprises the actions of enqueuing, dequeuing and clearing the queue. Wherein, it is Pod (Pod can be regarded as an aggregation of containers sharing name space, network space and the like, and is a basic unit of cluster scheduling) resource that is predicted by a prediction algorithm and generated by a copy control policy to perform enqueuing by an enqueue action. The forecasting algorithm forecasts the change situation of the resource demand of the service cluster in a future period of time, and if the cluster needs to increase the Pod, new Pod resources are built and the resource enqueuing action is executed. And (3) dequeuing: when the cluster demand is in the effective prediction time period, namely the Pod of the cluster generated by prediction is not cleared in the queue, and the cluster size needs to be increased; and traversing the queue resources and executing the dequeuing action. Cleaning and aligning: when the memory resource is in shortage and the resource needs to be recycled or after one period is finished, the queue is cleared according to the priority sequence.
The constraints of the modulo algorithm are: the service cluster oriented to the system is a stateless application cluster generated by the control of a copy controller, and the Pod in the cluster is from the same deployment template and provides the same service type to the outside; in addition, the new replica constraints generated for each prediction oriented stateless application service cluster are: only one can be created and added to the queue at a time, and only after the copy is scheduled or cleaned, the creation and enqueuing of new resources can be performed according to the prediction algorithm.
The algorithm has the advantages that: when the resource requirement of the service cluster is increased, the needed new copies already exist in the compensation queue, and the service cluster can be added into the cluster by directly pulling up the new copies, so that the preparation time consumed by creating the new copies from zero is saved, and the goal of more extremely and rapidly expanding cluster node resources is realized. The disadvantages are that: pod resources in the queue consume memory resources. The time saved by the dominant part exists in the life cycle of the Pod, if the cluster resources are sufficient and the running state is good, most of the time consumed by newly creating the copy exists in the suspended state, activities such as downloading the mirror image of the state and preparing the base environment of the Pod take longer, and the state accounts for a great proportion in the life cycle of the Pod. This approach is designed to trade off the gain effect in time at the expense of space. Additionally, the Pod status requirements for joining the queue are: the base context is ready and in an unbound node scheduled state, wherein the base context preparation comprises: basic isolation environment, file system, network, storage, container mirror download. Since the replica controller needs to maintain and control the number of cluster nodes, and the Pod added to the "compensation queue" is generated by the replica controller under the same service cluster, and the resource is ready but not put into use, the Pod resource in this state can set a special flag for the Pod of the compensation queue in the field of the resource list to distinguish from the existing state, so as to facilitate the uniform management of kubernets. In summary, when the cluster size needs to be expanded, if the Pod in the queue can be pulled up directly for operation, a lot of time is saved.
The intelligent expansion compensation process is realized by the following steps:
step 1: the service cluster creates an intelligent elastic expansion compensation module IACM of a user-defined resource in Kubernets, and defines a compensation queue in the intelligent elastic expansion compensation module IACM, wherein the compensation queue is maintained by a thread DaemonSet which is deployed on each node by the IACM;
step 2: judging whether the service cluster needs to add nodes in a future period of time, if so, creating Pod and executing the step 3); if the node does not need to be added, entering the next prediction period; pod is an aggregation of containers sharing a name space and a network space, and is a basic unit of cluster scheduling;
and step 3: adding the Pod resources predicted and generated by the prediction algorithm in the step 2) into a compensation queue, and executing the step 4);
and 4, step 4: setting the priority of the compensation queue, and executing the step 5;
and 5: when the length of the compensation queue exceeds a set threshold value, cleaning and recycling operation is performed on Pod resources in the compensation queue periodically, and step 6 is performed;
step 6: maintaining the operations of entering and exiting the compensation queue of the Pod resources and cleaning the Pod resources from the compensation queue, and executing the step 7;
and 7: the judgment process of step 2 is performed periodically.
The prediction algorithm logic described in step 2 is implemented by:
step 2-1: fitting the real-time monitoring data and the historical load data, judging the expansion requirement, newly building a Pod resource and adding the Pod resource into a compensation queue when nodes need to be added in the future, and not executing the operation of newly building the Pod resource when the nodes do not need to be added in the future;
step 2-2: judging whether Pod resources needing to be recycled exist or not, then executing a prediction algorithm, if the algorithm predicts that the Pod resources are needed in a future period of time, adding the Pod resources needing to be recycled into a compensation queue, and if the Pod resources are needed in the future period of time, discarding the Pod resources needing to be recycled;
step 2-3: and analyzing the Pod resource requirements submitted by the user in the future period of time, and establishing a new Pod resource according to the requirements submitted by the user and adding the new Pod resource to the compensation queue.
The priority of the compensation queue in step 4 is found by the following steps:
step 4-1: and (3) solving the priority of the Pod resources in the compensation queue according to the formula (1):
Pn=W0*Fn+W1*P0+W2*Cn(1)
wherein, PnIndicating the priority, W, of the newly enqueued Pod resource0、W1、W2Denotes a weight ratio, FnThe fitting factor is a value representing the fitting degree and representing the possibility of being scheduled in the future; p0Representing service preset priority, CnRepresenting a credit factor; w is more than 00,W1,W2<1,W0+W1+W2=1;
Step 4-2: calculating the fitting factor F in the formula (1) by using the formula (2)n
Wherein S represents a service cluster, i represents the total i newly added Pod events of the cluster in history, i' represents the total pop resource out-queue number in the service cluster, and OEmPod resource, IE, representing mth out-of-compensation queuemPod representing the m-th enqueue; fi+1The fitting factor representing the newly added Pod resource is judged according to the ratio of the Pod resource which belongs to the same service cluster to the compensation queue in history;
Figure BDA0002239274910000102
representing the sum of Pod resources historically enqueued and dequeued by a certain service cluster,
Figure BDA0002239274910000111
representing the sum of Pod resources historically added to a compensation queue by a certain service cluster; { IE, OE }. epsilon.S denotes a queue Pod resource IE in the setmAnd dequeue Pod resource OEmBelong to a service cluster;
step 4-3: signalling factor C in equation (1)nThe method is realized by the following steps:
step 4-3-1: obtaining historical resource quota utilization rate, setting that the service cluster has i times of newly-added Pod resources in history, and has n types of resources participating in calculation in Pod, wherein the resource quota utilization rates in the i times of newly-added Pod resources in the service cluster in history are respectively expressed as U1(t1,d1),...,Um(tm,dm),...,Ui(ti,di) Wherein:
Um(tm,dm)=(um,1(tm,dm),...,um,k(tm,dm),...,um,n(tm,dm))
Um(tm,dm) Indicates the resource utilization rate of the mth newly added Pod, um,k(tm,dm) Representing the resource quota utilization rate of the kth resource in the mth newly-added Pod, wherein k is 1, 2. t is tmIndicating the time at which the Pod starts providing service, dmIndicating the service duration provided by the Pod;
step 4-3-2: calculating the weight; firstly, calculating the utilization rate u of each resource quota in the newly added Pod by a formula (3)m,k(tm,dm) Occupied weight wm,k
Figure BDA0002239274910000112
wm,kFor each resource quota utilization rate u in newly-increased Podm,k(tm,dm) The weight occupied, α, is for dm0 < α < 1, D being equal to the periodic dynamic adjustmentA cycle time T, m of the resource quota is 1, 2.
The weight of the usage rate of each resource quota in the mth newly added Pod is represented as:
Wm=(wm,1,...,wm,k,...,wm,m)
step 4-3-3: calculating the credit C of the cluster where the Pod is located; for various resources in the Pod, after the Pod is newly added for i times in the history of the same service cluster, calculating a credit factor of each newly added Pod resource through a formula (4);
Figure BDA0002239274910000121
for all resource types, after the resource quota is applied for i times, when a new resource quota is allocated to the user for the (i + 1) th time, the credit factor C of the service cluster is as follows:
Figure BDA0002239274910000122
so the credit factor c of the i +1 th newly added resource Podn=ci+1=c。
In the hybrid scheduling soft segmentation process, the background is as follows: the basic scheduling unit on Kubernetes is Pod, resources required by a parameter service cluster and a training cluster in a model training scene need to be allocated once, if only partial resources are allocated or are successfully scheduled and occupied by other clusters, the cluster is started incompletely or incompletely. Many open source projects propose a group scheduling scheme (e.g., volcano), but the problem of large block resource allocation difficulty is difficult to solve. The 'hybrid scheduling soft segmentation' process provided by the patent can solve the problem in a targeted manner, and the basic idea is to divide a resource pool, separately place online services with Pod as scheduling granularity and batch job offline tasks, and reduce the problem that the scheduling cannot be performed all the time because a small job occupies resources when the batch job has an independent resource space. When the batch processing operation releases the resources, the resources are released in a whole block, and the resources are easily allocated when the operation is reused.
In addition, the design of the module can also dynamically allocate the resource pool, namely, the resource division water level line is adjusted when the resource demand changes. This is also the embodiment of the idea of "soft slicing". Defining: wherein, the characteristic behavior dimension of the application macro mainly comprises: difference, timeliness and basic planning. The difference is mainly reflected in that the difference is obvious in the requirements of different application types on vacant resources of containers such as Pod and the like, the time sensitivity, the cost planning and the like; the time period is mainly characterized in that the application has different performance behaviors in different time periods; the basic programmable body is the part that is agreed in advance by a human, beyond the part that the unpredictable application load acutely exceeds the capacity plan.
As shown in fig. 3, a flow chart of a hybrid scheduling soft-segmentation process is given, and the hybrid scheduling soft-segmentation process is specifically implemented by the following steps:
a) for Pod resources established in the intelligent telescopic compensation process, dividing the Pod resources into online task resources and offline task resources, and solving an online task resource ratio Rx and an offline task resource ratio Ry through a formula (6):
Figure BDA0002239274910000131
where x, y are intermediate variables, which are solved by equation (7):
Figure BDA0002239274910000132
α 1, α 2.. α n represents the weight of online tasks, α 1 ', α 2 '. α n ' represents the weight of offline tasks, the types of resource types share n classes, k1, k2,. and kn represent n classes of resource types, the weight of α i + α i ' is 1, the weight of α i and α i ' is determined by the ratio of the respective summations of the various types of resources of the online and offline tasks, i is 1,2,. n;
b) when the online task resource use residual rate is less than or equal to 10% or the task dequeuing rate is slowly lower than a set threshold value, the resource shortage is judged, and the resource needs to be adjusted; if the resource needs to be adjusted, executing the step c), and if the resource does not need to be adjusted, executing the step f);
c) judging the offline task service priority Py > P0If yes, indicating that the offline task cannot be evicted, and executing step d); if not, indicating that the offline task can be evicted to release the resource, and executing the step i); p0An eviction threshold for an offline task;
setting new resource proportion variables of online tasks and offline tasks to be R 'respectively'x、R′y,R′x+R′y1 is ═ 1; then:
Rx′=λ+Rx(8)
Rx=R′x,Ry=R′y(9)
wherein, λ is a resource adjustment factor;
Figure BDA0002239274910000133
wherein k1 ', k2 ', kn ' are the resources newly released by the offline task;
d) judging whether the utilization rate of the online task allocation resources is greater than or equal to 100%, if so, executing the step e); if not, executing step f);
e) judging whether the online task can be only scheduled to the current node, if so, executing the step i); if not, executing step h);
f) judging whether the residual rate of the off-line task resources is lower than 10%, if so, encroaching on the on-line task resources but not exceeding 50% of the total resources at most, and executing the step j); if not, executing step k);
g) encroaching on the online task resources, but not exceeding 50% of the total resources of the online task at most, judging whether the encroachment on the online task is successful, if so, executing the step j); if not, executing step k);
h) dispatching to other nodes or newly added nodes, and executing the step a); if the scheduling fails, executing the step i);
i) forcibly expelling the tasks with low priority in the off-line tasks according to the priority of the off-line tasks, and executing the step j);
j) proportioning R 'according to resources'x、R′yDynamically adjusting system resources, and executing the step k);
k) the resource meets the requirement, the circulation is finished, otherwise, the steps b) to j) are executed circularly according to the period T).
In step c), the online task priority Px and the offline task priority Py are obtained through a formula (11):
Figure BDA0002239274910000141
wherein, ω is0、ω1Represents the weight value, ω01+ξ=1,ω01+ξ′=1;PsFor task system priority, RtIndicating the running time of the task, EtIndicating the task deadline, ξ, ξ' are period impact factors.
The task dequeuing speed in the step b) is slowly lower than a set threshold value and is characterized by a task waiting period, and when the task waiting period exceeds the threshold value T0, the resource accounting is judged and the resource is required to be adjusted.
When dynamic expansion is triggered, a new Pod starting mode is not directly adopted, but a compensation queue of an intelligent telescopic compensation process is traversed first, and whether a prepared Pod resource exists is checked. If yes, directly pulling up; if not, executing the original scaling strategy. In the triggering dynamic capacity reduction, the reduced Pod is not directly killed after the cooling period, but is put into a compensation queue for a certain period, and then the intelligent compensation module calls a corresponding assembly to destroy the Pod. It can be seen that due to the addition of the intelligent telescopic compensation module, the original cooling period can be properly shortened, and therefore a faster and fine-grained dynamic telescopic scheme is realized.

Claims (7)

1. A cluster top-speed elastic expansion method for realizing efficient resource utilization comprises an intelligent expansion compensation process and a hybrid scheduling soft segmentation process; the intelligent telescopic compensation process is characterized by being realized through the following steps:
step 1: the service cluster creates an intelligent elastic expansion compensation module IACM of a user-defined resource in Kubernets, and defines a compensation queue in the intelligent elastic expansion compensation module IACM, wherein the compensation queue is maintained by a thread DaemonSet which is deployed on each node by the IACM;
step 2: judging whether the service cluster needs to add nodes in a future period of time, if so, creating Pod and executing the step 3); if the node does not need to be added, entering the next prediction period; pod is an aggregation of containers sharing a name space and a network space, and is a basic unit of cluster scheduling;
and step 3: adding the Pod resources predicted and generated by the prediction algorithm in the step 2) into a compensation queue, and executing the step 4);
and 4, step 4: setting the priority of the compensation queue, and executing the step 5;
and 5: when the length of the compensation queue exceeds a set threshold value, cleaning and recycling operation is performed on Pod resources in the compensation queue periodically, and step 6 is performed;
step 6: maintaining the operations of entering and exiting the compensation queue of the Pod resources and cleaning the Pod resources from the compensation queue, and executing the step 7;
and 7: the judgment process of step 2 is performed periodically.
2. The method for realizing resource-efficient elastic cluster scaling according to claim 1, wherein the prediction algorithm logic in step 2 is implemented by the following steps:
step 2-1: fitting the real-time monitoring data and the historical load data, judging the expansion requirement, newly building a Pod resource and adding the Pod resource into a compensation queue when nodes need to be added in the future, and not executing the operation of newly building the Pod resource when the nodes do not need to be added in the future;
step 2-2: judging whether Pod resources needing to be recycled exist or not, then executing a prediction algorithm, if the algorithm predicts that the Pod resources are needed in a future period of time, adding the Pod resources needing to be recycled into a compensation queue, and if the Pod resources are needed in the future period of time, discarding the Pod resources needing to be recycled;
step 2-3: and analyzing the Pod resource requirements submitted by the user in the future period of time, and establishing a new Pod resource according to the requirements submitted by the user and adding the new Pod resource to the compensation queue.
3. The method for cluster top-speed elastic scaling for resource efficient utilization according to claim 1 or 2, wherein the priority of the compensation queue in step 4 is obtained by the following steps:
step 4-1: and (3) solving the priority of the Pod resources in the compensation queue according to the formula (1):
Pn=W0*Fn+W1*P0+W2*Cn(1)
wherein, PnIndicating the priority, W, of the newly enqueued Pod resource0、W1、W2Denotes a weight ratio, FnThe fitting factor is a value representing the fitting degree and representing the possibility of being scheduled in the future; p0Representing service preset priority, CnRepresenting a credit factor; w is more than 00,W1,W2<1,W0+W1+W2=1;
Step 4-2: calculating the fitting factor F in the formula (1) by using the formula (2)n
Wherein S represents a service cluster, i represents the total i newly added Pod events of the cluster in history, i' represents the total pop resource out-queue number in the service cluster, and OEmPod resource, IE, representing mth out-of-compensation queuemPod representing the m-th enqueue; fi+1The fitting factor representing the newly added Pod resource is judged according to the ratio of the Pod resource which belongs to the same service cluster to the compensation queue in history;
Figure FDA0002239274900000022
representing the sum of Pod resources historically enqueued and dequeued by a certain service cluster,representing the sum of Pod resources historically added to a compensation queue by a certain service cluster; { IE, OE }. epsilon.S denotes a queue Pod resource IE in the setmAnd dequeue Pod resource OEmBelong to a service cluster;
step 4-3: signalling factor C in equation (1)nThe method is realized by the following steps:
step 4-3-1: obtaining historical resource quota utilization rate, setting that the service cluster has i times of newly-added Pod resources in history, and has n types of resources participating in calculation in Pod, wherein the resource quota utilization rates in the i times of newly-added Pod resources in the service cluster in history are respectively expressed as U1(t1,d1),...,Um(tm,dm),...,Ui(ti,di) Wherein:
Um(tm,dm)=(um,1(tm,dm),...,um,k(tm,dm),...,um,n(tm,dm))
Um(tm,dm) Indicates the resource utilization rate of the mth newly added Pod, um,k(tm,dm) Representing the resource quota utilization rate of the kth resource in the mth newly-added Pod, wherein k is 1, 2. t is tmIndicating the time at which the Pod starts providing service, dmIndicating the service duration provided by the Pod;
step 4-3-2: calculating the weight; firstly, calculating the utilization rate u of each resource quota in the newly added Pod by a formula (3)m,k(tm,dm) Occupied weight wm,k
Figure FDA0002239274900000031
wm,kFor each resource quota utilization rate u in newly-increased Podm,k(tm,dm) The weight occupied, α, is for dm0 < α < 1, and the size of D is equal to the period time T of the periodic dynamic adjustment resource quota, m is 1, 2.. times.i;
the weight of the usage rate of each resource quota in the mth newly added Pod is represented as:
Wm=(wm,1,...,wm,k,...,wm,m)
step 4-3-3: calculating the credit C of the cluster where the Pod is located; for various resources in the Pod, after the Pod is newly added for i times in the history of the same service cluster, calculating a credit factor of each newly added Pod resource through a formula (4);
for all resource types, after the resource quota is applied for i times, when a new resource quota is allocated to the user for the (i + 1) th time, the credit factor C of the service cluster is as follows:
Figure FDA0002239274900000033
so the credit factor c of the i +1 th newly added resource Podn=ci+1=c。
4. The method for cluster top-speed elastic scaling to achieve efficient resource utilization according to claim 1, wherein the hybrid scheduling soft-segmentation process is specifically achieved by the following steps:
a) for Pod resources established in the intelligent telescopic compensation process, dividing the Pod resources into online task resources and offline task resources, and solving an online task resource ratio Rx and an offline task resource ratio Ry through a formula (6):
where x, y are intermediate variables, which are solved by equation (7):
Figure FDA0002239274900000042
α 1, α 2.. α n represents the weight of online tasks, α 1 ', α 2 '. α n ' represents the weight of offline tasks, the types of resource types share n classes, k1, k2,. and kn represent n classes of resource types, the weight of α i + α i ' is 1, the weight of α i and α i ' is determined by the ratio of the respective summations of the various types of resources of the online and offline tasks, i is 1,2,. n;
b) when the online task resource use residual rate is less than or equal to 10% or the task dequeuing rate is slowly lower than a set threshold value, the resource shortage is judged, and the resource needs to be adjusted; if the resource needs to be adjusted, executing the step c), and if the resource does not need to be adjusted, executing the step f);
c) judging the offline task service priority Py > P0If yes, indicating that the offline task cannot be evicted, and executing step d); if not, indicating that the offline task can be evicted to release the resource, and executing the step i); p0An eviction threshold for an offline task;
setting new resource proportion variables of online tasks and offline tasks to be R 'respectively'x、R′y,R′x+R′y1 is ═ 1; then:
R′x=λ+Rx(8)
Rx=R′x,Ry=R′y(9)
wherein, λ is a resource adjustment factor;
Figure FDA0002239274900000051
wherein k1 ', k2 ', kn ' are the resources newly released by the offline task;
d) judging whether the utilization rate of the online task allocation resources is greater than or equal to 100%, if so, executing the step e); if not, executing step f);
e) judging whether the online task can be only scheduled to the current node, if so, executing the step i); if not, executing step h);
f) judging whether the residual rate of the off-line task resources is lower than 10%, if so, encroaching on the on-line task resources but not exceeding 50% of the total resources at most, and executing the step j); if not, executing step k);
g) encroaching on the online task resources, but not exceeding 50% of the total resources of the online task at most, judging whether the encroachment on the online task is successful, if so, executing the step j); if not, executing step k);
h) dispatching to other nodes or newly added nodes, and executing the step a); if the scheduling fails, executing the step i);
i) forcibly expelling the tasks with low priority in the off-line tasks according to the priority of the off-line tasks, and executing the step j);
j) proportioning R 'according to resources'x、R′yDynamically adjusting system resources, and executing the step k);
k) the resource meets the requirement, the circulation is finished, otherwise, the steps b) to j) are executed circularly according to the period T).
5. The method according to claim 4, wherein in step c), the online task priority Px and the offline task priority Py are obtained according to formula (11):
Figure FDA0002239274900000052
wherein, ω is0、ω1Represents the weight value, ω01+ξ=1,ω01+ξ′=1;PsFor task system priority, RtIndicating the running time of the task, EtIndicating the task deadline, ξ, ξ' are period impact factors.
6. The method as claimed in claim 4, wherein the slow task dequeuing rate in step b) is lower than a predetermined threshold and is characterized by a task waiting period, and when the task waiting period exceeds a threshold T0And then, judging that the resources are booked and needing to be adjusted.
7. The method according to claim 4, wherein when dynamic resource expansion is triggered, a new Pod resource starting mode is not directly adopted, but a compensation queue is traversed first to check whether a prepared Pod resource exists, if yes, the Pod resource is directly pulled up, and if not, an intelligent expansion compensation process is executed; when triggering dynamic capacity reduction, the reduced Pod is not directly killed after a cooling period, but is put into a compensation queue for waiting, and then is destroyed in an intelligent compensation process; it can be seen that due to the addition of the intelligent telescopic compensation process, the original cooling period can be properly shortened, and therefore a faster and fine-grained dynamic telescopic scheme is realized.
CN201910994328.2A 2019-10-18 2019-10-18 Cluster extremely-fast elastic telescoping method for realizing efficient resource utilization Active CN110825520B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910994328.2A CN110825520B (en) 2019-10-18 2019-10-18 Cluster extremely-fast elastic telescoping method for realizing efficient resource utilization

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910994328.2A CN110825520B (en) 2019-10-18 2019-10-18 Cluster extremely-fast elastic telescoping method for realizing efficient resource utilization

Publications (2)

Publication Number Publication Date
CN110825520A true CN110825520A (en) 2020-02-21
CN110825520B CN110825520B (en) 2023-08-29

Family

ID=69549553

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910994328.2A Active CN110825520B (en) 2019-10-18 2019-10-18 Cluster extremely-fast elastic telescoping method for realizing efficient resource utilization

Country Status (1)

Country Link
CN (1) CN110825520B (en)

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111352717A (en) * 2020-03-24 2020-06-30 广西梯度科技有限公司 Method for realizing kubernets self-defined scheduler
CN111399989A (en) * 2020-04-10 2020-07-10 中国人民解放军国防科技大学 Task preemption scheduling method and system for container cloud
CN112104486A (en) * 2020-08-31 2020-12-18 中国—东盟信息港股份有限公司 Kubernetes container-based network endpoint slicing method and system
CN112199194A (en) * 2020-10-14 2021-01-08 广州虎牙科技有限公司 Container cluster-based resource scheduling method, device, equipment and storage medium
CN112698947A (en) * 2020-12-31 2021-04-23 山东省计算中心(国家超级计算济南中心) GPU resource flexible scheduling method based on heterogeneous application platform
CN113419831A (en) * 2021-06-23 2021-09-21 上海观安信息技术股份有限公司 Sandbox task scheduling method and system
CN113765949A (en) * 2020-06-02 2021-12-07 华为技术有限公司 Resource allocation method and device
CN113961361A (en) * 2021-11-10 2022-01-21 重庆紫光华山智安科技有限公司 Control method and system for cache resources
CN113961335A (en) * 2020-07-01 2022-01-21 中兴通讯股份有限公司 Resource scheduling method, resource scheduling system and equipment
CN114513530A (en) * 2022-04-19 2022-05-17 山东省计算中心(国家超级计算济南中心) Cross-domain storage space bidirectional supply method and system
CN114968601A (en) * 2022-07-28 2022-08-30 合肥中科类脑智能技术有限公司 Scheduling method and scheduling system for AI training jobs with resources reserved according to proportion
CN116610534A (en) * 2023-07-18 2023-08-18 贵州海誉科技股份有限公司 Improved predictive elastic telescoping method based on Kubernetes cluster resources
TWI831159B (en) * 2022-03-22 2024-02-01 新加坡商鴻運科股份有限公司 Storage expansion method, apparatus, storage media and electronic device
CN118550716A (en) * 2024-07-30 2024-08-27 杭州老板电器股份有限公司 Big data task scheduling method and device, electronic equipment and storage medium

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2008101366A1 (en) * 2007-02-17 2008-08-28 Zte Corporation A packet dispatching method in wireless communication system
CN101951395A (en) * 2010-08-30 2011-01-19 中国科学院声学研究所 Access prediction-based data cache strategy for P2P Video-on-Demand (VoD) system server
CN103425535A (en) * 2013-06-05 2013-12-04 浙江大学 Agile elastic telescoping method in cloud environment
US20140237477A1 (en) * 2013-01-18 2014-08-21 Nec Laboratories America, Inc. Simultaneous scheduling of processes and offloading computation on many-core coprocessors
US20170011327A1 (en) * 2015-07-12 2017-01-12 Spotted, Inc Method of computing an estimated queuing delay
CN108874542A (en) * 2018-06-07 2018-11-23 桂林电子科技大学 Kubernetes method for optimizing scheduling neural network based
CN109117265A (en) * 2018-07-12 2019-01-01 北京百度网讯科技有限公司 The method, apparatus, equipment and storage medium of schedule job in the cluster
CN109960591A (en) * 2019-03-29 2019-07-02 神州数码信息系统有限公司 A method of the cloud application resource dynamic dispatching occupied towards tenant's resource
CN110096349A (en) * 2019-04-10 2019-08-06 山东科技大学 A kind of job scheduling method based on the prediction of clustered node load condition
CN110287003A (en) * 2019-06-28 2019-09-27 北京九章云极科技有限公司 The management method and management system of resource

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2008101366A1 (en) * 2007-02-17 2008-08-28 Zte Corporation A packet dispatching method in wireless communication system
CN101951395A (en) * 2010-08-30 2011-01-19 中国科学院声学研究所 Access prediction-based data cache strategy for P2P Video-on-Demand (VoD) system server
US20140237477A1 (en) * 2013-01-18 2014-08-21 Nec Laboratories America, Inc. Simultaneous scheduling of processes and offloading computation on many-core coprocessors
CN103425535A (en) * 2013-06-05 2013-12-04 浙江大学 Agile elastic telescoping method in cloud environment
US20170011327A1 (en) * 2015-07-12 2017-01-12 Spotted, Inc Method of computing an estimated queuing delay
CN108874542A (en) * 2018-06-07 2018-11-23 桂林电子科技大学 Kubernetes method for optimizing scheduling neural network based
CN109117265A (en) * 2018-07-12 2019-01-01 北京百度网讯科技有限公司 The method, apparatus, equipment and storage medium of schedule job in the cluster
CN109960591A (en) * 2019-03-29 2019-07-02 神州数码信息系统有限公司 A method of the cloud application resource dynamic dispatching occupied towards tenant's resource
CN110096349A (en) * 2019-04-10 2019-08-06 山东科技大学 A kind of job scheduling method based on the prediction of clustered node load condition
CN110287003A (en) * 2019-06-28 2019-09-27 北京九章云极科技有限公司 The management method and management system of resource

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
程振京;李海波;黄秋兰;程耀东;陈刚;: "高能物理云平台中的弹性计算资源管理机制" *

Cited By (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111352717B (en) * 2020-03-24 2023-04-07 广西梯度科技股份有限公司 Method for realizing kubernets self-defined scheduler
CN111352717A (en) * 2020-03-24 2020-06-30 广西梯度科技有限公司 Method for realizing kubernets self-defined scheduler
CN111399989A (en) * 2020-04-10 2020-07-10 中国人民解放军国防科技大学 Task preemption scheduling method and system for container cloud
CN113765949A (en) * 2020-06-02 2021-12-07 华为技术有限公司 Resource allocation method and device
CN113961335A (en) * 2020-07-01 2022-01-21 中兴通讯股份有限公司 Resource scheduling method, resource scheduling system and equipment
CN112104486A (en) * 2020-08-31 2020-12-18 中国—东盟信息港股份有限公司 Kubernetes container-based network endpoint slicing method and system
CN112199194A (en) * 2020-10-14 2021-01-08 广州虎牙科技有限公司 Container cluster-based resource scheduling method, device, equipment and storage medium
CN112199194B (en) * 2020-10-14 2024-04-19 广州虎牙科技有限公司 Resource scheduling method, device, equipment and storage medium based on container cluster
CN112698947B (en) * 2020-12-31 2022-03-29 山东省计算中心(国家超级计算济南中心) GPU resource flexible scheduling method based on heterogeneous application platform
CN112698947A (en) * 2020-12-31 2021-04-23 山东省计算中心(国家超级计算济南中心) GPU resource flexible scheduling method based on heterogeneous application platform
CN113419831A (en) * 2021-06-23 2021-09-21 上海观安信息技术股份有限公司 Sandbox task scheduling method and system
CN113419831B (en) * 2021-06-23 2023-04-11 上海观安信息技术股份有限公司 Sandbox task scheduling method and system
CN113961361A (en) * 2021-11-10 2022-01-21 重庆紫光华山智安科技有限公司 Control method and system for cache resources
CN113961361B (en) * 2021-11-10 2024-04-16 重庆紫光华山智安科技有限公司 Control method and system for cache resources
TWI831159B (en) * 2022-03-22 2024-02-01 新加坡商鴻運科股份有限公司 Storage expansion method, apparatus, storage media and electronic device
CN114513530A (en) * 2022-04-19 2022-05-17 山东省计算中心(国家超级计算济南中心) Cross-domain storage space bidirectional supply method and system
CN114968601A (en) * 2022-07-28 2022-08-30 合肥中科类脑智能技术有限公司 Scheduling method and scheduling system for AI training jobs with resources reserved according to proportion
CN116610534B (en) * 2023-07-18 2023-10-03 贵州海誉科技股份有限公司 Improved predictive elastic telescoping method based on Kubernetes cluster resources
CN116610534A (en) * 2023-07-18 2023-08-18 贵州海誉科技股份有限公司 Improved predictive elastic telescoping method based on Kubernetes cluster resources
CN118550716A (en) * 2024-07-30 2024-08-27 杭州老板电器股份有限公司 Big data task scheduling method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN110825520B (en) 2023-08-29

Similar Documents

Publication Publication Date Title
CN110825520A (en) Cluster top-speed elastic expansion method for realizing efficient resource utilization
CN103617062B (en) The render farm Dynamic Deployment System of a kind of flexibility and method
CN104951372B (en) A kind of Map/Reduce data processing platform (DPP) memory source dynamic allocation methods based on prediction
CN109788046B (en) Multi-strategy edge computing resource scheduling method based on improved bee colony algorithm
CN112269641B (en) Scheduling method, scheduling device, electronic equipment and storage medium
US20140165061A1 (en) Statistical packing of resource requirements in data centers
CN107911478A (en) Multi-user based on chemical reaction optimization algorithm calculates discharging method and device
CN110489217A (en) A kind of method for scheduling task and system
CN111427679A (en) Computing task scheduling method, system and device facing edge computing
CN111427675B (en) Data processing method and device and computer readable storage medium
CN113806018B (en) Kubernetes cluster resource mixed scheduling method based on neural network and distributed cache
CN113138860B (en) Message queue management method and device
CN113867959A (en) Training task resource scheduling method, device, equipment and medium
CN112732444A (en) Distributed machine learning-oriented data partitioning method
CN112486642B (en) Resource scheduling method, device, electronic equipment and computer readable storage medium
CN113641417A (en) Vehicle safety task unloading method based on branch-and-bound method
Yang et al. A novel distributed task scheduling framework for supporting vehicular edge intelligence
CN116069512A (en) Serverless efficient resource allocation method and system based on reinforcement learning
CN109428950B (en) Automatic scheduling method and system for IP address pool
Balla et al. Reliability enhancement in cloud computing via optimized job scheduling implementing reinforcement learning algorithm and queuing theory
CN116069496A (en) GPU resource scheduling method and device
CN113766037B (en) Task unloading control method and system for large-scale edge computing system
CN114416355A (en) Resource scheduling method, device, system, electronic equipment and medium
CN116302578B (en) QoS (quality of service) constraint stream application delay ensuring method and system
Li et al. Communications satellite multi-satellite multi-task scheduling

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant