CN117785457A - Resource management method, device, equipment and storage medium - Google Patents

Resource management method, device, equipment and storage medium Download PDF

Info

Publication number
CN117785457A
CN117785457A CN202311798418.7A CN202311798418A CN117785457A CN 117785457 A CN117785457 A CN 117785457A CN 202311798418 A CN202311798418 A CN 202311798418A CN 117785457 A CN117785457 A CN 117785457A
Authority
CN
China
Prior art keywords
service
micro
request frequency
delay
tail delay
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311798418.7A
Other languages
Chinese (zh)
Inventor
叶可江
罗树添
李想
徐敏贤
须成忠
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Institute of Advanced Technology of CAS
Original Assignee
Shenzhen Institute of Advanced Technology of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Institute of Advanced Technology of CAS filed Critical Shenzhen Institute of Advanced Technology of CAS
Priority to CN202311798418.7A priority Critical patent/CN117785457A/en
Publication of CN117785457A publication Critical patent/CN117785457A/en
Pending legal-status Critical Current

Links

Landscapes

  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention discloses a resource management method, a device, equipment and a storage medium, wherein the method comprises the following steps: when the host computer has deployed the offline service, if an instruction for deploying the online service is received, reducing the resource allocation of the offline service, and configuring excessive resources for each micro service of the online service in the current time period; predicting the predicted request frequency of each micro-service in the next preset time period based on a pre-trained prediction model, and pre-distributing the resources of each micro-service in the next preset time period according to the predicted request frequency; acquiring real-time tail delay of each micro-service at intervals of a preset period, and acquiring tail delay targets generated by each micro-service in an offline analysis stage; and adjusting the resource allocation of the micro service and the offline service according to the real-time tail delay and the tail delay target. The invention can effectively cope with the situations of a plurality of online and a plurality of offline service mixed parts on the host, ensure the end-to-end response time delay of the online service and improve the resource utilization rate.

Description

Resource management method, device, equipment and storage medium
Technical Field
The present disclosure relates to the field of cloud computing technologies, and in particular, to a method, an apparatus, a device, and a storage medium for resource management.
Background
Cloud computing is a model of providing computing resources and services over a network, and can support a wide range of applications such as online applications and batch calls. Resource management systems in cloud computing often over-allocate resources for applications in order to guarantee the quality of service of the applications, but this also results in lower overall resource utilization in the cloud computing system. Microservices, a commonly used online application architecture, are typically composed of a plurality of microservice components, and there are complex calling relationships between microservice components.
In recent years, micro-service architecture is rapidly developed and widely applied in cloud computing technology. The micro-service system decouples the application into multiple components for easier management, maintenance and updating than traditional monolithic architectures running all service components in one application. Due to the lightweight and loosely coupled nature of the micro-services, the resource manager can locate an overloaded single micro-service and extend the micro-service independently, rather than the entire application, with increasing load.
With the widespread use of micro-service architecture, micro-service resource management faces new challenges. Although the micro-service architecture has flexibility, hundreds of micro-service requests need to be processed while guaranteeing service level agreements. A complex dependency graph is formed between these components, so it becomes very difficult to implement fine-grained resource management to improve resource utilization and ensure end-to-end response time delay. Furthermore, micro-service containers typically run on the same physical machine as batch applications, which can lead to performance imbalances between containers of the same micro-service, especially in the case of heavy workload, where resource interference occurs. Most importantly, at present, for the situation of mixing a plurality of online services and a plurality of offline services on a host, reasonable allocation of resources is particularly important, but the existing resource scheduling scheme is either the resource allocation solely for the online services or the resource allocation solely for the offline services, and the resource scheduling scheme capable of effectively coordinating the situations of the online services and the offline services is not yet available.
Disclosure of Invention
In view of this, the present application provides a method, apparatus, device and storage medium for resource management, so as to solve the problem of unreasonable resource scheduling in the case of mixed online service and offline service.
In order to solve the technical problems, one technical scheme adopted by the application is as follows: there is provided a resource management method, comprising: when the host computer has deployed the offline service, if an instruction for deploying the online service is received, reducing the resource allocation of the offline service, and configuring excessive resources for each micro service of the online service in the current time period; predicting the predicted request frequency of each micro-service in the next preset time period based on a pre-trained prediction model, and pre-distributing the resources of each micro-service in the next preset time period according to the predicted request frequency; acquiring real-time tail delay of each micro-service at intervals of a preset period, and acquiring tail delay targets generated by each micro-service in an offline analysis stage; and adjusting the resource allocation of the micro service and the offline service according to the real-time tail delay and the tail delay target.
As a further improvement of the present application, predicting a predicted request frequency of each micro-service in a next preset time period based on a pre-trained prediction model, and pre-allocating resources of each micro-service in the next preset time period according to the predicted request frequency, including: after the online service is deployed, a pre-constructed tail delay model taking the request frequency as an independent variable and a resource utilization rate model corresponding to each micro service are obtained, the tail delay model is constructed according to tail delay of each micro service under a plurality of preset request frequencies, and the resource utilization rate model is constructed according to resource utilization rates of each micro service under a plurality of preset request frequencies; acquiring the current request frequency of each micro service in the current time period; inputting the current time period and the current request frequency of each micro-service into a pre-trained time sequence prediction model, and predicting to obtain the predicted request frequency of the next preset time period of each micro-service; and pre-distributing the resources of the next preset time period for each micro-service according to the current request frequency, the predicted request frequency, the tail delay model of each micro-service and the resource utilization rate model.
As a further improvement of the present application, pre-allocating resources for each micro-service for a next preset period of time according to the current request frequency, the predicted request frequency, and the tail delay model and the resource utilization model of each micro-service, including: calculating a first request frequency reference value according to a first preset rule by using the current request frequency, the number of copies of the micro-service obtained in advance and a first preset error coefficient; comparing the magnitude relation between the predicted request frequency and the first request frequency reference value; if the predicted request frequency is greater than or equal to the first request frequency reference value, performing capacity expansion processing on resources occupied by the micro-service; and if the predicted request frequency is smaller than the first request frequency reference value, performing capacity reduction processing on the resources occupied by the micro service.
As a further improvement of the present application, performing capacity expansion processing on resources occupied by the micro service includes: confirming the minimum request frequency corresponding to the tail delay target according to the tail delay target and the tail delay model; calculating a second request frequency reference value according to a second preset rule according to the minimum request frequency, the copy number of the micro service and the first preset error coefficient; comparing the predicted request frequency with the second request frequency reference value; if the predicted request frequency is smaller than the second request frequency reference value, longitudinally expanding the resources of the micro service, and calculating the resource utilization rate after the longitudinal expansion by using the predicted request frequency, the number of copies, the resource utilization rate model and the second preset error coefficient according to a third preset rule; if the predicted request frequency is greater than or equal to the second request frequency reference value, transversely expanding the resources of the micro-service, then longitudinally adjusting the resources of the micro-service, calculating the number of the transversely expanded copies according to a fourth preset rule by using the predicted request frequency, the tail delay target and the first preset error coefficient, and calculating the maximum resource utilization rate of each longitudinally adjusted copy according to the predicted request frequency, the number of the copies, the resource utilization rate model and the second preset error coefficient according to a third preset rule.
As a further improvement of the present application, the capacity reduction processing is performed on the resources occupied by the micro service, including: calculating to obtain a copy number reference value according to a fourth preset rule by using the predicted request frequency, the tail delay target and the first preset error coefficient; comparing the copy number reference value with the copy number; if the reference value of the number of the copies is equal to the number of the copies, carrying out longitudinal capacity shrinkage on the resource of the micro service, and calculating the resource utilization ratio utilization prediction request frequency, the number of the copies, the resource utilization ratio model and the second preset error coefficient after the longitudinal capacity shrinkage according to a third preset rule; if the reference value of the number of the copies is smaller than the reference value of the number of the copies, transversely shrinking the resource of the micro-service, and then longitudinally adjusting the resource of the micro-service, wherein the number of the copies after transversely shrinking is the reference value of the number of the copies, and the maximum resource utilization ratio of each copy after the longitudinal adjustment is calculated according to a third preset rule by using the predicted request frequency, the number of the copies, the resource utilization ratio model and the second preset error coefficient.
As a further improvement of the present application, the generating of the tail delay objective for each micro-service in the offline analysis stage specifically includes: in an offline analysis stage, acquiring response time delay and a calling dependency graph of each micro-service, wherein the calling dependency graph is generated according to a calling link of each micro-service, and comprises a plurality of nodes, and each node corresponds to one micro-service; confirming the average response time delay of each node in the call dependency graph according to the response time delay of each micro service; starting from a node with the ingress degree of 0, traversing all paths in the graph, and accumulating the average response time delays of the nodes through which the paths pass to obtain the average response time delay of each path; dividing the average response time delay of each node by the average response time delay of the path where the node is located to obtain the delay proportion of the node; multiplying the delay proportion by a pre-designated tail delay to obtain a tail delay target of each node, and selecting the smallest tail delay target as a final tail delay target when a plurality of tail delay targets exist for one node.
As a further improvement of the present application, adjusting resource allocation of micro services and offline services according to real-time tail delay and tail delay targets includes: if the real-time tail delay of the target micro-service is larger than the tail delay target corresponding to the target micro-service, reducing the resources of the deployed offline service by half; if the real-time tail delay of the target micro-service is not greater than the tail delay target corresponding to the target micro-service, deploying a new offline service or increasing the resources of the deployed offline service.
In order to solve the technical problem, another technical scheme adopted by the application is as follows: there is provided a resource management device including: the online service deployment module is used for reducing the resource allocation of the offline service and configuring excessive resources for each micro service of the online service in the current time period if an instruction for deploying the online service is received when the host computer deploys the offline service; the pre-allocation module is used for predicting the predicted request frequency of each micro-service in the next preset time period based on a pre-trained prediction model, and pre-allocating the resources of each micro-service in the next preset time period according to the predicted request frequency; the acquisition module is used for acquiring the real-time tail delay of each micro service at intervals of a preset period and acquiring a tail delay target generated by each micro service in an offline analysis stage; and the adjustment module is used for adjusting the resource allocation of the micro service and the offline service according to the real-time tail delay and the tail delay target.
In order to solve the technical problem, a further technical scheme adopted by the application is as follows: there is provided a computer device comprising a processor, a memory coupled to the processor, the memory having stored therein program instructions which, when executed by the processor, cause the processor to perform the steps of the resource management method as claimed in any one of the preceding claims.
In order to solve the technical problem, a further technical scheme adopted by the application is as follows: there is provided a storage medium storing program instructions capable of implementing any one of the above resource management methods.
The beneficial effects of this application are: according to the resource management method, after the offline service is deployed, when the online service is required to be deployed, the resources of the offline service are firstly reduced and then the online service is deployed, and the end-to-end response time delay of the online service is guaranteed by predicting and pre-distributing the resources of all micro services in the next preset time period, meanwhile, the tail delay of all micro services is monitored, and the resources occupied by all micro services and the resources occupied by the offline service are dynamically adjusted by combining with the preset tail delay targets of all micro services, so that the end-to-end response time delay of the online service is guaranteed, the resource utilization rate is improved, and the efficient operation of the online service and the offline service under the condition of mixed parts is guaranteed.
Drawings
FIG. 1 is a flow chart of a resource management method according to an embodiment of the invention;
FIG. 2 is a schematic diagram of functional modules of a resource management device according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of a computer device according to an embodiment of the present invention;
fig. 4 is a schematic structural view of a storage medium according to an embodiment of the present invention.
Detailed Description
The following description of the technical solutions in the embodiments of the present application will be made clearly and completely with reference to the accompanying drawings in the embodiments of the present application, and it is apparent that the described embodiments are only some embodiments of the present application, not all embodiments. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are within the scope of the present disclosure.
The terms "first," "second," "third," and the like in this application are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defining "a first", "a second", and "a third" may explicitly or implicitly include at least one such feature. In the description of the present application, the meaning of "plurality" means at least two, for example, two, three, etc., unless specifically defined otherwise. All directional indications (such as up, down, left, right, front, back … …) in the embodiments of the present application are merely used to explain the relative positional relationship, movement, etc. between the components in a particular gesture (as shown in the drawings), and if the particular gesture changes, the directional indication changes accordingly. Furthermore, the terms "comprise" and "have," as well as any variations thereof, are intended to cover a non-exclusive inclusion. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those listed steps or elements but may include other steps or elements not listed or inherent to such process, method, article, or apparatus.
Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment may be included in at least one embodiment of the present application. The appearances of such phrases in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. Those of skill in the art will explicitly and implicitly appreciate that the embodiments described herein may be combined with other embodiments.
Fig. 1 is a flow chart of a resource management method according to an embodiment of the present invention. It should be noted that, if there are substantially the same results, the method of the present invention is not limited to the flow sequence shown in fig. 1. As shown in fig. 1, the resource management method includes the steps of:
step S101: when the host computer deploys the offline service, if an instruction for deploying the online service is received, reducing the resource allocation of the offline service, and configuring excessive resources for each micro-service of the online service in the current time period.
It should be noted that online services include, but are not limited to: the running time is long, the time delay is sensitive, the requirement on stability is high, the service instability service can be perceived by horses and brings loss, and the service has obvious peaks and valleys, such as high daytime and low late night service, such as advertisement search service. Offline services include, but are not limited to, non-delay sensitive, retriable, typically short in about tens of minutes, and typically large data computing, machine learning, etc. services within.
Specifically, in this embodiment, when deciding to deploy an online service, the resource allocation of all deployed offline services needs to be reduced first, so as to ensure that sufficient resources are allocated to the online service. The method for reducing the offline service resource allocation may be to reduce the offline service resource allocation to half of the original one. When the online service is deployed, excessive resources are firstly allocated to the micro service of the online service so as to ensure the normal deployment of the online service, and then the resource allocation of the micro service is adjusted.
Step S102: predicting the predicted request frequency of each micro-service in the next preset time period based on a pre-trained prediction model, and pre-distributing the resources of each micro-service in the next preset time period according to the predicted request frequency.
Since the change in the request frequency of the online service generally has a certain periodicity, for example, a period of one day or one week, the request frequency of the online service can be predicted by using the time series prediction model.
In this embodiment, first, it is required to divide a time period according to a change rule of a request frequency of each micro service of an online service, so as to obtain a plurality of time periods, for example, divide a day into 24 time periods, where each time period corresponds to one hour. The time series prediction model is then trained using historical request frequency data of the microservices collected in advance. After the on-line service deployment, the corresponding time sequence prediction model and the request frequency of the current preset time period are adopted for each micro-service to predict, so that the predicted request frequency of the next preset time period of each micro-service is obtained, then the resources of each micro-service in the next preset time period are pre-allocated according to the predicted request frequency of the next preset time period, after the next preset time period is entered, the resource allocation of the micro-service can be directly adjusted according to the pre-allocated resources, and therefore reasonable resources can be reasonably allocated for each micro-service, and the problem of excessive resource allocation or insufficient resource allocation is avoided.
Further, step S102 specifically includes:
1. after the online service is deployed, a pre-built tail delay model taking the request frequency as an independent variable and a resource utilization rate model corresponding to each micro service are obtained, the tail delay model is built according to tail delay of each micro service under a plurality of preset request frequencies, and the resource utilization rate model is built according to resource utilization rates of each micro service under the plurality of preset request frequencies.
In this embodiment, a request frequency-tail delay type, request frequency-resource utilization model needs to be built for each micro-service, and both models are built with the request frequency as an argument. The model construction process is specifically as follows:
first, the request frequency is initially set to a minimum level (e.g., 10 requests per second), the application is accessed at the minimum level request frequency, and the tail delay (e.g., response delay at 95% of the minutes) and resource utilization of each micro-service at that time are recorded.
Next, the request frequency is increased step by step, and each time the request frequency is increased by a minimum level (e.g., the minimum level is 10 requests per second, then the requests after the increase are 20 requests per second, 30 requests per second … …), the tail delay (e.g., the response delay at 95% of the minutes) and the resource utilization of each micro-service at the request frequency are recorded.
The collected data is then consolidated into two groups. One group of the request frequencies are arranged from small to large, and tail delays corresponding to each request frequency are listed. For the second group, the request frequencies are also arranged from small to large, with the corresponding resource utilization at each request frequency being listed.
The specific format is as follows:
a first group:
request frequency 1, request frequency 2, request frequency 3, … …, request frequency n;
tail delay 1, tail delay 2, tail delay 3, … …, tail delay n ];
second group:
request frequency 1, request frequency 2, request frequency 3, … …, request frequency n;
[ resource utilization 1, resource utilization 2, resource utilization 3, … …, resource utilization n ];
and finally, respectively establishing a request frequency-tail delay model and a request frequency-resource utilization rate model for each micro-service in a fitting curve mode according to the two groups of data after arrangement.
The resource utilization ratio is specific to the CPU utilization ratio.
2. The current request frequency of each micro-service in the current time period is obtained.
3. And inputting the current time period and the current request frequency of each micro-service into a pre-trained time sequence prediction model, and predicting to obtain the predicted request frequency of the next preset time period of each micro-service.
Specifically, after the current request frequency of the micro service is obtained, the current request frequency and the current time period are input into a pre-trained time sequence prediction model, and the predicted request frequency of the next preset time period is predicted.
4. And pre-distributing the resources of the next preset time period for each micro-service according to the current request frequency, the predicted request frequency, the tail delay model of each micro-service and the resource utilization rate model.
Specifically, after obtaining the predicted request frequency of the next preset time period, performing resource pre-allocation for the micro-service according to the predicted request frequency, the pre-constructed tail delay model and the resource utilization rate model.
Further, pre-allocating resources of a next preset time period for each micro-service according to the current request frequency, the predicted request frequency, the tail delay model of each micro-service and the resource utilization rate model, including:
and 4.1, calculating a first request frequency reference value according to a first preset rule by using the current request frequency, the number of copies of the micro-service obtained in advance and a first preset error coefficient.
Specifically, a resource pre-allocation procedure of a micro service is described as an example: assuming that the number of copies of the micro-service is n, the average request frequency of the current copy is R c (i.e. the current request frequency), the predicted request frequency of the micro-service in the next preset time period predicted according to the time sequence prediction model is R n
First, the current request frequency R is utilized c Calculating a first request frequency reference value R from the number n of copies of the micro-service obtained in advance and a first preset error coefficient alpha 1 The calculation process is as follows:
R 1 =R c *n*α;
the first preset error coefficient α is set to eliminate the influence of the model prediction error as much as possible, and the value range is 0-1, and 0.7 is usually preferable.
And 4.2, comparing the magnitude relation between the predicted request frequency and the first request frequency reference value.
And 4.3, if the predicted request frequency is greater than or equal to the first request frequency reference value, performing capacity expansion processing on the resources occupied by the micro-service.
Specifically, when the predicted request frequency is greater than or equal to the first request frequency reference value, it is explained that resources occupied by the micro service in the next preset period of time will increase, and therefore, it is necessary to perform capacity expansion processing on the resources occupied by the micro service.
Further, in the case of performing the capacity expansion process, it is necessary to analyze whether the micro service is subjected to the longitudinal expansion or the transverse expansion. Therefore, the capacity expansion processing for the resources occupied by the micro-service specifically comprises:
And 4.3.1, confirming the minimum request frequency corresponding to the tail delay target according to the tail delay target and the tail delay model.
And 4.3.2, calculating a second request frequency reference value according to a second preset rule according to the minimum request frequency, the copy number of the micro service and the first preset error coefficient.
Specifically, a tail delay target L preset for each micro-service is first acquired k Combining the tail delay target and the tail delay model to confirm the minimum request frequency R corresponding to the tail delay target k
Then, using the minimum request frequency R k Calculating a second request frequency reference value R from the number n of copies of the micro-service obtained in advance and the first preset error coefficient alpha 2 The calculation process is as follows:
R 2 =R k *n*α。
4.3.3, comparing the predicted request frequency with the second request frequency reference value.
4.3.4 if the predicted request frequency is smaller than the second request frequency reference value, longitudinally expanding the resource of the micro service, and calculating the resource utilization ratio after the longitudinal expansion by using the predicted request frequency, the copy number, the resource utilization ratio model and the second preset error coefficient according to a third preset rule.
Specifically, when the predicted request frequency is smaller than the second request frequency reference value, performing longitudinal capacity expansion processing, and performing longitudinal capacity expansion on the resource utilization rate C d The calculation method is as follows: according to the resource utilization model, it is confirmed that when the request frequency is (R n Resource utilization C at/n) n And then calculating the resource utilization rate after capacity expansion according to a third preset rule: c (C) d =C n * Beta, beta is the second preset error coefficient, in order to eliminate the influence of the model prediction error as much as possible, the value range is 1-2, and 1.4 is usually preferable. Thus, the final maximum resource utilization of the micro-service will be limited to C d
And 4.3.5, if the predicted request frequency is greater than or equal to the second request frequency reference value, transversely expanding the resources of the micro-service, then longitudinally adjusting the resources of the micro-service, calculating the number of the transversely expanded copies according to a fourth preset rule by using the predicted request frequency, the tail delay target and the first preset error coefficient, and calculating the maximum resource utilization rate of each longitudinally adjusted copy according to a third preset rule by using the predicted request frequency, the number of the copies, the resource utilization rate model and the second preset error coefficient.
Specifically, in the case of performing lateral expansion, the number of copies after expansion is n d The calculation mode of (a) is as follows: n is n d =ceil(R n /(R k * α), ceil () represents rounding up the decimal, adjusting the maximum resource utilization of each copy to C in the longitudinal direction d
And 4.4, if the predicted request frequency is smaller than the first request frequency reference value, performing capacity reduction processing on the resources occupied by the micro service.
Specifically, when the predicted request frequency is smaller than the first request frequency reference value, the resources occupied by the micro service indicating the next preset time period will be reduced, and therefore, the capacity reduction processing needs to be performed on the resources occupied by the micro service.
Further, the capacity reduction processing is performed on the resources occupied by the micro service, including:
and 4.4.1, calculating the copy number reference value according to a fourth preset rule by using the predicted request frequency, the tail delay target and the first preset error coefficient.
Specifically, the copy number reference value n 1 The method for calculating the number of the copies after capacity expansion is the same as that of the copies after capacity expansion during capacity expansion, and specifically comprises the following steps: n is n 1 =ceil(R n /(R k *α))。
4.4.2, comparing the copy number reference value with the size of the copy number.
4.4.3, if the reference value of the number of the copies is equal to the number of the copies, carrying out longitudinal capacity reduction on the resource of the micro service, and calculating the utilization prediction request frequency, the number of the copies, the resource utilization model and the second preset error coefficient of the resource utilization ratio after the longitudinal capacity reduction according to a third preset rule.
Specifically, when the copy number reference value n 1 When the number of the copies is equal to the size of the copies, longitudinal capacity shrinking is carried out, and the resource utilization rate after capacity shrinking is C d
4.4.4, if the reference value of the number of copies is smaller than the number of copies, carrying out transverse volume reduction on the resource of the micro service, and then carrying out longitudinal adjustment, wherein the number of the copies subjected to transverse volume reduction is the reference value of the number of the copies, and calculating the maximum resource utilization ratio of each copy after longitudinal adjustment according to a third preset rule by using the predicted request frequency, the number of the copies, the resource utilization ratio model and the second preset error coefficient.
Specifically, when the copy number reference value n 1 When the number of copies is smaller than the number of copies, the copies are required to be transversely contracted and then longitudinally adjusted, and the number of the copies n after the transverse contraction is reduced s =ceil(R n /(R k * Alpha), the maximum resource utilization of each copy in the longitudinal direction is adjusted to C d
It should be noted that, the above-mentioned capacity shrinking and expanding processes are performed at specific time points: the capacity expansion operation occurs before the time series prediction model starts the next prediction (e.g., 1 minute before the prediction is performed), and the capacity contraction operation occurs after the time series prediction model completes the next prediction (e.g., 1 minute after the prediction is completed).
Step S103: and acquiring real-time tail delay of each micro-service at intervals of a preset period, and acquiring tail delay targets generated by each micro-service in an offline analysis stage.
Specifically, in this embodiment, the response delay of each microservice is periodically sampled (e.g., every 5 seconds) using a link tracking tool (e.g., jaeger, etc.), and the change in its tail delay is monitored.
Further, generating a tail delay target of each micro-service in an offline analysis stage specifically includes:
1. in the offline analysis stage, response time delay and a calling dependency graph of each micro-service are acquired, the calling dependency graph is generated according to a calling link of each micro-service, and the calling dependency graph comprises a plurality of nodes, and each node corresponds to one micro-service.
Specifically, the link tracking tool can analyze the call relationship through the recorded call data, and then generate a dependency graph for the application according to the information, wherein the dependency graph reveals the call relationship among the application components. Nodes in the call dependency graph represent components, arrows represent call relationships, and numbers on paths represent call times.
2. And confirming the average response time delay of each node in the call dependency graph according to the response time delay of each micro-service.
3. Starting from the node with the ingress degree of 0, traversing all paths in the graph, and accumulating the average response time delay of the nodes through which the paths pass to obtain the average response time delay of each path.
4. Dividing the average response time delay of each node by the average response time delay of the path where the node is located to obtain the delay proportion of the node.
5. Multiplying the delay proportion by a pre-designated tail delay to obtain a tail delay target of each node, and selecting the smallest tail delay target as a final tail delay target when a plurality of tail delay targets exist for one node.
It should be noted that, the user needs to pre-specify a total tail delay target (e.g., delay in 95% of the minutes should be within 30 ms) from end to end of a service, and then set the tail delay of each path to the total tail delay target.
It should be understood that a node may appear in multiple paths, and thus, there may be multiple delay proportions corresponding to the node, so that the node may calculate multiple tail delay targets, and the embodiment takes the smallest one of the multiple tail delay targets as the final tail delay target of the node.
Step S104: and adjusting the resource allocation of the micro service and the offline service according to the real-time tail delay and the tail delay target.
Specifically, corresponding adjustment measures are made according to whether the real-time tail delay of the micro service meets the tail delay target of the micro service, such as improving the resource allocation of the micro service, improving the resource allocation of the offline service, deploying new offline service, suspending the execution of the offline service, and the like.
Further, step S104 specifically includes:
1. if the real-time tail delay of the target micro-service is larger than the tail delay target corresponding to the target micro-service, the resources of the deployed offline service are reduced by half.
2. If the real-time tail delay of the target micro-service is not greater than the tail delay target corresponding to the target micro-service, deploying a new offline service or increasing the resources of the deployed offline service.
Further, in order to facilitate the allocation of the offline service resources, in this embodiment, the offline service resources are divided into a plurality of gears, for example, 10 gears, and 20% of the resources are utilized as the lowest gear, that is, the first gear, and the resource utilization of each gear is increased by 20% based on the previous gear, the 10 th gear is the highest gear, and the CPU resource utilization is 200%.
When the resources of the deployed offline service need to be reduced by half, the adjustment strategy is as follows: changing the resource allocation of all the deployed offline services into the original half, wherein the resources are in odd gears before, and then become into the half after descending by one gear, for example, if the resource allocation of one offline service is in the 6 th gear, changing the resource allocation of one offline service into the 3 rd gear; if the resource allocation of an offline service is 3 rd gear, the offline service becomes 2 nd gear after a first gear reduction, and finally becomes half of the offline service, namely 1 st gear. The offline service originally in the lowest gear (first gear) needs to be suspended. Meanwhile, the resource limitation of the micro service is temporarily released, so that the sufficient resource operation of the micro service is ensured.
When a new offline service needs to be deployed or the resources of the deployed offline service are increased, the adjustment strategy is as follows: first, an offline service is selected randomly from deployed offline services, and the resource allocation is increased by one gear. If all the deployed offline services are in the highest gear and the resource allocation cannot be continuously improved, a new offline service can be selected for deployment at the moment, and the resource allocation is set to be the lowest gear.
According to the resource management method, after the offline service is deployed, when the online service is required to be deployed, the resources of the offline service are firstly reduced and then the online service is deployed, and the end-to-end response time delay of the online service is guaranteed by predicting and pre-distributing the resources of all micro services in the next preset time period, meanwhile, the tail delay of all micro services is monitored, and the resources occupied by all micro services and the resources occupied by the offline service are dynamically adjusted by combining the preset tail delay targets of all micro services, so that the utilization rate of the resources can be improved while the end-to-end response time delay of the online service is guaranteed, and the efficient operation of the online service and the offline service under the mixed part condition is guaranteed.
Fig. 2 is a schematic functional block diagram of a resource management device according to an embodiment of the present invention. As shown in fig. 2, the resource management device 20 includes an online service deployment module 21, a pre-allocation module 22, an acquisition module 23, and an adjustment module 24.
The online service deployment module 21 is configured to, when the host has deployed the offline service, reduce resource allocation of the offline service and configure excessive resources for each micro service of the online service in a current time period if an instruction for deploying the online service is received;
a pre-allocation module 22, configured to predict a predicted request frequency of each micro-service in a next preset time period based on a pre-trained prediction model, and pre-allocate resources of each micro-service in the next preset time period according to the predicted request frequency;
the acquisition module 23 is configured to acquire real-time tail delay of each micro service at intervals of a preset period, and acquire a tail delay target generated by each micro service in an offline analysis stage;
an adjustment module 24 is configured to adjust resource allocation of the micro services and the offline services according to the real-time tail delay and the tail delay target.
Optionally, the pre-allocation module 22 performs an operation of predicting a predicted request frequency of each micro service for a next preset time period based on a pre-trained prediction model, and pre-allocating resources of each micro service for the next preset time period according to the predicted request frequency, specifically including: after the online service is deployed, a pre-constructed tail delay model taking the request frequency as an independent variable and a resource utilization rate model corresponding to each micro service are obtained, the tail delay model is constructed according to tail delay of each micro service under a plurality of preset request frequencies, and the resource utilization rate model is constructed according to resource utilization rates of each micro service under a plurality of preset request frequencies; acquiring the current request frequency of each micro service in the current time period; inputting the current time period and the current request frequency of each micro-service into a pre-trained time sequence prediction model, and predicting to obtain the predicted request frequency of the next preset time period of each micro-service; and pre-distributing the resources of the next preset time period for each micro-service according to the current request frequency, the predicted request frequency, the tail delay model of each micro-service and the resource utilization rate model.
Optionally, the pre-allocation module 22 performs an operation of pre-allocating the resources of the next preset time period for each micro-service according to the current request frequency, the predicted request frequency, and the tail delay model and the resource utilization model of each micro-service, specifically including: calculating a first request frequency reference value according to a first preset rule by using the current request frequency, the number of copies of the micro-service obtained in advance and a first preset error coefficient; comparing the magnitude relation between the predicted request frequency and the first request frequency reference value; if the predicted request frequency is greater than or equal to the first request frequency reference value, performing capacity expansion processing on resources occupied by the micro-service; and if the predicted request frequency is smaller than the first request frequency reference value, performing capacity reduction processing on the resources occupied by the micro service.
Optionally, the pre-allocation module 22 performs operations of performing capacity expansion processing on resources occupied by the micro service, including: confirming the minimum request frequency corresponding to the tail delay target according to the tail delay target and the tail delay model; calculating a second request frequency reference value according to a second preset rule according to the minimum request frequency, the copy number of the micro service and the first preset error coefficient; comparing the predicted request frequency with the second request frequency reference value; if the predicted request frequency is smaller than the second request frequency reference value, longitudinally expanding the resources of the micro service, and calculating the resource utilization rate after the longitudinal expansion by using the predicted request frequency, the number of copies, the resource utilization rate model and the second preset error coefficient according to a third preset rule; if the predicted request frequency is greater than or equal to the second request frequency reference value, transversely expanding the resources of the micro-service, then longitudinally adjusting the resources of the micro-service, calculating the number of the transversely expanded copies according to a fourth preset rule by using the predicted request frequency, the tail delay target and the first preset error coefficient, and calculating the maximum resource utilization rate of each longitudinally adjusted copy according to the predicted request frequency, the number of the copies, the resource utilization rate model and the second preset error coefficient according to a third preset rule.
Optionally, the pre-allocation module 22 performs operations of scaling the resources occupied by the micro-service, including: calculating to obtain a copy number reference value according to a fourth preset rule by using the predicted request frequency, the tail delay target and the first preset error coefficient; comparing the copy number reference value with the copy number; if the reference value of the number of the copies is equal to the number of the copies, carrying out longitudinal capacity shrinkage on the resource of the micro service, and calculating the resource utilization ratio utilization prediction request frequency, the number of the copies, the resource utilization ratio model and the second preset error coefficient after the longitudinal capacity shrinkage according to a third preset rule; if the reference value of the number of the copies is smaller than the reference value of the number of the copies, transversely shrinking the resource of the micro-service, and then longitudinally adjusting the resource of the micro-service, wherein the number of the copies after transversely shrinking is the reference value of the number of the copies, and the maximum resource utilization ratio of each copy after the longitudinal adjustment is calculated according to a third preset rule by using the predicted request frequency, the number of the copies, the resource utilization ratio model and the second preset error coefficient.
Optionally, the acquisition module 23 is further configured to perform operations for generating a tail delay target for each micro-service in an offline analysis phase, specifically including: in an offline analysis stage, acquiring response time delay and a calling dependency graph of each micro-service, wherein the calling dependency graph is generated according to a calling link of each micro-service, and comprises a plurality of nodes, and each node corresponds to one micro-service; confirming the average response time delay of each node in the call dependency graph according to the response time delay of each micro service; starting from a node with the ingress degree of 0, traversing all paths in the graph, and accumulating the average response time delays of the nodes through which the paths pass to obtain the average response time delay of each path; dividing the average response time delay of each node by the average response time delay of the path where the node is located to obtain the delay proportion of the node; multiplying the delay proportion by a pre-designated tail delay to obtain a tail delay target of each node, and selecting the smallest tail delay target as a final tail delay target when a plurality of tail delay targets exist for one node.
Optionally, the adjustment module 24 performs an operation of adjusting resource allocation of the micro service and the offline service according to the real-time tail delay and the tail delay target, including in particular: if the real-time tail delay of the target micro-service is larger than the tail delay target corresponding to the target micro-service, reducing the resources of the deployed offline service by half; if the real-time tail delay of the target micro-service is not greater than the tail delay target corresponding to the target micro-service, deploying a new offline service or increasing the resources of the deployed offline service.
For other details of the implementation of the foregoing embodiments of the resource management device by each module, reference may be made to the description of the resource management method in the foregoing embodiments, which is not repeated herein.
It should be noted that, in the present specification, each embodiment is described in a progressive manner, and each embodiment is mainly described as different from other embodiments, and identical and similar parts between the embodiments are all enough to be referred to each other. For the apparatus class embodiments, the description is relatively simple as it is substantially similar to the method embodiments, and reference is made to the description of the method embodiments for relevant points.
Referring to fig. 3, fig. 3 is a schematic structural diagram of a computer device according to an embodiment of the invention. As shown in fig. 3, the computer device 30 includes a processor 31 and a memory 32 coupled to the processor 31, where the memory 32 stores program instructions that, when executed by the processor 31, cause the processor 31 to perform the steps of the resource management method according to any of the embodiments described above.
The processor 31 may also be referred to as a resource (Central Processing Unit ). The processor 31 may be an integrated circuit chip with signal processing capabilities. The processor 31 may also be a general purpose processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
Referring to fig. 4, fig. 4 is a schematic structural diagram of a storage medium according to an embodiment of the present invention. The storage medium according to the embodiment of the present invention stores the program instructions 41 capable of implementing the above-mentioned resource management method, where the program instructions 41 may be stored in the storage medium in the form of a software product, and include several instructions to cause a computer device (which may be a personal computer, a server, or a network device, etc.) or a processor (processor) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, an optical disk, or other various media capable of storing program codes, or a computer device such as a computer, a server, a mobile phone, a tablet, or the like.
In the several embodiments provided in this application, it should be understood that the disclosed computer apparatus, device, and method may be implemented in other manners. For example, the apparatus embodiments described above are merely illustrative, e.g., the division of elements is merely a logical functional division, and there may be additional divisions of actual implementation, e.g., multiple elements or components may be combined or integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical or other form.
In addition, each functional unit in the embodiments of the present invention may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units. The foregoing is only the embodiments of the present application, and not the patent scope of the present application is limited by the foregoing description, but all equivalent structures or equivalent processes using the contents of the present application and the accompanying drawings, or directly or indirectly applied to other related technical fields, which are included in the patent protection scope of the present application.

Claims (10)

1. A method of resource management, comprising:
when the host computer has deployed the offline service, if an instruction for deploying the online service is received, reducing the resource allocation of the offline service, and configuring excessive resources for each micro service of the online service in the current time period;
predicting the predicted request frequency of each micro-service in a next preset time period based on a pre-trained prediction model, and pre-distributing resources of each micro-service in the next preset time period according to the predicted request frequency;
acquiring real-time tail delay of each micro-service at intervals of a preset period, and acquiring tail delay targets generated by each micro-service in an offline analysis stage;
and adjusting the resource allocation of the micro service and the offline service according to the real-time tail delay and the tail delay target.
2. The resource management method according to claim 1, wherein predicting a predicted request frequency of each micro service for a next preset time period based on a pre-trained prediction model, and pre-allocating resources of each micro service for the next preset time period according to the predicted request frequency, comprises:
after the online service is deployed, a pre-constructed tail delay model taking the request frequency as an independent variable and a resource utilization rate model corresponding to each micro service are obtained, wherein the tail delay model is constructed according to tail delays of each micro service under a plurality of preset request frequencies, and the resource utilization rate model is constructed according to resource utilization rates of each micro service under a plurality of preset request frequencies;
Acquiring the current request frequency of each micro service in the current time period;
inputting the current time period and the current request frequency of each micro service into a pre-trained time sequence prediction model, and predicting to obtain the predicted request frequency of the next preset time period of each micro service;
and pre-distributing resources of a next preset time period for each micro-service according to the current request frequency, the predicted request frequency, the tail delay model of each micro-service and the resource utilization rate model.
3. The resource management method according to claim 2, wherein the pre-allocating resources for each micro-service for a next preset period of time according to the current request frequency, the predicted request frequency, and the tail delay model of each micro-service, the resource utilization model, comprises:
calculating a first request frequency reference value according to a first preset rule by using the current request frequency, the number of copies of the micro-service obtained in advance and a first preset error coefficient;
comparing the magnitude relation between the predicted request frequency and the first request frequency reference value;
if the predicted request frequency is greater than or equal to the first request frequency reference value, performing capacity expansion processing on resources occupied by the micro service;
And if the predicted request frequency is smaller than the first request frequency reference value, performing capacity reduction processing on the resources occupied by the micro service.
4. The method for resource management according to claim 3, wherein said performing a capacity expansion process on the resources occupied by the micro service includes:
confirming the minimum request frequency corresponding to the tail delay target according to the tail delay target and the tail delay model;
calculating a second request frequency reference value according to a second preset rule according to the minimum request frequency, the number of copies of the micro-service and the first preset error coefficient;
comparing the predicted request frequency with the second request frequency reference value;
if the predicted request frequency is smaller than the second request frequency reference value, longitudinally expanding the resources of the micro service, wherein the resource utilization rate after the longitudinal expansion is calculated according to a third preset rule by using the predicted request frequency, the copy number, the resource utilization rate model and a second preset error coefficient;
and if the predicted request frequency is greater than or equal to the second request frequency reference value, transversely expanding the resources of the micro service, and then longitudinally adjusting the resources of the micro service, wherein the number of copies of the transverse expansion is calculated according to a fourth preset rule by using the predicted request frequency, the tail delay target and the first preset error coefficient, and the maximum resource utilization rate of each copy after the longitudinal adjustment is calculated according to a third preset rule by using the predicted request frequency, the number of copies, the resource utilization rate model and the second preset error coefficient.
5. The method for resource management according to claim 4, wherein said performing a capacity reduction process on the resources occupied by the micro service comprises:
calculating to obtain a copy number reference value according to a fourth preset rule by using the predicted request frequency, the tail delay target and the first preset error coefficient;
comparing the copy number reference value with the copy number;
if the reference value of the number of the copies is equal to the number of the copies, carrying out longitudinal capacity reduction on the resource of the micro service, and calculating the resource utilization rate after the longitudinal capacity reduction according to a third preset rule by utilizing the predicted request frequency, the number of the copies, the resource utilization rate model and a second preset error coefficient;
and if the copy number reference value is smaller than the copy number, transversely shrinking the resource of the micro service, then longitudinally adjusting the resource, wherein the copy number after transversely shrinking is the copy number reference value, and the maximum resource utilization rate of each copy after longitudinally adjusting is calculated by using the prediction request frequency, the copy number, the resource utilization rate model and a second preset error coefficient according to a third preset rule.
6. The resource management method according to claim 1, wherein generating the tail delay objective for each micro-service during the offline analysis phase comprises:
in an offline analysis stage, acquiring response time delay and a call dependency graph of each micro-service, wherein the call dependency graph is generated according to a call link of each micro-service, and comprises a plurality of nodes, and each node corresponds to one micro-service;
confirming the average response time delay of each node in the call dependency graph according to the response time delay of each micro service;
starting from a node with the ingress degree of 0, traversing all paths in the graph, and accumulating the average response time delays of the nodes through which the paths pass to obtain the average response time delay of each path;
dividing the average response time delay of each node by the average response time delay of the path where the node is located to obtain the delay proportion of the node;
and multiplying the delay proportion by a preassigned tail delay to obtain a tail delay target of each node, and selecting the smallest tail delay target as a final tail delay target when a plurality of tail delay targets exist in one node.
7. The method of claim 1, wherein said adjusting the resource allocation of said micro services and said offline services according to said real-time tail delay and said tail delay target comprises:
If the real-time tail delay of the target micro-service is larger than the tail delay target corresponding to the target micro-service, reducing the resources of the deployed offline service by half;
if the real-time tail delay of the target micro-service is not greater than the tail delay target corresponding to the target micro-service, deploying a new offline service or increasing the resources of the deployed offline service.
8. A resource management device, comprising:
the online service deployment module is used for reducing the resource allocation of the offline service and configuring excessive resources for each micro service of the online service in the current time period if an instruction for deploying the online service is received when the host computer deploys the offline service;
the pre-allocation module is used for predicting the predicted request frequency of each micro-service in the next preset time period based on a pre-trained prediction model, and pre-allocating the resources of each micro-service in the next preset time period according to the predicted request frequency;
the acquisition module is used for acquiring the real-time tail delay of each micro service at intervals of a preset period and acquiring a tail delay target generated by each micro service in an offline analysis stage;
and the adjustment module is used for adjusting the resource allocation of the micro service and the offline service according to the real-time tail delay and the tail delay target.
9. A computer device comprising a processor, a memory coupled to the processor, the memory having stored therein program instructions that, when executed by the processor, cause the processor to perform the steps of the resource management method of any of claims 1-7.
10. A storage medium storing program instructions enabling the implementation of the resource management method according to any one of claims 1-7.
CN202311798418.7A 2023-12-25 2023-12-25 Resource management method, device, equipment and storage medium Pending CN117785457A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311798418.7A CN117785457A (en) 2023-12-25 2023-12-25 Resource management method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311798418.7A CN117785457A (en) 2023-12-25 2023-12-25 Resource management method, device, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN117785457A true CN117785457A (en) 2024-03-29

Family

ID=90397572

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311798418.7A Pending CN117785457A (en) 2023-12-25 2023-12-25 Resource management method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN117785457A (en)

Similar Documents

Publication Publication Date Title
CN102281290B (en) Emulation system and method for a PaaS (Platform-as-a-service) cloud platform
CN113434253B (en) Cluster resource scheduling method, device, equipment and storage medium
CN109788489A (en) A kind of base station planning method and device
CN109002357B (en) Resource allocation method and device and Internet of things system
CN109525410A (en) The method, apparatus and distributed memory system of distributed memory system updating and management
CN115421930B (en) Task processing method, system, device, equipment and computer readable storage medium
CN114625500A (en) Method and application for scheduling micro-service application based on topology perception in cloud environment
CN111865622A (en) Cloud service metering and charging method and system based on rule engine cluster
CN117785457A (en) Resource management method, device, equipment and storage medium
CN107071014B (en) Resource adjusting method and device
CN114138453B (en) Resource optimization allocation method and system suitable for edge computing environment
CN111427682B (en) Task allocation method, system, device and equipment
CN112994911B (en) Calculation unloading method and device and computer readable storage medium
CN114020469A (en) Edge node-based multi-task learning method, device, medium and equipment
Al Maruf et al. Resource efficient allocation of fog nodes for faster vehicular OTA updates
CN116089069A (en) Partition management method and device of super fusion architecture, electronic equipment and storage medium
CN113891363B (en) Cell capacity expansion index prediction method, device and computer readable storage medium
CN116095096B (en) Data synchronization method, device and storage medium
CN116701410B (en) Method and system for storing memory state data for data language of digital networking
CN113535388B (en) Task-oriented service function aggregation method
CN117349031B (en) Distributed super computing resource scheduling analysis method, system, terminal and medium
CN117201319B (en) Micro-service deployment method and system based on edge calculation
CN116346932A (en) Resource scheduling method, equipment and system for network service
CN116866356A (en) Cloud server service scheduling method and device
CN115373821A (en) Internet of things service online reconfiguration method and device, electronic equipment and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination