CN117785457A

CN117785457A - Resource management methods, devices, equipment and storage media

Info

Publication number: CN117785457A
Application number: CN202311798418.7A
Authority: CN
Inventors: 叶可江; 罗树添; 李想; 徐敏贤; 须成忠
Original assignee: Shenzhen Institute of Advanced Technology of CAS
Current assignee: Shenzhen Institute of Advanced Technology of CAS
Priority date: 2023-12-25
Filing date: 2023-12-25
Publication date: 2024-03-29

Abstract

The invention discloses a resource management method, device, equipment and storage medium. The method includes: when the host has deployed offline services, if an instruction to deploy online services is received, the resource allocation for offline services is reduced and the resource allocation for online services is reduced. Each microservice is configured with excess resources in the current time period; the predicted request frequency of each microservice in the next preset time period is predicted based on the pre-trained prediction model, and each microservice is configured in the next preset time period based on the predicted request frequency. Resources in the time period are pre-allocated; the real-time tail delay of each microservice is collected at preset intervals, and the tail delay target generated by each microservice in the offline analysis phase is obtained; the microservice and the tail delay target are adjusted based on the real-time tail delay and tail delay target. Resource allocation for offline business. The invention can effectively cope with the scenario of mixed deployment of multiple online and multiple offline services on the host, ensure end-to-end response delay of online services, and improve resource utilization at the same time.

Description

Resource management method, device, equipment and storage medium

技术领域Technical field

本申请涉及云计算技术领域，特别是涉及一种资源管理方法、装置、设备及存储介质。The present application relates to the field of cloud computing technology, and in particular to a resource management method, device, equipment and storage medium.

背景技术Background technique

云计算是一种通过网络提供计算资源和服务的模式，可以支持在线应用和批处理调用等广泛应用程序。云计算中的资源管理系统，为了保证应用的服务质量，常常会为应用过度分配资源，但是这也会导致云计算系统中的整体资源利用率较低。微服务作为一种常用的在线应用架构，通常由多个微服务组件构成，而微服务组件之间又存在复杂的调用关系。Cloud computing is a model that provides computing resources and services over the network and can support a wide range of applications such as online applications and batch processing calls. In order to ensure the service quality of applications, resource management systems in cloud computing often over-allocate resources to applications, but this can also lead to low overall resource utilization in the cloud computing system. As a commonly used online application architecture, microservices are usually composed of multiple microservice components, and there are complex calling relationships between microservice components.

近年来，微服务架构在云计算技术中得到快速发展且应用广泛。相比传统的单体架构，在一个应用程序中运行所有的服务组件，微服务系统将应用解耦为多个组件，以便更容易地管理、维护和更新。由于微服务之间具有轻量级和松散耦合的特性，因此在负载不断增加的情况下，资源管理器可以定位过载的单个微服务，并独立扩展该微服务，而不是对整个应用程序进行扩展。In recent years, microservice architecture has developed rapidly and is widely used in cloud computing technology. Compared with the traditional monolithic architecture, which runs all service components in one application, the microservice system decouples the application into multiple components to make it easier to manage, maintain and update. Due to the lightweight and loose coupling nature of microservices, under increasing load, Resource Manager can locate an overloaded individual microservice and scale it independently rather than scaling the entire application. .

随着微服务架构的广泛应用，微服务资源管理面临着新的挑战。尽管微服务架构具有灵活性，但在保证服务水平协议的同时，需要处理数百个微服务的服务请求。这些组件之间形成了一个复杂的依赖图，因此要实现细粒度的资源管理以提高资源利用率并确保端到端的响应时延变得非常困难。此外，微服务容器通常与批处理应用运行在同一物理机上，这可能会导致同一微服务的容器之间的性能不平衡，特别是在工作负载很重的情况下，出现资源干扰的情况。最重要的是，目前对于主机上多个在线业务与多个离线业务混部的情景，资源的合理分配显得尤为重要，而现有的资源调度方案要么是单独针对在线业务的资源分配，要么是单独针对离线业务的资源分配，尚缺少一种能够有效协调在线业务和离线业务混部情形的资源调度方案。With the widespread application of microservice architecture, microservice resource management faces new challenges. Although microservice architecture is flexible, it is necessary to handle service requests from hundreds of microservices while ensuring service level agreements. A complex dependency graph is formed between these components, making it very difficult to implement fine-grained resource management to improve resource utilization and ensure end-to-end response latency. In addition, microservice containers usually run on the same physical machine as batch applications, which may cause performance imbalance between containers of the same microservice, especially when the workload is heavy, resulting in resource interference. Most importantly, for the scenario of multiple online services and multiple offline services co-located on the host, reasonable resource allocation is particularly important, and the existing resource scheduling schemes are either resource allocation for online services alone or resource allocation for offline services alone. There is still a lack of a resource scheduling scheme that can effectively coordinate the co-location of online and offline services.

发明内容Contents of the invention

有鉴于此，本申请提供一种资源管理方法、装置、设备及存储介质，以解决在线业务和离线业务混部情形下资源调度不合理的问题。In view of this, this application provides a resource management method, device, equipment and storage medium to solve the problem of unreasonable resource scheduling when online services and offline services are mixed.

为解决上述技术问题，本申请采用的一个技术方案是：提供一种资源管理方法，其包括：当主机已部署离线业务时，若接收到部署在线业务的指令，则降低离线业务的资源分配，并为在线业务的每个微服务在当前时间段配置过量资源；基于预先训练好的预测模型预测每个微服务下一预设时间段的预测请求频率，并根据预测请求频率对每个微服务在下一预设时间段的资源进行预分配；间隔预设周期采集每个微服务的实时尾部延迟，并获取每个微服务在离线分析阶段生成的尾部延迟目标；根据实时尾部延迟和尾部延迟目标调整微服务和离线业务的资源分配。In order to solve the above technical problems, a technical solution adopted by this application is to provide a resource management method, which includes: when the host has deployed offline services, if an instruction to deploy online services is received, then reduce the resource allocation of offline services, And allocate excess resources for each microservice of the online business in the current time period; predict the predicted request frequency of each microservice in the next preset time period based on the pre-trained prediction model, and configure each microservice based on the predicted request frequency. Pre-allocate resources in the next preset time period; collect the real-time tail delay of each microservice at preset intervals, and obtain the tail delay target generated by each microservice in the offline analysis phase; based on the real-time tail delay and tail delay target Adjust resource allocation for microservices and offline businesses.

作为本申请的进一步改进，基于预先训练好的预测模型预测每个微服务下一预设时间段的预测请求频率，并根据预测请求频率对每个微服务在下一预设时间段的资源进行预分配，包括：在部署在线业务后，获取每个微服务对应的预先构建的以请求频率作为自变量的尾部延迟模型和资源利用率模型，尾部延迟模型根据每个微服务在多个预设请求频率下的尾部延迟构建，资源利用率模型根据每个微服务在多个预设请求频率下的资源利用率构建；获取每个微服务在当前时间段的当前请求频率；将每个微服务的当前时间段和当前请求频率输入至预先训练好的时间序列预测模型，预测得到每个微服务的下一预设时间段的预测请求频率；根据当前请求频率、预测请求频率和每个微服务的尾部延迟模型、资源利用率模型为每个微服务预分配下一预设时间段的资源。As a further improvement of this application, the predicted request frequency of each microservice in the next preset time period is predicted based on the pre-trained prediction model, and the resources of each microservice in the next preset time period are predicted based on the predicted request frequency. Allocation, including: after deploying the online business, obtaining the pre-built tail delay model and resource utilization model corresponding to each microservice with request frequency as the independent variable. The tail delay model is based on multiple preset requests for each microservice. The tail delay under frequency is constructed, and the resource utilization model is constructed based on the resource utilization of each microservice under multiple preset request frequencies; the current request frequency of each microservice in the current time period is obtained; the resource utilization model of each microservice is The current time period and current request frequency are input to the pre-trained time series prediction model, and the predicted request frequency for the next preset time period of each microservice is predicted; based on the current request frequency, predicted request frequency and each microservice's The tail delay model and resource utilization model pre-allocate resources for the next preset time period for each microservice.

作为本申请的进一步改进，根据当前请求频率、预测请求频率和每个微服务的尾部延迟模型、资源利用率模型为每个微服务预分配下一预设时间段的资源，包括：利用当前请求频率、预先获取的微服务的副本数量和第一预设误差系数按第一预设规则计算第一请求频率参考值；比对预测请求频率与第一请求频率参考值的大小关系；若预测请求频率大于或等于第一请求频率参考值，则对微服务所占用的资源进行扩容处理；若预测请求频率小于第一请求频率参考值，则对微服务所占用的资源进行缩容处理。As a further improvement of the present application, resources for the next preset time period are pre-allocated to each microservice based on the current request frequency, the predicted request frequency, and the tail delay model and resource utilization model of each microservice, including: calculating a first request frequency reference value according to a first preset rule using the current request frequency, the pre-acquired number of copies of the microservice, and a first preset error coefficient; comparing the predicted request frequency with the first request frequency reference value; if the predicted request frequency is greater than or equal to the first request frequency reference value, expanding the resources occupied by the microservice; if the predicted request frequency is less than the first request frequency reference value, shrinking the resources occupied by the microservice.

作为本申请的进一步改进，对微服务所占用的资源进行扩容处理，包括：根据尾部延迟目标和尾部延迟模型确认尾部延迟目标对应的最小请求频率；根据最小请求频率、微服务的副本数量和第一预设误差系数按第二预设规则计算第二请求频率参考值；比对预测请求频率和第二请求频率参考值的大小；若预测请求频率小于第二请求频率参考值，则对微服务的资源进行纵向扩容，纵向扩容后的资源利用率利用预测请求频率、副本数量、资源利用率模型和第二预设误差系数按第三预设规则计算得到；若预测请求频率大于或等于第二请求频率参考值，则对微服务的资源进行横向扩容后再进行纵向调整，横向扩容的副本数量利用预测请求频率、尾部延迟目标和第一预设误差系数按第四预设规则计算得到，纵向调整后的每个副本的最大资源利用率利用预测请求频率、副本数量、资源利用率模型和第二预设误差系数按第三预设规则计算得到。As a further improvement of this application, the resources occupied by the microservice are expanded, including: confirming the minimum request frequency corresponding to the tail delay target based on the tail delay target and the tail delay model; based on the minimum request frequency, the number of copies of the microservice, and the number of copies of the microservice. A preset error coefficient calculates the second request frequency reference value according to the second preset rule; compares the predicted request frequency and the second request frequency reference value; if the predicted request frequency is less than the second request frequency reference value, the microservice The resources are expanded vertically. After vertical expansion, the resource utilization utilization predicted request frequency, number of copies, resource utilization model and second preset error coefficient are calculated according to the third preset rule; if the predicted request frequency is greater than or equal to the second If the request frequency reference value is used, the resources of the microservice will be expanded horizontally and then adjusted vertically. The number of copies for horizontal expansion is calculated according to the fourth preset rule using the predicted request frequency, tail delay target and first preset error coefficient. The adjusted maximum resource utilization utilization prediction request frequency of each copy, the number of copies, the resource utilization model and the second preset error coefficient are calculated according to the third preset rule.

作为本申请的进一步改进，对微服务所占用的资源进行缩容处理，包括：利用预测请求频率、尾部延迟目标和第一预设误差系数按第四预设规则计算得到副本数量参考值；比对副本数量参考值和副本数量的大小；若副本数量参考值与副本数量相等，则对微服务的资源进行纵向缩容，纵向缩容后的资源利用率利用预测请求频率、副本数量、资源利用率模型和第二预设误差系数按第三预设规则计算得到；若副本数量参考值小于副本数量，则对微服务的资源进行横向缩容后再进行纵向调整，横向缩容后的副本数量为副本数量参考值，纵向调整后的每个副本的最大资源利用率利用预测请求频率、副本数量、资源利用率模型和第二预设误差系数按第三预设规则计算得到。As a further improvement of this application, shrinking the resources occupied by microservices includes: using the predicted request frequency, tail delay target and first preset error coefficient to calculate the reference value of the number of copies according to the fourth preset rule; compare Set the reference value of the number of copies and the size of the number of copies; if the reference value of the number of copies is equal to the number of copies, vertically shrink the resources of the microservice, and the resource utilization after vertical shrinkage predicts the request frequency, number of copies, and resource utilization. The rate model and the second preset error coefficient are calculated according to the third preset rule; if the reference value of the number of copies is less than the number of copies, the resources of the microservice will be horizontally reduced and then adjusted vertically. The number of copies after the horizontal reduction will be As a reference value for the number of copies, the longitudinally adjusted maximum resource utilization prediction request frequency, number of copies, resource utilization model and second preset error coefficient of each copy are calculated according to the third preset rule.

作为本申请的进一步改进，在离线分析阶段生成每个微服务的尾部延迟目标，具体包括：在离线分析阶段，获取每个微服务的响应时延和调用依赖图，调用依赖图根据每个微服务的调用链路生成，调用依赖图包括多个节点，每个节点对应一个微服务；根据每个微服务的响应时延确认调用依赖图中每个节点的平均响应时延；从入度为0的节点出发，遍历图中的所有路径，对路径所经过节点的平均响应时延进行累加，便得到每条路径的平均响应时延；利用每个节点的平均响应时延除以节点所在路径的平均响应时延，得到节点的延迟比例；利用延迟比例乘以预先指定的尾部延迟得到每个节点的尾部延迟目标，且当一个节点存在多个尾部延迟目标时，选择最小的尾部延迟目标作为最终的尾部延迟目标。As a further improvement of this application, the tail delay target of each microservice is generated in the offline analysis stage, which specifically includes: in the offline analysis stage, the response delay and call dependency graph of each microservice are obtained, and the call dependency graph is based on each microservice. The call link of the service is generated. The call dependency graph includes multiple nodes, each node corresponds to a microservice; the average response delay of each node in the call dependency graph is confirmed based on the response delay of each microservice; from the in-degree to Starting from the node 0, traverse all paths in the graph, and accumulate the average response delay of the nodes passed by the path to obtain the average response delay of each path; use the average response delay of each node divided by the path where the node is located The average response delay of the node is obtained to obtain the delay ratio of the node; the delay ratio is multiplied by the pre-specified tail delay to obtain the tail delay target of each node, and when a node has multiple tail delay targets, the smallest tail delay target is selected as The ultimate tail delay target.

作为本申请的进一步改进，根据实时尾部延迟和尾部延迟目标调整微服务和离线业务的资源分配，包括：若存在目标微服务的实时尾部延迟大于目标微服务对应的尾部延迟目标，则将已部署的离线业务的资源减小一半；若不存在目标微服务的实时尾部延迟大于目标微服务对应的尾部延迟目标，则部署新的离线业务或增加已部署的离线业务的资源。As a further improvement of the present application, the resource allocation of microservices and offline services is adjusted according to the real-time tail delay and the tail delay target, including: if the real-time tail delay of the target microservice is greater than the tail delay target corresponding to the target microservice, the resources of the deployed offline service are reduced by half; if the real-time tail delay of the target microservice is not greater than the tail delay target corresponding to the target microservice, a new offline service is deployed or the resources of the deployed offline service are increased.

为解决上述技术问题，本申请采用的又一个技术方案是：提供一种资源管理装置，其包括：在线业务部署模块，用于当主机已部署离线业务时，若接收到部署在线业务的指令，则降低离线业务的资源分配，并为在线业务的每个微服务在当前时间段配置过量资源；预分配模块，用于基于预先训练好的预测模型预测每个微服务下一预设时间段的预测请求频率，并根据预测请求频率对每个微服务在下一预设时间段的资源进行预分配；采集模块，用于间隔预设周期采集每个微服务的实时尾部延迟，并获取每个微服务在离线分析阶段生成的尾部延迟目标；调整模块，用于根据实时尾部延迟和尾部延迟目标调整微服务和离线业务的资源分配。To solve the above technical problems, another technical solution adopted by the present application is: to provide a resource management device, which includes: an online business deployment module, which is used to reduce the resource allocation of offline business when the host has deployed offline business, if an instruction to deploy online business is received, and to configure excess resources for each microservice of online business in the current time period; a pre-allocation module, which is used to predict the predicted request frequency of each microservice in the next preset time period based on a pre-trained prediction model, and pre-allocate resources for each microservice in the next preset time period according to the predicted request frequency; a collection module, which is used to collect the real-time tail delay of each microservice at intervals of a preset period, and obtain the tail delay target generated by each microservice in the offline analysis stage; an adjustment module, which is used to adjust the resource allocation of microservices and offline businesses according to the real-time tail delay and the tail delay target.

为解决上述技术问题，本申请采用的再一个技术方案是：提供一种计算机设备，所述计算机设备包括处理器、与所述处理器耦接的存储器，所述存储器中存储有程序指令，所述程序指令被所述处理器执行时，使得所述处理器执行如上述任一项的资源管理方法的步骤。In order to solve the above technical problems, another technical solution adopted by this application is to provide a computer device. The computer device includes a processor and a memory coupled to the processor. Program instructions are stored in the memory. When the program instructions are executed by the processor, the processor is caused to execute the steps of any of the above resource management methods.

为解决上述技术问题，本申请采用的再一个技术方案是：提供一种存储介质，存储有能够实现上述任一项的资源管理方法的程序指令。In order to solve the above technical problems, another technical solution adopted by this application is to provide a storage medium that stores program instructions that can implement any of the above resource management methods.

本申请的有益效果是：本申请的资源管理方法通过在部署离线业务后，当需要部署在线业务时，先对离线业务的资源进行缩减再部署在线业务，并且通过对在线业务的所有微服务在下一预设时间段的资源进行预测和预分配，以保证在线业务端到端的响应时延，同时，对所有微服务的尾部延迟进行监测，结合预先设定的每个微服务的尾部延迟目标，对所有微服务占用的资源和离线业务占用的资源进行动态调整，从而能够在保障在线业务端到端的响应时延的同时提高资源利用率，保证在线业务和离线业务在混部的情况下能高效运行。The beneficial effects of this application are: the resource management method of this application reduces the resources of the offline business before deploying the online business when it is necessary to deploy the online business after deploying the offline business, and manages all the microservices of the online business under the Resources within a preset time period are predicted and pre-allocated to ensure the end-to-end response delay of online services. At the same time, the tail delay of all microservices is monitored, combined with the preset tail delay target of each microservice. Dynamically adjust the resources occupied by all microservices and the resources occupied by offline services, so as to ensure the end-to-end response delay of online services while improving resource utilization, ensuring that online services and offline services can be highly efficient in the case of co-location. run.

附图说明Description of drawings

图1是本发明实施例的资源管理方法的一流程示意图；FIG1 is a schematic diagram of a process of a resource management method according to an embodiment of the present invention;

图2是本发明实施例的资源管理装置的功能模块示意图；Figure 2 is a schematic diagram of the functional modules of the resource management device according to the embodiment of the present invention;

图3是本发明实施例的计算机设备的结构示意图；Figure 3 is a schematic structural diagram of a computer device according to an embodiment of the present invention;

图4是本发明实施例的存储介质的结构示意图。FIG. 4 is a schematic diagram of the structure of a storage medium according to an embodiment of the present invention.

具体实施方式Detailed ways

下面将结合本申请实施例中的附图，对本申请实施例中的技术方案进行清楚、完整地描述，显然，所描述的实施例仅是本申请的一部分实施例，而不是全部的实施例。基于本申请中的实施例，本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例，都属于本申请保护的范围。The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application. Obviously, the described embodiments are only some of the embodiments of the present application, rather than all of the embodiments. Based on the embodiments in this application, all other embodiments obtained by those of ordinary skill in the art without creative efforts fall within the scope of protection of this application.

本申请中的术语“第一”、“第二”、“第三”仅用于描述目的，而不能理解为指示或暗示相对重要性或者隐含指明所指示的技术特征的数量。由此，限定有“第一”、“第二”、“第三”的特征可以明示或者隐含地包括至少一个该特征。本申请的描述中，“多个”的含义是至少两个，例如两个，三个等，除非另有明确具体的限定。本申请实施例中所有方向性指示(诸如上、下、左、右、前、后……)仅用于解释在某一特定姿态(如附图所示)下各部件之间的相对位置关系、运动情况等，如果该特定姿态发生改变时，则该方向性指示也相应地随之改变。此外，术语“包括”和“具有”以及它们任何变形，意图在于覆盖不排他的包含。例如包含了一系列步骤或单元的过程、方法、系统、产品或设备没有限定于已列出的步骤或单元，而是可选地还包括没有列出的步骤或单元，或可选地还包括对于这些过程、方法、产品或设备固有的其它步骤或单元。The terms “first”, “second” and “third” in this application are only used for descriptive purposes and cannot be understood as indicating or implying relative importance or implicitly indicating the quantity of indicated technical features. Thus, features defined as "first", "second", and "third" may explicitly or implicitly include at least one of these features. In the description of this application, "plurality" means at least two, such as two, three, etc., unless otherwise clearly and specifically limited. All directional indications (such as up, down, left, right, front, back...) in the embodiments of this application are only used to explain the relative positional relationship between components in a specific posture (as shown in the drawings). , sports conditions, etc., if the specific posture changes, the directional indication will also change accordingly. Furthermore, the terms "including" and "having" and any variations thereof are intended to cover non-exclusive inclusion. For example, a process, method, system, product or device that includes a series of steps or units is not limited to the listed steps or units, but optionally also includes steps or units that are not listed, or optionally also includes Other steps or units inherent to such processes, methods, products or devices.

在本文中提及“实施例”意味着，结合实施例描述的特定特征、结构或特性可以包含在本申请的至少一个实施例中。在说明书中的各个位置出现该短语并不一定均是指相同的实施例，也不是与其它实施例互斥的独立的或备选的实施例。本领域技术人员显式地和隐式地理解的是，本文所描述的实施例可以与其它实施例相结合。Reference herein to "an embodiment" means that a particular feature, structure or characteristic described in connection with the embodiment can be included in at least one embodiment of the present application. The appearances of this phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. Those skilled in the art understand, both explicitly and implicitly, that the embodiments described herein may be combined with other embodiments.

图1是本发明实施例的资源管理方法的流程示意图。需注意的是，若有实质上相同的结果，本发明的方法并不以图1所示的流程顺序为限。如图1所示，该资源管理方法包括步骤：Figure 1 is a schematic flowchart of a resource management method according to an embodiment of the present invention. It should be noted that, if substantially the same results are obtained, the method of the present invention is not limited to the process sequence shown in Figure 1 . As shown in Figure 1, the resource management method includes steps:

步骤S101：当主机已部署离线业务时，若接收到部署在线业务的指令，则降低离线业务的资源分配，并为在线业务的每个微服务在当前时间段配置过量资源。Step S101: When the host has deployed offline services and receives an instruction to deploy online services, it reduces the resource allocation for offline services and allocates excess resources for each microservice of online services in the current time period.

需要说明的是，在线业务包括但不限于：运行时间长，延时敏感，对稳定向要求较高，服务不稳定业务会立马感知并且带来损失，有明显的波峰波谷，如白天高，深夜低的业务，比如广告搜索业务。离线业务包括但不限于非延时敏感，可重试，运行时间一般较短在几十分钟左右，内部一般为大数据计算，机器学习等服务。It should be noted that online services include but are not limited to: long running time, delay-sensitive, high requirements for stability, unstable services that will be immediately perceived and cause losses, and obvious peaks and troughs, such as high during the day and low at night, such as advertising search services. Offline services include but are not limited to services that are not delay-sensitive, can be retried, and generally have a short running time of about tens of minutes. They are generally big data computing, machine learning and other services.

具体地，本实施例中，在决定部署在线业务时，首先需要将所有已部署离线业务的资源分配降低，以保证有充足的资源分配给在线业务。其中，该降低离线业务资源分配的方式可以为将离线业务的资源分配降低为原来的一半。在部署在线业务时，首先分配过量的资源给在线业务的微服务，以保证在线业务正常部署，后续再对微服务的资源分配进行调整。Specifically, in this embodiment, when deciding to deploy online services, it is first necessary to reduce the resource allocation of all deployed offline services to ensure that sufficient resources are allocated to online services. The method of reducing the resource allocation of offline services may be to reduce the resource allocation of offline services to half of the original value. When deploying an online business, first allocate excess resources to the microservices of the online business to ensure normal deployment of the online business, and then adjust the resource allocation of the microservices later.

步骤S102：基于预先训练好的预测模型预测每个微服务下一预设时间段的预测请求频率，并根据预测请求频率对每个微服务在下一预设时间段的资源进行预分配。Step S102: Predict the predicted request frequency of each microservice in the next preset time period based on the pre-trained prediction model, and pre-allocate the resources of each microservice in the next preset time period based on the predicted request frequency.

需要说明的是，由于在线业务的请求频率的变化通常存在一定的周期性，如以一天或一周为周期，因此，可以利用时间序列预测模型对在线业务的请求频率进行预测。It should be noted that since the request frequency of online services usually changes with a certain periodicity, such as one day or one week, the time series prediction model can be used to predict the request frequency of online services.

本实施例中，首先需要根据在线业务的每个微服务的请求频率的变化规律进行时间段划分，得到多个时间段，例如，将一天划分为24个时间段，每个时间段对应一个小时。然后，利用预先收集到的微服务的历史请求频率数据训练时间序列预测模型。当在线业务部署之后，对每个微服务采用对应的时间序列预测模型和其当前预设时间段的请求频率进行预测，得到每个微服务下一预设时间段的预测请求频率，再根据该下一预设时间段的预测请求频率对每个微服务在下一预设时间段的资源进行预分配，在进入下一预设时间段后，即可直接根据该预分配资源对微服务的资源分配进行调整，从而可以合理的为每个微服务分配合理的资源，避免造成资源分配过量或资源分配不足的问题出现。In this embodiment, it is first necessary to divide the time periods according to the change pattern of the request frequency of each microservice of the online business to obtain multiple time periods. For example, one day is divided into 24 time periods, and each time period corresponds to one hour. . Then, the time series prediction model is trained using the pre-collected historical request frequency data of the microservices. After the online business is deployed, the corresponding time series prediction model and the request frequency of the current preset time period are used to predict each microservice, and the predicted request frequency of each microservice in the next preset time period is obtained, and then based on the The predicted request frequency of the next preset time period pre-allocates the resources of each microservice in the next preset time period. After entering the next preset time period, the resources of the microservice can be directly allocated based on the pre-allocated resources. The allocation is adjusted so that each microservice can be reasonably allocated with reasonable resources to avoid the problem of over-allocation or under-allocation of resources.

进一步的，步骤S102具体包括：Further, step S102 specifically includes:

1、在部署在线业务后，获取每个微服务对应的预先构建的以请求频率作为自变量的尾部延迟模型和资源利用率模型，尾部延迟模型根据每个微服务在多个预设请求频率下的尾部延迟构建，资源利用率模型根据每个微服务在多个预设请求频率下的资源利用率构建。1. After deploying the online business, obtain the pre-built tail delay model and resource utilization model corresponding to each microservice with request frequency as the independent variable. The tail delay model is based on each microservice under multiple preset request frequencies. The tail-latency model is built based on the resource utilization of each microservice at multiple preset request frequencies.

在本实施例中，需要为每个微服务构建请求频率——尾部延迟型、请求频率——资源利用率模型，该两个模型均以请求频率作为自变量进行构建。模型构建过程具体如下：In this embodiment, it is necessary to construct a request frequency-tail delay model and a request frequency-resource utilization model for each microservice. Both models are constructed with request frequency as an independent variable. The model building process is as follows:

首先，初始时将请求频率设置为最低水平(如每秒10个请求)，以最低水平的请求频率访问应用，记录下此时每个微服务的尾部延迟(如处于95％分位的响应时延)和资源利用率。First, initially set the request frequency to the lowest level (such as 10 requests per second), access the application at the lowest level of request frequency, and record the tail delay of each microservice at this time (such as the 95% response time). delay) and resource utilization.

接着，逐步增加每秒的请求频率，每次增加一个最低水平的请求频率(如，最低水平为每秒10个请求，则增加之后的请求为每秒20个请求、每秒30个请求……)，记录下该请求频率下每个微服务的尾部延迟(如处于95％分位的响应时延)和资源利用率。Next, gradually increase the request frequency per second, each time increasing the request frequency to a minimum level (for example, if the minimum level is 10 requests per second, the requests after the increase are 20 requests per second, 30 requests per second, and so on), and record the tail delay (such as the response delay at the 95th percentile) and resource utilization of each microservice under this request frequency.

然后，将收集到的数据整理成两组。其中一组，将请求频率从小到大排列，同时列出每个请求频率对应的尾部延迟。对于第二组，同样将请求频率从小到大排列，同时列出每个请求频率下对应的资源利用率。The collected data was then organized into two groups. In one group, the request frequency is arranged from small to large, and the tail delay corresponding to each request frequency is listed. For the second group, the request frequencies are also arranged from small to large, and the corresponding resource utilization under each request frequency is listed.

具体格式如下：The specific format is as follows:

第一组：First group:

[请求频率1，请求频率2，请求频率3，……，请求频率n]；[Request frequency 1, request frequency 2, request frequency 3, ..., request frequency n];

[尾部延迟1，尾部延迟2，尾部延迟3，……，尾部延迟n]；[Tail delay 1, tail delay 2, tail delay 3, ..., tail delay n];

第二组：Second Group:

[资源利用率1，资源利用率2，资源利用率3，……，资源利用率n]；[Resource utilization rate 1, resource utilization rate 2, resource utilization rate 3, ..., resource utilization rate n];

最后，根据整理后的两组数据，通过拟合曲线的方式分别为每个微服务建立请求频率——尾部延迟模型、请求频率——资源利用率模型。Finally, based on the two sets of collated data, a request frequency-tail delay model and a request frequency-resource utilization model were established for each microservice by fitting curves.

需要说明的是，该资源利用率具体值CPU利用率。It should be noted that the resource utilization is specifically the CPU utilization.

2、获取每个微服务在当前时间段的当前请求频率。2. Obtain the current request frequency of each microservice in the current time period.

3、将每个微服务的当前时间段和当前请求频率输入至预先训练好的时间序列预测模型，预测得到每个微服务的下一预设时间段的预测请求频率。3. Input the current time period and current request frequency of each microservice into the pre-trained time series prediction model, and predict the predicted request frequency of each microservice in the next preset time period.

具体地，在获取到微服务的当前请求频率后，将当前请求频率和当前时间段输入至预先训练好的时间序列预测模型，预测得到下一预设时间段的预测请求频率。Specifically, after obtaining the current request frequency of the microservice, the current request frequency and the current time period are input to a pre-trained time series prediction model, and the predicted request frequency of the next preset time period is predicted.

4、根据当前请求频率、预测请求频率和每个微服务的尾部延迟模型、资源利用率模型为每个微服务预分配下一预设时间段的资源。4. Pre-allocate resources for the next preset time period to each microservice based on the current request frequency, predicted request frequency, and the tail delay model and resource utilization model of each microservice.

具体地，在得到下一预设时间段的预测请求频率后，根据该预测请求频率和预先构建的尾部延迟模型、资源利用率模型为微服务进行资源预分配。Specifically, after obtaining the predicted request frequency for the next preset time period, resources are pre-allocated for the microservice based on the predicted request frequency and the pre-constructed tail delay model and resource utilization model.

进一步的，根据当前请求频率、预测请求频率和每个微服务的尾部延迟模型、资源利用率模型为每个微服务预分配下一预设时间段的资源，包括：Furthermore, resources for the next preset time period are pre-allocated to each microservice according to the current request frequency, the predicted request frequency, the tail delay model of each microservice, and the resource utilization model, including:

4.1、利用当前请求频率、预先获取的微服务的副本数量和第一预设误差系数按第一预设规则计算第一请求频率参考值。4.1. Calculate the first request frequency reference value according to the first preset rule using the current request frequency, the number of copies of the microservice obtained in advance and the first preset error coefficient.

具体地，以一个微服务的资源预分配过程为例进行说明：假设该微服务的副本数量为n，当前副本的平均请求频率为R_c(即当前请求频率)，根据时间序列预测模型预测得到的微服务在下一预设时间段的预测请求频率为R_n。Specifically, take the resource pre-allocation process of a microservice as an example: Assume that the number of copies of the microservice is n, and the average request frequency of the current copy is R _c (that is, the current request frequency). It is predicted according to the time series prediction model. The predicted request frequency of the microservice in the next preset time period is R _n .

首先，利用当前请求频率R_c、预先获取的微服务的副本数量n和第一预设误差系数α计算第一请求频率参考值R₁，计算过程如下：First, the first request frequency reference value R ₁ is calculated using the current request frequency R _c , the pre-obtained number of copies of the microservice n and the first preset error coefficient α. The calculation process is as follows:

R₁＝R_c*n*α；R ₁ =R _c *n*α;

其中，设置第一预设误差系数α的目的是为了尽可能消除模型预测误差的影响，其取值范围在0～1之间，通常可取0.7。The purpose of setting the first preset error coefficient α is to eliminate the influence of the model prediction error as much as possible. The value range of the first preset error coefficient α is between 0 and 1, and is usually 0.7.

4.2、比对预测请求频率与第一请求频率参考值的大小关系。4.2. Compare the relationship between the predicted request frequency and the first request frequency reference value.

4.3、若预测请求频率大于或等于第一请求频率参考值，则对微服务所占用的资源进行扩容处理。4.3. If the predicted request frequency is greater than or equal to the first request frequency reference value, the resources occupied by the microservice will be expanded.

具体地，当预测请求频率大于或等于第一请求频率参考值时，说明下一预设时间段的微服务占用的资源会增加，因此，需要对微服务占用的资源进行扩容处理。Specifically, when the predicted request frequency is greater than or equal to the first request frequency reference value, it means that the resources occupied by the microservice in the next preset time period will increase. Therefore, the resources occupied by the microservice need to be expanded.

进一步的，在进行扩容处理时，需要分析对微服务是进行纵向扩容还是横向扩容。因此，对微服务所占用的资源进行扩容处理，具体包括：Furthermore, when performing expansion processing, it is necessary to analyze whether the microservice should be expanded vertically or horizontally. Therefore, the resources occupied by microservices should be expanded, including:

4.3.1、根据尾部延迟目标和尾部延迟模型确认尾部延迟目标对应的最小请求频率。4.3.1. Determine the minimum request frequency corresponding to the tail latency target based on the tail latency target and the tail latency model.

4.3.2、根据最小请求频率、微服务的副本数量和第一预设误差系数按第二预设规则计算第二请求频率参考值。4.3.2. Calculate the second request frequency reference value according to the second preset rule based on the minimum request frequency, the number of copies of the microservice and the first preset error coefficient.

具体地，首先获取为每个微服务预先设定的尾部延迟目标L_k，再结合尾部延迟目标和尾部延迟模型确认该尾部延迟目标对应的最小请求频率R_k。Specifically, first obtain the tail delay target L _k preset for each microservice, and then combine the tail delay target and the tail delay model to confirm the minimum request frequency R _k corresponding to the tail delay target.

然后，利用最小请求频率R_k、预先获取的微服务的副本数量n和第一预设误差系数α计算第二请求频率参考值R₂，计算过程如下：Then, the second request frequency reference value R ₂ is calculated using the minimum request frequency R _k , the pre-obtained number of copies of the microservice n and the first preset error coefficient α. The calculation process is as follows:

R₂＝R_k*n*α。R ₂ =R _k *n*α.

4.3.3、比对预测请求频率和第二请求频率参考值的大小。4.3.3. Compare the predicted request frequency and the second request frequency reference value.

4.3.4、若预测请求频率小于第二请求频率参考值，则对微服务的资源进行纵向扩容，纵向扩容后的资源利用率利用预测请求频率、副本数量、资源利用率模型和第二预设误差系数按第三预设规则计算得到。4.3.4. If the predicted request frequency is less than the second request frequency reference value, vertically expand the resources of the microservice. The resource utilization after vertical expansion uses the predicted request frequency, number of copies, resource utilization model and second preset The error coefficient is calculated according to the third preset rule.

具体地，当预测请求频率小于第二请求频率参考值时，进行纵向扩容处理，且纵向扩容后的资源利用率C_d计算方式为：根据资源利用率模型，确认当请求频率为(R_n/n)时的资源利用率C_n，再按第三预设规则计算扩容后的资源利用率：C_d＝C_n*β，β为第二预设误差系数，目的是为了尽可能消除模型预测误差的影响，其取值范围在1～2之间，通常可取1.4。从而，该微服务的最终最大资源利用率将被限制为C_d。Specifically, when the predicted request frequency is less than the second request frequency reference value, vertical expansion processing is performed, and the resource utilization rate _Cd after vertical expansion is calculated as follows: according to the resource utilization model, the resource utilization rate _Cn when the request frequency is ( _Rn /n) is confirmed, and then the resource utilization rate after expansion is calculated according to the third preset rule: _Cd = _Cn *β, β is the second preset error coefficient, the purpose is to eliminate the influence of the model prediction error as much as possible, and its value range is between 1 and 2, and it can usually be 1.4. Therefore, the final maximum resource utilization rate of the microservice will be limited to _Cd .

4.3.5、若预测请求频率大于或等于第二请求频率参考值，则对微服务的资源进行横向扩容后再进行纵向调整，横向扩容的副本数量利用预测请求频率、尾部延迟目标和第一预设误差系数按第四预设规则计算得到，纵向调整后的每个副本的最大资源利用率利用预测请求频率、副本数量、资源利用率模型和第二预设误差系数按第三预设规则计算得到。4.3.5. If the predicted request frequency is greater than or equal to the second request frequency reference value, the resources of the microservice will be horizontally expanded and then adjusted vertically. The number of copies for horizontal expansion will be determined by using the predicted request frequency, tail delay target and first prediction. Assume that the error coefficient is calculated according to the fourth preset rule. The vertically adjusted maximum resource utilization prediction request frequency, number of copies, resource utilization model and second preset error coefficient of each copy are calculated according to the third preset rule. get.

具体地，在进行横向扩容时，扩容后的副本数量n_d的计算方式为：n_d＝ceil(R_n/(R_k*α))，ceil()表示将小数向上取整，纵向上将每个副本的最大资源利用率调整为C_d。Specifically, when performing horizontal expansion, the number of replicas after expansion, _nd , is calculated as follows: _nd = ceil( _Rn /( _Rk *α)), where ceil() means rounding up decimals, and vertically adjusting the maximum resource utilization of each replica to _Cd .

4.4、若预测请求频率小于第一请求频率参考值，则对微服务所占用的资源进行缩容处理。4.4. If the predicted request frequency is less than the first request frequency reference value, the resources occupied by the microservice are scaled down.

具体地，当预测请求频率小于第一请求频率参考值时，说明下一预设时间段的微服务占用的资源会降低，因此，需要对微服务占用的资源进行缩容处理。Specifically, when the predicted request frequency is less than the first request frequency reference value, it means that the resources occupied by the microservice in the next preset time period will decrease. Therefore, the resources occupied by the microservice need to be reduced.

进一步，对微服务所占用的资源进行缩容处理，包括：Furthermore, the resources occupied by the microservices are scaled down, including:

4.4.1、利用预测请求频率、尾部延迟目标和第一预设误差系数按第四预设规则计算得到副本数量参考值。4.4.1. Use the predicted request frequency, tail delay target and first preset error coefficient to calculate the reference value of the number of copies according to the fourth preset rule.

具体地，副本数量参考值n₁与扩容时的扩容后副本数量计算方式相同，具体为：n₁＝ceil(R_n/(R_k*α))。Specifically, the reference value n ₁ for the number of replicas is calculated in the same way as the number of replicas after expansion during capacity expansion, specifically: n ₁ =ceil(R _n /(R _k *α)).

4.4.2、比对副本数量参考值和副本数量的大小。4.4.2. Compare the reference value of the number of copies and the size of the number of copies.

4.4.3、若副本数量参考值与副本数量相等，则对微服务的资源进行纵向缩容，纵向缩容后的资源利用率利用预测请求频率、副本数量、资源利用率模型和第二预设误差系数按第三预设规则计算得到。4.4.3. If the reference value of the number of replicas is equal to the number of replicas, perform vertical scaling of the resources of the microservice. After the vertical scaling, the resource utilization prediction request frequency, number of replicas, resource utilization model and second preset The error coefficient is calculated according to the third preset rule.

具体地，当副本数量参考值n₁和副本数量大小相等时，进行纵向缩容，缩容后的资源利用率为C_d。Specifically, when the replica number reference value n ₁ is equal to the number of replicas, vertical scaling is performed, and the resource utilization rate after scaling is C _d .

4.4.4、若副本数量参考值小于副本数量，则对微服务的资源进行横向缩容后再进行纵向调整，横向缩容后的副本数量为副本数量参考值，纵向调整后的每个副本的最大资源利用率利用预测请求频率、副本数量、资源利用率模型和第二预设误差系数按第三预设规则计算得到。4.4.4. If the reference value of the number of copies is less than the number of copies, the resources of the microservice will be scaled down horizontally and then adjusted vertically. The number of copies after the horizontal reduction is the reference value for the number of copies. The number of copies of each copy after the vertical adjustment will be The maximum resource utilization prediction request frequency, number of copies, resource utilization model and second preset error coefficient are calculated according to the third preset rule.

具体地，当副本数量参考值n₁小于副本数量时，需要先进行横向缩容再在纵向上进行调整，横向缩容后的副本数量n_s＝ceil(R_n/(R_k*α)，纵向上每个副本的最大资源利用率调整为C_d。Specifically, when the reference value of the number of replicas n ₁ is less than the number of replicas, it is necessary to perform horizontal reduction first and then adjust it vertically. The number of replicas after horizontal reduction n _s =ceil(R _n /(R _k *α), The maximum resource utilization of each replica in the vertical direction is adjusted to C _d .

需要说明的是，上述缩容和扩容处理均需在特定时间点执行，：扩容操作发生在时间序列预测模型开始下一次预测前(如执行预测的1分钟前)，缩容操作发生在时间序列预测模型完成下一次预测后(如完成预测的1分钟后)。It should be noted that the above reduction and expansion processes need to be executed at specific points in time: the expansion operation occurs before the time series prediction model starts the next prediction (such as 1 minute before the prediction is performed), and the reduction operation occurs at the time series After the prediction model completes the next prediction (such as 1 minute after completing the prediction).

步骤S103：间隔预设周期采集每个微服务的实时尾部延迟，并获取每个微服务在离线分析阶段生成的尾部延迟目标。Step S103: Collect the real-time tail delay of each microservice at preset intervals, and obtain the tail delay target generated by each microservice in the offline analysis stage.

具体地，本实施例中，使用链路追踪工具(如Jaeger等)定期采样(如每隔5秒采样一次)每个微服务的响应时延，监控其尾部延迟的变化。Specifically, in this embodiment, a link tracking tool (such as Jaeger, etc.) is used to periodically sample (for example, every 5 seconds) the response delay of each microservice and monitor changes in its tail delay.

进一步的，在离线分析阶段生成每个微服务的尾部延迟目标，具体包括：Furthermore, the tail latency target of each microservice is generated during the offline analysis phase, including:

1、在离线分析阶段，获取每个微服务的响应时延和调用依赖图，调用依赖图根据每个微服务的调用链路生成，调用依赖图包括多个节点，每个节点对应一个微服务。1. In the offline analysis phase, obtain the response delay and call dependency graph of each microservice. The call dependency graph is generated based on the call link of each microservice. The call dependency graph includes multiple nodes, and each node corresponds to a microservice. .

具体地，链路追踪工具可以通过记录的调用数据分析其调用关系，然后根据这些信息为该应用生成一个依赖图，该依赖图揭示了应用组件之间的调用关系。调用依赖图中节点表示组件，箭头表示调用关系，路径上的数字表示调用次数。Specifically, the link tracking tool can analyze the call relationships through the recorded call data, and then generate a dependency graph for the application based on this information, which reveals the call relationships between application components. The nodes in the call dependency graph represent components, the arrows represent the calling relationship, and the numbers on the path represent the number of calls.

2、根据每个微服务的响应时延确认调用依赖图中每个节点的平均响应时延。2. Confirm the average response delay of each node in the call dependency graph based on the response delay of each microservice.

3、从入度为0的节点出发，遍历图中的所有路径，对路径所经过节点的平均响应时延进行累加，便得到每条路径的平均响应时延。3. Starting from the node with in-degree 0, traverse all paths in the graph, accumulate the average response delays of the nodes passed by the path, and obtain the average response delay of each path.

4、利用每个节点的平均响应时延除以节点所在路径的平均响应时延，得到节点的延迟比例。4. Use the average response delay of each node divided by the average response delay of the path where the node is located to obtain the delay ratio of the node.

5、利用延迟比例乘以预先指定的尾部延迟得到每个节点的尾部延迟目标，且当一个节点存在多个尾部延迟目标时，选择最小的尾部延迟目标作为最终的尾部延迟目标。5. Use the delay ratio multiplied by the pre-specified tail delay to obtain the tail delay target of each node, and when a node has multiple tail delay targets, select the smallest tail delay target as the final tail delay target.

需要说明的是，用户需要预先指定一个服务的端到端的总尾部延迟目标(如处于95％分位的时延应在30毫秒内)，再将每条路径的尾部延迟设置为该总尾部延迟目标。It should be noted that the user needs to specify the end-to-end total tail delay target of a service in advance (for example, the delay at the 95th percentile should be within 30 milliseconds), and then set the tail delay of each path to the total tail delay. Target.

需要理解的是，一个节点可能会出现再多条路径中，因此，其可能对应有多个延迟比例，导致该节点可以计算得到多个尾部延迟目标，本实施例将多个尾部延迟目标中最小的一个作为该节点最终的尾部延迟目标。It should be understood that a node may appear in multiple paths. Therefore, it may correspond to multiple delay ratios, causing the node to calculate multiple tail delay targets. In this embodiment, the minimum among the multiple tail delay targets is as the final tail delay target of the node.

步骤S104：根据实时尾部延迟和尾部延迟目标调整微服务和离线业务的资源分配。Step S104: Adjust the resource allocation of microservices and offline services according to the real-time tail delay and tail delay target.

具体地，根据微服务的实时尾部延迟是否满足其尾部延迟目标来做出相应的调节措施，如提高微服务的资源分配、提高离线业务的资源分配、部署新的离线业务、暂停离线业务的执行等。Specifically, corresponding adjustment measures are made based on whether the real-time tail delay of the microservice meets its tail delay target, such as increasing the resource allocation of microservices, improving the resource allocation of offline services, deploying new offline services, and suspending the execution of offline services. wait.

进一步的，步骤S104具体包括：Further, step S104 specifically includes:

1、若存在目标微服务的实时尾部延迟大于目标微服务对应的尾部延迟目标，则将已部署的离线业务的资源减小一半。1. If the real-time tail delay of the target microservice is greater than the tail delay target corresponding to the target microservice, reduce the deployed offline business resources by half.

2、若不存在目标微服务的实时尾部延迟大于目标微服务对应的尾部延迟目标，则部署新的离线业务或增加已部署的离线业务的资源。2. If there is no real-time tail delay of the target microservice that is greater than the tail delay target corresponding to the target microservice, deploy a new offline business or increase the resources of the deployed offline business.

进一步的，为了方便对离线业务的资源进行分配，本实施例中，将离线业务的资源分为多个档位，如划分为10档，20％资源利用率为最低档，即第一档，每个档位的资源利用率都在前一档位的基础上增加20％，第10档为最高档，CPU资源利用率为200％。Further, in order to facilitate the allocation of resources for offline services, in this embodiment, the resources for offline services are divided into multiple levels, such as 10 levels, with 20% resource utilization being the lowest level, that is, the first level. The resource utilization rate of each gear is increased by 20% based on the previous gear. The 10th gear is the highest level, and the CPU resource utilization rate is 200%.

当需要将已部署的离线业务的资源减小一半时，调整策略如下：将所有已部署离线业务的资源分配更改为原来的一半，之前处于奇数档位的，向下降一档后再变为自身一半，如，若一个离线业务的资源分配为第6档，则将其更改为第3档；若一个离线业务的资源分配为第3档，降一档后变为第2档，最后变为自身的一半，即第1档。对原本处于最低档(第一档)的离线业务则需要暂停运行。同时，暂时解除对该微服务的资源限制，以保证微服务的有充足的资源运行。When it is necessary to reduce the resources of deployed offline services by half, the adjustment strategy is as follows: change the resource allocation of all deployed offline services to half of the original ones. Those that were previously in odd-numbered levels will be reduced by one level and then become themselves. Half, for example, if the resource allocation of an offline business is level 6, change it to level 3; if the resource allocation of an offline business is level 3, it will be reduced by one level to level 2, and finally to level 2. Half of itself, the first gear. Offline services that were originally at the lowest level (first level) need to be suspended. At the same time, the resource restrictions on the microservice are temporarily lifted to ensure that the microservice has sufficient resources to run.

当需要部署新的离线业务或增加已部署的离线业务的资源，调整策略如下：首先尝试从已部署的离线业务中随机选择一个离线业务，将其资源分配提高一个档位。若所有已部署的离线业务均处于最高档位，无法继续提高其资源分配，则此时可选择一个新的离线业务进行部署，且将其资源分配设置为最低档。When it is necessary to deploy a new offline service or increase the resources of an already deployed offline service, the adjustment strategy is as follows: First, try to randomly select an offline service from the deployed offline services and increase its resource allocation by one level. If all deployed offline services are at the highest level and their resource allocation cannot be further increased, you can select a new offline service to deploy and set its resource allocation to the lowest level.

本实施例的资源管理方法通过在部署离线业务后，当需要部署在线业务时，先对离线业务的资源进行缩减再部署在线业务，并且通过对在线业务的所有微服务在下一预设时间段的资源进行预测和预分配，以保证在线业务端到端的响应时延，同时，对所有微服务的尾部延迟进行监测，结合预先设定的每个微服务的尾部延迟目标，对所有微服务占用的资源和离线业务占用的资源进行动态调整，从而能够在保障在线业务端到端的响应时延的同时提高资源利用率，保证在线业务和离线业务在混部的情况下能高效运行。The resource management method of this embodiment reduces the resources of the offline business before deploying the online business when the online business needs to be deployed after the offline business is deployed, and all the microservices of the online business are deployed in the next preset time period. Resources are predicted and pre-allocated to ensure the end-to-end response delay of online services. At the same time, the tail delay of all microservices is monitored. Combined with the preset tail delay target of each microservice, the time occupied by all microservices is measured. The resources and resources occupied by offline services are dynamically adjusted to ensure the end-to-end response delay of online services while improving resource utilization, ensuring that online services and offline services can run efficiently even when they are co-located.

图2是本发明实施例的资源管理装置的功能模块示意图。如图2所示，该资源管理装置20包括在线业务部署模块21、预分配模块22、采集模块23和调整模块24。Figure 2 is a schematic diagram of the functional modules of the resource management device according to the embodiment of the present invention. As shown in FIG. 2 , the resource management device 20 includes an online service deployment module 21 , a pre-allocation module 22 , a collection module 23 and an adjustment module 24 .

在线业务部署模块21，用于当主机已部署离线业务时，若接收到部署在线业务的指令，则降低离线业务的资源分配，并为在线业务的每个微服务在当前时间段配置过量资源；The online business deployment module 21 is used to reduce the resource allocation of the offline business and configure excess resources for each microservice of the online business in the current time period if an instruction to deploy the online business is received when the host has deployed the offline business;

预分配模块22，用于基于预先训练好的预测模型预测每个微服务下一预设时间段的预测请求频率，并根据预测请求频率对每个微服务在下一预设时间段的资源进行预分配；The pre-allocation module 22 is used to predict the predicted request frequency of each microservice in the next preset time period based on the pre-trained prediction model, and reserve the resources of each microservice in the next preset time period based on the predicted request frequency. distribute;

采集模块23，用于间隔预设周期采集每个微服务的实时尾部延迟，并获取每个微服务在离线分析阶段生成的尾部延迟目标；The collection module 23 is used to collect the real-time tail delay of each microservice at preset intervals, and obtain the tail delay target generated by each microservice in the offline analysis stage;

调整模块24，用于根据实时尾部延迟和尾部延迟目标调整微服务和离线业务的资源分配。The adjustment module 24 is used to adjust the resource allocation of microservices and offline services according to the real-time tail delay and the tail delay target.

可选地，预分配模块22执行基于预先训练好的预测模型预测每个微服务下一预设时间段的预测请求频率，并根据预测请求频率对每个微服务在下一预设时间段的资源进行预分配的操作，具体包括：在部署在线业务后，获取每个微服务对应的预先构建的以请求频率作为自变量的尾部延迟模型和资源利用率模型，尾部延迟模型根据每个微服务在多个预设请求频率下的尾部延迟构建，资源利用率模型根据每个微服务在多个预设请求频率下的资源利用率构建；获取每个微服务在当前时间段的当前请求频率；将每个微服务的当前时间段和当前请求频率输入至预先训练好的时间序列预测模型，预测得到每个微服务的下一预设时间段的预测请求频率；根据当前请求频率、预测请求频率和每个微服务的尾部延迟模型、资源利用率模型为每个微服务预分配下一预设时间段的资源。Optionally, the pre-allocation module 22 predicts the predicted request frequency of each microservice in the next preset time period based on the pre-trained prediction model, and allocates the resources of each microservice in the next preset time period based on the predicted request frequency. Perform pre-allocation operations, specifically including: after deploying the online business, obtain the pre-built tail delay model and resource utilization model corresponding to each microservice with request frequency as the independent variable. The tail delay model is based on the time of each microservice. The tail delay is constructed under multiple preset request frequencies, and the resource utilization model is built based on the resource utilization of each microservice under multiple preset request frequencies; obtain the current request frequency of each microservice in the current time period; The current time period and current request frequency of each microservice are input to the pre-trained time series prediction model, and the predicted request frequency of the next preset time period of each microservice is predicted; according to the current request frequency, predicted request frequency and The tail delay model and resource utilization model of each microservice pre-allocate resources for the next preset time period for each microservice.

可选地，预分配模块22执行根据当前请求频率、预测请求频率和每个微服务的尾部延迟模型、资源利用率模型为每个微服务预分配下一预设时间段的资源的操作，具体包括：利用当前请求频率、预先获取的微服务的副本数量和第一预设误差系数按第一预设规则计算第一请求频率参考值；比对预测请求频率与第一请求频率参考值的大小关系；若预测请求频率大于或等于第一请求频率参考值，则对微服务所占用的资源进行扩容处理；若预测请求频率小于第一请求频率参考值，则对微服务所占用的资源进行缩容处理。Optionally, the pre-allocation module 22 performs the operation of pre-allocating resources for the next preset time period for each micro-service based on the current request frequency, predicted request frequency, and the tail delay model and resource utilization model of each micro-service. Specifically, It includes: using the current request frequency, the number of copies of the microservice obtained in advance and the first preset error coefficient to calculate the first request frequency reference value according to the first preset rule; comparing the predicted request frequency with the first request frequency reference value relationship; if the predicted request frequency is greater than or equal to the first request frequency reference value, the resources occupied by the microservice will be expanded; if the predicted request frequency is less than the first request frequency reference value, the resources occupied by the microservice will be reduced. Capacity processing.

可选地，预分配模块22执行对微服务所占用的资源进行扩容处理的操作，具体包括：根据尾部延迟目标和尾部延迟模型确认尾部延迟目标对应的最小请求频率；根据最小请求频率、微服务的副本数量和第一预设误差系数按第二预设规则计算第二请求频率参考值；比对预测请求频率和第二请求频率参考值的大小；若预测请求频率小于第二请求频率参考值，则对微服务的资源进行纵向扩容，纵向扩容后的资源利用率利用预测请求频率、副本数量、资源利用率模型和第二预设误差系数按第三预设规则计算得到；若预测请求频率大于或等于第二请求频率参考值，则对微服务的资源进行横向扩容后再进行纵向调整，横向扩容的副本数量利用预测请求频率、尾部延迟目标和第一预设误差系数按第四预设规则计算得到，纵向调整后的每个副本的最大资源利用率利用预测请求频率、副本数量、资源利用率模型和第二预设误差系数按第三预设规则计算得到。Optionally, the pre-allocation module 22 performs the operation of expanding the resources occupied by the microservices, which specifically includes: confirming the minimum request frequency corresponding to the tail delay target according to the tail delay target and the tail delay model; Calculate the second request frequency reference value according to the second preset rule based on the number of copies and the first preset error coefficient; compare the predicted request frequency and the second request frequency reference value; if the predicted request frequency is less than the second request frequency reference value , then vertically expand the resources of the microservice, and the resource utilization after vertical expansion is calculated according to the predicted request frequency, number of copies, resource utilization model and second preset error coefficient according to the third preset rule; if the predicted request frequency is greater than or equal to the second request frequency reference value, the resources of the microservice will be horizontally expanded and then adjusted vertically. The number of copies for horizontal expansion will be based on the fourth preset using the predicted request frequency, tail delay target and first preset error coefficient. Calculated by the rules, the vertically adjusted maximum resource utilization prediction request frequency, number of copies, resource utilization model and second preset error coefficient of each copy are calculated according to the third preset rule.

可选地，预分配模块22执行对微服务所占用的资源进行缩容处理的操作，具体包括：利用预测请求频率、尾部延迟目标和第一预设误差系数按第四预设规则计算得到副本数量参考值；比对副本数量参考值和副本数量的大小；若副本数量参考值与副本数量相等，则对微服务的资源进行纵向缩容，纵向缩容后的资源利用率利用预测请求频率、副本数量、资源利用率模型和第二预设误差系数按第三预设规则计算得到；若副本数量参考值小于副本数量，则对微服务的资源进行横向缩容后再进行纵向调整，横向缩容后的副本数量为副本数量参考值，纵向调整后的每个副本的最大资源利用率利用预测请求频率、副本数量、资源利用率模型和第二预设误差系数按第三预设规则计算得到。Optionally, the pre-allocation module 22 performs the operation of shrinking the resources occupied by the microservice, which specifically includes: using the predicted request frequency, the tail delay target and the first preset error coefficient to calculate the replica according to the fourth preset rule. Quantity reference value; compare the reference value of the number of copies and the number of copies; if the reference value of the number of copies is equal to the number of copies, vertically shrink the resources of the microservice. The resource utilization after vertical shrinkage will predict the request frequency, The number of copies, the resource utilization model and the second preset error coefficient are calculated according to the third preset rule; if the reference value of the number of copies is less than the number of copies, the resources of the microservice will be scaled down horizontally and then adjusted vertically. The number of copies after storage is a reference value for the number of copies. The vertically adjusted maximum resource utilization prediction request frequency, number of copies, resource utilization model and second preset error coefficient of each copy are calculated according to the third preset rule. .

可选地，采集模块23还用于执行在离线分析阶段生成每个微服务的尾部延迟目标的操作，具体包括：在离线分析阶段，获取每个微服务的响应时延和调用依赖图，调用依赖图根据每个微服务的调用链路生成，调用依赖图包括多个节点，每个节点对应一个微服务；根据每个微服务的响应时延确认调用依赖图中每个节点的平均响应时延；从入度为0的节点出发，遍历图中的所有路径，对路径所经过节点的平均响应时延进行累加，便得到每条路径的平均响应时延；利用每个节点的平均响应时延除以节点所在路径的平均响应时延，得到节点的延迟比例；利用延迟比例乘以预先指定的尾部延迟得到每个节点的尾部延迟目标，且当一个节点存在多个尾部延迟目标时，选择最小的尾部延迟目标作为最终的尾部延迟目标。Optionally, the collection module 23 is also used to perform the operation of generating the tail delay target of each microservice in the offline analysis stage, which specifically includes: in the offline analysis stage, obtaining the response delay and call dependency graph of each microservice, calling The dependency graph is generated based on the call link of each microservice. The call dependency graph includes multiple nodes, each node corresponding to a microservice. The average response time of each node in the call dependency graph is confirmed based on the response delay of each microservice. delay; starting from the node with in-degree 0, traverse all paths in the graph, and accumulate the average response delay of the nodes the path passes through, and then obtain the average response delay of each path; use the average response time of each node The delay is divided by the average response delay of the path where the node is located to obtain the delay ratio of the node; the delay ratio is multiplied by the pre-specified tail delay to obtain the tail delay target of each node, and when a node has multiple tail delay targets, select The smallest tail delay target is used as the final tail delay target.

可选地，调整模块24执行根据实时尾部延迟和尾部延迟目标调整微服务和离线业务的资源分配的操作，具体包括：若存在目标微服务的实时尾部延迟大于目标微服务对应的尾部延迟目标，则将已部署的离线业务的资源减小一半；若不存在目标微服务的实时尾部延迟大于目标微服务对应的尾部延迟目标，则部署新的离线业务或增加已部署的离线业务的资源。Optionally, the adjustment module 24 performs the operation of adjusting the resource allocation of microservices and offline services according to the real-time tail delay and the tail delay target, specifically including: if the real-time tail delay of the target microservice is greater than the tail delay target corresponding to the target microservice, Then reduce the resources of the deployed offline business by half; if there is no real-time tail delay of the target microservice that is greater than the tail delay target corresponding to the target microservice, deploy a new offline business or increase the resources of the deployed offline business.

关于上述实施例资源管理装置中各模块实现技术方案的其他细节，可参见上述实施例中的资源管理方法中的描述，此处不再赘述。For other details of the technical solution for implementing each module in the resource management device of the above embodiment, please refer to the description of the resource management method in the above embodiment, and will not be described again here.

需要说明的是，本说明书中的各个实施例均采用递进的方式描述，每个实施例重点说明的都是与其他实施例的不同之处，各个实施例之间相同相似的部分互相参见即可。对于装置类实施例而言，由于其与方法实施例基本相似，所以描述的比较简单，相关之处参见方法实施例的部分说明即可。It should be noted that each embodiment in this specification is described in a progressive manner. Each embodiment focuses on its differences from other embodiments. The same and similar parts between the various embodiments are referred to each other. Can. As for the device embodiment, since it is basically similar to the method embodiment, the description is relatively simple. For relevant details, please refer to the partial description of the method embodiment.

请参阅图3，图3为本发明实施例的计算机设备的结构示意图。如图3所示，该计算机设备30包括处理器31及和处理器31耦接的存储器32，存储器32中存储有程序指令，程序指令被处理器31执行时，使得处理器31执行上述任一实施例所述的资源管理方法步骤。Please refer to FIG. 3 , which is a schematic structural diagram of a computer device according to an embodiment of the present invention. As shown in FIG. 3 , the computer device 30 includes a processor 31 and a memory 32 coupled to the processor 31 . The memory 32 stores program instructions. When the program instructions are executed by the processor 31 , the processor 31 executes any of the above. Resource management method steps described in the embodiment.

其中，处理器31还可以称为资源(Central Processing Unit，中央处理单元)。处理器31可能是一种集成电路芯片，具有信号的处理能力。处理器31还可以是通用处理器、数字信号处理器(DSP)、专用集成电路(ASIC)、现场可编程门阵列(FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。The processor 31 may also be called a resource (Central Processing Unit). The processor 31 may be an integrated circuit chip with signal processing capabilities. The processor 31 may also be a general-purpose processor, a digital signal processor (DSP), an application-specific integrated circuit (ASIC), a field-programmable gate array (FPGA) or other programmable logic device, a discrete gate or transistor logic device, or a discrete hardware component. . A general-purpose processor may be a microprocessor or the processor may be any conventional processor, etc.

参阅图4，图4为本发明实施例的存储介质的结构示意图。本发明实施例的存储介质存储有能够实现上述资源管理方法的程序指令41，其中，该程序指令41可以以软件产品的形式存储在上述存储介质中，包括若干指令用以使得一台计算机设备(可以是个人计算机，服务器，或者网络设备等)或处理器(processor)执行本申请各个实施方式所述方法的全部或部分步骤。而前述的存储介质包括：U盘、移动硬盘、只读存储器(ROM，Read-OnlyMemory)、随机存取存储器(RAM，Random Access Memory)、磁碟或者光盘等各种可以存储程序代码的介质，或者是计算机、服务器、手机、平板等计算机设备。Refer to Figure 4, which is a schematic structural diagram of a storage medium according to an embodiment of the present invention. The storage medium of the embodiment of the present invention stores program instructions 41 that can implement the above resource management method, wherein the program instructions 41 can be stored in the above storage medium in the form of a software product, including several instructions to make a computer device ( It may be a personal computer, a server, a network device, etc.) or a processor (processor) that executes all or part of the steps of the method described in each embodiment of this application. The aforementioned storage media include: U disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic disk or optical disk and other media that can store program code. Or computer equipment such as computers, servers, mobile phones, tablets, etc.

在本申请所提供的几个实施例中，应该理解到，所揭露的计算机设备，装置和方法，可以通过其它的方式实现。例如，以上所描述的装置实施例仅仅是示意性的，例如，单元的划分，仅仅为一种逻辑功能划分，实际实现时可以有另外的划分方式，例如多个单元或组件可以结合或者可以集成到另一个系统，或一些特征可以忽略，或不执行。另一点，所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口，装置或单元的间接耦合或通信连接，可以是电性，机械或其它的形式。In the several embodiments provided in this application, it should be understood that the disclosed computer equipment, apparatus and methods can be implemented in other ways. For example, the device embodiments described above are only illustrative. For example, the division of units is only a logical function division. In actual implementation, there may be other division methods. For example, multiple units or components may be combined or integrated. to another system, or some features can be ignored, or not implemented. On the other hand, the coupling or direct coupling or communication connection between each other shown or discussed may be through some interfaces, and the indirect coupling or communication connection of the devices or units may be in electrical, mechanical or other forms.

另外，在本发明各个实施例中的各功能单元可以集成在一个处理单元中，也可以是各个单元单独物理存在，也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现，也可以采用软件功能单元的形式实现。以上仅为本申请的实施方式，并非因此限制本申请的专利范围，凡是利用本申请说明书及附图内容所作的等效结构或等效流程变换，或直接或间接运用在其他相关的技术领域，均同理包括在本申请的专利保护范围内。In addition, each functional unit in various embodiments of the present invention can be integrated into one processing unit, or each unit can exist physically alone, or two or more units can be integrated into one unit. The above integrated units can be implemented in the form of hardware or software functional units. The above are only embodiments of the present application, and do not limit the patent scope of the present application. Any equivalent structure or equivalent process transformation made using the contents of the description and drawings of this application, or directly or indirectly applied in other related technical fields, All are similarly included in the patent protection scope of this application.

Claims

1. A resource management method, characterized in that it includes:

When the host has deployed offline services, if it receives an instruction to deploy online services, it will reduce the resource allocation for offline services and allocate excess resources for each microservice of online services in the current time period;

Predict the predicted request frequency of each microservice in the next preset time period based on the pre-trained prediction model, and pre-allocate the resources of each microservice in the next preset time period based on the predicted request frequency;

Collect the real-time tail latency of each microservice at preset intervals, and obtain the tail latency target generated by each microservice during the offline analysis phase;

Adjust the resource allocation of the microservice and the offline business according to the real-time tail delay and the tail delay target.

2. The resource management method according to claim 1, characterized in that the predicted request frequency of each microservice in the next preset time period is predicted based on a pre-trained prediction model, and the resources of each microservice in the next preset time period are pre-allocated according to the predicted request frequency, comprising:

After deploying the online business, obtain the pre-built tail delay model and resource utilization model corresponding to each microservice with request frequency as the independent variable. The tail delay model is based on each microservice under multiple preset request frequencies. The tail delay is constructed, and the resource utilization model is constructed based on the resource utilization of each microservice under multiple preset request frequencies;

Get the current request frequency of each microservice in the current time period;

Input the current time period and the current request frequency of each microservice into a pre-trained time series prediction model, and predict the predicted request frequency of the next preset time period for each microservice;

Pre-allocate resources for the next preset time period to each microservice based on the current request frequency, the predicted request frequency, the tail delay model and the resource utilization model of each microservice.

3. The resource management method according to claim 2, characterized in that the tail delay model and the resource utilization model based on the current request frequency, the predicted request frequency and each microservice are: Each microservice is pre-allocated resources for the next preset time period, including:

Calculate the first request frequency reference value according to the first preset rule using the current request frequency, the number of copies of the microservice obtained in advance and the first preset error coefficient;

Compare the magnitude relationship between the predicted request frequency and the first request frequency reference value;

If the predicted request frequency is greater than or equal to the first request frequency reference value, expand the resources occupied by the microservice;

If the predicted request frequency is less than the first request frequency reference value, the resources occupied by the microservice are scaled down.

4. The resource management method according to claim 3, wherein the expansion of the resources occupied by the microservices includes:

Confirm the minimum request frequency corresponding to the tail delay target according to the tail delay target and the tail delay model;

Calculate a second request frequency reference value according to the minimum request frequency, the number of copies of the microservice, and the first preset error coefficient according to a second preset rule;

Compare the size of the predicted request frequency and the second request frequency reference value;

If the predicted request frequency is less than the second request frequency reference value, the resources of the microservice are vertically expanded, and the resource utilization after the vertical expansion is calculated according to a third preset rule using the predicted request frequency, the number of replicas, the resource utilization model and the second preset error coefficient;

If the predicted request frequency is greater than or equal to the second request frequency reference value, the resources of the microservice are horizontally expanded and then vertically adjusted. The number of copies of the horizontal expansion is determined by the predicted request frequency and the total number of copies. The tail delay target and the first preset error coefficient are calculated according to the fourth preset rule. The vertically adjusted maximum resource utilization of each copy utilizes the predicted request frequency, the number of copies, the resource utilization The rate model and the second preset error coefficient are calculated according to the third preset rule.

5. The resource management method according to claim 4, characterized in that the scaling down of resources occupied by the microservices comprises:

Calculate the copy number reference value according to a fourth preset rule using the prediction request frequency, the tail delay target and the first preset error coefficient;

Compare the reference value of the number of copies with the size of the number of copies;

If the reference value of the number of replicas is equal to the number of replicas, the resources of the microservice are vertically scaled down, and the resource utilization after the vertical scaling is determined by using the predicted request frequency, the number of replicas, and the resources. The utilization model and the second preset error coefficient are calculated according to the third preset rule;

If the reference value of the number of copies is less than the number of copies, the resources of the microservice are horizontally reduced and then adjusted vertically. The number of copies after the horizontal reduction is the reference value of the number of copies, and the number of copies after the vertical adjustment is The maximum resource utilization of each copy is calculated according to a third preset rule using the predicted request frequency, the number of copies, the resource utilization model and the second preset error coefficient.

6. The resource management method according to claim 1, characterized in that the tail delay target of each microservice is generated in the offline analysis stage, specifically including:

In the offline analysis stage, the response delay and call dependency graph of each microservice are obtained. The call dependency graph is generated based on the call link of each microservice. The call dependency graph includes multiple nodes, each node corresponds to one microservices;

Confirm the average response delay of each node in the call dependency graph based on the response delay of each microservice;

Starting from the node with in-degree 0, traverse all paths in the graph, accumulate the average response delays of the nodes passed by the path, and then get the average response delay of each path;

The average response delay of each node is divided by the average response delay of the path where the node is located to obtain the delay ratio of the node;

The tail delay target of each node is obtained by multiplying the delay ratio by the pre-specified tail delay, and when a node has multiple tail delay targets, the smallest tail delay target is selected as the final tail delay target.

7. The resource management method according to claim 1, wherein adjusting the resource allocation of the microservice and the offline business according to the real-time tail delay and the tail delay target includes:

If there is a real-time tail delay of the target microservice that is greater than the tail delay target corresponding to the target microservice, reduce the deployed offline business resources by half;

If there is no real-time tail delay of the target microservice that is greater than the tail delay target corresponding to the target microservice, then deploy a new offline service or increase the resources of the deployed offline service.

8. A resource management device, characterized in that it includes:

The online business deployment module is used to reduce the resource allocation of the offline business and allocate excess resources for each microservice of the online business in the current time period if the host has deployed an offline business and receives an instruction to deploy an online business;

The pre-allocation module is used to predict the predicted request frequency of each microservice in the next preset time period based on the pre-trained prediction model, and perform resource allocation for each microservice in the next preset time period based on the predicted request frequency. preallocation;

The collection module is used to collect the real-time tail delay of each microservice at preset intervals, and obtain the tail delay target generated by each microservice in the offline analysis phase;

An adjustment module, configured to adjust the resource allocation of the microservice and the offline business according to the real-time tail delay and the tail delay target.

9. A computer device, characterized in that the computer device includes a processor and a memory coupled to the processor, and program instructions are stored in the memory. When the program instructions are executed by the processor, The processor is caused to execute the steps of the resource management method according to any one of claims 1-7.

10. A storage medium, characterized by storing program instructions capable of implementing the resource management method according to any one of claims 1-7.