CN114020326A

CN114020326A - Micro-service response time prediction method and system based on graph neural network

Info

Publication number: CN114020326A
Application number: CN202111297775.6A
Authority: CN
Inventors: 陈昕; 郑伟平; 闫雪梅
Original assignee: Lijian Defense Technology Xinjiang Co ltd
Current assignee: Lijian Defense Technology Xinjiang Co ltd
Priority date: 2021-11-04
Filing date: 2021-11-04
Publication date: 2022-02-08

Abstract

The embodiment of the invention provides a micro-service response time prediction method and system based on a graph neural network, wherein a response time prediction model is established, and in order to collect a training network model, the whole software system needs to be deployed, tested and recorded. The system is tested under different external load conditions and different resource deployment states, and the number of the service requests, the service response time, the resource configuration information and the like are recorded by taking the deployed micro-services as units. Using the collected data as a training data set; taking the number of service requests, container template information, deployment container number information and a calling relation graph among micro-services as input; and taking the response time of each micro service in a period S in the future as output, and performing model training to finally obtain a response time prediction model of the micro service. The micro-service response time prediction model can provide support for resource scheduling and service resource expansion of the micro-service system, and the purpose of optimizing the overall performance of the software system is achieved.

Description

Micro-service response time prediction method and system based on graph neural network

Technical Field

The embodiment of the invention relates to the technical field of software application development, in particular to a micro-service response time prediction method and system based on a graph neural network.

Background

Currently, in the field of software application development, micro-services are becoming a popular technology that is widely used. The micro-service technology divides the software function into a plurality of service entities which can be autonomous, realizes flexible deployment and expansion, and meets increasingly complex service application requirements. By combining the cloud platform and the lightweight container technology, the performance monitoring and the resource level expansion of specific micro-services can be realized under the support of the application management technology during running, and the purpose of fine management on the overall performance of a software system is achieved.

On the other hand, due to independent deployment of the micro-services, the system has a complex service call chain, the performance of the system is greatly different under different external loads and resource configuration states, and a challenge is provided for the overall optimization of the system performance. For example, when a micro-service is detected to be a performance bottleneck, after a service resource is extended for the micro-service, another micro-service on a call chain may become a new performance bottleneck, which may result in failure to improve the overall performance of the system. Therefore, in a complex distributed computing environment of micro-service deployment, a performance modeling means aiming at the optimization of the whole system (particularly a software system formed by a plurality of micro-services) is needed, and a model support is provided for the resource scheduling and the resource expansion of the micro-service system.

Disclosure of Invention

The embodiment of the invention provides a micro-service response time prediction method and a system based on a graph neural network.

In a first aspect, an embodiment of the present invention provides a micro-service response time prediction method based on a graph neural network, including:

the method comprises the steps of running a software system to be modeled, and recording running data of each micro-service in the software system to be modeled, wherein the running data comprises service request quantity, container template information, deployment container quantity information, service request response time and a calling relation graph among the micro-services;

and inputting the operation data into a response time prediction model trained in advance to predict the micro-service response time in a set period in the future.

Before prediction, training the graph neural network, specifically comprising:

configuring at least one container template for each micro service of a software system to be modeled, and configuring operation resources and an operation environment for the container template based on the operation environment requirement of the micro service;

generating a preset external access load, loading the preset external access load to a software system to be modeled, testing the preset external access load within a set time period, and recording the running data of each microservice in the software system to be modeled so as to generate a training data set; the operation data also comprises micro service id, recording time, service response time, container template resource configuration information and container quantity information; the container template resource configuration information comprises a CPU, a memory and a network bandwidth;

constructing a graph neural network, regarding each micro service as a node in the graph neural network, and setting a state value h for each micro service i_i；

Obtaining state value after R iteration rounds based on iteration method

Defining a feedforward fully-connected neural network, and iterating the state values of the micro-service i after R rounds

Substituting into the fully-connected neural network to obtain an output value O_i(ii) a And determining a loss function, and carrying out graph neural network training based on the loss function and the training data set to obtain a response time prediction model of the micro-service.

Preferably, the operating resource comprises a computing power CPU_iMemory capacity MEM_iAnd network bandwidth BAND_i。

Preferably, after generating the training data set, the method further includes:

checking whether the response time of each micro service violates a service level agreement, wherein the service level agreement is a preset response time threshold;

if the response time of any micro service does not violate the service level agreement, linearly increasing the preset external access load amount based on a preset proportion;

if the micro-services violating the service level agreement are detected, expanding the service resources of 1 container template for each micro-service violating the service level agreement;

and if the loaded external access load quantity reaches T times of the highest access quantity of the software system to be modeled, terminating the test.

And in the operation process, continuously recording the operation data of each micro-service in the software system to be modeled so as to generate a training data set.

Preferably, the initial state of the micro service i is

Initial state

The initialization method comprises the following steps:

converting the service request number, the container template CPU measurement, the memory measurement, the network bandwidth measurement and the deployment container number into binary bit strings, connecting the binary bit strings in series, supplementing 0 at the tail part, and assigning the finally obtained d-dimensional vector to a binary bit string

Preferably, the state value after the R iteration rounds is solved based on an iteration method

Specifically comprises：

On a calling relation graph among micro services, collecting nodes from neighbor nodes of a node i according to a probability p to form a neighbor node set N (i), and if the N (i) is empty, re-collecting until the N (i) at least comprises 1 node;

carrying out weighted average on the state of each node in the neighbor node set to obtain

In the above formula, j is 1,2, …, R, which represents the j-th iteration;

are trainable weight coefficients;

computing

The value of (c):

in the above formula, the first and second carbon atoms are,

representing the concatenation of two state vectors, W^(j)For trainable matrices of weight coefficients, E^(j)Is a bias term vector;

will be provided with

Carrying out normalization treatment:

preferably, the loss function is:

LOSS＝∑_i∈X||O_i-Y_i||+L_reg

in the above formula, X is a node set of the whole network, i.e., a set of all microservices in the software system; y is_iResponding to the time tag value for the micro service i; l is_regIs a regular term of L2.

In a second aspect, an embodiment of the present invention provides a micro service response time prediction system based on a graph neural network, including:

the acquisition module is used for operating the software system to be modeled and recording the operating data of each micro-service in the software system to be modeled, wherein the operating data comprises service request quantity, container template information, deployment container quantity information, service request response time and a calling relation graph among the micro-services;

and the prediction module is used for inputting the operation data into a response time prediction model trained in advance so as to predict the micro-service response time in a set period in the future.

In a third aspect, an embodiment of the present invention provides an electronic device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor executes the computer program to implement the steps of the method for predicting response time of a micro-service based on a graph neural network according to the embodiment of the first aspect of the present invention.

In a fourth aspect, an embodiment of the present invention provides a non-transitory computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the steps of the graph neural network-based micro-service response time prediction method according to an embodiment of the first aspect of the present invention.

The method and the system for predicting the response time of the micro-service based on the graph neural network, provided by the embodiment of the invention, are used for establishing a response time prediction model, and in order to collect a training network model, the whole software system needs to be deployed, tested and recorded. The system is tested under different external load conditions and different resource deployment states, and the number of the service requests, the service response time, the resource configuration information (including the measurement of CPU, memory, network bandwidth and the like) and the like are recorded by taking the deployed micro-service as a unit. Using the collected data as a training data set; the method comprises the following steps of taking the service request quantity, container template information, deployment container quantity information and a call relation graph among all micro services in a software system as input; and taking the response time of each micro service in a period S in the future as output, performing model training to finally obtain a response time prediction model of the micro service, and providing support for resource scheduling and service resource expansion of the micro service system so as to optimize the overall performance of the software system.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and those skilled in the art can also obtain other drawings according to the drawings without creative efforts.

FIG. 1 is a block diagram of a method for predicting response time of microservice based on graph neural networks according to an embodiment of the present invention;

fig. 2 is a schematic physical structure diagram according to an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

In the embodiment of the present application, the term "and/or" is only one kind of association relationship describing an associated object, and means that three relationships may exist, for example, a and/or B may mean: a exists alone, A and B exist simultaneously, and B exists alone.

The terms "first" and "second" in the embodiments of the present application are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include at least one such feature. In the description of the present application, the terms "comprise" and "have", as well as any variations thereof, are intended to cover a non-exclusive inclusion. For example, a system, product or apparatus that comprises a list of elements or components is not limited to only those elements or components but may alternatively include other elements or components not expressly listed or inherent to such product or apparatus. In the description of the present application, "plurality" means at least two, e.g., two, three, etc., unless explicitly specifically limited otherwise.

Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the application. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. It is explicitly and implicitly understood by one skilled in the art that the embodiments described herein can be combined with other embodiments. The following description and description will proceed with reference being made to various embodiments.

Fig. 1 provides a method for predicting microservice response time based on a graph neural network according to an embodiment of the present invention, including:

the method comprises the steps of running a software system to be modeled, and recording running data of each micro service in the software system to be modeled, wherein the running data comprises service request quantity, container template information, deployment container quantity information, service request response time and a calling relation graph among the micro services;

Specifically, before response time prediction is performed, training data needs to be collected, a neural network needs to be constructed and trained, and in order to collect a training network model, the whole software system needs to be deployed, tested and recorded. Specifically, the system is tested under different external load conditions and different resource deployment states, and the number of requested services, service response time, resource configuration information (including measurements of CPU, memory, network bandwidth, and the like) and the like are recorded in units of deployed micro-services. And (3) taking the collected data as a training data set, and establishing a prediction model of the micro-service response time through training.

Configuring at least one container template for each micro service of a software system to be modeled, and configuring operation resources and an operation environment for the container template based on the operation environment requirement of the micro service; firstly, acquiring deployment codes and data of a software system and a related operation configuration description; according to M micro service modules in the system, preparing M container templates for the micro service modules: determining the resource allocation of the container template i according to the actual situation of the microservice, including the CPU with the computing power_iMemory capacity MEM_iAnd network bandwidth BAND_i. In addition to running resources, a runtime environment on which the microservice depends needs to be installed in the container template. To simplify the performance modeling process, it is assumed that the microservice is extended in units of container templates when extending resources. That is, for microservice i, the system provides it with n resources (n is an integer and ≧ 1) of container template i. Assume that the microservice framework uses a Ribbon load balancing strategy. Second, a response time threshold for the user request, i.e., the quality of service level promised by the software system, is determined.

Generating a preset external access load, firstly carrying out system deployment according to the scale of deploying 1 container template for each micro service, and generating a smaller external access load L.

Loading the generated external access load on a software system to be modeled, testing within a set time interval (time S), and recording the operating data of each microservice in the software system to be modeled so as to generate a training data set; the operation data also comprises micro service id, recording time, service response time, container template resource configuration information and container quantity information; the container template resource configuration information comprises a CPU, a memory and a network bandwidth; checking whether the response time of each micro service violates a service level agreement, wherein the service level agreement is a preset response time threshold;

if the response time of any micro service does not violate the service level agreement, linearly increasing the external access load amount based on a preset proportion; specifically, 1/K of the original access load is increased each time; and returning to load the generated external access load on the software system to be modeled, and testing within a set time period (time S);

if the micro-services violating the service level agreement are detected, expanding the service resources of 1 container template for each micro-service violating the service level agreement; if the response time of the plurality of micro services is found to violate the service level agreement, resources are expanded for the plurality of micro services at the same time, and each micro service expands 1 service resource of the corresponding container template; and returning to load the generated external access load on the software system to be modeled, and testing within a set time period (time S);

and if the loaded external access load quantity reaches T times of the highest access quantity of the software system to be modeled, terminating the test. And in the operation process, continuously recording the operation data of each micro-service in the software system to be modeled so as to generate a training data set.

The embodiment of the invention provides a response time prediction model of a microservice constructed by using a graph neural network. Specifically, the input of the model is the service request number, container template information, deployment container number information of all micro-services in the software system, and a call relation graph among all the micro-services; the output of the model is a response time prediction for each microservice over a future period of time S.

Constructing a graph neural network, regarding each micro service as a node in the graph neural network, and setting a state value h for each micro service i_i(ii) a The initial state of the micro service i is

Initial state

The initialization method comprises the following steps: converting the service request number, the container template CPU measurement, the memory measurement, the network bandwidth measurement and the deployment container number into binary bit strings, connecting the binary bit strings in series, supplementing 0 at the tail part, and assigning the finally obtained d-dimensional vector to a binary bit string

In the j-th iteration (j ═ 1,2, …, R), the state of the microservice (node) i is calculated by the following procedure:

on a calling relationship graph among micro services, collecting nodes from neighbor nodes of a node i according to a probability p to form a neighbor node set N (i), and if N (i) is empty, re-collecting until N (i) at least comprises 1 node;

In the above formula, j is 1,2, …, R, which represents the j-th iteration;

are trainable weight coefficients;

computing

The value of (c):

in the above formula, the first and second carbon atoms are,

representing the concatenation of two state vectors, W^(j)As a trainable matrix of weight coefficients, b^(j)Is a bias term vector;

will be provided with

Carrying out normalization treatment:

obtaining state value after R iteration rounds based on iteration method

Substituting into the network to obtain output value O_i，

F represents a feed-forward fully-connected neural network.

And determining a loss function, and carrying out graph neural network training based on the loss function and the training data set to obtain a response time prediction model of the micro-service.

The calculated loss function is:

LOSS＝∑_i∈X||O_i-Y_i||+L_reg

And (3) training the model by using the data set recorded in the performance test stage through a gradient descent algorithm to finally obtain a response time prediction model of the micro-service.

In the software system deployed in the micro-service mode, independent deployment, performance monitoring and resource expansion can be carried out on specific micro-services. When the resource is expanded, the micro service response time prediction model can be utilized to guide the resource expansion process. In particular, there may be several applications as follows.

Firstly, expanding resources; when the response time of a micro service violates the service level constraint, the resources of the micro service need to be expanded. In order to simplify the expansion process, the invention assumes that the expansion process is expanded by taking the container template of the micro-service as a unit, namely, 1 or more container instances are additionally deployed by taking the original container template each time, and a Ribbon load balancing strategy is used. In the expansion, the number of container instances to be expanded needs to be determined, and the micro service response time prediction model can be used. Specifically, according to the state information (including the number of service requests, the number of deployed resources and the like) of each micro service in the system, input data of a prediction model is generated, the number of containers on the micro service to be expanded is adjusted, a predicted value of the response time of the micro service is obtained through the prediction model, and the predicted value is compared with an agreed service level to determine the number of container instances to be expanded.

Similarly, response time prediction models for microservices can also be utilized to assist in determining the amount of resources to deploy when resource contraction needs to be performed.

Secondly, global expansion; due to the call chain relationship existing between the micro services, when a resource is extended for a certain micro service, the request quantity of the downstream micro service may be increased sharply, so that the micro service becomes a new performance bottleneck and needs to extend the resource for the new micro service. This problem of frequent intermittent expansion has a large impact on the overall performance of the system. By applying the micro-service response time prediction model provided by the invention, when resources are expanded for the micro-service, whether the response time of other micro-services violates the service level agreement or not can be checked according to the prediction value of the prediction model, and if so, the number of containers is adjusted for the new bottleneck. Through the mode of 'adjustment-prediction-check-adjustment', resources can be reasonably adjusted for a plurality of micro services at one time, and the defect of frequent intermittent expansion caused by the method is avoided.

Thirdly, accessing current limit; by utilizing the response time prediction model of the micro-service, the maximum value estimation of the system access load can be provided, the access flow is limited under the condition that the system deployment resources are not changed, and the reduction of the service level is avoided.

The embodiment of the invention also provides a micro-service response time prediction system based on the graph neural network, and the micro-service response time prediction method based on the graph neural network in the embodiments comprises the following steps:

Based on the same concept, an embodiment of the present invention further provides an entity structure schematic diagram, as shown in fig. 2, the server may include: a processor (processor)810, a communication Interface 820, a memory 830 and a communication bus 840, wherein the processor 810, the communication Interface 820 and the memory 830 communicate with each other via the communication bus 840. The processor 810 may invoke logic instructions in the memory 830 to perform the steps of the graph neural network based microservice response time prediction method as described in the various embodiments above. Examples include:

In addition, the logic instructions in the memory 830 may be implemented in software functional units and stored in a computer readable storage medium when the logic instructions are sold or used as independent products. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.

Based on the same concept, embodiments of the present invention further provide a non-transitory computer-readable storage medium storing a computer program, where the computer program includes at least one code, and the at least one code is executable by a master device to control the master device to implement the steps of the method for predicting response time of a micro-service based on a graph neural network according to the embodiments. Examples include:

Based on the same technical concept, the embodiment of the present application further provides a computer program, which is used to implement the above method embodiment when the computer program is executed by the main control device.

The program may be stored in whole or in part on a storage medium packaged with the processor, or in part or in whole on a memory not packaged with the processor.

Based on the same technical concept, the embodiment of the present application further provides a processor, and the processor is configured to implement the above method embodiment. The processor may be a chip.

The embodiments of the present invention can be arbitrarily combined to achieve different technical effects.

In the above embodiments, the implementation may be wholly or partially realized by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on a computer, the procedures or functions described in accordance with the present application are generated, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored in a computer readable storage medium or transmitted from one computer readable storage medium to another, for example, the computer instructions may be transmitted from one website, computer, server, or data center to another website, computer, server, or data center by wire (e.g., coaxial cable, fiber optic, digital subscriber line) or wirelessly (e.g., infrared, wireless, microwave, etc.). The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device, such as a server, a data center, etc., that incorporates one or more of the available media. The usable medium may be a magnetic medium (e.g., floppy disk, hard disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., Solid state disk), among others.

One of ordinary skill in the art will appreciate that all or part of the processes in the methods of the above embodiments may be implemented by hardware related to instructions of a computer program, which may be stored in a computer-readable storage medium, and when executed, may include the processes of the above method embodiments. And the aforementioned storage medium includes: various media capable of storing program codes, such as ROM or RAM, magnetic or optical disks, etc.

Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims

1. A micro-service response time prediction method based on a graph neural network is characterized by comprising the following steps:

2. The graph neural network-based microservice response time prediction method according to claim 1, further comprising training the graph neural network, specifically comprising:

Obtaining state value after R iteration rounds based on iteration method

3. The graph neural network-based microservice response time prediction method of claim 2, wherein the operating resources comprise computing power CPUs_iMemory capacity MEM_iAnd network bandwidth BAND_i。

4. The method for predicting response time of microservice based on graph neural network of claim 2, further comprising, after generating the training data set:

if the response time of any micro service does not violate the service level agreement, linearly increasing the external access load amount based on a preset proportion;

5. The method of claim 2, wherein the initial state of the micro-service i is

Initial state

The initialization method comprises the following steps:

6. The method of claim 5, wherein the state value after R iteration is obtained based on an iterative method

The method specifically comprises the following steps:

In the above formula, j is 1,2, …, R, which represents the j-th iteration;

are trainable weight coefficients;

computing

The value of (c):

in the above formula, the first and second carbon atoms are,

will be provided with

Carrying out normalization treatment:

。

7. the graph neural network-based microservice response time prediction method of claim 6, wherein the loss function is:

LOSS＝∑_i∈X||O_i-Y_i||+L_reg

in the above formula, X is a node set of the whole network, i.e., a set of all microservices in the software system; y is_iService response time tag value for micro-service i；L_regIs a regular term of L2.

8. A microservice response time prediction system based on a graph neural network, comprising:

9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor when executing the program implements the steps of the graph neural network-based microservice response time prediction method of any of claims 1 to 7.

10. A non-transitory computer readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the graph neural network-based microservice response time prediction method according to any one of claims 1 to 7.