CN113518000A

CN113518000A - Method and device for adjusting number of instances of online service and electronic equipment

Info

Publication number: CN113518000A
Application number: CN202110518066.XA
Authority: CN
Inventors: 张磊
Original assignee: Beijing QIYI Century Science and Technology Co Ltd
Current assignee: Beijing QIYI Century Science and Technology Co Ltd
Priority date: 2021-05-12
Filing date: 2021-05-12
Publication date: 2021-10-19
Anticipated expiration: 2041-05-12
Also published as: CN113518000B

Abstract

The application relates to a method and a device for adjusting the number of instances of online service and electronic equipment, wherein the method comprises the following steps: firstly, target index data is obtained, then the target index data is input into prediction model prediction of current deployment, predicted flow data of online service output by prediction model prediction of the current deployment in a preset time period is obtained, then the number of instances of the online service is adjusted according to the predicted flow data, actual flow data of the online service in the preset time period after the number of the adjusted instances is obtained, whether the actual flow data and the predicted flow data meet a first preset condition or not is judged, if yes, the prediction model of the current deployment is updated to be the latest prediction model obtained by training according to the predicted flow data obtained each time by data of all indexes collected in real time, and therefore the prediction model of the current deployment can be suitable for the online service with the number of the instances adjusted continuously in real time.

Description

Method and device for adjusting number of instances of online service and electronic equipment

Technical Field

The present application relates to the technical field of service expansion and contraction, and in particular, to a method and an apparatus for adjusting the number of instances of an online service, and an electronic device.

Background

The online service is a service that needs to respond to a client request in real time, and generally, the online service deploys a plurality of instances to ensure that a large batch of requests of the client can be responded in time. Generally, the request traffic of the online service generally has a tidal phenomenon, i.e., the request traffic in the daytime and at night is greatly different, the difference between the request traffic in the peak and the request traffic in the valley is usually several times, and the request traffic is steeply increased and slowly decreased along with the hot spot event.

In order to cope with the change of the request traffic, in the related art, service expansion and contraction (adjustment of the number of instances of the service) may be performed by means of timing, threshold monitoring, model prediction, and the like, but the timing manner is difficult to cope with the steep-rise and slow-fall phenomenon caused by the hot spot event; the expansion and contraction capacity usually needs a certain time to be successfully expanded and contracted, and the steep increase and slow decrease phenomenon caused by the hot spot event cannot be timely dealt with by adopting a threshold monitoring mode; however, the current model prediction needs a long time, the structure of the online service changes along with the implementation of the expansion and contraction capacity, and the model needs to be trained again to predict the requested traffic of the online service with the changed structure.

Disclosure of Invention

In order to solve the problems that the training time of model prediction is long and the demand for flow prediction of the online service request frequently generating the scaling capacity at present cannot be met in the related technology, the application provides a method and a device for adjusting the number of instances of the online service, an electronic device and a storage medium.

According to a first aspect of the present application, there is provided a method for adjusting the number of instances of an online service, including:

acquiring target index data, wherein the target index data is data of indexes required by prediction of a pre-marked currently deployed prediction model;

inputting the target index data into prediction of a currently deployed prediction model, and acquiring predicted flow data of the online service output by the currently deployed prediction model within a preset time period;

adjusting the number of instances of the online service according to the predicted flow data;

acquiring actual flow data of the online service in the preset time period after the number of the instances is adjusted, and judging whether the actual flow data and the predicted flow data meet a first preset condition;

and if so, updating the currently deployed prediction model into the latest prediction model obtained by training according to the predicted flow data acquired each time by the data of all indexes acquired in real time.

In an optional embodiment, the obtaining target index data includes:

calling a target index corresponding to a currently deployed prediction model, wherein the target index comprises a gradient index;

and screening data corresponding to the target index from the data of all indexes collected in real time to obtain target index data.

In an optional embodiment, the predicted flow data includes a predicted flow value corresponding to each time in the preset time period;

the adjusting the number of instances of the online service according to the predicted traffic data comprises:

determining a first difference value of predicted flow values corresponding to adjacent moments;

if a first difference value larger than a first threshold value exists, increasing the number of the instances of the online service at a target time, wherein the target time is a time before the preset time period and is away from the starting time of the preset time period by a preset time length;

and if a first difference value smaller than a second threshold value exists, reducing the number of the instances of the online service at the target moment.

In an optional embodiment, the actual flow data includes an actual flow value corresponding to each time in the preset time period, and the predicted flow data includes a predicted flow value corresponding to each time in the preset time period;

the determining whether the actual flow data and the predicted flow data satisfy a first preset condition includes:

determining a first average value of all actual flow values and a second average value of all predicted flow values, and determining a second difference value of the first average value and the second average value;

if the second difference is larger than a third threshold, judging that the actual flow data and the predicted flow data meet a first preset condition;

and if the second difference is not larger than a third threshold, judging that the actual flow data and the predicted flow data do not meet a first preset condition.

In an optional embodiment, the method further comprises:

under the condition that the predicted flow data is obtained every time, determining training sample data according to data of all indexes collected in real time and the predicted flow data;

training a prediction model according to the training sample data to obtain an updated parameter;

the updating of the currently deployed prediction model to the latest prediction model obtained by training the prediction flow data acquired each time according to the data of all the indexes acquired in real time comprises the following steps:

calling the update parameters of the current latest prediction model;

and updating the model parameters of the currently deployed prediction model into the updated parameters so as to update the currently deployed prediction model into a prediction model obtained by training according to the data of all indexes acquired in real time and the predicted flow data acquired each time.

In an optional embodiment, the actual flow data is data belonging to a preset index in all index data collected in real time;

the method for determining training sample data according to the data of all indexes collected in real time and the predicted flow data under the condition that the predicted flow data is obtained each time comprises the following steps:

determining the correlation coefficients of other indexes except the preset index and the preset index in all the indexes based on the data of all the indexes collected in real time;

determining an index of which the correlation coefficient with a preset index is greater than a preset threshold value as a contribution index;

determining data of a time sequence gradient index of the actual flow data based on the actual flow data;

under the condition that predicted flow data are obtained each time, determining data of an error gradient index between the actual flow data and the predicted flow data based on the actual flow data and the predicted flow data;

determining the data of the contribution index, the data of the time sequence gradient index and the data of the error gradient index as the training sample data;

and marking the contribution index, the time sequence gradient index and the error gradient index as target indexes of the currently trained prediction model. In an optional embodiment, the actual flow data includes an actual flow value corresponding to each time in the preset time period;

the determining data of a time series gradient index of the actual flow data based on the actual flow data comprises:

determining a third difference value of the actual flow numerical values corresponding to every two adjacent moments;

and determining the data of the time sequence gradient index of the actual flow data according to all the determined third difference values. In an optional embodiment, the actual flow data includes an actual flow value corresponding to each time in the preset time period, and the predicted flow data includes a predicted flow value corresponding to each time in the preset time period;

the determining data of an error gradient indicator between the actual flow data and the predicted flow data based on the actual flow data and the predicted flow data comprises:

determining a fourth difference value between the actual flow value and the preset flow value corresponding to the same moment in the preset time period at the same moment;

and determining data of an error gradient index between the actual flow data and the predicted flow data according to all the determined fourth difference values. In an optional embodiment, the determining, based on the data of all the indexes collected in real time, a correlation coefficient between each of the indexes except the preset index and the preset index includes:

for any index, determining a correlation coefficient of the index and a preset index according to data of the index by using a preset correlation statistical algorithm;

according to a second aspect of the present application, there is provided an apparatus for online service instance adjustment, the apparatus comprising:

the system comprises an acquisition module, a prediction module and a prediction module, wherein the acquisition module is used for acquiring target index data, and the target index data is data of indexes required by prediction of a pre-marked currently deployed prediction model;

a prediction module, configured to input the target index data into prediction of a currently deployed prediction model, and obtain predicted traffic data of the online service output by the currently deployed prediction model within a preset time period

The adjusting module is used for adjusting the number of the instances of the online service according to the predicted flow data;

the judging module is used for acquiring actual flow data of the online service in the preset time period after the number of the instances is adjusted, and judging whether the actual flow data and the predicted flow data meet a first preset condition or not;

and the updating module is used for updating the currently deployed prediction model into the latest prediction model obtained by training according to the prediction flow data acquired each time according to the data of all indexes acquired in real time if the current deployed prediction model meets the requirements.

In an optional embodiment, the obtaining module includes:

the first retrieval unit is used for retrieving a target index corresponding to a currently deployed prediction model, and the target index comprises a gradient index;

and the screening unit is used for screening the data corresponding to the target index from the data of all the indexes collected in real time to obtain the target index data.

the adjustment module includes:

the first determining unit is used for determining a first difference value of the predicted flow numerical values corresponding to adjacent moments;

an increasing unit, configured to increase, at a target time, the number of instances of the online service if a first difference greater than a first threshold exists, where the target time is a time before the preset time period and is a time that is a preset time length away from a starting time of the preset time period;

and the reducing unit is used for reducing the number of the instances of the online service at the target moment if a first difference value smaller than a second threshold value exists.

the judging module comprises:

the second determining unit is used for determining a first average value of all actual flow values and a second average value of all predicted flow values, and determining a second difference value of the first average value and the second average value;

the first judging unit is used for judging that the actual flow data and the predicted flow data meet a first preset condition if the second difference value is larger than a third threshold value;

and the second judging unit is used for judging that the actual flow data and the predicted flow data do not meet the first preset condition if the second difference value is not larger than a third threshold value.

In an optional embodiment, the apparatus further comprises:

the third determining unit is used for determining training sample data according to the data of all indexes acquired in real time and the predicted flow data under the condition that the predicted flow data is acquired each time;

the training unit is used for training the prediction model according to the training sample data to obtain an updated parameter;

the update module includes:

the second calling unit is used for calling the update parameters of the current latest prediction model;

and the updating unit is used for updating the model parameters of the currently deployed prediction model into the updated parameters so as to update the currently deployed prediction model into the prediction model obtained by training according to the data of all indexes acquired in real time and the predicted flow data acquired each time.

the third determination unit includes:

the first determining subunit is used for determining the correlation coefficients of other indexes except the preset index in all the indexes and the preset index based on the data of all the indexes collected in real time;

a second determining subunit, configured to determine, as a contribution index, an index whose correlation coefficient with the preset index is greater than a preset threshold;

a third determining subunit, configured to determine, based on the actual flow data, data of a time-series gradient index of the actual flow data;

a fourth determining subunit, configured to determine, based on the actual flow data and the predicted flow data, data of an error gradient index between the actual flow data and the predicted flow data each time predicted flow data is obtained;

a fifth determining subunit, configured to determine, as the training sample data, the data of the contribution indicator, the data of the timing gradient indicator, and the data of the error gradient indicator;

and the marking subunit is used for marking the contribution index, the time sequence gradient index and the error gradient index as target indexes of the currently trained prediction model.

In an optional embodiment, the actual flow data includes an actual flow value corresponding to each time in the preset time period;

the third determining subunit includes:

a sixth determining subunit, configured to determine a third difference between actual flow values corresponding to every two adjacent time instants;

and the seventh determining subunit is configured to determine, according to all the determined third difference values, data of a time-series gradient index of the actual flow data.

the fourth determining subunit includes:

the eighth determining subunit is configured to determine, for the same time of the preset time period, a fourth difference between the actual flow value and the preset flow value corresponding to the same time;

a ninth determining subunit, configured to determine, according to all the fourth determined differences, data of an error gradient index between the actual flow data and the predicted flow data.

In an optional embodiment, the first determining subunit comprises:

and the tenth determining subunit is configured to determine, for any index, a correlation coefficient between the index and the preset index according to the data of the index by using a preset correlation statistical algorithm.

According to a third aspect of the present application, there is provided an electronic device comprising: at least one processor and memory;

the processor is configured to execute the program for adjusting the number of instances of the online service stored in the memory, so as to implement the method for adjusting the number of instances of the online service according to the first aspect of the present application.

According to a fourth aspect of the present application, there is provided a storage medium storing one or more programs which, when executed, implement the method for adjusting the number of instances of an online service according to the first aspect of the present application.

The technical scheme provided by the application can comprise the following beneficial effects: the method comprises the steps of firstly obtaining target index data, wherein the target index data are data of indexes required by prediction of a pre-marked currently deployed prediction model, then inputting the target index data into prediction of the currently deployed prediction model, obtaining predicted flow data of an online service predicted and output by the currently deployed prediction model within a preset time period, then adjusting the number of instances of the online service according to the predicted flow data, obtaining actual flow data of the online service within the preset time period after adjusting the number of instances, judging whether the actual flow data and the predicted flow data meet a first preset condition or not, and if yes, updating the currently deployed prediction model into a latest prediction model obtained by training according to the predicted flow data obtained by all index data collected in real time each time. Based on this, after the number of instances is adjusted, actual flow data within a preset time period of the online service is obtained each time, and when the actual flow data and the predicted flow data meet the first preset condition, it is indicated that the currently deployed prediction model is no longer suitable for the online service of the current number of instances, so that the currently deployed prediction model is updated to the latest prediction model obtained by training the predicted flow data obtained each time according to the data of all indexes collected in real time, so that the currently deployed prediction model can be suitable for the online service of the continuously adjusted number of instances in real time.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the application.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present application and together with the description, serve to explain the principles of the application.

FIG. 1 is an architectural diagram illustrating an example number adjustment for an online service provided by an embodiment of the present application;

FIG. 2 is a flow chart illustrating a method for adjusting the number of instances of an online service according to another embodiment of the present application;

fig. 3 is a schematic flowchart of a process for obtaining target index data according to another embodiment of the present application;

FIG. 4 is a detailed flow diagram of adjusting the number of instances of an online service provided by another embodiment of the present application;

fig. 5 is a schematic flow chart illustrating a process of determining whether the actual flow data and the predicted flow data satisfy the first preset condition according to another embodiment of the present application;

FIG. 6 is a schematic flow chart of training a predictive model according to another embodiment of the present application;

FIG. 7 is a schematic flow chart for determining training sample data according to another embodiment of the present application;

FIG. 8 is a schematic diagram of an LSTM prediction model according to another embodiment of the present application;

FIG. 9 is a schematic structural diagram of an apparatus for online service instance adjustment according to another embodiment of the present application;

fig. 10 is a schematic structural diagram of an electronic device according to another embodiment of the present application.

Detailed Description

Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with the present application. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present application, as detailed in the appended claims.

Referring to fig. 1, fig. 1 is a schematic diagram illustrating an architecture for adjusting the number of instances of an online service according to an embodiment of the present application.

As shown in fig. 1, in this embodiment, the model training is divided into two parts, one part is model training, and the other part is model prediction, where the model training mainly continuously adjusts model parameters of a prediction model according to predicted traffic data and actual traffic data in the same time period, and when an error between the predicted traffic data and the actual traffic data reaches a certain threshold, covers a current latest model parameter in the model training in a currently deployed prediction model, so that the currently deployed prediction model effectively adapts to online services that continuously change the number of instances.

In addition, the currently deployed prediction model refers to a prediction model deployed in an operating system of the online service, and used for predicting predicted traffic data of the online service for a future period of time according to actual traffic data of the online service.

Specifically, the on-line index collection in fig. 1 is to collect data of all on-line indexes of the on-line service, then automatically screen features to screen out on-line indexes strongly related to the flow of the on-line service, input the screened data of the on-line indexes as actual flow data into a currently deployed prediction model, perform on-line flow prediction to obtain predicted flow data of the on-line service in a preset time period, then adjust the number of application instances according to the predicted flow data, obtain the actual flow data of the on-line service in the same preset time period after adjustment, on one hand, return to a model training part to perform gradient feature calculation with the predicted flow data, and train the calculated gradient feature, the actual flow data and the predicted flow data as sample data based on the structure of the currently deployed prediction model, obtaining new target model parameters; and on the other hand, online prediction error statistics is carried out on the predicted flow data, and when the error statistics result meets a first preset condition, a new target model parameter is covered into the currently deployed prediction model.

Based on the architecture shown in fig. 1, the present application provides a method for adjusting the number of instances of an online service, and the following description is provided by way of example.

Referring to fig. 2, fig. 2 is a flowchart illustrating a method for adjusting the number of instances of an online service according to another embodiment of the present application.

As shown in fig. 2, the method for adjusting the number of instances of the online service provided by this embodiment may include:

step S201, target index data is obtained, wherein the target index data is data of indexes required by prediction of a pre-marked currently deployed prediction model.

It should be noted that, the prediction model in this embodiment is continuously subjected to iterative training, and during each training, contribution indexes having a correlation coefficient with a preset index greater than a preset threshold are determined, and these contribution indexes are marked as data of indexes required by the prediction model obtained by this training.

Of course, in order to enable the prediction model to better sense the steep increase and the steep decrease of the flow data, the embodiment further provides a time sequence gradient feature, and in order to further improve the accuracy of prediction of the prediction model, the embodiment further provides an error gradient feature. In this embodiment, the aforementioned contribution index, the time sequence gradient feature, and the error gradient feature are marked as target indexes required by the currently trained prediction model.

It should be noted that, referring to fig. 3, a process of acquiring target index data in the present embodiment may be shown, and fig. 3 is a schematic flow chart of acquiring target index data according to another embodiment of the present application.

As shown in fig. 3, the process of acquiring target index data provided in this embodiment includes:

step S301, a target index corresponding to the currently deployed prediction model is called, wherein the target index comprises a gradient index.

Since the currently deployed prediction model has a pre-marked target index, the mark may be an identification of the stored target index. It should be noted that the calling in this step may be to call a target index corresponding to the identifier from all indexes collected by the on-line index collection module in fig. 1.

Step S302, screening data corresponding to the target index from the data of all indexes collected in real time to obtain target index data.

In this step, the data of all the indexes collected in real time may be the data of all the indexes collected by the on-line index collection module in fig. 1. Because the target index contains the gradient index and the shaving degree index needs to be further calculated, in the step, the data of the gradient index can be directly calculated by the on-line index acquisition module so as to be obtained by direct screening in the step, or the data required by calculating the gradient index can be called firstly and then calculated to obtain the data of the corresponding shaving degree index. Specifically, reference may be made to the following description of calculation of the shaving degree index, which is not described herein again.

Step S202, target index data are input into prediction of the currently deployed prediction model, and predicted flow data of the online service output by the currently deployed prediction model in a preset time period are obtained.

In this step, the currently deployed prediction model is deployed in an operating system of the online service, and is used for predicting predicted traffic data of the online service for a period of time in the future according to actual traffic data of the online service.

In addition, the preset time period refers to a future time period, and specifically may be a time period with a preset time length starting from the current time, and in a specific example, if the current time is 2021 year 1 month 1 day 8 pointing and finishing, and the preset time length is one week, the preset time period may be from 2021 year 1 month 1 day 8 pointing and finishing to 2021 year 1 month 8 day 8 pointing and finishing. Of course, the preset time period can be adjusted according to specific requirements, such as 2 hours, 1 day, one month, and the like.

The predicted traffic data refers to data corresponding to an online indicator that can represent a traffic trend of an online service, and may be, for example, data corresponding to a request traffic (QPS) or a delay (latency) online indicator.

And step S203, adjusting the number of the instances of the online service according to the predicted flow data.

In this step, when the number of instances of the online service is adjusted according to the predicted traffic data, the adjustment can be performed in advance for a certain amount of time, thereby solving the problem that the response of the scaling capacity is not timely. For example, the predicted flow data indicates that there is a fast flow increase at 20 o ' clock 1/2021, and the number of instances may be increased at 19 o ' clock 50 o ' clock 1/2021.

Specifically, referring to fig. 4, the process of adjusting the number of instances of the online service in this step is shown, where fig. 4 is a schematic specific flowchart of adjusting the number of instances of the online service according to another embodiment of the present application. As shown in fig. 4, the specific process for adjusting the number of instances of the online service provided by this embodiment may include:

step S401, a first difference value of the predicted flow values corresponding to the adjacent moments is determined.

It should be noted that the predicted flow data includes predicted flow data corresponding to each time in a preset time period, where the time may be one time every second, or one time every two seconds, and may be specifically adjusted according to a requirement.

In this step, the adjacent time refers to two adjacent times in a preset time period, for example, the time a and the time B are two adjacent times, where the time a corresponds to the predicted flow value a, the time B corresponds to the predicted flow value B, a first difference is B-a, and if there is a time C, the time a, the time B, and the time C are in sequence from the morning to the evening, then the time C and the time B are adjacent times, and the time C corresponds to the predicted flow value C, then another first difference is C-B. In this embodiment, the first difference is obtained by subtracting the predicted flow rate value corresponding to the earlier time from the predicted flow rate value corresponding to the later time in the two adjacent times.

Taking 1 minute as an example, in a specific example, the predicted flow data for 5 minutes can be shown in table 1.

Time of day	Predicting flow value
		1 month, 1 day, 8 in 2021: 01	200
1 month, 1 day, 8 in 2021: 02	210
		1 month, 1 day, 8 in 2021: 03	205
1 month, 1 day, 8 in 2021: 04	500
		1 month, 1 day, 8 in 2021: 05	700

TABLE 1

According to the data in table 1, in this step, 4 first differences are obtained, which are, in order, 210-.

Step S402, if a first difference value larger than a first threshold value exists, increasing the number of instances of the online service at a target time, wherein the target time is a time before a preset time period and is away from the starting time of the preset time period by a preset time length.

In the foregoing step, a plurality of first difference values are obtained, so that in this step, it is necessary to sequentially determine whether each first difference value is greater than a first threshold, and as long as the first difference value greater than the first threshold occurs, it is determined that the first difference value greater than the first threshold exists, and then the number of instances of the online service is increased at the target time.

In this embodiment, the target time may be a time before the preset time period and separated from the starting time of the preset time period by a preset time length, and taking the preset time period (2021, 1, 8:01 — 2021, 1, 8:05) in table 1 as an example, if the preset time length is 6 minutes, the target time is 2021, 1, 7, 55.

Because the length of the preset time period may be set as required, and may be one week or even longer, when the target time is determined, it may also be more accurate, for example, a time which is a preset time length from the earliest time corresponding to the first difference value larger than the first threshold is used as the target time. For example, if the first threshold is 200, the first difference value of more than 200 is 295 and 200, 295 corresponds to 1/8: 03/2021/1/8: 04/200, 200 corresponds to 1/8: 04/2021/1/8: 05/2021, wherein the earliest time is 1/8: 03/2021, and the target time is 7: 57/1/2021/1 if the preset time duration is 6 minutes.

Certainly, in a preset time period, there may be a plurality of stages with large flow rate changes, when determining a target time, the target time may be determined by comparing two adjacent first difference values, first all the first difference values are arranged according to the front-back sequence of the respective corresponding time, then the first difference values are compared pairwise, if the previous first difference value is smaller than a first threshold value, and the other first difference value is larger than the first threshold value, the earlier time corresponding to the first difference value larger than the first threshold value is taken as a reference, and the time apart from the preset time length is taken as the target time.

And S403, if the first difference value smaller than the second threshold value exists, reducing the number of the instances of the online service at the target moment.

The process of determining the target time in this step is similar to that in step S302, and is not described here again.

In addition, the number of instances increased or decreased in the above process may be determined according to the threshold range in which the predicted flow value is located, for example, a mapping relation table between the threshold range and the number of instances is preset, which may specifically refer to table 2.

Threshold range	Number of examples
		[400，500)	2
[500，600)	6
		[600，700)	10

TABLE 2

With multiple predicted flow values, each predicted flow data at the peak location may be selected for mapping in table 2.

Step S204, acquiring actual flow data of the online service in a preset time period after the number of the instances is adjusted, and judging whether the actual flow data and the predicted flow data meet a first preset condition.

It should be noted that the preset time period in this step and the preset time period in step S202 are the same time period, it should be noted that the actual traffic data and the predicted traffic data may be data of a certain preset index, and the preset index is an index preset in this embodiment and capable of most representing service traffic, such as an on-line index of a request Quantity (QPS), a delay (latency), and the like.

Specifically, the process of determining whether the actual flow data and the predicted flow data satisfy the first preset condition is to determine that the currently deployed prediction model is still not suitable for the online service of the current instance quantity, so that it may be determined by comparing whether an error between the actual flow data and the predicted flow data is within an acceptable range, specifically referring to fig. 5, where fig. 5 is a schematic flow diagram provided in another embodiment of the present application for determining whether the actual flow data and the predicted flow data satisfy the first preset condition.

As shown in fig. 5, the process of determining whether the actual flow data and the predicted flow data satisfy the first preset condition may include:

step S501, determining a first average value of all actual flow values and a second average value of all predicted flow values, and determining a second difference value of the first average value and the second average value.

Because the actual flow data includes the actual flow value corresponding to each time within the preset time period and the predicted flow data includes the predicted flow value corresponding to each time within the preset time period, in this step, the average value of the actual flow values and the predicted flow values, that is, the average value of all the actual flow values, that is, the first average value, can be taken; the average of all predicted flow values, i.e., the second average.

Step S502, if a second difference value between the first average value and the second average value is larger than a third threshold value, the actual flow data and the predicted flow data are judged to meet a first preset condition.

Step S503, if the second difference between the first average value and the second average value is not greater than the third threshold, determining that the actual flow data and the predicted flow data do not satisfy the first preset condition.

In steps S502 and S503, a difference between the first average value and the second average value, that is, a second difference value, is calculated, and when the second difference value is greater than a third threshold value, it is determined that the actual flow data and the predicted flow data satisfy a first preset condition. If the second difference is greater than the third threshold, it indicates that the difference between the actual flow data and the predicted flow data is too large, that is, the currently deployed prediction model is no longer suitable for the online services of the current number of instances.

Of course, the manner for determining whether the currently deployed prediction model is suitable for the online service of the current instance number is not limited to the scheme shown in fig. 5, but may also be configured to obtain an absolute value of a difference between the actual flow value and the predicted flow value at the same time, determine an average value of absolute values of all differences, determine whether the average value is greater than a fourth threshold, and if so, satisfy the first preset condition, and if not, do not satisfy the first preset condition. In this way, the effect of the sign carried between the differences in the previous example on the error statistics can be avoided.

And S205, if the current deployment prediction model is satisfied, updating the current deployment prediction model into the latest prediction model obtained by training according to the prediction flow data acquired at each time of the data of all indexes acquired in real time.

It should be noted that the present embodiment may further include a process of training the prediction model, specifically, refer to fig. 6, where fig. 6 is a schematic flowchart of a process of training the prediction model according to another embodiment of the present application.

As shown in fig. 6, the process of training the prediction model in this embodiment may include:

step S601, under the condition that the predicted flow data is obtained each time, determining training sample data according to the data of all indexes and the predicted flow data which are collected in real time.

In the foregoing process, each training time, a contribution index having a correlation coefficient with a preset index greater than a preset threshold is determined, and the contribution indexes are marked as data of indexes required by the prediction model obtained by the training time, where the required indexes are the target indexes mentioned in step S201.

Therefore, when determining the training sample data, the contribution index and the gradient index may be determined first, specifically refer to fig. 7, where fig. 7 is a schematic flowchart of a process of determining the training sample data according to another embodiment of the present application. As shown in fig. 7, the process of determining training sample data provided in this embodiment may include:

step S701, determining the correlation coefficients of other indexes except the preset index and the preset index in all the indexes based on the data of all the indexes collected in real time.

In this step, all indexes corresponding to the online service are mainly classified into three categories, namely, application indexes, system indexes and other indexes, including but not limited to the following indexes:

the application indexes are as follows: QPS (service request per second), Latency-P50(P50 quantile delay), Latency-P90(P90 quantile delay), Latency-P99(P99 quantile delay), and Type (service Type index, such as batch processing and deep learning).

And (3) system indexes are as follows: CPUUtils (CPU utilization), MEMUtils (memory utilization), NetIn/Out (network ingress/egress traffic), diskoiops (storage read/write times).

Other indexes are as follows: year (index collection Year information), Month (index collection Month information), Day (index collection Day information), Hour (index collection Hour information), Min (index collection minute information), Holiday (Holiday information), Activity (promotional campaign information).

In this embodiment, the output of the prediction model is generally data of indexes such as a request amount and a delay, and the indexes have different correlations with indexes corresponding to the output data of the prediction model, and the indexes with weaker correlations contribute less to the process of training the prediction model and even generate misleading.

In this step, a correlation coefficient between all the indexes and a preset index is calculated, where the preset index is an index determined in advance from all indexes corresponding to the online service, and is generally an index capable of representing a flow change trend, such as a request amount or a delay.

When calculating the correlation coefficient of two indexes (any index and a preset index), the calculation may be performed with reference to the following formula:

where E is the desired operator, μ 1 and μ 2 are the means of the indices y1 and y2, and σ 1, σ 2 are the standard deviations of the indices y1 and y 2.ρ is the correlation coefficient of indices y1 and y 2.

Step S702, determining the index with the correlation coefficient with the preset index larger than the preset threshold value as the contribution index.

In this step, the preset threshold may be ρ _ threshold, and a value thereof may be located in an interval of [0.5, 0.8 ]. If y1 is a predetermined index and y2 is a certain index, and ρ > ρ _ threshold, y2 and y1 are considered to be strongly correlated, and y2 can be used as a contribution index.

Step S703 is to determine the time-series gradient index data of the actual flow data based on the actual flow data.

In this step, the time-series gradient feature may be determined based on actual flow data, and may specifically be calculated by using the following formula: delta (y)_t)＝y_t-y_t-1Wherein, y_tRepresenting the actual flow value at time t, y_t-1Representing the actual flow value at time t-1. That is, the third difference of the actual flow values corresponding to every two adjacent moments is determined, and then the data of the time sequence gradient index of the actual flow data is determined according to all the determined third differences.

Step S704, in the case of acquiring the predicted flow data each time, determining data of an error gradient index between the actual flow data and the predicted flow data based on the actual flow data and the predicted flow data.

In this step, the error gradient feature may be determined based on the actual flow data and the predicted flow data, and specifically, for the same time of the preset time period, a fourth difference between the actual flow value and the preset flow value corresponding to the same time may be determined, and then data of the error gradient index between the actual flow data and the predicted flow data may be determined according to all the determined fourth differences.

In one specific example, the fourth difference may be calculated using the following equation: Δ y_t＝y_t-y′_tWherein, y_tRepresenting the actual flow value at time t, y_t' denotes the predicted flow value at time t.

Step S705, determining data of the contribution index, data of the timing gradient index, and data of the error gradient index as training sample data.

In this step, after the training sample data is obtained, the prediction model can be trained based on the training sample data.

Step S706, the contribution index, the time sequence gradient index and the error gradient index are marked as target indexes of the currently trained prediction model.

The contribution index, the time sequence gradient index and the error gradient index can be marked as target indexes of the currently trained prediction model, and after the currently trained prediction model is on line, the target indexes are corresponding to the currently deployed prediction model.

Step S602, training the prediction model according to the training sample data to obtain an updated parameter.

In this step, a gradient descent algorithm may be used to determine the target model parameters that would minimize the predetermined loss function of the prediction model.

The gradient descent algorithm is one of iteration methods, and when the minimum value of the loss function is solved, iteration solution can be performed step by step through the gradient descent method to obtain the minimized loss function and the model parameter value, that is, the target model parameter when the preset loss function reaches the minimum value is obtained through iteration by using the gradient descent algorithm and taking actual flow data and predicted flow data as the basis.

The prediction model may be an LSTM prediction model, and its structure may be as shown in fig. 8, fig. 8 is a schematic structural diagram of an LSTM prediction model provided in another embodiment of the present application, where there are 6 computing units (i.e., LSTM cells), each computing unit is provided with its own computing parameters, the computing parameters of all the computing units form model parameters of the entire model, X (t-2), X (t-1), X (t +1), and X (t +2) in fig. 8 are inputs, and Y (t +1), Y (t +2), and Y (t +3) are outputs.

In the embodiment, target index data is obtained first, wherein the target index data is data of indexes required by a pre-marked currently deployed prediction model for prediction, then the target index data is input into the currently deployed prediction model for prediction, predicted flow data of an online service predicted and output by the currently deployed prediction model in a preset time period is obtained, then the number of instances of the online service is adjusted according to the predicted flow data, the actual flow data of the online service in the preset time period after the number of the adjusted instances is obtained, whether the actual flow data and the predicted flow data meet a first preset condition or not is judged, and if the actual flow data and the predicted flow data meet the first preset condition, the currently deployed prediction model is updated to be a latest prediction model obtained by training the predicted flow data obtained each time according to data of all indexes collected in real time. Based on this, after the number of instances is adjusted, actual flow data within a preset time period of the online service is obtained each time, and when the actual flow data and the predicted flow data meet the first preset condition, it is indicated that the currently deployed prediction model is no longer suitable for the online service of the current number of instances, so that the currently deployed prediction model is updated to the latest prediction model obtained by training the predicted flow data obtained each time according to the data of all indexes collected in real time, so that the currently deployed prediction model can be suitable for the online service of the continuously adjusted number of instances in real time.

Referring to fig. 9, fig. 9 is a schematic structural diagram of an apparatus for online service instance adjustment according to another embodiment of the present application.

As shown in fig. 9, the apparatus for adjusting an online service instance provided by this embodiment may include:

an obtaining module 901, configured to obtain target index data, where the target index data is data of an index required for predicting a pre-marked currently deployed prediction model;

a prediction module 902, configured to input the target index data into prediction of the currently deployed prediction model, and obtain predicted traffic data of the online service output by the currently deployed prediction model within a preset time period

An adjusting module 903, configured to adjust the number of instances of the online service according to the predicted traffic data;

the judging module 904 is configured to obtain actual traffic data of the online service in a preset time period after the number of the instances is adjusted, and judge whether the actual traffic data and the predicted traffic data meet a first preset condition;

and the updating module 905 is configured to update the currently deployed prediction model to a latest prediction model obtained by training according to the predicted flow data acquired each time according to the data of all the indexes acquired in real time if the current deployed prediction model is satisfied.

In an optional embodiment, the obtaining module comprises:

the first calling unit is used for calling a target index corresponding to the currently deployed prediction model, and the target index comprises a gradient index;

In an optional embodiment, the predicted flow data includes a predicted flow value corresponding to each time in a preset time period;

the adjustment module includes:

the online service monitoring device comprises an increasing unit, a judging unit and a judging unit, wherein the increasing unit is used for increasing the number of instances of online service at a target time if a first difference value larger than a first threshold value exists, and the target time is a time which is before a preset time period and is away from the starting time of the preset time period by a preset time length;

In an optional embodiment, the actual flow data includes an actual flow value corresponding to each time in a preset time period, and the predicted flow data includes a predicted flow value corresponding to each time in the preset time period;

the judging module comprises:

and the second judging unit is used for judging that the actual flow data and the predicted flow data do not meet the first preset condition if the second difference value is not larger than the third threshold value.

In an alternative embodiment, the apparatus further comprises: the third determining unit is used for determining training sample data according to the data of all indexes acquired in real time and the predicted flow data under the condition that the predicted flow data is acquired each time; the training unit is used for training the prediction model according to the training sample data to obtain an updated parameter;

the update module includes: the second calling unit is used for calling the update parameters of the current latest prediction model; and the updating unit is used for updating the model parameters of the currently deployed prediction model into updated parameters so as to update the currently deployed prediction model into a prediction model obtained by training according to the data of all indexes acquired in real time and the predicted flow data acquired each time.

the third determination unit includes: the first determining subunit is used for determining the correlation coefficients of other indexes except the preset index and the preset index in all the indexes based on the data of all the indexes collected in real time; a second determining subunit, configured to determine, as a contribution index, an index whose correlation coefficient with the preset index is greater than a preset threshold; the third determining subunit is used for determining the data of the time sequence gradient index of the actual flow data based on the actual flow data; the fourth determining subunit is configured to determine, based on the actual flow data and the predicted flow data, data of an error gradient index between the actual flow data and the predicted flow data each time the predicted flow data is obtained; a fifth determining subunit, configured to determine data of the contribution index, data of the time sequence gradient index, and data of the error gradient index as training sample data; and the marking subunit is used for marking the contribution index, the time sequence gradient index and the error gradient index as target indexes of the currently trained prediction model.

In an optional embodiment, the actual flow data includes an actual flow value corresponding to each time in a preset time period;

the third determining subunit includes: a sixth determining subunit, configured to determine a third difference between actual flow values corresponding to every two adjacent time instants; and the seventh determining subunit is used for determining the data of the time sequence gradient index of the actual flow data according to all the determined third difference values.

the fourth determining subunit includes: the eighth determining subunit is configured to determine, for the same time of the preset time period, a fourth difference between the actual flow value and the preset flow value corresponding to the same time; a ninth determining subunit, configured to determine data of an error gradient index between the actual flow data and the predicted flow data, based on all the fourth difference values determined.

In an alternative embodiment, the first determining subunit includes: and the tenth determining subunit is used for determining a correlation coefficient between the index and the preset index according to the data of the index by using a preset correlation statistical algorithm for any index.

Referring to fig. 10, fig. 10 is a schematic structural diagram of an electronic device according to another embodiment of the present application.

As shown in fig. 10, the electronic device provided in this embodiment includes: at least one processor 1001, memory 1002, at least one network interface 1003, and other user interfaces 1004. The various components in the electronic device 1000 are coupled together by a bus system 1005. It is understood that bus system 1005 is used to enable communications among the components connected. The bus system 1005 includes a power bus, a control bus, and a status signal bus, in addition to a data bus. But for the sake of clarity the various busses are labeled in figure 10 as the bus system 1005. The user interface 1004 may include, among other things, a display, a keyboard, or a pointing device (e.g., a mouse, trackball, touch pad, or touch screen, among others.

It is to be understood that the memory 1002 in embodiments of the present invention may be either volatile memory or nonvolatile memory, or may include both volatile and nonvolatile memory. The non-volatile Memory may be a Read-Only Memory (ROM), a Programmable ROM (PROM), an Erasable PROM (EPROM), an Electrically Erasable PROM (EEPROM), or a flash Memory. Volatile Memory can be Random Access Memory (RAM), which acts as external cache Memory. By way of illustration and not limitation, many forms of RAM are available, such as Static random access memory (Static RAM, SRAM), Dynamic Random Access Memory (DRAM), Synchronous Dynamic random access memory (Synchronous DRAM, SDRAM), Double Data Rate Synchronous Dynamic random access memory (ddr Data Rate SDRAM, ddr SDRAM), Enhanced Synchronous SDRAM (ESDRAM), synchlronous SDRAM (SLDRAM), and Direct Rambus RAM (DRRAM). The memory 1002 described herein is intended to comprise, without being limited to, these and any other suitable types of memory.

In some embodiments, memory 1002 stores the following elements, executable units or data structures, or a subset thereof, or an expanded set thereof: an operating system 10021 and a second application 10022. The operating system 10021 includes various system programs, such as a framework layer, a core library layer, a driver layer, and the like, and is used for implementing various basic services and processing hardware-based tasks. The second application 10022 includes various second applications, such as a Media Player (Media Player), a Browser (Browser), and the like, for implementing various application services. A program for implementing the method according to the embodiment of the present invention may be included in the second application 10022. In the embodiment of the present invention, the processor 1001 is configured to execute the method steps provided by the foregoing method embodiments by calling a program or an instruction stored in the memory 1002, which may be, specifically, a program or an instruction stored in the second application 10022.

The method disclosed by the embodiment of the invention can be applied to the processor 1001 or can be implemented by the processor 1001. The processor 1001 may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the above method may be implemented by integrated logic circuits of hardware or instructions in the form of software in the processor 1001. The Processor 1001 may be a general-purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic device, or discrete hardware components. The various methods, steps and logic blocks disclosed in the embodiments of the present invention may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of the method disclosed in connection with the embodiments of the present invention may be directly implemented by a hardware decoding processor, or implemented by a combination of hardware and software elements in the decoding processor. The software elements may be located in ram, flash, rom, prom, or eprom, registers, among other storage media that are well known in the art. The storage medium is located in the memory 1002, and the processor 1001 reads the information in the memory 1002 and performs the steps of the method in combination with the hardware.

It is to be understood that the embodiments described herein may be implemented in hardware, software, firmware, middleware, microcode, or any combination thereof. For a hardware implementation, the Processing units may be implemented in one or more Application Specific Integrated Circuits (ASICs), Digital Signal Processors (DSPs), Digital Signal Processing Devices (DSPDs), Programmable Logic Devices (PLDs), Field Programmable Gate Arrays (FPGAs), general purpose processors, controllers, micro-controllers, microprocessors, other electronic units configured to perform the functions of the present Application, or a combination thereof.

For a software implementation, the techniques herein may be implemented by means of units performing the functions herein. The software codes may be stored in a memory and executed by a processor. The memory may be implemented within the processor or external to the processor.

The embodiment of the invention also provides a storage medium (computer readable storage medium). The storage medium herein stores one or more programs. Among others, the storage medium may include volatile memory, such as random access memory; the memory may also include non-volatile memory, such as read-only memory, flash memory, a hard disk, or a solid state disk; the memory may also comprise a combination of memories of the kind described above. When one or more programs in the storage medium are executable by one or more processors, the method for adjusting the number of instances of the online service executed on the electronic device side is realized. The processor is used for executing the program for adjusting the number of instances of the online service stored in the memory to realize the following steps of the method for adjusting the number of instances of the online service provided by the foregoing embodiment, which is executed on the electronic device side.

With regard to the apparatus in the above-described embodiment, the specific manner in which each module performs the operation has been described in detail in the embodiment related to the method, and will not be elaborated here. It is understood that the same or similar parts in the above embodiments may be mutually referred to, and the same or similar parts in other embodiments may be referred to for the content which is not described in detail in some embodiments. It should be noted that, in the description of the present application, the terms "first", "second", etc. are used for descriptive purposes only and are not to be construed as indicating or implying relative importance. Further, in the description of the present application, the meaning of "a plurality" means at least two unless otherwise specified. Any process or method descriptions in flow charts or otherwise described herein may be understood as representing modules, segments, or portions of code which include one or more executable instructions for implementing specific logical functions or steps of the process, and the scope of the preferred embodiments of the present application includes other implementations in which functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order, depending on the functionality involved, as would be understood by those reasonably skilled in the art of the present application.

It should be understood that portions of the present application may be implemented in hardware, software, firmware, or a combination thereof. In the above embodiments, the various steps or methods may be implemented in software or firmware stored in memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, any one or combination of the following techniques, which are known in the art, may be used: a discrete logic circuit having a logic gate circuit for implementing a logic function on a data signal, an application specific integrated circuit having an appropriate combinational logic gate circuit, a Programmable Gate Array (PGA), a Field Programmable Gate Array (FPGA), or the like.

It will be understood by those skilled in the art that all or part of the steps carried by the method for implementing the above embodiments may be implemented by hardware related to instructions of a program, which may be stored in a computer readable storage medium, and when the program is executed, the program includes one or a combination of the steps of the method embodiments. In addition, functional units in the embodiments of the present application may be integrated into one processing module, or each unit may exist alone physically, or two or more units are integrated into one module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode. The integrated module, if implemented in the form of a software functional module and sold or used as a stand-alone product, may also be stored in a computer readable storage medium.

The storage medium mentioned above may be a read-only memory, a magnetic or optical disk, etc. In the description herein, reference to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the application. In this specification, the schematic representations of the terms used above do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Although embodiments of the present application have been shown and described above, it is understood that the above embodiments are exemplary and should not be construed as limiting the present application, and that variations, modifications, substitutions and alterations may be made to the above embodiments by those of ordinary skill in the art within the scope of the present application.

Claims

1. A method for adjusting a number of instances of an online service, comprising:

2. The method of claim 1, wherein the obtaining target metric data comprises:

3. The method of claim 1, wherein the predicted flow data comprises a predicted flow value corresponding to each time in the preset time period;

4. The method according to any one of claims 1 to 3, wherein the actual flow data includes an actual flow value corresponding to each time in the preset time period, and the predicted flow data includes a predicted flow value corresponding to each time in the preset time period;

5. The method of claim 1, further comprising:

calling the update parameters of the current latest prediction model;

6. The method according to claim 5, wherein the actual flow data is data belonging to a preset index among data of all indexes collected in real time;

and marking the contribution index, the time sequence gradient index and the error gradient index as target indexes of the currently trained prediction model.

7. The method according to claim 6, wherein the actual flow data includes an actual flow value corresponding to each time in the preset time period;

and determining the data of the time sequence gradient index of the actual flow data according to all the determined third difference values.

8. The method according to claim 6, wherein the actual flow data includes an actual flow value corresponding to each time in the preset time period, and the predicted flow data includes a predicted flow value corresponding to each time in the preset time period;

and determining data of an error gradient index between the actual flow data and the predicted flow data according to all the determined fourth difference values.

9. The method according to claim 6, wherein the determining the correlation coefficient between each of the other indexes except the preset index and the preset index based on the data of all indexes collected in real time comprises:

and for any index, determining a correlation coefficient of the index and the preset index according to the data of the index by using a preset correlation statistical algorithm.

10. An apparatus for online service instance adjustment, the apparatus comprising:

11. An electronic device, comprising: at least one processor and memory;

the processor is configured to execute the program for adjusting the number of instances of the online service stored in the memory to implement the method for adjusting the number of instances of the online service according to any one of claims 1 to 9.

12. A storage medium storing one or more programs which, when executed, implement the method for adjusting the number of instances of an online service of any one of claims 1 to 9.