CN114443310A

CN114443310A - Resource scheduling method, device, equipment, medium and program product

Info

Publication number: CN114443310A
Application number: CN202210266825.2A
Authority: CN
Inventors: 谢伟; 王磊; 周文泽; 刘慕雨
Original assignee: Industrial and Commercial Bank of China Ltd ICBC
Current assignee: Industrial and Commercial Bank of China Ltd ICBC
Priority date: 2022-03-17
Filing date: 2022-03-17
Publication date: 2022-05-06

Abstract

The present disclosure provides a resource scheduling method based on function calculation, the method comprising: predicting a second call amount for calling a first function at a second time based on a first call amount for calling the first function at the first time, wherein the first time is before the second time; predicting the memory usage of the first function at the second moment based on the second call amount; before the second time comes, the method further comprises: changing a predetermined number of function instances based on the second call volume, wherein the function instances are used for running the first function; and under the condition that a preset number of function examples are increased, allocating memory resources to each function example in the preset number of function examples based on the memory usage amount. The present disclosure also provides a resource scheduling apparatus, a device, a storage medium and a program product.

Description

Resource scheduling method, device, equipment, medium and program product

Technical Field

The present disclosure relates to the field of cloud computing, and more particularly, to a method, an apparatus, a device, a medium, and a program product for resource scheduling based on function computing.

Background

Function computing is a cloud computing technology based on a Serverless architecture. The function calculation supports dynamic adjustment of required calculation resources along with changes of business transaction amount, such as function call amount aiming at business transaction amount influence, function instances are automatically added in a business peak period, and high concurrent business requests are responded. And in the service valley period, the function instances are automatically reduced, and the computing resources are distributed to other service functions which need more computing resources.

In carrying out the inventive concepts of the present disclosure, the inventors discovered: in the related art, a certain delay exists in the process of resource scheduling, and the size of the allocated memory cannot be well adapted to the call quantity of the function when the function instance is created.

Disclosure of Invention

In view of the foregoing, the present disclosure provides methods, apparatuses, devices, media, and program products for improving resource scheduling efficiency and saving memory resources.

In one aspect of the embodiments of the present disclosure, a method for scheduling resources based on function computation is provided, including: predicting a second call amount for calling a first function at a second time based on a first call amount for calling the first function at the first time, wherein the first time is before the second time; predicting the memory usage of the first function at the second moment based on the second call amount; before the second time comes, the method further comprises: changing a predetermined number of function instances based on the second call volume, wherein the function instances are used for running the first function; and under the condition that a preset number of function examples are increased, allocating memory resources to each function example in the preset number of function examples based on the memory usage amount.

According to an embodiment of the present disclosure, the predicting, based on the second call amount, the memory usage amount of the first function at the second time includes: inputting the second call quantity into a memory prediction model obtained in advance; and obtaining the memory usage amount output by the memory prediction model.

According to an embodiment of the present disclosure, the method further includes obtaining the memory prediction model, specifically including: obtaining a third calling amount for calling the first function at each third moment in the first time sequence; obtaining the memory usage amount of a single function instance of the first function at each third moment; and obtaining the memory prediction model according to the third calling amount and the memory usage amount of the single function instance.

According to an embodiment of the present disclosure, obtaining the memory prediction model according to the third call amount and the memory usage amount of the single function instance includes: obtaining a first fitting curve according to the third calling amount and the memory usage amount of the single function instance; determining parameters of the memory prediction model based on the first fitted curve.

According to an embodiment of the present disclosure, predicting a second call volume to call a first function at a second time based on a first call volume to call the first function at the first time comprises: inputting the first call quantity into a call prediction model obtained in advance; and obtaining the second calling amount output by the calling prediction model.

According to an embodiment of the present disclosure, the method further includes obtaining the call prediction model, specifically including: obtaining a fourth calling amount for calling the first function at each fourth moment in the second time sequence; and obtaining the calling prediction model according to each fourth moment and the corresponding fourth calling amount.

According to an embodiment of the present disclosure, the calling prediction model includes a time series model and a poisson distribution model, and obtaining the calling prediction model according to each fourth time and the corresponding fourth calling amount includes: and respectively determining parameters of the time series model and parameters of the Poisson distribution model according to each fourth moment and the corresponding fourth call quantity.

According to an embodiment of the present disclosure, after obtaining the time series model and the poisson distribution model, the predicting a second call volume for calling the first function at a second time includes: inputting the first call quantity into the time series model and the Poisson distribution model respectively; and determining the second calling amount based on the undetermined calling amounts respectively output by the time series model and the Poisson distribution model.

According to an embodiment of the present disclosure, obtaining the fourth call volume includes: obtaining a running log of the first function in the second time series; cleaning data in the running log according to the service keywords to obtain target log data; and obtaining the fourth calling amount according to the target log data.

According to an embodiment of the present disclosure, before changing the predetermined number of function instances based on the second call volume, the method further includes determining the predetermined number, specifically including: determining a second number of function instances at the second time based on the second call volume; the predetermined number is determined based on a difference between the second number and the first number.

According to an embodiment of the present disclosure, includes: in the case that the second number is greater than the first number, changing the predetermined number of function instances to increase the predetermined number of function instances; in the case where the second number is less than or equal to the first number, the function instance of changing the predetermined number is a function instance of decreasing the predetermined number.

Another aspect of the embodiments of the present disclosure provides a resource scheduling apparatus, including: the device comprises a first prediction module, a second prediction module and a first function module, wherein the first prediction module is used for predicting a second calling amount for calling a first function at a second moment based on a first calling amount for calling the first function at the first moment, and the first moment is before the second moment; the second prediction module is used for predicting the memory usage of the first function at the second moment based on the second call amount; an instance changing module, configured to change a predetermined number of function instances based on the second call volume before the second time comes, where the function instances are used to run the first function; and the memory allocation module is used for allocating memory resources to each function instance in the function instances with the preset number based on the memory usage amount under the condition of increasing the function instances with the preset number.

Another aspect of the disclosed embodiments provides an electronic device, including: one or more processors; a storage device to store one or more programs, wherein the one or more programs, when executed by the one or more processors, cause the one or more processors to perform the method as described above.

Yet another aspect of the embodiments of the present disclosure provides a computer-readable storage medium having stored thereon executable instructions, which when executed by a processor, cause the processor to perform the method as described above.

Yet another aspect of the disclosed embodiments provides a computer program product comprising a computer program that when executed by a processor implements the method as described above.

One or more of the above embodiments have the following advantageous effects: the problem that the resource scheduling delay is delayed and the memory size of the function instance cannot be well adapted to the function call volume can be at least partially solved. The function call amount at the second moment is predicted in advance at the first moment, and the function instances with the preset number are changed in advance, so that the requirement of the function call amount can be met at the second moment, the memory use amount is further predicted according to the second call amount obtained through prediction, and when the function instances are created, the initial memory size of the function instances is distributed according to the memory use amount, so that the effects of saving memory resources and maximizing the overall resource scheduling of a function computing platform are achieved.

Drawings

The foregoing and other objects, features and advantages of the disclosure will be apparent from the following description of embodiments of the disclosure, which proceeds with reference to the accompanying drawings, in which:

fig. 1 schematically shows an application scenario diagram of a resource scheduling method according to an embodiment of the present disclosure;

FIG. 2 schematically shows a flow chart of a resource scheduling method according to an embodiment of the present disclosure;

FIG. 3 schematically illustrates a flow chart for determining a predetermined number according to an embodiment of the present disclosure;

FIG. 4 schematically illustrates a flow chart for predicting a second call amount according to an embodiment of the present disclosure;

FIG. 5 schematically illustrates a flow diagram for obtaining a call prediction model according to an embodiment of the present disclosure;

FIG. 6 schematically illustrates a flow chart for obtaining a fourth call amount according to an embodiment of the disclosure;

FIG. 7 schematically illustrates a flow diagram for predicting a second call amount, according to another embodiment of the present disclosure;

FIG. 8 schematically illustrates a flow chart of predicting memory usage in accordance with an embodiment of the present disclosure;

FIG. 9 schematically illustrates a flow diagram for obtaining a memory prediction model according to an embodiment of the disclosure;

FIG. 10 schematically illustrates a flow chart for determining parameters of a memory prediction model according to an embodiment of the present disclosure;

fig. 11 schematically shows a block diagram of a resource scheduling apparatus according to an embodiment of the present disclosure;

FIG. 12 schematically shows a block diagram of a log processing module according to an embodiment of the present disclosure;

FIG. 13 schematically shows a block diagram of a temporal sequence analysis module according to an embodiment of the present disclosure;

FIG. 14 schematically illustrates a block diagram of a Poisson distribution analysis module according to an embodiment of the present disclosure;

FIG. 15 schematically illustrates a block diagram of a function computation core engine module according to an embodiment of the present disclosure;

FIG. 16 is a flow chart of resource scheduling based on a resource scheduling apparatus according to an embodiment of the disclosure;

fig. 17 schematically shows a block diagram of an electronic device adapted to implement the resource scheduling method according to an embodiment of the present disclosure.

Detailed Description

Hereinafter, embodiments of the present disclosure will be described with reference to the accompanying drawings. It should be understood that the description is illustrative only and is not intended to limit the scope of the present disclosure. In the following detailed description, for purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the embodiments of the disclosure. It may be evident, however, that one or more embodiments may be practiced without these specific details. Moreover, in the following description, descriptions of well-known structures and techniques are omitted so as to not unnecessarily obscure the concepts of the present disclosure.

The serverless computing does not no longer use servers, but background servers are transparent to users, and users no longer need to pay attention to complex deployment and maintenance problems. A user-oriented function computing platform is provided based on a serverless architecture, and functions can be provided for users to develop or use by taking the functions as service granularity.

The function computation technology is further abstracted and encapsulated in terms of infrastructure and software environment. Taking the service application development as an example, the management work in the aspect of the bottom layer server does not need to be concerned any more, so that the development of the application layer of the service logic is focused, the service development work can be more specialized, and the quick iteration of the service product is assisted.

The function computation technique may implement dynamic scaling of function instances. The current dynamic capacity expansion mode can calculate the business transaction amount and the function resource occupation condition in real time, and then adjust the calculation resource according to the preset threshold (such as the transaction amount threshold or the function resource occupation threshold), and a certain time delay exists in the dynamic capacity expansion, so that the user experience is poor for some business functions which are sensitive to the transaction time delay and face terminal users.

Taking the dynamic capacity expansion with delay as an example, on one hand, the function instance has the problem of cold start, for example, because the function instance does not exist in the system, it is necessary to restart a function, install a runtime environment, and deploy code. These steps take a certain amount of time. On the other hand, when resource scheduling is performed, idle functions may be cleared and allocated to functions that need capacity expansion, which also takes time. Taking the delay existing in the capacity shrinkage as an example, when the system detects that the transaction amount is reduced, some function instances are idle for a period of time, and the resources are occupied in the period of time, so that waste is caused.

Taking the allocation of memory resources during dynamic capacity expansion as an example, the memory size corresponding to the preset function instance is generally not changed. No matter aiming at different service functions or calling situations of a certain function at different time, the function instance allocates a preset fixed memory size during creation. In practical applications, there may be a difference in memory usage of function instances as the function calls differ. If the initial memory allocation is large and the memory usage amount only occupies a small portion, resource waste may be caused. In this case, the function instances occupy a large memory, and may limit the number of function instances of the whole function computing platform to a certain extent due to insufficient allocation of memory resources.

The resource scheduling method based on function calculation provided by the embodiment of the disclosure can at least partially solve the problems that the resource scheduling is delayed, and the memory size of the function instance cannot be well adapted to the transaction amount. The function call amount at the second moment is predicted in advance at the first moment, and the function instances with the preset number are changed in advance, so that the requirement of the function call amount can be met at the second moment, the memory use amount is further predicted according to the second call amount obtained through prediction, and when the function instances are created, the initial memory size of the function instances is distributed according to the memory use amount, so that the effects of saving memory resources and maximizing the overall resource scheduling of a function computing platform are achieved.

It should be noted that the resource scheduling method, apparatus, device, medium, and program product based on function computation according to the embodiments of the present disclosure may be used in the financial field in the cloud computing related aspect, and may also be used in any field other than the financial field.

Fig. 1 schematically shows an application scenario diagram of a resource scheduling method according to an embodiment of the present disclosure.

As shown in fig. 1, the application scenario 100 according to this embodiment may include

terminal devices

101, 102, 103, a cloud service 104, and a server 105. A medium that provides communication links between the

terminal devices

101, 102, 103, the cloud service 104, and the server 105 using a network. The network may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few.

A user may use the

terminal devices

101, 102, 103 to interact with the server 105 through the network using the cloud service 104 to receive or send messages, etc. The

terminal devices

101, 102, 103 may have installed thereon various communication client applications, such as shopping-like applications, web browser applications, search-like applications, instant messaging tools, mailbox clients, social platform software, etc. (by way of example only).

The

terminal devices

101, 102, 103 may be various electronic devices having a display screen and supporting web browsing, including but not limited to smart phones, tablet computers, laptop portable computers, desktop computers, and the like.

The cloud service 104 may be a cloud computing service implementing a function computing platform based on a serverless architecture, which may have function computing related services deployed or communicatively connected thereto. The server 105 may be a hardware resource such as a server providing various services, for example a background management server (for example only) providing support for websites browsed by users using the

terminal devices

101, 102, 103. The background management server may analyze and perform other processing on the received data such as the user request, and feed back a processing result (e.g., a webpage, information, or data obtained or generated according to the user request) to the terminal device.

It should be understood that the number of terminal devices, cloud services, and servers in fig. 1 are merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for an implementation.

The resource scheduling method based on function calculation according to the embodiment of the present disclosure will be described in detail below with reference to fig. 2 to 10 based on the scenario described in fig. 1.

Fig. 2 schematically shows a flow chart of a resource scheduling method according to an embodiment of the present disclosure.

As shown in fig. 2, the resource scheduling method of this embodiment includes operations S210 to S240.

In operation S210, a second call amount to call the first function at a second time is predicted based on a first call amount to call the first function at the first time, wherein the first time is prior to the second time.

Illustratively, the first time may be any one time point. For example, at a certain time point before the traffic peak comes, the current actual number of function calls is obtained. The first function may be any function of a function computation platform. Taking financial services as an example, the function computing platform may provide services such as intelligent investment advisors, batch reconciliation, account inquiry, etc., and the first function may belong to the class of intelligent investment advisors and be used for providing fund review services. The number of times the first function is invoked at a second time in the future (i.e., the second amount of invocation) is predicted by the number of times the one or more users invoke the first function when using the fund back-testing service at the first time.

In operation S220, memory usage of the first function at a second time is predicted based on the second call amount.

Illustratively, the memory usage may be the memory usage of all function instances, or may be the memory usage of a single function instance. For example, if the memory usage of all function instances is predicted, the memory usage of a single function instance can be obtained in an average manner.

In an alternative embodiment, predictions may be made based on historical data. For example, 100 function instances are typically turned on at 3 pm of each day during peak traffic hours, and the first time may be 30 pm and the second time may be 3 pm (for example only). And predicting the second calling amount and the memory usage amount in advance according to the calling times and the memory usage amount obtained at 3 pm every day in the historical data.

In another alternative embodiment, the prediction may be based on expert experience. For example, the prediction rule of the function instance is set according to the experience of the function computation technologist. Specifically, under the condition that a call volume threshold or a function occupation resource threshold is preset, a plurality of indexes, such as a time index, a current idle function index, an index of a region where a user of a current first function is located, or an index of the number of users used by a current platform, may be set according to expert experience. And predicting the second calling amount and the memory usage amount according to the change of the indexes.

In yet another alternative embodiment, this may be achieved by a predictive model. The prediction model may be considered a mathematical function, with the second call prediction model having the first call input and the second call output. And inputting the second call quantity obtained by prediction into a prediction model of the memory usage quantity, and outputting the second call quantity as the memory usage quantity.

Before the second time comes, the method further includes operations S230 to S240:

assuming that the second time is a service peak time, in the related art, the capacity expansion is performed after the monitored service transaction amount or the function transaction amount reaches a preset threshold, and there may be a delay. The function instances with the number not meeting the requirement are not operated at the second moment, and the user experience is poor for some service functions which are sensitive to the terminal user and the transaction delay. The capacity expansion or capacity reduction operation may be started within a predetermined time period before the second time comes after the second call amount and/or the memory usage amount are obtained. The predetermined period of time may be obtained from the time of a cold start or a shut down of an idle function, etc.

In some embodiments, before the second time arrives, a scale-up capacity schedule may be automatically formulated based on the predicted values. The scaling operations performed at various times during future peak time periods (including the second time) may be included in the schedule.

In operation S230, a predetermined number of function instances for executing the first function are changed based on the second call volume.

In some embodiments, the lifecycle of the first function begins with the writing of code and the definition of a configuration file, followed by the compilation of the code. Then, a runtime environment is built, and the system can be converted into a code binary file, a program package, a container or a mirror image according to the design of the system. Finally, the method is deployed, and a function instance is formed for a user to call.

In operation S240, in the case of increasing the predetermined number of function instances, memory resources are allocated to each function instance of the predetermined number of function instances based on the memory usage amount.

Illustratively, the changing operation in operation S230 may include an increasing operation, a constant operation, or a decreasing operation. The memory usage amount may be a memory resource size occupied by the function instance in the process of receiving the access request of the user to run the first function. Therefore, after the second call volume is predicted, the obtained memory usage volume can be adapted to the second call volume according to the second call volume prediction.

For example, a certain number of function instances may be determined for operational support based on the second call volume. If the memory resource of the function instance is too large, the memory may be idle to cause waste, and if the memory resource of the function instance is too small, the concurrent operation of the first function may not be supported, or even the function may not be successfully operated. While the memory resource of the function instance in the related art is a fixed size of the allocation, the memory resource may be too large or too small according to the second call amount.

According to the embodiment of the disclosure, the function call amount at the second moment is predicted in advance at the first moment, and the function instances with the preset number are changed in advance, so that the requirement of the function call amount at the second moment can be met, the memory use amount is further predicted and obtained according to the predicted second call amount, and when the function instances are created, the initial memory size of the function instances is distributed according to the memory use amount, so that the effects of saving memory resources and maximizing the overall resource scheduling of a function computing platform are achieved.

Fig. 3 schematically illustrates a flow chart for determining a predetermined number according to an embodiment of the present disclosure.

Before changing the predetermined number of function instances based on the second call volume in operation S230, the method further includes determining the predetermined number, which includes operations S310 to S320, as shown in fig. 3.

In operation S310, a second number of function instances at a second time is determined based on a second call volume.

Illustratively, the determination may be based on the amount of concurrent calls that each function instance can support. The product of the concurrent call amount and the second amount is greater than or equal to the second call amount. In some embodiments, the second amount may be determined by the magnitude of the second amount, such as 0 to 1000 times, and the second amount is 20 to 30. The second number is 30-100 (just an example) between 1001-5000 times.

In operation S320, a predetermined number is determined based on a difference between the second number and the first number.

In other embodiments, the determination may be based on a ratio of the first number and the second number. The pre-assigned scaling factor is determined, for example, based on the ratio size, and the predetermined number is further determined based on the first number and a scaling system.

According to the embodiment of the disclosure, the function instance meeting the requirement can be calculated and created in advance through the predetermined number determined by the currently determined first number and the second number obtained through prediction, so that the expansion or contraction can be made before the second moment.

According to an embodiment of the present disclosure, in the case where the second number is greater than the first number, the predetermined number of function instances is changed to increase the predetermined number of function instances. In the case where the second number is less than or equal to the first number, the predetermined number of function instances is changed to a function instance that is decreased by the predetermined number.

Fig. 4 schematically illustrates a flowchart of predicting a second call amount in operation S210 according to an embodiment of the present disclosure.

As shown in fig. 4, predicting the second call volume to call the first function at the second time in operation S210 includes operations S410 to S420.

In operation S410, a first call amount is input to a call prediction model obtained in advance.

In some embodiments, the calling prediction model may be obtained by machine learning, using a training data set for training. The training samples in the training data set may include historical invocation data for the first function in the production environment.

In operation S420, a second call amount that calls the output of the prediction model is obtained.

According to the embodiment of the disclosure, the second call amount is predicted in advance by using the call prediction model, and compared with a mode of setting a threshold value, the capacity expansion or the capacity reduction can be performed in more time, so that the resource scheduling efficiency is improved. In addition, the calling prediction model can learn through historical calling data to obtain the rule of the historical calling process, and compared with the expert rule, the calling prediction model has the advantages of being automatic, accurate and the like.

FIG. 5 schematically shows a flow diagram for obtaining a call prediction model according to an embodiment of the disclosure.

As shown in fig. 5, obtaining the call prediction model of this embodiment includes operations S510 to S520.

In operation S510, a fourth call amount for calling the first function at each fourth time in the second time series is obtained.

The second time series may be a time period within which the function call amount at each time may be obtained. The time period may be a traffic peak period or may further include time windows around the traffic peak period. Therefore, the method can be used as a training sample to enable the calling prediction model to learn an accurate calling rule during the service peak period or before and after the period in the training process.

FIG. 6 schematically shows a flow chart for obtaining a fourth call amount according to an embodiment of the disclosure.

As shown in fig. 6, obtaining the fourth call amount of this embodiment includes operations S610 to S630.

In operation S610, a running log of the first function in the second time series is obtained.

In some embodiments, a business function with obvious high peak and low valley of a plurality of business transactions is screened according to the production running condition, and the business function is the first function. The effect is to show that the function runs with the requirements for scaling. If the peaks and valleys are not very different, it is likely that the latency problem of the existing scalability is not so sensitive to the adverse impact on the user experience.

In operation S620, data in the running log is cleaned according to the service keyword, and target log data is obtained.

In some embodiments, the service function detailed operation log is cleaned according to the service key words, and useful log data are screened out. The service key may include a term associated with a service or function, and is operative to determine an associated call record from one or more logs of execution of the first function based thereon. The association may be the same call, different calls of the same service, etc. Taking financial services as an example, the service keywords may include function instances, transaction types, user account numbers, transaction amounts, transaction channels or call times, and the like.

Illustratively, the process of data cleansing may be a process of removing useless data, preserving target log data.

In operation S630, a fourth call amount is obtained according to the target log data.

Illustratively, the parameters in the call may be determined to be satisfactory by the call sequence number. Parameters in the invocation may include transaction type, user account number, transaction amount, transaction channel or invocation time, etc. And in the case of meeting the requirements, determining the calling as one calling of the first function. In some embodiments, the call sequence number and call time may be the only calls determined for the first function.

According to the embodiment of the disclosure, the log data are cleaned in a keyword mode, the target log data are obtained, the training sample with higher quality can be obtained, and the calling prediction model obtained according to the training sample is higher in accuracy because the training sample is real calling data obtained in a production environment.

In some embodiments, the target log data may be used directly as a training sample. Specifically, the frequency of the transaction type, the user account number, the transaction amount, the transaction channel or the calling time and the like in the calling record is determined through the target log data, so that the relationship between each keyword and the calling amount is established, and the calling prediction model is obtained. In operation S410, a first call volume at a first time and a frequency of occurrence of a service keyword (representing a service index situation handled by a user) at the first call volume may be input into a call prediction model.

In operation S520, a call prediction model is obtained according to each fourth time and the corresponding fourth call volume.

According to the embodiment of the disclosure, each fourth time in the second time series and the fourth call volume of the corresponding time can be used as training samples, and the call prediction model is obtained by learning the change trend of the call volume at different times.

According to an embodiment of the present disclosure, the invoking prediction model includes a time series model and a poisson distribution model, and the obtaining the invoking prediction model according to each fourth time and the corresponding fourth invocation amount in operation S520 includes: and respectively determining parameters of the time series model and parameters of the Poisson distribution model according to each fourth moment and the corresponding fourth call quantity.

In some embodiments, each fourth time and the corresponding fourth call volume may be automatically analyzed using a time series analysis technique, such as using an AR autoregressive model, a MA moving average model, an ARIMA autoregressive sliding average model, or using an LSTM (long short term memory network) model. And acquiring a time series model after the parameters are determined so as to predict the future increase of the function call quantity.

In some embodiments, each fourth time and the corresponding fourth call volume are analyzed by using a poisson distribution mathematical principle, and a poisson distribution model conforming to a call rule is constructed. Specifically, the probability density model of the poisson distribution model is shown in formula 1.

Where k is the value of the input call quantity to be predicted, and λ is the expected value.

Illustratively, the solution process is: and setting x as the value of the cyclic function variable, continuously accumulating the value of the probability density function P to obtain a distribution accumulated value, wherein the value of x is continuously + 1. When the accumulated value of the distribution reaches the value of U-U [0, 1] of the uniform distribution, the corresponding x is the value of the inverse function of the Poisson distribution function in U, and the value is the random value of the Poisson distribution to be solved according to the same distribution of the value of the inverse function of the distribution function in U and the random variable, and the value is the predicted value.

Fig. 7 schematically illustrates a flow diagram for predicting a second call amount according to another embodiment of the present disclosure.

As shown in fig. 7, predicting the second call amount of this embodiment includes operations S710 to S720.

In operation S710, the first call volume is input into the time series model and the poisson distribution model, respectively.

In operation S720, a second call volume is determined based on the call volumes to be scheduled respectively output by the time series model and the poisson distribution model.

Illustratively, the time series model outputs a quantity to be scheduled according to the first call quantity, and the poisson distribution model outputs a quantity to be scheduled according to the first call quantity. Any one of the maximum value, the minimum value, or the average value of the two amounts to be adjusted may be used as the second amount.

According to the embodiment of the disclosure, by comprehensively applying time series analysis and poisson distribution analysis, the service change trend (time series model) of the service transaction volume in a long time period can be predicted, the change trend (poisson distribution model) of the service transaction volume in a short time period can also be predicted, appropriate computing resources can be scheduled, the number of function instances can be adjusted, the degree of adapting to the service transaction volume is achieved, and the use amount of the computing resources and the service user experience are balanced.

Fig. 8 schematically illustrates a flowchart of predicting the memory usage amount in operation S220 according to an embodiment of the present disclosure.

As shown in fig. 8, predicting the memory usage of the first function at the second time based on the second call amount in operation S220 includes operations S810 to S820.

In operation S810, a second call amount is input to a pre-obtained memory prediction model.

In some embodiments, the memory prediction model may also be obtained by training with a training data set in a machine learning manner. The training samples in the training dataset may include memory usage of a single function instance of the first function at runtime in the production environment.

In operation S820, the memory usage amount output by the memory prediction model is obtained.

According to the embodiment of the disclosure, the memory usage amount at the second moment is predicted in advance by using the memory prediction model, the adaptation of the second call amount and the memory usage amount can be comprehensively considered, and the initial memory size of a single function instance is distributed according to the predicted memory usage amount, so that the effects of saving memory resources and maximizing the number of the whole function instances of the platform are achieved.

FIG. 9 schematically shows a flow chart for obtaining a memory prediction model according to an embodiment of the disclosure.

As shown in fig. 9, obtaining the memory prediction model according to this embodiment includes operations S910 to S930.

In operation S910, a third call amount for calling the first function at each third time in the first time sequence is obtained.

In operation S920, the memory usage amount of the single function instance of the first function at each third time is obtained.

For example, a first function at a certain historical time corresponds to 5 (only for example) function instances, and the determination of the memory usage amount of a single function instance may be a maximum value in the 5 function instances, or may also be an average value or a minimum value, and the like.

In some embodiments, the first time series may be the same as or different from the second time series. In the same case, each fourth time is the same as each third time, and the fourth call amount at each time is the same as the third call amount. In this case, since the memory prediction model and the training sample calling the prediction model have a correlation, and in actual prediction, the output of the calling prediction model is used as the input of the memory prediction model. Therefore, the accuracy of the memory prediction model can be improved.

According to the embodiment of the disclosure, the running log of the first function in the first time sequence can be obtained, and the data in the running log is cleaned according to the service key words to obtain the target log data. And obtaining a third call amount and a memory usage amount according to the target log data. In some embodiments, the service function detailed operation log is cleaned according to the service key words, and useful log data are screened out. Taking financial services as an example, the service keywords may include function instance names, memory usage amounts, transaction types, user account numbers, transaction amounts, transaction channels or call times, and the like.

In some embodiments, the target log data may be used directly as a training sample. Specifically, the frequency of the transaction type, the user account number, the transaction amount, the transaction channel or the calling time and the like in the calling record is determined through the target log data, so that the relationship between each keyword and the memory usage is established, and the calling prediction model is obtained. In operation S810, the frequency of each keyword (representing the service index condition handled by the user) may be predicted and input into the memory prediction model together with the second call volume.

In operation S930, a memory prediction model is obtained according to the third call amount and the memory usage amount of the single function instance.

According to the embodiment of the disclosure, because the third call amount and the memory usage amount in the first time sequence are real call data obtained in the production environment, the memory prediction model obtained in accordance with the third call amount and the memory usage amount can learn the relationship between the call amount and the memory usage amount, and has higher accuracy.

FIG. 10 schematically illustrates a flow chart for determining parameters of a memory prediction model according to an embodiment of the disclosure.

As shown in fig. 10, the obtaining the memory prediction model according to the third call amount and the memory usage amount of the single function instance in operation S930 includes operations S1010 to S1020.

In operation S1010, a first fitting curve is obtained according to the third call amount and the memory usage amount of the single function instance.

Illustratively, in a two-dimensional X-z coordinate system, the third adjustment amount is an abscissa, and the memory usage amount of the single function instance is an ordinate, a linear regression, such as a least squares method, and a polynomial regression may be used to obtain the first fitting curve.

In operation S1020, parameters of the in-memory prediction model are determined based on the first fitted curve.

In some embodiments, the parameters of the memory model may be determined based on the fitting effect of the first fitted curve. The fitting effect may be determined, for example, by the squared error or the root mean square error, etc.

For example, a simple linear regression (linear regression) model is used to perform modeling analysis on a function call quantity (input variable) and a memory usage quantity (output strain quantity) of a single function instance on a training data set. The linear regression model is shown in equation 2.

y＝β₀+β₁x (formula 2)

Wherein, the less is the predicted single-instance memory usage, x is the function call, beta₀、β₁For model parameters, in specific implementation, the least square method (least squares) is used to solve the model parameter estimation values, so that the model achieves the best effect of fitting data.

Based on the resource scheduling method, the disclosure also provides a resource scheduling device. The apparatus will be described in detail below with reference to fig. 11 to 16.

Fig. 11 schematically shows a block diagram of a resource scheduling apparatus 1100 according to an embodiment of the present disclosure.

As shown in fig. 11, the resource scheduling apparatus 1100 of this embodiment includes a first prediction module 1110, a second prediction module 1120, an instance changing module 1130, and a memory allocation module 1140.

The first prediction module 1110 may perform operation S210 for predicting a second call amount for calling the first function at a second time based on a first call amount for calling the first function at the first time, wherein the first time is prior to the second time.

The second prediction module 1120 may perform operation S220 for predicting the memory usage of the first function at the second time based on the second call amount.

The instance changing module 1130 may perform operation S230 for changing a predetermined number of function instances based on the second call volume before the second time comes, wherein the function instances are used for the execution of the first function. And

the memory allocation module 1140 may perform operation S240 for allocating memory resources for each function instance of the predetermined number of function instances based on the memory usage amount in case of increasing the predetermined number of function instances.

According to an embodiment of the present disclosure, the resource scheduling apparatus 1100 may further include a predetermined number determining module, which is configured to perform operations S310 to S320, and is not described herein again.

According to an embodiment of the disclosure, the first prediction module 1110 may further perform operations S410 to S420, and operations S510 to S520, which are not described herein again.

According to an embodiment of the present disclosure, the resource scheduling apparatus 1100 may further include a fourth call volume obtaining module, configured to perform operations S610 to S630, which are not described herein again.

Fig. 12 schematically shows a block diagram of the log processing module 1150 according to an embodiment of the present disclosure.

As shown in fig. 12, the log processing module 1150 of this embodiment includes a log caching unit 1151, a log persisting unit 1152, and a query and derivation unit 1153.

The log cache unit 1151 is configured to cache system logs and service execution logs in a recent period of time (e.g., three days or one week) by using a high-performance and high-availability caching technology such as a flash disk and redis, memcached, and the like, so as to efficiently perform operations such as querying and exporting.

The log persistence unit 1152 is configured to store log data by using an elistic search technology, an object storage technology, a distributed file system, and the like, guarantee that the log is persistently available, and support automatic clearing of an expired log (disk space is limited) according to rules.

The query and export unit 1153 is configured to provide log query and export functions using the technology of elistic search, vue, node.

According to an embodiment of the present disclosure, the resource scheduling apparatus 1100 may further include a second call volume obtaining module, configured to perform operations S710 to S720, which are not described herein.

Fig. 13 schematically illustrates a block diagram of the temporal sequence analysis module 1160, according to an embodiment of the present disclosure.

As shown in fig. 13, the resource scheduling apparatus 1100 of this embodiment may further include a time series analysis module 1160 including a first preprocessing unit 1161, a model configuration unit 1162, and a first model optimization unit 1163.

The first preprocessing unit 1161 is configured to clean the operation log, screen out data that is helpful for time series modeling, and store the data in a local cache for standby.

The model configuration unit 1162 is used to configure and enable the time series analysis model, and supports multiple models, such as AR, MA, ARIMA, and the like.

The first model optimizing unit 1163 is configured to apply the enabled model, perform adjustment using the screened log, and establish a time series model conforming to log data, so as to predict a subsequent function call amount.

Fig. 14 schematically shows a block diagram of a poisson distribution analysis module 1170 according to an embodiment of the present disclosure.

As shown in fig. 14, the resource scheduling apparatus 1100 of this embodiment may further include a poisson distribution analysis module 1170, which includes a second preprocessing unit 1171 and a second model optimization unit 1172.

The second preprocessing unit 1171 is configured to clean the transaction log of the business function, screen out data that is helpful for poisson distribution modeling, and store the data in a local cache for later use.

The second model optimization unit 1172 is configured to apply a poisson distribution mathematical model, perform adjustment using the screened log, and establish a poisson distribution model that conforms to log data, for predicting a subsequent function call amount.

Fig. 15 schematically shows a block diagram of the structure of the function calculation core engine module 1180 according to an embodiment of the present disclosure.

As shown in fig. 15, the resource scheduling apparatus 1100 of this embodiment may further include a function calculation core engine module 1180, which includes a transaction detection unit 1181, a model calculation unit 1182, and a function instance scheduling unit 1183.

The transaction monitoring unit 1181 is used for monitoring the transaction amount of the business function in real time, and is used for model calculation and function instance scheduling reference.

The model calculation unit 1182 is configured to support the first prediction module 1110 and the second prediction module 1120, and calculate the function call amount in the near future (within half an hour, for example) by combining the time series model and the poisson distribution model of the business function.

The function instance scheduling unit 1183 is configured to support the instance changing module 1130, predict the transaction amount according to the current function call amount and the model, and schedule the function instance, where the volume is expanded if the transaction amount is expected to increase, and the volume is reduced if the transaction amount is obviously reduced (expected valley).

It should be noted that the implementation, solved technical problems, implemented functions, and achieved technical effects of each module/unit/subunit and the like in the apparatus part embodiment are respectively the same as or similar to the implementation, solved technical problems, implemented functions, and achieved technical effects of each corresponding step in the method part embodiment, and are not described herein again.

According to the embodiment of the present disclosure, any plurality of modules 1110 to 1180 in the resource scheduling apparatus 1100 may be combined into one module to be implemented, or any one of the modules may be split into a plurality of modules. Alternatively, at least part of the functionality of one or more of these modules may be combined with at least part of the functionality of the other modules and implemented in one module.

According to an embodiment of the present disclosure, at least one of the modules 1110 to 1180 in the resource scheduling device 1100 may be implemented at least partially as a hardware circuit, such as a Field Programmable Gate Array (FPGA), a Programmable Logic Array (PLA), a system on a chip, a system on a substrate, a system on a package, an Application Specific Integrated Circuit (ASIC), or may be implemented by hardware or firmware in any other reasonable manner of integrating or packaging a circuit, or may be implemented by any one of three implementations of software, hardware, and firmware, or any suitable combination of any of the three. Alternatively, at least one of the modules 1110-1180 of the resource scheduler 1100 may be at least partly implemented as a computer program module, which when executed may perform a corresponding function.

Fig. 16 schematically shows a resource scheduling flowchart based on the resource scheduling apparatus 1100 according to an embodiment of the present disclosure.

As shown in fig. 16, the resource scheduling flow of this embodiment includes operations S1601 to S1607.

Operation S1601: and starting the system.

Operation S1602: and calculating and predicting the peak time and the dispatching amount of the service by using a function core engine dispatching model.

Operation S1603: and the function core engine schedules the calculation resource expansion according to the expected calling quantity and increases a proper function example.

Operation S1604: the calling quantity of the service function is increased according to Poisson distribution to reach a peak, and the low-delay user experience is met because enough function examples are prepared in advance.

Operation S1605: the service function call volume decreases according to the poisson distribution, and is expected to enter the valley.

Operation S1606: and the function core engine schedules the capacity reduction of the computing resources according to the expected service scheduling amount, and reduces the proper function examples.

Operation S1607: and (4) evaluating the production running condition of the intelligent capacity expansion and reduction mechanism by developers, and if the intelligent capacity expansion and reduction mechanism is insufficient, continuously adjusting and optimizing the time series model and the Poisson distribution model.

According to the embodiment of the disclosure, the intelligent capacity expansion and reduction method is operated through a function based on time series analysis and Poisson distribution. The statistical learning modeling can be carried out on The historical call quantity Of The service function based on The time sequence analysis technology and The Poisson distribution mathematical principle, and The SOTA (State-Of-The-Art, currently best) user experience can be achieved by utilizing The mathematical model to dynamically expand and contract The capacity function calculation resources during production and operation.

As shown in fig. 17, an electronic apparatus 1700 according to an embodiment of the present disclosure includes a processor 1701 which can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM)1702 or a program loaded from a storage portion 1708 into a Random Access Memory (RAM) 1703. The processor 1701 may include, for example, a general purpose microprocessor (e.g., a CPU), an instruction set processor and/or associated chipset, and/or a special purpose microprocessor (e.g., an Application Specific Integrated Circuit (ASIC)), among others. The processor 1701 may also include on-board memory for caching purposes. The processor 1701 may include a single processing unit or multiple processing units for performing the different actions of the method flow according to embodiments of the present disclosure.

In the RAM 1703, various programs and data necessary for the operation of the electronic apparatus 1700 are stored. The processor 1701, the ROM 1702, and the RAM 1703 are connected to each other by a bus 1704. The processor 1701 performs various operations of the method flow according to the embodiments of the present disclosure by executing programs in the ROM 1702 and/or the RAM 1703. Note that the programs may also be stored in one or more memories other than ROM 1702 and RAM 1703. The processor 1701 may also execute various operations of the method flows according to the embodiments of the present disclosure by executing programs stored in one or more memories.

Electronic device 1700 may also include input/output (I/O) interface 1705, input/output (I/O) interface 1705 also connected to bus 1704, according to an embodiment of the present disclosure. Electronic device 1700 may also include one or more of the following components connected to I/O interface 1705: an input section 1706 including a keyboard, mouse, and the like. Including an output portion 1707 such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker and the like. A storage portion 1708 including a hard disk and the like. And a communication section 1709 including a network interface card such as a LAN card, a modem, or the like. The communication section 1709 performs communication processing via a network such as the internet. A driver 1710 is also connected to the I/O interface 1705 as necessary. A removable medium 1711 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 1710 as necessary, so that a computer program read out therefrom is mounted into the storage portion 1708 as necessary.

The present disclosure also provides a computer-readable storage medium, which may be embodied in the devices/apparatuses/systems described in the above embodiments. Or may exist separately and not be assembled into the device/apparatus/system. The computer-readable storage medium carries one or more programs which, when executed, implement the method according to an embodiment of the disclosure.

According to embodiments of the present disclosure, the computer-readable storage medium may be a non-volatile computer-readable storage medium, which may include, for example but is not limited to: a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. For example, according to embodiments of the present disclosure, a computer-readable storage medium may include the ROM 1702 and/or RAM 1703 described above and/or one or more memories other than the ROM 1702 and RAM 1703.

Embodiments of the present disclosure also include a computer program product comprising a computer program containing program code for performing the method illustrated in the flow chart. When the computer program product runs in a computer system, the program code is used for causing the computer system to realize the method provided by the embodiment of the disclosure.

The computer program performs the above-described functions defined in the system/apparatus of the embodiment of the present disclosure when executed by the processor 1701. The above described systems, devices, modules, units, etc. may be implemented by computer program modules according to embodiments of the present disclosure.

In one embodiment, the computer program may be hosted on a tangible storage medium such as an optical storage device, a magnetic storage device, or the like. In another embodiment, the computer program may also be transmitted, distributed as a signal on a network medium, downloaded and installed via the communication portion 1709, and/or installed from the removable medium 1711. The computer program containing program code may be transmitted using any suitable network medium, including but not limited to: wireless, wired, etc., or any suitable combination of the foregoing.

In such embodiments, the computer program may be downloaded and installed from a network via the communication portion 1709, and/or installed from the removable media 1711. The computer program, when executed by the processor 1701, performs the above-described functions defined in the system of the embodiment of the present disclosure. The systems, devices, apparatuses, modules, units, etc. described above may be implemented by computer program modules according to embodiments of the present disclosure.

In accordance with embodiments of the present disclosure, program code for executing computer programs provided by embodiments of the present disclosure may be written in any combination of one or more programming languages, and in particular, these computer programs may be implemented using high level procedural and/or object oriented programming languages, and/or assembly/machine languages. The programming language includes, but is not limited to, programming languages such as Java, C + +, python, the "C" language, or the like. The program code may execute entirely on the user computing device, partly on the user device, partly on a remote computing device, or entirely on the remote computing device or server. In the case of a remote computing device, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., through the internet using an internet service provider).

The embodiments of the present disclosure have been described above. However, these examples are for illustrative purposes only and are not intended to limit the scope of the present disclosure. Although the embodiments are described separately above, this does not mean that the measures in the embodiments cannot be used advantageously in combination. The scope of the disclosure is defined by the appended claims and equivalents thereof. Various alternatives and modifications can be devised by those skilled in the art without departing from the scope of the present disclosure, and such alternatives and modifications are intended to be within the scope of the present disclosure.

Claims

1. A resource scheduling method based on function calculation comprises the following steps:

predicting a second call amount for calling a first function at a second time based on a first call amount for calling the first function at the first time, wherein the first time is before the second time;

predicting the memory usage of the first function at the second moment based on the second call amount;

before the second time comes, the method further comprises:

changing a predetermined number of function instances based on the second call volume, wherein the function instances are used for running the first function; and

under the condition of increasing the preset number of function instances, allocating memory resources for each function instance in the preset number of function instances based on the memory usage amount.

2. The method of claim 1, wherein predicting the memory usage of the first function at the second time based on the second call amount comprises:

inputting the second call quantity into a pre-obtained memory prediction model;

and obtaining the memory usage amount output by the memory prediction model.

3. The method according to claim 2, wherein the method further comprises obtaining the memory prediction model, specifically comprising:

obtaining a third calling amount for calling the first function at each third moment in the first time sequence;

obtaining the memory usage amount of a single function instance of the first function at each third moment;

and obtaining the memory prediction model according to the third calling amount and the memory usage amount of the single function instance.

4. The method of claim 3, wherein obtaining the memory prediction model based on the third call amount and the memory usage amount of the single function instance comprises:

obtaining a first fitting curve according to the third calling amount and the memory usage amount of the single function instance;

determining parameters of the memory prediction model based on the first fitted curve.

5. The method of claim 1, wherein predicting a second call volume to call a first function at a second time based on a first call volume to call the first function at the first time comprises:

inputting the first call quantity into a call prediction model obtained in advance;

and obtaining the second calling amount output by the calling prediction model.

6. The method according to claim 5, wherein the method further comprises obtaining the call prediction model, in particular comprising:

obtaining a fourth calling amount for calling the first function at each fourth moment in the second time sequence;

and obtaining the calling prediction model according to each fourth moment and the corresponding fourth calling amount.

7. The method of claim 6, wherein the call prediction model comprises a time series model and a poisson distribution model, and obtaining the call prediction model according to each fourth time instant and the corresponding fourth call quantity comprises:

and respectively determining parameters of the time series model and parameters of the Poisson distribution model according to each fourth moment and the corresponding fourth call quantity.

8. The method of claim 7, wherein, after obtaining the time series model and the poisson distribution model, the predicting a second call volume to call the first function at a second time comprises:

inputting the first call quantity into the time series model and the Poisson distribution model respectively;

and determining the second calling amount based on the undetermined calling amounts respectively output by the time series model and the Poisson distribution model.

9. The method of claim 6, wherein obtaining the fourth call volume comprises:

obtaining a running log of the first function in the second time series;

cleaning data in the running log according to the service keywords to obtain target log data;

and obtaining the fourth calling amount according to the target log data.

10. The method according to claim 1, wherein before changing a predetermined number of function instances based on the second call volume, the method further comprises determining the predetermined number, in particular comprising:

determining a second number of function instances at the second time based on the second call volume;

determining the predetermined number based on a difference between the second number and a first number, wherein the first number is a number of function instances at the first time instant.

11. The method of claim 10, comprising:

in the case that the second number is larger than the first number, changing the predetermined number of function instances to increase the predetermined number of function instances;

in the case where the second number is less than or equal to the first number, the function instance of changing the predetermined number is a function instance of decreasing the predetermined number.

12. A resource scheduling apparatus based on function computation, comprising:

the device comprises a first prediction module, a second prediction module and a third prediction module, wherein the first prediction module is used for predicting a second calling amount of a first function called at a second moment based on a first calling amount of the first function called at the first moment, and the first moment is before the second moment;

the second prediction module is used for predicting the memory usage of the first function at the second moment based on the second call amount;

an instance changing module, configured to change a predetermined number of function instances based on the second call volume before the second time comes, where the function instances are used to run the first function; and

and the memory allocation module is used for allocating memory resources to each function instance in the predetermined number of function instances on the basis of the memory usage amount under the condition that the predetermined number of function instances are increased.

13. An electronic device, comprising:

one or more processors;

a storage device for storing one or more programs,

wherein the one or more programs, when executed by the one or more processors, cause the one or more processors to perform the method of any of claims 1-11.

14. A computer readable storage medium having stored thereon executable instructions which, when executed by a processor, cause the processor to perform the method of any one of claims 1 to 11.

15. A computer program product comprising a computer program which, when executed by a processor, implements a method according to any one of claims 1 to 11.