CN105631196A

CN105631196A - Microservice-oriented container level flexible resource supply system and method

Info

Publication number: CN105631196A
Application number: CN201510974291.9A
Authority: CN
Inventors: 吴恒; 郝庭毅; 宋云奎; 张文博
Original assignee: Institute of Software of CAS
Current assignee: Institute of Software of CAS
Priority date: 2015-12-22
Filing date: 2015-12-22
Publication date: 2016-06-01
Anticipated expiration: 2035-12-22
Also published as: CN105631196B

Abstract

The invention relates to a microservice-oriented container level flexible resource supply system and method. The system comprises a data acquisition device, a performance modeling device, a feedforward controller, a response time predictor and a container scheduler. According to the system and method, the deficiencies of difficulty in suitability for a sudden load mode and difficulty in user service quality guarantee in an existing method are made up for.

Description

A kind of container levels flexible resource feed system towards micro services framework and method

Technical field

The present invention relates to a kind of container levels flexible resource feed system towards micro services framework and method, belong to the container resource elastic supply problem under the Internet and resource management field, particularly burst load scene.

Background technology

Micro services framework (Microservice) embodies the design philosophy of internet, applications, its core concept is that fine granularity Module Division, serviceization interface encapsulation, lightweight communication are mutual, there are following 2 advantages: (1) module autonomy is strong, internet, applications change can be met very well fast, the demand that module is independently updated, (2) module favorable expandability, can meet the prediction of internet, applications customer volume difficulty, the demand of Resource dynamic allocation very well. wherein, the former refers to application and development and design category, the latter, it is emphasised that application runs and maintenance issues, is also the emphasis paid close attention to herein. report according to Gartner, micro services itself has good autgmentability, it is increasingly becoming the main flow architecture mode of structure internet, applications, but from the visual angle run and safeguard, tackle typical the Internet sudden change load (Flash-crowds) scene, guarantee application service quality (QualityofService) still faces the challenge, and (service quality refers to the application software satisfaction degree to time requirement, response time is one of its important metric, such as QoS of customer is 5 seconds, namely represent that the interval that user initiates request response from request is not to be exceeded 5 seconds), such as Netflix, Facebook, Twitter etc.

In recent years, Lightweight Container technology is arisen at the historic moment, and its essence is simulation process running environment, has the features such as resource occupation is few, application startup is fast, just progressively becomes and support the main flow architecture platform that micro services is run. Container has the feature of second level resource provision, can well meet the demand that real time resources is supplied by internet, applications load changing. But, have method or be limited to physical machine motility not, resource is difficult to elastic supply; Or the supply minute level that is limited to resources of virtual machine is ageing, it is typically only capable to be applicable to the load model of mechanical periodicity (Time-of-day). Fast development and deep application along with interconnected technology, how to tackle sudden change load (Flash-crowds) still to face the challenge, as Jingdone district adopts container as " 618 " flash sale activity in 2015, user accesses number rate of increase higher than expection, causes that the phenomenon such as " without response ", " card " occurs in certain applications component.

Existing work is mainly directed towards physical machine and the elastic supply of virtual machine Scene realization resource. Some documents are difficult to the feature of quickly supply for physical resource, or adopt capacity planning, using service quality as constraints, estimate the peak resource demand of application; Or employing admission control mechanism, push away, according to resource provision amount is counter, the peak load that application can be born, ensure the service quality of application by refusing service strategy. As LudmilaCherkasova et al. propose based on admission control mechanism, set up the loss model of session and handling capacity, the demand and supply of deduction resource. The resource requirement equation group based on linear model that and for example RobertssonA et al. proposes, ensureing the lower in advance peak value estimating resource requirement of application quality, thus reaching the purpose of the on-demand supply of resource. Some documents consider virtual machine performance expense factor, adopt the method for model-driven to portray the resource requirement change of application under virtualized environment, and in this, as the foundation that resource provides. These methods generally adopt the enhancing study mechanism such as (ReinforcementLearning), statistical learning (StatisticalLearning) to carry out model parameter training. As KarlssonM et al. proposes the resource requirement of each Service Instance of methods analyst based on performance isolation, respectively to every kind of service construction performance change model, thus carrying out adaptive resource supply; Some documents consider reciprocity energy interference problem between virtual machine, adopt the methods such as machine learning (StatisticalMachineLearning), fuzzy control (FuzzyControl), theory of probability (ProbabilityTheory) to depict reciprocity between virtual machine and can disturb the impact of correspondence resource provision, and in this, as the foundation of application resource supply. AsThe mode based on machine learning that P et al. proposes, carries out rule by the performance parameter of historical data set correspondence model and trains, and carry out resource according to the resource provision rule obtained and dynamically adjust. But, owing to resources of virtual machine supply needs minute level, therefore said method is usually applicable only to the application scenarios of load cyclically-varying in time (Time-of-day).

Summary of the invention

The technology of the present invention solves problem: overcome the deficiencies in the prior art, it is provided that a kind of container levels resource provision method towards micro services framework, solves existing method and is difficult to be suitable for burst load pattern, there is the deficiency that QoS of customer is difficult to ensure.

The technology of the present invention solution: a kind of container levels flexible resource feed system towards micro services framework, including data acquisition unit, performance modeling device, response time predictor, feedforward controller and container scheduler, wherein:

Data acquisition unit, periodically monitoring container CPU, internal memory, magnetic disc i/o, network I/O system resource, and user's number of request per second; Additionally, data acquisition unit is responsible for setting up " load-preference resource utilization ", because the resource preference of container has diversity, such as I/O intensity container, the resource utilizations such as its disk transfers, network transmission are higher than the utilization rate of CPU possibly, and for example computation-intensive container, the utilization rate of its CPU and internal memory is likely to want more higher compared with disk I/O. Specifically, data acquisition unit periodically obtains the peak in cpu busy percentage, memory usage, magnetic disc i/o, these four parameters of network I/O, and is defined as preference resource utilization, then structure " load-preference resource utilization ";

Performance modeling device, according to the data that data acquisition unit obtains, is modeled by the performance equation in Jackson open loop networks and flow equation. Under preference resource utilization known conditions, build the tie-in equation of " load-response time " under micro services framework;

Response time predictor, there is known variables in " load-response time " tie-in equation, only determines this known variables, can calculate response time. Therefore, adopting Kalman filtering, using known variables as prediction matrix, using known variables as observing matrix, by known variables being carried out assignment, estimating response time, and estimated value and monitor value are obtained variance and average passes to feedforward controller;

Feedforward controller, analyze variance and the average of residual error, obtained the calibration value of filtering parameter (in " load-response time " tie-in equation the value of known variables) by fuzzy logic, its objective is to revise known variables so that the response time of estimation is more accurate;

Container scheduler, according to whether response time predictive value runs counter to application service quality, the on-demand supply carrying out container levels resource.

In described performance modeling device, build the incidence relation equation of " load-response time " under micro services framework by Jackson open loop networks specific as follows:

u_j=u_0j+��_j��_i(��_ji��T_ji)(1)

B = d + Σ_{j} \frac{T_{j}}{1 - u_{j}} - - - (2)

Wherein, j is application component, and i is application component example, j, and the relation of i can be described as: user's request will flow through multiple application component j₁...j_n, each application component j contains multiple example i₁...i_m; Each example runs in a reservoir; u_j�� [0,1) it is the preference resource utilization of application component j; u_0jRefer to when application component j preference resource utilization in zero load situation; ��_jiRefer to the number of concurrent of the i-th container of application component j, i.e. the number of request of arrival per second, meet poisson arrival process; T_jiRefer to the service processing time of the i-th container of application component j; T_jRefer to that the average service of application component j processes the time; D refers to service total delay time; B refers to service overall response time; ��_jIt it is the load correlation coefficient with resource utilization of service j; u_j,u_0j,��_ji, B obtains by monitoring, ��_jIt it is the empirical value provided according to historical data;

Equation (1) represents preference resource utilization and adds overhead when having load equal to the expense of application build itself; Equation (2) represents response time equal to total delay plus total service processing time, and wherein, the service processing time is equal to the average service time of application build and the inverse ratio of untapped preference resource utilization; Such that it is able to obtain the performance equation of " load-response time " from " load-preference resource utilization ".

Described response time predictor detailed process.

(1) for equation (1) and equation (2), the average service process time of total delay and application component is difficult to predict, is unknown number; Preference resource utilization, application component own resources expense, load can be observed and obtaining, and is datum;

(2) by Kalman filtering algorithm, using unknown number as prediction matrix; Datum is as observing matrix;

(3) by carrying out assignment to unknown number, the predictive value of the time that meets with a response.

The filtering equations of Kalman is as follows:

X (k)=H (k) X (k-1)+Q_k(3)

Z (k)=H (k) X (k)+R_k(4)

Wherein X (k) is prediction matrix, and its value isForm please adjust the matrix representing the service processing time with service flow total delay; Z (k) is observing matrix, and its value isForm please adjust the matrix representing the resource utilization of application example, load and response time; Q (k) is procedure activation noise covariance matrix, and it meets Q (k)��N (0, Q), R (k) and measures noise covariance matrix, and it meets the Gauss distribution of R (k)��N (0, R).

Described feedforward controller detailed process, in order to revise the unknown number in Kalman filtering, improves the accuracy of response time prediction.

(1) owing to burst load has uncertainty, it is therefore desirable to regulate the noise variance matrix of Kalman filter.

(2) noise variance matrix is consistent with the matrix of Gauss distribution, is represented by the first-order equation of standard deviation; And the residual error (difference of observing matrix actual value and predictive value) in wave filter calculating process, this difference represents the wave filter degree of dependence to measured value, when its value increases, representing load in burst change, the accuracy of filter prediction is declining, and this value is relevant with the standard deviation of noise variance matrix, so, can passing through to analyze variance and the average of residual error, adjusting the change of noise variance matrix, thus ensureing the accuracy of filter prediction.

Building Triangleshape grade of membership function by TS fuzzy logic, the curve of error of itself and conventional Kalman filter being carried out across comparison, thus inferring feasibility and the effectiveness of these group data. Again each experimental result and a front experimental result are carried out longitudinal contrast, to determine the linear combination of better effects if. Eventually pass 100 groups of emulation experiments, it is determined that the rule output data of FLAC, only enumerate two important FLAC rules here:

And if only if, and residual variance belongs to little, when residual error average belongs to zero, and T=P (r) �� 0.3+0.8, U=-P (r) �� 0.2+1.9

And if only if, and residual variance belongs to big, and residual error average belongs to hour, T=-P (r) �� 0.5+0.6, U=P (r) �� 0.1+1.4

Described container scheduler detailed process.

(1) whether response time runs counter to the standard that application service quality is container scheduling;

(2) when response time is broken a contract, to each application component service time response time shared by predictive value ratio in weight the higher person be extended; When response time is not broken a contract, if the total resources utilization rate of host is more than 75%, then the container that wherein resource occupation is higher is migrated; When response time is not broken a contract, if the resource utilization of the application example of certain application component is less than rated value, then shrink.

The container resource provision method of the present invention comprises the following steps:

Step S01: periodically monitoring gathers the system resources such as the CPU of each container, internal memory, magnetic disc i/o, network I/O and uses parameter and user's number of request per second; Each container is due to the difference of COS, and its resource preference is also different; Data acquisition phase generates " load-preference resource utilization " data pair the most at last, as its output;

Step S02: according to the data pair produced in step S01, be modeled by the performance equation in Jackson open loop networks and flow equation, portray the incidence relation of " load-response time " under micro services framework;

Step S03: using the predictive value as Kalman filter of the unknown number in " load-response time " relation equation in step S02, it is known that number, as the observations of Kalman filter, builds predictive equation and observational equation respectively;

Step S04: the residual error of predictive value and observation will be produced according in step S03 Kalman filter calculating process, residual error is to adjust the important parameter of Kalman filter prediction accuracy; Using the average of residual error and variance as the input of TS ambiguity function, the noise matrix of Kalman filter is output;

Step S05: the service quality whether having run counter to application according to the predictive value of the response time drawn in step S03 is foundation, container is scheduling, thus ensureing the service quality that user applies.

Present invention advantage compared with prior art is in that:

(1) present invention proposes the application elastic supply framework based on container, utilizes container lightweight feature, improves the actual effect of resource provision.

(2) present invention proposes the feature utilizing Fuzzy Adaptive Kalman Filtering Fast Convergent, improve the forecasting accuracy of sudden change load and the effectiveness of resource provision, solving existing method cannot under the scene of burst load, the problem ensureing QoS of customer.

Accompanying drawing explanation

Fig. 1 is the inventive method flowchart.

Fig. 2 is container scheduling strategy flow chart in the inventive method.

Detailed description of the invention

For making the present invention easier to understand, in conjunction with an example, the present invention is further elaborated, but this example does not constitute any limitation of the invention.

As shown in Figure 1: it is as follows that the present invention is embodied as step:

Step S01: the resource that when definition preference resource is that in CPU, internal memory, disk read-write, network service, container runs, proportion is of a relatively high; Data acquisition unit in the present embodiment, will be acquired the preference resource in each container and user's number of requests, and the data forming " preference resource-load " export as it; Preference resource refers to that each container is different due to COS, so the resource type of its preference is also different, the preference resource of computation-intensive is probably cpu busy percentage or memory usage, and preference resource intensive for I/O is probably magnetic disc i/o or network I/O.

Step S02: using the data of " preference resource-load " in step S01 to as input, by Jackson open loop networks performance equation and flow equation, portraying the incidence relation of load and response time;

Here, the relation of Jackson open loop networks and micro services framework refers to: in (1) micro services framework, application component is independent from, communicated by messaging bus between module, there is no status information, meet under Jackson network queuing model, node (application component and node) is separate, meets the constraint of exponential; (2) being interacted by message between application component under micro services framework, meeting Jackson network is open loop, and node input meets the hypothesis of Poisson distribution; (3) application component is after processing request, may select and enters next node or leave network.

User's request can redirect in node, through the process of related application component, finally responds to user. When certain application component exists multiple example time, adopt Round Robin (Round-RobinScheduling) strategy. In order to distinguish the different instances of same application component, definition: j is application component, and i is application component example. The relation of j, i three can be described as: user's request will flow through multiple application component j₁...j_n, each application component j contains multiple example i₁...i_m; , each example runs in a reservoir. Due to the resource preference different (such as CPU collection type, I/O collection type) of application component, causing that the preference resource that container occurs is different, definition preference resource is the resource that in container CPU, internal memory, magnetic disc i/o, utilization rate is the highest; u_j�� [0,1) it is the preference resource utilization of application component j; u_0jRefer to when application component j preference resource utilization in zero load situation; ��_jiRefer to the number of concurrent of the i-th container of application component j, i.e. the number of request of arrival per second, meet poisson arrival process; T_jiRefer to the service processing time of the i-th container of application component j; T_jRefer to that the average service of application component j processes the time; D refers to that user asks the overall network transmission time of stream f; B refers to the response time of service flow f; ��_jIt it is the load correlation coefficient with resource utilization of service j. Have according to Jackson network traffics equation and network performance equation:

u_j=u_0j+��_j��_i(��_ji��T_ji)(1)

B = d + Σ_{j} \frac{T_{j}}{1 - u_{j}} - - - (2)

Wherein, u_j,u_0j,��_ji, B obtains by monitoring, ��_jIt is the empirical value provided according to historical data, T_ji, d is difficult to monitor, it is necessary to estimate by predicting. So-called elastic supply, refers to that response time B is under the premise of relatively-stationary interval, the resource requirement of application. Visible, T_ji, d is by the key element of elastic telescopic.

Here, the original filtration equation of Kalman is as follows:

X (k)=H (k) X (k-1)+Q_k(3)

Z (k)=H (k) X (k)+R_k(4)

Wherein X (k) is prediction matrix, and its value isRepresent the matrix of service processing time and service flow total delay; Z (k) is observing matrix, and its value isRepresent the matrix of the resource utilization of application example, load and response time; Q_kIt is procedure activation noise covariance matrix, R_kBeing measure noise covariance matrix, it meets R_k��N (0, R) Gauss distribution, it is generally acknowledged and the two noise matrix should be set to zero-mean white noise, but load change is uncertain often, appearance such as load sudden peaks situation, so in order to make the scheduling of the flexible resource of system have a real-time, procedure activation noise covariance matrix and measure noise covariance matrix should adaptive adjustment in time, existing noise matrix is set to:

Q_k=TQ (5)

R_k=UR (6)

Wherein, T, U is the adjusted value of time-varying, can obtain predictive equation as follows:

\overset{&OverBar;}{X} (k | k - 1) = H (k) \overset{&OverBar;}{X} (k - 1) - - - (7)

\overset{&OverBar;}{Z} (k) = H (k) \overset{&OverBar;}{X} (k) - - - (8)

WhereinIt is the predictive value of each service processing time and request total delay time,It is the predictive value of resource utilization lamp observed parameter, it is possible to willPredictive value bring formula (1) and (2) into, draw the wave filter predictive value to response time.It is defined as residual error r, representative system model relies on the degree of measured value, the more big then system model of its value is more big to the dependence of measured value, at this moment illustrate that system load is likely to be to jump or the state of bust, service processing time and total delay cannot be carried out Accurate Prediction by wave filter, it is necessary to change Filtering Model.

Step S04: using the predictive value as Kalman filter of the unknown number in " load-response time " relation equation in step S02, it is known that number, as the observations of Kalman filter, builds predictive equation and observational equation respectively;

Judge wave filter the need of update according to exactly monitoring residual error, ideally residual error is zero-mean white noise, namely wave filter can perfect self adaptation, the average white noise if residual error is not zero, then illustrate that error occurs in filter prediction. Owing to residual variance, residual error average are relevant with Q and R, it is possible to by estimating residual variance and average, then carry out fuzzy reasoning, finally adjust the value of U and T, reach the purpose making Kalman filtering algorithm adapt to Time variable structure.

So, design an ambiguity function (FLAC) and constantly monitor the change of residual sum average, then adjust T and U according to fuzzy rule, to change noise matrix, thus the square error matrix of Kalman filter is adjusted so that it is perform optimal estimation, to meet time-dependent demand always.

Adopt TS fuzzy logic system, residual variance and average are set up Triangleshape grade of membership function and fuzzy rule. Such as, if residual variance is increasing, average also gradually away from zero, then should reduce procedure activation noise T and increase measurement noise U. Thus setting up fuzzy logic ordination table, zero in table represents T and U and need not change, and little representative increases T reduction U, and big representative reduces T and increases U, and middle representative increases T and U simultaneously.

Across comparison is carried out, thus inferring feasibility and the effectiveness of these group data according to above principle and by the curve of error of itself and conventional Kalman filter. Again each experimental result and a front experimental result are carried out longitudinal contrast, to determine the linear combination of better effects if. Eventually pass 100 groups of emulation experiments, it is determined that the rule output data of FLAC, only enumerate two important FLAC rules here:

The effectiveness that the parameter of wave filter predicts the outcome dynamically is adjusted with guarantee according to above-mentioned FLAC rule.

Step S05: the service quality whether having run counter to application according to the predictive value of the response time drawn in step S03 is foundation, container is scheduling, thus ensureing the service quality that user applies;

According to load and resource service condition, container is carried out Real-Time Scheduling, export response time smoothly to ensure. Noise parameter in definition Kalman filtering algorithm is global variable T and U; In definition Service Instance, resource makes the maximum of consumption, minima and maximum response time; Definition Kalman filtering function EKF, for the predicated response time; Definition FLAC is ambiguity function, for T and U is sent feedforward information, the model parameter of Automatic adjusument EKF; Definition ResponseTime function, with the prediction output valve of EKF for parameter, calculates response time; Constant volume device migrates function Migrate, container expansion function Expand, containers shrink function Contract. First calculate response time predictive value and residual error average and variance according to Kalman filtering function, then bring the two value into fuzzy logic function, Kalman filtering function parameter is carried out feed-forward regulation, finally calculate the response time of application. Whether break a contract following three kinds of container scheduling strategies according to response time, seen Fig. 2:

(1) container migrates, and the main cause of its generation is that the use of container own resources reaches the resource limit upper limit, and the total resources being because host will arrive restriction upper limit threshold, now be not required to container is extended. Have only to migrate to partial containers other nodes. Container transition process approximately as, first container persistence is mirrored into, this step will preserve current application state, generates container according to mirror image on other nodes of cluster, and original container will be deleted by controller. When container migrates, the container that this strategy is minimum by migrating resource utilization, and judge that whether host resource utilization rate is less than threshold value, if being still above threshold value, then continue to migrate the container that resource utilization is minimum, until satisfying condition.

(2) container expansion, the main cause of its generation is in response to the time and runs counter to service quality, now carries out container and migrates and can not solve problem, for ensureing the average response time of this container, it is necessary to this container is extended. The process of container expansion approximately as, persistence is mirrored into by container, on other nodes according to this mirror image generate container, by the modes such as load balancing complete extension. During container expansion, by several container expansion higher for average service time to 2 times, all the other container instance numbers increase by one.

(3) containers shrink, the resource service condition that the main cause of its generation is each example of application is below estimated value, now needs the example quantity of this application of cutting. During containers shrink, will be less than the container instance number of threshold value and reduce one, if container instance number only one of which, then do not deal with.

Non-elaborated part of the present invention belongs to techniques well known.

Claims

1. the container levels flexible resource feed system towards micro services framework, it is characterised in that including: data acquisition unit, performance modeling device, feedforward controller, response time predictor and container scheduler, wherein:

Data acquisition unit, periodically monitoring container CPU, internal memory, magnetic disc i/o, network I/O system resource, and user's number of request per second; It is also responsible for setting up " load-preference resource utilization ", the resource preference of container has diversity, data acquisition unit periodically obtains the peak in cpu busy percentage, memory usage, magnetic disc i/o, these four parameters of network I/O, and peak is defined as preference resource utilization, then structure " load-preference resource utilization ";

Performance modeling device, according to the data that data acquisition unit obtains, is modeled by the performance equation in Jackson open loop networks and flow equation, under preference resource utilization known conditions, builds the tie-in equation of " load-response time " under micro services framework;

Response time predictor, adopt Kalman filtering, using the known variables in the tie-in equation of " load-response time " under micro services framework as prediction matrix, using known variables as observing matrix, by known variables is carried out assignment, estimation response time, and variance and the average of estimated value and observation are passed to feedforward controller;

2. the container levels flexible resource feed system towards micro services framework according to claim 1, it is characterised in that: in described performance modeling device, build the incidence relation equation of " load-response time " under micro services framework specific as follows:

u_j=u_0j+��_j��_i(��_ji��T_ji)(1)

B = d + Σ_{j} \frac{T_{j}}{1 - u_{j}} - - - (2)

3. the container levels flexible resource feed system towards micro services framework according to claim 1, it is characterised in that: the Kalman filtering side's device built in described response time predictor is as follows:

X (k)=H (k) X (k-1)+Q_k(3)

Z (k)=H (k) X (k)+R_k(4)

Wherein X (k) is prediction matrix, represents the matrix of service processing time and service flow total delay; Z (k) is observing matrix, represents the matrix of the resource utilization of application example, load and response time; Q (k) is procedure activation noise covariance matrix, and R (k) measures noise covariance matrix.

4. the container levels flexible resource feed system towards micro services framework according to claim 1, it is characterized in that: the process of the calibration value that described feedforward fuzzy logic obtains Kalman filtering parameter is: build Triangleshape grade of membership function by TS fuzzy logic, the curve of error of curve of error and conventional Kalman filter is carried out across comparison, thus inferring feasibility and the effectiveness of these group data; Again each experimental result and a front experimental result are carried out longitudinal contrast, to determine the linear combination of better effects if; Eventually pass up to a hundred groups of emulation experiments, it is determined that rule output data, the i.e. calibration value of Kalman filtering parameter of FLAC.

5. the container levels flexible resource feed system towards micro services framework according to claim 1, it is characterised in that: described container scheduler detailed process is:

When the predictive value of response time is broken a contract, to each application component service time response time shared by predictive value ratio in weight the higher person be extended; When the predictive value of response time is not broken a contract, if the total resources utilization rate of host is more than 75%, then the container that wherein resource occupation is higher is migrated; When the predictive value of response time is not broken a contract, if the resource utilization of the application example of certain application component is less than rated value, then shrink.

6. the container levels flexible resource supply method towards micro services framework, it is characterised in that comprise the following steps:

Step S01: periodically gather the CPU of each container, internal memory, magnetic disc i/o, network I/O system resource use parameter and user's number of request per second; Each container is due to the difference of COS, and its resource preference is also different; Data acquisition phase generates " load-preference resource utilization " data pair the most at last;

Step S02: according to the data pair produced in step S01, be modeled by the performance equation in Jackson open loop networks and flow equation, builds the incidence relation of " load-response time " under micro services framework;

Step S05: whether the predictive value of the response time drawn in step S03 has run counter to the service quality of application is foundation, container is carried out resource management, thus ensureing the service quality that user applies.