CN109617738B

CN109617738B - Method, system and non-volatile storage medium for user service scaling

Info

Publication number: CN109617738B
Application number: CN201811619760.5A
Authority: CN
Inventors: 王远大; 宋翔; 郑健
Original assignee: Ucloud Technology Co ltd
Current assignee: Ucloud Technology Co ltd
Priority date: 2018-12-28
Filing date: 2018-12-28
Publication date: 2022-05-31
Anticipated expiration: 2038-12-28
Also published as: CN109617738A

Abstract

The invention provides a method, a system and a non-volatile storage medium for service scaling, wherein the method comprises the following steps: a request forwarding step of forwarding a service request from a user; a data acquisition step, namely receiving the service request, calculating service performance data according to the service request, and collecting hardware data; a conversion step, performing format conversion on the service performance data and the hardware data to obtain training data; a decision step, namely acquiring a trained current classification model, and inputting the training data into the current classification model to obtain a scaling decision; a capacity expansion and reduction step, namely utilizing the capacity expansion and reduction decision to perform service capacity expansion and reduction; and a training step, training the current classification model by using the training data to obtain an updated classification model, and taking the updated classification model as the current classification model.

Description

Method, system and non-volatile storage medium for user service scaling

Technical Field

The invention relates to a method, a system and a non-volatile storage medium for service scaling.

Background

In order to attract more users, each big cloud facilitator builds its own API (application program interface) service, such as the arrhizus API market, the hundredth API store, etc. The automatic capacity expansion service provided by the current cloud provider for the user is mainly to dynamically increase or decrease the allocation amount of the machine according to the use condition of the hardware resources, such as the utilization rate of cpu, memory and the like, so as to reduce the waste of the hardware resources. The amount of the resource demand of the API service of the user needs to be estimated by the user and is distributed in advance. In addition, the user needs to monitor the resource utilization rate by means of an alarm system, and when the resources are insufficient, more resources are added manually.

The current capacity expansion and reduction scheme of the cloud provider mainly increases and decreases hardware resources so as to improve the resource utilization rate. The availability and performance of user API services requires the user himself to ensure. The API service performance on the user side is different, and a phenomenon that the service resource utilization rate is low but the service response time becomes slow may occur, and at this time, parallel service expansion rather than capacity reduction is often required. Therefore, simply through the use condition of hardware resources such as cpu occupancy rate, memory usage amount and the like, high availability and high performance of user services cannot be ensured. The user side needs to increase monitoring and manual capacity expansion, so that the access cost of the user side is increased, and the automation degree is low.

In addition, the existing capacity expansion and contraction strategy is too simple to make an optimal decision. The measurement indexes of the types of hardware resources and the quality of service are more, and if the capacity is expanded and contracted by using simple logic judgment only according to partial indexes, the optimization of a strategy is difficult to ensure, and the waste of resources is inevitable.

Meanwhile, the existing capacity expansion and reduction strategy is unchanged, once the system design is completed, the capacity expansion and reduction strategy is kept unchanged all the time, and self-adaptation can not be carried out according to the current situation.

Disclosure of Invention

The invention provides a method for service scaling, which comprises the following steps:

a request forwarding step of forwarding a service request from a user;

a data acquisition step, namely receiving the service request, calculating service performance data according to the service request, and collecting hardware data;

a conversion step, performing format conversion on the service performance data and the hardware data to obtain training data;

a decision step, namely acquiring a trained current classification model, and inputting the training data into the current classification model to obtain a scaling decision;

a capacity expansion and reduction step, namely utilizing the capacity expansion and reduction decision to perform service capacity expansion and reduction;

and a training step, training the current classification model by using the training data to obtain an updated classification model, and taking the updated classification model as the current classification model.

In the data acquisition step, the user service unit receives the service request and calculates service performance data according to the service request.

Wherein the training data conforms to a format of training performed in the training step.

And when the decision step is carried out next time, the updated classification model is obtained.

Wherein, in the request forwarding step, the routing unit forwards the service request, and the service scaling is scaling the routing unit and/or the user service unit.

The present invention also provides a system for service scalability, the system comprising:

a routing unit for forwarding a service request from a user;

the data acquisition unit is used for receiving the service request, calculating service performance data according to the service request and collecting hardware data;

the conversion unit is used for carrying out format conversion on the service performance data and the hardware data to obtain training data;

the decision unit is used for acquiring a trained current classification model and inputting the training data into the current classification model to obtain a scaling decision;

the capacity expansion and reduction unit is used for carrying out service capacity expansion and reduction by utilizing the capacity expansion and reduction decision;

and the training unit is used for training the current classification model by using the training data to obtain an updated classification model, and taking the updated classification model as the current classification model.

Wherein the data acquisition unit further comprises:

the user service unit receives the service request and calculates the service performance data according to the service request;

a collection unit to collect the hardware data.

Wherein the training data conforms to a format in which training is performed in the training unit.

And when the decision unit makes a next decision, acquiring the updated classification model.

Wherein the service scaling is scaling the at least one routing unit and/or the user service unit.

The present invention further provides a nonvolatile storage medium on which a program for performing service expansion is stored, the program being executed by a computer to implement a method of service expansion, the program including:

a request forwarding instruction for forwarding a service request from a user;

a data acquisition instruction, receiving the service request, calculating service performance data according to the service request, and collecting hardware data;

a conversion instruction, which is used for carrying out format conversion on the service performance data and the hardware data to obtain training data;

a decision instruction is used for acquiring a trained current classification model and inputting the training data into the current classification model to obtain a scaling decision;

a capacity expansion and reduction instruction, which utilizes the capacity expansion and reduction decision to carry out service capacity expansion and reduction;

and training the current classification model by using the training data to obtain an updated classification model, and taking the updated classification model as the current classification model.

The invention can ensure high availability and high performance of user service, reduce access cost and access difficulty of the user side, improve automation degree, ensure optimization of capacity expansion and contraction strategies and prevent waste of hardware resources.

Drawings

FIG. 1 is a block diagram of a system for service scalability according to an embodiment of the present invention;

fig. 2 is a flow diagram of a method for service scaling according to an embodiment of the present invention.

Detailed Description

Embodiments of the present invention will be described in detail below with reference to the accompanying drawings.

Fig. 1 is a block diagram of a system 1 for service scalability according to an embodiment of the present invention. As shown in fig. 1, the system 1 includes a routing unit 11, a data obtaining unit 12, a converting unit 13, a training unit 14, a decision unit 15, and a scaling unit 16. The system 1 may include one or more routing units 11 and one or more data acquisition units 12, among others.

Only one routing unit 11 and one data acquisition unit 12 are shown in fig. 1.

Fig. 2 is a flow diagram of a method for service scaling according to an embodiment of the present invention. An embodiment of the present invention will be described in detail below with reference to fig. 1 and 2.

As shown in fig. 2, in the request forwarding step S21, the routing unit 11 forwards the service request from the user. Wherein, the routing unit 11 forwards the service request to the corresponding data obtaining unit 12 according to the content of the service request.

In the data acquisition step S22, the corresponding data acquisition unit 12 receives the service request, calculates service performance data from the service request, and collects hardware data.

The data acquiring unit 12 includes a user service unit 121 and a collecting unit 122. The user service unit 121 receives the service request and calculates service performance data according to the service request. The service performance data is, for example, the query volume per second of the service (service qps), the service response time, the service failure rate, and the like calculated from the service request. The subscriber service unit 121 transmits the calculated service performance data to the conversion unit 13.

Here, each data acquisition unit 12 includes a user service unit 121 and a collection unit 122 corresponding to the user service unit 121, and the user service unit 121 and the collection unit 122 are located on the same physical machine. The collection unit 122 collects hardware data of the physical machine. The hardware data refers to the usage of the hardware resources of the physical machine where the collection unit 122 is located, such as the usage of cpu, gpu (graphics processor), memory, etc., network bandwidth, etc. The collection unit 122 will transmit the currently collected hardware data to the conversion unit 13.

In the conversion step S23, the conversion unit 13 performs format conversion on the service performance data received from the user service unit 121 and the hardware data received from the collection unit 122 to obtain training data. Wherein the training data format conforms to the format of training performed in training unit 14. The conversion unit 13 will pass the training data to the decision unit 15 and the training unit 14.

In decision step S24, the decision unit 15 obtains the trained current classification model (e.g., classification model a) from the training unit 14, and inputs the training data into the current classification model, resulting in a scaling decision.

The current classification model is a model trained by a machine learning algorithm according to historical data, which is received by the training unit 14 before the current training data is acquired. These historical training data may be training data previously acquired and converted from different data acquisition units 12 in the system 1.

Next, in the scaling step S25, the scaling unit 16 performs service scaling using the scaling decision. Wherein, the service expansion is to expand the routing unit 11 and/or the user service unit 121. Specifically, the capacity expansion and reduction unit 16 determines whether to increase or decrease the routing unit 11 and/or the user service unit 121 in the system 1 according to the capacity expansion and reduction decision.

In training step 26, training section 14 trains classification model a using the training data transmitted from forwarding section 13 to obtain an updated classification model (for example, classification model B), and sets classification model B as the current classification model.

Before the training data is received, the classification model a trained using the history data using the machine learning algorithm is stored in the training unit 14. The decision unit 15 now obtains the classification model a, and obtains the above-mentioned scaling decision.

After receiving the training data, the training unit 14 uses the machine learning algorithm to train (adapt) the classification model a again with the training data, e.g. resulting in a more optimized classification model B. Since training takes time, the decision unit 15 uses the classification model a when making the decision this time, and obtains and uses the optimized classification model B when making the next decision. In this way, the current classification model can be retrained again according to the training data received in real time to obtain a continuously optimized classification model, thereby ensuring the decision optimization made by the decision unit 15.

In the invention, the service performance data can be acquired in real time from the current service request, the use condition of the hardware resource can be acquired in real time, and the service performance data and the use condition of the hardware resource are input into the machine learning classification model which is trained in advance, so that whether the capacity expansion is needed or not and the quantity of the capacity expansion are determined. The introduction of service performance data ensures high availability and high performance of user services, thereby reducing the access cost and the access difficulty of a user side and improving the automation degree.

In addition, after receiving the new training data, the training unit 15 trains the current classification model again by using the new training data to obtain an optimized updated classification model, and the optimized updated classification model is used for determining a scaling capacity decision next time. Therefore, the classification model of the invention is not invariable, but is optimized (self-adaptive) in real time according to new training data, thereby ensuring the optimization of the capacity expansion and contraction strategy and preventing the waste of hardware resources.

Preferably, the system 1 of the present invention can be implemented by a container cluster managed by the conventional kubernets (K8S, an existing container management system), and the expansion and contraction unit can be implemented by an api server in K8S.

The present invention also provides a nonvolatile storage medium on which a program for performing service expansion is stored, the program being executed by a computer to implement a method of service expansion, the program including:

a request forwarding instruction for forwarding a service request from a user;

a decision instruction, namely acquiring a trained current classification model, and inputting the training data into the current classification model to obtain a scaling decision;

While the present invention has been described in conjunction with specific embodiments, it is evident that many alternatives, modifications, and variations will be apparent to those skilled in the art in light of the foregoing description. Accordingly, it is intended that such alternatives, modifications, and variations be included within the spirit and scope of the appended claims.

Claims

1. A method for API service scalability, the method comprising:

a request forwarding step of forwarding a service request from a user;

and a training step, namely training the current classification model by using the training data to obtain an updated classification model, and taking the updated classification model as the current classification model.

2. The method of claim 1, wherein in the data obtaining step, the user service unit receives the service request and calculates service performance data according to the service request.

3. The method of claim 2, wherein the training data conforms to a format in which training is performed in the training step.

4. The method of claim 3, wherein the updated classification model is obtained the next time the decision step is performed.

5. The method according to any of claims 2-4, wherein in the request forwarding step, a routing unit forwards the service request,

wherein the service scaling is scaling the routing unit and/or the user service unit.

6. A system for API service scale-up, the system comprising:

a routing unit for forwarding a service request from a user;

7. The system of claim 6, wherein the data acquisition unit further comprises:

and the collecting unit is used for collecting the hardware data.

8. The system of claim 7, wherein the training data conforms to a format for training in the training unit.

9. The system of claim 8, wherein the updated classification model is obtained when the next decision is made by the decision unit.

10. A system according to any of claims 7-9, characterized in that said service scaling is scaling at least one routing element and/or subscriber service element.

11. A nonvolatile storage medium on which a program for API service expansion is stored, the program being executed by a computer to implement a method of service expansion, characterized by comprising:

a request forwarding instruction for forwarding a service request from a user;