CN113010262B

CN113010262B - Memory optimization method based on cloud computing

Info

Publication number: CN113010262B
Application number: CN202110197878.9A
Authority: CN
Inventors: 樊馨
Original assignee: Zhongke Panyun Beijing Technology Co ltd
Current assignee: Zhongke Panyun Beijing Technology Co ltd
Priority date: 2021-02-22
Filing date: 2021-02-22
Publication date: 2022-05-10
Anticipated expiration: 2041-02-22
Also published as: CN113010262A

Abstract

The application discloses a memory optimization method based on cloud computing.A cloud server determines a plurality of Java Virtual Machines (JVMs) running a plurality of first application programs and acquires the memory size and the processing resource occupancy rate of the JVMs; the cloud server respectively carries out weighted average operation on the memory size and the processing resource occupancy rate of the JVMs to obtain a memory mean value and a processing resource occupancy rate mean value, sets a first processing resource occupancy rate upper limit value based on the processing resource occupancy rate mean value, and establishes a first template JVM based on the memory mean value and the processing resource occupancy rate upper limit value; acquiring a plurality of first running times of a plurality of first application programs in a plurality of JVMs; acquiring a plurality of second running times of a plurality of first application programs in a plurality of JVMs; and optimizing the occupancy rate upper limit value of the memory and the processing resource of the first template JVM according to the ratio of the first running time to the second running time to generate a plurality of optimized template JVMs, wherein the plurality of optimized template JVMs correspond to the plurality of first applications.

Description

Memory optimization method based on cloud computing

Technical Field

The application relates to the technical field of cloud computing, in particular to a memory optimization method based on cloud computing.

Background

Cloud computing (cloud computing) is one type of distributed computing, and means that a huge data computing processing program is decomposed into countless small programs through a network "cloud", and then the small programs are processed and analyzed through a system consisting of a plurality of servers to obtain results and are returned to a user. In the early stage of cloud computing, simple distributed computing is adopted, task distribution is solved, and computing results are merged. Thus, cloud computing is also known as grid computing. By the technology, tens of thousands of data can be processed in a short time (several seconds), so that strong network service is achieved. At present, the cloud service is not just distributed computing, but a result of hybrid evolution and leap of computer technologies such as distributed computing, utility computing, load balancing, parallel computing, network storage, hot backup redundancy, virtualization and the like.

In a cloud computing system, different application programs are operated in different hardware hosts through a virtual computing technology, wherein a virtual machine is a basis for program operation, and the actual memory occupation condition and the processing resource occupancy rate of the virtual machine directly influence the fluency degree of program operation.

Disclosure of Invention

The embodiment of the application provides a memory optimization method based on cloud computing, which is used for solving the problem of unreasonable memory allocation caused by the lack of an optimal memory allocation rule in the prior art.

The embodiment of the invention provides a memory optimization method based on cloud computing, which comprises the following steps:

the method comprises the steps that a cloud server determines a plurality of Java Virtual Machines (JVMs) running a plurality of first application programs, and the memory size and the processing resource occupancy rate of the JVMs are obtained;

the cloud server acquires the calling frequency of the plurality of first application programs in a unit time period, and allocates different weights of the plurality of first application programs based on the calling frequency;

respectively carrying out weighted average operation on the memory size and the processing resource occupancy rate of the JVMs based on the distributed weight to obtain a memory mean value and a processing resource occupancy rate mean value, setting a first processing resource occupancy rate upper limit value based on the processing resource occupancy rate mean value, and establishing a first template JVM based on the memory mean value and the processing resource occupancy rate upper limit value, wherein the memory size of the first template JVM is the memory mean value, and the processing resource occupancy rate upper limit of the first template JVM is the first processing resource occupancy rate upper limit value;

acquiring a plurality of first running times of the plurality of first application programs in the plurality of JVMs;

modifying all the memories of the JVMs into the memory value of the first template JVM, modifying all the processing resource occupancy rate upper limits of the JVMs into the processing resource occupancy rate upper limit of the first template JVM, and acquiring a plurality of second running times of the first applications in the JVMs;

optimizing the occupancy rate upper limit value of the memory and processing resource of the first template JVM according to the ratio of the first running time to the second running time to generate a plurality of optimized template JVMs, wherein the optimized template JVMs correspond to the first applications;

when a second application requests a JVM from the cloud server, the cloud server compares the second application with the operating parameters of the plurality of first applications,

if the difference value of at least two types of operation parameters of the second application program and one of the first application programs is in a preset range, allocating an optimized template JVM corresponding to the one of the first application programs to the second application program,

and if the difference value of any two types of operation parameters does not exist between the second application program and any one of the first application programs, distributing the first template JVM for the second application program.

Optionally, the operating parameters of the second application include a maximum memory size, a thread number, a thread stack size, a supervision proportion, and a user priority.

Optionally, the method further comprises:

allocating the plurality of optimized template JVMs to the corresponding plurality of first applications to optimize runtime of the plurality of first applications, wherein the plurality of optimized template JVMs are stored in a storage space of the cloud server;

and dynamically allocating an extensible storage space for the storage space of the cloud server to optimize the utilization rate of the storage space.

Optionally, the size of the dynamically allocated scalable storage space is determined by the following formula:

wherein F is the size of the expandable storage space, T is the running time of a single program, M is the memory value of the single program, alpha is the weight, beta and lambda is the correction coefficient.

Optionally, the first runtime is the time T3 when the program ends running, minus the time T2 when the program starts running, minus the time T1 consumed by compiling a single program.

Alternatively, the time taken for program compilation, T1, is recorded using the jstat-compiler Command token.

Optionally, a script is started through a shell command, all memories of the JVMs are modified into memory values of the first template JVM through a JAR command, and all processing resource occupancy upper limits of the JVMs are modified into processing resource occupancy upper limits of the first template JVM.

According to the memory allocation method, the mode of setting the template JVM and optimizing the template JVM is adopted, so that memory allocation and allocation of processing resource occupancy rate upper limits for different programs are more reasonable, the running efficiency of the programs is effectively improved, the running time of the programs is shortened, and the resource utilization rate of a cloud computing system is improved.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings used in the description of the embodiments will be briefly introduced below.

FIG. 1 is a diagram of a cloud computing network architecture in one embodiment;

FIG. 2 is a schematic diagram illustrating a cloud-computing-based memory optimization process according to an embodiment;

fig. 3 is a schematic diagram of memory optimization in one embodiment.

Detailed Description

The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some, but not all, embodiments of the present application. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

It will be understood that the terms "comprises" and/or "comprising," when used in this specification and the appended claims, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.

It is also to be understood that the terminology used in the description of the present application herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used in the specification of the present application and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.

It should be further understood that the term "and/or" as used in this specification and the appended claims refers to and includes any and all possible combinations of one or more of the associated listed items.

As used in this specification and the appended claims, the term "if" may be interpreted contextually as "when", "upon" or "in response to a determination" or "in response to a detection". Similarly, the phrase "if it is determined" or "if a [ described condition or event ] is detected" may be interpreted contextually to mean "upon determining" or "in response to determining" or "upon detecting [ described condition or event ]" or "in response to detecting [ described condition or event ]".

Fig. 1 is a cloud computing system architecture diagram of an embodiment of the present invention. As shown in fig. 1, fig. 1 includes a cloud 10, a memory optimization engine 11 and a terminal 12, where the cloud 10 is composed of a plurality of cloud servers, each cloud server includes a plurality of Java virtual machines JVMs, each JVM can run one or more applications, and by means of virtual machine technology, a single application is actually run by a plurality of physical hosts at the same time; the memory optimization engine 11 can be built in a cloud server, and is essentially used for accelerating the algorithm and the rule of memory optimization of the cloud JVM, so that the resource occupation condition of the JVM is good enough, the running efficiency of an application program can be improved, and the resource waste condition of the JVM can be effectively reduced; the terminal 12 may include a mobile phone, a personal computer, and other common devices, and initiates a call request of an application program to a cloud, where the cloud bears the application program through a JVM for the terminal to use.

Fig. 2 is a flowchart of a memory optimization method based on cloud computing in an embodiment. The method in the embodiment comprises the following steps:

s101, a cloud server determines a plurality of Java Virtual Machines (JVMs) running a plurality of first application programs, and acquires the memory size and the processing resource occupancy rate of the JVMs;

a virtual machine is an abstract computer, and is realized by simulating various computer functions on an actual computer. The JVM has its own sophisticated hardware architecture, such as processor, stack, registers, etc., and has a corresponding instruction system. The JVM shields information related to a specific operating system platform so that Java programs can be run on a variety of platforms without modification by only generating object code (bytecode) that runs on the JVM.

The JVM specification defines an abstract, rather than an actual machine or processor. This specification describes an instruction set, a set of registers, a stack, a "garbage heap," and a method area. Once a JVM runs on a given platform, any Java program (compiled program, called bytecode) can run on this platform. The JVM may interpret the bytecode one instruction at a time (mapping it to the actual processor instructions), or the bytecode may be further compiled by a compiler called just-in-time in the actual processor.

In this embodiment of the present invention, the first applications are multiple and distributed in multiple different JVMs, for example, the first applications are a, B, and C respectively distributed in JVM a, JVM B, and JVM C, and during the running of the three applications a, B, and C, the memory size and the CPU occupancy rate of the processing resource of the JVM are respectively obtained. For example, JVM a occupies 16GB of memory, resource occupancy is 85%, JVM B occupies 10GB of memory, resource occupancy is 75%, JVM C occupies 8GB of memory, and resource occupancy is 65%.

S102, the cloud server obtains the calling frequency of the first application programs in a unit time period, and different weights of the first application programs are distributed based on the calling frequency;

in the embodiment of the present invention, the higher the call frequency, it is proved that the more frequent the application of the application program in a unit time is, the more important the memory occupied by the JVM of the application program needs to be taken as an important consideration in the later memory optimization design, and thus the weight of the memory is larger. In view of this, it is necessary to obtain the calling frequency of different applications and adaptively set their weights. For example, if the calling frequency of the applications a, B, and C is changed from large to small, the weight setting thereof is also satisfied.

S103, respectively carrying out weighted average operation on the memory size and the processing resource occupancy rate of the JVMs based on the distributed weight to obtain a memory mean value and a processing resource occupancy rate mean value, setting a first processing resource occupancy rate upper limit value based on the processing resource occupancy rate mean value, and establishing a first template JVM based on the memory mean value and the processing resource occupancy rate upper limit value, wherein the memory size of the first template JVM is the memory mean value, and the processing resource occupancy rate upper limit of the first template JVM is the first processing resource occupancy rate upper limit value;

the purpose of S103 is to design a general template JVM, which relies on a common program currently running (the higher the frequency, the larger the weight, and the lower the frequency, the lower the weight), and can be used to perform memory tuning and CPU resource occupancy upper limit (a highest proportion limit is performed on CPU occupancy rate) tuning for the next program running.

Therefore, in S103, the memory sizes of the JVMs need to be weighted and averaged to obtain a memory average, the CPU resource occupancy is usually a variable, and the air CPU resource occupancy may show an increase or a decrease with different amplitudes in different time periods, so that the weighted average of a certain time period or a certain time point needs to be measured, and a threshold value is floated based on the measured CPU occupancy average, so as to obtain an upper limit of CPU resource occupancy, for example, the CPU resource occupancy of current a, B, and C is 85%, 75%, and 65%, the CPU occupancy after the weighted average is 72%, and the CPU resource occupancy after the floating by 15% (i.e., 87%) is the upper limit of resource occupancy. The resource occupation upper limit value has the advantages that a certain available range is kept to be allocated to the JVM, meanwhile, the range does not need to be too high, an empirical value floats above the average value, so that the smooth use of most application programs can be met, and the resource occupation rate of the CPU can be correspondingly saved.

S104, acquiring a plurality of first running times of the plurality of first application programs in the plurality of JVMs;

the first running time T is the time T3 when the program finishes running, the time T2 when the program starts running and the time T1 when the single program is compiled, namely T = T3-T2-T1. Alternatively, the time taken for program compilation, T1, is recorded using the jstat-compiler Command token.

S105, modifying all memories of the JVMs into memory values of the first template JVM, modifying all processing resource occupancy rate upper limits of the JVMs into processing resource occupancy rate upper limits of the first template JVM, and acquiring a plurality of second running times of the first applications in the JVMs;

in the embodiment of the invention, the memory optimization is mainly embodied in 'probing', namely, by continuously probing or testing, which memory value is the optimal memory value is measured and calculated, and which CPU resource occupies the upper limit value, so that the use of the program is not influenced and the resource is not wasted. Therefore, in S104, the first runtime is obtained according to the existing first application, and in S105, all the JVMs are set as the memory values of the first template JVM, that is, all the JVMs are set as the unified memory value and the unified CPU utilization upper limit value, so that the runtime of each application, that is, the second runtime, is tested.

Optionally, a script may be started through a shell command, and a JAR command is used to modify all memories of the JVMs into a memory value of the first template JVM, and modify all processing resource occupancy rate upper limits of the JVMs into a processing resource occupancy rate upper limit of the first template JVM.

S106, optimizing the occupancy rate upper limit value of the memory and processing resource of the first template JVM according to the ratio of the first running time to the second running time, and generating a plurality of optimized template JVMs, wherein the optimized template JVMs correspond to the first applications;

if the ratio of the second running time to the first running time is lower than 1, the second running time is shorter than the first running time, the memory value and the CPU resource occupancy rate set by the template JVM are better than the memory value and the CPU resource occupancy rate set by the original JVM, and the memory value and the CPU resource occupancy rate of the template JVM are preferentially adopted as the parameter values of the program JVM of the type;

if the ratio of the second running time to the first running time is higher than 1, the second running time is longer than the first running time, the memory value and the CPU resource occupancy rate set by the template JVM are worse than those set by the original JVM, the second running time needs to be updated according to the weighted average result repeatedly in the next time period, the next second running time is compared with the first running time until the ratio of the second running time to the first running time is lower than 1, and the updated memory value of the second running time and the CPU resource occupancy upper limit are used as parameters for optimizing the template JVM and recorded.

In addition, in the embodiment of the present invention, if the ratio of the second running time to the first running time is lower than 1, a +1 or-1 operation may be further performed on the memory value of the template JVM and the CPU resource occupation upper limit value, so as to obtain an optimal parameter of the template JVM. For example, if the memory value of the template JVM is 10G, the upper limit of CPU resource consumption is 87%, increasing the memory value of the template JVM to 10G, and the upper limit of resource consumption is reduced to 86%, testing the second running time of the program in the JVM again (for the second time), if the second running time of the second time is shorter than the first second running time, the 10G memory and the upper limit of 86% are more optimal parameters, storing and forming a JVM optimization template for the application program, and so on, after the operations of +1 or-1 are not repeated, testing the length of the second running time until the running time is tested to be the minimum and the upper limit of CPU consumption is the minimum, and the memory value and the upper limit of CPU consumption at this time are the optimal parameters.

S107, when a second application program requests JVM from the cloud server, the cloud server compares the second application program with the operating parameters of the plurality of first application programs,

The operation parameters of the second application program include a maximum memory size, a thread number, a thread stack size, a supervision proportion and a user priority. Table 1 lists possible application operating parameters:

TABLE 1

In S107, if the difference between at least two types of operating parameters of the second application and the first application is within a preset range (which may be set according to an empirical value), the cloud server may assume that the second application and the first specific application belong to a similar application or a same type of application, such as wechat and whapp, and the resource occupation, the thread number, and the call interface of the application are all the same or substantially the same, and preferentially use an optimization template JVM applicable to wechat to set the application for whatapp. For example, two applications of a game type have high GPU rendering rate and CPU occupancy rate, and are preferentially applicable to an optimized template JVM of the game application.

For convenience of description, fig. 3 exemplarily shows the storage space change of the cloud server before and after the memory optimization. There are three illustrative memory instances in fig. 3, the left diagram is an example of excess memory space, the right diagram is an example of insufficient memory space, and the middle is an example of memory optimization. In the left diagram, the memory allocated to each JVM is too high and the storage space reservation is very sufficient, resulting in redundancy, and in the right diagram, the memory allocated to each JVM is too low and the storage space reservation is insufficient, so that the program runs slowly, which has a large impact on the operation essence. The intermediate optimization scheme does not cause excess and unnecessary waste.

In addition, in the embodiment of the present invention, the cloud server may further set a dynamic extensible storage space to temporarily provide an extended memory for different JVMs, which is specifically as follows:

the cloud server allocates the plurality of optimized template JVMs to the corresponding plurality of first applications to enable the plurality of first applications to optimize runtime, wherein the plurality of optimized template JVMs are stored in a storage space of the cloud server;

Wherein the dynamically allocated scalable storage space size is determined by the following formula:

Figure 210949DEST_PATH_BDA0002946524500000031

where F is the scalable storage size, T_iFor a single program run time, M_iIs the memory value of a single program,

_ibeta, lambda is a correction coefficient, N is a positive integer, and i is a positive integer ranging from 1 to N.

The dynamic allocation of the scalable storage space of the cloud server is positively correlated with the running time and negatively correlated with the memory size, that is, the higher the running time is, the slower the running is, the more the required scalable space is, and the higher the JVM memory is, the less the scalable space is. In fig. 3, the dynamic scalable space of the left graph is high, and such a high dynamic scalable space is not actually needed, and the dynamic scalable space of the right graph is low, and a high scalable space is actually needed to make up for the problem of insufficient memory of a single JVM, so that the scalable space needs to be rationalized to improve memory complement of the JVM, and unnecessary memory waste cannot be caused.

The above description is only for the specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily conceive various equivalent modifications or substitutions within the technical scope of the present application, and these modifications or substitutions should be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims

1. A memory optimization method based on cloud computing is applied to a cloud computing system and is characterized by comprising the following steps:

and if the difference value of any two types of running parameters does not exist between the second application program and any one of the first application programs, distributing the first template JVM for the second application program.

2. The method of claim 1, wherein the operating parameters of the second application include a maximum memory size, a thread count, a thread stack size, a supervisory proportion, and a user priority.

3. The method of claim 1, further comprising:

4. The method of claim 1, wherein the first runtime is a time T3 when the program ends running, minus a time T2 when the program starts running, minus a time T1 spent by compiling a single program.

5. The method of claim 4, wherein the Jstat-compiler directives are used to record the elapsed time T1 for program compilation.

6. The method of claim 1, wherein a script is started by a shell command, and a JAR command is used to modify all memories of the JVMs into memory values of the first template JVM, and modify all processing resource occupancy upper limits of the JVMs into processing resource occupancy upper limits of the first template JVM.