CN112328355A - Self-adaptive optimal memory reservation estimation method for long-life container - Google Patents

Self-adaptive optimal memory reservation estimation method for long-life container Download PDF

Info

Publication number
CN112328355A
CN112328355A CN202011073505.2A CN202011073505A CN112328355A CN 112328355 A CN112328355 A CN 112328355A CN 202011073505 A CN202011073505 A CN 202011073505A CN 112328355 A CN112328355 A CN 112328355A
Authority
CN
China
Prior art keywords
memory
actor
reservation
stage
memory reservation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011073505.2A
Other languages
Chinese (zh)
Other versions
CN112328355B (en
Inventor
刘芳
林嘉韵
蔡振华
黄志杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
National Sun Yat Sen University
Original Assignee
National Sun Yat Sen University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by National Sun Yat Sen University filed Critical National Sun Yat Sen University
Priority to CN202011073505.2A priority Critical patent/CN112328355B/en
Publication of CN112328355A publication Critical patent/CN112328355A/en
Application granted granted Critical
Publication of CN112328355B publication Critical patent/CN112328355B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • G06F2009/45587Isolation or security of virtual machine instances
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • G06F2009/45595Network integration; Enabling network access in virtual machine instances
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention discloses a self-adaptive optimal memory reservation estimation method for a long-life container, which is applied to different stages of a Spark distributed cluster of a data center and comprises the following steps: s1: executing a MEER + strategy at the initial stage of the server cluster, collecting historical data of application program operation in the server, and estimating the optimal memory reservation of the cluster at the initial stage by using the historical data; s2: and executing a DEEP-MEER strategy in a stable stage of the server cluster, obtaining an optimal memory reservation model in the stable stage by using historical data, and estimating the optimal memory reservation in the current stage by using the model. The method adopts different optimal memory reservation estimation strategies for Spark distributed clusters in the data center at different life cycle stages, approaches an optimal value by thinning step length at the initial stage of the clusters, improves estimation accuracy, and establishes a reinforcement learning model by utilizing rich historical data at the stable stage of the clusters, thereby ensuring the stability of application program performance.

Description

Self-adaptive optimal memory reservation estimation method for long-life container
Technical Field
The invention relates to the technical field of computers, in particular to a self-adaptive optimal memory reservation estimation method for a long-life container.
Background
With the development of big data, more and more memory computing works including machine learning, stream processing, interactive query, graph computing and the like are deployed on a shared cluster of a data center. Such a workload handles a large amount of data, and is called a Long Running Application (LRA). It can be observed from the results of public enterprise cluster tracking and analysis that LRAs have become the major workload of everyday online services in data centers today.
Existing large data processing systems, such as Spark and Flink, rely primarily on resource managers, such as YARN, messes, Omega, Borg, and kubernets, to allocate resources for applications. These managers schedule resources by packing the CPU and memory into containers. Unlike conventional short-lived containers used to process batch jobs, containers of LRAs remain active until the application is executed, and are therefore referred to as long-lived containers.
The same application is repeatedly executed for different data, which is particularly common in data centers. In this working mode, the only change is the data content, and the resource occupation mode is not changed. Therefore, it is of great significance to explore the resource occupation pattern and find out the optimal resource allocation strategy. If the resource manager reserves too much memory for an application, unnecessary waste can result. The application program only needs to occupy partial memory, and because the container is fixed before the application program is executed, the rest of the memory has no chance to be used by the application program and cannot be reallocated to other application programs, and before the container is released, the container becomes an idle memory fragment. Conversely, if the memory reservation is too small, the performance of the application may not be guaranteed, and the more serious result is that the application crashes or even does not complete smoothly. Therefore, it is meaningful to estimate an optimal memory reservation that can guarantee the performance of the application program without wasting resources.
In the prior art, the publication number is CN110187967A, and a memory prediction method and device suitable for a dependency analysis tool are disclosed in 2019, 8/30.s.a method specifically includes extracting a source code file from a Java package file; analyzing the extracted source code files to generate an abstract syntax tree, and acquiring the number of instance objects of each type of node classes in the abstract syntax tree generated by each source code file; calculating the memory occupied by the node class instance object of each class; calculating the size of an occupied memory of an abstract syntax tree generated by each source code file; and calculating the required memory size of the whole Java program package. The method only calculates the size of the memory occupied by the Java program package, and for the memory calculation, the memory is also required to be used for caching input data and intermediate variables, and the data quantity is related to the size of an input data set and cannot be obtained through source code analysis. Therefore, it is not suitable for predicting the memory of the entire long-life container.
Disclosure of Invention
The invention provides a self-adaptive optimal memory reservation estimation method for a long-life container, aiming at overcoming the defect that a resource manager cannot accurately and effectively estimate optimal memory reservation by using historical operating data of an application program for the long-life container in a data center in the prior art.
The primary objective of the present invention is to solve the above technical problems, and the technical solution of the present invention is as follows:
an adaptive optimal memory reservation estimation method for a long-life container, which is applied to different stages of a data center Spark distributed server cluster, comprises the following steps:
s1: executing a MEER + strategy at the initial stage of the Spark distributed cluster, collecting historical data of application program operation in a server, and estimating the optimal memory reservation of the Spark distributed cluster at the initial stage by using the historical data;
s2: and executing a DEEP-MEER strategy in a stable stage of the Spark distributed cluster, obtaining an optimal memory reservation model in the stable stage by using known historical data, and estimating the optimal memory reservation in the current stage by using the model.
In this scheme, the MEER + policy execution flow includes three processes, which are respectively recorded as: a trial run stage, an iterative search stage, an approximation stage,
the trial operation stage refers to that when the application program is submitted for the first time, the application program operates under excess reservation, initial memory occupation and program operation data are generated by trial operation, the initial memory occupation and program operation data are recorded by a history server and a measurement system, an expected value of the memory usage amount is calculated by using a histogram analysis model based on the memory occupation data, and then the expected value of the memory usage amount is transmitted to a resource manager;
the iterative search phase refers to the resource manager taking the last evaluation M when the application is resubmittedn-1As the memory reservation of the current operation, the MEER records the memory occupation amount and the program operation time in the program operation process, and calculates the memory occupation expectation MnAnd evaluating the performance, terminating the search if the performance satisfies any one of the termination conditions, Mn-2Namely the optimal reservation which is finally estimated; otherwise, MEER will calculate the expected value MnAs the new memory reserved value of the application program, applying to the next execution; the termination conditions are three, one: the execution time is too long, and the condition two is as follows: and (3) the garbage is too time-consuming to recycle, and the condition is three: the memory utilization rate reaches the expected target, except for the third condition, the application program has to bear one time of time-consuming and inefficient operation for terminating the iterative search;
the approach stage means that the optimal memory reservation estimation has two branches according to different termination conditions; if the termination condition met in the iterative search stage is the condition one or the condition two, the memory reservation calculation formula of the MEER + is as follows:
Mn=Mn-1+Mf,where Mf<Mt-1-Mt, (1)
wherein M isfIs an increment or decrement added to correct the estimation result, MtIs an estimated memory reservation that meets the termination condition; the approximation stage is terminated when no termination condition is met any more, and the final optimal memory is reserved as the estimation result M of the last iterationn-1(ii) a If the termination condition met in the iterative search stage is condition three, the MEER + executes the outer process, and the memory reservation calculation formula is as follows:
Mn=Mn-1-Mf,where Mf<Mt-M, (2)
and stopping until the condition I or the condition II is met, and reserving the final optimal memory as a memory reserved value used in the last iteration, namely an estimation result of the last iteration.
In the scheme, the optimal memory reservation estimation model used for executing the MEER + strategy is a histogram analysis model, the histogram analysis model is used for recording the current memory occupation every second in a measurement system for each operation of an application program, a corresponding histogram is drawn, two endpoints of a rectangle on a horizontal axis in the histogram represent the memory usage amount, the high representation frequency of each rectangle, namely the occurrence frequency of the memory occupation amount between the two endpoints, the probability density estimation is carried out on the memory occupation by utilizing a histogram analysis method, and the memory occupation expectation is calculated.
In the scheme, the average value of the memory occupation amount in the histogram analysis method at a certain moment is xiThe probability within the interval of (a) is defined as:
Figure BDA0002715984530000031
where Count is the sum of all the rectangle heights, i.e., frequencies, Freq (x)i) Denotes the mean value xiThe frequency of (c).
In the scheme, the memory occupation expectation calculation formula is as follows:
Figure BDA0002715984530000032
wherein N is the number of rectangles in the frequency domain distribution histogram.
In this scheme, the execution mode of the DEEP-MEER strategy is as follows: when the application program is submitted, the resource manager will adopt the last optimal memory reservation estimation result Mn-1As the memory reservation of the current operation, the history server and the measurement system record the memory occupation data of the current operation, and the expectation of the memory usage amount is calculated by using the Actor-Critic modelValue MnThe expected value is then communicated to the resource manager.
In the scheme, an optimal memory reservation estimation model used by the DEEP-MEER strategy is an Actor-criticc model, and the Actor-criticc model comprises the following steps: the system comprises an Actor unit, a Critic unit and a cluster for running application programs;
the Actor unit is responsible for providing a strategy, the Actor unit selects an action according to the probability and modifies the probability of the action to be selected according to the score provided by the Critic unit, the Actor unit implements a random strategy which maps the system state to the corresponding action, and the Actor unit comprises three layers of neurons: the method comprises the steps that an input layer, a hidden layer and an output layer are arranged, wherein the input of the input layer is from an environment state, an activation function of the hidden layer is Relu (), neurons of the output layer respectively correspond to how much memory reservation actions are set for a current application program, the output result of the output layer is finally converted into a value between 0 and 1 through a Softmax () function, and the sum of all output values is equal to 1;
the Critic unit is a value function of an evaluation strategy, evaluates the action selected by the Actor unit and provides feedback to help the Actor unit to adjust the strategy, an output layer of the Critic unit is only provided with one neuron, the neuron is a score given by the Critic unit, once the Critic unit calculates the score, the score is combined with a reward returned by an environment, and finally a loss value is calculated and used for guiding the Actor unit and the Critic unit to update parameters;
the cluster running the application program interacts with the Actor and the Critic, and the functions of the cluster comprise: firstly, executing an action selected by an Actor unit, namely, reserving a memory appointed by the Actor to run and submit an application program; second, return the state changed by performing the action and measure the size of the reward value that benefits from the action.
In this scheme, the Actor-criticic model calculation process includes the following steps:
s1: initialization state S0
S2 execution of Critic Unit, calculate V (S)0);
S3: let i equal to 1;
s4: execution of Actor element, based on Si-1Calculating the probability P of each actioni-1Determining A with the maximum probabilityi-1
S5: performing action Ai-1Obtaining SiAnd Ri
S6 execution of Critic Unit, calculate V (S)i);
S7 calculating the TD error deltai-1
S8: calculating Loss value Loss (delta)i-1);
S9, updating the parameter omega of Actor and Critic under the guidance of the loss value;
s10, if i is not more than N, i is i +1, go back to S4
The initialization S0The method comprises the steps of running an application program in a state of reserving excessive memory to obtain an initial environment state; v refers to a cost function.
In this embodiment, the TD error is defined as:
δt=Rt+1+γV(St+1)-V(St). (5)
the loss function is freely selected according to the requirement; the parameter updating means that the neural network carries out back propagation according to a chain rule, calculates the derivative of the composite function, then propagates the gradient of the output neuron back to the input neuron, and adjusts the learnable parameters of the network according to the calculated gradient; n is the set iteration number;
the Actor-Critic model defines model parameters, and the model parameters comprise: a reward value, a state parameter and an action, the reward value being used to determine a benefit of performing a given action in a given state, the reward value at time t being defined by the formula:
Figure BDA0002715984530000051
wherein M istAnd TtReferring to reserved memory and program running time at time t, respectively, each neuron of the output layer of the Actor unit corresponds toSetting a specific memory;
the state parameters include: Δ t represents increased program run time compared to the trial run;
e represents the expected value of the memory occupation;
max1~maxnrepresenting n values with highest frequency occupied by a memory in histogram analysis;
p1~pndenotes max1~maxnThe corresponding frequency.
Compared with the prior art, the technical scheme of the invention has the beneficial effects that:
the invention adopts different optimal memory reservation estimation strategies for the server cluster in the data center at different life cycle stages, and improves the estimation accuracy by thinning the step length to approach the optimal value, thereby ensuring the stability of the application program performance.
Drawings
FIG. 1 is a flow chart of the method of the present invention.
Fig. 2 is a schematic diagram of the architecture and workflow of the MEER + policy of the present invention.
FIG. 3 is a frequency domain histogram of memory usage in the present invention.
FIG. 4 is a schematic diagram of the architecture and workflow of the DEEP-MEER strategy of the present invention.
FIG. 5 is a schematic structural diagram of an Actor-Critic model according to the present invention.
Fig. 6 is a schematic diagram of the average relative error of the MEER strategy, the MEER + strategy and the Deep-MEER strategy over four workloads in the embodiment of the present invention.
FIG. 7 is a schematic diagram of relative estimation errors of the MEER strategy, the MEER + strategy and the Deep-MEER strategy on four workloads in the embodiment of the present invention.
Fig. 8 is a schematic diagram illustrating changes in memory utilization during the Page Rank operation process in the embodiment of the present invention.
FIG. 9 is a diagram illustrating the variation of the runtime of an application with the number of repeated executions when the MEER policy, the MEER + policy, and the Deep-MEER policy are applied to a benchmark workload according to an embodiment of the present invention.
Detailed Description
In order that the above objects, features and advantages of the present invention can be more clearly understood, a more particular description of the invention will be rendered by reference to the appended drawings. It should be noted that the embodiments and features of the embodiments of the present application may be combined with each other without conflict.
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention, however, the present invention may be practiced in other ways than those specifically described herein, and therefore the scope of the present invention is not limited by the specific embodiments disclosed below.
Example 1
As shown in fig. 1, an adaptive optimal memory reservation estimation method for a long-life container, which is applied to different stages of a Spark distributed cluster, includes the following steps:
s1: executing a MEER + strategy at the initial stage of the Spark distributed cluster, collecting historical data of application program operation in a server, and obtaining the optimal memory reservation of the Spark distributed cluster at the initial stage by using the historical data;
it should be noted that Spark distributed cluster is in the initial stage. When the Spark distributed cluster just starts to work, although no program running experience exists and no estimation basis exists, the MEER + strategy can be adopted to sacrifice some application program performance for preliminary estimation. Meanwhile, the system generates a large amount of historical data of application program operation, so that original data is provided for model training, and the estimation model based on reinforcement learning can be trained.
S2: executing a DEEP-MEER strategy in a stable stage of the Spark distributed cluster, obtaining an optimal memory reservation model in the stable stage by using known historical data, and estimating the optimal memory reservation in the current stage by using the model;
fig. 2 is a schematic diagram illustrating the architecture and workflow of the MEER + policy. In this scheme, the MEER + policy execution flow includes three processes, which are respectively recorded as: a trial run stage, an iterative search stage, an approximation stage,
the trial operation stage refers to that when the application program is submitted for the first time, the application program operates under excess reservation, initial memory occupation and program operation data are generated by trial operation, the initial memory occupation and program operation data are recorded by a history server and a measurement system, an expected value of the memory usage amount is calculated by using a histogram analysis model based on the memory occupation data, and then the expected value of the memory usage amount is transmitted to a resource manager;
the iterative search phase refers to the resource manager taking the last evaluation M when the application is resubmittedn-1As the memory reservation of the current operation, the MEER records the memory occupation amount and the program operation time in the program operation process, and calculates the memory occupation expectation MnAnd evaluating the performance, terminating the search if the performance satisfies any one of the termination conditions, Mn-2Namely the optimal reservation which is finally estimated; otherwise, MEER will calculate the expected value MnAnd the new memory reserved value of the application program is used for the next execution. The termination conditions are three, one: the execution time is too long, and the condition two is as follows: and (3) the garbage is too time-consuming to recycle, and the condition is three: memory utilization achieves the desired goal. Obviously, except for the third condition, to terminate the iterative search, the application program must undergo a time-consuming and inefficient run;
the approach stage means that the optimal memory reservation estimation has two branches according to different termination conditions; if the termination condition met in the iterative search stage is the condition one or the condition two, the internal side execution process of the MEER + is carried out, and the memory reservation calculation formula is as follows:
Mn=Mn-1+Mf,where Mf<Mt-1-Mt, (1)
wherein M isfIs an increment or decrement added to correct the estimation result, MtIs an estimated memory reservation that meets the termination condition; the approximation stage is terminated when no termination condition is met any more, and the final optimal memory is reserved as the estimation result M of the last iterationn-1(ii) a If the termination condition met in the iterative search stage is condition three, the MEER + executes the outer process, and the memory reservation calculation formula is as follows:
Mn=Mn-1-Mf,where Mf<Mt-M, (2)
stopping until the condition one or the condition two is met, and finally reserving the optimal memory as a memory reserved value M used in the last iterationn-2I.e. the estimation result of the previous iteration.
In the scheme, the optimal memory reservation model used in executing the MEER + strategy is a histogram analysis model. The histogram analysis model is to record the current memory occupation once every second in the measurement system for each operation of the application program, draw a corresponding histogram, as shown in fig. 3, two end points of a rectangle on a horizontal axis in the histogram represent the memory usage amount, mark the mean value of the two end points on the horizontal axis for the convenience of calculation, and perform probability density estimation on the memory occupation by using a histogram analysis method and calculate the memory occupation expectation, wherein the high representative frequency of each rectangle is the occurrence frequency of the memory occupation amount between the two end points.
In the scheme, the average value of the memory occupation amount in the histogram analysis method at a certain moment is xiThe probability within the interval of (a) may be defined as:
Figure BDA0002715984530000081
where Count is the sum of all the rectangle heights, i.e., frequencies, Freq (x)i) Denotes the mean value xiThe frequency of (c).
In the scheme, the memory occupation expectation calculation formula is as follows:
Figure BDA0002715984530000082
wherein N is the number of rectangles in the frequency domain distribution histogram.
It should be noted that the expectation value reflects the possible average cost of future server cluster application operations, which means that most of the memory requirements have a very high reference value for the estimation of the reserved memory. As long as proper parameters are added, a functional relation formula with the memory occupation expectation as an independent variable and the reserved memory as a dependent variable is formed, and a model for estimating the optimal memory reservation is constructed.
As shown in fig. 4, in the stable stage of the server, in this scheme, the DEEP-mer policy is executed in such a way that, when the application program is submitted, the resource manager will use the last optimal memory reservation estimation result Mn-1As the memory reservation of the current operation, the history server and the measurement system record the memory occupation data of the current operation, and calculate the expected value M of the memory usage amount by using the Actor-Critic modelnThe expected value is then communicated to the resource manager.
In the scheme, an optimal memory reservation estimation model used by the DEEP-MEER strategy is an Actor-criticc model, and the Actor-criticc model comprises the following steps: the system comprises an Actor unit, a Critic unit and a cluster for running application programs;
the Actor unit executes a random strategy, the random strategy is a probability value for mapping the system state to a corresponding action, the Actor unit selects an action according to the probability, and then modifies the probability of the action to be selected according to a feedback score provided by the Critic unit; the Actor unit includes an input layer, a hidden layer, and an output layer, as shown in fig. 5, each circle represents a neuron, and each line between two neurons represents a weight. Let ω be a set of such weight parameters. The value of each neuron is a weighted sum of the neurons of the previous layer, except that the value of the input neuron is provided by the environment. The input of the input layer is from a cluster state of an operating application program, the activation function of the hidden layer is Relu (), the neurons of the output layer respectively correspond to how many memory reserved actions are set for the application program, the output result of the output layer of the Actor unit is converted into a value between 0 and 1 through a Softmax () function, the sum of all output values is equal to 1, and each output value represents the probability that the corresponding action should be selected.
The criticic unit is a value function of an evaluation strategy, evaluates the action selected by the Actor unit and provides feedback to help the Actor unit to adjust the strategy; the only difference between the basic structure and the parametric form of the Critic unit and the Actor unit is that its output layer has only one neuron, which is the score given by the Critic unit. Once the Critic unit calculates the score, the Critic unit combines the score with the reward returned by the environment, and finally calculates a loss value for guiding the Actor unit and the Critic unit to carry out parameter updating.
The cluster running the application program interacts with the Actor and the Critic, and the functions of the cluster comprise: firstly, executing an action selected by an Actor unit, namely, reserving a memory appointed by the Actor to run and submit an application program; second, return the state changed by performing the action and measure the size of the reward value that benefits from the action.
In this scheme, the Actor-criticic model calculation process includes the following steps:
Figure BDA0002715984530000091
Figure BDA0002715984530000101
note that, the initialization S0The method comprises the steps of running an application program in a state of reserving excessive memory to obtain an initial environment state; v refers to a cost function;
at initialization S0And reserving excessive memory, wherein the TD error is a common error for adjusting the strategy, and is defined as:
δt=Rt+1+γV(St+1)-V(St). (5)
the loss function adopted for calculating the loss value can be freely selected according to personal needs, parameter updating refers to that the neural network carries out back propagation according to a chain rule, the derivative of the composite function is calculated, then the gradient of the output neuron is propagated back to the input neuron, and learnable parameters of the network are adjusted according to the calculated gradient.
The Actor-Critic model defines model parameters, and the model parameters comprise: the objective of the Actor-criticic model is to conserve memory while maintaining good program run performance. Therefore, the more memory is saved, the less time is required for completing the application program, and the more the system is expected, and the reward value at the time t is defined by the formula:
Figure BDA0002715984530000102
wherein M istAnd TtThe reserved memory and the program running time at the moment t are respectively indicated, and each neuron of an output layer of the Actor unit corresponds to a specific memory setting;
the state parameters include: Δ t represents increased program run time compared to the trial run;
e represents the expected value of the memory occupation;
max1~maxnrepresenting n values with highest frequency occupied by a memory in histogram analysis;
p1~pndenotes max1~maxnThe corresponding frequency.
The actions are exemplified by the following settings: each neuron of the output layer of an Actor unit corresponds to a particular memory setting. If the neuron obtains the highest probability, selecting an action of setting the reserved memory to be 0.5 GB; if the neuron obtains the highest probability, selecting an action of setting the reserved memory to be 1 GB; the actions represented by each of the remaining neurons and so on.
In the following embodiment, the present invention is verified and analyzed in terms of estimation accuracy, generalization capability, memory utilization, and application performance.
First, estimate the precision
The relative error of the final result for each estimation method over four typical workloads is shown in fig. 6. Both the MEER + strategy and Deep-MEER strategy show better Estimation accuracy on most workloads than the reference [1] ([1] Xu, G., Xu, C.: MEER: Online Estimation of optical Memory responsiveness for Long live contacts In-Memory Cluster computing.in:39th IEEE International reference on Distributed Computing Systems (ICDCS). pp.23-34. (2019)). FIG. 7 illustrates the average relative error of the strategy across all workloads. It can be seen that the precision of the MEER + is the highest, and the Deep-MEER strategy is the second time, and the performances of the Deep-MEER strategy are better than the estimation strategy proposed in the document [1 ].
Second, generalization ability
The experimental results in FIG. 6 also show that the Deep-MEER strategy of the present invention has good generalization ability. In the experiment, the Actor-Critic model is trained by using data of workload Page Rank and Triangle Count, but when the trained model is used for estimating Shortest Paths and SVD + +, the estimation result still performs well. This result illustrates that once model training is complete, it can be used for optimal memory reservation estimation for any workload.
Memory utilization rate
The invention is beneficial to saving memory resources and improving the memory utilization rate. FIG. 8 is a graph of memory footprint as a function of run time. The memory utilization of the MEER + policy and Deep-MEER policy are all greater than that of reference [1], and their average utilization and peak memory utilization are superior to that of reference [1 ].
Fourth, application program performance
The invention can ensure that the application program does not experience sharp performance fluctuation in a stable stage, and is beneficial to improving user experience. As shown in fig. 9, the reference [1] and the MEER + used in the present invention for making the preliminary estimation cause performance jitter, and the application can always maintain satisfactory performance when the Deep-MEER strategy is used.
It should be understood that the above-described embodiments of the present invention are merely examples for clearly illustrating the present invention, and are not intended to limit the embodiments of the present invention. Other variations and modifications will be apparent to persons skilled in the art in light of the above description. And are neither required nor exhaustive of all embodiments. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the claims of the present invention.

Claims (10)

1. An adaptive optimal memory reservation estimation method for long-life containers, which is applied to different stages of a Spark distributed cluster of a data center, is characterized by comprising the following steps:
s1: executing a MEER + strategy at the initial stage of the Spark distributed cluster, collecting historical data of application program operation in a server, and estimating the optimal memory reservation of the Spark distributed cluster at the initial stage by using the historical data;
s2: and executing a DEEP-MEER strategy in a stable stage of the Spark distributed cluster, obtaining an optimal memory reservation model in the stable stage by using known historical data, and estimating the optimal memory reservation in the current stage by using the model.
2. The adaptive optimal memory reservation estimation method for long-life containers as claimed in claim 1, wherein the MEER + policy execution flow includes three processes: a trial run stage, an iterative search stage, an approximation stage,
the trial operation stage refers to that when the application program is submitted for the first time, the application program operates under excess reservation, initial memory occupation and program operation data are generated by trial operation, the initial memory occupation and program operation data are recorded by a history server and a measurement system, an expected value of the memory usage amount is calculated by using a histogram analysis model based on the memory occupation data, and then the expected value of the memory usage amount is transmitted to a resource manager;
the iterative search phase refers to the resource manager taking the last evaluation M when the application is resubmittedn-1As the memory reservation of the current operation, the MEER records the memory occupation amount and the program operation time in the program operation process, and calculates the memory occupation expectation MnAnd evaluating the performance, terminating the search if the performance satisfies any one of the termination conditions, Mn-2Is the most importantFinal estimated optimal reservation; otherwise, MEER will calculate the expected value MnAs the new memory reserved value of the application program, applying to the next execution; the termination conditions are three, one: the execution time is too long, and the condition two is as follows: and (3) the garbage is too time-consuming to recycle, and the condition is three: the memory utilization rate reaches the expected target;
the approach stage means that the optimal memory reservation estimation has two branches according to different termination conditions; if the termination condition met in the iterative search stage is the condition one or the condition two, the memory reservation calculation formula of the MEER + is as follows:
Mn=Mn-1+Mf,where Mf<Mt-1-Mt, (1)
wherein M isfIs an increment or decrement added to correct the estimation result, MtIs an estimated memory reservation that meets the termination condition; the approximation stage is terminated when no termination condition is met any more, and the final optimal memory is reserved as the estimation result M of the last iterationn-1(ii) a If the termination condition met in the iterative search stage is condition three, the MEER + executes the outer process, and the memory reservation calculation formula is as follows:
Mn=Mn-1-Mf,where Mf<Mt-M, (2)
stopping until the condition one or the condition two is met, and finally reserving the optimal memory as a memory reserved value M used in the last iterationn-2I.e. the estimation result of the previous iteration.
3. The method as claimed in claim 2, wherein the optimal memory reservation estimation model used in the MEER + policy is a histogram analysis model, the histogram analysis model is used for recording the current memory occupancy every second in the measurement system for each operation of the application program, a corresponding histogram is drawn, two endpoints of a rectangle on a horizontal axis in the histogram represent the memory usage, the frequency of occurrence of a high representation frequency of each rectangle, that is, the memory occupancy between the two endpoints, is estimated by using a probability density estimation method for the memory occupancy, and the memory occupancy expectation is calculated.
4. The adaptive optimal memory reservation estimation method for long-life containers of claim 3, wherein the average value of memory occupancy in histogram analysis is x at a certain momentiThe probability within the interval of (a) is defined as:
Figure FDA0002715984520000021
where Count is the sum of all the rectangle heights, i.e., frequencies, Freq (x)i) Denotes the mean value xiThe frequency of (c).
5. The adaptive optimal memory reservation estimation method for long-life containers as claimed in claim 4, wherein the memory usage expectation calculation formula is as follows:
Figure FDA0002715984520000022
wherein N is the number of rectangles in the frequency domain distribution histogram.
6. The adaptive optimal memory reservation estimation method for long-life containers according to claim 4, wherein the DEEP-mer policy is implemented by: when the application program is submitted, the resource manager will adopt the last optimal memory reservation estimation result Mn-1As the memory reservation of the current operation, the history server and the measurement system record the memory occupation data of the current operation, and calculate the expected value M of the memory usage amount by using the Actor-Critic modelnThe expected value is then communicated to the resource manager.
7. The adaptive optimal memory reservation estimation method for the long-life container according to claim 6, wherein the optimal memory reservation estimation model used by the DEEP-mer strategy is an Actor-critical model, and the Actor-critical model comprises: the system comprises an Actor unit, a Critic unit and a cluster for running application programs;
the Actor unit is responsible for providing strategies, selects an action according to the probability, modifies the probability of the action to be selected according to the score provided by the Critic unit, and realizes a random strategy which maps the system state to the corresponding action;
the Critic unit is a value function of an evaluation strategy, evaluates the action selected by the Actor unit and provides feedback to help the Actor unit to adjust the strategy, an output layer of the Critic unit is only provided with one neuron, the neuron is a score given by the Critic unit, once the Critic unit calculates the score, the score is combined with a reward returned by an environment, and finally a loss value is calculated and used for guiding the Actor unit and the Critic unit to update parameters;
the cluster running the application program interacts with the Actor and the Critic, and the functions of the cluster comprise: firstly, executing an action selected by an Actor unit, namely, reserving a memory appointed by the Actor to run and submit an application program; second, return the state changed by performing the action and measure the size of the reward value that benefits from the action.
8. The adaptive optimal memory reservation estimation method for long-lived containers of claim 7, wherein the Actor unit comprises three layers of neurons: the input layer, the hidden layer and the output layer, wherein the input of the input layer is from an environment state, the activation function of the hidden layer is Relu (), the neurons of the output layer respectively correspond to how much memory reservation action is set for the current application program, the output result of the output layer is finally converted into a value between 0 and 1 through a Softmax () function, and the sum of all output values is equal to 1.
9. The adaptive optimal memory reservation estimation method for long-life containers according to claim 8, wherein the Actor-criticic model calculation procedure comprises the following steps:
s1: initialization state S0
S2: performing Critic unit to calculate V (S)0);
S3: let i equal to 1;
s4: execution of Actor element, based on Si-1Calculating the probability P of each actioni-1Determining A with the maximum probabilityi-1
S5: performing action Ai-1Obtaining SiAnd Ri
S6: performing Critic unit to calculate V (S)i);
S7: calculating TD error deltai-1
S8: calculating Loss value Loss (delta)i-1);
S9: updating the parameter omega of Actor and Critic under the guidance of the loss value;
s10: if i is not more than N, i is i +1, go back to S4;
the initialization S0The method comprises the steps of running an application program in a state of reserving excessive memory to obtain an initial environment state; v represents a cost function.
10. The adaptive optimal memory reservation estimation method for long-lived containers as claimed in claim 9, wherein the TD error is defined as:
δt=Rt+1+γV(St+1)-V(St). (5)
the loss function is freely selected according to the requirement; the parameter updating means that the neural network carries out back propagation according to a chain rule, calculates the derivative of the composite function, then propagates the gradient of the output neuron back to the input neuron, and adjusts the learnable parameters of the network according to the calculated gradient; n is the set iteration number;
the Actor-Critic model defines model parameters, and the model parameters comprise: a reward value, a state parameter and an action, the reward value being used to determine a benefit of performing a given action in a given state, the reward value at time t being defined by the formula:
Figure FDA0002715984520000041
wherein M istAnd TtThe reserved memory and the program running time at the moment t are respectively indicated, and each neuron of an output layer of the Actor unit corresponds to a specific memory setting;
the state parameters include: Δ t represents increased program run time compared to the trial run;
e represents the expected value of the memory occupation;
max1~maxnrepresenting n values with highest frequency occupied by a memory in histogram analysis;
p1~pndenotes max1~maxnThe corresponding frequency.
CN202011073505.2A 2020-10-09 2020-10-09 Adaptive optimal memory reservation estimation method for long-life container Active CN112328355B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011073505.2A CN112328355B (en) 2020-10-09 2020-10-09 Adaptive optimal memory reservation estimation method for long-life container

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011073505.2A CN112328355B (en) 2020-10-09 2020-10-09 Adaptive optimal memory reservation estimation method for long-life container

Publications (2)

Publication Number Publication Date
CN112328355A true CN112328355A (en) 2021-02-05
CN112328355B CN112328355B (en) 2024-04-23

Family

ID=74314814

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011073505.2A Active CN112328355B (en) 2020-10-09 2020-10-09 Adaptive optimal memory reservation estimation method for long-life container

Country Status (1)

Country Link
CN (1) CN112328355B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160371108A1 (en) * 2015-06-16 2016-12-22 Vmware, Inc. Reservation for a multi-machine application
CN108415776A (en) * 2018-03-06 2018-08-17 华中科技大学 A kind of memory in distributed data processing system estimates the method with configuration optimization
CN110390345A (en) * 2018-04-20 2019-10-29 复旦大学 A kind of big data cluster adaptive resource dispatching method based on cloud platform
CN111176832A (en) * 2019-12-06 2020-05-19 重庆邮电大学 Performance optimization and parameter configuration method based on memory computing framework Spark
CN111666149A (en) * 2020-05-06 2020-09-15 西北工业大学 Ultra-dense edge computing network mobility management method based on deep reinforcement learning

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160371108A1 (en) * 2015-06-16 2016-12-22 Vmware, Inc. Reservation for a multi-machine application
CN108415776A (en) * 2018-03-06 2018-08-17 华中科技大学 A kind of memory in distributed data processing system estimates the method with configuration optimization
CN110390345A (en) * 2018-04-20 2019-10-29 复旦大学 A kind of big data cluster adaptive resource dispatching method based on cloud platform
CN111176832A (en) * 2019-12-06 2020-05-19 重庆邮电大学 Performance optimization and parameter configuration method based on memory computing framework Spark
CN111666149A (en) * 2020-05-06 2020-09-15 西北工业大学 Ultra-dense edge computing network mobility management method based on deep reinforcement learning

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
孟红涛 等: "Spark内存管理及缓存策略研究", 计算机科学, vol. 44, no. 6, pages 31 - 36 *

Also Published As

Publication number Publication date
CN112328355B (en) 2024-04-23

Similar Documents

Publication Publication Date Title
CN110737529B (en) Short-time multi-variable-size data job cluster scheduling adaptive configuration method
US20220300812A1 (en) Workflow optimization
US9262216B2 (en) Computing cluster with latency control
Gupta et al. PQR: Predicting query execution times for autonomous workload management
CN110390345B (en) Cloud platform-based big data cluster self-adaptive resource scheduling method
CN110321222B (en) Decision tree prediction-based data parallel operation resource allocation method
US11614978B2 (en) Deep reinforcement learning for workflow optimization using provenance-based simulation
US20150363226A1 (en) Run time estimation system optimization
CA3090095C (en) Methods and systems to determine and optimize reservoir simulator performance in a cloud computing environment
CN111738434A (en) Method for executing deep neural network on heterogeneous processing unit
CN111314120A (en) Cloud software service resource self-adaptive management framework based on iterative QoS model
CN113157421A (en) Distributed cluster resource scheduling method based on user operation process
CN109710372B (en) Calculation intensive cloud workflow scheduling method based on owl search algorithm
CN112084035B (en) Task scheduling method and system based on ant colony algorithm
CN115168027A (en) Calculation power resource measurement method based on deep reinforcement learning
Li et al. Weighted double deep Q-network based reinforcement learning for bi-objective multi-workflow scheduling in the cloud
Nascimento et al. A reinforcement learning scheduling strategy for parallel cloud-based workflows
CN113641445B (en) Cloud resource self-adaptive configuration method and system based on depth deterministic strategy
Baheri Mars: Multi-scalable actor-critic reinforcement learning scheduler
CN112328355B (en) Adaptive optimal memory reservation estimation method for long-life container
Ye et al. Parameters tuning of multi-model database based on deep reinforcement learning
US20230004870A1 (en) Machine learning model determination system and machine learning model determination method
Gao et al. DBN based cloud service response time prediction method
CN115827225A (en) Distribution method of heterogeneous operation, model training method, device, chip, equipment and medium
Sen et al. Predictive price-performance optimization for serverless query processing

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant