CN110427263B - Spark big data application program performance modeling method and device for Docker container and storage device - Google Patents

Spark big data application program performance modeling method and device for Docker container and storage device Download PDF

Info

Publication number
CN110427263B
CN110427263B CN201810401153.5A CN201810401153A CN110427263B CN 110427263 B CN110427263 B CN 110427263B CN 201810401153 A CN201810401153 A CN 201810401153A CN 110427263 B CN110427263 B CN 110427263B
Authority
CN
China
Prior art keywords
resource allocation
big data
application program
allocation model
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810401153.5A
Other languages
Chinese (zh)
Other versions
CN110427263A (en
Inventor
扣彦敏
叶可江
须成忠
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Institute of Advanced Technology of CAS
Original Assignee
Shenzhen Institute of Advanced Technology of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Institute of Advanced Technology of CAS filed Critical Shenzhen Institute of Advanced Technology of CAS
Priority to CN201810401153.5A priority Critical patent/CN110427263B/en
Publication of CN110427263A publication Critical patent/CN110427263A/en
Application granted granted Critical
Publication of CN110427263B publication Critical patent/CN110427263B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • G06F2009/45562Creating, deleting, cloning virtual machine instances

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Stored Programmes (AREA)

Abstract

The embodiment of the invention discloses a Spark big data application program performance modeling method facing a Docker container, and also discloses corresponding equipment and storage equipment. In the embodiment of the invention, key parameters affecting the performance of a Docker container and Spark big data application program and corresponding experimental data are obtained, then the experimental data are put into machine learning for model training, corresponding resource allocation models are obtained, and then an optimal resource allocation model is obtained according to the resource allocation models and input test data; according to the method, the resource parameters of the dock container and the Spark big data application program are subjected to joint optimization, the corresponding relation between the resource parameters of the Spark big data application program and the resource parameters of the dock container is found, and the optimal resource allocation parameter value of the Spark big data application program is set according to the size of the resource parameters of the dock container, so that the Spark big data application program based on the dock container is more stable.

Description

Spark big data application program performance modeling method and device for Docker container and storage device
Technical Field
The invention relates to the field of image recognition and processing, in particular to a Spark big data application program performance modeling method, equipment and storage equipment for a Docker container.
Background
With the continued development of cloud computing technology, more and more enterprises migrate complex IT applications to the cloud. The cloud platform utilizes a virtualization technology to realize the management and elastic expansion of the large-scale bottom physical resources. Over the long term in the past, virtual machines have played the role of a cloud platform infrastructure layer backbone, which is used to provide isolation and control of physical resources. However, the additional virtual control layer causes additional performance loss to the cloud platform, and the traditional mode using the virtual machine as the minimum resource scheduling unit has a series of problems of low resource utilization rate, complex configuration and the like. Thanks to the widespread popularity of dockers in recent years, container technology plays an increasingly important role in cloud computing. The container is a similar but lighter solution, which uses less resource overhead and starts destruction time than a traditional, dummies, and is considered to be a better solution for application release deployment on the cloud platform. This is both the opportunity for the cloud platform to use the container and the challenge to the container integration technology.
Container-based cloud computing systems have advantages over traditional virtual machine-based cloud computing systems in terms of boot speed, resource consumption, etc., so in recent years many major companies have adopted container technology to build new cloud computing systems. However, due to the complex relationship between the Docker resource allocation and the application performance, the performance of the big data application (e.g. Spark) running in the Docker container is unstable, and the advantages of the cloud computing system cannot be fully exerted. Therefore, how to determine the impact of different Docker container resource allocations on big data applications (Spark) is a challenge.
Disclosure of Invention
The technical problem to be solved by the embodiment of the invention is to provide a Spark big data application program performance modeling method, equipment and storage equipment for a Docker container, wherein the resource parameters of the Spark big data application program and the Docker container are subjected to joint optimization, the corresponding relation between the resource parameters of the Spark big data application program and the resource parameters of the Docker container is found, and the optimal Spark big data application program resource allocation parameter value is set according to the size of the Docker container resource parameter.
In order to solve the technical problems, one technical scheme adopted by the embodiment of the invention is as follows:
the Spark big data application program performance modeling method facing the Docker container comprises the following steps:
acquiring key parameters affecting the performances of the Docker container and Spark big data application program, and acquiring corresponding experimental data;
putting the experimental data into machine learning for model training, and obtaining a corresponding resource allocation model;
and acquiring an optimal resource allocation model according to the resource allocation model and the input test data.
Further, the step of obtaining the optimal resource allocation model according to the resource allocation model and the input test data includes:
inputting the test data into the obtained resource allocation model to perform performance prediction, and obtaining a predicted value;
comparing the predicted value with a true value to obtain an error rate;
and evaluating each resource allocation model according to the error rate, and finding out an optimal resource allocation model.
Further, the step of obtaining key parameters affecting the performance of the Docker container and Spark big data application program includes:
acquiring key parameters of a Docker container, wherein the key parameters comprise CPU, memory and disk parameters;
and under the condition of limiting a certain resource usage amount of the Docker container, adjusting resource allocation of the Spark big data application program, and acquiring key parameters affecting the execution performance of the Spark big data application program.
Further, the step of putting the experimental data into machine learning to perform model training and obtaining a corresponding resource allocation model includes:
dividing the experimental data into a plurality of groups of data to be used as a model training data set;
and inputting the model training data set into an R machine learning library or a Python machine learning library for model training, and obtaining a resource allocation model of the R machine learning library or the Python machine learning library.
In order to solve the technical problems, a second technical scheme adopted by the embodiment of the invention is as follows:
provided is a Spark big data application program performance modeling device facing a Docker container, which comprises:
the key data acquisition module is used for acquiring key parameters affecting the performances of the Docker container and the Spark big data application program;
the experimental data acquisition module is used for acquiring corresponding experimental data;
the resource allocation model acquisition module is used for putting the experimental data acquired by the experimental data acquisition module into machine learning for model training and acquiring a corresponding resource allocation model;
and the optimal resource allocation model acquisition module is used for acquiring the optimal resource allocation model according to the resource allocation model acquired by the resource allocation model acquisition module and the input test data.
Further, the optimal resource allocation model obtaining module includes:
the predicted value acquisition unit is used for inputting the test data into the acquired resource allocation model to perform performance prediction so as to acquire a predicted value;
an error rate comparing unit for comparing the predicted value with the true value to obtain an error rate;
and the resource allocation model evaluation unit is used for evaluating each resource allocation model according to the error rate and finding out the optimal resource allocation model.
Further, the key data acquisition module includes:
the Docker container parameter acquisition unit is used for acquiring key parameters of the Docker container, wherein the key parameters comprise CPU, memory and disk parameters;
the Spark big data application program parameter obtaining unit is used for adjusting resource allocation of the Spark big data application program under the condition of limiting a certain resource usage amount of the dock container, and obtaining key parameters affecting the execution performance of the Spark big data application program.
Further, the resource allocation model acquisition module includes:
the model training data set acquisition unit is used for dividing the experimental data into a plurality of groups of data to be used as a model training data set;
the resource allocation model acquisition unit is used for inputting the model training data set into an R machine learning library or a Python machine learning library for model training, and acquiring a corresponding R machine learning library resource allocation model or a corresponding Python machine learning library resource allocation model.
In order to solve the above technical problems, a third technical solution adopted in the embodiment of the present invention is: there is provided a storage device storing a computer program which, when executed by a processor, implements the steps of the method of any of claims 1 to 4.
In order to solve the above technical problems, a fourth technical solution adopted in the embodiment of the present invention is: there is provided a Spark big data application performance modeling apparatus for a Docker container comprising a memory, a processor and a computer program stored in the memory and executable on the processor, the processor implementing the steps of the method according to any of claims 1 to 4 when the computer program is executed.
The beneficial effects of the embodiment of the invention are as follows: different from the situation of the prior art, the embodiment of the invention provides a modeling technology for constructing a container resource allocation model by using a machine learning technology, implementing and evaluating Spark operation performance of a big data application program, and acquiring key parameters influencing the performance of a Docker container and the Spark big data application program and corresponding experimental data, then putting the experimental data into machine learning for model training, acquiring a corresponding resource allocation model, and acquiring an optimal resource allocation model according to the resource allocation model and input test data; the embodiment of the invention builds a container resource allocation model by using a machine learning technology, implements and evaluates a modeling technology of Spark operation performance of a big data application program, namely, finds out a corresponding relation between the resource parameters of the Spark big data application program and the resource parameters of the Docker container by carrying out joint optimization on the resource parameters of the Docker container and the Spark big data application program, and sets an optimal resource allocation parameter value of the Spark big data application program according to the size of the resource parameters of the Docker container, thereby enabling the Spark big data application program based on the Docker container to be more stable.
Drawings
FIG. 1 is a data flow diagram of a Spark big data application performance modeling method for a Docker container according to an embodiment of the present invention;
FIG. 2 is another data flow diagram of a Spark big data application performance modeling method for a dock container according to an embodiment of the present invention;
FIG. 3 is another data flow diagram of a Spark big data application performance modeling method for a dock container according to an embodiment of the present invention;
FIG. 4 is a schematic diagram of a logic structure of a Spark big data application performance modeling device facing a Docker container according to an embodiment of the present invention;
FIG. 5 is a schematic diagram of another logic structure of a Spark big data application performance modeling device for a dock container according to an embodiment of the present invention;
FIG. 6 is a schematic diagram of another logic structure of a Spark big data application performance modeling device for a Docker container according to an embodiment of the present invention.
Detailed Description
Embodiment one, referring to fig. 1, an embodiment of a Spark big data application performance modeling method for a Docker container according to the present invention includes:
101. acquiring key parameters affecting the performances of the Docker container and Spark big data application program, and acquiring corresponding experimental data;
the key parameters affecting the performance of the Docker container and the Spark big data application program are obtained, corresponding experimental data are collected, and specifically, the key parameters affecting the distribution of the resources of the Docker container and the distribution of the Spark resources affecting the performance of the Spark big data application program on the Docker container can be determined by jointly adjusting the distribution of the resources of the CPU, the memory and the I/O of the Spark big data application program on the typical Docker container, and corresponding experimental data are collected based on the Spark big data application program. Specific:
in order to study the impact of different resource configurations on a cluster of containers and Spark big data applications running on top of the cluster, it is first necessary to perform resource control on the Docker containers. The resource use of the Docker container is limited by inserting different parameter options in the command for starting the Docker container, and key parameters affecting the performance are acquired. Table 1 lists key parameters affecting the performance of the Docker container, namely, key parameter options related to CPU, memory and disk:
table 1: docker key parameter options
Then, the resource allocation of the Spark big data application is adjusted. The efficiency of resource use is optimized by adjusting various parameters of the Spark big data application program, so that the execution performance of the Spark big data application program operation is improved. The allocation resources of Spark big data application programs are mainly the adjustment of executor, cpu per executor, memory per executor, driver memory and the like. Specifically, the corresponding parameters may be set when the spark big data application job is submitted using a spark-subtschell script.
Specifically, after the Spark big data application program is run on the dock container cluster, all dock containers are removed, a new dock container cluster is created again, and the usage of the remaining one resource is limited. And comparing and finding out key parameters affecting the resource allocation of the Docker container and the resource allocation of the Spark big data application program on the Docker container through experimental results, and collecting corresponding experimental data as data of a training model and a testing model. It should be noted that, since the Docker container and the corresponding Spark big data application resource are limited only on a certain resource at a time in the experimental process, the default resource configuration is supplemented in the default position in the collected sample data.
Table 2 lists key parameters affecting Spark big data applications, each corresponding to a certain portion of the job's operating principle.
Table 2: primary resource parameters in Spark
102. Putting experimental data into machine learning for model training, and obtaining a corresponding resource allocation model;
and (3) inputting the experimental data obtained in the step (101) into machine learning to perform model training by applying a machine learning technology, and obtaining a corresponding resource allocation model. Specifically, model training may be performed for at least two machine learning, and the specific machine learning technology may be an R machine learning library or a Python machine learning library, which is not limited herein. And (3) selecting a plurality of groups of data from the experimental result in the step (101) as a model training set, calling corresponding codes in an R or Python machine learning library to perform model training, and acquiring a corresponding resource allocation model. It should be noted that, when model training is performed by machine learning, attention should be paid to setting parameters according to the features of each machine learning algorithm.
103. Acquiring an optimal resource allocation model according to the resource allocation model and the input test data set;
and step 102, after training the resource allocation model, predicting an input test data set based on the resource allocation model, finding out an error rate compared with a true value, evaluating the resource allocation model, and finding out an optimal model.
Specifically, experimental comparisons were made in three aspects in this example: 1. comparison of different models evaluates the overall accuracy of the modeling technique throughout the workload. 2. And comparing the prediction results of different Spark big data application programs. 3. Comparison of accuracy of performance predictions, the accuracy of predictions for multiple models is compared by comparing the predicted results to the true values. And obtaining an optimal resource allocation model by combining the three comparison modes.
After key parameters affecting the performance of the Docker container and the Spark big data application program are obtained, the resource parameters of the Docker container and the Spark big data application program are subjected to joint optimization, the corresponding relation between the resource parameters of the Spark big data application program and the resource parameters of the Docker container is found, and the optimal resource allocation parameter value of the Spark big data application program is set according to the size of the resource parameters of the Docker container.
In the embodiment, key parameters affecting the performances of a Docker container and Spark big data application program and corresponding experimental data are obtained, then the experimental data are put into machine learning for model training, corresponding resource allocation models are obtained, and then an optimal resource allocation model is obtained according to the resource allocation models and input test data; the embodiment of the invention builds the container resource allocation model by using a machine learning technology, implements and evaluates the modeling technology of Spark operation performance of the big data application program, and can make the Spark big data application program based on the dock container more stable.
Embodiment II, referring to FIG. 2, the implementation of the Spark big data application performance modeling method for a Docker container of the present invention includes:
201. acquiring key parameters of a Docker container;
key parameters of the Docker container are obtained. Specifically, the resource use of the Docker container is limited by inserting different parameter options into a command for starting the Docker container, and key parameters affecting the performance, mainly key parameters related to a CPU, a memory and a disk, are obtained.
202. Acquiring key parameters affecting the execution performance of the Spark big data application program;
and under the condition of limiting a certain resource usage amount of the Docker container, adjusting resource allocation of the Spark big data application program, and acquiring key parameters affecting the execution performance of the Spark big data application program.
203. Collecting experimental data;
and when key parameters affecting the execution performance of the Docker container and Spark big data application program are acquired, saving each parameter data as experimental data.
204. Dividing experimental data into a plurality of groups of data to be used as a model training data set;
in step 203, after the experimental data is acquired, the experimental data is divided into a plurality of sets of data as a model training data set.
205. Inputting the model training data set into an R machine learning library or a Python machine learning library for model training, and obtaining a resource allocation model of the R machine learning library or the Python machine learning library;
and (3) inputting the model training data set into machine learning to perform model training by applying a machine learning technology, and acquiring a corresponding resource allocation model. Specifically, model training may be performed for at least two machine learning, and the specific machine learning technology may be an R machine learning library or a Python machine learning library, which is not limited herein. And selecting a plurality of groups of data as a model training set, calling corresponding codes in an R or Python machine learning library to perform model training, and acquiring a corresponding resource allocation model. It should be noted that, when model training is performed by machine learning, attention should be paid to setting parameters according to the features of each machine learning algorithm.
206. Inputting the test data set into the obtained resource allocation model to perform performance prediction, and obtaining a predicted value;
after the corresponding resource allocation model is obtained, the test data set is input into the resource allocation model for prediction, and a predicted value is obtained.
207. Comparing the predicted value with the true value to obtain an error rate;
208. evaluating each resource allocation model according to the error rate, and finding out an optimal resource allocation model;
and comparing the predicted value with the true value to obtain an error rate, evaluating the resource allocation model, and finding out an optimal model. Specifically, experimental comparisons were made in three aspects in this example: 1. comparison of different models evaluates the overall accuracy of the modeling technique throughout the workload. 2. And comparing the prediction results of different Spark big data application programs. 3. Comparison of accuracy of performance predictions, the accuracy of predictions for multiple models is compared by comparing the predicted results to the true values. And obtaining an optimal resource allocation model through the three comparison modes.
In this embodiment, key parameters affecting performance of a dock container and Spark big data application program and corresponding experimental data are obtained, then the experimental data are put into machine learning for model training, a corresponding resource allocation model is obtained, then an optimal resource allocation model is obtained according to the resource allocation model and input test data, a corresponding relation between resource parameters of the Spark big data application program and resource parameters of the dock container is found, and an optimal resource allocation parameter value of the Spark big data application program is set according to the size of the resource parameters of the dock container; the embodiment of the invention builds the container resource allocation model by using a machine learning technology, implements and evaluates the modeling technology of Spark operation performance of the big data application program, and can make the Spark big data application program based on the dock container more stable.
The present invention will be specifically described below by way of a specific application example:
step one, parameter mediation, namely acquiring key parameters affecting the performances of a Docker container and Spark big data application program, and acquiring corresponding experimental data, specifically:
the large data cluster based on the dock container needs to be deployed, the value of the dock container resource allocation parameter is set, and the corresponding value of the Spark large data application resource allocation parameter is set, so that experimental tests are performed, wherein the dock container resource allocation parameter is shown in table 1 in detail, the Spark large data application resource allocation parameter is shown in table 2 in detail, and details are omitted here.
In this application example, we choose the HDFS file system and the Yarn resource manager in the Hadoop ecosystem, as well as the Spark distributed computing framework. Using Docker 6 containers were created, one: 1 master,5 slave, wherein the master is responsible for resource management and task scheduling. And sequentially deploying HDFS, yarn and Spark on the cluster. In addition, we select four typical Spark big data applications, respectively: KMeans, pageRank, wordCount and tersort are subjected to experimental tests, the resource allocation parameters of the Docker container and the Spark resource allocation parameters are set to carry out corresponding experimental tests, each experiment only limits the use amount of one resource, and the rest is not limited. The resource allocation parameters of the Docker container are shown in table 1, the resource allocation parameters of the spark big data application are shown in table 2, and the details are not described here.
Selecting relevant parameters of CPU resources: we first limit the CPU resources of each container to 1, correspondingly set the CPU resource size of Spark to 1, and ensure that num-executors are not larger than the number of CPU core available in the total container, and other parameters are set as default configuration, and then start HDFS, yarn, spark to calculate the running time of the corresponding application. Then we increase the CPU resource size of the container in turn, and on the premise of this, circularly change the CPU resource size of the corresponding Spark, ensure that 1+.ltoreq.num-executives excutor-cores+.ltoreq.total CPU core number available for the container, calculate the running time of each corresponding application. According to the method, the CPU related parameters are sequentially subjected to experiments, and the performance influence result of typical Spark big data application based on a Docker container under different CPU resources is obtained. And save the experimental data.
Selecting related parameters of memory resources: when limiting the memory usage of the container, i.e. using the-memory command option, it is necessary to set the corresponding value of the horn. The method comprises the steps of setting the range of the internal access value of a container to be [4,12] according to the memory of a used server and the number of the set containers, setting the memory size of Spark to be (1 +.about.num-executor-memory +.about.total available memory of the container-2) according to the size of the memory of the container, and setting 2GB memory for a daemon of HDFS, yarn, spark. According to the method, the related parameters of the memory are sequentially subjected to experiments, and the performance influence result of Spark big data application based on the Docker container under different memory resources is obtained. And save the experimental data.
Further, regarding the selection of the related parameters of the I/O resources, similar to the selection of the related parameters of the CPU resources, the I/O resources of the container are limited, and the I/O resources of Spark are correspondingly set in a certain interval. According to the method, the I/O related parameters are sequentially subjected to experiments, and the performance influence results of typical Spark big data application based on a Docker container under different I/O resources are obtained. And save the experimental data.
Step two, model training, namely putting experimental data into machine learning to perform model training, and acquiring a corresponding resource allocation model, wherein the model training comprises the following steps of:
in this application example, regarding the application of the machine learning technique, the performance of the Spark big data application running in the dock container is predicted according to the key parameters determined in the previous step, and the corresponding resource allocation model is trained according to the experimental data generated in the previous step.
In the application example, model training is performed aiming at two machine learning of a Support Vector Machine (SVM) and an Artificial Neural Network (ANN), a plurality of groups of data are selected from the experimental data to serve as a model training set, corresponding codes in an R machine learning library are called to perform model training, and a corresponding performance model is obtained.
In this application example we train an ANN using a neural network software package in R, the input and output variables of the ANN can be separated by multiple layers, each layer having a configurable number of hidden neurons. The choice of these parameters depends on the number of input and output variables and the complexity of their interrelationships. We compare to find the best parameters by incrementing the hidden layer and hidden neuron size during the training process. We have provided a number of different artificial neural network models for simulating the performance of various applications running in the container. Respectively adjusting the number n of different hidden layers, wherein the number s_i of the neurons of each hidden layer (i represents the label of the hidden layer, 1< = i < = n) is that a first model is provided with i hidden layers, the first layer is provided with 5 hidden neurons, and the second layer is provided with 3 hidden neurons; in the second model, there are also two hidden layers, but the number of neurons per layer is 4 and 2, respectively; in both the third and fourth models there is only one hidden layer, with 4 neurons and 2 neurons, respectively.
The SVM-based regression technique works by first mapping the input to a multidimensional vector space using some form of nonlinear function, and finally performing a linear regression in the mapped space. We apply the e1071 package to connect the libsvm in R and adjust the parameters kernel, gamma and cost accordingly to refine the SVM model, since the accuracy of the SVM model depends largely on their choice. For example, if the cost (cost) is too large, we may store many support vectors and overfit. In contrast, a very small cost value may result in a under fit. In this case we use four different kernel functions (linear, radial, sigmoid) for modeling.
Thirdly, model selection, namely putting experimental data into machine learning to perform model training in the previous step, obtaining a corresponding resource allocation model, predicting a test data set based on the optimal resource allocation model, finding out error rates compared with a true value, evaluating respective resource allocation models, and finding out the corresponding optimal model, and the method is specific:
the application example performs experimental comparison from three aspects:
1. comparison of different models evaluates the overall accuracy of the modeling technique throughout the workload. Taking Kmeans as a workload to run different models as an example, sequentially taking Kmeans as the workload to perform experiments on the models generated in the second step, and comparing the overall accuracy of the models in the Kmeans, wherein the result shows that the accuracy of the SVM model taking radial as a kernel function is highest.
2. And comparing the prediction results of different big data applications. According to different characteristics of big data applications, four typical Spark big data applications KMeans, pageRank, wordCount and TeraSort are selected, the models generated in the second step are respectively operated on the four Spark big data applications, the operation time of each model in the four big data applications is respectively compared, and experimental results show that the prediction accuracy of the second model of the ANN is highest.
3. Comparison of accuracy of performance predictions, by comparison with the true values, the accuracy of predictions for multiple models is compared. Comparing the running time of the big data application of the model generated in the second step with the running time of the corresponding big data application in the first step, and the experimental result shows that the average performance of the SVM model is higher than that of the ANN model.
For best performance, spark big data application resource allocation based on a Docker container needs to be limited by resource allocation on the premise of Docker container resource allocation. The prior related technology does not carry out joint optimization on Spark big data application programs and a Docker container, and the Spark big data application programs and the Docker container are independently allocated with resources, so that the overall consideration is lacking, and the performance cannot be optimized. In the application example, the resource parameters of the Spark big data application program and the resource parameters of the Docker container are subjected to joint optimization, the corresponding relation between the resource parameters of the Spark big data application program and the resource parameters of the Docker container is found, and the optimal resource allocation parameter value of the Spark big data application program is set according to the size of the resource parameters of the Docker container.
Embodiment four, referring to fig. 4 and 5, an embodiment of a Spark big data application performance modeling apparatus for a Docker container according to the present invention includes:
the key data acquisition module 301 is configured to acquire key parameters that affect performance of the Docker container and Spark big data application; specifically, the resource use of the Docker container is limited by inserting different parameter options into a command for starting the Docker container, and key parameters affecting the performance, mainly key parameters related to a CPU, a memory and a disk, are obtained; and under the condition of limiting a certain resource usage amount of the dock container, adjusting resource allocation of the Spark big data application program, and acquiring key parameters affecting the execution performance of the Spark big data application program.
The experimental data acquisition module 302 is configured to acquire corresponding experimental data;
the resource allocation model acquisition module 303 is configured to put the experimental data acquired by the experimental data acquisition module 302 into machine learning for model training, and acquire a corresponding resource allocation model; specifically, a machine learning technology is used, a model training data set is input into machine learning to perform model training, and a corresponding resource allocation model is obtained. Specifically, model training may be performed for at least two machine learning, and the specific machine learning technology may be an R machine learning library or a Python machine learning library, which is not limited herein. And selecting a plurality of groups of data as a model training set, calling corresponding codes in an R or Python machine learning library to perform model training, and acquiring a corresponding resource allocation model. It should be noted that, when model training is performed by machine learning, attention should be paid to setting parameters according to the features of each machine learning algorithm.
An optimal resource allocation model obtaining module 304, configured to obtain an optimal resource allocation model according to the resource allocation model obtained by the resource allocation model obtaining module 303 and the input test data; specifically, experimental comparison was made in three aspects in this example: 1. comparison of different models evaluates the overall accuracy of the modeling technique throughout the workload. 2. And comparing the prediction results of different Spark big data application programs. 3. Comparison of accuracy of performance predictions, the accuracy of predictions for multiple models is compared by comparing the predicted results to the true values. And obtaining an optimal resource allocation model through the three comparison modes.
Further, the optimal resource allocation model obtaining module 304 includes:
the predicted value obtaining unit 3041 is configured to input test data into the obtained resource allocation model to perform performance prediction, so as to obtain a predicted value;
an error rate comparing unit 3042 for comparing the predicted value with a true value to obtain an error rate;
a resource allocation model evaluation unit 3043 configured to evaluate each of the resource allocation models according to the error rate, and find an optimal resource allocation model.
Further, the key data obtaining module 301 includes:
the Docker container parameter obtaining unit 3011 is configured to obtain key parameters of the Docker container, where the key parameters include CPU, memory and disk parameters;
and the Spark big data application program parameter obtaining unit 3012 is configured to adjust resource allocation of the Spark big data application program and obtain key parameters affecting execution performance of the Spark big data application program under the condition that a certain resource usage amount of the dock container is limited.
Further, the resource allocation model obtaining module 303 includes:
a model training data set obtaining unit 3031, configured to divide the experimental data into a plurality of groups of data as a model training data set;
the resource allocation model obtaining unit 3032 is configured to input the model training data set into an R machine learning library or a Python machine learning library for model training, and obtain a corresponding R machine learning library resource allocation model or Python machine learning library resource allocation model.
In this embodiment, key parameters affecting performance of a dock container and Spark big data application program and corresponding experimental data are obtained, then the experimental data are put into machine learning for model training, a corresponding resource allocation model is obtained, then an optimal resource allocation model is obtained according to the resource allocation model and input test data, a corresponding relation between resource parameters of the Spark big data application program and resource parameters of the dock container is found, and an optimal resource allocation parameter value of the Spark big data application program is set according to the size of the resource parameters of the dock container; the embodiment of the invention builds the container resource allocation model by using a machine learning technology, implements and evaluates the modeling technology of Spark operation performance of the big data application program, and can make the Spark big data application program based on the dock container more stable.
An embodiment five of the present invention provides a storage device, where a computer program is stored, and the computer program is executed by a processor to implement the steps of the methods in the first embodiment, the second embodiment and the third embodiment, where the steps are specifically described in the first embodiment, the second embodiment and the third embodiment, and are not described herein.
In a sixth embodiment, referring to fig. 6, a device for modeling performance of Spark big data application program for a Docker container according to the present invention includes a memory, a processor, and a computer program stored in the memory and capable of running on the processor, where the processor implements steps of the methods in the first, second and third embodiments when executing the computer program in the memory, and the steps are specifically described in the first, second and third embodiments, which are not described herein.
The foregoing description is only of embodiments of the present invention, and is not intended to limit the scope of the invention, and all equivalent structures or equivalent processes using the descriptions and the drawings of the present invention or directly or indirectly applied to other related technical fields are included in the scope of the present invention.

Claims (4)

1. A Spark big data application program performance modeling method facing a Docker container is characterized by comprising the following steps:
acquiring key parameters affecting the performances of the Docker container and Spark big data application program, and acquiring corresponding experimental data; comprising the following steps:
acquiring key parameters of a Docker container, wherein the key parameters comprise CPU, memory and disk parameters;
under the condition of limiting a certain resource usage amount of the Docker container, adjusting resource allocation of the Spark big data application program, and acquiring key parameters affecting execution performance of the Spark big data application program;
putting the experimental data into machine learning for model training, and obtaining a corresponding resource allocation model; comprising the following steps:
dividing the experimental data into a plurality of groups of data to be used as a model training data set;
inputting the model training data set into an R machine learning library or a Python machine learning library for model training, and obtaining a resource allocation model of the R machine learning library or the Python machine learning library;
acquiring an optimal resource allocation model according to the resource allocation model and the input test data; comprising the following steps:
inputting the test data into the obtained resource allocation model to perform performance prediction, and obtaining a predicted value;
comparing the predicted value with a true value to obtain an error rate;
and evaluating each resource allocation model according to the error rate, and finding out an optimal resource allocation model.
2. A Spark big data application performance modeling device for a Docker container, comprising:
the key data acquisition module is used for acquiring key parameters affecting the performances of the Docker container and the Spark big data application program;
the experimental data acquisition module is used for acquiring corresponding experimental data;
the resource allocation model acquisition module is used for putting the experimental data acquired by the experimental data acquisition module into machine learning for model training and acquiring a corresponding resource allocation model;
the optimal resource allocation model acquisition module is used for acquiring an optimal resource allocation model according to the resource allocation model acquired by the resource allocation model acquisition module and the input test data;
the optimal resource allocation model acquisition module comprises:
the predicted value acquisition unit is used for inputting the test data into the acquired resource allocation model to perform performance prediction so as to acquire a predicted value;
an error rate comparing unit for comparing the predicted value with the true value to obtain an error rate;
the resource allocation model evaluation unit is used for evaluating each resource allocation model according to the error rate and finding out an optimal resource allocation model;
the key data acquisition module comprises:
the Docker container parameter acquisition unit is used for acquiring key parameters of the Docker container, wherein the key parameters comprise CPU, memory and disk parameters;
the Spark big data application program parameter acquisition unit is used for adjusting resource allocation of the Spark big data application program and acquiring key parameters affecting the execution performance of the Spark big data application program under the condition of limiting a certain resource usage amount of the dock container;
the resource allocation model acquisition module includes:
the model training data set acquisition unit is used for dividing the experimental data into a plurality of groups of data to be used as a model training data set;
the resource allocation model acquisition unit is used for inputting the model training data set into an R machine learning library or a Python machine learning library for model training, and acquiring a corresponding R machine learning library resource allocation model or a corresponding Python machine learning library resource allocation model.
3. A storage device storing a computer program which, when executed by a processor, performs the steps of the method of claim 1.
4. A Spark big data application performance modeling device for a Docker container, comprising a memory, a processor and a computer program stored in the memory and executable on the processor, wherein the processor implements the steps of the method according to claim 1 when executing the computer program.
CN201810401153.5A 2018-04-28 2018-04-28 Spark big data application program performance modeling method and device for Docker container and storage device Active CN110427263B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810401153.5A CN110427263B (en) 2018-04-28 2018-04-28 Spark big data application program performance modeling method and device for Docker container and storage device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810401153.5A CN110427263B (en) 2018-04-28 2018-04-28 Spark big data application program performance modeling method and device for Docker container and storage device

Publications (2)

Publication Number Publication Date
CN110427263A CN110427263A (en) 2019-11-08
CN110427263B true CN110427263B (en) 2024-03-19

Family

ID=68407143

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810401153.5A Active CN110427263B (en) 2018-04-28 2018-04-28 Spark big data application program performance modeling method and device for Docker container and storage device

Country Status (1)

Country Link
CN (1) CN110427263B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111629048B (en) * 2020-05-22 2023-04-07 浪潮电子信息产业股份有限公司 spark cluster optimal configuration parameter determination method, device and equipment
CN112463389B (en) * 2020-12-10 2024-06-18 中国科学院深圳先进技术研究院 Resource management method and device for distributed machine learning task
US11729279B2 (en) * 2021-01-11 2023-08-15 Dell Products, L.P. System and method for remote assisted optimization of native services

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105868019A (en) * 2016-02-01 2016-08-17 中国科学院大学 Automatic optimization method for performance of Spark platform
CN106648654A (en) * 2016-12-20 2017-05-10 深圳先进技术研究院 Data sensing-based Spark configuration parameter automatic optimization method
WO2018045541A1 (en) * 2016-09-08 2018-03-15 华为技术有限公司 Optimization method for container allocation and processing device

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120221373A1 (en) * 2011-02-28 2012-08-30 Manish Marwah Estimating Business Service Responsiveness

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105868019A (en) * 2016-02-01 2016-08-17 中国科学院大学 Automatic optimization method for performance of Spark platform
WO2018045541A1 (en) * 2016-09-08 2018-03-15 华为技术有限公司 Optimization method for container allocation and processing device
CN106648654A (en) * 2016-12-20 2017-05-10 深圳先进技术研究院 Data sensing-based Spark configuration parameter automatic optimization method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于学习的容器环境Spark性能监控与分析;皮艾迪 等;《计算机应用》;20171210;第37卷(第12期);第3586-3591页 *

Also Published As

Publication number Publication date
CN110427263A (en) 2019-11-08

Similar Documents

Publication Publication Date Title
EP3754496B1 (en) Data processing method and related products
CN109947567B (en) Multi-agent reinforcement learning scheduling method and system and electronic equipment
WO2022262167A1 (en) Cluster resource scheduling method and apparatus, electronic device and storage medium
CN105956021B (en) A kind of automation task suitable for distributed machines study parallel method and its system
CN110674936A (en) Neural network processing method and device, computer equipment and storage medium
US9785472B2 (en) Computing cluster performance simulation using a genetic algorithm solution
CN110633153A (en) Method for realizing neural network model splitting by using multi-core processor and related product
CN110427263B (en) Spark big data application program performance modeling method and device for Docker container and storage device
CN112416585B (en) Deep learning-oriented GPU resource management and intelligent scheduling method
Han et al. Signal processing and networking for big data applications
CN110689121A (en) Method for realizing neural network model splitting by using multi-core processor and related product
Cheong et al. SCARL: Attentive reinforcement learning-based scheduling in a multi-resource heterogeneous cluster
Gong et al. Improving hw/sw adaptability for accelerating cnns on fpgas through a dynamic/static co-reconfiguration approach
CN110309911A (en) Neural network model verification method, device, computer equipment and storage medium
CN106528171B (en) Method of interface, apparatus and system between a kind of heterogeneous computing platforms subsystem
US20210390405A1 (en) Microservice-based training systems in heterogeneous graphic processor unit (gpu) cluster and operating method thereof
US12079734B1 (en) Compilation time reduction for memory and compute bound neural networks
CN102831102A (en) Method and system for carrying out matrix product operation on computer cluster
CN113255873A (en) Clustering longicorn herd optimization method, system, computer equipment and storage medium
Qian et al. R-cnn object detection inference with deep learning accelerator
CN114943885A (en) Synchronous cache acceleration method and system based on training task
Maste et al. Intelligent dynamic time quantum allocation in mlfq scheduling
Zhou et al. Training and Serving System of Foundation Models: A Comprehensive Survey
CN114217688B (en) NPU power consumption optimization system and method based on neural network structure
CN110415162B (en) Adaptive graph partitioning method facing heterogeneous fusion processor in big data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant