CN113704220A - Ceph parameter tuning method based on LSTM and genetic algorithm - Google Patents
Ceph parameter tuning method based on LSTM and genetic algorithm Download PDFInfo
- Publication number
- CN113704220A CN113704220A CN202111021786.1A CN202111021786A CN113704220A CN 113704220 A CN113704220 A CN 113704220A CN 202111021786 A CN202111021786 A CN 202111021786A CN 113704220 A CN113704220 A CN 113704220A
- Authority
- CN
- China
- Prior art keywords
- conf
- ceph
- cache
- lstm
- size
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 51
- 238000004422 calculation algorithm Methods 0.000 title claims abstract description 24
- 230000002068 genetic effect Effects 0.000 title claims abstract description 22
- 238000012360 testing method Methods 0.000 claims abstract description 9
- 238000012417 linear regression Methods 0.000 claims description 6
- 108090000623 proteins and genes Proteins 0.000 claims description 3
- 238000005457 optimization Methods 0.000 abstract description 17
- 238000012549 training Methods 0.000 description 6
- 230000000694 effects Effects 0.000 description 5
- 230000006872 improvement Effects 0.000 description 4
- 238000002474 experimental method Methods 0.000 description 3
- 230000007246 mechanism Effects 0.000 description 3
- 238000011160 research Methods 0.000 description 3
- 230000008859 change Effects 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 230000035772 mutation Effects 0.000 description 2
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000000052 comparative effect Effects 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000009977 dual effect Effects 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 238000007637 random forest analysis Methods 0.000 description 1
- 238000010845 search algorithm Methods 0.000 description 1
- 238000010187 selection method Methods 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/18—File system types
- G06F16/182—Distributed file systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/34—Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
- G06F11/3447—Performance evaluation by modeling
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/12—Computing arrangements based on biological models using genetic models
- G06N3/126—Evolutionary algorithms, e.g. genetic algorithms or genetic programming
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Biophysics (AREA)
- General Physics & Mathematics (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Biomedical Technology (AREA)
- Molecular Biology (AREA)
- Computational Linguistics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Computing Systems (AREA)
- General Health & Medical Sciences (AREA)
- Computer Hardware Design (AREA)
- Quality & Reliability (AREA)
- Databases & Information Systems (AREA)
- Physiology (AREA)
- Genetics & Genomics (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention belongs to the technical field of parameter tuning, and particularly relates to a Ceph parameter tuning method based on LSTM and a genetic algorithm, which comprises the following steps: collecting a data set; proving a non-linear relationship; s3, constructing a performance prediction model by using LSTM; the optimization was performed using EGA. The method for collecting the data set comprises the following steps: randomly dereferencing 8 configuration parameters of Ceph in an adjustable range, and setting the ith parameter confiHas a value range of [ lbi,ubi],confi=random(lbi,ubi) I is 1,2, …, 8; combining parameters config ═ conf1,conf2,conf3,conf4,conf5,conf6,conf7,conf8Updating the data into a Ceph system, and testing the read-write performance of a corresponding Ceph block storage system; parameter combination configiAnd corresponding iopsiComposing a data item (config)i,iopsi) And taking all collected data items as a data set for constructing a Ceph performance prediction model. According to the invention, an accurate and reliable Ceph performance prediction model is constructed by using LSTM, the predicted value of the performance prediction model is used as the fitness of population individuals, and the optimal parameter configuration is found through EGA, so that the system performance is optimal.
Description
Technical Field
The invention belongs to the technical field of parameter tuning, and particularly relates to a Ceph parameter tuning method based on LSTM and a genetic algorithm.
Background
The performance optimization work of domestic and foreign research scholars on the Ceph system is mainly divided into three aspects: specific hardware environment optimization, application-oriented scenario optimization, and internal mechanism optimization. With the advent of NVDIMM (Non-Volatile Dual In-line Memory Module) products, byte-addressable Non-Volatile Memory will provide IO performance similar to Memory In terms of specific hardware environment optimization. Simulating the performance of using NVDIMMs as the underlying media in a Ceph system, the throughput can be improved by more than 100% for a single node, mapping all content to NVDIMMs.
In application-oriented scene optimization, Ceph is not the most suitable storage system in the field of high-performance computing. Files accessed by data intensive applications in high-performance computing are classified as read intensive, write intensive or read-write intensive. The read-write characteristics of these files are used to set file placement decisions, balancing the high performance computing workload. In the field of cloud computing, data logs of data objects are operated, write atomicity and reliability are kept, and experimental results show that the capacity provided by a new storage engine is more than 3 times of the original capacity.
There has also been some progress in the research of performance optimization problems in terms of the internal mechanisms of Ceph storage systems. In the existing dividing and controlling strategy based on MapReduce, the optimal data placement strategy of Ceph in a heterogeneous environment is solved by using a mixed integer linear programming algorithm. The experimental results show that compared with the original strategy realized in Ceph, the algorithm can improve the read-write performance of the system by 25.6%. The existing multi-attribute decision-making Ceph Storage selection method collects IO performance of OSD (object Storage device) and effectively combines the IO performance, and distinguishes different application scenes by marking application priority, so that the overall read-write performance is improved by 13.7%. In the prior art, parameters including a kernel, a file system, a disk cache, an RBD and the like which need to be adjusted in a full flash memory environment are described in detail, but performance comparison before and after adjustment is not given. The existing black box optimization technology applied to the storage system selects the next modified parameter configuration according to the last information, but the method needs a large amount of data sets for support and is difficult to implement in a practical environment. The existing automatic adjustment method for the Ceph configuration parameters based on random forest (Radio Frequency, RF) and Genetic Algorithm (GA) uses RF to construct a performance prediction model, and compared with a black box optimization technology, the method can predict the Ceph system performance more quickly, and saves a large amount of time and occupation of system resources. However, the amount of data used in this document is too small, RF may not produce good regression results, and RF is not able to make predictions beyond the training set data range, which may result in overfitting when modeling data of certain specific noises.
Although the specific hardware environment optimization and application-oriented scene optimization methods have a certain progress on performance improvement, the general environment is not considered, and the performance improvement space caused by adjustment and optimization of the internal parameters of the Ceph is ignored. In the above internal mechanism optimization method research, the method has general applicability to performance improvement, but the nonlinear relationship of parameters cannot be considered completely.
Disclosure of Invention
Aiming at the technical problems that the read-write performance of the system cannot be fully exerted by the default parameters of the Ceph, the manual parameter adjustment efficiency is low, and a large amount of system resources are wasted, the invention provides the Ceph parameter adjusting and optimizing method based on the LSTM and the genetic algorithm, which has strong applicability, large performance improvement and high efficiency.
In order to solve the technical problems, the invention adopts the technical scheme that:
a Ceph parameter tuning method based on LSTM and genetic algorithm comprises the following steps:
s1, collecting a data set;
s2, proving the nonlinear relation of the data sets;
s3, constructing a performance prediction model by using LSTM;
and S4, optimizing by using EGA to obtain a set of optimal parameters.
The method for collecting the data set in S1 includes:
s1.1, randomly taking values of 8 configuration parameters of Ceph within an adjustable range, and setting the ith parameter confiHas a value range of [ lbi,ubi],confi=random(lbi,ubi),i=1,2,…,8;
S1.2, combining parameters config ═ { conf1,conf2,conf3,conf4,conf5,conf6,conf7,conf8Updating the data into a Ceph system, and testing the read-write performance of a corresponding Ceph block storage system;
s1.3 parameter combination configiAnd corresponding iopsiComposing a data item (config)i,iopsi) And taking all collected data items as a data set for constructing a Ceph performance prediction model.
The 8 parameters of the Ceph are bluestore _ cache _ size _ ssd, bluestore _ cache _ size _ hdd, bluestore _ cache _ size _ meta _ ratio, bluestore _ cache _ kv _ ratio, osd _ max _ write _ size, osd _ map _ cache _ size, rbd _ cache _ size and rbd _ cache _ max _ size, respectively; the type of the bluestore _ cache _ size _ ssd and bluestore _ cache _ size _ hd is integer, the type of the bluestore _ cache _ meta _ ratio and the bluestore _ cache _ kv _ ratio is float, and the type of the osd _ max _ write _ size, the osd _ map _ cache _ size, the rbd _ cache _ size and the rbd _ cache _ max _ dirty is integer.
The method for proving the non-linear relationship of the data sets in the step S2 is as follows:
the nonlinear relation of the data set is proved by establishing a function for predicting by establishing a linear combination through establishing a multiple linear regression model, wherein the multiple linear regression model is as follows:
f(config)=ω1conf1+ω2conf2+...+ω8conf8+b
b is a constant, w1-w8If the variables have linear relation, a group of coefficients and constants must exist, so that the true value is constrained in the range of the predicted value obtained by the formula.
The method for constructing the performance prediction model by using the LSTM in the S3 comprises the following steps:
defining an Error formula Error reflecting the difference between the real value and the predicted value,
wherein ActualiFor true values of Ceph block storage systems, ForecastiIs the predicted value of the LSTM model, and n is the number of samples.
The method for optimizing by using EGA in S4 comprises the following steps: setting the population size as M and the maximum iteration number T, and combining a group of parameters with config ═ conf1,conf2,conf3,conf4,conf5,conf6,conf7,conf8Using the gene of each individual as an individual in the population, and P (t) represents the population of the t generation; finding out individual elitist with maximum fitness in the population before genetic operation by adopting an EGA algorithm, storing the information of the individual elitist, and replacing the individual elitist with the minimum fitness in a new population after the genetic operationTo retain elitist in the next generation population.
Compared with the prior art, the invention has the following beneficial effects:
according to the invention, an accurate and reliable Ceph performance prediction model is constructed by using LSTM, the predicted value of the performance prediction model is used as the fitness of population individuals, and the optimal parameter configuration is found through EGA, so that the system performance is optimal.
Drawings
FIG. 1 is a schematic diagram of the overall framework of Ceph parameter tuning according to the present invention;
FIG. 2 is a graph showing the effect of the blocksize on model accuracy according to the present invention;
FIG. 3 is a graph comparing the predicted effects of the LSTM and RF models of the present invention;
FIG. 4 is a comparison of the effect of the method of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In the description of the present invention, it is to be noted that, unless otherwise explicitly specified or limited, the terms "connected" and "connected" are to be interpreted broadly, e.g., as being fixed or detachable or integrally connected; can be mechanically or electrically connected; they may be connected directly or indirectly through intervening media, or they may be interconnected between two elements. The specific meanings of the above terms in the present invention can be understood in specific cases to those skilled in the art.
A Ceph parameter tuning method based on LSTM and genetic algorithm comprises the following steps:
s1, collecting a data set;
s2, proving the nonlinear relation of the data sets;
s3, constructing a performance prediction model by using LSTM;
and S4, optimizing by using EGA to obtain a set of optimal parameters.
Further, the method for collecting the data set in S1 is as follows:
s1.1, randomly taking values of 8 configuration parameters of Ceph within an adjustable range, and setting the ith parameter confiHas a value range of [ lbi,ubi],confi=random(lbi,ubi),i=1,2,…,8;
S1.2, combining parameters config ═ { conf1,conf2,conf3,conf4,conf5,conf6,conf7,conf8Updating the data into a Ceph system, and testing the read-write performance of a corresponding Ceph block storage system;
s1.3 parameter combination configiAnd corresponding iopsiComposing a data item (config)i,iopsi) And taking all collected data items as a data set for constructing a Ceph performance prediction model.
Further, 8 parameters of Ceph are bluestore _ cache _ size _ ssd, bluestore _ cache _ size _ hdd, bluestore _ cache _ size _ meta _ ratio, bluestore _ cache _ kv _ ratio, osd _ max _ write _ size, osd _ map _ cache _ size, rbd _ cache _ size, and rbd _ cache _ max _ size, respectively; the type of bluestore _ cache _ size _ ssd and bluestore _ cache _ size _ hd is integer, the type of bluestore _ cache _ meta _ ratio and bluestore _ cache _ kv _ ratio is float, the type of osd _ max _ write _ size, osd _ map _ cache _ size, rbd _ cache _ size and rbd _ cache _ max _ size are integer, 8 parameters of Ceph are shown in table 1.
Parameter name | Type (B) | Value range | |
bluestore_cache_size_ssd | integer | 1GB~10GB | |
bluestore_cache_size_hdd | integer | 1GB~ | |
bluestore_cache_meta_ratio | float | ||
0~1 | |||
| float | 0~1 | |
osd_max_write_size | integer | 4~2000 | |
| integer | 64~1024 | |
rbd_cache_size | integer | 1MB~64MB | |
rbd_cache_max_dirty | integer | 1MB~64MB |
TABLE 1
The method for randomly taking values of 8 parameters of Ceph in S3.1 comprises the following steps: randomly dereferencing 8 configuration parameters of Ceph in an adjustable range, and setting the ith parameter confiHas a value range of [ lbi,ubi],confi=random(lbi,ubi) I 1,2, …,8, combining the parameters config { conf ═ conf1,conf2,conf3,conf4,conf5,conf6,conf7,conf8Updating the data into a Ceph system, and testing the read-write performance of a corresponding Ceph block storage system; parameter combination configiAnd corresponding iopsiComposing a data item (config)i,iopsi) And taking all collected data items as a data set for constructing a Ceph performance prediction model.
The IOPS value in S3.2 is divided into 6 indices: random read IOPS, random write IOPS, sequential read IOPS, sequential write IOPS, mixed sequential read IOPS and mixed random read IOPS;
further, the method for collecting the IOPS value corresponding to the parameter in S3.2 includes:
step 1: random (lb) was usedi,ubi) I is 1,2, …,8 is randomly 8 parameter values;
step 2: synchronizing the modified configuration parameters to the whole Ceph cluster by using a cluster management tool Ansible;
and step 3: acquiring the performance of the block storage system by using a fio + rbd test tool;
and 4, step 4: the testing task is periodically executed using the crontabs tool, steps 1-3 are repeated, and the parameter combinations and corresponding IOPS values are collected.
Further, the method for proving the non-linear relationship of the data sets in S2 is as follows:
the nonlinear relation of the data set is proved by establishing a function for predicting by establishing a linear combination through establishing a multiple linear regression model, wherein the multiple linear regression model is as follows:
f(config)=ω1conf1+ω2conf2+...+ω8conf8+b
wherein: b is a constant, w1-w8If the variables have linear relation, a group of coefficients and constants must exist, so that the true value is constrained in the range of the predicted value obtained by the formula.
Further, the method for constructing the performance prediction model by using the LSTM in S3 includes:
defining an Error formula Error reflecting the difference between the real value and the predicted value,
wherein ActualiFor true values of Ceph block storage systems, ForecastiIs the predicted value of the LSTM model, and n is the number of samples.
Further, the optimization method using EGA in S4 is as follows: setting the population size as M and the maximum iteration number T, and combining a group of parameters with config ═ conf1,conf2,conf3,conf4,conf5,conf6,conf7,conf8Using the gene of each individual as an individual in the population, and P (t) represents the population of the t generation; and finding the individual elitist with the maximum fitness in the population before genetic operation by adopting an EGA algorithm, storing the information of the individual elitist, replacing the individual with the minimum fitness in the new population by the elitist after the genetic operation, and keeping the elitist in the next generation of population.
The pseudo code of the EGA algorithm is as follows.
Where lines 9 and 16 represent finding elite individuals elitist. Line 15 represents the replacement of the least adaptable individual in the new population with elitist.
Analysis of the Experimental results of the invention
First, predicting model accuracy
In order to improve the accuracy of the Ceph performance prediction model and reduce the training time, the batch size of the LSTM model needs to be adjusted. The batchsize is the number of samples selected for one training in the neural network. The size of this parameter affects the degree and speed of model optimization and directly affects the usage of the GPU and memory. If the batch size is too small, the fluctuation of the gradient change is large, and the network is not easy to converge, and if the parameter setting is too large, the memory capacity is too high, the gradient is inaccurate, and the time is long. The size of the batchsize is determined experimentally, as shown in fig. 2.
As can be seen from fig. 2, as the batch size increases, the accuracy of the model increases. When the blocksize is 32, the model accuracy is maximized. When the blocksize is greater than 32, the model accuracy is reduced. And the training time period is gradually increased after the blocksize is larger than 32. According to the experimental result, the blocksize selection 32 can achieve the optimal training effect.
After determining the parameters of the LSTM model, to verify the accuracy of the performance prediction model, the present invention uses 3000 sets of data acquired in section 3.1 as the data set of the performance prediction model. Wherein, 80% is used as a training set, 10% is used as a verification set, and 10% is used as a test set.
In order to verify the advantages and disadvantages of the Ceph performance prediction model, the performance prediction models are respectively established for the Ceph system by using the LSTM and the RF, and the accuracy of the two performance prediction models is analyzed and compared. The results of the experiment are shown in FIG. 3.
In fig. 3, predicted versus true values for LSTM and RF are shown. Where the abscissa represents different parameter configurations and the ordinate represents IOPS values. LSTM and RF represent predicted values of the performance model established by LSTM and RF, respectively. From the overall trend, the predicted values obtained by adopting the LSTM and the RF can reflect the performance fluctuation caused by the parameter change in time, but the predicted curve and the true value curve obtained by the RF have obvious difference and have larger deviation with the true value at certain moments. In order to visually compare the precision difference of the two models, the invention uses Error to evaluate the precision of the models. The Error values for the RF and LSTM models were calculated to be 0.56% and 0.28%, respectively, by comparative experiments. It follows that the prediction accuracy of LSTM is better than the prior method RF.
Second, performance comparison analysis
In order to evaluate the effect of the method on the adjustment of the Ceph system performance, the method is compared with an LSTM + GA method and an RF and GA based automatic Ceph adjustment method.
Before the experiment, initial parameters of EGA are set: probability of variation PmCross probability PcA population size M and a maximum number of iterations T. Mutation is essentially a deep search of the parameter configuration value space, the probability of mutation PmIf the value is too large, the genetic algorithm becomes a random search algorithm, and because the randomness is too large, the EGA spends more time in searching; cross probability PcThe alternating speed of the configuration scheme is influenced, and the algorithm efficiency is higher by selecting higher cross probability; the larger the population scale M and the maximum iteration number T are, the larger the search scale can be increased, the search accuracy is improved, but the larger the search scale M and the maximum iteration number T are, the more the time overhead is increased, and the search efficiency is reduced. The settings of the EGA parameters are shown in Table 2 after a plurality of experimental tests.
Parameter(s) | Value taking |
Maximum number of |
100 |
|
20 |
Cross probability Pc | 0.8 |
Probability of variation Pm | 0.1 |
TABLE 2
An iteration trend graph of the LSTM + GA method, the LSTM + EGA method and the RF + GA method is shown in figure 4, wherein the abscissa represents the iteration times of the genetic algorithm, and the ordinate represents the read-write performance of the Ceph block storage system. In order to obtain more accurate experimental results, each method respectively takes the average value of 5 times of algorithm operation as the final experimental result.
It can be seen from fig. 4 that the read-write performance of the Ceph block storage system after parameter optimization is about 6750. The LSTM + GA method and the RF + GA method did not differ much in the first 20 generations, both methods reached a plateau around 60 generations, but RF + GA slightly lags behind LSTM + GA. The LSTM + EGA method can reach a steady state in about 40 generations, which shows that the convergence rate of the LSTM + EGA is faster than that of the LSTM + GA and the RF + GA. And the optimal value obtained by the LSTM + EGA method is superior to that obtained by the LSTM + GA method and the RF + GA method, which shows that the convergence precision of the LSTM + EGA is higher.
And (3) bringing the obtained optimal parameter combination into a Ceph system in a real environment, and measuring that the IOPS mean value of the block storage system performance is 6612 and the difference with the predicted value is not much within an acceptable range. The performance of the default parameter configuration can only reach 3971, and the performance is about 1.7 times of the default configuration.
Although only the preferred embodiments of the present invention have been described in detail, the present invention is not limited to the above embodiments, and various changes can be made without departing from the spirit of the present invention within the knowledge of those skilled in the art, and all changes are encompassed in the scope of the present invention.
Claims (6)
1. A Ceph parameter tuning method based on LSTM and genetic algorithm is characterized in that: comprises the following steps:
s1, collecting a data set;
s2, proving the nonlinear relation of the data set, thereby proving the complexity of Ceph tuning;
s3, constructing a performance prediction model by using LSTM;
and S4, optimizing by using EGA to obtain a set of optimal parameters.
2. The Ceph parameter tuning method based on LSTM and genetic algorithm as claimed in claim 1, wherein: the method for collecting the data set in S1 includes:
s1.1, randomly taking values of 8 configuration parameters of Ceph within an adjustable range, and setting the ith parameter confiHas a value range of [ lbi,ubi],confi=random(lbi,ubi),i=1,2,…,8;
S1.2, combining parameters config ═ { conf1,conf2,conf3,conf4,conf5,conf6,conf7,conf8Updating the data into a Ceph system, and testing the read-write performance of a corresponding Ceph block storage system;
s1.3 parameter combination configiAnd corresponding iopsiComposing a data item (config)i,iopsi) And taking all collected data items as a data set for constructing a Ceph performance prediction model.
3. The Ceph parameter tuning method based on LSTM and genetic algorithm as claimed in claim 2, wherein: the 8 parameters of the Ceph are bluestore _ cache _ size _ ssd, bluestore _ cache _ size _ hdd, bluestore _ cache _ size _ meta _ ratio, bluestore _ cache _ kv _ ratio, osd _ max _ write _ size, osd _ map _ cache _ size, rbd _ cache _ size and rbd _ cache _ max _ size, respectively; the type of the bluestore _ cache _ size _ ssd and bluestore _ cache _ size _ hd is integer, the type of the bluestore _ cache _ meta _ ratio and the bluestore _ cache _ kv _ ratio is float, and the type of the osd _ max _ write _ size, the osd _ map _ cache _ size, the rbd _ cache _ size and the rbd _ cache _ max _ dirty is integer.
4. The Ceph parameter tuning method based on LSTM and genetic algorithm as claimed in claim 1, wherein: the method for proving the non-linear relationship of the data sets in the step S2 is as follows:
the nonlinear relation of the data set is proved by establishing a function for predicting by establishing a linear combination through establishing a multiple linear regression model, wherein the multiple linear regression model is as follows:
f(config)=ω1conf1+ω2conf2+...+ω8conf8+b
b is a constant, w1-w8If the variables have linear relation, a group of coefficients and constants must exist, so that the true value is constrained in the range of the predicted value obtained by the formula.
5. The Ceph parameter tuning method based on LSTM and genetic algorithm as claimed in claim 1, wherein: the method for constructing the performance prediction model by using the LSTM in the S3 comprises the following steps:
defining an Error formula Error reflecting the difference between the real value and the predicted value,
wherein ActualiFor true values of Ceph block storage systems, ForecastiIs the predicted value of the LSTM model, and n is the number of samples.
6. The Ceph parameter tuning method based on LSTM and genetic algorithm as claimed in claim 1, wherein: the method for optimizing by using EGA in S4 comprises the following steps: setting the population size as M and the maximum iteration number T, and combining a group of parameters with config ═ conf1,conf2,conf3,conf4,conf5,conf6,conf7,conf8As one individual in the population,each parameter represents a gene of an individual, and P (t) represents the population of the t generation; and finding the individual elitist with the maximum fitness in the population before genetic operation by adopting an EGA algorithm, storing the information of the individual elitist, replacing the individual with the minimum fitness in the new population by the elitist after the genetic operation, and keeping the elitist in the next generation of population.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111021786.1A CN113704220A (en) | 2021-09-01 | 2021-09-01 | Ceph parameter tuning method based on LSTM and genetic algorithm |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111021786.1A CN113704220A (en) | 2021-09-01 | 2021-09-01 | Ceph parameter tuning method based on LSTM and genetic algorithm |
Publications (1)
Publication Number | Publication Date |
---|---|
CN113704220A true CN113704220A (en) | 2021-11-26 |
Family
ID=78658813
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111021786.1A Pending CN113704220A (en) | 2021-09-01 | 2021-09-01 | Ceph parameter tuning method based on LSTM and genetic algorithm |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113704220A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20230108213A1 (en) * | 2021-10-05 | 2023-04-06 | Softiron Limited | Ceph Failure and Verification |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108764568A (en) * | 2018-05-28 | 2018-11-06 | 哈尔滨工业大学 | A kind of data prediction model tuning method and device based on LSTM networks |
CN109243172A (en) * | 2018-07-25 | 2019-01-18 | 华南理工大学 | Traffic flow forecasting method based on genetic algorithm optimization LSTM neural network |
CN109634924A (en) * | 2018-11-02 | 2019-04-16 | 华南师范大学 | File system parameter automated tuning method and system based on machine learning |
CN110766237A (en) * | 2019-10-31 | 2020-02-07 | 内蒙古工业大学 | Bus passenger flow prediction method and system based on SPGAPSO-SVM algorithm |
US20200125945A1 (en) * | 2018-10-18 | 2020-04-23 | Drvision Technologies Llc | Automated hyper-parameterization for image-based deep model learning |
-
2021
- 2021-09-01 CN CN202111021786.1A patent/CN113704220A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108764568A (en) * | 2018-05-28 | 2018-11-06 | 哈尔滨工业大学 | A kind of data prediction model tuning method and device based on LSTM networks |
CN109243172A (en) * | 2018-07-25 | 2019-01-18 | 华南理工大学 | Traffic flow forecasting method based on genetic algorithm optimization LSTM neural network |
US20200125945A1 (en) * | 2018-10-18 | 2020-04-23 | Drvision Technologies Llc | Automated hyper-parameterization for image-based deep model learning |
CN109634924A (en) * | 2018-11-02 | 2019-04-16 | 华南师范大学 | File system parameter automated tuning method and system based on machine learning |
CN110766237A (en) * | 2019-10-31 | 2020-02-07 | 内蒙古工业大学 | Bus passenger flow prediction method and system based on SPGAPSO-SVM algorithm |
Non-Patent Citations (2)
Title |
---|
温惠英 等: "GA-LSTM模型在高速公路交通流预测中的应用", 《哈尔滨工业大学学报》 * |
陈禹 等: "基于随机森林和遗传算法的Ceph参数自动调优", 《计算机应用》 * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20230108213A1 (en) * | 2021-10-05 | 2023-04-06 | Softiron Limited | Ceph Failure and Verification |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Mahgoub et al. | {OPTIMUSCLOUD}: Heterogeneous configuration optimization for distributed databases in the cloud | |
US9811781B2 (en) | Time-series data prediction device of observation value, time-series data prediction method of observation value, and program | |
CN110968272B (en) | Time sequence prediction-based method and system for optimizing storage performance of mass small files | |
CN109582758B (en) | Optimization method for Elasticissearch index shards | |
CN108009260B (en) | Copy placement method combining node load and distance under big data storage | |
CN110289994B (en) | Cluster capacity adjusting method and device | |
CN108462605B (en) | Data prediction method and device | |
CN112926635B (en) | Target clustering method based on iterative self-adaptive neighbor propagation algorithm | |
WO2018129500A1 (en) | Optimized navigable key-value store | |
Bausch et al. | Making cost-based query optimization asymmetry-aware | |
CN109471847B (en) | I/O congestion control method and control system | |
CN112398700B (en) | Service degradation method and device, storage medium and computer equipment | |
CN108363643A (en) | A kind of HDFS copy management methods based on file access temperature | |
CN111738477A (en) | Deep feature combination-based power grid new energy consumption capability prediction method | |
CN112884236B (en) | Short-term load prediction method and system based on VDM decomposition and LSTM improvement | |
CN114880806A (en) | New energy automobile sales prediction model parameter optimization method based on particle swarm optimization | |
CN106776370A (en) | Cloud storage method and device based on the assessment of object relevance | |
CN113704220A (en) | Ceph parameter tuning method based on LSTM and genetic algorithm | |
CN116401954A (en) | Prediction method, prediction device, equipment and medium for cycle life of lithium battery | |
CN115941696A (en) | Heterogeneous Big Data Distributed Cluster Storage Optimization Method | |
JP7098204B2 (en) | VOD service cache replacement method based on random forest algorithm in edge network environment | |
CN116822360A (en) | Power system frequency track prediction method, device, medium and equipment | |
CN117129875A (en) | Method, system, equipment and medium for training and predicting battery capacity prediction model | |
CN113282241B (en) | Hard disk weight optimization method and device based on Ceph distributed storage | |
CN112990603B (en) | Air conditioner cold load prediction method and system considering frequency domain decomposed data characteristics |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20211126 |
|
RJ01 | Rejection of invention patent application after publication |