CN113704220A - Ceph parameter tuning method based on LSTM and genetic algorithm - Google Patents

Ceph parameter tuning method based on LSTM and genetic algorithm Download PDF

Info

Publication number
CN113704220A
CN113704220A CN202111021786.1A CN202111021786A CN113704220A CN 113704220 A CN113704220 A CN 113704220A CN 202111021786 A CN202111021786 A CN 202111021786A CN 113704220 A CN113704220 A CN 113704220A
Authority
CN
China
Prior art keywords
conf
ceph
cache
lstm
size
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111021786.1A
Other languages
Chinese (zh)
Inventor
李雷孝
牛铁铭
李�杰
李少旭
林浩
马志强
万剑雄
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Inner Mongolia University of Technology
Original Assignee
Inner Mongolia University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Inner Mongolia University of Technology filed Critical Inner Mongolia University of Technology
Priority to CN202111021786.1A priority Critical patent/CN113704220A/en
Publication of CN113704220A publication Critical patent/CN113704220A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/182Distributed file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3447Performance evaluation by modeling
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/12Computing arrangements based on biological models using genetic models
    • G06N3/126Evolutionary algorithms, e.g. genetic algorithms or genetic programming

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • General Physics & Mathematics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Computer Hardware Design (AREA)
  • Quality & Reliability (AREA)
  • Databases & Information Systems (AREA)
  • Physiology (AREA)
  • Genetics & Genomics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention belongs to the technical field of parameter tuning, and particularly relates to a Ceph parameter tuning method based on LSTM and a genetic algorithm, which comprises the following steps: collecting a data set; proving a non-linear relationship; s3, constructing a performance prediction model by using LSTM; the optimization was performed using EGA. The method for collecting the data set comprises the following steps: randomly dereferencing 8 configuration parameters of Ceph in an adjustable range, and setting the ith parameter confiHas a value range of [ lbi,ubi],confi=random(lbi,ubi) I is 1,2, …, 8; combining parameters config ═ conf1,conf2,conf3,conf4,conf5,conf6,conf7,conf8Updating the data into a Ceph system, and testing the read-write performance of a corresponding Ceph block storage system; parameter combination configiAnd corresponding iopsiComposing a data item (config)i,iopsi) And taking all collected data items as a data set for constructing a Ceph performance prediction model. According to the invention, an accurate and reliable Ceph performance prediction model is constructed by using LSTM, the predicted value of the performance prediction model is used as the fitness of population individuals, and the optimal parameter configuration is found through EGA, so that the system performance is optimal.

Description

Ceph parameter tuning method based on LSTM and genetic algorithm
Technical Field
The invention belongs to the technical field of parameter tuning, and particularly relates to a Ceph parameter tuning method based on LSTM and a genetic algorithm.
Background
The performance optimization work of domestic and foreign research scholars on the Ceph system is mainly divided into three aspects: specific hardware environment optimization, application-oriented scenario optimization, and internal mechanism optimization. With the advent of NVDIMM (Non-Volatile Dual In-line Memory Module) products, byte-addressable Non-Volatile Memory will provide IO performance similar to Memory In terms of specific hardware environment optimization. Simulating the performance of using NVDIMMs as the underlying media in a Ceph system, the throughput can be improved by more than 100% for a single node, mapping all content to NVDIMMs.
In application-oriented scene optimization, Ceph is not the most suitable storage system in the field of high-performance computing. Files accessed by data intensive applications in high-performance computing are classified as read intensive, write intensive or read-write intensive. The read-write characteristics of these files are used to set file placement decisions, balancing the high performance computing workload. In the field of cloud computing, data logs of data objects are operated, write atomicity and reliability are kept, and experimental results show that the capacity provided by a new storage engine is more than 3 times of the original capacity.
There has also been some progress in the research of performance optimization problems in terms of the internal mechanisms of Ceph storage systems. In the existing dividing and controlling strategy based on MapReduce, the optimal data placement strategy of Ceph in a heterogeneous environment is solved by using a mixed integer linear programming algorithm. The experimental results show that compared with the original strategy realized in Ceph, the algorithm can improve the read-write performance of the system by 25.6%. The existing multi-attribute decision-making Ceph Storage selection method collects IO performance of OSD (object Storage device) and effectively combines the IO performance, and distinguishes different application scenes by marking application priority, so that the overall read-write performance is improved by 13.7%. In the prior art, parameters including a kernel, a file system, a disk cache, an RBD and the like which need to be adjusted in a full flash memory environment are described in detail, but performance comparison before and after adjustment is not given. The existing black box optimization technology applied to the storage system selects the next modified parameter configuration according to the last information, but the method needs a large amount of data sets for support and is difficult to implement in a practical environment. The existing automatic adjustment method for the Ceph configuration parameters based on random forest (Radio Frequency, RF) and Genetic Algorithm (GA) uses RF to construct a performance prediction model, and compared with a black box optimization technology, the method can predict the Ceph system performance more quickly, and saves a large amount of time and occupation of system resources. However, the amount of data used in this document is too small, RF may not produce good regression results, and RF is not able to make predictions beyond the training set data range, which may result in overfitting when modeling data of certain specific noises.
Although the specific hardware environment optimization and application-oriented scene optimization methods have a certain progress on performance improvement, the general environment is not considered, and the performance improvement space caused by adjustment and optimization of the internal parameters of the Ceph is ignored. In the above internal mechanism optimization method research, the method has general applicability to performance improvement, but the nonlinear relationship of parameters cannot be considered completely.
Disclosure of Invention
Aiming at the technical problems that the read-write performance of the system cannot be fully exerted by the default parameters of the Ceph, the manual parameter adjustment efficiency is low, and a large amount of system resources are wasted, the invention provides the Ceph parameter adjusting and optimizing method based on the LSTM and the genetic algorithm, which has strong applicability, large performance improvement and high efficiency.
In order to solve the technical problems, the invention adopts the technical scheme that:
a Ceph parameter tuning method based on LSTM and genetic algorithm comprises the following steps:
s1, collecting a data set;
s2, proving the nonlinear relation of the data sets;
s3, constructing a performance prediction model by using LSTM;
and S4, optimizing by using EGA to obtain a set of optimal parameters.
The method for collecting the data set in S1 includes:
s1.1, randomly taking values of 8 configuration parameters of Ceph within an adjustable range, and setting the ith parameter confiHas a value range of [ lbi,ubi],confi=random(lbi,ubi),i=1,2,…,8;
S1.2, combining parameters config ═ { conf1,conf2,conf3,conf4,conf5,conf6,conf7,conf8Updating the data into a Ceph system, and testing the read-write performance of a corresponding Ceph block storage system;
s1.3 parameter combination configiAnd corresponding iopsiComposing a data item (config)i,iopsi) And taking all collected data items as a data set for constructing a Ceph performance prediction model.
The 8 parameters of the Ceph are bluestore _ cache _ size _ ssd, bluestore _ cache _ size _ hdd, bluestore _ cache _ size _ meta _ ratio, bluestore _ cache _ kv _ ratio, osd _ max _ write _ size, osd _ map _ cache _ size, rbd _ cache _ size and rbd _ cache _ max _ size, respectively; the type of the bluestore _ cache _ size _ ssd and bluestore _ cache _ size _ hd is integer, the type of the bluestore _ cache _ meta _ ratio and the bluestore _ cache _ kv _ ratio is float, and the type of the osd _ max _ write _ size, the osd _ map _ cache _ size, the rbd _ cache _ size and the rbd _ cache _ max _ dirty is integer.
The method for proving the non-linear relationship of the data sets in the step S2 is as follows:
the nonlinear relation of the data set is proved by establishing a function for predicting by establishing a linear combination through establishing a multiple linear regression model, wherein the multiple linear regression model is as follows:
f(config)=ω1conf12conf2+...+ω8conf8+b
b is a constant, w1-w8If the variables have linear relation, a group of coefficients and constants must exist, so that the true value is constrained in the range of the predicted value obtained by the formula.
The method for constructing the performance prediction model by using the LSTM in the S3 comprises the following steps:
defining an Error formula Error reflecting the difference between the real value and the predicted value,
Figure BDA0003241729300000031
wherein ActualiFor true values of Ceph block storage systems, ForecastiIs the predicted value of the LSTM model, and n is the number of samples.
The method for optimizing by using EGA in S4 comprises the following steps: setting the population size as M and the maximum iteration number T, and combining a group of parameters with config ═ conf1,conf2,conf3,conf4,conf5,conf6,conf7,conf8Using the gene of each individual as an individual in the population, and P (t) represents the population of the t generation; finding out individual elitist with maximum fitness in the population before genetic operation by adopting an EGA algorithm, storing the information of the individual elitist, and replacing the individual elitist with the minimum fitness in a new population after the genetic operationTo retain elitist in the next generation population.
Compared with the prior art, the invention has the following beneficial effects:
according to the invention, an accurate and reliable Ceph performance prediction model is constructed by using LSTM, the predicted value of the performance prediction model is used as the fitness of population individuals, and the optimal parameter configuration is found through EGA, so that the system performance is optimal.
Drawings
FIG. 1 is a schematic diagram of the overall framework of Ceph parameter tuning according to the present invention;
FIG. 2 is a graph showing the effect of the blocksize on model accuracy according to the present invention;
FIG. 3 is a graph comparing the predicted effects of the LSTM and RF models of the present invention;
FIG. 4 is a comparison of the effect of the method of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In the description of the present invention, it is to be noted that, unless otherwise explicitly specified or limited, the terms "connected" and "connected" are to be interpreted broadly, e.g., as being fixed or detachable or integrally connected; can be mechanically or electrically connected; they may be connected directly or indirectly through intervening media, or they may be interconnected between two elements. The specific meanings of the above terms in the present invention can be understood in specific cases to those skilled in the art.
A Ceph parameter tuning method based on LSTM and genetic algorithm comprises the following steps:
s1, collecting a data set;
s2, proving the nonlinear relation of the data sets;
s3, constructing a performance prediction model by using LSTM;
and S4, optimizing by using EGA to obtain a set of optimal parameters.
Further, the method for collecting the data set in S1 is as follows:
s1.1, randomly taking values of 8 configuration parameters of Ceph within an adjustable range, and setting the ith parameter confiHas a value range of [ lbi,ubi],confi=random(lbi,ubi),i=1,2,…,8;
S1.2, combining parameters config ═ { conf1,conf2,conf3,conf4,conf5,conf6,conf7,conf8Updating the data into a Ceph system, and testing the read-write performance of a corresponding Ceph block storage system;
s1.3 parameter combination configiAnd corresponding iopsiComposing a data item (config)i,iopsi) And taking all collected data items as a data set for constructing a Ceph performance prediction model.
Further, 8 parameters of Ceph are bluestore _ cache _ size _ ssd, bluestore _ cache _ size _ hdd, bluestore _ cache _ size _ meta _ ratio, bluestore _ cache _ kv _ ratio, osd _ max _ write _ size, osd _ map _ cache _ size, rbd _ cache _ size, and rbd _ cache _ max _ size, respectively; the type of bluestore _ cache _ size _ ssd and bluestore _ cache _ size _ hd is integer, the type of bluestore _ cache _ meta _ ratio and bluestore _ cache _ kv _ ratio is float, the type of osd _ max _ write _ size, osd _ map _ cache _ size, rbd _ cache _ size and rbd _ cache _ max _ size are integer, 8 parameters of Ceph are shown in table 1.
Parameter name Type (B) Value range
bluestore_cache_size_ssd integer 1GB~10GB
bluestore_cache_size_hdd integer 1GB~10GB
bluestore_cache_meta_ratio float
0~1
bluestore_cache_kv_ratio float 0~1
osd_max_write_size integer 4~2000
osd_map_cache_size integer 64~1024
rbd_cache_size integer 1MB~64MB
rbd_cache_max_dirty integer 1MB~64MB
TABLE 1
The method for randomly taking values of 8 parameters of Ceph in S3.1 comprises the following steps: randomly dereferencing 8 configuration parameters of Ceph in an adjustable range, and setting the ith parameter confiHas a value range of [ lbi,ubi],confi=random(lbi,ubi) I 1,2, …,8, combining the parameters config { conf ═ conf1,conf2,conf3,conf4,conf5,conf6,conf7,conf8Updating the data into a Ceph system, and testing the read-write performance of a corresponding Ceph block storage system; parameter combination configiAnd corresponding iopsiComposing a data item (config)i,iopsi) And taking all collected data items as a data set for constructing a Ceph performance prediction model.
The IOPS value in S3.2 is divided into 6 indices: random read IOPS, random write IOPS, sequential read IOPS, sequential write IOPS, mixed sequential read IOPS and mixed random read IOPS;
further, the method for collecting the IOPS value corresponding to the parameter in S3.2 includes:
step 1: random (lb) was usedi,ubi) I is 1,2, …,8 is randomly 8 parameter values;
step 2: synchronizing the modified configuration parameters to the whole Ceph cluster by using a cluster management tool Ansible;
and step 3: acquiring the performance of the block storage system by using a fio + rbd test tool;
and 4, step 4: the testing task is periodically executed using the crontabs tool, steps 1-3 are repeated, and the parameter combinations and corresponding IOPS values are collected.
Further, the method for proving the non-linear relationship of the data sets in S2 is as follows:
the nonlinear relation of the data set is proved by establishing a function for predicting by establishing a linear combination through establishing a multiple linear regression model, wherein the multiple linear regression model is as follows:
f(config)=ω1conf12conf2+...+ω8conf8+b
wherein: b is a constant, w1-w8If the variables have linear relation, a group of coefficients and constants must exist, so that the true value is constrained in the range of the predicted value obtained by the formula.
Further, the method for constructing the performance prediction model by using the LSTM in S3 includes:
defining an Error formula Error reflecting the difference between the real value and the predicted value,
Figure BDA0003241729300000071
wherein ActualiFor true values of Ceph block storage systems, ForecastiIs the predicted value of the LSTM model, and n is the number of samples.
Further, the optimization method using EGA in S4 is as follows: setting the population size as M and the maximum iteration number T, and combining a group of parameters with config ═ conf1,conf2,conf3,conf4,conf5,conf6,conf7,conf8Using the gene of each individual as an individual in the population, and P (t) represents the population of the t generation; and finding the individual elitist with the maximum fitness in the population before genetic operation by adopting an EGA algorithm, storing the information of the individual elitist, replacing the individual with the minimum fitness in the new population by the elitist after the genetic operation, and keeping the elitist in the next generation of population.
The pseudo code of the EGA algorithm is as follows.
Figure BDA0003241729300000072
Figure BDA0003241729300000081
Where lines 9 and 16 represent finding elite individuals elitist. Line 15 represents the replacement of the least adaptable individual in the new population with elitist.
Analysis of the Experimental results of the invention
First, predicting model accuracy
In order to improve the accuracy of the Ceph performance prediction model and reduce the training time, the batch size of the LSTM model needs to be adjusted. The batchsize is the number of samples selected for one training in the neural network. The size of this parameter affects the degree and speed of model optimization and directly affects the usage of the GPU and memory. If the batch size is too small, the fluctuation of the gradient change is large, and the network is not easy to converge, and if the parameter setting is too large, the memory capacity is too high, the gradient is inaccurate, and the time is long. The size of the batchsize is determined experimentally, as shown in fig. 2.
As can be seen from fig. 2, as the batch size increases, the accuracy of the model increases. When the blocksize is 32, the model accuracy is maximized. When the blocksize is greater than 32, the model accuracy is reduced. And the training time period is gradually increased after the blocksize is larger than 32. According to the experimental result, the blocksize selection 32 can achieve the optimal training effect.
After determining the parameters of the LSTM model, to verify the accuracy of the performance prediction model, the present invention uses 3000 sets of data acquired in section 3.1 as the data set of the performance prediction model. Wherein, 80% is used as a training set, 10% is used as a verification set, and 10% is used as a test set.
In order to verify the advantages and disadvantages of the Ceph performance prediction model, the performance prediction models are respectively established for the Ceph system by using the LSTM and the RF, and the accuracy of the two performance prediction models is analyzed and compared. The results of the experiment are shown in FIG. 3.
In fig. 3, predicted versus true values for LSTM and RF are shown. Where the abscissa represents different parameter configurations and the ordinate represents IOPS values. LSTM and RF represent predicted values of the performance model established by LSTM and RF, respectively. From the overall trend, the predicted values obtained by adopting the LSTM and the RF can reflect the performance fluctuation caused by the parameter change in time, but the predicted curve and the true value curve obtained by the RF have obvious difference and have larger deviation with the true value at certain moments. In order to visually compare the precision difference of the two models, the invention uses Error to evaluate the precision of the models. The Error values for the RF and LSTM models were calculated to be 0.56% and 0.28%, respectively, by comparative experiments. It follows that the prediction accuracy of LSTM is better than the prior method RF.
Second, performance comparison analysis
In order to evaluate the effect of the method on the adjustment of the Ceph system performance, the method is compared with an LSTM + GA method and an RF and GA based automatic Ceph adjustment method.
Before the experiment, initial parameters of EGA are set: probability of variation PmCross probability PcA population size M and a maximum number of iterations T. Mutation is essentially a deep search of the parameter configuration value space, the probability of mutation PmIf the value is too large, the genetic algorithm becomes a random search algorithm, and because the randomness is too large, the EGA spends more time in searching; cross probability PcThe alternating speed of the configuration scheme is influenced, and the algorithm efficiency is higher by selecting higher cross probability; the larger the population scale M and the maximum iteration number T are, the larger the search scale can be increased, the search accuracy is improved, but the larger the search scale M and the maximum iteration number T are, the more the time overhead is increased, and the search efficiency is reduced. The settings of the EGA parameters are shown in Table 2 after a plurality of experimental tests.
Parameter(s) Value taking
Maximum number of iterations T 100
Population size M 20
Cross probability Pc 0.8
Probability of variation Pm 0.1
TABLE 2
An iteration trend graph of the LSTM + GA method, the LSTM + EGA method and the RF + GA method is shown in figure 4, wherein the abscissa represents the iteration times of the genetic algorithm, and the ordinate represents the read-write performance of the Ceph block storage system. In order to obtain more accurate experimental results, each method respectively takes the average value of 5 times of algorithm operation as the final experimental result.
It can be seen from fig. 4 that the read-write performance of the Ceph block storage system after parameter optimization is about 6750. The LSTM + GA method and the RF + GA method did not differ much in the first 20 generations, both methods reached a plateau around 60 generations, but RF + GA slightly lags behind LSTM + GA. The LSTM + EGA method can reach a steady state in about 40 generations, which shows that the convergence rate of the LSTM + EGA is faster than that of the LSTM + GA and the RF + GA. And the optimal value obtained by the LSTM + EGA method is superior to that obtained by the LSTM + GA method and the RF + GA method, which shows that the convergence precision of the LSTM + EGA is higher.
And (3) bringing the obtained optimal parameter combination into a Ceph system in a real environment, and measuring that the IOPS mean value of the block storage system performance is 6612 and the difference with the predicted value is not much within an acceptable range. The performance of the default parameter configuration can only reach 3971, and the performance is about 1.7 times of the default configuration.
Although only the preferred embodiments of the present invention have been described in detail, the present invention is not limited to the above embodiments, and various changes can be made without departing from the spirit of the present invention within the knowledge of those skilled in the art, and all changes are encompassed in the scope of the present invention.

Claims (6)

1. A Ceph parameter tuning method based on LSTM and genetic algorithm is characterized in that: comprises the following steps:
s1, collecting a data set;
s2, proving the nonlinear relation of the data set, thereby proving the complexity of Ceph tuning;
s3, constructing a performance prediction model by using LSTM;
and S4, optimizing by using EGA to obtain a set of optimal parameters.
2. The Ceph parameter tuning method based on LSTM and genetic algorithm as claimed in claim 1, wherein: the method for collecting the data set in S1 includes:
s1.1, randomly taking values of 8 configuration parameters of Ceph within an adjustable range, and setting the ith parameter confiHas a value range of [ lbi,ubi],confi=random(lbi,ubi),i=1,2,…,8;
S1.2, combining parameters config ═ { conf1,conf2,conf3,conf4,conf5,conf6,conf7,conf8Updating the data into a Ceph system, and testing the read-write performance of a corresponding Ceph block storage system;
s1.3 parameter combination configiAnd corresponding iopsiComposing a data item (config)i,iopsi) And taking all collected data items as a data set for constructing a Ceph performance prediction model.
3. The Ceph parameter tuning method based on LSTM and genetic algorithm as claimed in claim 2, wherein: the 8 parameters of the Ceph are bluestore _ cache _ size _ ssd, bluestore _ cache _ size _ hdd, bluestore _ cache _ size _ meta _ ratio, bluestore _ cache _ kv _ ratio, osd _ max _ write _ size, osd _ map _ cache _ size, rbd _ cache _ size and rbd _ cache _ max _ size, respectively; the type of the bluestore _ cache _ size _ ssd and bluestore _ cache _ size _ hd is integer, the type of the bluestore _ cache _ meta _ ratio and the bluestore _ cache _ kv _ ratio is float, and the type of the osd _ max _ write _ size, the osd _ map _ cache _ size, the rbd _ cache _ size and the rbd _ cache _ max _ dirty is integer.
4. The Ceph parameter tuning method based on LSTM and genetic algorithm as claimed in claim 1, wherein: the method for proving the non-linear relationship of the data sets in the step S2 is as follows:
the nonlinear relation of the data set is proved by establishing a function for predicting by establishing a linear combination through establishing a multiple linear regression model, wherein the multiple linear regression model is as follows:
f(config)=ω1conf12conf2+...+ω8conf8+b
b is a constant, w1-w8If the variables have linear relation, a group of coefficients and constants must exist, so that the true value is constrained in the range of the predicted value obtained by the formula.
5. The Ceph parameter tuning method based on LSTM and genetic algorithm as claimed in claim 1, wherein: the method for constructing the performance prediction model by using the LSTM in the S3 comprises the following steps:
defining an Error formula Error reflecting the difference between the real value and the predicted value,
Figure FDA0003241729290000021
wherein ActualiFor true values of Ceph block storage systems, ForecastiIs the predicted value of the LSTM model, and n is the number of samples.
6. The Ceph parameter tuning method based on LSTM and genetic algorithm as claimed in claim 1, wherein: the method for optimizing by using EGA in S4 comprises the following steps: setting the population size as M and the maximum iteration number T, and combining a group of parameters with config ═ conf1,conf2,conf3,conf4,conf5,conf6,conf7,conf8As one individual in the population,each parameter represents a gene of an individual, and P (t) represents the population of the t generation; and finding the individual elitist with the maximum fitness in the population before genetic operation by adopting an EGA algorithm, storing the information of the individual elitist, replacing the individual with the minimum fitness in the new population by the elitist after the genetic operation, and keeping the elitist in the next generation of population.
CN202111021786.1A 2021-09-01 2021-09-01 Ceph parameter tuning method based on LSTM and genetic algorithm Pending CN113704220A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111021786.1A CN113704220A (en) 2021-09-01 2021-09-01 Ceph parameter tuning method based on LSTM and genetic algorithm

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111021786.1A CN113704220A (en) 2021-09-01 2021-09-01 Ceph parameter tuning method based on LSTM and genetic algorithm

Publications (1)

Publication Number Publication Date
CN113704220A true CN113704220A (en) 2021-11-26

Family

ID=78658813

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111021786.1A Pending CN113704220A (en) 2021-09-01 2021-09-01 Ceph parameter tuning method based on LSTM and genetic algorithm

Country Status (1)

Country Link
CN (1) CN113704220A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20230108213A1 (en) * 2021-10-05 2023-04-06 Softiron Limited Ceph Failure and Verification

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108764568A (en) * 2018-05-28 2018-11-06 哈尔滨工业大学 A kind of data prediction model tuning method and device based on LSTM networks
CN109243172A (en) * 2018-07-25 2019-01-18 华南理工大学 Traffic flow forecasting method based on genetic algorithm optimization LSTM neural network
CN109634924A (en) * 2018-11-02 2019-04-16 华南师范大学 File system parameter automated tuning method and system based on machine learning
CN110766237A (en) * 2019-10-31 2020-02-07 内蒙古工业大学 Bus passenger flow prediction method and system based on SPGAPSO-SVM algorithm
US20200125945A1 (en) * 2018-10-18 2020-04-23 Drvision Technologies Llc Automated hyper-parameterization for image-based deep model learning

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108764568A (en) * 2018-05-28 2018-11-06 哈尔滨工业大学 A kind of data prediction model tuning method and device based on LSTM networks
CN109243172A (en) * 2018-07-25 2019-01-18 华南理工大学 Traffic flow forecasting method based on genetic algorithm optimization LSTM neural network
US20200125945A1 (en) * 2018-10-18 2020-04-23 Drvision Technologies Llc Automated hyper-parameterization for image-based deep model learning
CN109634924A (en) * 2018-11-02 2019-04-16 华南师范大学 File system parameter automated tuning method and system based on machine learning
CN110766237A (en) * 2019-10-31 2020-02-07 内蒙古工业大学 Bus passenger flow prediction method and system based on SPGAPSO-SVM algorithm

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
温惠英 等: "GA-LSTM模型在高速公路交通流预测中的应用", 《哈尔滨工业大学学报》 *
陈禹 等: "基于随机森林和遗传算法的Ceph参数自动调优", 《计算机应用》 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20230108213A1 (en) * 2021-10-05 2023-04-06 Softiron Limited Ceph Failure and Verification

Similar Documents

Publication Publication Date Title
Mahgoub et al. {OPTIMUSCLOUD}: Heterogeneous configuration optimization for distributed databases in the cloud
US9811781B2 (en) Time-series data prediction device of observation value, time-series data prediction method of observation value, and program
CN110968272B (en) Time sequence prediction-based method and system for optimizing storage performance of mass small files
CN109582758B (en) Optimization method for Elasticissearch index shards
CN108009260B (en) Copy placement method combining node load and distance under big data storage
CN110289994B (en) Cluster capacity adjusting method and device
CN108462605B (en) Data prediction method and device
CN112926635B (en) Target clustering method based on iterative self-adaptive neighbor propagation algorithm
WO2018129500A1 (en) Optimized navigable key-value store
Bausch et al. Making cost-based query optimization asymmetry-aware
CN109471847B (en) I/O congestion control method and control system
CN112398700B (en) Service degradation method and device, storage medium and computer equipment
CN108363643A (en) A kind of HDFS copy management methods based on file access temperature
CN111738477A (en) Deep feature combination-based power grid new energy consumption capability prediction method
CN112884236B (en) Short-term load prediction method and system based on VDM decomposition and LSTM improvement
CN114880806A (en) New energy automobile sales prediction model parameter optimization method based on particle swarm optimization
CN106776370A (en) Cloud storage method and device based on the assessment of object relevance
CN113704220A (en) Ceph parameter tuning method based on LSTM and genetic algorithm
CN116401954A (en) Prediction method, prediction device, equipment and medium for cycle life of lithium battery
CN115941696A (en) Heterogeneous Big Data Distributed Cluster Storage Optimization Method
JP7098204B2 (en) VOD service cache replacement method based on random forest algorithm in edge network environment
CN116822360A (en) Power system frequency track prediction method, device, medium and equipment
CN117129875A (en) Method, system, equipment and medium for training and predicting battery capacity prediction model
CN113282241B (en) Hard disk weight optimization method and device based on Ceph distributed storage
CN112990603B (en) Air conditioner cold load prediction method and system considering frequency domain decomposed data characteristics

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20211126

RJ01 Rejection of invention patent application after publication