CN110554838A - thermal data prediction method based on joint optimization echo state network - Google Patents

thermal data prediction method based on joint optimization echo state network Download PDF

Info

Publication number
CN110554838A
CN110554838A CN201910566123.4A CN201910566123A CN110554838A CN 110554838 A CN110554838 A CN 110554838A CN 201910566123 A CN201910566123 A CN 201910566123A CN 110554838 A CN110554838 A CN 110554838A
Authority
CN
China
Prior art keywords
particle
storage layer
value
adaptive
echo state
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910566123.4A
Other languages
Chinese (zh)
Other versions
CN110554838B (en
Inventor
罗旗舞
王玥童
阳春华
桂卫华
周灿
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Central South University
Original Assignee
Central South University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Central South University filed Critical Central South University
Priority to CN201910566123.4A priority Critical patent/CN110554838B/en
Publication of CN110554838A publication Critical patent/CN110554838A/en
Priority to PCT/CN2020/097950 priority patent/WO2020259543A1/en
Application granted granted Critical
Publication of CN110554838B publication Critical patent/CN110554838B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0614Improving the reliability of storage systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • G06F3/064Management of blocks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/0671In-line storage system
    • G06F3/0673Single storage device
    • G06F3/0679Non-volatile semiconductor memory device, e.g. flash memory, one time programmable memory [OTP]

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

the invention discloses a thermal data prediction method based on a joint optimization echo state network, which creatively uses quantum particle swarm optimization to calculate a storage layer parameter of the echo state network to obtain an optimal storage layer parameter, calculates an output weight value and a global optimal adaptation value by combining an echo state network with L 2 + self-adaptive L 1/2 regularization constraint in the particle position updating process, takes a particle position corresponding to the global optimal adaptation value as the optimal storage layer parameter when iteration is finished, finally calculates a final output weight value by adopting the echo state network with L 2 + self-adaptive L 1/2 regularization constraint based on the optimal storage layer parameter, predicts thermal data by using the final output weight value and a logic block address where the input historical thermal data is located, and predicts the data on the logic block address as the thermal data.

Description

Thermal data prediction method based on joint optimization echo state network
Technical Field
The invention belongs to the technical field of chaotic time sequence prediction, and particularly relates to a thermal data prediction method based on a joint optimization echo state network.
Background
As a non-volatile storage technology, NAND flash memory is widely used in communication systems and consumer electronics products, and has higher access speed and power efficiency compared to hard disk drives. In NAND flash based consumer electronics, a large number of applications rely on NAND flash for data exchange, file storage and video storage. The NAND flash memory is mainly used for storing large-capacity data, and the NAND structure can provide extremely high unit density and can achieve high storage density, high writing-in speed and high erasing speed; therefore, NAND flash memories are widely used for large-capacity data storage, such as solid-state disks. The demand of the NAND flash memory will continue to increase in the future, and the NAND flash memory is mainly applied to the fields of cloud computing, the Internet of things, data centers and the like.
However, NAND flash memory faces at least two challenges, namely displaced updating and limited endurance, which limits its large scale application. NAND flash memory suffers from the drawback of not being able to override an operation, i.e. not being able to perform a new write operation on a page of flash memory before the page is erased. Improper updates will generate many invalid and dead pages, which can reduce efficiency and performance. Furthermore, NAND flash memory has a limited lifetime because a flash block can only withstand a limited number of erasures, and if the number of erasures of a block is greater than the maximum number of erasures of the block, it will not be usable. Garbage Collection (GC) and Wear Leveling (WL) are design ideas of allocating frequently written Data (i.e., Hot Data) to blocks with a small number of erase times and allocating least recently used Data (i.e., cold Data) to blocks with a large number of erase times, and have important effects on solving these two challenges, while the efficiency and performance of GC and WL depend on Hot Data Identification (HDI) to a large extent. The nature of HDI is to try to understand well the access behavior of hot data in order to intelligently assign different data to the appropriate blocks, but conventional HDI has two problems, one of which is that memory overhead is large. At present, most of hot data identification mechanisms adopt the idea of identifying hot data pages in a NAND flash memory, and the core principle of the mechanisms is a page counter, which records the read-write operation times of logical pages corresponding to the NAND flash memory pages within a certain time, if the read-write times are greater than a set threshold value, the requested page is judged to be a hot page, otherwise, the requested page is a cold page. Another serious problem is that the accuracy of identification is not high, and the Bloom filter-based hot data identification mechanism is widely applied to the cold and hot data identification of SSD, but the Bloom filter has the inherent defect of false positive, that is, data not belonging to the set is erroneously determined to be in the set. In addition, thermal data identification mode considerations such as load request size and load access mode are single, the locality characteristics of loads cannot be completely and comprehensively considered, and the accuracy of thermal data identification is not high.
disclosure of Invention
The invention aims to provide a thermal data prediction method based on a joint optimization echo state network, which creatively proposes to replace the traditional thermal data identification with thermal data prediction and constructs the joint optimization echo state network, thereby ensuring that the predicted thermal data has more real-time property and reliability.
the invention provides a thermal data prediction method based on a joint optimization echo state network, which comprises the following steps:
s1: initializing parameters required by a quantum particle swarm algorithm and position information of each particle;
the position information of the particles comprises initial positions and position ranges of the particles, and the position of each particle is represented by storage layer parameters in an echo state network;
s2: iterative optimization is carried out by utilizing a quantum particle group algorithm to determine the optimal storage layer parameters;
Updating the positions of the particles by adopting a quantum particle swarm algorithm based on the position range of each particle, and utilizing L in each updating process2+ adaptive L1/2Computing an output weight value by the regularized and constrained echo state network to obtain a global optimal adaptation value, and taking a particle position corresponding to the global optimal adaptation value as an optimal storage layer parameter when iteration is finished;
S3: optimal storage layer parameter adoption L in network based on echo state2+ adaptive L1/2the regularized and constrained echo state network calculates the final output weight;
S4: and predicting the hot data by using the final output weight and the address of the logic block where the input historical hot data is located, wherein the prediction formula is as follows:
y=x*Wout
Wherein y represents the obtained predicted logic block address, the data at the predicted logic block address is thermal data, x is the logic block address of the input historical thermal data, Woutindicating the output weight, the logical block address where the historical thermal data is located is used in the echo state network training process in step S2 and step S3.
The method is characterized in that a solid state disk structure frame is adopted, meanwhile, a hot data identification module in a flash memory conversion layer is innovatively replaced by hot data prediction, the hot data is predicted by using a joint optimization echo state network, the joint optimization comprises two parts, the first part uses a quantum particle swarm algorithm to iteratively optimize and determine the optimal storage layer parameter of the echo state network, and the second part uses L2+ self-adaptation L1/2The regularization constrained ESN obtains the output weight of high sparsity, and the invention iteratively optimizes the quantum particle swarm algorithm and L2+ adaptive L1/2the optimal storage layer parameters are obtained by combining regularization constraints, and the joint optimization echo state network for prediction has higher real-time performance and reliability. The invention trains the echo state network by using the logic block address where the historical thermal data is located to obtain the final output weight value, and then predicts the final output weight value as the logic block address where the thermal data is located by using the final output weight value.
Further preferably, the iterative optimization for determining the optimal storage layer parameters in step S2 is performed as follows:
S21: the position of each particle is sequentially used as a storage layer parameter in an echo state network, and L is respectively adopted2+ adaptive L1/2The echo state network of regularization constraint calculates the corresponding output weight of each particle;
The current position of each particle is sequentially used as a storage layer parameter in the echo state network and an output weight is calculated;
S22: calculating an adaptive value of each particle by using the output weight corresponding to each particle;
S23: selecting an individual optimal adaptive value, an individual optimal parameter, a global optimal adaptive value and a global optimal parameter of each particle according to the adaptive value of each particle based on a minimum adaptive value principle;
Wherein, the position of the particle selected as the global optimum adaptive value is the global optimum parameter;
S24: updating the position of each particle in the position range of the particle, recalculating the adaptive value of each particle based on the updated position of each particle, and updating the individual optimal adaptive value, the individual optimal parameter, the global optimal adaptive value and the global optimal parameter of each particle based on the minimum adaptive value principle;
s25: judging whether the iteration times reach the maximum iteration times, if not, returning to the step S24 for the next iteration calculation; otherwise, the current global optimum parameter is taken as the optimum storage layer parameter.
Further preferably, the position of any particle j is updated according to the following formula:
wherein the content of the first and second substances,
In the formula, Pj(t+1)、Pj(t) represents the positions of the particles j after and before the update, respectively,and ujare all random numbers, sbestj、 sbestiThe individual optimal parameters of the jth and ith particles are shown, the mbest is the average value of the current individual optimal parameters of all the particles, iter and itermaxRespectively the current iteration number and the maximum iteration number, omegamax、ωminRespectively, inertia factors, and N is the total number of particles.
Further preferably, the calculation formula of the adaptive value of any particle j is:
Where Fitness denotes the adaptation value, λ, of the current particle j1、λ2Are all regularization coefficients, WoutThe output weight value corresponding to the current particle j is obtained; y represents the rear section of the logic block address where the historical thermal data used for network training is located, X represents the state information of the storage layer updated on the basis of the front section of the logic block address where the historical thermal data used for network training is located, and X WoutAnd the prediction result corresponding to the rear section of the logical block address of the historical thermal data is shown.
further preferably, L is used2+ adaptive L1/2the process of computing the output weight value by the regularized constrained echo state network is as follows:
U401: acquiring an input layer-storage layer weight matrix in an echo state network, wherein the internal connection of the storage layer is the weight matrix, and the front section of a logic block address where historical thermal data is located is used as an input variable U, and the rear section is used as an actual result Y;
Wherein, the input layer-storage layer weight matrix and the storage layer internal connection weight matrix are related to storage layer parameters in the echo state network;
u402: updating state information X of the storage layer based on the input variable U, wherein the state information X is composed of state node information X (i);
X(t)=logsig(U(t)Win+X(t-1)Wx)
Wherein, U (T) represents the T-th data in the input variable U, X (T) and X (T-1) represent the T-th and T-1-th state node information respectively, the maximum value T of T is determined by the data length of the input variable U, Win、WxRespectively representing an input layer-storage layer weight matrix in the echo state network, wherein the storage layer is internally connected with the weight matrix, and logsig (phi) represents an activation function;
U402: based on L2+ adaptive L1/2loss function under regularization constraintobtaining an output weight value under the minimum value of the loss function;
in the formula, E represents a loss function, λ1、λ2are all regularization coefficients.
Further preferably, the acquiring process of step U402 is: simplifying the loss function, and calculating an output weight by adopting a coordinate descent algorithm;
Wherein the simplified loss function is represented as:
There are:wherein I is an identity matrix;
Solving matrix W'outIs calculated for each element separately, W'outthe value of the kth element of the mth row is as follows:
wherein the content of the first and second substances,
Of formula (II) to (III)'k(t) denotes the kth element, X 'of the kth line of Y'j(t) denotes the t-th element of the jth line of X'.Represents matrix W'outJ (j) th line k (k) th element of (2)>When m is greater than the total number of the carbon atoms,Is zero; l is the number of output layer nodes, and n is the number of storage layer nodes.
Finally using matrix W'outAnd the output weight WoutCalculating the output weight W according to the relationshipout
further preferably, the method further includes performing adaptive optimization on the output weight obtained in step U402, where the optimization process is as follows:
converting the loss function, and calculating the weight W' by adopting a coordinate descent algorithmoutThen, the weight W ″, is utilizedoutcalculating an optimized output weight;
The loss function after conversion is:
weight WoutAnd the output weight Woutthe relationship of (1) is:
Wherein the content of the first and second substances,K is the number of output layer nodes.
Further preferably, the storage layer parameters in the echo state network comprise four key parameters of an internal connection spectrum radius, a storage layer scale, an input layer proportion coefficient and a storage layer sparsity.
further preferably, the parameters required for initializing the quantum-behaved particle swarm algorithm in step S1 include a particle swarm size N and a maximum iteration number itermaxand an inertia factor omegamaxAnd ωmin
Further preferably, when the particle position is updated, if the particle movement distance exceeds the position range corresponding to the particle, the particle position parameter is set to exceed a boundary value corresponding to the position range.
Advantageous effects
1. the invention innovatively provides that the traditional hot data identification is replaced by hot data prediction, the disclosed hot data prediction technology can predict the property of the next data in advance by one or even a few beats according to historical access behaviors, and actively allocates and stores the next data to a corresponding block (hot/cold data block) of a Solid State Disk (SSD). Meanwhile, the neural network method reserves more characteristic information for input quantity and classifies the thermal data more comprehensively.
2. the invention performs joint optimization, L, on the echo state network2regularization constraint achieves good generalization ability through a trade-off between model bias and prediction variance to be able to obtain weights for successive contractions, however it does not produce sparse solutions; adaptive L1/2Regularization can generate very sparse solutions, but when there is a high correlation between the predictor variables, L1/2cannot well play a regulating role, and the invention comprehensively adopts L2+ adaptive L1/2The regularization training least square method can obtain the advantages of two regularizations, and further improves the prediction precision of the thermal data. In addition, the storage layer parameters of the echo state network are optimized based on a QPSO algorithm, and the problem that the storage layer parameters cannot be determined when a model is built can be solved. Compared with the traditional PSO algorithm, the algorithm removes the speed information of particles based on the wave-particle duality, only retains the position information, can effectively reduce the complexity of calculation, and simultaneously obtains the storage layer parameters of the adaptive model, thereby further improving the prediction precision; furthermore, the invention provides L2+ adaptive L1/2The regularization and the QPSO algorithm are combined to obtain the optimal storage layer parameters, and the prediction precision is improved.
drawings
FIG. 1 is a typical architecture of a NAND flash memory system;
FIG. 2 is a flow chart of a method for hot data prediction based on joint optimization echo state network according to an embodiment of the present invention;
FIG. 3 is a flow chart of the iterative optimization algorithm of the quantum-behaved particle swarm optimization of the present invention;
FIG. 4 shows the use of L in the present invention2+ adaptive L1/2And (3) a specific algorithm flow chart for computing the output weight value by the constrained echo state network.
FIG. 5 is a graph comparing the performance of four actual workloads according to one embodiment of the present invention.
Detailed Description
the present invention will be further described with reference to the following examples.
The invention provides a hot data prediction method based on a joint optimization echo state network, which is mainly applied to a NAND flash memory system, and as shown in FIG. 1, the hot data prediction method is a typical architecture of the NAND flash memory system, and comprises a module B101 (operation of a user), a module B102 (file system) and a module B103 (solid state disk). The actual operation of the user will affect the solid state disk through the file system. The solid state disk further comprises a Flash memory conversion layer, a Flash controller and a NAND Flash array, wherein the Flash memory conversion layer comprises an address allocation unit, a garbage recovery unit, a wear leveling unit and a hot data prediction unit, the invention innovatively provides that the hot data prediction unit replaces a traditional hot data identification unit, a traditional hot data identification method is used for passively analyzing user access behaviors, corresponding data are allocated and stored to corresponding blocks (hot/cold data blocks) of the Solid State Disk (SSD) through a Flash transport layer protocol (FTL), and the method has higher hot data missing detection or false alarm when the method is used for responding to a request with complex access behaviors. The hot data prediction technology disclosed by the invention can predict the property of the next data in advance by one or even several beats according to historical access behaviors, actively distributes and stores the data to a corresponding block (hot/cold data block) of a Solid State Disk (SSD), and is compatible with secondary verification of a traditional hot data identification scheme. Accordingly, the thermal data prediction method proposed by the present invention is essentially "predictive thermal data identification". The finally obtained address information of the prediction logic block is used for garbage recovery and wear leveling processing.
In the process, the wear leveling and the garbage collection have larger influence in the solid state disk. The traditional hot data identification is to accurately and efficiently distinguish which data belongs to valid data. The invention provides a hot data prediction method based on a joint optimization echo state network, which replaces hot data identification with hot data prediction, has high precision prediction and specifically comprises the following steps:
S1: initializing parameters required by the quantum particle swarm algorithm and position information of each particle.
The position information of the particles comprises initial positions and position ranges of the particles, the position of each particle is represented by storage layer parameters in an Echo State Network (ESN), the storage layer parameters in the echo state network comprise internal connection spectrum radius, storage layer scale, input layer proportionality coefficient and storage layer sparsity, the dimension of each particle is initialized to 4 in the example, namely each particle is a matrix of 1 × 4 and represents 4 parameters of the ESN storage layer respectively. Determining the range of 4 parameters, defining the range of the parameters as the position range of all the particles, randomly assigning a value to each particle in the position range during initialization, considering that the particles continuously move towards the optimal direction in the specified range during the subsequent updating process, and updating the position information of the particles to a boundary value if the particles exceed the specified range during the movement. Wherein each particle location represents a specific value of an ESN storage layer parameter.
Parameters required by the quantum particle swarm algorithm comprise a particle swarm size N and a maximum iteration number Itermaxinertia factor omegamaxAnd ωmin(for subsequent updating of the particle position information).
S2: iterative optimization is carried out by utilizing a quantum particle group algorithm to determine the optimal storage layer parameters;
Updating the positions of the particles by adopting a quantum particle swarm algorithm based on the position range of each particle, and utilizing L in each updating process2+ adaptive L1/2computing output weight value by the regularized and constrained echo state network to obtain a global optimal adaptation value, and when iteration is finished, obtaining the global optimal adaptation valuethe corresponding particle position should be taken as the optimal storage layer parameter. The specific process comprises the following steps:
S21: taking the position of each particle as a storage layer parameter in an echo state network, and respectively adopting L2+ adaptive L1/2The echo state network of regularization constraint calculates the corresponding output weight of each particle;
The current position of each particle is sequentially used as a storage layer parameter in the echo state network and an output weight is calculated;
S22: calculating an adaptive value of each particle by using the output weight corresponding to each particle;
S23: selecting an individual optimal adaptive value, an individual optimal parameter, a global optimal adaptive value and a global optimal parameter of each particle according to the adaptive value of each particle based on a minimum adaptive value principle;
wherein, the position of the particle selected as the global optimum adaptive value is the global optimum parameter;
S24: updating the position of each particle in the position range of the particle, recalculating the adaptive value of each particle based on the updated position of each particle, and updating the individual optimal adaptive value, the individual optimal parameter, the global optimal adaptive value and the global optimal parameter of each particle based on the minimum adaptive value principle;
S25: judging whether the iteration times reach the maximum iteration times, if not, returning to the step S24 for the next iteration calculation; otherwise, the current global optimum parameter is taken as the optimum storage layer parameter.
Based on the above logic, an embodiment of the present invention provides an example flowchart as shown in fig. 3, which includes the following steps:
U301: and (4) iteration initialization, setting the current iteration number iter to be 1, and setting the particle label j to be 1.
u302: setting the position of the jth particle as ESN storage layer parameter, and utilizing L2+ adaptive L1/2least square calculation in regularization constraint training to obtain output weight W with higher sparsityout. By L2+ adaptive L1/2RegularizationESN calculation output weight W of chemical constraintoutThe detailed steps are shown in fig. 4 and described in detail below.
u303: output weight W corresponding to jth particleoutcalculating an adaptive value corresponding to the jth particle according to the following calculation formula:
In the formula, λ1、λ2Are all regularization coefficients, Woutthe output weight value corresponding to the current particle j is obtained; y represents the rear section of the logic block address where the historical thermal data used for network training is located, X represents the state information of the storage layer updated on the basis of the front section of the logic block address where the historical thermal data used for network training is located, and X WoutAnd the prediction result corresponding to the rear section of the logical block address of the historical thermal data is shown.
D301: and judging whether all the particles finish the calculation of the adaptive values, if not, adding 1 to j, returning to the step U302 to calculate the adaptive value of the next particle, and if all the particles finish the calculation of the adaptive values, performing the step U304.
U304: and selecting the individual optimal adaptive value, the individual optimal parameter, the global optimal adaptive value and the global optimal parameter of each particle according to the adaptive value of each particle based on the minimum adaptive value principle. After all the particles calculate the adaptive values, comparing and judging, recording the adaptive value of each particle as an individual optimal adaptive value fsbest, and recording the position of each particle as an individual optimal parameter sbest; and recording the minimum particle adaptive value in all the particle adaptive values as a global optimal adaptive value fgbest, and the corresponding position of the minimum particle adaptive value is a global optimal parameter gbest. These parameters obtained will be used for subsequent iteration optimization.
U305: the iteration starts and the particle index j is reset to 1.
U306: and (3) calculating the mbest corresponding to the jth particle, wherein the calculation formula is as follows:
In the formula, sbestiand (4) representing the individual optimal parameters of the ith particle, wherein mbest is the average value of the current individual optimal parameters of all the particles, namely, the average value of each dimension parameter of all the particles is respectively taken for updating the position information of the particles.
u307: and updating the position information of the jth particle of the particle, wherein the updating formula is as follows:
wherein, Pj(t+1)、Pj(t) represents the positions of the particles j after and before the update, respectively,And ujAre random numbers between (0,1), wherein the calculation formula of β is:
as can be seen from the calculation formula of beta, in the early stage of iteration, the parameter beta representing the step length of the movement of the particles is larger, and the particles can move to the optimal position more quickly; and the later period of the iteration beta is smaller, which means that the particle size is reduced near the optimal position, and each movement is more accurate to be close to the optimal position.
after updating the location information, the newly obtained storage layer parameters are reused with L2+ adaptive L1/2The ESN of the regularization constraint recalculates the adapted value, so D302, D303, U308, U309 are: updating the individual best and the global best according to the newly calculated adaptive value, if the newly calculated adaptive value is smaller than the individual best adaptive value of the particle, updating the individual best adaptive value of the particle to the newly calculated adaptive value, and updating the individual best parameter of the particle to be the parameter of the current particle; and if the newly calculated adaptive value is smaller than the global optimal adaptive value, updating the adaptive value to be the global optimal adaptive value, and updating the global optimal parameter to be the parameter of the particle.
D304: and judging whether all the particles are updated, if not, j +1, returning to U306, recalculating mbest by using the updated particle parameters, updating the position information of the next particle, and if all the particles are updated, performing D305.
D305: judging whether the iteration times reach the maximum iteration times, if not, adding 1 to iter, returning to U305, carrying out the next iteration, and if the maximum iteration times are reached, deriving the final global optimal parameter for predicting the logic block address by subsequently training the joint optimization echo state network.
S3: optimal storage layer parameter adoption L in network based on echo state2+ adaptive L1/2and calculating the final output weight value by the regularized and constrained echo state network. The process of calculating the final output weight is shown in fig. 4, which will be described in detail below.
s4: and predicting the hot data by using the final output weight and the address of the logic block where the input historical hot data is located, wherein the prediction formula is as follows:
y=x*Wout
Wherein y represents the obtained predicted logic block address, the data at the predicted logic block address is thermal data, x is the logic block address of the input historical thermal data, WoutRepresenting the output weight. Where y is the predicted access address, note that x and WoutAll can be multidimensional variables, and the obtained y is a one-dimensional variable, and the obtained data on the logical block address is classified as hot data for garbage recovery and wear leveling processing.
When the output weight is calculated, a group of storage layer parameters, namely the internal connection spectrum radius, the storage layer scale, the input layer proportion coefficient and the storage layer sparsity, are determined. As shown in fig. 4, L is used in the present invention2+ adaptive L1/2The process of computing the output weight value by the regularized constrained echo state network is as follows:
U401: and acquiring an input layer-storage layer weight matrix in the echo state network, wherein the storage layer is internally connected with the weight matrix, the front section of the logic block address where the historical thermal data is located is used as an input variable U, and the rear section is used as an actual result Y.
specifically, an Echo State Network (ESN) is a low-complexity and fast-convergence calculation scheme, and is suitable for time data classification and prediction tasks. The ESN network architecture includes three layers: an input layer, a storage layer and an output layer, wherein the input layer-storage layer weight is Winthe internal connection weight of the storage layer is WxThe storage layer-output layer weight is Wout. The number of nodes in the input layer, the storage layer and the output layer is initialized to be K, n and L, and the number n of the nodes in the storage layer is the storage layer scale determination in the storage layer parameters. And initializing the input layer-storage layer weight Win∈Rn×Ki.e. random assignment; initializing internal connection weight W of memory layerx∈Rn×nthat is, n × n × storage layer sparsity obtains a non-zero number, and then a connection weight W is calculatedxthe positions and sizes of the non-zero elements in (1) are randomly assigned, and other elements are all zero. When the sparsity of the storage layer is larger, the nonlinear approximation capability is stronger; then, an internal connection weight W is determined by utilizing the internal connection spectrum radiusxThe maximum eigenvalue of (2) ensures network stability only if the internal connection spectral radius is less than 1. Thus, the input layer-storage layer weight W is determined based on the storage layer parametersin∈Rn×KThe internal connection weight W of the memory layerx∈Rn×n. In this embodiment, L is also initialized1/2and L2coefficient lambda1=5*10-72=1*10-5the method is used for regularization calculation, an input variable U is constructed by utilizing the front 2/3 of a logic block address where input historical thermal data are located, an actual result Y is constructed by utilizing the rear 1/3, the logic block address where the selected historical thermal data are located is the logic block address where the historical thermal data recorded by a user are located in the embodiment of the invention, in other feasible embodiments, the selected length can be other, the method is not specifically limited, the general idea is that a front section address is utilized to predict a rear section address, the predicted rear section address is compared with the actual address to adjust the network, the part is the original characteristic of an echo state network, and the method is not specifically characterizeda description is given.
U402: updating state information X of the storage layer based on the input variable U, wherein the state information X is composed of state node information X (t);
X(t)=logsig(U(t)Win+X(t-1)Wx)
wherein, U (t) represents the t-th data in the input variable U, X (t) and X (t-1) represent the t-th and t-1-th state node information respectively, the number of nodes is determined by the data length of the input variable U, Win、WxThe method respectively represents an input layer-storage layer weight matrix, a storage layer internal connection weight matrix and logsig (·) in the echo state network, and the method can enable a neural network to approach any nonlinear function at will, then the neural network can be applied to a nonlinear model, and when the activation function is used, the input quantity is directly multiplied by an input layer proportionality coefficient to be converted into a corresponding range of the activation function. Since the calculation is performed by inputting in sequence, t can be understood as a time of day.
u402: based on L2+ adaptive L1/2obtaining an output weight value under the minimum value of the loss function by the loss function under the regularization constraint;
In the formula, E represents a loss function, λ1、λ2Are all regularization coefficients.
In order to realize calculation, the loss function E is simplified, and then a coordinate descent algorithm is adopted to calculate an output weight;
wherein the simplified loss function is represented as:
There are:Wherein I is an identity matrix.
solving matrix W'outMethod (2)Is calculated separately for each of the elements, W'outThe value of the kth element of the mth row is as follows:
wherein the content of the first and second substances,
of formula (II) to (III)'k(t) denotes the kth element, X 'of the kth line of Y'j(t) denotes X' line jth element;Represents matrix W'outj (j) th line k (k) th element of (2)>When m is greater than the total number of the carbon atoms,Is zero.
Finally using matrix W'outAnd the output weight WoutCalculating the output weight W according to the relationshipout. In this embodiment, the method further includes performing adaptive optimization on the output weight obtained in step U402, where the optimization is U404:
u404: converting the loss function, and calculating the weight W' by adopting a coordinate descent algorithmoutthen, the weight W ″, is utilizedoutCalculating an optimized output weight;
The loss function after conversion is:
Weight WoutAnd the output weight Woutthe relationship of (1) is:
Wherein the content of the first and second substances,n is the number of nodes of the storage layer, and K is the number of nodes of the output layer.
in order to verify the reliability of the method, the method creatively replaces thermal data identification with thermal data prediction, and improves the accuracy of thermal data discrimination. We used four actual workloads for objective evaluation. Finaminial 1 is a write intensive trace file, MSR is a common workload for large enterprise servers, Distilled stands for typical usage patterns of personal computers, and finally MillSSD is collected from an industrial automated optical inspection instrument with Runcore RCS hardware configuration-V-T25 SSD (512GB, SATA2), Intel X27400 and 2G DDR 3. MillSSD is also a write-intensive trace file because it has the effect of a substantial image backup. The results of the performance comparison of this example are shown in FIG. 5. From the test results, it can be seen that: on WDAC basis, the HOESN thermal ratio curve almost overlaps WDAC in most cases. This major trend can be clearly seen at all four working loads, especially for more write-intensive MSRs and milssd. It is clear that at four workloads, our HOESN has the lowest FIR followed by DL-MBF _ s. Although MBF experiences a relatively high FIR, it is still a good HDI scheme for SSDs, where WDAC is proposed, which becomes the classical benchmark for the following studies. Notably, of the four workloads, the degree of improvement in HOESN was most impressive for MillSSD (from 4.08% to 2.23%). These preliminary tests also demonstrate our original idea that understanding the access behavior of hot data of NAND flash can be considered a time-series prediction, for which hoe, was proposed. The results show that our prediction method can get a good idea of the access behavior of the disk workload, which is the basic premise for providing reliable service for GC and WL.
It should be emphasized that the examples described herein are illustrative and not restrictive, and thus the invention is not limited to the examples described in the specific embodiments, but rather, other embodiments may be devised by those skilled in the art without departing from the spirit and scope of the present invention, and it is intended to cover all modifications, alterations, and equivalents included within the scope of the present invention.

Claims (10)

1. a thermal data prediction method based on a joint optimization echo state network is characterized in that: the method comprises the following steps:
S1: initializing parameters required by a quantum particle swarm algorithm and position information of each particle;
the position information of the particles comprises initial positions and position ranges of the particles, and the position of each particle is represented by storage layer parameters in an echo state network;
S2: iterative optimization is carried out by utilizing a quantum particle group algorithm to determine the optimal storage layer parameters;
Updating the positions of the particles by adopting a quantum particle swarm algorithm based on the position range of each particle, wherein L is utilized in each updating process2+ adaptive L1/2Computing an output weight value and a global optimal adaptation value by the regularized and constrained echo state network, and taking a particle position corresponding to the global optimal adaptation value as an optimal storage layer parameter when iteration is finished;
S3: optimal storage layer parameter adoption L in network based on echo state2+ adaptive L1/2The regularized and constrained echo state network calculates the final output weight;
S4: and predicting the hot data by using the final output weight and the address of the logic block where the input historical hot data is located, wherein the prediction formula is as follows:
y=x*Wout
Wherein y represents the obtained predicted logic block address, the data at the predicted logic block address is thermal data, x is the logic block address of the input historical thermal data, WoutIndicating the output weight, the logical block address where the historical thermal data is located is used in the echo state network training process in step S2 and step S3.
2. The method of claim 1, wherein: the iterative optimization for determining the optimal storage layer parameters in step S2 is performed as follows:
S21: the position of each particle is sequentially used as a storage layer parameter in an echo state network, and L is respectively adopted2+ adaptive L1/2The echo state network of regularization constraint calculates the corresponding output weight of each particle;
the current position of each particle is sequentially used as a storage layer parameter in the echo state network and an output weight is calculated;
S22: calculating an adaptive value of each particle by using the output weight corresponding to each particle;
S23: selecting an individual optimal adaptive value, an individual optimal parameter, a global optimal adaptive value and a global optimal parameter of each particle according to the adaptive value of each particle based on a minimum adaptive value principle;
wherein, the position of the particle selected as the global optimum adaptive value is the global optimum parameter;
S24: updating the position of each particle in the position range of the particle, recalculating the adaptive value of each particle based on the updated position of each particle, and updating the individual optimal adaptive value, the individual optimal parameter, the global optimal adaptive value and the global optimal parameter of each particle based on the minimum adaptive value principle;
S25: judging whether the iteration times reach the maximum iteration times, if not, returning to the step S24 for the next iteration calculation; otherwise, the current global optimum parameter is taken as the optimum storage layer parameter.
3. The method of claim 2, wherein: the position of any particle j is updated according to the following formula:
wherein the content of the first and second substances,
in the formula, Pj(t+1)、Pj(t) represents the positions of the particles j after and before the update, respectively,And ujAre all random numbers, sbestj、sbestiThe individual optimal parameters of the jth and ith particles are shown, the mbest is the average value of the current individual optimal parameters of all the particles, iter and itermaxrespectively the current iteration number and the maximum iteration number, omegamax、ωminRespectively, inertia factors, and N is the total number of particles.
4. The method of claim 2, wherein: the calculation formula of the adaptive value of any particle j is as follows:
where Fitness denotes the adaptation value, λ, of the current particle j1、λ2Are all regularization coefficients, WoutThe output weight value corresponding to the current particle j is obtained; y represents the rear section of the logic block address where the historical thermal data used for network training is located, X represents the state information of the storage layer updated on the basis of the front section of the logic block address where the historical thermal data used for network training is located, and X WoutAnd the prediction result corresponding to the rear section of the logical block address of the historical thermal data is shown.
5. The method of claim 1, wherein: by using L2+ adaptive L1/2The process of computing the output weight value by the regularized constrained echo state network is as follows:
U401: acquiring an input layer-storage layer weight matrix in an echo state network, wherein the internal connection of the storage layer is the weight matrix, and the front section of a logic block address where historical thermal data is located is used as an input variable U, and the rear section is used as an actual result Y;
Wherein, the input layer-storage layer weight matrix and the storage layer internal connection weight matrix are related to the storage layer parameters in the echo state network;
u402: updating state information X of the storage layer based on the input variable U, wherein the state information X is composed of state node information X (t);
X(t)=logsig(U(t)Win+X(t-1)Wx)
Wherein, U (T) represents the T-th data in the input variable U, X (T) and X (T-1) represent the T-th and T-1-th state node information respectively, the maximum value T of T is determined by the data length of the input variable U, Win、WxRespectively representing an input layer-storage layer weight matrix in the echo state network, wherein the storage layer is internally connected with the weight matrix, and logsig (phi) represents an activation function;
u402: based on L2+ adaptive L1/2obtaining an output weight value under the minimum value of the loss function by the loss function under the regularization constraint;
In the formula, E represents a loss function, λ1、λ2Are all regularization coefficients.
6. the method of claim 5, wherein: the process of step U402 is: simplifying the loss function, and calculating an output weight by adopting a coordinate descent algorithm;
Wherein the simplified loss function is represented as:
There are:Wherein I is an identity matrix;
Solving matrix W'outIs calculated for each element separately, W'outThe value of the kth element of the mth row is as follows:
Wherein the content of the first and second substances,
in the formula, Yk' (t) denotes the t-th element, X ' of the kth line of Y 'j(t) denotes X' line jth element;represents matrix W'outJ (j) th line k (k) th element of (2)>when m is greater than the total number of the carbon atoms,is zero; l is the number of output layer nodes, and n is the number of storage layer nodes.
7. the method of claim 6, wherein: the method also comprises the step of carrying out self-adaptive optimization on the output weight value obtained in the step U402, wherein the optimization process is as follows:
Converting the loss function, and calculating the weight W' by adopting a coordinate descent algorithmoutThen calculating the optimized output weight;
The loss function after conversion is:
Weight WoutAnd the output weight WoutThe relationship of (1) is:
Wherein the content of the first and second substances,K is the number of nodes of the input layer.
8. The method of claim 1, wherein: the storage layer parameters in the echo state network comprise four key parameters of an internal connection spectrum radius, a storage layer scale, an input layer proportion coefficient and a storage layer sparsity.
9. the method of claim 1, wherein: the parameters required for initializing the quantum-behaved particle swarm algorithm in step S1 include particle swarm size N and maximum iteration number itermaxand an inertia factor omegamaxAnd ωmin
10. The method of claim 1, wherein: when the particle position is updated, if the particle moving distance exceeds the position range corresponding to the particle, the particle position parameter is set to exceed the boundary value corresponding to the position range.
CN201910566123.4A 2019-06-27 2019-06-27 Thermal data prediction method based on joint optimization echo state network Active CN110554838B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201910566123.4A CN110554838B (en) 2019-06-27 2019-06-27 Thermal data prediction method based on joint optimization echo state network
PCT/CN2020/097950 WO2020259543A1 (en) 2019-06-27 2020-06-24 Hot data prediction method based on joint optimization of echo state network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910566123.4A CN110554838B (en) 2019-06-27 2019-06-27 Thermal data prediction method based on joint optimization echo state network

Publications (2)

Publication Number Publication Date
CN110554838A true CN110554838A (en) 2019-12-10
CN110554838B CN110554838B (en) 2020-08-14

Family

ID=68735438

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910566123.4A Active CN110554838B (en) 2019-06-27 2019-06-27 Thermal data prediction method based on joint optimization echo state network

Country Status (2)

Country Link
CN (1) CN110554838B (en)
WO (1) WO2020259543A1 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020259543A1 (en) * 2019-06-27 2020-12-30 中南大学 Hot data prediction method based on joint optimization of echo state network
CN112448697A (en) * 2020-10-30 2021-03-05 合肥工业大学 Active filter optimization method and system based on quantum particle swarm optimization
CN112731019A (en) * 2020-12-21 2021-04-30 合肥工业大学 Fault diagnosis method for ANPC three-level inverter
WO2024077642A1 (en) * 2022-10-12 2024-04-18 大连理工大学 Method for constructing quantum echo state network model for aero-engine fault early-warning

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103959291A (en) * 2011-04-20 2014-07-30 诺沃—诺迪斯克有限公司 Glucose predictor based on regularization networks with adaptively chosen kernels and regularization parameters
US20170046079A1 (en) * 2015-08-11 2017-02-16 International Business Machines Corporation Read distribution in a three-dimensional stacked memory based on thermal profiles
CN109104388A (en) * 2017-06-20 2018-12-28 希捷科技有限公司 The devices, systems, and methods adaptive for regularization parameter
CN109656485A (en) * 2018-12-24 2019-04-19 合肥兆芯电子有限公司 The method for distinguishing dsc data and cold data
CN109726858A (en) * 2018-12-21 2019-05-07 新奥数能科技有限公司 Heat load prediction method and device based on dynamic time warping
CN109901800A (en) * 2019-03-14 2019-06-18 重庆大学 A kind of mixing memory system and its operating method

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103020434A (en) * 2012-11-30 2013-04-03 南京航空航天大学 Particle swarm optimization-based least square support vector machine combined predicting method
CN110554838B (en) * 2019-06-27 2020-08-14 中南大学 Thermal data prediction method based on joint optimization echo state network

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103959291A (en) * 2011-04-20 2014-07-30 诺沃—诺迪斯克有限公司 Glucose predictor based on regularization networks with adaptively chosen kernels and regularization parameters
US20170046079A1 (en) * 2015-08-11 2017-02-16 International Business Machines Corporation Read distribution in a three-dimensional stacked memory based on thermal profiles
CN109104388A (en) * 2017-06-20 2018-12-28 希捷科技有限公司 The devices, systems, and methods adaptive for regularization parameter
CN109726858A (en) * 2018-12-21 2019-05-07 新奥数能科技有限公司 Heat load prediction method and device based on dynamic time warping
CN109656485A (en) * 2018-12-24 2019-04-19 合肥兆芯电子有限公司 The method for distinguishing dsc data and cold data
CN109901800A (en) * 2019-03-14 2019-06-18 重庆大学 A kind of mixing memory system and its operating method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
MEILING XU ; SHUHUI ZHANG ; MIN HAN: "Multivariate time series modeling and prediction based on reservoir independent components", 《2015 SIXTH INTERNATIONAL CONFERENCE ON INTELLIGENT CONTROL AND INFORMATION PROCESSING (ICICIP)》 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020259543A1 (en) * 2019-06-27 2020-12-30 中南大学 Hot data prediction method based on joint optimization of echo state network
CN112448697A (en) * 2020-10-30 2021-03-05 合肥工业大学 Active filter optimization method and system based on quantum particle swarm optimization
CN112731019A (en) * 2020-12-21 2021-04-30 合肥工业大学 Fault diagnosis method for ANPC three-level inverter
WO2024077642A1 (en) * 2022-10-12 2024-04-18 大连理工大学 Method for constructing quantum echo state network model for aero-engine fault early-warning

Also Published As

Publication number Publication date
CN110554838B (en) 2020-08-14
WO2020259543A1 (en) 2020-12-30

Similar Documents

Publication Publication Date Title
CN110554838B (en) Thermal data prediction method based on joint optimization echo state network
KR102645142B1 (en) Storage devices, methods and non-volatile memory devices for performing garbage collection using estimated valid pages
WO2021008220A1 (en) Systems and methods for data storage system
CN110414031B (en) Method and device for predicting time sequence based on volterra series model, electronic equipment and computer readable storage medium
US20180067697A1 (en) Storage Devices Including Nonvolatile Memory Devices and Access Methods for Nonvolatile Memory Devices
CN109817267B (en) Deep learning-based flash memory life prediction method and system and computer-readable access medium
CN110968272B (en) Time sequence prediction-based method and system for optimizing storage performance of mass small files
Ma et al. RBER-aware lifetime prediction scheme for 3D-TLC NAND flash memory
CN106233265A (en) Access frequency hierarchical structure is used for evicting from the selection of target
EP3651024B1 (en) Method of operating storage device, storage device performing the same and storage system including the same
Zhang et al. Crftl: cache reallocation-based page-level flash translation layer for smartphones
Luo et al. Self-learning hot data prediction: Where echo state network meets NAND flash memories
Gupta et al. Relevance feedback based online learning model for resource bottleneck prediction in cloud servers
He et al. Information-aware attention dynamic synergetic network for multivariate time series long-term forecasting
Nguyen et al. Recurrent conditional heteroskedasticity
Heidenreich et al. Transfer learning of recurrent neural network‐based plasticity models
Ahmed et al. Bitcoin Price Prediction using the Hybrid Convolutional Recurrent Model Architecture
Poczeta et al. Analysis of fuzzy cognitive maps with multi-step learning algorithms in valuation of owner-occupied homes
Feng et al. Using disturbance compensation and data clustering (DC) 2 to improve reliability and performance of 3D MLC flash memory
CN110705631A (en) SVM-based bulk cargo ship equipment state detection method
Ha et al. Dynamic hot data identification using a stack distance approximation
Reddy et al. Analysis of Stock Market Value Prediction using Simple Novel Long Short Term Memory Algorithm in Comparison with Back Propagation Algorithm for Increased Accuracy Rate
CN113822583A (en) Power distribution network investment demand prediction method and device, terminal equipment and medium
Khan et al. DaCapo: An On-Device Learning Scheme for Memory-Constrained Embedded Systems
Kargar et al. E2-NVM: A Memory-Aware Write Scheme to Improve Energy Efficiency and Write Endurance of NVMs using Variational Autoencoders.

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant