CN110554838B - Thermal data prediction method based on joint optimization echo state network - Google Patents

Thermal data prediction method based on joint optimization echo state network Download PDF

Info

Publication number
CN110554838B
CN110554838B CN201910566123.4A CN201910566123A CN110554838B CN 110554838 B CN110554838 B CN 110554838B CN 201910566123 A CN201910566123 A CN 201910566123A CN 110554838 B CN110554838 B CN 110554838B
Authority
CN
China
Prior art keywords
particle
storage layer
weight
value
adaptive
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910566123.4A
Other languages
Chinese (zh)
Other versions
CN110554838A (en
Inventor
罗旗舞
王玥童
阳春华
桂卫华
周灿
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Central South University
Original Assignee
Central South University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Central South University filed Critical Central South University
Priority to CN201910566123.4A priority Critical patent/CN110554838B/en
Publication of CN110554838A publication Critical patent/CN110554838A/en
Priority to PCT/CN2020/097950 priority patent/WO2020259543A1/en
Application granted granted Critical
Publication of CN110554838B publication Critical patent/CN110554838B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0614Improving the reliability of storage systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • G06F3/064Management of blocks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/0671In-line storage system
    • G06F3/0673Single storage device
    • G06F3/0679Non-volatile semiconductor memory device, e.g. flash memory, one time programmable memory [OTP]

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a thermal data prediction method based on a joint optimization echo state network, which creatively uses quantum particle swarm optimization to calculate storage layer parameters of the echo state network to obtain optimal storage layer parameters, and combines L in the particle position updating process2+ adaptive L1/2Outputting a weight value and calculating a global optimal adaptation value in the regularized constrained echo state network calculation process, taking a particle position corresponding to the global optimal adaptation value as an optimal storage layer parameter when iteration is finished, and finally adopting L based on the optimal storage layer parameter2+ adaptive L1/2The method comprises the steps of calculating a final output weight value by a regularized constrained echo state network, predicting thermal data by using the final output weight value and a logic block address where the input historical thermal data is located, and predicting the data on the logic block address to be the thermal data.

Description

Thermal data prediction method based on joint optimization echo state network
Technical Field
The invention belongs to the technical field of chaotic time sequence prediction, and particularly relates to a thermal data prediction method based on a joint optimization echo state network.
Background
As a non-volatile storage technology, NAND flash memory is widely used in communication systems and consumer electronics products, and has higher access speed and power efficiency compared to hard disk drives. In NAND flash based consumer electronics, a large number of applications rely on NAND flash for data exchange, file storage and video storage. The NAND flash memory is mainly used for storing large-capacity data, and the NAND structure can provide extremely high unit density and can achieve high storage density, high writing-in speed and high erasing speed; therefore, NAND flash memories are widely used for large-capacity data storage, such as solid-state disks. The demand of the NAND flash memory will continue to increase in the future, and the NAND flash memory is mainly applied to the fields of cloud computing, the Internet of things, data centers and the like.
However, NAND flash memory faces at least two challenges, namely displaced updating and limited endurance, which limits its large scale application. NAND flash memory suffers from the drawback of not being able to override an operation, i.e. not being able to perform a new write operation on a page of flash memory before the page is erased. Improper updates will generate many invalid and dead pages, which can reduce efficiency and performance. Furthermore, NAND flash memory has a limited lifetime because a flash block can only withstand a limited number of erasures, and if the number of erasures of a block is greater than the maximum number of erasures of the block, it will not be usable. Garbage Collection (GC) and Wear Leveling (WL) are design ideas of allocating frequently written data (i.e., hot data) to blocks with a small number of erase times and allocating the least recently used data (i.e., cold data) to blocks with a large number of erase times, and have important effects on solving these two challenges, while the efficiency and performance of GC and WL depend on Hot Data Identification (HDI) to a large extent. The nature of HDI is to try to understand well the access behavior of hot data in order to intelligently assign different data to the appropriate blocks, but conventional HDI has two problems, one of which is that memory overhead is large. At present, most of hot data identification mechanisms adopt the idea of identifying hot data pages in a NAND flash memory, and the core principle of the mechanisms is a page counter, which records the read-write operation times of logical pages corresponding to the NAND flash memory pages within a certain time, if the read-write times are greater than a set threshold, the request page is judged to be a hot page, otherwise, the request page is a cold page. Another serious problem is that the recognition accuracy is not high, and the Bloom filter-based hot data recognition mechanism is widely applied to the cold and hot data recognition of SSD, but the Bloom filter has the inherent defect of false positive, that is, the data not belonging to the set is erroneously determined to be in the set. In addition, thermal data identification mode considerations such as load request size and load access mode are relatively single, and the locality characteristics of the load cannot be completely and comprehensively considered, so that the accuracy of thermal data identification is not high.
Disclosure of Invention
The invention aims to provide a thermal data prediction method based on a joint optimization echo state network, which creatively proposes to replace the traditional thermal data identification with thermal data prediction and constructs the joint optimization echo state network, thereby ensuring that the predicted thermal data has more real-time property and reliability.
The invention provides a thermal data prediction method based on a joint optimization echo state network, which comprises the following steps:
s1: initializing parameters required by a quantum particle swarm algorithm and position information of each particle;
the position information of the particles comprises initial positions and position ranges of the particles, and the position of each particle is represented by storage layer parameters in an echo state network;
s2: iterative optimization is carried out by utilizing a quantum particle group algorithm to determine the optimal storage layer parameters;
updating the positions of the particles by adopting a quantum particle swarm algorithm based on the position range of each particle, and utilizing L in each updating process2+ adaptive L1/2Outputting a weight value in the regularized and constrained echo state network calculation process to obtain a global optimal adaptation value, and taking a particle position corresponding to the global optimal adaptation value as an optimal storage layer parameter when iteration is finished;
s3: optimal storage layer parameter adoption L in network based on echo state2+ adaptive L1/2The regularized and constrained echo state network calculates the final output weight;
s4: and predicting the hot data by using the final output weight and the address of the logic block where the input historical hot data is located, wherein the prediction formula is as follows:
y=x*Wout(1)
wherein y represents the obtained predicted logic block address, the data at the predicted logic block address is thermal data, x is the logic block address of the input historical thermal data, Wout(1)Indicating the output weight, the logical block address where the historical thermal data is located is used in the echo state network training process in step S2 and step S3.
The invention innovatively predicts the thermal data used by the thermal data identification module in the flash memory conversion layer while continuing to use the solid state disk structure frameworkInstead, the thermal data is predicted by using a joint optimization echo state network, the joint optimization comprises two parts, the first part uses a quantum particle swarm algorithm to iteratively optimize and determine the optimal storage layer parameters of the echo state network, and the second part uses L2+ adaptive L1/2The regularization constrained ESN obtains the output weight of high sparsity, and the invention iteratively optimizes the quantum particle swarm algorithm and L2+ adaptive L1/2The optimal storage layer parameters are obtained by combining regularization constraints, and the joint optimization echo state network for prediction has high instantaneity and reliability. The invention trains the echo state network by using the logic block address where the historical thermal data is located to obtain the final output weight value, and then predicts the final output weight value as the logic block address where the thermal data is located by using the final output weight value.
Further preferably, the iterative optimization for determining the optimal storage layer parameters in step S2 is performed as follows:
s21: the position of each particle is sequentially used as a storage layer parameter in an echo state network, and L is respectively adopted2+ adaptive L1/2The echo state network of regularization constraint calculates the output weight of the process corresponding to each particle;
the current position of each particle is sequentially used as a storage layer parameter in the echo state network and a process output weight is calculated;
s22: calculating the adaptive value of each particle by using the process output weight corresponding to each particle;
s23: selecting an individual optimal adaptive value, an individual optimal parameter, a global optimal adaptive value and a global optimal parameter of each particle according to the adaptive value of each particle based on a minimum adaptive value principle;
wherein, the position of the particle selected as the global optimum adaptive value is the global optimum parameter;
s24: updating the position of each particle in the position range of the particle, recalculating the adaptive value of each particle based on the updated position of each particle, and updating the individual optimal adaptive value, the individual optimal parameter, the global optimal adaptive value and the global optimal parameter of each particle based on the minimum adaptive value principle;
s25: judging whether the iteration times reach the maximum iteration times, if not, returning to the step S24 for the next iteration calculation; otherwise, the current global optimum parameter is taken as the optimum storage layer parameter.
Further preferably, the position of any particle j is updated according to the following formula:
Figure GDA0002554424440000031
wherein,
Figure GDA0002554424440000032
in the formula, Pj(t+1)、Pj(t) represents the positions of the particles j after and before the update, respectively,
Figure GDA0002554424440000037
and ujAre all random numbers, sbestj、sbestiThe individual optimal parameters of the jth and ith particles are shown, the mbest is the average value of the current individual optimal parameters of all the particles, iter and itermaxRespectively the current iteration number and the maximum iteration number, omegamax、ωminThe maximum inertia factor and the minimum inertia factor are respectively, and N is the total number of particles.
Further preferably, the calculation formula of the adaptive value of any particle j is:
Figure GDA0002554424440000033
where Fitness denotes the adaptation value, λ, of the current particle j1、λ2Are all the coefficients of the regularization, and,
Figure GDA0002554424440000034
outputting the weight W for the process corresponding to the current particle jout(2)(ii) a Y represents the rear section of the logic block address where the historical thermal data used for network training is located, and X represents the storage layer updated based on the front section of the logic block address where the historical thermal data used for network training is locatedThe state information of (a) is stored in the memory,
Figure GDA0002554424440000035
representing a process output weight corresponding to a current particle j
Figure GDA0002554424440000036
And the prediction result corresponding to the later section of the logic block address of the historical thermal data.
Further preferably, L is used2+ adaptive L1/2The process of calculating the final output weight or the process output weight by the regularized constrained echo state network is as follows:
u401: acquiring an input layer-storage layer weight matrix in an echo state network, wherein the internal connection of the storage layer is the weight matrix, and the front section of a logic block address where historical thermal data is located is used as an input variable U, and the rear section is used as an actual result Y;
wherein, the input layer-storage layer weight matrix and the storage layer internal connection weight matrix are related to the storage layer parameters in the echo state network;
u402: updating state information X of the storage layer based on the input variable U, wherein the state information X is composed of state node information X (t);
X(t)=log sig(U(t)Win+X(t-1)Wx)
wherein, U (T) represents the T-th data in the input variable U, X (T) and X (T-1) represent the T-th and T-1-th state node information respectively, the maximum value T of T is determined by the data length of the input variable U, Win、WxRespectively representing an input layer-storage layer weight matrix in the echo state network, wherein the storage layer is internally connected with the weight matrix, and logsig (phi) represents an activation function;
u403: based on L2+ adaptive L1/2Obtaining an output weight value under the minimum value of the loss function by the loss function under the regularization constraint;
Figure GDA0002554424440000041
in the formula, E represents a loss function, λ1、λ2Are all regularization coefficients. Wherein, if the process output weight in step S2 is calculated, the weight W is outputoutEquals to the process output weight; if the final output weight is calculated in step S3, the weight W is outputoutEqual to the final output weight.
Further preferably, the acquiring process in step U403 is: simplifying the loss function, and calculating an output weight by adopting a coordinate descent algorithm;
wherein the simplified loss function is represented as:
Figure GDA0002554424440000042
there are:
Figure GDA0002554424440000043
wherein I is an identity matrix;
solving matrix W'outIs calculated for each element separately, W'outThe value of the kth element of the mth row is as follows:
Figure GDA0002554424440000044
wherein,
Figure GDA0002554424440000045
Figure GDA0002554424440000046
of formula (II) to (III)'k(t) denotes the kth element, X 'of the kth line of Y'j(t) denotes the t-th element of the jth line of X'.
Figure GDA0002554424440000051
Represents matrix W'outJ (j) th line k (k) th element of (2)>When m is greater than the total number of the carbon atoms,
Figure GDA0002554424440000052
is zero; l is the number of output layer nodes, and n is the number of storage layer nodes.
Finally using matrix W'outAnd the output weight WoutCalculating the output weight W according to the relationshipout
Further preferably, the method further includes performing adaptive optimization on the output weight obtained in step U403, where the optimization process is as follows:
converting the loss function, and calculating the weight W' by adopting a coordinate descent algorithmoutThen, the weight W ″, is utilizedoutCalculating an optimized output weight;
the loss function after conversion is:
Figure GDA0002554424440000053
weight WoutAnd the output weight WoutThe relationship of (1) is:
Figure GDA0002554424440000054
wherein,
Figure GDA0002554424440000055
k is the number of output layer nodes.
Further preferably, the storage layer parameters in the echo state network comprise four key parameters of internal connection spectrum radius, storage layer scale, input layer proportion coefficient and storage layer sparsity.
Further preferably, the parameters required for initializing the quantum-behaved particle swarm algorithm in step S1 include a total number N of particles, and a maximum iteration number itermaxAnd a maximum inertia factor omegamaxAnd minimum inertia factor omegamin
Further preferably, when the particle position is updated, if the particle movement distance exceeds the position range corresponding to the particle, the particle position parameter is set to exceed a boundary value corresponding to the position range.
Advantageous effects
1. The invention innovatively provides that the traditional hot data identification is replaced by hot data prediction, the disclosed hot data prediction technology can predict the property of the next data in advance by one or even a few beats according to historical access behaviors, and actively allocates and stores the next data to the corresponding block (hot/cold data block) of the Solid State Disk (SSD). Meanwhile, the neural network method reserves more characteristic information for input quantity and classifies the thermal data more comprehensively.
2. The invention performs joint optimization, L, on the echo state network2Regularization constraint achieves good generalization ability through a trade-off between model bias and prediction variance to be able to obtain weights for successive contractions, however it does not produce sparse solutions; adaptive L1/2Regularization can generate very sparse solutions, but when there is a high correlation between the predictor variables, L1/2Cannot well play a regulating role, and the invention comprehensively adopts L2+ adaptive L1/2The regularization training least square method can obtain the advantages of two regularizations, and further improves the prediction precision of the thermal data. In addition, echo state network storage layer parameters are optimized based on a QPSO algorithm, and the problem that the storage layer parameters cannot be determined when a model is built can be solved. Compared with the traditional PSO algorithm, the algorithm removes the speed information of particles based on the wave-particle duality, only retains the position information, can effectively reduce the complexity of calculation, and simultaneously obtains the storage layer parameters of the adaptive model, thereby further improving the prediction precision; furthermore, the invention provides L2+ adaptive L1/2The regularization and the QPSO algorithm are combined to obtain the optimal storage layer parameters, and the prediction precision is improved.
Drawings
FIG. 1 is a typical architecture of a NAND flash memory system;
FIG. 2 is a flow chart of a method for hot data prediction based on joint optimization echo state network according to an embodiment of the present invention;
FIG. 3 is a flow chart of the iterative optimization algorithm of the quantum-behaved particle swarm optimization of the present invention;
FIG. 4 shows the use of L in the present invention2+ adaptive L1/2And (3) a specific algorithm flow chart for computing the output weight value by the constrained echo state network.
FIG. 5 is a graph comparing the performance of four actual workloads according to one embodiment of the present invention.
Detailed Description
The present invention will be further described with reference to the following examples.
The invention provides a hot data prediction method based on a joint optimization echo state network, which is mainly applied to a NAND flash memory system, and as shown in FIG. 1, the hot data prediction method is a typical architecture of the NAND flash memory system, and comprises a module B101 (operation of a user), a module B102 (file system) and a module B103 (solid state disk). The actual operation of the user will affect the solid state disk through the file system. The solid state disk further comprises a Flash memory conversion layer, a Flash controller and a NAND Flash array, wherein the Flash memory conversion layer comprises an address allocation unit, a garbage recovery unit, a wear leveling unit and a hot data prediction unit, the invention innovatively provides that the hot data prediction unit replaces a traditional hot data identification unit, a traditional hot data identification method is used for passively analyzing user access behaviors, corresponding data are allocated and stored to corresponding blocks (hot/cold data blocks) of the Solid State Disk (SSD) through a Flash transport layer protocol (FTL), and the method has higher hot data missing detection or false alarm when the method is used for responding to a request with complex access behaviors. The hot data prediction technology disclosed by the invention can predict the property of the next data in advance by one or even several beats according to historical access behaviors, actively distributes and stores the data to a corresponding block (hot/cold data block) of a Solid State Disk (SSD), and is compatible with secondary verification of a traditional hot data identification scheme. Accordingly, the thermal data prediction method proposed by the present invention is essentially "predictive thermal data identification". The finally obtained address information of the prediction logic block is used for garbage recovery and wear leveling processing.
In the process, the wear leveling and the garbage collection have larger influence in the solid state disk. The traditional thermal data identification is to accurately and efficiently distinguish which data belongs to valid data. The invention provides a hot data prediction method based on a joint optimization echo state network, which replaces hot data identification with hot data prediction, has high precision prediction and specifically comprises the following steps:
s1: initializing parameters required by the quantum particle swarm algorithm and position information of each particle.
The position information of the particles comprises initial positions and position ranges of the particles, the position of each particle is represented by storage layer parameters in an Echo State Network (ESN), the storage layer parameters in the ESN comprise internal connection spectrum radius, storage layer scale, input layer proportion coefficient and storage layer sparsity, the dimension of each particle is initialized to 4 in the example, namely each particle is a matrix of 1 × 4 and represents 4 parameters of the ESN storage layer respectively. Determining the range of 4 parameters, defining the range of the parameters as the position range of all the particles, randomly assigning a value to each particle in the position range during initialization, considering that the particles continuously move towards the optimal direction in the specified range in the subsequent updating process, and updating the position information of the particles into a boundary value if the particles exceed the specified range in the movement process. Wherein each particle location represents a specific value of an ESN storage layer parameter.
The parameters required by the quantum particle swarm optimization comprise the total number N of particles and the maximum iteration number ItermaxMaximum inertia factor omegamaxAnd minimum inertia factor omegamin(for subsequent updating of the particle position information).
S2: iterative optimization is carried out by utilizing a quantum particle group algorithm to determine the optimal storage layer parameters;
updating the positions of the particles by adopting a quantum particle swarm algorithm based on the position range of each particle, and utilizing L in each updating process2+ adaptive L1/2And outputting the weight value in the regularized and constrained echo state network calculation process to obtain a global optimal adaptive value, and taking the particle position corresponding to the global optimal adaptive value as an optimal storage layer parameter when iteration is finished. The specific process comprises the following steps:
S21: taking the position of each particle as a storage layer parameter in an echo state network, and respectively adopting L2+ adaptive L1/2The echo state network of regularization constraint calculates the output weight of the process corresponding to each particle;
the current position of each particle is sequentially used as a storage layer parameter in the echo state network and a process output weight is calculated;
s22: calculating the adaptive value of each particle by using the process output weight corresponding to each particle;
s23: selecting an individual optimal adaptive value, an individual optimal parameter, a global optimal adaptive value and a global optimal parameter of each particle according to the adaptive value of each particle based on a minimum adaptive value principle;
wherein, the position of the particle selected as the global optimum adaptive value is the global optimum parameter;
s24: updating the position of each particle in the position range of the particle, recalculating the adaptive value of each particle based on the updated position of each particle, and updating the individual optimal adaptive value, the individual optimal parameter, the global optimal adaptive value and the global optimal parameter of each particle based on the minimum adaptive value principle;
s25: judging whether the iteration times reach the maximum iteration times, if not, returning to the step S24 for the next iteration calculation; otherwise, the current global optimum parameter is taken as the optimum storage layer parameter.
Based on the above logic, an embodiment of the present invention provides an example flowchart as shown in fig. 3, which includes the following steps:
u301: and (4) iteration initialization, setting the current iteration number iter to be 1, and setting the particle label j to be 1.
U302: setting the position of the jth particle as ESN storage layer parameter, and utilizing L2+ adaptive L1/2Least square calculation occurring in regularization constraint training to obtain process output weight W with higher sparsityout(2). By L2+ adaptive L1/2Regularized constrained ESN calculation output weight WoutThe detailed steps are as followsAs shown in fig. 4, and described in more detail below.
U303: process output weight W based on jth particleout(2)Calculating an adaptive value corresponding to the jth particle, wherein the calculation formula is as follows:
Figure GDA0002554424440000081
in the formula, λ1、λ2Are all the coefficients of the regularization, and,
Figure GDA0002554424440000082
outputting the weight W for the process corresponding to the current particle jout(2)(ii) a Y represents the rear section of the logical block address where the historical thermal data used for network training is located, X represents the state information of the storage layer updated based on the front section of the logical block address where the historical thermal data used for network training is located,
Figure GDA0002554424440000083
representing a process output weight corresponding to a current particle j
Figure GDA0002554424440000084
And the prediction result corresponding to the later section of the logic block address of the historical thermal data.
D301: and judging whether all the particles finish the calculation of the adaptive values, if not, adding 1 to j, returning to the step U302 to calculate the adaptive value of the next particle, and if all the particles finish the calculation of the adaptive values, performing the step U304.
U304: and selecting the individual optimal adaptive value, the individual optimal parameter, the global optimal adaptive value and the global optimal parameter of each particle according to the adaptive value of each particle based on the minimum adaptive value principle. After all the particles calculate the adaptive values, comparing and judging, recording the adaptive value of each particle as an individual optimal adaptive value fsbest, and recording the position of each particle as an individual optimal parameter sbest; and recording the minimum particle adaptive value in all the particle adaptive values as a global optimal adaptive value fgbest, and the corresponding position of the minimum particle adaptive value is a global optimal parameter gbest. These parameters obtained will be used for subsequent iteration optimization.
U305: the iteration starts and the particle index j is reset to 1.
U306: and (3) calculating the mbest corresponding to the jth particle, wherein the calculation formula is as follows:
Figure GDA0002554424440000085
in the formula, sbestiAnd (4) representing the individual optimal parameters of the ith particle, wherein mbest is the average value of the current individual optimal parameters of all the particles, namely, the average value of each dimension parameter of all the particles is respectively taken for updating the position information of the particles.
U307: and updating the position information of the jth particle of the particle, wherein the updating formula is as follows:
Figure GDA0002554424440000086
wherein, Pj(t+1)、Pj(t) represents the positions of the particles j after and before the update, respectively,
Figure GDA0002554424440000087
and ujAre random numbers between (0,1), wherein β has the formula:
Figure GDA0002554424440000088
as can be seen from the calculation formula of beta, in the earlier stage of iteration, the parameter beta representing the step length of the movement of the particles is larger, and the particles can move to the optimal position more quickly; and the later period of the iteration beta is smaller, which means that the particle size is reduced near the optimal position, and each movement is more accurate to be close to the optimal position.
After updating the location information, the newly obtained storage layer parameters are reused with L2+ adaptive L1/2The ESN of the regularization constraint recalculates the adapted value, so D302, D303, U308, U309 are: updating the individual best and the global best according to the newly calculated adaptive value, if the newly calculated adaptive value is less than theUpdating the individual optimal adaptation value of the particle to a newly calculated adaptation value and updating the individual optimal parameter of the particle to be the parameter of the current particle; and if the newly calculated adaptive value is smaller than the global optimal adaptive value, updating the adaptive value to be the global optimal adaptive value, and updating the global optimal parameter to be the parameter of the particle.
D304: and judging whether all the particles are updated, if not, j +1, returning to U306, recalculating mbest by using the updated particle parameters, updating the position information of the next particle, and if all the particles are updated, performing D305.
D305: judging whether the iteration times reach the maximum iteration times, if not, adding 1 to iter, returning to U305, carrying out the next iteration, and if the maximum iteration times are reached, deriving the final global optimal parameter for the subsequent training of the joint optimization echo state network to predict the logic block address.
S3: optimal storage layer parameter adoption L in network based on echo state2+ adaptive L1/2And calculating the final output weight value by the regularized echo state network. The process of calculating the final output weight is shown in fig. 4, which will be described in detail below.
S4: and predicting the hot data by using the final output weight and the address of the logic block where the input historical hot data is located, wherein the prediction formula is as follows:
y=x*Wout(1)
wherein y represents the obtained predicted logic block address, the data at the predicted logic block address is thermal data, x is the logic block address of the input historical thermal data, Wout(1)Representing the final output weights. Where y is the predicted access address, note that x and Wout(1)All can be multidimensional variables, and the obtained y is a one-dimensional variable, and the obtained data on the logical block address is classified as hot data for garbage recovery and wear leveling processing.
When calculating the output weight, a group of storage layer parameters, namely the radius of an internal connection spectrum, the scale of the storage layer, the proportional coefficient sum of an input layer are determinedStorage layer sparsity. As shown in fig. 4, L is used in the present invention2+ adaptive L1/2The process of computing the output weight value by the regularized constrained echo state network is as follows:
u401: and acquiring an input layer-storage layer weight matrix in the echo state network, wherein the storage layer is internally connected with the weight matrix, and the front section of the logic block address where the historical thermal data is located is used as an input variable U, and the rear section is used as an actual result Y.
Specifically, an Echo State Network (ESN) is a low-complexity and fast-convergence calculation scheme, and is suitable for time data classification and prediction tasks. The ESN network architecture includes three layers: an input layer, a storage layer and an output layer, wherein the input layer-storage layer weight is WinThe internal connection weight of the storage layer is WxThe storage layer-output layer weight is Wout. The number of nodes in the input layer, the storage layer and the output layer is initialized to be K, n and L, and the number n of the nodes in the storage layer is the storage layer scale determination in the storage layer parameters. And initializing the input layer-storage layer weight Win∈Rn×KI.e. random assignment; initializing memory layer internal connection weight Wx∈Rn×nThat is, n × n × storage layer sparsity obtains non-zero number, and then the connection weight W is processedxThe positions and sizes of the non-zero elements in (1) are randomly assigned, and other elements are all zero. When the sparsity of the storage layer is larger, the nonlinear approximation capability is stronger; then, an internal connection weight W is determined by utilizing the internal connection spectrum radiusxThe maximum eigenvalue of (2) ensures network stability only if the internal connection spectral radius is less than 1. Thus, the input layer-storage layer weight W is determined based on the storage layer parametersin∈Rn×KInternal connection weight W of storage layerx∈Rn×n. In this embodiment, L is also initialized1/2And L2Coefficient lambda1=5*10-72=1*10-5For regularization calculation, front 2/3 of the logic block address where the input historical thermal data is located is used for constructing an input variable U, and rear 1/3 of the logic block address where the input historical thermal data is located is used for constructing an actual result YAccording to the logical block address, in other feasible embodiments, the selected length may be other, and the present invention does not specifically limit the length, and the general idea is to predict the address of the next segment by using the address of the previous segment, and then compare the predicted address of the next segment with the actual address to adjust the network, which is the original characteristic of the echo state network, and the present invention does not specifically describe the length.
U402: updating state information X of the storage layer based on the input variable U, wherein the state information X is composed of state node information X (t);
X(t)=log sig(U(t)Win+X(t-1)Wx)
wherein, U (t) represents the t-th data in the input variable U, X (t) and X (t-1) represent the t-th and t-1-th state node information respectively, the number of nodes is determined by the data length of the input variable U, Win、WxThe method respectively represents an input layer-storage layer weight matrix, a storage layer internal connection weight matrix and logsig (·) in the echo state network, and the method can enable a neural network to approach any nonlinear function at will, then the neural network can be applied to a nonlinear model, and when the activation function is used, an input quantity is directly multiplied by an input layer proportionality coefficient to be converted into a corresponding range of the activation function. Since the calculation is performed by inputting in sequence, t can be understood as a time of day.
U403: based on L2+ adaptive L1/2Obtaining an output weight value under the minimum value of the loss function by the loss function under the regularization constraint;
Figure GDA0002554424440000101
in the formula, E represents a loss function, λ1、λ2Are all regularization coefficients.
In order to realize calculation, the loss function E is simplified, and then a coordinate descent algorithm is adopted to calculate an output weight;
wherein the simplified loss function is represented as:
Figure GDA0002554424440000111
there are:
Figure GDA0002554424440000112
wherein I is an identity matrix.
Solving matrix W'outIs calculated for each element separately, W'outValue of kth element of mth line (W'out)mkThe following were used:
Figure GDA0002554424440000113
wherein,
Figure GDA0002554424440000114
Figure GDA0002554424440000115
of formula (II) to (III)'k(t) denotes the kth element, X 'of the kth line of Y'j(t) denotes X' line jth element;
Figure GDA0002554424440000116
represents matrix W'outJ (j) th line k (k) th element of (2)>When m is greater than the total number of the carbon atoms,
Figure GDA0002554424440000117
is zero.
Finally using matrix W'outAnd the output weight WoutCalculating the output weight W according to the relationshipout. In this embodiment, the method further includes performing adaptive optimization on the output weight obtained in step U403, where the optimization is U404:
u404: converting the loss function, and calculating the weight W' by adopting a coordinate descent algorithmoutThen, the weight W ″, is utilizedoutCalculating an optimized output weight;
the loss function after conversion is:
Figure GDA0002554424440000118
weight WoutAnd the output weight WoutThe relationship of (1) is:
Figure GDA0002554424440000119
wherein,
Figure GDA00025544244400001110
n is the number of nodes of the storage layer, and K is the number of nodes of the output layer.
In order to verify the reliability of the method, the method creatively replaces thermal data identification with thermal data prediction, and improves the accuracy rate of thermal data discrimination. We used four actual workloads for objective evaluation. Finaminial 1 is a write intensive trace file, MSR is a common workload for large enterprise servers, Distilled stands for typical usage patterns of personal computers, and finally MillSSD is collected from an industrial automated optical inspection instrument with Runcore RCS hardware configuration-V-T25 SSD (512GB, SATA2), Intel X27400 and 2G DDR 3. MillSSD is also a write-intensive trace file because it has the effect of a substantial image backup. The results of the performance comparison of this example are shown in FIG. 5. From the test results, it can be seen that: on WDAC basis, the HOESN thermal ratio curve almost overlaps WDAC in most cases. This major trend can be clearly seen at all four workloads, especially for more write-intensive MSRs and milssd. It is clear that at four workloads, our HOESN has the lowest FIR followed by DL-MBF _ s. Although MBF experiences a relatively high FIR, it is still a good HDI scheme for SSDs, where WDAC is proposed, which becomes the classical benchmark for the following studies. Notably, of the four workloads, the degree of improvement in HOESN was most impressive for MillSSD (from 4.08% to 2.23%). These preliminary tests also demonstrate our original idea that understanding the access behavior of hot data of NAND flash can be considered a time-series prediction, for which hoe, was proposed. The results show that our prediction method can get a good idea of the access behavior of the disk workload, which is the basic premise for reliable service for GC and WL.
It should be emphasized that the examples described herein are illustrative and not restrictive, and thus the invention is not to be limited to the examples described herein, but rather to other embodiments that may be devised by those skilled in the art based on the teachings herein, and that various modifications, alterations, and substitutions are possible without departing from the spirit and scope of the present invention.

Claims (10)

1. A thermal data prediction method based on a joint optimization echo state network is characterized in that: the method comprises the following steps:
s1: initializing parameters required by a quantum particle swarm algorithm and position information of each particle;
the position information of the particles comprises initial positions and position ranges of the particles, and the position of each particle is represented by storage layer parameters in an echo state network;
s2: iterative optimization is carried out by utilizing a quantum particle group algorithm to determine the optimal storage layer parameters;
updating the positions of the particles by adopting a quantum particle swarm algorithm based on the position range of each particle, wherein L is utilized in each updating process2+ adaptive L1/2Outputting a weight value and calculating a global optimal adaptation value in the regularized and constrained echo state network calculation process, and taking a particle position corresponding to the global optimal adaptation value as an optimal storage layer parameter when iteration is finished;
s3: optimal storage layer parameter adoption L in network based on echo state2+ adaptive L1/2The regularized and constrained echo state network calculates the final output weight;
s4: and predicting the hot data by using the final output weight and the address of the logic block where the input historical hot data is located, wherein the prediction formula is as follows:
y=x*Wout(1)
where y represents the predicted logical block takenAddress, predicting the data at the logic block address as thermal data, x as the logic block address where the input historical thermal data is located, Wout(1)And representing the final output weight, wherein the logical block address where the historical thermal data is located is used for the echo state network training process in the step S2 and the step S3.
2. The method of claim 1, wherein: the iterative optimization for determining the optimal storage layer parameters in step S2 is performed as follows:
s21: the position of each particle is sequentially used as a storage layer parameter in an echo state network, and L is respectively adopted2+ adaptive L1/2The echo state network of regularization constraint calculates the output weight of the process corresponding to each particle;
the current position of each particle is sequentially used as a storage layer parameter in the echo state network and a process output weight is calculated;
s22: calculating the adaptive value of each particle by using the process output weight corresponding to each particle;
s23: selecting an individual optimal adaptive value, an individual optimal parameter, a global optimal adaptive value and a global optimal parameter of each particle according to the adaptive value of each particle based on a minimum adaptive value principle;
wherein, the position of the particle selected as the global optimum adaptive value is the global optimum parameter;
s24: updating the position of each particle in the position range of the particle, recalculating the adaptive value of each particle based on the updated position of each particle, and updating the individual optimal adaptive value, the individual optimal parameter, the global optimal adaptive value and the global optimal parameter of each particle based on the minimum adaptive value principle;
s25: judging whether the iteration times reach the maximum iteration times, if not, returning to the step S24 for the next iteration calculation; otherwise, the current global optimum parameter is taken as the optimum storage layer parameter.
3. The method of claim 2, wherein: the position of any particle j is updated according to the following formula:
Figure FDA0002554424430000021
wherein,
Figure FDA0002554424430000022
in the formula, Pj(t+1)、Pj(t) represents the positions of the particles j after and before the update, respectively,
Figure FDA0002554424430000023
and ujAre all random numbers between (0,1), sbestj、sbestiThe individual optimal parameters of the jth and ith particles are shown, the mbest is the average value of the current individual optimal parameters of all the particles, iter and itermaxRespectively the current iteration number and the maximum iteration number, omegamax、ωminMaximum and minimum inertia factors, respectively, N is the total number of particles, gbest represents the global optimum parameter, and β represents the step size parameter of the particle movement.
4. The method of claim 2, wherein: the calculation formula of the adaptive value of any particle j is as follows:
Figure FDA0002554424430000024
where Fitness denotes the adaptation value, λ, of the current particle j1、λ2Are all the coefficients of the regularization, and,
Figure FDA0002554424430000025
outputting a weight value for the process corresponding to the current particle j; y represents the rear section of the logical block address where the historical thermal data used for network training is located, X represents the state information of the storage layer updated based on the front section of the logical block address where the historical thermal data used for network training is located,
Figure FDA0002554424430000026
representing a process output weight corresponding to a current particle j
Figure FDA0002554424430000027
And the prediction result corresponding to the later section of the logic block address of the historical thermal data.
5. The method of claim 1, wherein: by using L2+ adaptive L1/2The process of calculating the final output weight or the process output weight by the regularized constrained echo state network is as follows:
u401: acquiring an input layer-storage layer weight matrix in an echo state network, wherein the internal connection of the storage layer is the weight matrix, and the front section of a logic block address where historical thermal data is located is used as an input variable U, and the rear section is used as an actual result Y;
wherein, the input layer-storage layer weight matrix and the storage layer internal connection weight matrix are related to the storage layer parameters in the echo state network;
u402: updating state information X of the storage layer based on the input variable U, wherein the state information X is composed of state node information X (t);
X(t)=logsig(U(t)Win+X(t-1)Wx)
wherein, U (T) represents the T-th data in the input variable U, X (T) and X (T-1) represent the T-th and T-1-th state node information respectively, the maximum value T of T is determined by the data length of the input variable U, Win、WxRespectively representing an input layer-storage layer weight matrix in the echo state network, wherein the storage layer is internally connected with the weight matrix, and logsig (phi) represents an activation function;
u403: based on L2+ adaptive L1/2Obtaining an output weight value under the minimum value of the loss function by the loss function under the regularization constraint;
Figure FDA0002554424430000031
in the formula, E represents a loss function, λ1、λ2Are all regularization coefficients, WoutRepresents the output weight, X WoutRepresentation is based on output weight WoutPredicting results corresponding to the rear section of the logic block address where the historical thermal data is located;
wherein, if the process output weight in step S2 is calculated, the weight W is outputoutEquals to the process output weight; if the final output weight is calculated in step S3, the weight W is outputoutEqual to the final output weight.
6. The method of claim 5, wherein: the process of step U403 is: simplifying the loss function, and calculating an output weight by adopting a coordinate descent algorithm;
wherein the simplified loss function is represented as:
Figure FDA0002554424430000032
there are:
Figure FDA0002554424430000033
wherein I is an identity matrix;
solving matrix W'outIs to calculate separately for each element thereof, a matrix W'outValue of kth element of mth line (W'out)mkThe following were used:
Figure FDA0002554424430000034
wherein,
Figure FDA0002554424430000035
Figure FDA0002554424430000036
of formula (II) to (III)'k(t) denotes the kth element, X 'of the kth line of Y'j(t)、X′m(t) denotes the t-th element of the jth line of X 'and the t-th element of the mth line of X';
Figure FDA0002554424430000037
represents matrix W'outJ (j) th line k (k) th element of (2)>When m is greater than the total number of the carbon atoms,
Figure FDA0002554424430000038
is zero; l is the number of output layer nodes, and n is the number of storage layer nodes.
7. The method of claim 6, wherein: the method also comprises the step of carrying out self-adaptive optimization on the output weight value obtained in the step U403, wherein the optimization process is as follows:
converting the loss function, and calculating the weight W' by adopting a coordinate descent algorithmoutThen calculating the optimized output weight;
the loss function after conversion is:
Figure FDA0002554424430000041
weight WoutAnd the output weight WoutThe relationship of (1) is:
Figure FDA0002554424430000042
wherein,
Figure FDA0002554424430000043
k is the number of nodes of the input layer.
8. The method of claim 1, wherein: the storage layer parameters in the echo state network comprise four key parameters of an internal connection spectrum radius, a storage layer scale, an input layer proportion coefficient and a storage layer sparsity.
9. The method of claim 1, wherein: the parameters required for initializing the quantum-behaved particle swarm algorithm in step S1 include the total number of particles N and the maximum iteration number itermaxAnd a maximum inertia factor omegamaxAnd minimum inertia factor omegamin
10. The method of claim 1, wherein: when the particle position is updated, if the particle moving distance exceeds the position range corresponding to the particle, the particle position parameter is set to exceed the boundary value corresponding to the position range.
CN201910566123.4A 2019-06-27 2019-06-27 Thermal data prediction method based on joint optimization echo state network Active CN110554838B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201910566123.4A CN110554838B (en) 2019-06-27 2019-06-27 Thermal data prediction method based on joint optimization echo state network
PCT/CN2020/097950 WO2020259543A1 (en) 2019-06-27 2020-06-24 Hot data prediction method based on joint optimization of echo state network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910566123.4A CN110554838B (en) 2019-06-27 2019-06-27 Thermal data prediction method based on joint optimization echo state network

Publications (2)

Publication Number Publication Date
CN110554838A CN110554838A (en) 2019-12-10
CN110554838B true CN110554838B (en) 2020-08-14

Family

ID=68735438

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910566123.4A Active CN110554838B (en) 2019-06-27 2019-06-27 Thermal data prediction method based on joint optimization echo state network

Country Status (2)

Country Link
CN (1) CN110554838B (en)
WO (1) WO2020259543A1 (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110554838B (en) * 2019-06-27 2020-08-14 中南大学 Thermal data prediction method based on joint optimization echo state network
CN112448697A (en) * 2020-10-30 2021-03-05 合肥工业大学 Active filter optimization method and system based on quantum particle swarm optimization
CN112731019B (en) * 2020-12-21 2022-10-14 合肥工业大学 Fault diagnosis method for ANPC three-level inverter
CN116192640A (en) * 2021-11-25 2023-05-30 中移(苏州)软件技术有限公司 Network slice resource allocation method and device, SDN controller and storage medium
CN115841067A (en) * 2022-10-12 2023-03-24 大连理工大学 Quantum echo state network model construction method for aircraft engine fault early warning

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109726858A (en) * 2018-12-21 2019-05-07 新奥数能科技有限公司 Heat load prediction method and device based on dynamic time warping
CN109901800A (en) * 2019-03-14 2019-06-18 重庆大学 A kind of mixing memory system and its operating method

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103959291B (en) * 2011-04-20 2018-05-11 诺沃—诺迪斯克有限公司 Glucose predictions device based on the regularization network with adaptively selected core and regularization parameter
CN103020434A (en) * 2012-11-30 2013-04-03 南京航空航天大学 Particle swarm optimization-based least square support vector machine combined predicting method
US9575671B1 (en) * 2015-08-11 2017-02-21 International Business Machines Corporation Read distribution in a three-dimensional stacked memory based on thermal profiles
US10014026B1 (en) * 2017-06-20 2018-07-03 Seagate Technology Llc Head delay calibration and tracking in MSMR systems
CN109656485A (en) * 2018-12-24 2019-04-19 合肥兆芯电子有限公司 The method for distinguishing dsc data and cold data
CN110554838B (en) * 2019-06-27 2020-08-14 中南大学 Thermal data prediction method based on joint optimization echo state network

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109726858A (en) * 2018-12-21 2019-05-07 新奥数能科技有限公司 Heat load prediction method and device based on dynamic time warping
CN109901800A (en) * 2019-03-14 2019-06-18 重庆大学 A kind of mixing memory system and its operating method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
Meiling Xu;Shuhui Zhang;Min Han.Multivariate time series modeling and prediction based on reservoir independent components.《2015 Sixth International Conference on Intelligent Control and Information Processing (ICICIP)》.2015,325-330. *

Also Published As

Publication number Publication date
WO2020259543A1 (en) 2020-12-30
CN110554838A (en) 2019-12-10

Similar Documents

Publication Publication Date Title
CN110554838B (en) Thermal data prediction method based on joint optimization echo state network
US10896126B2 (en) Storage device, method and non-volatile memory device performing garbage collection using estimated number of valid pages
CN110414031B (en) Method and device for predicting time sequence based on volterra series model, electronic equipment and computer readable storage medium
WO2021008220A1 (en) Systems and methods for data storage system
Xu et al. A novel approach for determining the optimal number of hidden layer neurons for FNN's and its application in data mining
CN105653591B (en) A kind of industrial real-time data classification storage and moving method
Yang et al. Reducing garbage collection overhead in {SSD} based on workload prediction
US20220027721A1 (en) Read threshold estimation systems and methods using deep learning
CN112232495A (en) Prediction model training method, device, medium and computing equipment
US20220165337A1 (en) Systems and methods for determining change of read threshold voltage
CN113157202A (en) Memory controller, memory device including the same, and method of operating the same
CN106355031A (en) Data value degree calculation method based on analytic hierarchy process
EP3651024B1 (en) Method of operating storage device, storage device performing the same and storage system including the same
CN110275895B (en) Filling equipment, device and method for missing traffic data
CN115713956A (en) System and method for dynamically compensating multiple sources of interference in a non-volatile memory storage device
Lima et al. Evaluation of recurrent neural networks for hard disk drives failure prediction
Heidenreich et al. Transfer learning of recurrent neural network‐based plasticity models
Ahmed et al. Bitcoin Price Prediction using the Hybrid Convolutional Recurrent Model Architecture
CN114154615A (en) Neural architecture searching method and device based on hardware performance
US11455531B2 (en) Trustworthy predictions using deep neural networks based on adversarial calibration
CN116991323A (en) Solid-state disk cache management method, system, equipment and storage medium
CN115827651B (en) Low-energy-consumption on-board embedded database memory transaction management method and system
TWI842048B (en) System and method for dynamic inter-cell interference compensation in non-volatile memory storage devices
US20230100132A1 (en) System and method for estimating perturbation norm for the spectrum of robustness
Sun et al. CeCR: Cross-entropy contrastive replay for online class-incremental continual learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant