CN110554838A

CN110554838A - thermal data prediction method based on joint optimization echo state network

Info

Publication number: CN110554838A
Application number: CN201910566123.4A
Authority: CN
Inventors: 罗旗舞; 王玥童; 阳春华; 桂卫华; 周灿
Original assignee: Central South University
Current assignee: Central South University
Priority date: 2019-06-27
Filing date: 2019-06-27
Publication date: 2019-12-10
Anticipated expiration: 2039-06-27
Also published as: CN110554838B; WO2020259543A1

Abstract

the invention discloses a thermal data prediction method based on a joint optimization echo state network, which creatively uses quantum particle swarm optimization to calculate a storage layer parameter of the echo state network to obtain an optimal storage layer parameter, calculates an output weight value and a global optimal adaptation value by combining an echo state network with L ₂ + self-adaptive L _1/2 regularization constraint in the particle position updating process, takes a particle position corresponding to the global optimal adaptation value as the optimal storage layer parameter when iteration is finished, finally calculates a final output weight value by adopting the echo state network with L ₂ + self-adaptive L _1/2 regularization constraint based on the optimal storage layer parameter, predicts thermal data by using the final output weight value and a logic block address where the input historical thermal data is located, and predicts the data on the logic block address as the thermal data.

Description

Thermal data prediction method based on joint optimization echo state network

Technical Field

The invention belongs to the technical field of chaotic time sequence prediction, and particularly relates to a thermal data prediction method based on a joint optimization echo state network.

Background

As a non-volatile storage technology, NAND flash memory is widely used in communication systems and consumer electronics products, and has higher access speed and power efficiency compared to hard disk drives. In NAND flash based consumer electronics, a large number of applications rely on NAND flash for data exchange, file storage and video storage. The NAND flash memory is mainly used for storing large-capacity data, and the NAND structure can provide extremely high unit density and can achieve high storage density, high writing-in speed and high erasing speed; therefore, NAND flash memories are widely used for large-capacity data storage, such as solid-state disks. The demand of the NAND flash memory will continue to increase in the future, and the NAND flash memory is mainly applied to the fields of cloud computing, the Internet of things, data centers and the like.

However, NAND flash memory faces at least two challenges, namely displaced updating and limited endurance, which limits its large scale application. NAND flash memory suffers from the drawback of not being able to override an operation, i.e. not being able to perform a new write operation on a page of flash memory before the page is erased. Improper updates will generate many invalid and dead pages, which can reduce efficiency and performance. Furthermore, NAND flash memory has a limited lifetime because a flash block can only withstand a limited number of erasures, and if the number of erasures of a block is greater than the maximum number of erasures of the block, it will not be usable. Garbage Collection (GC) and Wear Leveling (WL) are design ideas of allocating frequently written Data (i.e., Hot Data) to blocks with a small number of erase times and allocating least recently used Data (i.e., cold Data) to blocks with a large number of erase times, and have important effects on solving these two challenges, while the efficiency and performance of GC and WL depend on Hot Data Identification (HDI) to a large extent. The nature of HDI is to try to understand well the access behavior of hot data in order to intelligently assign different data to the appropriate blocks, but conventional HDI has two problems, one of which is that memory overhead is large. At present, most of hot data identification mechanisms adopt the idea of identifying hot data pages in a NAND flash memory, and the core principle of the mechanisms is a page counter, which records the read-write operation times of logical pages corresponding to the NAND flash memory pages within a certain time, if the read-write times are greater than a set threshold value, the requested page is judged to be a hot page, otherwise, the requested page is a cold page. Another serious problem is that the accuracy of identification is not high, and the Bloom filter-based hot data identification mechanism is widely applied to the cold and hot data identification of SSD, but the Bloom filter has the inherent defect of false positive, that is, data not belonging to the set is erroneously determined to be in the set. In addition, thermal data identification mode considerations such as load request size and load access mode are single, the locality characteristics of loads cannot be completely and comprehensively considered, and the accuracy of thermal data identification is not high.

disclosure of Invention

The invention aims to provide a thermal data prediction method based on a joint optimization echo state network, which creatively proposes to replace the traditional thermal data identification with thermal data prediction and constructs the joint optimization echo state network, thereby ensuring that the predicted thermal data has more real-time property and reliability.

the invention provides a thermal data prediction method based on a joint optimization echo state network, which comprises the following steps:

s1: initializing parameters required by a quantum particle swarm algorithm and position information of each particle;

the position information of the particles comprises initial positions and position ranges of the particles, and the position of each particle is represented by storage layer parameters in an echo state network;

s2: iterative optimization is carried out by utilizing a quantum particle group algorithm to determine the optimal storage layer parameters;

Updating the positions of the particles by adopting a quantum particle swarm algorithm based on the position range of each particle, and utilizing L in each updating process₂+ adaptive L_1/2Computing an output weight value by the regularized and constrained echo state network to obtain a global optimal adaptation value, and taking a particle position corresponding to the global optimal adaptation value as an optimal storage layer parameter when iteration is finished;

S3: optimal storage layer parameter adoption L in network based on echo state₂+ adaptive L_1/2the regularized and constrained echo state network calculates the final output weight;

S4: and predicting the hot data by using the final output weight and the address of the logic block where the input historical hot data is located, wherein the prediction formula is as follows:

y＝x*W_out

Wherein y represents the obtained predicted logic block address, the data at the predicted logic block address is thermal data, x is the logic block address of the input historical thermal data, W_outindicating the output weight, the logical block address where the historical thermal data is located is used in the echo state network training process in step S2 and step S3.

The method is characterized in that a solid state disk structure frame is adopted, meanwhile, a hot data identification module in a flash memory conversion layer is innovatively replaced by hot data prediction, the hot data is predicted by using a joint optimization echo state network, the joint optimization comprises two parts, the first part uses a quantum particle swarm algorithm to iteratively optimize and determine the optimal storage layer parameter of the echo state network, and the second part uses L₂+ self-adaptation L_1/2The regularization constrained ESN obtains the output weight of high sparsity, and the invention iteratively optimizes the quantum particle swarm algorithm and L₂+ adaptive L_1/2the optimal storage layer parameters are obtained by combining regularization constraints, and the joint optimization echo state network for prediction has higher real-time performance and reliability. The invention trains the echo state network by using the logic block address where the historical thermal data is located to obtain the final output weight value, and then predicts the final output weight value as the logic block address where the thermal data is located by using the final output weight value.

Further preferably, the iterative optimization for determining the optimal storage layer parameters in step S2 is performed as follows:

S21: the position of each particle is sequentially used as a storage layer parameter in an echo state network, and L is respectively adopted₂+ adaptive L_1/2The echo state network of regularization constraint calculates the corresponding output weight of each particle;

The current position of each particle is sequentially used as a storage layer parameter in the echo state network and an output weight is calculated;

S22: calculating an adaptive value of each particle by using the output weight corresponding to each particle;

S23: selecting an individual optimal adaptive value, an individual optimal parameter, a global optimal adaptive value and a global optimal parameter of each particle according to the adaptive value of each particle based on a minimum adaptive value principle;

Wherein, the position of the particle selected as the global optimum adaptive value is the global optimum parameter;

S24: updating the position of each particle in the position range of the particle, recalculating the adaptive value of each particle based on the updated position of each particle, and updating the individual optimal adaptive value, the individual optimal parameter, the global optimal adaptive value and the global optimal parameter of each particle based on the minimum adaptive value principle;

s25: judging whether the iteration times reach the maximum iteration times, if not, returning to the step S24 for the next iteration calculation; otherwise, the current global optimum parameter is taken as the optimum storage layer parameter.

Further preferably, the position of any particle j is updated according to the following formula:

wherein the content of the first and second substances,

In the formula, P_j(t+1)、P_j(t) represents the positions of the particles j after and before the update, respectively,and u_jare all random numbers, sbest_j、 sbest_iThe individual optimal parameters of the jth and ith particles are shown, the mbest is the average value of the current individual optimal parameters of all the particles, iter and iter_maxRespectively the current iteration number and the maximum iteration number, omega_max、ω_minRespectively, inertia factors, and N is the total number of particles.

Further preferably, the calculation formula of the adaptive value of any particle j is:

Where Fitness denotes the adaptation value, λ, of the current particle j₁、λ₂Are all regularization coefficients, W_outThe output weight value corresponding to the current particle j is obtained; y represents the rear section of the logic block address where the historical thermal data used for network training is located, X represents the state information of the storage layer updated on the basis of the front section of the logic block address where the historical thermal data used for network training is located, and X W_outAnd the prediction result corresponding to the rear section of the logical block address of the historical thermal data is shown.

further preferably, L is used₂+ adaptive L_1/2the process of computing the output weight value by the regularized constrained echo state network is as follows:

U401: acquiring an input layer-storage layer weight matrix in an echo state network, wherein the internal connection of the storage layer is the weight matrix, and the front section of a logic block address where historical thermal data is located is used as an input variable U, and the rear section is used as an actual result Y;

Wherein, the input layer-storage layer weight matrix and the storage layer internal connection weight matrix are related to storage layer parameters in the echo state network;

u402: updating state information X of the storage layer based on the input variable U, wherein the state information X is composed of state node information X (i);

X(t)＝logsig(U(t)W_in+X(t-1)W_x)

Wherein, U (T) represents the T-th data in the input variable U, X (T) and X (T-1) represent the T-th and T-1-th state node information respectively, the maximum value T of T is determined by the data length of the input variable U, W_in、W_xRespectively representing an input layer-storage layer weight matrix in the echo state network, wherein the storage layer is internally connected with the weight matrix, and logsig (phi) represents an activation function;

U402: based on L₂+ adaptive L_1/2loss function under regularization constraintobtaining an output weight value under the minimum value of the loss function;

in the formula, E represents a loss function, λ₁、λ₂are all regularization coefficients.

Further preferably, the acquiring process of step U402 is: simplifying the loss function, and calculating an output weight by adopting a coordinate descent algorithm;

Wherein the simplified loss function is represented as:

There are:wherein I is an identity matrix;

Solving matrix W'_outIs calculated for each element separately, W'_outthe value of the kth element of the mth row is as follows:

wherein the content of the first and second substances,

Of formula (II) to (III)'_k(t) denotes the kth element, X 'of the kth line of Y'_j(t) denotes the t-th element of the jth line of X'.Represents matrix W'_outJ (j) th line k (k) th element of (2)>When m is greater than the total number of the carbon atoms,Is zero; l is the number of output layer nodes, and n is the number of storage layer nodes.

Finally using matrix W'_outAnd the output weight W_outCalculating the output weight W according to the relationship_out。

further preferably, the method further includes performing adaptive optimization on the output weight obtained in step U402, where the optimization process is as follows:

converting the loss function, and calculating the weight W' by adopting a coordinate descent algorithm_outThen, the weight W ″, is utilized_outcalculating an optimized output weight;

The loss function after conversion is:

weight W_outAnd the output weight W_outthe relationship of (1) is:

Wherein the content of the first and second substances,K is the number of output layer nodes.

Further preferably, the storage layer parameters in the echo state network comprise four key parameters of an internal connection spectrum radius, a storage layer scale, an input layer proportion coefficient and a storage layer sparsity.

further preferably, the parameters required for initializing the quantum-behaved particle swarm algorithm in step S1 include a particle swarm size N and a maximum iteration number iter_maxand an inertia factor omega_maxAnd ω_min。

Further preferably, when the particle position is updated, if the particle movement distance exceeds the position range corresponding to the particle, the particle position parameter is set to exceed a boundary value corresponding to the position range.

Advantageous effects

1. the invention innovatively provides that the traditional hot data identification is replaced by hot data prediction, the disclosed hot data prediction technology can predict the property of the next data in advance by one or even a few beats according to historical access behaviors, and actively allocates and stores the next data to a corresponding block (hot/cold data block) of a Solid State Disk (SSD). Meanwhile, the neural network method reserves more characteristic information for input quantity and classifies the thermal data more comprehensively.

2. the invention performs joint optimization, L, on the echo state network₂regularization constraint achieves good generalization ability through a trade-off between model bias and prediction variance to be able to obtain weights for successive contractions, however it does not produce sparse solutions; adaptive L_1/2Regularization can generate very sparse solutions, but when there is a high correlation between the predictor variables, L_1/2cannot well play a regulating role, and the invention comprehensively adopts L₂+ adaptive L_1/2The regularization training least square method can obtain the advantages of two regularizations, and further improves the prediction precision of the thermal data. In addition, the storage layer parameters of the echo state network are optimized based on a QPSO algorithm, and the problem that the storage layer parameters cannot be determined when a model is built can be solved. Compared with the traditional PSO algorithm, the algorithm removes the speed information of particles based on the wave-particle duality, only retains the position information, can effectively reduce the complexity of calculation, and simultaneously obtains the storage layer parameters of the adaptive model, thereby further improving the prediction precision; furthermore, the invention provides L₂+ adaptive L_1/2The regularization and the QPSO algorithm are combined to obtain the optimal storage layer parameters, and the prediction precision is improved.

drawings

FIG. 1 is a typical architecture of a NAND flash memory system;

FIG. 2 is a flow chart of a method for hot data prediction based on joint optimization echo state network according to an embodiment of the present invention;

FIG. 3 is a flow chart of the iterative optimization algorithm of the quantum-behaved particle swarm optimization of the present invention;

FIG. 4 shows the use of L in the present invention₂+ adaptive L_1/2And (3) a specific algorithm flow chart for computing the output weight value by the constrained echo state network.

FIG. 5 is a graph comparing the performance of four actual workloads according to one embodiment of the present invention.

Detailed Description

the present invention will be further described with reference to the following examples.

The invention provides a hot data prediction method based on a joint optimization echo state network, which is mainly applied to a NAND flash memory system, and as shown in FIG. 1, the hot data prediction method is a typical architecture of the NAND flash memory system, and comprises a module B101 (operation of a user), a module B102 (file system) and a module B103 (solid state disk). The actual operation of the user will affect the solid state disk through the file system. The solid state disk further comprises a Flash memory conversion layer, a Flash controller and a NAND Flash array, wherein the Flash memory conversion layer comprises an address allocation unit, a garbage recovery unit, a wear leveling unit and a hot data prediction unit, the invention innovatively provides that the hot data prediction unit replaces a traditional hot data identification unit, a traditional hot data identification method is used for passively analyzing user access behaviors, corresponding data are allocated and stored to corresponding blocks (hot/cold data blocks) of the Solid State Disk (SSD) through a Flash transport layer protocol (FTL), and the method has higher hot data missing detection or false alarm when the method is used for responding to a request with complex access behaviors. The hot data prediction technology disclosed by the invention can predict the property of the next data in advance by one or even several beats according to historical access behaviors, actively distributes and stores the data to a corresponding block (hot/cold data block) of a Solid State Disk (SSD), and is compatible with secondary verification of a traditional hot data identification scheme. Accordingly, the thermal data prediction method proposed by the present invention is essentially "predictive thermal data identification". The finally obtained address information of the prediction logic block is used for garbage recovery and wear leveling processing.

In the process, the wear leveling and the garbage collection have larger influence in the solid state disk. The traditional hot data identification is to accurately and efficiently distinguish which data belongs to valid data. The invention provides a hot data prediction method based on a joint optimization echo state network, which replaces hot data identification with hot data prediction, has high precision prediction and specifically comprises the following steps:

S1: initializing parameters required by the quantum particle swarm algorithm and position information of each particle.

The position information of the particles comprises initial positions and position ranges of the particles, the position of each particle is represented by storage layer parameters in an Echo State Network (ESN), the storage layer parameters in the echo state network comprise internal connection spectrum radius, storage layer scale, input layer proportionality coefficient and storage layer sparsity, the dimension of each particle is initialized to 4 in the example, namely each particle is a matrix of 1 × 4 and represents 4 parameters of the ESN storage layer respectively. Determining the range of 4 parameters, defining the range of the parameters as the position range of all the particles, randomly assigning a value to each particle in the position range during initialization, considering that the particles continuously move towards the optimal direction in the specified range during the subsequent updating process, and updating the position information of the particles to a boundary value if the particles exceed the specified range during the movement. Wherein each particle location represents a specific value of an ESN storage layer parameter.

Parameters required by the quantum particle swarm algorithm comprise a particle swarm size N and a maximum iteration number Iter_maxinertia factor omega_maxAnd ω_min(for subsequent updating of the particle position information).

Updating the positions of the particles by adopting a quantum particle swarm algorithm based on the position range of each particle, and utilizing L in each updating process₂+ adaptive L_1/2computing output weight value by the regularized and constrained echo state network to obtain a global optimal adaptation value, and when iteration is finished, obtaining the global optimal adaptation valuethe corresponding particle position should be taken as the optimal storage layer parameter. The specific process comprises the following steps:

S21: taking the position of each particle as a storage layer parameter in an echo state network, and respectively adopting L₂+ adaptive L_1/2The echo state network of regularization constraint calculates the corresponding output weight of each particle;

Based on the above logic, an embodiment of the present invention provides an example flowchart as shown in fig. 3, which includes the following steps:

U301: and (4) iteration initialization, setting the current iteration number iter to be 1, and setting the particle label j to be 1.

u302: setting the position of the jth particle as ESN storage layer parameter, and utilizing L₂+ adaptive L_1/2least square calculation in regularization constraint training to obtain output weight W with higher sparsity_out. By L₂+ adaptive L_1/2RegularizationESN calculation output weight W of chemical constraint_outThe detailed steps are shown in fig. 4 and described in detail below.

u303: output weight W corresponding to jth particle_outcalculating an adaptive value corresponding to the jth particle according to the following calculation formula:

In the formula, λ₁、λ₂Are all regularization coefficients, W_outthe output weight value corresponding to the current particle j is obtained; y represents the rear section of the logic block address where the historical thermal data used for network training is located, X represents the state information of the storage layer updated on the basis of the front section of the logic block address where the historical thermal data used for network training is located, and X W_outAnd the prediction result corresponding to the rear section of the logical block address of the historical thermal data is shown.

D301: and judging whether all the particles finish the calculation of the adaptive values, if not, adding 1 to j, returning to the step U302 to calculate the adaptive value of the next particle, and if all the particles finish the calculation of the adaptive values, performing the step U304.

U304: and selecting the individual optimal adaptive value, the individual optimal parameter, the global optimal adaptive value and the global optimal parameter of each particle according to the adaptive value of each particle based on the minimum adaptive value principle. After all the particles calculate the adaptive values, comparing and judging, recording the adaptive value of each particle as an individual optimal adaptive value fsbest, and recording the position of each particle as an individual optimal parameter sbest; and recording the minimum particle adaptive value in all the particle adaptive values as a global optimal adaptive value fgbest, and the corresponding position of the minimum particle adaptive value is a global optimal parameter gbest. These parameters obtained will be used for subsequent iteration optimization.

U305: the iteration starts and the particle index j is reset to 1.

U306: and (3) calculating the mbest corresponding to the jth particle, wherein the calculation formula is as follows:

In the formula, sbest_iand (4) representing the individual optimal parameters of the ith particle, wherein mbest is the average value of the current individual optimal parameters of all the particles, namely, the average value of each dimension parameter of all the particles is respectively taken for updating the position information of the particles.

u307: and updating the position information of the jth particle of the particle, wherein the updating formula is as follows:

wherein, P_j(t+1)、P_j(t) represents the positions of the particles j after and before the update, respectively,And u_jAre random numbers between (0,1), wherein the calculation formula of β is:

as can be seen from the calculation formula of beta, in the early stage of iteration, the parameter beta representing the step length of the movement of the particles is larger, and the particles can move to the optimal position more quickly; and the later period of the iteration beta is smaller, which means that the particle size is reduced near the optimal position, and each movement is more accurate to be close to the optimal position.

after updating the location information, the newly obtained storage layer parameters are reused with L₂+ adaptive L_1/2The ESN of the regularization constraint recalculates the adapted value, so D302, D303, U308, U309 are: updating the individual best and the global best according to the newly calculated adaptive value, if the newly calculated adaptive value is smaller than the individual best adaptive value of the particle, updating the individual best adaptive value of the particle to the newly calculated adaptive value, and updating the individual best parameter of the particle to be the parameter of the current particle; and if the newly calculated adaptive value is smaller than the global optimal adaptive value, updating the adaptive value to be the global optimal adaptive value, and updating the global optimal parameter to be the parameter of the particle.

D304: and judging whether all the particles are updated, if not, j +1, returning to U306, recalculating mbest by using the updated particle parameters, updating the position information of the next particle, and if all the particles are updated, performing D305.

D305: judging whether the iteration times reach the maximum iteration times, if not, adding 1 to iter, returning to U305, carrying out the next iteration, and if the maximum iteration times are reached, deriving the final global optimal parameter for predicting the logic block address by subsequently training the joint optimization echo state network.

S3: optimal storage layer parameter adoption L in network based on echo state₂+ adaptive L_1/2and calculating the final output weight value by the regularized and constrained echo state network. The process of calculating the final output weight is shown in fig. 4, which will be described in detail below.

y＝x*W_out

Wherein y represents the obtained predicted logic block address, the data at the predicted logic block address is thermal data, x is the logic block address of the input historical thermal data, W_outRepresenting the output weight. Where y is the predicted access address, note that x and W_outAll can be multidimensional variables, and the obtained y is a one-dimensional variable, and the obtained data on the logical block address is classified as hot data for garbage recovery and wear leveling processing.

When the output weight is calculated, a group of storage layer parameters, namely the internal connection spectrum radius, the storage layer scale, the input layer proportion coefficient and the storage layer sparsity, are determined. As shown in fig. 4, L is used in the present invention₂+ adaptive L_1/2The process of computing the output weight value by the regularized constrained echo state network is as follows:

U401: and acquiring an input layer-storage layer weight matrix in the echo state network, wherein the storage layer is internally connected with the weight matrix, the front section of the logic block address where the historical thermal data is located is used as an input variable U, and the rear section is used as an actual result Y.

specifically, an Echo State Network (ESN) is a low-complexity and fast-convergence calculation scheme, and is suitable for time data classification and prediction tasks. The ESN network architecture includes three layers: an input layer, a storage layer and an output layer, wherein the input layer-storage layer weight is W_inthe internal connection weight of the storage layer is W_xThe storage layer-output layer weight is Wout. The number of nodes in the input layer, the storage layer and the output layer is initialized to be K, n and L, and the number n of the nodes in the storage layer is the storage layer scale determination in the storage layer parameters. And initializing the input layer-storage layer weight W_in∈R^n×Ki.e. random assignment; initializing internal connection weight W of memory layer_x∈R^n×nthat is, n × n × storage layer sparsity obtains a non-zero number, and then a connection weight W is calculated_xthe positions and sizes of the non-zero elements in (1) are randomly assigned, and other elements are all zero. When the sparsity of the storage layer is larger, the nonlinear approximation capability is stronger; then, an internal connection weight W is determined by utilizing the internal connection spectrum radius_xThe maximum eigenvalue of (2) ensures network stability only if the internal connection spectral radius is less than 1. Thus, the input layer-storage layer weight W is determined based on the storage layer parameters_in∈R^n×KThe internal connection weight W of the memory layer_x∈R^n×n. In this embodiment, L is also initialized_1/2and L₂coefficient lambda₁＝5*10^-7,λ₂＝1*10^-5the method is used for regularization calculation, an input variable U is constructed by utilizing the front 2/3 of a logic block address where input historical thermal data are located, an actual result Y is constructed by utilizing the rear 1/3, the logic block address where the selected historical thermal data are located is the logic block address where the historical thermal data recorded by a user are located in the embodiment of the invention, in other feasible embodiments, the selected length can be other, the method is not specifically limited, the general idea is that a front section address is utilized to predict a rear section address, the predicted rear section address is compared with the actual address to adjust the network, the part is the original characteristic of an echo state network, and the method is not specifically characterizeda description is given.

U402: updating state information X of the storage layer based on the input variable U, wherein the state information X is composed of state node information X (t);

X(t)＝logsig(U(t)W_in+X(t-1)W_x)

wherein, U (t) represents the t-th data in the input variable U, X (t) and X (t-1) represent the t-th and t-1-th state node information respectively, the number of nodes is determined by the data length of the input variable U, W_in、W_xThe method respectively represents an input layer-storage layer weight matrix, a storage layer internal connection weight matrix and logsig (·) in the echo state network, and the method can enable a neural network to approach any nonlinear function at will, then the neural network can be applied to a nonlinear model, and when the activation function is used, the input quantity is directly multiplied by an input layer proportionality coefficient to be converted into a corresponding range of the activation function. Since the calculation is performed by inputting in sequence, t can be understood as a time of day.

u402: based on L₂+ adaptive L_1/2obtaining an output weight value under the minimum value of the loss function by the loss function under the regularization constraint;

In order to realize calculation, the loss function E is simplified, and then a coordinate descent algorithm is adopted to calculate an output weight;

wherein the simplified loss function is represented as:

There are:Wherein I is an identity matrix.

solving matrix W'_outMethod (2)Is calculated separately for each of the elements, W'_outThe value of the kth element of the mth row is as follows:

wherein the content of the first and second substances,

of formula (II) to (III)'_k(t) denotes the kth element, X 'of the kth line of Y'_j(t) denotes X' line jth element;Represents matrix W'_outj (j) th line k (k) th element of (2)>When m is greater than the total number of the carbon atoms,Is zero.

Finally using matrix W'_outAnd the output weight W_outCalculating the output weight W according to the relationship_out. In this embodiment, the method further includes performing adaptive optimization on the output weight obtained in step U402, where the optimization is U404:

u404: converting the loss function, and calculating the weight W' by adopting a coordinate descent algorithm_outthen, the weight W ″, is utilized_outCalculating an optimized output weight;

The loss function after conversion is:

Weight W_outAnd the output weight W_outthe relationship of (1) is:

Wherein the content of the first and second substances,n is the number of nodes of the storage layer, and K is the number of nodes of the output layer.

in order to verify the reliability of the method, the method creatively replaces thermal data identification with thermal data prediction, and improves the accuracy of thermal data discrimination. We used four actual workloads for objective evaluation. Finaminial 1 is a write intensive trace file, MSR is a common workload for large enterprise servers, Distilled stands for typical usage patterns of personal computers, and finally MillSSD is collected from an industrial automated optical inspection instrument with Runcore RCS hardware configuration-V-T25 SSD (512GB, SATA2), Intel X27400 and 2G DDR 3. MillSSD is also a write-intensive trace file because it has the effect of a substantial image backup. The results of the performance comparison of this example are shown in FIG. 5. From the test results, it can be seen that: on WDAC basis, the HOESN thermal ratio curve almost overlaps WDAC in most cases. This major trend can be clearly seen at all four working loads, especially for more write-intensive MSRs and milssd. It is clear that at four workloads, our HOESN has the lowest FIR followed by DL-MBF _ s. Although MBF experiences a relatively high FIR, it is still a good HDI scheme for SSDs, where WDAC is proposed, which becomes the classical benchmark for the following studies. Notably, of the four workloads, the degree of improvement in HOESN was most impressive for MillSSD (from 4.08% to 2.23%). These preliminary tests also demonstrate our original idea that understanding the access behavior of hot data of NAND flash can be considered a time-series prediction, for which hoe, was proposed. The results show that our prediction method can get a good idea of the access behavior of the disk workload, which is the basic premise for providing reliable service for GC and WL.

It should be emphasized that the examples described herein are illustrative and not restrictive, and thus the invention is not limited to the examples described in the specific embodiments, but rather, other embodiments may be devised by those skilled in the art without departing from the spirit and scope of the present invention, and it is intended to cover all modifications, alterations, and equivalents included within the scope of the present invention.

Claims

1. a thermal data prediction method based on a joint optimization echo state network is characterized in that: the method comprises the following steps:

Updating the positions of the particles by adopting a quantum particle swarm algorithm based on the position range of each particle, wherein L is utilized in each updating process₂+ adaptive L_1/2Computing an output weight value and a global optimal adaptation value by the regularized and constrained echo state network, and taking a particle position corresponding to the global optimal adaptation value as an optimal storage layer parameter when iteration is finished;

y＝x*W_out

2. The method of claim 1, wherein: the iterative optimization for determining the optimal storage layer parameters in step S2 is performed as follows:

3. The method of claim 2, wherein: the position of any particle j is updated according to the following formula:

wherein the content of the first and second substances,

in the formula, P_j(t+1)、P_j(t) represents the positions of the particles j after and before the update, respectively,And u_jAre all random numbers, sbest_j、sbest_iThe individual optimal parameters of the jth and ith particles are shown, the mbest is the average value of the current individual optimal parameters of all the particles, iter and iter_maxrespectively the current iteration number and the maximum iteration number, omega_max、ω_minRespectively, inertia factors, and N is the total number of particles.

4. The method of claim 2, wherein: the calculation formula of the adaptive value of any particle j is as follows:

5. The method of claim 1, wherein: by using L₂+ adaptive L_1/2The process of computing the output weight value by the regularized constrained echo state network is as follows:

Wherein, the input layer-storage layer weight matrix and the storage layer internal connection weight matrix are related to the storage layer parameters in the echo state network;

X(t)＝logsig(U(t)W_in+X(t-1)W_x)

6. the method of claim 5, wherein: the process of step U402 is: simplifying the loss function, and calculating an output weight by adopting a coordinate descent algorithm;

Wherein the simplified loss function is represented as:

There are:Wherein I is an identity matrix;

Wherein the content of the first and second substances,

in the formula, Y_k' (t) denotes the t-th element, X ' of the kth line of Y '_j(t) denotes X' line jth element;represents matrix W'_outJ (j) th line k (k) th element of (2)>when m is greater than the total number of the carbon atoms,is zero; l is the number of output layer nodes, and n is the number of storage layer nodes.

7. the method of claim 6, wherein: the method also comprises the step of carrying out self-adaptive optimization on the output weight value obtained in the step U402, wherein the optimization process is as follows:

Converting the loss function, and calculating the weight W' by adopting a coordinate descent algorithm_outThen calculating the optimized output weight;

The loss function after conversion is:

Weight W_outAnd the output weight W_outThe relationship of (1) is:

Wherein the content of the first and second substances,K is the number of nodes of the input layer.

8. The method of claim 1, wherein: the storage layer parameters in the echo state network comprise four key parameters of an internal connection spectrum radius, a storage layer scale, an input layer proportion coefficient and a storage layer sparsity.

9. the method of claim 1, wherein: the parameters required for initializing the quantum-behaved particle swarm algorithm in step S1 include particle swarm size N and maximum iteration number iter_maxand an inertia factor omega_maxAnd ω_min。

10. The method of claim 1, wherein: when the particle position is updated, if the particle moving distance exceeds the position range corresponding to the particle, the particle position parameter is set to exceed the boundary value corresponding to the position range.