CN117251830A - Flash memory life prediction method and device, readable storage medium and electronic equipment - Google Patents

Flash memory life prediction method and device, readable storage medium and electronic equipment Download PDF

Info

Publication number
CN117251830A
CN117251830A CN202311510147.0A CN202311510147A CN117251830A CN 117251830 A CN117251830 A CN 117251830A CN 202311510147 A CN202311510147 A CN 202311510147A CN 117251830 A CN117251830 A CN 117251830A
Authority
CN
China
Prior art keywords
flash memory
sample
feature
training
features
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311510147.0A
Other languages
Chinese (zh)
Inventor
徐永刚
孙成思
何瀚
王灿
谭尚庚
刘昆奇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chengdu Statan Testing Technology Co ltd
Original Assignee
Chengdu Statan Testing Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chengdu Statan Testing Technology Co ltd filed Critical Chengdu Statan Testing Technology Co ltd
Priority to CN202311510147.0A priority Critical patent/CN117251830A/en
Publication of CN117251830A publication Critical patent/CN117251830A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/27Regression, e.g. linear or logistic regression
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2413Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
    • G06F18/24147Distances to closest patterns, e.g. nearest neighbour classification
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention discloses a flash memory life prediction method, a flash memory life prediction device, a readable storage medium and electronic equipment, wherein a flash memory sample, a flash memory characteristic corresponding to the flash memory sample and a flash memory life are obtained; performing secondary processing on the flash memory features by using a feature selection algorithm based on correlation to obtain a feature subset related to the flash memory life; classifying by using a K nearest neighbor algorithm based on the feature subset to obtain classified data, and training a preset multiple linear regression model by using the classified data and the flash memory life to obtain a trained multiple linear regression model; and obtaining characteristic information of the flash memory to be predicted, predicting the flash memory to be predicted according to the characteristic information by using the trained multiple linear regression model to obtain the service life of the flash memory to be predicted, and combining a correlation-based characteristic selection algorithm with a K nearest neighbor algorithm to improve the precision of a final prediction model so as to realize efficient and accurate flash memory service life prediction.

Description

Flash memory life prediction method and device, readable storage medium and electronic equipment
Technical Field
The invention relates to the technical field of flash life prediction, in particular to a flash life prediction method and device, a readable storage medium and electronic equipment.
Background
NAND (flash memory, NAND flash memory) is a better storage device than hard disk drives, and its non-volatility, high storage capacity, low manufacturing cost, short erase time, etc. make it a better choice in the market place. The smart card has the advantages of being small in storage card, U disk and mobile phone, large in household appliances, automobiles, cloud computing and other enterprise fields, and capable of being developed more rapidly due to wide application in the fields of automatic control, communication, industry, aerospace, precision manufacturing, computers and the like.
At present, due to the continuous progress of the semiconductor manufacturing process, the three-dimensional flash memory has been in the era of ultra-high density and ultra-large capacity, and the serious problem of reliability reduction caused by the reduction of the size and the increase of the stacking layer number is urgent. The most basic requirement of data storage is reliability, and if the reliability of data storage cannot be guaranteed, the flash memory has no meaning even if the storage capacity is made to be larger, the density is higher, and the performance is better. Among the reliability parameters, the life of the flash memory is a particularly important parameter, which represents the number of operations that the flash memory can perform before failing (read/write function). The service life of the flash memory is predicted, the loss of the flash memory device, such as data, economy and the like, caused by failure can be effectively avoided, and meanwhile, an effective use strategy can be formulated according to the predicted service life, so that the data function of the flash memory product can be maximally exerted. The existing flash life prediction technology is to perform machine learning classification training on collected flash characteristic values, such as decision trees, support vector machines and the like, but has weak generalization capability, if a new data set is added on an original data set, reasonable response is difficult to give, the characteristic values are not extracted, a large number of characteristics and secondary processing characteristics are input into a characteristic classification algorithm, the problems of being unfavorable for calculation such as fitting, dimension explosion and the like are easy to occur, and the prediction efficiency and accuracy of the flash life are also affected by larger calculated amount.
Disclosure of Invention
The technical problems to be solved by the invention are as follows: provided are a flash lifetime prediction method, a flash lifetime prediction device, a readable storage medium, and an electronic device, which can realize efficient and accurate flash lifetime prediction.
In order to solve the technical problems, the invention adopts a technical scheme that:
a flash memory life prediction method comprises the following steps:
acquiring a flash memory sample, flash memory characteristics corresponding to the flash memory sample and flash memory service life;
performing secondary processing on the flash memory features by using a feature selection algorithm based on correlation to obtain a feature subset related to the flash memory life;
classifying by using a K nearest neighbor algorithm based on the feature subset to obtain classified data, and training a preset multiple linear regression model by using the classified data and the flash memory life to obtain a trained multiple linear regression model;
acquiring characteristic information of a flash memory to be predicted, and predicting the flash memory to be predicted according to the characteristic information by using the trained multiple linear regression model to obtain the service life of the flash memory to be predicted;
the classifying is performed by using a K nearest neighbor algorithm based on the feature subset, and the classified data comprises the following steps:
randomly grouping the feature subsets to obtain a training group and a testing group;
the Euclidean distance between each training feature in the training group and all the test features in the test group is calculated to obtain a distance value;
sorting the distance values to obtain sorted distance values;
determining a preset K value, and sequentially selecting nearest neighbor sample points from the ordered distance values according to the preset K value;
and determining a target class with the most category occupation ratio of the nearest neighbor sample point, and determining the target class as a prediction class of the training feature to obtain classified data.
In order to solve the technical problems, the invention adopts another technical scheme that:
a flash life prediction apparatus, comprising:
the data acquisition module is used for acquiring a flash memory sample, flash memory characteristics corresponding to the flash memory sample and flash memory service life;
the data processing module is used for carrying out secondary processing on the flash memory characteristics by using a characteristic selection algorithm based on correlation to obtain a characteristic subset related to the service life of the flash memory;
the model training module is used for classifying by using a K nearest neighbor algorithm based on the feature subset to obtain classified data, and training a preset multiple linear regression model by using the classified data and the flash memory life to obtain a trained multiple linear regression model;
the prediction module is used for obtaining the characteristic information of the flash memory to be predicted, and predicting the flash memory to be predicted according to the characteristic information by using the trained multiple linear regression model to obtain the service life of the flash memory to be predicted;
the classifying is performed by using a K nearest neighbor algorithm based on the feature subset, and the classified data comprises the following steps:
randomly grouping the feature subsets to obtain a training group and a testing group;
the Euclidean distance between each training feature in the training group and all the test features in the test group is calculated to obtain a distance value;
sorting the distance values to obtain sorted distance values;
determining a preset K value, and sequentially selecting nearest neighbor sample points from the ordered distance values according to the preset K value;
and determining a target class with the most category occupation ratio of the nearest neighbor sample point, and determining the target class as a prediction class of the training feature to obtain classified data.
In order to solve the technical problems, the invention adopts another technical scheme that:
a computer readable storage medium having stored thereon a computer program which when executed by a processor performs the steps of a flash lifetime prediction method as described above.
In order to solve the technical problems, the invention adopts another technical scheme that:
an electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the steps of a flash lifetime prediction method as described above when the computer program is executed.
The invention has the beneficial effects that: the method comprises the steps of performing secondary processing on flash memory features by using a feature selection algorithm based on correlation to obtain a feature subset related to the service life of the flash memory, classifying by using a K nearest neighbor algorithm based on the feature subset to obtain classified data, training a preset multiple linear regression model by using the classified data and the service life of the flash memory to obtain a multiple linear regression model after training, performing service life prediction by using the multiple linear regression model after training, effectively filtering useless features in the flash memory features by using the feature selection algorithm based on correlation to find out features highly related to the service life of the flash memory, accurately mining the mapping relation between the service life of the flash memory and the features by using the K nearest neighbor algorithm, ensuring that the classified data has high accuracy, avoiding the problems of unfavorable calculation such as fitting, dimension explosion and the like even if new data is added, improving the final prediction model accuracy by combining the feature selection algorithm based on correlation with the K nearest neighbor algorithm, reducing training time, and realizing efficient and accurate service life prediction.
Drawings
FIG. 1 is a flowchart illustrating a method for predicting lifetime of a flash memory according to an embodiment of the present invention;
FIG. 2 is a schematic diagram illustrating a flash lifetime prediction apparatus according to an embodiment of the present invention;
fig. 3 is a schematic structural diagram of an electronic device according to an embodiment of the present invention;
FIG. 4 is a diagram illustrating an optimal search in a method for predicting lifetime of a flash memory according to an embodiment of the present invention;
FIG. 5 is a linear function calculation result of lifetime prediction in the method for lifetime prediction of flash memory according to the embodiment of the present invention;
fig. 6 is a schematic diagram of life regression fit in the flash life prediction method according to the embodiment of the present invention.
Detailed Description
In order to describe the technical contents, the achieved objects and effects of the present invention in detail, the following description will be made with reference to the embodiments in conjunction with the accompanying drawings.
Referring to fig. 1, an embodiment of the present invention provides a method for predicting a lifetime of a flash memory, including the steps of:
acquiring a flash memory sample, flash memory characteristics corresponding to the flash memory sample and flash memory service life;
performing secondary processing on the flash memory features by using a feature selection algorithm based on correlation to obtain a feature subset related to the flash memory life;
classifying by using a K nearest neighbor algorithm based on the feature subset to obtain classified data, and training a preset multiple linear regression model by using the classified data and the flash memory life to obtain a trained multiple linear regression model;
acquiring characteristic information of a flash memory to be predicted, and predicting the flash memory to be predicted according to the characteristic information by using the trained multiple linear regression model to obtain the service life of the flash memory to be predicted;
the classifying is performed by using a K nearest neighbor algorithm based on the feature subset, and the classified data comprises the following steps:
randomly grouping the feature subsets to obtain a training group and a testing group;
the Euclidean distance between each training feature in the training group and all the test features in the test group is calculated to obtain a distance value;
sorting the distance values to obtain sorted distance values;
determining a preset K value, and sequentially selecting nearest neighbor sample points from the ordered distance values according to the preset K value;
and determining a target class with the most category occupation ratio of the nearest neighbor sample point, and determining the target class as a prediction class of the training feature to obtain classified data.
From the above description, the beneficial effects of the invention are as follows: the method comprises the steps of performing secondary processing on flash memory features by using a feature selection algorithm based on correlation to obtain a feature subset related to the service life of the flash memory, classifying by using a K nearest neighbor algorithm based on the feature subset to obtain classified data, training a preset multiple linear regression model by using the classified data and the service life of the flash memory to obtain a multiple linear regression model after training, performing service life prediction by using the multiple linear regression model after training, effectively filtering useless features in the flash memory features by using the feature selection algorithm based on correlation to find out features highly related to the service life of the flash memory, accurately mining the mapping relation between the service life of the flash memory and the features by using the K nearest neighbor algorithm, ensuring that the classified data has high accuracy, avoiding the problems of unfavorable calculation such as fitting, dimension explosion and the like even if new data is added, improving the final prediction model accuracy by combining the feature selection algorithm based on correlation with the K nearest neighbor algorithm, reducing training time, and realizing efficient and accurate service life prediction.
Further, the obtaining a flash memory sample, a flash memory feature corresponding to the flash memory sample, and a flash memory lifetime includes:
obtaining a flash memory sample by using a random extraction mode;
performing read-write test on a storage block of the flash memory sample, and acquiring flash memory characteristics corresponding to the flash memory sample in the read-write test process;
repeating erasing operation on the storage block of the flash memory sample until the flash memory sample fails, and acquiring a corresponding erasing operation period number when the flash memory sample fails;
and determining the erasing operation cycle number as the flash memory life.
As can be seen from the above description, the flash memory sample is obtained by random extraction, and the flash memory characteristics and the flash memory lifetime of the flash memory sample are obtained, which is used as the data base for the subsequent model training, so that the randomness of the data is ensured, and the accurate flash memory lifetime prediction is realized.
Further, the flash memory characteristics comprise flash memory storage block programming time, flash memory storage block erasing time, flash memory storage block error rate and residual currents after a plurality of different waiting time values after the flash memory performs erasing operation;
the step of performing a read-write test on the storage block of the flash memory sample, and the step of obtaining the flash memory characteristics corresponding to the flash memory sample in the read-write test process includes:
determining a test node in a storage block of the flash memory sample;
writing operation is carried out on a storage block of the flash memory sample at the test node, and writing time corresponding to the writing operation and the cycle number of stopping recording of return data corresponding to the writing operation are obtained;
multiplying the writing time by the cycle number to obtain the programming time of the flash memory storage block;
performing an erasing operation on a storage block of the flash memory sample, and acquiring an erasing time corresponding to the erasing operation and a continuous period number of the erasing operation;
multiplying the erasing time by the continuous period number to obtain the erasing time of the flash memory storage block;
reading data corresponding to the writing operation from a storage block of the flash memory sample, and determining the error rate of the flash memory storage block according to the read data and the writing data of the writing operation;
waiting for a preset time after the erasing operation, and applying a preset control voltage to a control gate of the flash memory sample;
and after the preset control voltage is applied, acquiring a current value between the drain electrode and the source electrode of the flash memory sample, and determining the current value as residual current after a plurality of different waiting time values are obtained after the flash memory is erased.
As can be seen from the above description, the flash memory characteristics include a flash memory block programming time, a flash memory block erasing time, a flash memory block error rate, and residual currents after a plurality of different waiting time values after the flash memory is erased, and the flash memory characteristics obtained by the method are characteristics possibly related to the flash memory life, which is beneficial to training of a flash memory life prediction model.
Further, the secondary processing of the flash memory features using a correlation-based feature selection algorithm to obtain a feature subset related to the flash memory lifetime includes:
evaluating an estimate of each classification for each of the flash memory features using a heuristic equation;
and screening the flash memory features based on the estimated value to obtain a feature subset related to the flash memory service life.
From the above description, it can be seen that the correlation between each flash memory feature and the life of the flash memory can be accurately and effectively estimated by using a heuristic equation, and the higher the estimated value is, the higher the correlation is, so that the feature subset screened out is highly correlated with the life of the flash memory, but the features are not correlated with each other, thereby being beneficial to improving the accuracy of the model which is finally trained.
Further, the evaluating the estimate of each classification for each of the flash memory features using a heuristic equation includes:
in the formula, merit s Representing the heuristic advantage of a feature subset S comprising k features,representing the average relevance of a feature to a class, +.>Representing the average correlation of features to features.
From the above description, it can be known that the correlation between the flash memory features and the flash memory lifetime can be accurately described according to the heuristic advantages obtained by calculating the average correlation between the features and the class, so that the selected feature subset can be used for training the multiple linear regression model.
Further, training the preset multiple linear regression model by using the classified data and the flash memory life, and before obtaining the trained multiple linear regression model, the method comprises the following steps:
and calculating regression coefficients by using a least square method or a gradient descent method, and constructing a preset multiple linear regression model according to the regression coefficients.
From the above description, it is known that the regression coefficient is calculated using the least square method or the gradient descent method, so that the relationship between the input and output of the model can be generated, thereby realizing the prediction of the flash lifetime.
Referring to fig. 2, another embodiment of the present invention provides a flash lifetime prediction apparatus, including:
the data acquisition module is used for acquiring a flash memory sample, flash memory characteristics corresponding to the flash memory sample and flash memory service life;
the data processing module is used for carrying out secondary processing on the flash memory characteristics by using a characteristic selection algorithm based on correlation to obtain a characteristic subset related to the service life of the flash memory;
the model training module is used for classifying by using a K nearest neighbor algorithm based on the feature subset to obtain classified data, and training a preset multiple linear regression model by using the classified data and the flash memory life to obtain a trained multiple linear regression model;
the prediction module is used for obtaining the characteristic information of the flash memory to be predicted, and predicting the flash memory to be predicted according to the characteristic information by using the trained multiple linear regression model to obtain the service life of the flash memory to be predicted;
the classifying is performed by using a K nearest neighbor algorithm based on the feature subset, and the classified data comprises the following steps:
randomly grouping the feature subsets to obtain a training group and a testing group;
the Euclidean distance between each training feature in the training group and all the test features in the test group is calculated to obtain a distance value;
sorting the distance values to obtain sorted distance values;
determining a preset K value, and sequentially selecting nearest neighbor sample points from the ordered distance values according to the preset K value;
and determining a target class with the most category occupation ratio of the nearest neighbor sample point, and determining the target class as a prediction class of the training feature to obtain classified data.
Another embodiment of the present invention provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of a flash lifetime prediction method as described above.
Referring to fig. 3, another embodiment of the present invention provides an electronic device, including a memory, a processor, and a computer program stored in the memory and capable of running on the processor, where the processor implements the steps of the above-mentioned method for predicting the lifetime of a flash memory when executing the computer program.
The method, the device, the readable storage medium and the electronic device for predicting the lifetime of the flash memory can be applied to the lifetime prediction of different types of flash memories, such as single-level cell (SLC) flash memory, multi-level cell (MLC) flash memory and three-level cell (TLC) flash memory, and are described in the following embodiments:
example 1
Referring to fig. 1 and fig. 4-6, a method for predicting the lifetime of a flash memory according to the present embodiment is characterized by comprising the steps of:
s1, acquiring a flash memory sample, flash memory characteristics corresponding to the flash memory sample and flash memory service life, wherein the method specifically comprises the following steps:
s11, acquiring a flash memory sample by using a random extraction mode;
in an alternative embodiment, a random extraction mode is used to obtain flash memory samples from flash memories manufactured at different times and in different batches under the same manufacturing process level, wherein the number of the flash memory samples can be a preset proportion of the total amount of the extracted flash memories;
in an alternative embodiment, the preset ratio is 2%, and the preset ratio may be set according to the machine learning accuracy requirement.
S12, performing a read-write test on a storage block of the flash memory sample, and acquiring flash memory characteristics corresponding to the flash memory sample in the read-write test process, wherein the method specifically comprises the following steps:
in an alternative embodiment, the flash memory characteristics include a flash memory block program time, a flash memory block erase time, a flash memory block error rate, and a margin current after a plurality of different latency values after an erase operation of the flash memory;
in another alternative embodiment, the flash memory features further include, but are not limited to, flash memory block read time, operating frequency, chip power consumption, threshold voltage distribution, memory block number, memory page number, number of program/erase cycles currently experienced by the flash memory chip, number of conditional error pages, number of conditional error blocks, number of error bits, and error rate;
s121, determining a test node in a storage block of the flash memory sample;
s122, writing operation is carried out on the storage block of the flash memory sample at the test node, and writing time corresponding to the writing operation and the cycle number of stopping recording of return data corresponding to the writing operation are obtained;
s123, multiplying the writing time by the cycle number to obtain the programming time of the flash memory storage block;
s124, performing an erasing operation on a storage block of the flash memory sample, and acquiring an erasing time corresponding to the erasing operation and a continuous period number of the erasing operation;
s125, multiplying the erasing time by the continuous period number to obtain the erasing time of the flash memory block;
s126, reading data corresponding to the writing operation from the storage block of the flash memory sample, and determining the error rate of the flash memory storage block according to the read data and the writing data of the writing operation; flash memory stores the block error rate, i.e., the ratio of the number of differences between the read data and the write data to the total operand summary.
S127, waiting for preset time after the erasing operation, and applying preset control voltage to the control gate of the flash memory sample;
and S128, after the preset control voltage is applied, acquiring a current value between a drain electrode and a source electrode of the flash memory sample, and determining the current value as residual current after a plurality of different waiting time values are obtained after the flash memory is erased.
S13, repeatedly erasing the storage block of the flash memory sample until the flash memory sample fails, and acquiring a corresponding erasing operation period number when the flash memory sample fails;
s14, determining the erasing operation cycle number as the flash memory service life.
S2, performing secondary processing on the flash memory features by using a feature selection algorithm (CFS) based on correlation to obtain a feature subset related to the flash memory service life, wherein the method specifically comprises the following steps:
s21, evaluating an estimated value of each flash memory characteristic for each classification by using a heuristic equation;
the method comprises the following steps:
in the formula, merit s Representing the heuristic advantage of a feature subset S comprising k features,representing the average relevance of a feature to a class, +.>Representing the average correlation of features to features.
Wherein the classification is the specific lifetime of the flash memory.
The core of CFS is to estimate and rank the value of feature subsets using correlation-based heuristics, rather than individual features, based on the assumption that: a good subset of features contains features that are highly class-dependent, but these features are not related to each other.
In an alternative embodiment, before S21, further comprising:
s20, performing preliminary processing on the flash memory characteristics to obtain the flash memory characteristics after the preliminary processing;
for example, calculating the average value, variance, skewness and the like of the flash memory characteristics to obtain processed flash memory characteristics;
and S21, performing secondary processing on the flash memory characteristics after the primary processing so as to improve the processing efficiency.
And S22, screening the flash memory features based on the estimated value to obtain a feature subset related to the flash memory service life.
The CFS first calculates the feature and class and the feature-feature correlation matrix from the flash memory features, then searches the feature subset space with the best-first search method, starts from the empty feature subset, has no feature selection at the beginning and produces all possible single features, calculates the estimated value of the feature using heuristic equations, selects one feature with the largest estimated value to enter the feature subset, then selects the second feature with the largest estimated value to enter the feature subset, if the estimated value of the two features is smaller than the original estimated value, removes the feature with the largest estimated value from the feature subset, and so on, finds out the feature combination with the largest estimated value, and then obtains the feature subset related to the flash memory life, as shown in fig. 4, and the result of the feature filtering based on the best-first search of the CFS is shown in table 1.
TABLE 1 results of feature filtering for CFS-based best-priority searches
It can be seen that the search starts with a set of empty features, with the advantage of 0.0. Each round of feature searching represents the dataset by adding individual features, selecting and expanding the highest scoring subset in the same manner, i.e., finding a better feature set. The result shows that after the fourth optimal feature is added in the feature subset, the feature subset has the advantages that the feature subset with three optimal features is smaller, so that the optimal preferential search is finished and the feature subset related to the service life of the flash memory is screened out.
In an alternative embodiment, after obtaining the feature subset related to the flash lifetime, the method may further include: adding unselected features other than the feature subset related to the flash lifetime to a classifier (i.e., a K nearest neighbor algorithm used later), if the classification accuracy of the classifier increases, the feature subset is useful for global prediction, and adding the useful unselected feature to the feature subset, otherwise, the feature subset is not useful for global prediction, and the useless unselected feature is not added to the feature subset.
In the training process, the categories are directly calibrated, in the testing process, the K nearest neighbor algorithm is applied to determine the categories, and then the categories are compared with the values in the library, so that the classification accuracy is obtained.
S3, classifying by using a K nearest neighbor algorithm based on the feature subset to obtain classified data, training a preset multiple linear regression model by using the classified data and the flash memory life to obtain a trained multiple linear regression model, wherein the method specifically comprises the following steps of:
s31, randomly grouping the feature subsets to obtain a training group and a test group;
assuming a total of 1000 features in the feature subset, the feature subset is randomly grouped into 800 samples for a training set and 200 samples for a test set.
S32, calculating Euclidean distances between each training feature in the training group and all test features in the test group to obtain a distance value;
s33, sorting the distance values to obtain sorted distance values;
s34, determining a preset K value, and sequentially selecting nearest neighbor sample points from the ordered distance values according to the preset K value;
in an alternative embodiment, the preset K value is 3.
Specifically, each training feature in the training set is placed in a coordinate system with each value as an axis, then all the test features in the test set are placed in the coordinate system, and a plurality of nearest sample points are found out by calculating Euclidean distances between each training feature in the training set and all the test features in the test set.
And S35, determining a target class with the most class occupation ratio of the nearest neighbor sample point, and determining the target class as a prediction class of the training feature to obtain classified data.
S36, calculating a regression coefficient by using a least square method or a gradient descent method, and constructing a preset multiple linear regression model according to the regression coefficient.
Specifically, an input/output function is required to be established firstly by using a least square method or a gradient descent method, the least square method takes a feature subset as an input matrix, flash memory service life as an output matrix, and regression coefficients of each input are obtained through matrix calculation; the gradient descent algorithm further formulates a cost function of the difference between the actual and predicted values, and the regression coefficients are obtained by minimizing the cost function.
And S37, training a preset multiple linear regression model by using the classified data and the flash memory life to obtain a trained multiple linear regression model.
S4, obtaining characteristic information of the flash memory to be predicted, and predicting the flash memory to be predicted according to the characteristic information by using the trained multiple linear regression model to obtain the service life of the flash memory to be predicted;
specifically, the characteristic information of the flash memory to be predicted is obtained, the characteristic information of the flash memory to be predicted is input into the trained multiple linear regression model for prediction, and the service life of the flash memory to be predicted is output.
Classifier construction was performed using KNN (K nearest neighbor algorithm), DT (decision tree) and SVM (support vector machine) classifiers based on the original dataset (56 features), then all 56 features were filtered using CFS algorithm, and then classifier testing was performed using the filtered optimal feature set, with the results shown in tables 2 and 3, which compare the run time and classification accuracy of the dataset and the screened dataset. It can be seen that the accuracy of the classification model constructed using CFS is significantly improved compared to the original dataset. The classifier using the feature selection dataset generally has higher accuracy than the classifier using the raw data directly, and the results of table 2 also show that the CFS-based KNN model of the present invention can achieve an optimal correct classification rate of up to 98% in this binary classification problem.
TABLE 2 accuracy of different classification models
TABLE 3 runtime of different classification models
In summary, the CFS-KNN model of the present invention is most effective, where the regression model is further analyzed, as shown in fig. 5, and fig. 5 is a linear function calculation result of life prediction, and it is obvious that almost all samples are calculated along this function, and the error is small. Finally, the predicted lifetime is compared with the actual lifetime, and a comparative image of the actual lifetime and the predicted lifetime is given, as shown in fig. 6. Experiments prove that the invention can well predict the service life of the flash memory, has high precision, and the R2 (goodness of fit) is 0.998, which proves that the model has high precision.
Example two
Referring to fig. 2, a flash lifetime prediction apparatus of the present embodiment includes:
the data acquisition module is used for acquiring a flash memory sample, flash memory characteristics corresponding to the flash memory sample and flash memory service life;
the data processing module is used for carrying out secondary processing on the flash memory characteristics by using a characteristic selection algorithm based on correlation to obtain a characteristic subset related to the service life of the flash memory;
the model training module is used for classifying by using a K nearest neighbor algorithm based on the feature subset to obtain classified data, and training a preset multiple linear regression model by using the classified data and the flash memory life to obtain a trained multiple linear regression model;
the prediction module is used for obtaining the characteristic information of the flash memory to be predicted, and predicting the flash memory to be predicted according to the characteristic information by using the trained multiple linear regression model to obtain the service life of the flash memory to be predicted;
the classifying is performed by using a K nearest neighbor algorithm based on the feature subset, and the classified data comprises the following steps:
randomly grouping the feature subsets to obtain a training group and a testing group;
the Euclidean distance between each training feature in the training group and all the test features in the test group is calculated to obtain a distance value;
sorting the distance values to obtain sorted distance values;
determining a preset K value, and sequentially selecting nearest neighbor sample points from the ordered distance values according to the preset K value;
and determining a target class with the most category occupation ratio of the nearest neighbor sample point, and determining the target class as a prediction class of the training feature to obtain classified data.
Example III
A computer-readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of the flash lifetime prediction method of embodiment one.
Example IV
Referring to fig. 3, an electronic device includes a memory, a processor, and a computer program stored on the memory and executable on the processor, where the processor implements the steps of the flash lifetime prediction method in the first embodiment when executing the computer program.
In summary, the method and the device for predicting the service life of the flash memory, the readable storage medium and the electronic equipment provided by the invention acquire a flash memory sample, a flash memory characteristic corresponding to the flash memory sample and the service life of the flash memory; performing secondary processing on the flash memory features by using a feature selection algorithm based on correlation to obtain a feature subset related to the flash memory life; classifying by using a K nearest neighbor algorithm based on the feature subset to obtain classified data, and training a preset multiple linear regression model by using the classified data and the flash memory life to obtain a trained multiple linear regression model; obtaining characteristic information of a flash memory to be predicted, predicting the flash memory to be predicted according to the characteristic information by using the trained multiple linear regression model, obtaining the service life of the flash memory to be predicted, specifically, calculating a regression coefficient by using a least square method or a gradient descent method, constructing a preset multiple linear regression model according to the regression coefficient, generating a relation between input and output of the model, effectively filtering useless characteristics in characteristics of the flash memory by using a characteristic selection algorithm based on correlation, finding out characteristics highly correlated with the service life of the flash memory, accurately mining a mapping relation between the service life of the flash memory by using a K nearest neighbor algorithm, improving the precision of a final prediction model by combining the characteristic selection algorithm based on correlation with the K nearest neighbor algorithm, and reducing training time, thereby realizing efficient and accurate service life prediction of the flash memory.
In the foregoing embodiments provided by the present application, it should be understood that the disclosed method, apparatus, computer readable storage medium and electronic device may be implemented in other manners. For example, the apparatus embodiments described above are merely illustrative, and for example, the division of the modules is merely a logical function division, and there may be additional divisions when actually implemented, for example, multiple components or modules may be combined or integrated into another apparatus, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with respect to each other may be an indirect coupling or communication connection via some interfaces, devices or components or modules, which may be in electrical, mechanical, or other forms.
The components illustrated as separate components may or may not be physically separate, and components shown as components may or may not be physical modules, i.e., may be located in one place, or may be distributed over multiple network modules. Some or all of the components may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional module in each embodiment of the present invention may be integrated into one processing module, or each component may exist alone physically, or two or more modules may be integrated into one module. The integrated modules may be implemented in hardware or in software functional modules.
The integrated modules, if implemented in the form of software functional modules and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied essentially or in part or all of the technical solution or in part in the form of a software product stored in a storage medium, including instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
It should be noted that, for the sake of simplicity of description, the foregoing method embodiments are all expressed as a series of combinations of actions, but it should be understood by those skilled in the art that the present invention is not limited by the order of actions described, as some steps may be performed in other order or simultaneously in accordance with the present invention. Further, those skilled in the art will appreciate that the embodiments described in the specification are all preferred embodiments, and that the acts and modules referred to are not necessarily all required for the present invention.
In the foregoing embodiments, the descriptions of the embodiments are emphasized, and for parts of one embodiment that are not described in detail, reference may be made to the related descriptions of other embodiments.
The foregoing description is only illustrative of the present invention and is not intended to limit the scope of the invention, and all equivalent changes made by the specification and drawings of the present invention, or direct or indirect application in the relevant art, are included in the scope of the present invention.

Claims (9)

1. A flash memory life prediction method is characterized by comprising the following steps:
acquiring a flash memory sample, flash memory characteristics corresponding to the flash memory sample and flash memory service life;
performing secondary processing on the flash memory features by using a feature selection algorithm based on correlation to obtain a feature subset related to the flash memory life;
classifying by using a K nearest neighbor algorithm based on the feature subset to obtain classified data, and training a preset multiple linear regression model by using the classified data and the flash memory life to obtain a trained multiple linear regression model;
acquiring characteristic information of a flash memory to be predicted, and predicting the flash memory to be predicted according to the characteristic information by using the trained multiple linear regression model to obtain the service life of the flash memory to be predicted;
the classifying is performed by using a K nearest neighbor algorithm based on the feature subset, and the classified data comprises the following steps:
randomly grouping the feature subsets to obtain a training group and a testing group;
the Euclidean distance between each training feature in the training group and all the test features in the test group is calculated to obtain a distance value;
sorting the distance values to obtain sorted distance values;
determining a preset K value, and sequentially selecting nearest neighbor sample points from the ordered distance values according to the preset K value;
and determining a target class with the most category occupation ratio of the nearest neighbor sample point, and determining the target class as a prediction class of the training feature to obtain classified data.
2. The method of claim 1, wherein the obtaining a flash memory sample, a flash memory feature corresponding to the flash memory sample, and a flash memory lifetime comprises:
obtaining a flash memory sample by using a random extraction mode;
performing read-write test on a storage block of the flash memory sample, and acquiring flash memory characteristics corresponding to the flash memory sample in the read-write test process;
repeating erasing operation on the storage block of the flash memory sample until the flash memory sample fails, and acquiring a corresponding erasing operation period number when the flash memory sample fails;
and determining the erasing operation cycle number as the flash memory life.
3. The method of claim 2, wherein the flash characteristics include flash memory block program time, flash memory block erase time, flash memory block error rate, and margin current after a plurality of different latency values after an erase operation of the flash memory;
the step of performing a read-write test on the storage block of the flash memory sample, and the step of obtaining the flash memory characteristics corresponding to the flash memory sample in the read-write test process includes:
determining a test node in a storage block of the flash memory sample;
writing operation is carried out on a storage block of the flash memory sample at the test node, and writing time corresponding to the writing operation and the cycle number of stopping recording of return data corresponding to the writing operation are obtained;
multiplying the writing time by the cycle number to obtain the programming time of the flash memory storage block;
performing an erasing operation on a storage block of the flash memory sample, and acquiring an erasing time corresponding to the erasing operation and a continuous period number of the erasing operation;
multiplying the erasing time by the continuous period number to obtain the erasing time of the flash memory storage block;
reading data corresponding to the writing operation from a storage block of the flash memory sample, and determining the error rate of the flash memory storage block according to the read data and the writing data of the writing operation;
waiting for a preset time after the erasing operation, and applying a preset control voltage to a control gate of the flash memory sample;
and after the preset control voltage is applied, acquiring a current value between the drain electrode and the source electrode of the flash memory sample, and determining the current value as residual current after a plurality of different waiting time values are obtained after the flash memory is erased.
4. The method of claim 1, wherein said performing secondary processing on said flash memory features using a correlation-based feature selection algorithm to obtain a subset of features associated with said flash memory lifetime comprises:
evaluating an estimate of each classification for each of the flash memory features using a heuristic equation;
and screening the flash memory features based on the estimated value to obtain a feature subset related to the flash memory service life.
5. The method of claim 4, wherein said evaluating the estimate of each class for each of the flash memory features using heuristic equations comprises:
in the formula, merit s Representing the heuristic advantage of a feature subset S comprising k features,representing the average relevance of a feature to a class, +.>Representing the average correlation of features to features.
6. The method for predicting flash life according to claim 1, wherein training the preset multiple linear regression model using the classified data and the flash life comprises, before obtaining the trained multiple linear regression model:
and calculating regression coefficients by using a least square method or a gradient descent method, and constructing a preset multiple linear regression model according to the regression coefficients.
7. A flash life prediction apparatus, comprising:
the data acquisition module is used for acquiring a flash memory sample, flash memory characteristics corresponding to the flash memory sample and flash memory service life;
the data processing module is used for carrying out secondary processing on the flash memory characteristics by using a characteristic selection algorithm based on correlation to obtain a characteristic subset related to the service life of the flash memory;
the model training module is used for classifying by using a K nearest neighbor algorithm based on the feature subset to obtain classified data, and training a preset multiple linear regression model by using the classified data and the flash memory life to obtain a trained multiple linear regression model;
the prediction module is used for obtaining the characteristic information of the flash memory to be predicted, and predicting the flash memory to be predicted according to the characteristic information by using the trained multiple linear regression model to obtain the service life of the flash memory to be predicted;
the classifying is performed by using a K nearest neighbor algorithm based on the feature subset, and the classified data comprises the following steps:
randomly grouping the feature subsets to obtain a training group and a testing group;
the Euclidean distance between each training feature in the training group and all the test features in the test group is calculated to obtain a distance value;
sorting the distance values to obtain sorted distance values;
determining a preset K value, and sequentially selecting nearest neighbor sample points from the ordered distance values according to the preset K value;
and determining a target class with the most category occupation ratio of the nearest neighbor sample point, and determining the target class as a prediction class of the training feature to obtain classified data.
8. A computer readable storage medium having stored thereon a computer program, which when executed by a processor performs the steps of a flash lifetime prediction method according to any one of claims 1 to 6.
9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the steps of a flash lifetime prediction method as claimed in any one of claims 1 to 6 when the computer program is executed by the processor.
CN202311510147.0A 2023-11-14 2023-11-14 Flash memory life prediction method and device, readable storage medium and electronic equipment Pending CN117251830A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311510147.0A CN117251830A (en) 2023-11-14 2023-11-14 Flash memory life prediction method and device, readable storage medium and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311510147.0A CN117251830A (en) 2023-11-14 2023-11-14 Flash memory life prediction method and device, readable storage medium and electronic equipment

Publications (1)

Publication Number Publication Date
CN117251830A true CN117251830A (en) 2023-12-19

Family

ID=89137158

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311510147.0A Pending CN117251830A (en) 2023-11-14 2023-11-14 Flash memory life prediction method and device, readable storage medium and electronic equipment

Country Status (1)

Country Link
CN (1) CN117251830A (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108229541A (en) * 2017-12-11 2018-06-29 上海海事大学 Bar stress data classification method in a kind of gantry crane based on K nearest neighbor algorithms
CN109637576A (en) * 2018-12-17 2019-04-16 华中科技大学 A kind of service life of flash memory prediction technique based on support vector regression
CN110097120A (en) * 2019-04-30 2019-08-06 南京邮电大学 Network flow data classification method, equipment and computer storage medium
CN116822383A (en) * 2023-08-31 2023-09-29 成都态坦测试科技有限公司 Equipment life prediction model construction method and device, readable storage medium and equipment
US20230325775A1 (en) * 2022-04-11 2023-10-12 International Business Machines Corporation Predictive computing and data analytics for project management

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108229541A (en) * 2017-12-11 2018-06-29 上海海事大学 Bar stress data classification method in a kind of gantry crane based on K nearest neighbor algorithms
CN109637576A (en) * 2018-12-17 2019-04-16 华中科技大学 A kind of service life of flash memory prediction technique based on support vector regression
CN110097120A (en) * 2019-04-30 2019-08-06 南京邮电大学 Network flow data classification method, equipment and computer storage medium
US20230325775A1 (en) * 2022-04-11 2023-10-12 International Business Machines Corporation Predictive computing and data analytics for project management
CN116822383A (en) * 2023-08-31 2023-09-29 成都态坦测试科技有限公司 Equipment life prediction model construction method and device, readable storage medium and equipment

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
MA R等: "RBER-aware lifetime prediction scheme for 3D-TLC NAND flash memory", 《IEEE ACCESS》, vol. 7, pages 44696 - 44708, XP011718956, DOI: 10.1109/ACCESS.2019.2909567 *
范玉雷等: "基于相变存储器和闪存的数据库事务恢复模型", 《计算机学报》, vol. 36, no. 8, pages 1582 - 1591 *

Similar Documents

Publication Publication Date Title
CN105045523B (en) Memory controller, memory device and system, and method of operating memory controller
US20200097403A1 (en) Recency based victim block selection for garbage collection in a solid state device (ssd)
US11604834B2 (en) Technologies for performing stochastic similarity searches in an online clustering space
US20200110536A1 (en) Optimizing data storage device operation by grouping logical block addresses and/or physical block addresses using hints
TW201947598A (en) Method for performing memory access management with aid of machine learning in memory device, associated memory device and controller thereof, and associated electronic device
Singh et al. Performance evaluation of k-means and heirarichal clustering in terms of accuracy and running time
CN116822383A (en) Equipment life prediction model construction method and device, readable storage medium and equipment
Nguyen et al. SparseHC: a memory-efficient online hierarchical clustering algorithm
Mic et al. Binary sketches for secondary filtering
US11216696B2 (en) Training data sample selection for use with non-volatile memory and machine learning processor
US9454550B2 (en) Database method for B+ tree based on PRAM
US9104946B2 (en) Systems and methods for comparing images
Günnemann et al. Subspace clustering for indexing high dimensional data: a main memory index based on local reductions and individual multi-representations
CN111813339B (en) Data writing method and device for Nand Flash of Flash memory, electronic equipment and storage medium
CN117251830A (en) Flash memory life prediction method and device, readable storage medium and electronic equipment
CN115758206B (en) Method for quickly searching last write end position of Norflash in ZNS solid state disk
US10671644B1 (en) Adaptive column set composition
CN113641305B (en) Garbage collection method and device for solid state disk, electronic equipment and storage medium
Gudmundsson et al. Impact of storage technology on the efficiency of cluster-based high-dimensional index creation
CN104809098A (en) Method and device for determining statistical model parameter based on expectation-maximization algorithm
CN110309273A (en) Answering method and device
Fitzgerald et al. A comparative study of classification methods for flash memory error rate prediction
CN112817525A (en) Method and device for predicting reliability grade of flash memory chip and storage medium
Meskina On the effect of data reduction on classification accuracy
CN114171095B (en) 3D NAND flash memory threshold voltage distribution prediction method, device and storage system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination