CN109407972A - The storage device of correcting data error is carried out using temperature difference equalization methods - Google Patents

The storage device of correcting data error is carried out using temperature difference equalization methods Download PDF

Info

Publication number
CN109407972A
CN109407972A CN201811081439.6A CN201811081439A CN109407972A CN 109407972 A CN109407972 A CN 109407972A CN 201811081439 A CN201811081439 A CN 201811081439A CN 109407972 A CN109407972 A CN 109407972A
Authority
CN
China
Prior art keywords
storage
temperature
data
write
error
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201811081439.6A
Other languages
Chinese (zh)
Other versions
CN109407972B (en
Inventor
弗兰克·陈
颜巍
高兰娟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhiyu Technology Co ltd
Original Assignee
To Reputation Technology (wuhan) Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by To Reputation Technology (wuhan) Co Ltd filed Critical To Reputation Technology (wuhan) Co Ltd
Priority to CN201811081439.6A priority Critical patent/CN109407972B/en
Publication of CN109407972A publication Critical patent/CN109407972A/en
Application granted granted Critical
Publication of CN109407972B publication Critical patent/CN109407972B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/0671In-line storage system
    • G06F3/0673Single storage device
    • G06F3/0679Non-volatile semiconductor memory device, e.g. flash memory, one time programmable memory [OTP]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0614Improving the reliability of storage systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • G06F3/0644Management of space entities, e.g. partitions, extents, pools

Abstract

The storage device of correcting data error is carried out using temperature difference equalization methods, with storage chip, control chip and the interface being connect with external data source, control program is realized: recording the moment corresponding temperature T1, and it is stored by being written as metadata, the corresponding write-in temperature T1 of each storage cell is obtained, and calculates mean temperature T when write-in totally;After the completion of write-in, real time environment temperature T2 locating for real-time monitoring storage chip;Whether the difference between mean temperature T when judging environment temperature T2 and write-in is more than threshold value t, if it exceeds entering in next step;Further judge whether environment temperature T2 reaches the ideal write-in temperature of setting, when reaching, just calculates the summation of the difference of the write-in temperature T1 and environment temperature T2 of storage cell all in some storage region, obtain the difference summation of all storage regions;It is ranked up, is successively tested from big to small to the data in all storage regions and error correction with the size of the difference summation of storage region.

Description

The storage device of correcting data error is carried out using temperature difference equalization methods
Technical field
The present invention relates to data storage devices, belong to data storage control field.
Background technique
With the update of flash memory, the data stored in flash memory are more and more sensitive to the variation of temperature, for example, under high temperature The data of write-in, the probability for reading error at low temperature can increase considerably, otherwise be also.
However, more and more flash memory such as SSD solid state hard disks or storage array are used in industrial occasions, and these The variation of the environment temperature of occasion is very wide in range: from -40 DEG C -+85 DEG C, under this this big temperature difference environment under working environment, The data being written in flash memory can change with the variation of ambient temperature, i.e., the digital state hair in 0,1 in storage chip It is raw to change, after changing to a certain extent, it will lead to the data of storage, file is damaged or loses.
In the prior art still without preferable solution, industry usual way is to establish backup, i.e., specific permanent Server or storage center are established in warm environment to back up and save crucial data, it is clear that this kind of scheme is not appropriate for all User, is especially not suitable for personal consumption user and medium-sized and small enterprises use.
Summary of the invention
The present invention be in order to solve under above-mentioned wide temperature difference condition, the prior art cannot preferably realize data reliable read write, It can not carry out that the defect of self-correction carries out, and it is an object of the present invention to provide a kind of carry out correcting data error using temperature difference equalization methods and depositing Storage device.
The present invention provides a kind of storage device that correcting data error is carried out using temperature difference equalization methods, has storage chip, control Coremaking piece and the interface connecting with external data source, wherein being stored with control program in control chip, control program is realized Following steps:
Storage chip is divided into multiple storage regions, the region includes multiple storage cells;
As data are written in storage cell, when some storage cell is fully written, it is corresponding just to record the moment Temperature T1, and stored by being written as metadata, the corresponding write-in temperature T1 of each storage cell is obtained, and calculate entire storage Mean temperature T when chip is by the multiple overall write-in being written when writing full;
After the completion of write-in, real time environment temperature T2 locating for real-time monitoring storage chip;
Whether the difference between mean temperature T when judging environment temperature T2 and write-in is more than threshold value t, if it exceeds into In next step;
When difference is more than threshold value t, further judge whether environment temperature T2 reaches the ideal write-in temperature of setting, works as ring When border temperature T2 reaches the ideal write-in temperature of setting, the write-in temperature of storage cell all in some storage region is just calculated The summation of the difference of T1 and environment temperature T2, obtains the difference summation of all storage regions;
It is ranked up with the size of the difference summation of storage region, from big to small successively to the number in all storage regions According to testing and error correction.
The storage device provided by the invention that correcting data error is carried out using temperature difference equalization methods, can also have such spy Sign, wherein the storage cell is memory block or memory page.
The storage device provided by the invention that correcting data error is carried out using temperature difference equalization methods, can also have such spy Sign, wherein include: to the process that data are tested
Compare in some storage region, the corresponding environment temperature of all storage cells and write-in temperature T1Between difference And arranged according to numerical values recited, data scanning successively is carried out to storage cell from big to small, inspection obtains the bit error rate.
The storage device provided by the invention that correcting data error is carried out using temperature difference equalization methods, can also have such spy Sign, wherein correcting data error process includes:
When error rates of data reaches threshold value, but is less than the hardware ECC error correction ability of storage chip itself, start hardware The data completely restored are written back in new storage cell by ECC error correction again, while recording write-in temperature T when write-in1
When hardware ECC error correction ability of the error rates of data more than storage chip itself, start additional time data recovery mechanism To restore error correction.
The storage device provided by the invention that correcting data error is carried out using temperature difference equalization methods, can also have such spy Sign, wherein set the hardware ECC error correction ability of storage chip itself as the bit error rate of 70-85%.
The storage device provided by the invention that correcting data error is carried out using temperature difference equalization methods, can also have such spy Sign, wherein the additional time data recovery mechanism includes that RAID data restores.
The storage device provided by the invention that correcting data error is carried out using temperature difference equalization methods, can also have such spy Sign, wherein the setting value of the ideal write-in temperature be according to the product specification of deposit data storage device, practical service environment, What the such factor of flash type, life cycle was set.
The storage device provided by the invention that correcting data error is carried out using temperature difference equalization methods, can also have such spy Sign, the corresponding write-in temperature T1 of all storage cells of some storage region are stored in specific storage region according to numerical values recited In.
The storage device provided by the invention that correcting data error is carried out using temperature difference equalization methods, can also have such spy Sign, the storage device are SSD solid state hard disk.
The storage device provided by the invention that correcting data error is carried out using temperature difference equalization methods, can also have such spy Sign, wherein the storage chip of the SSD solid state hard disk is the storage chip of SLC, MLC, TLC or QLC flash memory particle.
The function and effect of the present invention is: carrying out correcting data error using temperature difference equalization methods involved according to the present invention Storage device because because control chip in be stored with control program, which is first divided into storage chip multiple Storage region, the region include multiple storage cells;It is written in storage cell then as data, when some storage is single When position is fully written, the moment corresponding temperature T1 is just recorded, and stored by being written as metadata, obtain each storage cell pair The write-in temperature T1 answered, and calculate mean temperature T of the entire storage chip by the multiple overall write-in being written when writing full when; After the completion of to be written, real time environment temperature T2 locating for real-time monitoring storage chip;When real-time judge environment temperature T2 is with write-in Mean temperature T between difference whether be more than threshold value t, if it exceeds enter in next step;Once when difference is more than threshold value t, Further judge whether environment temperature T2 reaches the ideal write-in temperature of setting, when environment temperature T2 reaches the ideal write-in of setting When temperature, the total of the write-in temperature T1 of storage cell all in some storage region and the difference of environment temperature T2 is just calculated With obtain the difference summation of all storage regions;Finally be ranked up with the size of the difference summation of storage region, from greatly to It is small successively to test to the data in all storage regions and error correction, so, the present invention can timely obtain write-in number Temperature locating for storage chip after, only the environment temperature locating for the storage chip and totally write-in when mean temperature When difference is more than the threshold value of setting, the data reconstruction method that can just trigger temperature difference equilibrium carries out correcting data error;And carrying out data When error correction, when by waiting environment temperature to reach the best write-in temperature of setting, data detection and recovery operation is just carried out, is passed through The summation for calculating the write-in temperature T1 of storage cell all in some storage region and the difference of the environment temperature, is owned Storage region difference summation then be ranked up according to the size of difference summation, from big to small successively to all memory blocks Data in domain carry out bit error rate inspection and data are restored, and so operation not only can be to the number for being possible to loss of data occur According to progress error correction recovery, and correct data can be written in most preferably write-in temperature, play preferable Data Data error correction Solidification effect.
Detailed description of the invention
Fig. 1 is that the structure of the storage device for carrying out correcting data error using temperature difference equalization methods in the embodiment of the present invention is shown It is intended to;And
Fig. 2 be in the embodiment of the present invention control chip storage control program corresponding to utilization temperature difference equalization methods into The step schematic diagram of the method for row correcting data error.
Specific embodiment
It is real below in order to be easy to understand the technical means, the creative features, the aims and the efficiencies achieved by the present invention It applies example combination attached drawing and the present invention is specifically addressed using the storage device that temperature difference equalization methods carry out correcting data error.
Embodiment 1
Fig. 1 is that the structure of the storage device for carrying out correcting data error using temperature difference equalization methods in the embodiment of the present invention is shown It is intended to.
As shown in Figure 1, carrying out the storage device 100 of correcting data error using temperature difference equalization methods, there is storage chip 10, control Coremaking piece 20 and the interface 30 and pcb board 40, temperature sensor 50 being connect with external data source, wherein in control chip It is stored with control program.
The storage chip 10 is storage chip made of SLC, MLC, TLC or QLC flash memory particle.
In the present embodiment, the storage chip is nand flash memory chip, specially SLC, MLC, TLC or QLC flash memory The nand flash memory chip of grain production.Theoretically, be also possible to other kinds of storage chip, for example, NOR flash memory, ROM, PROM, EPROM, EEPROM, Flash ROM, FRAM, MRAM, RRAM, PCRAM etc. are that can be used as storage core of the invention Piece.
SLC, Single-LevelCell, i.e. 1bit/cell, the speed fast service life is long, price it is super it is expensive (about MLC3 times or more Price), about 100,000 erasing and writing lifes.
MLC, Multi-LevelCell, i.e. 2bit/cell, speed General Life is general, and price is general, about 3000--- 10000 erasing and writing lifes.
TLC, Trinary-LevelCell, i.e. 3bit/cell, Ye You Flash producer is 8LC, the relatively slow service life phase of speed To short, cheap, about 500 erasing and writing lifes.
QLC, Quad-Level Cell, i.e. 4bit/cell support 16 charge values, and the speed most slow service life is most short.
The nand flash memory chip of these three structures, the briefly best performance of SLC, price superelevation.It is typically used as enterprise Grade or high-end enthusiast.MLC performance is enough, and moderate cost is consumer level SSD application mainstream, and TLC comprehensive performance is minimum, and price is most Cheaply.But the performance of TLC flash memory can be made up, improved by high-performance master control, master control algorithm.
Chip 20 is controlled, chip, commercially available purchase, such as SATA3 controller is controlled using general SSD, selects the U.S. 88SS1074,88SS1079 controller of Marvell company (Chinese name steps prestige science and technology Group Co., Ltd, happiness of now renaming), It is applicable in SATA data-interface;
NVMe controller is selected U.S. Marvell company (Chinese name steps prestige science and technology Group Co., Ltd, happiness of now renaming) 88SS1093,88SS1092 controller, the PCIe data interface being suitable under NVMe agreement.
Here the Marvell company, the U.S. enumerated is an example, actually the SSD controller of any producer on the market It can realize, not limiting is Marvell company, the U.S..
The interface 30 of data source connection, the interface used includes PCIe, SATA interface.
Pcb board 40, as the circuit carrier of hardware above, the storage chip 10, control chip 20 and interface 30 are all It is arranged on the pcb board 40.The PCB is provided with temperature sensor 50, for detecting the temperature of the storage chip 40.
Fig. 2 be in the embodiment of the present invention control chip storage control program corresponding to utilization temperature difference equalization methods into The step schematic diagram of the method for row correcting data error.
Control program is stored in control chip 20, control program realization is below to utilize temperature difference equalization methods to carry out data Error correction, as shown in Fig. 2, using temperature difference equalization methods carry out correcting data error the following steps are included:
Storage chip is divided into multiple storage regions by step S1, and the region includes multiple storage cells.
In the present embodiment, the storage chip is nand flash memory chip, specially SLC, MLC, TLC or QLC flash memory The nand flash memory chip of grain production.Theoretically, can also make other kinds of storage chip, for example, NOR flash memory, ROM, PROM, EPROM, EEPROM, Flash ROM, FRAM, MRAM, RRAM, PCRAM etc. are that can be used as storage core of the invention Piece.
SLC, Single-LevelCell, i.e. 1bit/cell, the speed fast service life is long, price it is super it is expensive (about MLC3 times or more Price), about 100,000 erasing and writing lifes.
MLC, Multi-LevelCell, i.e. 2bit/cell, speed General Life is general, and price is general, about 3000--- 10000 erasing and writing lifes.
TLC, Trinary-LevelCell, i.e. 3bit/cell, Ye You Flash producer is 8LC, the relatively slow service life phase of speed To short, cheap, about 500 erasing and writing lifes.
QLC, Quad-Level Cell, i.e. 4bit/cell support 16 charge values, and the speed most slow service life is most short.
The nand flash memory chip of these three structures, the briefly best performance of SLC, price superelevation.It is typically used as enterprise Grade or high-end enthusiast.MLC performance is enough, and moderate cost is consumer level SSD application mainstream, and TLC comprehensive performance is minimum, and price is most Cheaply.But the performance of TLC flash memory can be made up, improved by high-performance master control, master control algorithm.
Wherein, the storage cell is memory block (block) or memory page (page), it is however generally that, one basic to deposit The capacity for storing up unit is 16k byte, this specific data is different and different according to the manufacturer of storage particle.
Step S2, when some storage cell is fully written, just records the moment as data are written in storage cell Corresponding temperature T1, and stored by being written as metadata, the corresponding write-in temperature T1 of each storage cell is obtained, and calculate whole Mean temperature T when a storage chip is by the multiple overall write-in being written when writing full.
The corresponding write-in temperature T1 of all storage cells of each storage region is stored according to numerical values recited specifically to be deposited In storage area domain.
The corresponding write-in temperature of all storage cells is subjected to arithmetic summation, is then just obtained divided by the number of write-in temperature Here mean temperature T.
When data are written, all by control chip possessed by storage device (including control chip and storage chip) into ECC protection is gone.
ECC is writing a Chinese character in simplified form for " Error Correcting Code ", and Chinese is " error checking and correction ".ECC is one Kind can be realized the technology of " error checking and correction ", and ECC protection is exactly to apply this technology to protect the data of storage Corresponding ECC code is written by control chip and is stored in storage chip generally in storing data by the operation of shield, This will allow the data being stored in storage chip carry out hardware recovery.ECC can also be construed to error correction or correcting code、error checking and correcting、error checking and Correcting is also interpreted as Error correction circuit, is that a kind of maturation is applied and set in data storage Standby upper data protection and Restoration Mechanism.
Step S3, after the completion of write-in, real time environment temperature T2 locating for real-time monitoring storage chip.
Because the write-in of data may be continuous, it is also possible to it is desultory, and writing process and reading in practice Process and waiting process are interlaced, so write-in completion here is also possible to for some storage cell The storage region artificially divided for some.
Real-time monitoring environment temperature T2Sample frequency be every 1-30 second once, this frequency is according to the storage chip institute Depending on the working environment at place and the frequency of read-write, if the temperature change of working environment is bigger, and storage chip is read It is frequent to write comparison, it is meant that in the case of this kind, the temperature change of storage chip can relatively acutely, and corresponding environment temperature is adopted Sample frequency will be relatively high.
Step S4, whether the difference between mean temperature T when judging environment temperature T2 and write-in is more than threshold value t, if More than into next step.
Specifically for each storage cell, when needing to calculate real-time environment temperature T2 in real time and being written The temperature gap of mean temperature T, and judge whether the difference looked into the threshold value t of setting.
Obviously, according to the data being written in the storage chip introduced in background technique to the sensitive situations of temperature, write-in temperature If the difference of degree and environment temperature reaches some value, that is, this threshold value t, then the risk of loss of data will be very big, And this threshold value t is obviously and can all there be relationship in type, the technique of production and the manufacturer of storage chip.
For use environment, the setting value of this threshold value t is also influenced whether.Clearly as the significance level of data Difference, for especially important valuable data, the threshold value t for the difference that we set is with regard to smaller, temperature variation small in this way Can triggering following step S4 checked operation, so as to preferably protect the integrality of these valuable datas.
On the other hand, due to storage chip type, for example, SLC type nand flash memory particle just than the NAND of MLC type Flash memory is particle stabilized, reliably, just has stronger resistivity to temperature variation, even if temperature change, the stabilization of data Property also than MLC, TLC is higher, in this way can be in phase using the threshold value t of difference of the storage chip of SLC type flash memories particle It is set in the case where larger.
Same reason, the technique of different production and manufacturer also result in the threshold of the temperature gap of storage chip Value t is different.Applicant suggests that the threshold range used is 20-80oC.
Step S5 further judges whether environment temperature T2 reaches the ideal write-in temperature of setting when difference is more than threshold value t Degree just calculates storage cell all in some storage region when environment temperature T2 reaches the ideal write-in temperature of setting The summation that the difference of temperature T1 and environment temperature T2 is written, obtains the difference summation of all storage regions.
Wherein, the setting value of the ideal write-in temperature is product specification according to deposit data storage device, actual use What the such factor of environment, flash type, life cycle was set.When environment temperature reaches the ideal write-in temperature of setting, The data of write-in have better reliability, stronger, more difficult to make a fault or lose, in later Conservation environment With to the better adaptability of extraneous temperature change.
When environment temperature T2 reaches the ideal write-in temperature of setting, storage list all in each storage region is just calculated The summation of the difference of the write-in temperature T1 and environment temperature of position, obtains the difference summation of all storage regions.
Step S6 is ranked up, from big to small successively to all memory blocks with the size of the difference summation of storage region Data in domain carry out bit error rate checking procedure and data restoration step.
Wherein, include: to the process steps of data progress bit error rate inspection
Compare in some storage region, the difference between the corresponding environment temperature of all storage cells and write-in temperature T1 And arranged according to numerical values recited, data scanning successively is carried out to storage cell from big to small, inspection obtains the bit error rate.
Judge whether the bit error rate of data reaches bit error rate threshold, data recovery is carried out if reaching bit error rate threshold.
According to the bit error rate and threshold value, the hardware ECC error correction ability of storage chip of the data that some storage cell is stored The size relation of the attainable bit error rate carry out corresponding processing.
Hardware ECC protection mechanism can be reported during correcting data error every cell data (usually 512byte-4KB it Between) an opposite percentage is arranged further according to the error correcting capability of hardware ECC in bit (Bit) quantity of data that can correct Threshold value, the i.e. peak of the bit error rate, as the attainable bit error rate of hardware ECC error correction ability institute of storage chip, this error code Rate is known as the error correcting capability of hardware ECC.The present embodiment can be according to the life cycle of storage equipment, flash memory characteristics, by hardware ECC's Error correcting capability is set in the bit error rate in 70-85%.
Wherein, data recovery procedure step includes:
When error rates of data is not up to threshold value, illustrate that data are reliably, there is no loss situation occurs, without carrying out Processing.This threshold value is known as data reliable thresholds, is usually determined by the control chip of storage device, and numerical value is from 10-5To 10-9Differ.
When error rates of data reaches threshold value, but is less than the hardware ECC error correction ability of storage chip itself, start hardware The data completely restored are written back in new storage cell by ECC error correction again, while recording write-in temperature when write-in again T1。
When error rates of data is more than the hardware ECC error correction ability of storage chip itself, starts additional data and restore machine System is to restore data.Wherein, the additional time data recovery mechanism includes RAID data recovery, re-try, soft-retry number According to recovery etc..
The full name of RAID is Redundant Array of Inexpensive Disk, and translator of Chinese is cheap redundant magnetic The abbreviation abbreviation RAID technique of disk array.It is the DavidPatterson by the branch school California, USA university Berkeley in 1988 The disk redundancy technology that professor et al. puts forward.From that time, disk array technology develops quickly, and gradually moves to maturity.
People have gradually recognized disk array technology at present.Disk array technology can be divided into several ranks in detail 0-5RAID technology, and developed the new rank of so-called RAID Level 10,30,50 again.It is simple with the benefit of RAID Say and be exactly: highly-safe, speed is fast, data capacity super large.The RAID technique of certain ranks can be increased to speed individually The 400% of hard disk drive.Disk array links together multiple hard disk drives collaborative work, substantially increases speed, The reliability of hard-disk system is increased to close to error-free boundary simultaneously.These " fault-tolerant " system speeds are exceedingly fast, while reliability It is high.
RAID restoration methods are can to save entire Die using multiple die (disk sheet or storage cell) even-odd check (disk sheet storage cell) scrap or loss of data, to restore to data.
Several data point voltage's distribiutings of re-try, i.e. read retry, MLC or TLC, SLC are possible to translate, as long as Several distributions can restore without superposition.ReadRetry attempts to read data with different reference voltages, until reading Come.
Soft-retry is read with Soft Inform ation.It integrates after reading several groups of data from different reference voltages and is finally counted According to.This needs more powerful ECC error correction ability, such as LDPC (the English contracting of Low Density Parity Check Code It writes, Chinese means low density parity check code, is most mentioned in his doctoral thesis early in the 1960s by Gallager Out.) as data can not be restored by any mechanism, corresponding data are labeled as damaging, such as read the data of this storage cell, Error condition will be returned, shows that the area data is unreadable.
The action and effect of the present embodiment is: the side of realization data reliability read-write according to involved in the present embodiment Method, because when data are written in storage chip, record write-in temperature T1;And after the completion of write-in, real-time monitoring storage chip institute The environment temperature T at place2;After collecting environment temperature, environment temperature T is judged2With write-in temperature T1Between difference whether be more than Threshold value t needs to test if it exceeds illustrating that biggish change may occur for the data saved under the environment;Work as difference When more than threshold value t, tests to the data of storage and obtain the bit error rate of data;According to not sympathizing with for the bit error rate of data Shape timely carries out data recovery using different repair modes, so, method operation provided by the invention is in the storage device Temperature locating for storage chip after write-in data can timely be obtained, and be compared in real time with temperature when write-in, when When the threshold value of temperature and environment temperature more than setting is written, just illustrates that the data stored after write-in may have occurred loss, go forward side by side One stepping performing check knows specific error rates of data, selected according to the bit error rate size of data different repair modes into Row data reparation.
Further, because of record write-in temperature T1Process be that storage chip is first divided into multiple storage regions, it is described Region includes multiple storage cells;It is written in storage cell then as data, when storage cell is fully written, just records The moment corresponding temperature T1, and stored by being written as metadata, obtain the corresponding write-in temperature T of each storage cell1, institute A write-in temperature can be corresponding with using each most basic storage cell and as metadata, it so can be to each basic Storage cell be monitored examine and restored when there is loss of data.
Further, there are two types of operation modes for the process tested to the data of storage:
Mode one includes following operation:
Compare the corresponding environment temperature T of all storage cells2With write-in temperature T1Between difference and according to numerical values recited It is arranged, preferentially the storage cell big to difference data carries out data scanning reading, and scanning show that some storage is single after reading The bit error rate of position institute storing data.
Obviously operation is suitable for that storage chip is busy in this, that is, carrying out data reading or write operation when It waits, since in this case, scan full hard disk reading can not be carried out.Under such situation, the present embodiment is arranged according to size of the difference Sequence, priority processing difference big storage cell are tested, and are equivalent to are classified in battlefield to injury in this way, preferential to locate Reason treatment severely injured personnel, the present embodiment so operates can be when storage chip be busy, the data detection and recovery effect that are optimal Fruit.
Further, the present embodiment stores the difference of the bit error rate of different data according to some storage cell Data the bit error rate and threshold value, the hardware ECC error correction ability of storage chip institute the attainable bit error rate size relation progress Corresponding processing:
When error rates of data is not up to threshold value, illustrate that data are reliably, there is no loss situation occurs, without carrying out Processing;
When error rates of data reaches threshold value, but is less than the hardware ECC error correction ability of storage chip itself, start hardware The data completely restored are written back in new storage cell by ECC error correction again, while recording write-in temperature when write-in again T1
When error rates of data is more than the hardware ECC error correction ability of storage chip itself, starts additional data and restore machine System is to restore data.Wherein, the additional time data recovery mechanism includes RAID data recovery, re-try, soft-retry number According to recovery etc.;
If data can not be restored by any mechanism, corresponding data are labeled as damaging, as read this storage cell Data will return to error condition, show that the area data is unreadable.
It should be understood by those skilled in the art that, the embodiment of the present invention can provide as method, system or computer program Product.Therefore, complete hardware embodiment, complete software embodiment or reality combining software and hardware aspects can be used in the present invention Apply the form of example.Moreover, it wherein includes the computer of computer usable program code that the present invention, which can be used in one or more, The shape for the computer program product implemented in usable storage medium (including but not limited to magnetic disk storage and optical memory etc.) Formula.
The present invention be referring to according to the method for the embodiment of the present invention, the process of equipment (system) and computer program product Figure and/or block diagram describe.It should be understood that every one stream in flowchart and/or the block diagram can be realized by computer program instructions The combination of process and/or box in journey and/or box and flowchart and/or the block diagram.It can provide these computer programs Instruct the processor of general purpose computer, special purpose computer, Embedded Processor or other programmable data processing devices to produce A raw machine, so that being generated by the instruction that computer or the processor of other programmable data processing devices execute for real The device for the function of being specified in present one or more flows of the flowchart and/or one or more blocks of the block diagram.
These computer program instructions, which may also be stored in, is able to guide computer or other programmable data processing devices with spy Determine in the computer-readable memory that mode works, so that it includes referring to that instruction stored in the computer readable memory, which generates, Enable the manufacture of device, the command device realize in one box of one or more flows of the flowchart and/or block diagram or The function of being specified in multiple boxes.
These computer program instructions also can be loaded onto a computer or other programmable data processing device, so that counting Series of operation steps are executed on calculation machine or other programmable devices to generate computer implemented processing, thus in computer or The instruction executed on other programmable devices is provided for realizing in one or more flows of the flowchart and/or block diagram one The step of function of being specified in a box or multiple boxes.
Above embodiment is preferred case of the invention, the protection scope being not intended to limit the invention.

Claims (10)

1. using temperature difference equalization methods carry out correcting data error storage device, have storage chip, control chip and with outside Data source connection interface, wherein being stored with control program in control chip, control program is performed the steps of
Storage chip is divided into multiple storage regions, the region includes multiple storage cells;
As data are written in storage cell, when some storage cell is fully written, the moment corresponding temperature is just recorded T1, and stored by being written as metadata, the corresponding write-in temperature T1 of each storage cell is obtained, and calculate entire storage chip Mean temperature T when by the multiple overall write-in being written when writing full;
After the completion of write-in, real time environment temperature T2 locating for real-time monitoring storage chip;
Whether the difference between mean temperature T when judging environment temperature T2 and write-in is more than threshold value t, if it exceeds into next Step;
When difference is more than threshold value t, further judge whether environment temperature T2 reaches the ideal write-in temperature of setting, when environment temperature When degree T2 reaches the ideal write-in temperature of setting, just calculate the write-in temperature T1 of storage cell all in some storage region with The summation of the difference of environment temperature T2 obtains the difference summation of all storage regions;
Be ranked up with the size of the difference summation of storage region, from big to small successively to the data in all storage regions into Performing check and error correction.
2. the storage device according to claim 1 for carrying out correcting data error using temperature difference equalization methods, it is characterised in that:
Wherein, the storage cell is memory block or memory page.
3. the storage device according to claim 1 for carrying out correcting data error using temperature difference equalization methods, it is characterised in that:
Wherein, include: to the process that data are tested
Compare in some storage region, the corresponding environment temperature of all storage cells and write-in temperature T1Between difference and root It is arranged according to numerical values recited, data scanning successively is carried out to storage cell from big to small, inspection obtains the bit error rate.
4. the storage device according to claim 3 for carrying out correcting data error using temperature difference equalization methods, it is characterised in that:
Wherein, correcting data error process includes:
When error rates of data reaches threshold value, but is less than the hardware ECC error correction ability of storage chip itself, start hardware ECC The data completely restored are written back in new storage cell by error correction again, while recording write-in temperature T when write-in1
When hardware ECC error correction ability of the error rates of data more than storage chip itself, it is next extensive to start additional time data recovery mechanism Multiple error correction.
5. the storage device according to claim 4 for carrying out correcting data error using temperature difference equalization methods, it is characterised in that:
Wherein, the hardware ECC error correction ability of storage chip itself is set as the bit error rate of 70-85%.
6. the storage device according to claim 4 for carrying out correcting data error using temperature difference equalization methods, it is characterised in that:
Wherein, the additional time data recovery mechanism includes that RAID data restores.
7. the storage device according to claim 1 for carrying out correcting data error using temperature difference equalization methods, it is characterised in that:
Wherein, the setting value of the ideal write-in temperature be according to the product specification of deposit data storage device, practical service environment, What the such factor of flash type, life cycle was set.
8. the storage device according to claim 1 for carrying out correcting data error using temperature difference equalization methods, it is characterised in that:
The corresponding write-in temperature T1 of all storage cells of some storage region is stored in specific memory block according to numerical values recited In domain.
9. the storage device according to claim 1 for carrying out correcting data error using temperature difference equalization methods, it is characterised in that:
The storage device is SSD solid state hard disk.
10. the storage device according to claim 9 for carrying out correcting data error using temperature difference equalization methods, it is characterised in that:
Wherein, the storage chip of the SSD solid state hard disk is the storage chip of SLC, MLC, TLC or QLC flash memory particle.
CN201811081439.6A 2018-09-17 2018-09-17 Storage device for correcting data error by using temperature difference equalization method Active CN109407972B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811081439.6A CN109407972B (en) 2018-09-17 2018-09-17 Storage device for correcting data error by using temperature difference equalization method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811081439.6A CN109407972B (en) 2018-09-17 2018-09-17 Storage device for correcting data error by using temperature difference equalization method

Publications (2)

Publication Number Publication Date
CN109407972A true CN109407972A (en) 2019-03-01
CN109407972B CN109407972B (en) 2021-08-06

Family

ID=65464211

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811081439.6A Active CN109407972B (en) 2018-09-17 2018-09-17 Storage device for correcting data error by using temperature difference equalization method

Country Status (1)

Country Link
CN (1) CN109407972B (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1677563A (en) * 2004-03-30 2005-10-05 尔必达存储器股份有限公司 Semiconductor device and testing method for same
US20120144270A1 (en) * 2007-09-06 2012-06-07 Western Digital Technologies, Inc. Storage subsystem capable of adjusting ecc settings based on monitored conditions
CN105009088A (en) * 2012-12-03 2015-10-28 西部数据技术公司 Methods, solid state drive controllers and data storage devices having a runtime variable RAID protection scheme
US20170024163A1 (en) * 2015-07-24 2017-01-26 Sk Hynix Memory Solutions Inc. Data temperature profiling by smart counter
TW201732821A (en) * 2016-03-02 2017-09-16 群聯電子股份有限公司 Data transmitting method, memory control circuit unit and memory storage device
CN108052414A (en) * 2017-12-28 2018-05-18 湖南国科微电子股份有限公司 A kind of method and system for promoting SSD operating temperature ranges

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1677563A (en) * 2004-03-30 2005-10-05 尔必达存储器股份有限公司 Semiconductor device and testing method for same
US20120144270A1 (en) * 2007-09-06 2012-06-07 Western Digital Technologies, Inc. Storage subsystem capable of adjusting ecc settings based on monitored conditions
CN105009088A (en) * 2012-12-03 2015-10-28 西部数据技术公司 Methods, solid state drive controllers and data storage devices having a runtime variable RAID protection scheme
US20170024163A1 (en) * 2015-07-24 2017-01-26 Sk Hynix Memory Solutions Inc. Data temperature profiling by smart counter
TW201732821A (en) * 2016-03-02 2017-09-16 群聯電子股份有限公司 Data transmitting method, memory control circuit unit and memory storage device
CN108052414A (en) * 2017-12-28 2018-05-18 湖南国科微电子股份有限公司 A kind of method and system for promoting SSD operating temperature ranges

Also Published As

Publication number Publication date
CN109407972B (en) 2021-08-06

Similar Documents

Publication Publication Date Title
Schwarz et al. Disk scrubbing in large archival storage systems
TWI498724B (en) Apparatus and method for protecting data from write failures in memory devices
US9880903B2 (en) Intelligent stress testing and raid rebuild to prevent data loss
KR101608679B1 (en) Torn write mitigation
US10248515B2 (en) Identifying a failing group of memory cells in a multi-plane storage operation
CN104484251B (en) A kind of processing method and processing device of hard disk failure
JP2011521397A5 (en)
JP2011521397A (en) Apparatus, system and method for detecting and replacing a failed data storage mechanism
US11430540B2 (en) Defective memory unit screening in a memory system
TW200532449A (en) Efficient media scan operations for storage systems
US20150178150A1 (en) Techniques for Assessing Pass/Fail Status of Non-Volatile Memory
CN103019894B (en) Reconstruction method for redundant array of independent disks
CN109445982A (en) Realize the data storage device of data reliable read write
CN104503781A (en) Firmware upgrading method for hard disk and storage system
CN110399247A (en) A kind of data reconstruction method, device, equipment and computer readable storage medium
US10235229B1 (en) Rehabilitating storage devices in a storage array that includes a plurality of storage devices
CN109375869A (en) Realize the method and system, storage medium of data reliable read write
US20060215456A1 (en) Disk array data protective system and method
CN110444243A (en) Store test method, system and the storage medium of equipment read error error correcting capability
CN109284201A (en) Temperature equalization data reconstruction method and system, storage medium
US20180046372A1 (en) Unusable column mapping in flash memory devices
CN109460316A (en) Data reconstruction method and system, storage medium based on temperature difference equilibrium
CN109407972A (en) The storage device of correcting data error is carried out using temperature difference equalization methods
CN109358984A (en) The storage device of data recovery is carried out using temperature equalization data reconstruction method
CN109358979A (en) Application, system and storage medium of the temperature difference equalization methods in correcting data error

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
PE01 Entry into force of the registration of the contract for pledge of patent right

Denomination of invention: Storage device for data error correction using temperature difference equalization method

Effective date of registration: 20220120

Granted publication date: 20210806

Pledgee: Wuhan area branch of Hubei pilot free trade zone of Bank of China Ltd.

Pledgor: EXASCEND TECHNOLOGY (WUHAN) CO.,LTD.

Registration number: Y2022420000020

PE01 Entry into force of the registration of the contract for pledge of patent right
CP03 Change of name, title or address

Address after: 430000 west of 2-3 / F, No.2 factory building, Guannan Industrial Park, No.1 Gaoxin 2nd Road, Wuhan Donghu New Technology Development Zone, Wuhan City, Hubei Province

Patentee after: Zhiyu Technology Co.,Ltd.

Address before: 430223 west of building 2-3, Guannan Industrial Park, No. 1, Gaoxin 2nd Road, Wuhan East Lake New Technology Development Zone, Wuhan, Hubei Province

Patentee before: EXASCEND TECHNOLOGY (WUHAN) CO.,LTD.

CP03 Change of name, title or address