CN109375869A - Realize the method and system, storage medium of data reliable read write - Google Patents

Realize the method and system, storage medium of data reliable read write Download PDF

Info

Publication number
CN109375869A
CN109375869A CN201811082866.6A CN201811082866A CN109375869A CN 109375869 A CN109375869 A CN 109375869A CN 201811082866 A CN201811082866 A CN 201811082866A CN 109375869 A CN109375869 A CN 109375869A
Authority
CN
China
Prior art keywords
data
write
storage
temperature
error rate
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811082866.6A
Other languages
Chinese (zh)
Inventor
弗兰克·陈
颜巍
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
To Reputation Technology (wuhan) Co Ltd
Original Assignee
To Reputation Technology (wuhan) Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by To Reputation Technology (wuhan) Co Ltd filed Critical To Reputation Technology (wuhan) Co Ltd
Priority to CN201811082866.6A priority Critical patent/CN109375869A/en
Publication of CN109375869A publication Critical patent/CN109375869A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0614Improving the reliability of storage systems
    • G06F3/0619Improving the reliability of storage systems in relation to data integrity, e.g. data losses, bit errors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/08Error detection or correction by redundancy in data representation, e.g. by using checking codes
    • G06F11/10Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's
    • G06F11/1008Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's in individual solid state devices
    • G06F11/1068Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's in individual solid state devices in sector programmable memories, e.g. flash disk
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1458Management of the backup or restore process
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3058Monitoring arrangements for monitoring environmental properties or parameters of the computing system or of the computing system component, e.g. monitoring of power, currents, temperature, humidity, position, vibrations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/0671In-line storage system
    • G06F3/0673Single storage device
    • G06F3/0679Non-volatile semiconductor memory device, e.g. flash memory, one time programmable memory [OTP]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/0671In-line storage system
    • G06F3/0683Plurality of storage devices
    • G06F3/0689Disk arrays, e.g. RAID, JBOD

Abstract

The present invention provides the method and system, storage medium for realizing data reliable read write, is related to data storage device and method, belongs to the method art processes of data storage the following steps are included: recording write-in temperature T1 when data are written in storage chip;After the completion of write-in, environment temperature T2 locating for real-time monitoring storage chip;Judge whether environment temperature T2 and the difference being written between temperature T1 are more than threshold value t, if it exceeds entering in next step;When difference is more than threshold value t, tests to the data of storage and obtain the bit error rate of data;Judge whether the bit error rate of data reaches bit error rate threshold, data recovery is carried out if reaching bit error rate threshold.It is corresponding that corresponding system and storage medium are also provided.

Description

Realize the method and system, storage medium of data reliable read write
Technical field
The present invention relates to data storage device and methods, belong to the method field of data storage.
Background technique
With the update of flash memory, the data stored in flash memory are more and more sensitive to the variation of temperature, for example, under high temperature The data of write-in, the probability for reading error at low temperature can increase considerably, otherwise be also.
However, more and more flash memory such as SSD solid state hard disks or storage array are used in industrial occasions, and these The variation of the environment temperature of occasion is very wide in range: from -40 DEG C -+85 DEG C, under this this big temperature difference environment under working environment, The data being written in flash memory can change with the variation of ambient temperature, i.e., the digital state hair in 0,1 in storage chip It is raw to change, after changing to a certain extent, it will lead to the data of storage, file is damaged or loses.
In the prior art still without preferable solution, industry usual way is to establish backup, i.e., specific permanent Server or storage center are established in warm environment to back up and save crucial data, it is clear that this kind of scheme is not appropriate for all User, is especially not suitable for personal consumption user and medium-sized and small enterprises use.
Summary of the invention
The present invention is to solve under above-mentioned wide temperature difference condition, and the prior art cannot preferably realize data reliable read write Problem and carry out, and it is an object of the present invention to provide a kind of method and system for realizing data reliable read write, this method, which is applied, to be used Solves the problems, such as data reliable read write under the wide temperature difference on the storage device of flash memory.
The present invention provides a kind of method for realizing data reliability read-write, which comprises the following steps:
Write-in temperature T when data are written in storage chip, when record write-in is completed1
When monitoring storage chip locating for environment temperature T2
Judge environment temperature T2With write-in temperature T1Between difference whether be more than threshold value t;
When difference is more than threshold value t, tests to the data of storage and obtain the bit error rate of data;
Judge whether the bit error rate of data reaches bit error rate threshold, data recovery is carried out if reaching bit error rate threshold.
The method provided by the invention for realizing data reliability read-write, can also have the feature that, wherein record is write The process for entering temperature T1 includes:
Wherein, the process of record write-in temperature T1 includes:
Storage chip is divided into multiple storage regions, the storage region includes multiple storage cells;
When storage region is fully written, record the moment corresponding temperature T1, and by as metadata write-in storage to depositing The specific region for storing up chip, obtains the corresponding write-in temperature T1 of each storage cell.
The method provided by the invention for realizing data reliability read-write, can also have the feature that, wherein described to deposit Storing up unit is memory block or memory page.
The method provided by the invention for realizing data reliability read-write, can also have the feature that, wherein step 2 Middle real-time monitoring environment temperature T2Sample frequency be every 1-30 second once.
The method provided by the invention for realizing data reliability read-write, can also have the feature that, wherein to storage The process tested of data include:
Compare the corresponding environment temperature T of all storage cells2With write-in temperature T1Between difference and according to numerical values recited It is arranged, preferentially the storage cell big to difference data carries out data scanning.
The method provided by the invention for realizing data reliability read-write, can also have the feature that, wherein to storage The process tested of data include:
When not having data write-in, the data of all storage cells of entire storage chip are scanned.
The method provided by the invention for realizing data reliability read-write, can also have the feature that, wherein work as data The bit error rate reaches threshold value, but when being less than the hardware ECC error correction ability of storage chip itself, starts hardware ECC error correction, will be complete The data of recovery are written back to new storage unit again, while recording write-in temperature T when write-in1
When hardware ECC error correction ability of the error rates of data more than storage chip itself, start additional time data recovery mechanism To restore data.
The method provided by the invention for realizing data reliability read-write, can also have the feature that, wherein setting is deposited The hardware ECC error correction ability for storing up chip itself is the bit error rate of 70-85%.
The method provided by the invention for realizing data reliability read-write, can also have the feature that, wherein the volume Outer time data recovery mechanism includes that RAID data restores.
The present invention also provides a kind of systems for realizing data reliability read-write characterized by comprising
Thermograph module is written, for when data are written in storage chip, temperature T1 to be written in record;
Environment temperature detection module is used for the environment temperature T locating for real-time monitoring storage chip after the completion of write-in2
Judgment module, for judging environment temperature T2With write-in temperature T1Between difference whether be more than threshold value t;
Inspection module, for testing to the data of storage and obtaining the error code of data when difference is more than threshold value t Rate;
Data recovery module, judges whether the bit error rate of data reaches bit error rate threshold, if reaching bit error rate threshold Carry out data recovery.
The present invention also provides a kind of storage mediums, are stored thereon with computer program, which is characterized in that the computer journey The method of above-mentioned realization data reliability read-write is realized when sequence is executed by processor.
The function and effect of the present invention is: the method that data reliability read-write is realized involved according to the present invention, because For when data are written in storage chip, temperature T is written in record1;And after the completion of write-in, ring locating for real-time monitoring storage chip Border temperature T2;After collecting environment temperature, environment temperature T is judged2With write-in temperature T1Between difference whether be more than threshold value t, If it exceeds illustrating that biggish change may occur for the data saved under the environment, need to test;When difference is more than threshold When value t, tests to the data of storage and obtain the bit error rate of data;According to the different situations of the bit error rate of data, in time Data recovery is carried out using different repair mode, so, method operation provided by the invention in the storage device can and When acquisition write-in data after storage chip locating for temperature, and in real time with write-in when temperature be compared, when write-in temperature When degree and environment temperature are more than the threshold value of setting, just illustrate that the data stored after write-in may have occurred loss, a stepping of going forward side by side Performing check knows specific error rates of data, and different repair modes is selected according to the bit error rate size of data to carry out data It repairs.
Detailed description of the invention
Fig. 1 is the step schematic diagram that the method for data reliable read write is realized in the embodiment of the present invention;
Fig. 2 is the process schematic of record write-in temperature;And
Fig. 3 is the module frame chart of the system of the realization data reliability read-write provided in the embodiment of the present invention.
Specific embodiment
It is real below in order to be easy to understand the technical means, the creative features, the aims and the efficiencies achieved by the present invention Example combination attached drawing is applied to realize the method for data reliable read write to the present invention and realize that the system of data reliable read write, storage are situated between Matter is specifically addressed.
Embodiment 1
Fig. 1 is the step schematic diagram that the method for data reliable read write is realized in the embodiment of the present invention.
As shown in Figure 1, the method provided in this embodiment for realizing data reliability read-write, comprising the following steps:
Step S1, the write-in temperature T1 when data are written in storage chip, when record write-in is completed.
In the present embodiment, the storage chip is nand flash memory chip, specially SLC, MLC, TLC or QLC flash memory The nand flash memory chip of grain production.Theoretically, can also make other kinds of storage chip, for example, NOR flash memory, ROM, PROM, EPROM, EEPROM, Flash ROM, FRAM, MRAM, RRAM, PCRAM etc. are that can be used as storage core of the invention Piece.
SLC, Single-LevelCell, i.e. 1bit/cell, the speed fast service life is long, price it is super it is expensive (about MLC3 times or more Price), about 100,000 erasing and writing lifes.
MLC, Multi-LevelCell, i.e. 2bit/cell, speed General Life is general, and price is general, about 3000--- 10000 erasing and writing lifes.
TLC, Trinary-LevelCell, i.e. 3bit/cell, Ye You Flash producer is 8LC, the relatively slow service life phase of speed To short, cheap, about 500 erasing and writing lifes.
QLC, Quad-Level Cell, i.e. 4bit/cell support 16 charge values, and the speed most slow service life is most short.
The nand flash memory chip of these three structures, the briefly best performance of SLC, price superelevation.It is typically used as enterprise Grade or high-end enthusiast.MLC performance is enough, and moderate cost is consumer level SSD application mainstream, and TLC comprehensive performance is minimum, and price is most Cheaply.But the performance of TLC flash memory can be made up, improved by high-performance master control, master control algorithm.
Fig. 2 is the process schematic of record write-in temperature.
Specifically, record write-in temperature T1Process include:
Storage chip is divided into multiple storage regions by step S1-1, and the region includes multiple storage cells.
Wherein, the storage cell is memory block (block) or memory page (page), it is however generally that, one basic to deposit The capacity for storing up unit is 16k byte, this specific data is different and different according to the manufacturer of storage particle.
Step S1-2 records the moment corresponding temperature T1, and be written as metadata when storage region is fully written The specific region for storing storage chip obtains the corresponding write-in temperature T1 of each storage cell.Specific region is reserved special Door is used to store the region of these data.
When data are written, all by control chip possessed by storage device (including control chip and storage chip) into ECC protection is gone.
ECC is writing a Chinese character in simplified form for " Error Correcting Code ", and Chinese is " error checking and correction ".ECC is one Kind can be realized the technology of " error checking and correction ", and ECC protection is exactly to apply this technology to protect the data of storage Corresponding ECC code is written by control chip and is stored in storage chip generally in storing data by the operation of shield, This will allow the data being stored in storage chip carry out hardware recovery.ECC can also be construed to error correction or correcting code、error checking and correcting、error checking and Correcting is also interpreted as Error correction circuit, is that a kind of maturation is applied and set in data storage Standby upper data protection and Restoration Mechanism.
Step S2, after the completion of write-in, environment temperature T2 locating for real-time monitoring storage chip.
Because the write-in of data may be continuous, it is also possible to it is desultory, and writing process and reading in practice Process and waiting process are interlaced, so write-in completion here is also possible to for some storage cell The storage region artificially divided for some.
Real-time monitoring environment temperature T2Sample frequency be every 1-30 second once, this frequency is according to the storage chip institute Depending on the working environment at place and the frequency of read-write, if the temperature change of working environment is bigger, and storage chip is read It is frequent to write comparison, it is meant that in the case of this kind, the temperature change of storage chip can relatively acutely, and corresponding environment temperature is adopted Sample frequency will be relatively high.
Step S3 judges whether environment temperature T2 and the difference being written between temperature T1 are more than threshold value t, if it exceeds into Next step S4.
Specifically for each storage cell, needs to calculate real-time environment temperature T2 in real time and deposited with itself The temperature gap of the write-in temperature T1 of unit write-in data is stored up, and judges whether the difference looked into the threshold value t of setting.
Obviously, according to the data being written in the storage chip introduced in background technique to the sensitive situations of temperature, write-in temperature If the difference of degree and environment temperature reaches some value, that is, this threshold value t, then the risk of loss of data will be very big, And this threshold value t is obviously and can all there be relationship in type, the technique of production and the manufacturer of storage chip.
For use environment, the setting value of this threshold value t is also influenced whether.Clearly as the significance level of data Difference, for especially important valuable data, the threshold value t for the difference that we set is with regard to smaller, temperature variation small in this way Can triggering following step S4 checked operation, so as to preferably protect the integrality of these valuable datas.
On the other hand, due to storage chip type, for example, SLC type nand flash memory particle just than the NAND of MLC type Flash memory is particle stabilized, reliably, just has stronger resistivity to temperature variation, even if temperature change, the stabilization of data Property also than MLC, TLC is higher, in this way can be in phase using the threshold value t of difference of the storage chip of SLC type flash memories particle It is set in the case where larger.
Same reason, the technique of different production and manufacturer also result in the threshold of the temperature gap of storage chip Value t is different.Applicant suggests that the threshold range used is 20-80 DEG C.
Step S4 tests to the data of storage and obtains the bit error rate of data when difference is more than threshold value t.
There are two types of operation modes for the process tested to the data of storage:
Mode one includes following operation:
Compare the corresponding environment temperature T of all storage cells2With write-in temperature T1Between difference and according to numerical values recited It is arranged, preferentially the storage cell big to difference data carries out data scanning reading, and scanning show that some storage is single after reading The bit error rate of position institute storing data.
Obviously operation is suitable for that storage chip is busy in this, that is, carrying out data reading or write operation when It waits, since in this case, scan full hard disk reading can not be carried out.Under such situation, the present embodiment is arranged according to size of the difference Sequence, priority processing difference big storage cell are tested, and are equivalent to are classified in battlefield to injury in this way, preferential to locate Reason treatment severely injured personnel, the present embodiment so operates can be when storage chip be busy, the data detection and recovery effect that are optimal Fruit.
Mode two includes following operation:
When not having data write-in, the data of all storage cells of entire storage chip are scanned.
Obviously, when entire storage chip is not written and read, so that it may which directly the read-write of unlatching scan full hard disk comes To the bit error rate of the data of each storage cell.
Step S5, judges whether the bit error rate of data reaches bit error rate threshold, is counted if reaching bit error rate threshold According to recovery.
According to the bit error rate and threshold value, the hardware ECC error correction ability of storage chip of the data that some storage cell is stored The size relation of the attainable bit error rate carry out corresponding processing:
Hardware ECC protection mechanism can be reported during correcting data error every cell data (usually 512byte-4KB it Between) an opposite percentage is arranged further according to the error correcting capability of hardware ECC in bit (Bit) quantity of data that can correct Threshold value, the i.e. peak of the bit error rate, as the attainable bit error rate of hardware ECC error correction ability institute of storage chip, this error code Rate is known as the error correcting capability of hardware ECC.The present embodiment can be according to the life cycle of storage equipment, flash memory characteristics, by hardware ECC's Error correcting capability is set in the bit error rate in 70-85%.
When error rates of data is not up to threshold value, illustrate that data are reliably, there is no loss situation occurs, without carrying out Processing.This threshold value becomes data reliable thresholds, is usually determined by the control chip of storage device, and numerical value is from 10-5To 10-9Differ.
When error rates of data reaches threshold value, but is less than the hardware ECC error correction ability of storage chip itself, start hardware The data completely restored are written back in new storage cell by ECC error correction again, while recording write-in temperature when write-in again T1
When error rates of data is more than the hardware ECC error correction ability of storage chip itself, starts additional data and restore machine System is to restore data.Wherein, the additional time data recovery mechanism includes RAID data recovery, re-try, soft-retry number According to recovery etc..
The full name of RAID is Redundant Array of Inexpensive Disk, and translator of Chinese is cheap redundant magnetic The abbreviation abbreviation RAID technique of disk array.It is the DavidPatterson by the branch school California, USA university Berkeley in 1988 The disk redundancy technology that professor et al. puts forward.From that time, disk array technology develops quickly, and gradually moves to maturity.
People have gradually recognized disk array technology at present.Disk array technology can be divided into several ranks in detail 0-5RAID technology, and developed the new rank of so-called RAID Level 10,30,50 again.It is simple with the benefit of RAID Say and be exactly: highly-safe, speed is fast, data capacity super large.The RAID technique of certain ranks can be increased to speed individually The 400% of hard disk drive.Disk array links together multiple hard disk drives collaborative work, substantially increases speed, The reliability of hard-disk system is increased to close to error-free boundary simultaneously.These " fault-tolerant " system speeds are exceedingly fast, while reliability It is high.
RAID restoration methods are can to save entire Die using multiple die (disk sheet or storage cell) even-odd check (disk sheet storage cell) scrap or loss of data, to restore to data.
Several data point voltage's distribiutings of re-try, i.e. read retry, MLC or TLC, SLC are possible to translate, as long as Several distributions can restore without superposition.ReadRetry attempts to read data with different reference voltages, until reading Come.
Soft-retry is read with Soft Inform ation.It integrates after reading several groups of data from different reference voltages and is finally counted According to.This needs more powerful ECC error correction ability, as LDPC (english abbreviation of LowDensity Parity Check Code, Chinese means low density parity check code, is most proposed in his doctoral thesis early in the 1960s by Gallager.)
If data can not be restored by any mechanism, corresponding data are labeled as damaging, as read this storage cell Data will return to error condition, show that the area data is unreadable.
Embodiment 2
Fig. 3 is the module frame chart of the system of the realization data reliability read-write provided in the embodiment of the present invention.
It include write-in thermograph module 101, environment the present embodiment provides a kind of system for realizing data reliability read-write Temperature detecting module 102, judgment module 103, inspection module 104, data recovery module 105.
Thermograph module 101 is written, for the write-in temperature when data are written in storage chip, when record write-in is completed T1
Environment temperature detection module 102, for environment temperature T2 locating for real-time monitoring storage chip.
Judgment module 103, for judging whether environment temperature T2 and the difference being written between temperature T1 are more than threshold value t.
Inspection module 104, for testing to the data of storage and obtaining the mistake of data when difference is more than threshold value t Code rate.
Data recovery module 105, judges whether the bit error rate of data reaches bit error rate threshold, if reaching bit error rate threshold Then carry out data recovery.
Specifically, as a kind of optimization:
Preferably, record write-in temperature T1Process include:
Storage chip is divided into multiple storage regions, the storage region includes multiple storage cells;
When storage region is fully written, record the moment corresponding temperature T1, and by as metadata write-in storage to depositing The specific region for storing up chip, obtains the corresponding write-in temperature T1 of each storage cell.
Preferably, the process tested to the data of storage includes:
Compare the corresponding environment temperature T of all storage cells2With write-in temperature T1Between difference and according to numerical values recited It is arranged, preferentially the storage cell big to difference data carries out data scanning;Or
The process tested to the data of storage includes:
When not having data write-in, the data of all storage cells of entire storage chip are scanned.
Preferably, when error rates of data reaches threshold value, but is less than the hardware ECC error correction ability of storage chip itself, Start hardware ECC error correction, the data completely restored is written back to again in new storage cell, while recording write-in when write-in Temperature T1
Preferably, when the hardware ECC error correction ability that error rates of data is more than storage chip itself, it is extensive to start additional data The system of answering a pager's call is to restore data.
Preferably, the hardware ECC error correction ability of storage chip itself is set as the bit error rate of 70-85%.
Preferably, the additional time data recovery mechanism includes that RAID data restores.
Embodiment 3
The present embodiment provides a kind of storage mediums, are stored thereon with computer program, the computer program is by processor The method of realization data reliability read-write below is realized when execution:
Step S1, the write-in temperature T1 when data are written in storage chip, when record write-in is completed.
Specifically, step S1 further include:
Storage chip is divided into multiple storage regions by step S1-1, and the region includes multiple storage cells.
Step S1-2 records the moment corresponding temperature T1, and be written as metadata when storage region is fully written The specific region for storing storage chip obtains the corresponding write-in temperature T1 of each storage cell.
Step S2, after the completion of write-in, environment temperature T2 locating for real-time monitoring storage chip.
Step S3 judges whether environment temperature T2 and the difference being written between temperature T1 are more than threshold value t, if it exceeds into Next step S4.
Step S4 tests to the data of storage and obtains the bit error rate of data when difference is more than threshold value t.
Step S5, judges whether the bit error rate of data reaches bit error rate threshold, is counted if reaching bit error rate threshold According to recovery.
The obvious storage medium can be CD, flash disk or disk, floppy disk, CD, DVD, hard disk, flash memory, CF card, SD Card, mmc card, SM card, memory stick (Memory Stick), xD card, tape, magneto-optic disk etc., by the computer of the corresponding above method On the storage medium, user installs or runs after obtaining the storage medium can be in correspondence for program storage or imprinting Storage device on execute the embodiment of the present invention 1 method.
The action and effect of the present embodiment is: the side of realization data reliability read-write according to involved in the present embodiment Method, because when data are written in storage chip, record write-in temperature T1;And after the completion of write-in, real-time monitoring storage chip institute The environment temperature T at place2;After collecting environment temperature, environment temperature T is judged2With write-in temperature T1Between difference whether be more than Threshold value t needs to test if it exceeds illustrating that biggish change may occur for the data saved under the environment;Work as difference When more than threshold value t, tests to the data of storage and obtain the bit error rate of data;According to not sympathizing with for the bit error rate of data Shape timely carries out data recovery using different repair modes, so, method operation provided by the invention is in the storage device Temperature locating for storage chip after write-in data can timely be obtained, and be compared in real time with temperature when write-in, when When the threshold value of temperature and environment temperature more than setting is written, just illustrates that the data stored after write-in may have occurred loss, go forward side by side One stepping performing check knows specific error rates of data, selected according to the bit error rate size of data different repair modes into Row data reparation.
Further, because of record write-in temperature T1Process be that storage chip is first divided into multiple storage regions, it is described Region includes multiple storage cells;It is written in storage cell then as data, when storage cell is fully written, just records The moment corresponding temperature T1, and stored by being written as metadata, obtain the corresponding write-in temperature T of each storage cell1, institute A write-in temperature can be corresponding with using each most basic storage cell and as metadata, it so can be to each basic Storage cell be monitored examine and restored when there is loss of data.
Further, there are two types of operation modes for the process tested to the data of storage:
Mode one includes following operation:
Compare the corresponding environment temperature T of all storage cells2With write-in temperature T1Between difference and according to numerical values recited It is arranged, preferentially the storage cell big to difference data carries out data scanning reading, and scanning show that some storage is single after reading The bit error rate of position institute storing data.
Obviously this operation is suitable for that storage chip is busy, that is, carrying out data reading or write operation when It waits, since in this case, scan full hard disk reading can not be carried out.Under such situation, the present embodiment is arranged according to size of the difference Sequence, priority processing difference big storage cell are tested, and are equivalent to are classified in battlefield to injury in this way, preferential to locate Reason treatment severely injured personnel, the present embodiment so operates can be when storage chip be busy, the data detection and recovery effect that are optimal Fruit.
Further, the present embodiment stores the difference of the bit error rate of different data according to some storage cell Data the bit error rate and threshold value, the hardware ECC error correction ability of storage chip institute the attainable bit error rate size relation progress Corresponding processing:
When error rates of data is not up to threshold value, illustrate that data are reliably, there is no loss situation occurs, without carrying out Processing;
When error rates of data reaches threshold value, but is less than the hardware ECC error correction ability of storage chip itself, start hardware The data completely restored are written back in new storage cell by ECC error correction again, while recording write-in temperature when write-in again T1
When error rates of data is more than the hardware ECC error correction ability of storage chip itself, starts additional data and restore machine System is to restore data.Wherein, the additional time data recovery mechanism includes RAID data recovery, re-try, soft-retry number According to recovery etc.;
If data can not be restored by any mechanism, corresponding data are labeled as damaging, as read this storage cell Data will return to error condition, show that the area data is unreadable.
It should be understood by those skilled in the art that, the embodiment of the present invention can provide as method, system or computer program Product.Therefore, complete hardware embodiment, complete software embodiment or reality combining software and hardware aspects can be used in the present invention Apply the form of example.Moreover, it wherein includes the computer of computer usable program code that the present invention, which can be used in one or more, The shape for the computer program product implemented in usable storage medium (including but not limited to magnetic disk storage and optical memory etc.) Formula.
The present invention be referring to according to the method for the embodiment of the present invention, the process of equipment (system) and computer program product Figure and/or block diagram describe.It should be understood that every one stream in flowchart and/or the block diagram can be realized by computer program instructions The combination of process and/or box in journey and/or box and flowchart and/or the block diagram.It can provide these computer programs Instruct the processor of general purpose computer, special purpose computer, Embedded Processor or other programmable data processing devices to produce A raw machine, so that being generated by the instruction that computer or the processor of other programmable data processing devices execute for real The device for the function of being specified in present one or more flows of the flowchart and/or one or more blocks of the block diagram.
These computer program instructions, which may also be stored in, is able to guide computer or other programmable data processing devices with spy Determine in the computer-readable memory that mode works, so that it includes referring to that instruction stored in the computer readable memory, which generates, Enable the manufacture of device, the command device realize in one box of one or more flows of the flowchart and/or block diagram or The function of being specified in multiple boxes.
These computer program instructions also can be loaded onto a computer or other programmable data processing device, so that counting Series of operation steps are executed on calculation machine or other programmable devices to generate computer implemented processing, thus in computer or The instruction executed on other programmable devices is provided for realizing in one or more flows of the flowchart and/or block diagram one The step of function of being specified in a box or multiple boxes.
Above embodiment is preferred case of the invention, the protection scope being not intended to limit the invention.

Claims (10)

1. a kind of method for realizing data reliability read-write, which comprises the following steps:
Write-in temperature T when data are written in storage chip, when record write-in is completed1
Environment temperature T locating for real-time monitoring storage chip2
Judge environment temperature T2With write-in temperature T1Between difference whether be more than threshold value t;
When difference is more than threshold value t, tests to the data of storage and obtain the bit error rate of data;
Judge whether the bit error rate of data reaches bit error rate threshold, data recovery is carried out if reaching bit error rate threshold.
2. the method according to claim 1 for realizing data reliability read-write, it is characterised in that:
Wherein, the process of record write-in temperature T1 includes:
Storage chip is divided into multiple storage regions, the storage region includes multiple storage cells;
When storage region is fully written, the moment corresponding temperature T1 is recorded, and storage to storage core is written as metadata The specific region of piece obtains the corresponding write-in temperature T1 of each storage cell.
3. the method according to claim 1 for realizing data reliability read-write, it is characterised in that:
Wherein, the storage cell is memory block or memory page.
4. the method according to claim 1 for realizing data reliability read-write, it is characterised in that:
Wherein, the process tested to the data of storage includes:
Compare the corresponding environment temperature T of all storage cells2With write-in temperature T1Between difference and carried out according to numerical values recited Arrangement, preferentially the storage cell big to difference data carries out data scanning.
5. the method according to claim 1 for realizing data reliability read-write, it is characterised in that:
Wherein, the process tested to the data of storage includes:
When not having data write-in, the data of all storage cells of entire storage chip are scanned.
6. the method according to claim 1 for realizing data reliability read-write, it is characterised in that:
Wherein,
When error rates of data reaches threshold value, but is less than the hardware ECC error correction ability of storage chip itself, start hardware ECC The data completely restored are written back in new storage cell by error correction again, while recording write-in temperature T when write-in1
When hardware ECC error correction ability of the error rates of data more than storage chip itself, it is next extensive to start additional time data recovery mechanism Complex data.
7. the method according to claim 6 for realizing data reliability read-write, it is characterised in that:
Wherein, the hardware ECC error correction ability of storage chip itself is set as the bit error rate of 70-85%.
8. the method according to claim 6 for realizing data reliability read-write, it is characterised in that:
Wherein, the additional time data recovery mechanism includes that RAID data restores.
9. the system for realizing data reliability read-write characterized by comprising
Thermograph module is written, for when data are written in storage chip, temperature T1 to be written in record;
Environment temperature detection module is used for the environment temperature T locating for real-time monitoring storage chip after the completion of write-in2
Judgment module, for judging environment temperature T2With write-in temperature T1Between difference whether be more than threshold value t;
Inspection module, for testing to the data of storage and obtaining the bit error rate of data when difference is more than threshold value t;
Data recovery module, judges whether the bit error rate of data reaches bit error rate threshold, carries out if reaching bit error rate threshold Data are restored.
10. a kind of storage medium, is stored thereon with computer program, which is characterized in that the computer program is held by processor Claim 1 to 8 described in any item methods are realized when row.
CN201811082866.6A 2018-09-17 2018-09-17 Realize the method and system, storage medium of data reliable read write Pending CN109375869A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811082866.6A CN109375869A (en) 2018-09-17 2018-09-17 Realize the method and system, storage medium of data reliable read write

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811082866.6A CN109375869A (en) 2018-09-17 2018-09-17 Realize the method and system, storage medium of data reliable read write

Publications (1)

Publication Number Publication Date
CN109375869A true CN109375869A (en) 2019-02-22

Family

ID=65405447

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811082866.6A Pending CN109375869A (en) 2018-09-17 2018-09-17 Realize the method and system, storage medium of data reliable read write

Country Status (1)

Country Link
CN (1) CN109375869A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110010163A (en) * 2019-04-16 2019-07-12 苏州浪潮智能科技有限公司 A kind of data in magnetic disk holding capacity test method and relevant apparatus
CN112035060A (en) * 2020-08-17 2020-12-04 合肥康芯威存储技术有限公司 Error detection method and system for storage medium and storage system
CN112035060B (en) * 2020-08-17 2024-04-26 合肥康芯威存储技术有限公司 Error detection method and system for storage medium and storage system

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101901169A (en) * 2010-03-23 2010-12-01 成都市华为赛门铁克科技有限公司 Scanner and method
CN102708019A (en) * 2012-04-28 2012-10-03 华为技术有限公司 Method, device and system for hard disk data recovery
CN104346232A (en) * 2013-08-06 2015-02-11 慧荣科技股份有限公司 Data storage device and access limiting method thereof
CN104731522A (en) * 2013-12-20 2015-06-24 慧荣科技股份有限公司 Data storage device and data maintenance method thereof
CN107818025A (en) * 2017-10-31 2018-03-20 郑州云海信息技术有限公司 Hard disk cold data method of calibration, device, equipment and computer-readable recording medium
US9971530B1 (en) * 2016-11-09 2018-05-15 Sandisk Technologies Llc Storage system and method for temperature throttling for block reading
CN108052414A (en) * 2017-12-28 2018-05-18 湖南国科微电子股份有限公司 A kind of method and system for promoting SSD operating temperature ranges
CN108228371A (en) * 2016-12-15 2018-06-29 发那科株式会社 Machine learning device and method, life predication apparatus, numerical control device

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101901169A (en) * 2010-03-23 2010-12-01 成都市华为赛门铁克科技有限公司 Scanner and method
CN102708019A (en) * 2012-04-28 2012-10-03 华为技术有限公司 Method, device and system for hard disk data recovery
CN104346232A (en) * 2013-08-06 2015-02-11 慧荣科技股份有限公司 Data storage device and access limiting method thereof
CN104731522A (en) * 2013-12-20 2015-06-24 慧荣科技股份有限公司 Data storage device and data maintenance method thereof
US9971530B1 (en) * 2016-11-09 2018-05-15 Sandisk Technologies Llc Storage system and method for temperature throttling for block reading
CN108228371A (en) * 2016-12-15 2018-06-29 发那科株式会社 Machine learning device and method, life predication apparatus, numerical control device
CN107818025A (en) * 2017-10-31 2018-03-20 郑州云海信息技术有限公司 Hard disk cold data method of calibration, device, equipment and computer-readable recording medium
CN108052414A (en) * 2017-12-28 2018-05-18 湖南国科微电子股份有限公司 A kind of method and system for promoting SSD operating temperature ranges

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110010163A (en) * 2019-04-16 2019-07-12 苏州浪潮智能科技有限公司 A kind of data in magnetic disk holding capacity test method and relevant apparatus
CN112035060A (en) * 2020-08-17 2020-12-04 合肥康芯威存储技术有限公司 Error detection method and system for storage medium and storage system
CN112035060B (en) * 2020-08-17 2024-04-26 合肥康芯威存储技术有限公司 Error detection method and system for storage medium and storage system

Similar Documents

Publication Publication Date Title
KR101608679B1 (en) Torn write mitigation
US10248515B2 (en) Identifying a failing group of memory cells in a multi-plane storage operation
JP2011521397A5 (en)
US11430540B2 (en) Defective memory unit screening in a memory system
CN102708019A (en) Method, device and system for hard disk data recovery
TW201044409A (en) Data recovery in a solid state storage system
CN111124758A (en) Data recovery method for failed hard disk
CN107799150A (en) The error mitigation of 3D nand flash memories
CN103019894B (en) Reconstruction method for redundant array of independent disks
CN109445982A (en) Realize the data storage device of data reliable read write
US20130166991A1 (en) Non-Volatile Semiconductor Memory Device Using Mats with Error Detection and Correction and Methods of Managing the Same
CN112632643A (en) Method for preventing flash memory data loss, solid state disk controller and solid state disk
CN107885620B (en) Method and system for improving performance and reliability of solid-state disk array
CN102929740A (en) Method and device for detecting bad block of storage equipment
US20060215456A1 (en) Disk array data protective system and method
CN103176859A (en) Flash data backup/recovery method, equipment and signal source
CN110444243A (en) Store test method, system and the storage medium of equipment read error error correcting capability
CN109284201A (en) Temperature equalization data reconstruction method and system, storage medium
CN109358984A (en) The storage device of data recovery is carried out using temperature equalization data reconstruction method
US20150255174A1 (en) Memory testing method and apparatus
CN109375869A (en) Realize the method and system, storage medium of data reliable read write
CN109460316A (en) Data reconstruction method and system, storage medium based on temperature difference equilibrium
CN113190179B (en) Method for prolonging service life of mechanical hard disk, storage device and system
US11593242B2 (en) Method of operating storage device for improving reliability, storage device performing the same and method of operating storage using the same
CN109358979A (en) Application, system and storage medium of the temperature difference equalization methods in correcting data error

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20190222