CN101599305B - Storage system with data repair function and data repair method thereof - Google Patents

Storage system with data repair function and data repair method thereof Download PDF

Info

Publication number
CN101599305B
CN101599305B CN 200810109903 CN200810109903A CN101599305B CN 101599305 B CN101599305 B CN 101599305B CN 200810109903 CN200810109903 CN 200810109903 CN 200810109903 A CN200810109903 A CN 200810109903A CN 101599305 B CN101599305 B CN 101599305B
Authority
CN
China
Prior art keywords
data
memory
block
test
memory region
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN 200810109903
Other languages
Chinese (zh)
Other versions
CN101599305A (en
Inventor
陈明达
林传生
谢祥安
张惠能
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
A Data Technology Co Ltd
Original Assignee
A Data Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by A Data Technology Co Ltd filed Critical A Data Technology Co Ltd
Priority to CN 200810109903 priority Critical patent/CN101599305B/en
Publication of CN101599305A publication Critical patent/CN101599305A/en
Application granted granted Critical
Publication of CN101599305B publication Critical patent/CN101599305B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The invention provides a storage system with data repair function and a data repair method thereof, which mainly utilize one or more repeated testing and repair processes to reduce errors in a memory medium to a range that can be repaired by commonly used error detection and correction (ECC) functions so as to ensure the correctness of data reading and effectively improve the reliability of data, wherein the preferred mode of the data repair step comprises the steps of utilizing a test data generator in the storage system to provide a piece of test data, writing the test data into a memory block with data errors, reading the data to find error bits, enabling the error bits to be in the range that can be repaired by the ECC technology through a repair program, but if the testing times exceed the upper limit of a test time, the error bits cannot be found or the error repair cannot be reduced to the range that can be repaired by the error detection and correction technology, the memory block is marked as a corrupted block.

Description

Stocking system and data recovery method thereof with data repair function
Technical field
The present invention relates to a kind of stocking system and data recovery method thereof with data repair function, particularly utilize to write the one or many test data is carried out internal memory to the mode of the position that stores misdata detection and data correction.
Background technology
Because flash memory tool access speed is fast, power consumption is low, volume is little and shatter-proof etc. is better than the characteristic of conventional hard, so gradually be used on the information accumulation device widely.
Flash memory is because of structural relation, the data that store are subject to high voltage interference or impact aging because of memory cell, that damage easily, causing the data that store to produce mistake, is noble potential such as the state of script memory cell, but controller to read this memory cell when reading this memory cell be low-potential state; Or the state of memory cell is electronegative potential originally, and but reading this memory cell when controller reads this memory cell is high potential state.
For the data of avoiding flash memory stores make a mistake, improve the fiduciary level of storage data, mainly utilize error detection and alignment technique (Error Checking and Correction, ECC) to detect and the error recovery data on the prior art.
Summary ECC technology:
Data will first through ECC unitary operation in the Memory Controller Hub, produce the ECC code (ECC code) of these data, and the ECC code be stored in the flash memory with these data just when writing flash memory stores.When reading out data, controller reads out data and ECC code thereof together, carries out first the computing of detecting and the position that corrects mistakes through the ECC unit, if do not find on inspection error bit, then exports these data; If check out wrong position in the data, under the scope that ECC can revise, just with again output after the data correction; If the detected error position surpasses the scope that ECC can revise, just controller reward data read error then.
Can be with reference to U.S. Patent Publication No. 20040230879 " Apparatus and method for responding to data retention loss in anon-volatile memory unit using error checking and correction techniques " (on November 18th, 2004 be open) about case before the detecting of flash memory stores data and reparation relevant, and United States Patent (USP) the 6th, 785, No. 856 " Internal self-test circuit for amemory array " (bulletin on August 31st, 2004) two cases.The former be the error bit of flash memory under the capability for correcting of ECC, a kind of error detection and restorative procedure are provided, the latter arranges the error bit that test circuit is detected internal memory in stocking system, see also Fig. 1.
Figure 1 shows that United States Patent (USP) the 6th, 785, the memory device 100 with selftest device (self-tester) 104 that proposes for No. 856, wherein mnemon 102 is connected to selftest device 104, wherein wrong detecting and correction (ECC) circuit 106, self-test circuit 108 and a buffer (register) 110, this mnemon 1 02 is divided into the internal storage structure of a plurality of memory pages (page).Whether selftest device 104 is responsible for detecting internal memories wrong and proofreaied and correct, in the middle of ECC circuit 106 will in each memory page, divide into groups (group) detection, as the circulation of using the Reed-Solomon algorithm is verbose checks (cyclic redundancycheck, CRC) method detects the mistake in the mnemon, self-test circuit 108 is the central wrong number of times of calculating then, and be stored in the working area 110, more can utilize different Data Detection to remember that respectively hundred million writable status carries out Data correction.
But, in the technology that above-mentioned prior art proposes, if the position of making a mistake surpasses the scope that ECC can revise, the situation of data read errors will occur.
Summary of the invention
Technical matters to be solved by this invention is to overcome prior art to utilize error detection and alignment technique (ECC) to carry out the data detection of memory media and the restriction of correction, for this reason, the invention provides a kind of data recovery technique that is applied in the memory media, mainly be by one or repeatedly testing process, revise the mistake in the memory media, and can reduce the scope that wrong position can be revised to ECC, if after testing time surpasses a upper limit number of times next time, then the memory region of mistake can be labeled as the damage block, to improve the data reliability of memory media.
Stocking system with data repair function provided by the present invention mainly includes control module and mnemon, wherein comprises test data generator, comparing unit, reparation unit, data buffer and ECC unit in the control module.
Above-mentioned test data generator provides test data; Comparing unit is used for relatively being write by the test data generator test data and the test data that is read out by mnemon of mnemon, judges with this whether this storage area exists the position of mistake; Data buffer is used for the misdata that temporary storage memory unit ECC can't revise; Repair the bit data corresponding with error bit in the Information revision data buffer of the error bit that the unit provides according to comparing unit; The ECC unit is except carrying out when the general normal read-write operation the debug of data and revising, and also is used for the operation that the data execution error to the data buffer checks and proofreaies and correct.
The preferred embodiment of the data recovery method in the above-mentioned stocking system then is that one or complex number memory page or leaf of one first memory region in the mnemon ECC occurs can't correct mistakes in the situation of data, first with the data Replica of this first memory region to the second memory region of this redundancy and this first memory region of erasing, then, test data generator in the control module namely provides a test data to write in the first memory region that error in data occurs, and checks by the data that read in the memory page that error in data occurs in this first memory region whether two data have difference.
If do not find error bit, method provided by the present invention is then proceeded next time testing process, make the test data generator produce in addition a different set of test data, carry out for the second time testing procedure, detect on the first memory region the error bit on the memory page that error in data occurs, if through testing procedure for several times repeatedly, and surpass the upper limit number of times of a test, the position that all can't locate errors then is labeled as the damage block with the first memory region.But, if finding error bit for the first time or behind the testing process several times, then then revise the data of corresponding error bit in the data buffer, make it in the recoverable scope of ECC technology, if but still can't repair with the ECC technology this moment, then will proceed next time testing process, with the misdata in the further correction data buffer, in the hope of error bit being down to the amendable scope of ECC.
Compared with prior art, the present invention has following beneficial effect:
Utilization of the present invention writes test data detected and revised data to the mode of the position that stores misdata error bit, the number of error bit in the data is down to the scope that ECC can repair, and make when situation that ECC can't the mis repair data occurs, by one or repeatedly testing process, revise the mistake in the memory media, if after testing time surpasses a upper limit number of times next time, then the memory region of mistake can be labeled as the damage block, thereby guaranteed the correctness that data read, effectively promoted the fiduciary level of data.
Description of drawings
Fig. 1 is the memory storage circuit block diagram that prior art has self-repairing capability;
Fig. 2 is the functional block diagram of stocking system of the present invention;
Fig. 3 A to Fig. 3 I shows a view interactive between each memory block of the present invention;
Figure 4 shows that the process step of the data reparation of a preferred embodiment of the present invention;
Figure 5 shows that the process step of the data reparation of another better implementation column of the present invention;
Fig. 6 is shown as the step of carrying out again testing process when data are repaired;
Fig. 7 A to Fig. 7 H shows another interactive between each memory block of the present invention view.
Wherein,
Memory device 100 mnemons 102
Selftest device 104 ECC circuit 106
Self-test circuit 108 buffers 110
Stocking system 20 control modules 22
Mnemon 24 test data generators 221
Comparing unit 223 is repaired unit 225
Data buffer 227 ECC unit 229
Embodiment
The invention provides a kind of stocking system and data recovery method thereof with data repair function, particularly when the error detection of normal operation and correction (ECC) technology can't be repaired the error bit of storage data, stocking system of the present invention can provide the function of detecting error bit and then mis repair position, guarantee the correctness that data read, effectively promote data reliability.
Prior art is to guarantee the fiduciary level of data, and control module can adopt the ECC function to check when reading the data of internal memory and the data of mis repair.Yet the ECC errors repair is limited in one's ability, if the error bit of data exceeds the extent of amendment of ECC, will the situation of data read errors occur still.Therefore, the misdata that stocking system of the present invention will can't be revised for ECC, utilization writes test data detected and revised data to the mode of the position that stores misdata error bit, the number of error bit in the data is down to the scope that ECC can repair, and make when situation that ECC can't the mis repair data occurs, can utilize restorative procedure of the present invention to read out correct data.
See also Fig. 2, shown in the functional block diagram of stocking system of the present invention, stocking system 20 comprises a control module 22 and a mnemon (non-volatile) 24, both are electrically connected, and wherein comprise test data generator 221, comparing unit 223 in the control module 22, repair unit 225, data buffer 227 and error detection and correction (ECC) unit 229.
Above-mentioned test data generator 221 is in order to produce test data, and test data write in the mnemon, test data can be complete for " 0 " and data, such as 0x00, entirely be " 1 " data, as 0xFF or " 0 " with " 1 " alternately data, such as 0x55 or 0xAA, perhaps can be two random number datas that produce at random.Comparing unit 223 is used for relatively being write by test data generator 221 test data of mnemon 24 (embodiment can be the non-voltile memories such as flash memory), with by the data that read out in the mnemon 24, judge whether this mnemon 24 exists the position of mistake, and judge the information of the error bits such as address of error bit in the mnemon.
For instance, when the test data that writes mnemon 24 for complete for " 0 " and data, and the data that a position arranged in the data that read out by mnemon 24 for " 1 ", then this position is the position of mistake.Data buffer 227 is used for the misdata that temporary above-mentioned mnemon 24 ECC can't revise.Repair the bit data corresponding with error bit in the Information revision data buffer 227 of the error bit that unit 225 provides according to comparing unit 223, specifically, data buffer 227 can be selected from the internal memories such as random access memory (RAM), non-voltile memory (Non-volatile), phase-change memory element (Phase Change Memory), free ferroelectric formula random access memory (Free erroelectric random access memory), magnetic RAM (Magnetic RAM).
Other arranges an ECC unit 229, in order to carry out debug and the correction of the data in the above-mentioned mnemon 24, and for the data to data buffer 227, operation with execution error inspection and correction, this ECC unit 229 is except carrying out when the general normal read-write operation the debug of data and revising, and also is used for the operation that the data execution error to data buffer 227 checks and proofreaies and correct.
Interactive state between each memory block of Fig. 3 A to Fig. 3 I explanation the present invention, the flow process that the mode that main explanation utilization writes test data is carried out inspection and the correction of error bit.
During beginning, find first misdata.Fig. 3 A is shown as the block A in the mnemon, three memory regions (block) such as block B and block C, when control module reads the block A of mnemon, find wherein to have a memory page (being assumed to be the first page of graphic middle sign " defective ") that can't the correct mistakes situation of data of ECC occurs, then control module is carried out and is copied, with the data Replica among the block A in block B, the data that wherein indicate the first page of " defective " are not processed via ECC, and with original the copying in the block B first page of data integrity, then other page number will still be copied on the block B by block A by the ECC function.
Afterwards, carry out data Replica and erase.Shown in Fig. 3 B, suppose in the above-mentioned process that copies, after having copied smoothly second page (indicating " source book "), find that the situation that ECC can't the mis repair data also occurs the 3rd page of block A, be denoted as " defective ", then same with original the copying among the 3rd page of the block B of the data integrity among the 3rd page of the block A; Then copy the 4th page data, after having copied smoothly the 4th page data, the position of finding again its mistake of data of the 5th page when the data that copy the 5th page surpasses the amendable scope of ECC, be denoted as " defective ", then equally with original the copying in the 5th page of block B of the data integrity among the 5th page of the block A.According to above-mentioned copy rule, the data among the block A are all copied among the block B, block A then erases.
Then, write test data and detect error bit.Behind the block A that erases, provide the first test data to write block A by the test data generator, shown in Fig. 3 C, after the first test data writes each page of block A, be denoted as " sample 1 " on each page, control module will carry out the fetch program for the data in the block A first page, the 3rd page and the 5th page, to make comparisons with the first test data of block A, whether detect vicious position.If be found the error bit in the memory page in the process, just with the data Replica of the relevant memory page of block B in the buffer of control module, do the data correction according to the position of mistake again.After the data correction of mistake, control module is looked for a block in flash memory, such as block C, deposits revised data and other data that need not revise in block C by the working area.
Shown in Fig. 3 D, control module is compared first the data in the block A first page, if discovery error bit, be denoted as " defective ", just in the data readback data buffer with block B first page, in data buffer, according to corrupted bits of information correction data, then will in buffer, revised data write block C, be denoted as " revising ".And block B second page is correct data, is denoted as " raw data ", just directly copies among the block C.
Control module is then compared the data (because error in data) among the 3rd page of the block A, if discovery error bit, just with the data readback data buffer of the 3rd page of block B, and the position that in buffer, corrects mistakes, behind the position that corrects mistakes, more revised data are write block C, as be expressed as " revising " the 3rd page.
The 4th page of block B is correct data, just directly copies among the block C, is denoted as " source book ".Then compare the data among the 5th page of the block A, if find error bit, just equally with the data readback data buffer of the 5th page of block B, and the position that in buffer, corrects mistakes, behind the position that corrects mistakes, more revised data are write block C.According to above-mentioned restorative procedure, correct data and the revised data of script among the block B are write block C.
Then shown in Fig. 3 E, after data all write block C, the block B that just erases, and the data of first page among the block C, the 3rd page and the 5th page are all through revising, the error bit of each page has all dropped to the scope that ECC can revise, just then block A is denoted as the damage block.
Yet, when the first test data writes block A, might can't find error bit, or be found error bit, but through revising the situation still error bit can't be down to the scope that ECC can revise.
Fig. 3 F shows record the first test data among the block A, is denoted as " sample 1 "; Block B records the misdata that indicating " defective " and the raw data of sign " raw data "; Block C then putting down in writing through the memory page (being denoted as " revising ") of correction, source book (being denoted as " raw data ") with do not reach yet the recoverable scope of ECC or do not find error bit and data wrong data (being denoted as " defective ") still through revising.
When control module writes the first test data in block A, suppose that first page detects error bit smoothly in block A, after the data and correction of the block B first page that reads back, make error bit drop to the scope that ECC can repair, just revised data are write among the block C.Then copy the data of block B second page to block C.
And in utilizing the first test data detecting block A during the 3rd page error bit, comparing unit is not found wrong position, or is found error bit, but still can't make the number of error bit be down to the scope that ECC can repair through correction.In the former situation (without finding error bit), control module with the data Replica among the 3rd page of the block A in block C; Or in latter's situation, (be found error bit), with the data Replica repaired in the control module data buffer in block C, as being denoted as the memory page of " defective " among the block C.Then copy the data of the 4th page of block B to block C.
Then detect among the block A the 5th page error bit, the misdata of supposing the 5th page of block B can repair and make error bit to be down in the scope that ECC can revise smoothly,, to data buffer and after revising data the repair data in the data buffer is write among the block C in the data that read the 5th page of block B.Then the ensuing memory page of block B is copied to block C.
After above-mentioned Fig. 3 F all writes block C with data, namely the erase data of block A and block B then sees also Fig. 3 G, and the test data generator produces the second test data (being denoted as " sample 2 "), and write block A, in order to test the error bit of the 3rd page of above-mentioned block A.Shown in Fig. 3 H, the data in block C first page and the second page directly copy to another block in the flash memory, block D.Then, control module is according to the error bit of the 3rd page of the second test data detecting block A, if find smoothly error bit, just in the data readback data buffer with the 3rd page of block C, according to the data in the error bit correction buffer, suppose this time to revise error bit to be down to the scope that ECC can revise, then revised data in the data buffer are write block D.Then the ensuing memory page of block C is copied among the block D.At last shown in Fig. 3 I, after data all write block D, block C and block A is denoted as the damage block erased.
Interactive state when the stocking system framework that provides according to the invention described above and each memory region application data repair function, the method of its data reparation comprises at least and detects first a memory region can't the correct mistakes situation of data of detecting and alignment technique (ECC) that makes a mistake, the present invention is about to this wrong content and is down in the manageable scope of ECC, and then copy data in the memory region in a temporarily providing room, comprise another redundant memory region that the embodiment of the invention is utilized, or in the random access memory (RAM), and the data of the memory region of erasing.Afterwards, look for wrong bit position by testing process, as obtaining wrong bit position by the test data that relatively writes and the data of reading.And revise in the temporarily providing room the relatively data of error bit position according to the information of error bit, judge again the misdata that after revising, whether still exists the ECC technology to repair, or the testing process again by one or many reduces wrong figure place, make it to reach in the manageable scope of ECC technology, with execution error detection and correction, at last more the memory region that makes a mistake is labeled as the damage block, in order to avoid continue to make a mistake, to promote the reading efficiency of Storage Media. in the futurePreferred embodiment flow process as shown in Figure 4 wherein.
Step 1 begins, such as step S401, when the control module in the stocking system when reading a wherein physical blocks of mnemon (such as flash memory), such as the first memory region, wherein one or more memory page detects can't the correct mistakes situation of data of ECC that occurs, at this moment, need the data Replica in the former memory region to a temporarily providing room (such as above-mentioned random access memory, non-voltile memory, phase-change memory element, the data buffer of free ferroelectric formula random access memory or magnetic RAM) in, this is step S403 for example, control module utilizes a redundant block that does not use, be assumed to be the second memory region, with the data Replica of the first memory region to the second memory region of this redundancy, then just the first memory region is erased, such as step S405.It is worth mentioning that, at the data Replica of the first memory region in the process of the second memory region, control module will be controlled the ECC unit stops execution error detection and reparation for the data page of mistake function, guaranteeing that data transfer is not affected by ECC, and make fully constant the copying in the second memory region of data of the memory page of the first memory region.
Then carry out a testing process, namely produce a test data (if testing process for the first time by the test data generator in the control module, then be to write the first test data) write in the first memory region that error in data occurs (step S407), such as the data among the block A among Fig. 3 C " sample 1 ", then control module reads the data (step S409) in the memory page that error in data occurs in this first memory region, and make comparisons by comparing unit, comparing unit checks with the data (step S411) that read out in contrastive test data producer the first test data that provides and the memory page that makes a mistake whether two documents have difference from the first memory region.
Such as step S413, judge whether vicious position by comparison result, namely wrong hardware address is noted down respectively different data if find position corresponding between two data, and then the memory address of this memory page is wrong position; If different data are not found in corresponding position between two data, then do not find wrong position.
If do not find at comparing unit under the situation of error bit, method provided by the present invention is then proceeded a testing process (step S415), make the test data generator produce in addition a different set of test data (the second test data), carry out for the second time testing procedure, the second test data is write in above-mentioned the first memory region equally, comprise the memory page that error in data occurs, continue to detect the error bit on the memory page that error in data occurs on the first memory region.
The present invention is when carrying out above-mentioned test loop, the upper limit number of times (greater than 1) of a test will be set, if through after the test of this number of times, the position all can't locate errors, then control module is just repaid the information of read error, then the first memory region is labeled as damage block (Bad block).
If but behind the first time or testing process several times (comprising above-mentioned steps S405, S407, S409, S411 and S413), really be found to error bit, then the information with error bit sends the reparation unit to, and then control module is from the data Replica of relatively vicious memory page on the second memory region to data working area (step S417).
Then repair the unit just according to the information of error bit, revise the data (step S419) of corresponding error bit in the data buffer.For example, if wrong position is first and the 3rd, and the data of this memory page are 11110101 in the data buffer, then the data in the data buffer are become 0 or become 1 by 0 by 1, are modified to 01010101.After the data in revising data buffer, then to judge the program whether the ECC unit in the control module can carry out to the data in the data buffer error detection and reparation, whether within the recoverable scope in ECC unit (step S421), if the figure place of the misdata in the data buffer still surpasses the scope that the ECC unit can be repaired, namely the ECC unit still can't carry out the reparation of misdata to the misdata of detecting in the data buffer, then making the test data generator produce another group test data goes to write in the memory page of the first memory region generation error in data (step S415), continue the error bit in the detecting memory page, with the misdata in the further correction data buffer, in the hope of error bit being down to the amendable scope of ECC.
Equally, if after too much test data test, the testing time upper limit greater than above-mentioned setting, still can't effectively potential drop wrong in the data buffer be low to moderate the scope that ECC can revise, then control module is just repaid the message of read error, and first memory region that then will make a mistake is labeled as the damage block.
If the data in the data buffer are after repairing, its wrong figure place has been down to the scope that the ECC unit can be repaired, in other words, be exactly that the ECC unit can carry out to the misdata of detecting in the data buffer reparation of misdata, then control module is looked for a redundant block again in mnemon, be assumed to be the 3rd memory region (such as the block C of Fig. 3 E), to write the position of the memory page that relatively makes a mistake in this 3rd memory region in the data buffer through revised data, the data of other memory page then copy (step S423) by memory page corresponding in the second memory region of existing copy data in the 3rd memory region.When the data in the 3rd memory region all copy complete after, control module above-mentioned the second memory region (step S425) of just erasing, and the first memory region is labeled as damage block (step S427).
So according to above-mentioned flow process, if there is the situation of the memory page that a plurality of ECC can't the mis repair data in memory region, stocking system proposed by the invention still can utilize and write that test data is detected and the method for mis repair position is processed.
Another embodiment, data recovery method flow chart of steps as shown in Figure 5 then are provided again.In this example, when control module when reading a wherein physical blocks of mnemon (such as flash memory), be assumed to be the first memory region, can't the correct mistakes situation (step S501) of data of ECC occurs in a certain memory page that reads entity the first memory region, at this moment, need the data Replica in the former memory region in a temporarily providing room, this example then is by control module the data in the first memory region all to be copied to (step S503) in the data buffer, such as a memory space in the stocking system, or the internal memory of the computer system that connects of stocking system, the first memory region (step S505) of then erasing.Wherein, at the data Replica of the first memory region in the process of data buffer, control module will be controlled the ECC unit for can't the correct mistakes memory page of data of ECC, stop the function of execution error detection and reparation, to guarantee that data transfer is not affected by ECC, make the data in the memory page of the first memory region generation error in data can fully constant copying in the data buffer.
Then carry out testing process, as described in step S507, provide a test data to write in the first memory region by the test data generator, comprise the memory page that makes a mistake, then the control module data (step S509) that read the memory page that the first memory region makes a mistake are to comparing unit make comparisons (step S511), the test data that comparing unit contrastive test data producer provides and the data that from the first memory region, read out in this memory page, check whether two data have difference, such as step S513, judge whether wrong position, comprise two kinds of situations:
Situation one:
Note down respectively different data if find position corresponding between two data, then judge in the first memory region to have error bit, at this moment, need to revise these data of putting down in writing, namely revise the data that write the working area at step S503.
Situation two:
If different data are not found in corresponding position between two data, then do not find wrong position, need carry out testing process (step S515) next time this moment, make the test data generator produce another group test data and remove to write the first memory region to the memory page that wherein makes a mistake, to continue the error bit on detecting the first memory region memory page, comprise repeating step S505, S507, S509, S511 and S513.
In situation two, repeatedly after the several above-mentioned steps, if the number of the test data that the test data generator provides has surpassed the error bit that a preset upper limit all can't be found out memory page in the first memory region, then control module is convenient to look in the mnemon a redundant block, it is the second memory region, block data in the data buffer is write this second memory region, afterwards, the information of control module repayment read error, then this first memory region is labeled as the damage block, step is described in Fig. 6.
In above-mentioned condition one, when comparing unit is found wrong position, send the information of this error bit to the reparation unit, then repair the unit just according to the information of error bit, the bit data (step S517) of corresponding error bit on the modified block data accumulating page or leaf in data buffer, after the data in revising data buffer, the program that the ECC unit carries out error detection and reparation to the data of block data memory page in the data buffer, and must judge whether within the recoverable scope of ECC (step S519), if the figure place of the misdata of memory page in the data buffer (memory page that makes a mistake in former the first memory region) still surpasses the scope that the ECC unit can be repaired, then making the test data generator produce another group test data goes to write in the memory page above-mentioned in the first memory region, continue aforesaid detecting and revision program, such as above-mentioned step S515.
If behind the test procedure through a predetermined number, if still can't effectively potential drop wrong on the memory page in the data buffer be low to moderate the scope that ECC can revise, then control module is convenient to look in the mnemon a redundant block, be assumed to be the second memory region, block data in the data buffer is write this second memory region, afterwards, the information of control module repayment read error, first memory region that then will make a mistake is labeled as damage block (Bad block), and step is described in Fig. 6.
And if the data in the data buffer are after repairing, its wrong figure place has been down to the scope that the ECC unit can be repaired, then control module is looked for a redundant block in mnemon, be assumed to be the second memory region, data in the data buffer are write (step S521) in this block, when the data in the second memory region all copy complete after, the first memory region that error in data will occur control module is labeled as the damage block, such as step S523.
Then, to have a plurality of memory pages in the block be the situation that ECC can't correct mistakes if stocking system runs into, stocking system of the present invention is all to copy to the data in the block in the data buffer, the memory page that then can't correct mistakes for ECC in the block is one by one carried out the program of wrong debug described above and reparation, each memory page that script ECC can't correct mistakes in data buffer, the scope that ECC can repair has been down in its error bit all, just the data in the data working area are write back in the mnemon, namely repair by the ECC unit afterwards, and can't the correct mistakes block of data of ECC will before occur be labeled as the damage block.
Above-mentioned Fig. 4 step S415 and the described flow process of Fig. 5 step S515, do not find at comparing unit under the situation of error bit, the present invention then implements this testing process repeatedly, make the test data generator produce lower a different set of test data, carry out again one or plural number time testing procedure, error bit on the detecting memory region on the memory page of generation error in data is until find error bit.
The present invention is when carrying out above-mentioned test loop, the upper limit number of times (greater than 1) of a test will be set, mainly be in not finding error bit or through repairing but under the situation that ECC still can't repair (step S601), to carry out testing process next time, need judge whether first before this to surpass the upper limit of test quantity? (step S603), if through after the test of this number of times, the position all can't locate errors, the testing process that namely carries out has surpassed default quantity, then can carry out step S611, the result of control module repayment read error is to stocking system, and the memory region that mark wants to test instantly is a damage block (step S613).If be about to the testing process that carries out still below default quantity or just equal this predetermined number, then can proceed next time testing process (step S605), testing process sees also Fig. 4 or Fig. 5.And judge whether to find error bit in test result? (step S607), or after repairing, judge that ECC can repair? (step S609), if still do not find error bit or ECC still can't repair after repairing, the described step of Fig. 6 repeatedly then.
According to above-described embodiment, the stocking system with data repair function that the present invention proposes carries out exchanges data by a built-in scratch-pad memory, especially carry out the scratch-pad memory that data are repaired, Fig. 7 A to 7H figure shows that then the random access memory that utilizes in the computer system is as the temporary temporary memory area with repairing of data, but not utilize the memory region in the mnemon in the stocking system, when especially the stocking system in this example is connected in computer system, such as desktop PC, mobile computer, carry-on computer system etc., and the storage volume of the random access memory in the computer system is enough large, and can be used for providing stocking system temporary.
Please consult first Fig. 7 A to Fig. 7 C.When data read, the situation that ECC can't revise memory page occurs, first page (wherein indicating the memory page of " defective ") such as A in the block among Fig. 7 A, then control module is just carried out the action in the data buffer of the data Replica of block A, in this example with the data Replica of block A to the RAM scratch block.Suppose in the middle of reproduction process, to find again that the 3rd page of memory page and the 5th page also are can't the correct mistakes situation of data of ECC, so the data of first page, the 3rd page and the 5th page and other without occur ECC can't correct mistakes data page number directly thus block A copy in the RAM scratch block, shown in Fig. 7 B.
After finishing data-moving, the original data of block A of erasing, and write the first test data to block A, such as " sample 1 " that indicates among Fig. 7 C.Then shown in Fig. 7 D, then check block A first page according to the first test data that writes each memory page, the error bit of the 3rd page and the 5th page, if discovery error bit, then according to the error bit of finding directly the RAM scratch block in figure revise data in the relevant memory page, first page in data buffer, the figure place of the misdata of the 3rd page and the 5th page has been down to scope that ECC can revise (as indicating the memory page of " revising " in the RAM scratch block among Fig. 7 D, all the other are " raw data "), control module is convenient to look in the mnemon block, be assumed to be block B, data Replica in the RAM scratch block in block B, is labeled as the damage block with block A at last.
Yet, if shown in Fig. 7 E, the error bit that will detect the 3rd page of block A when control module is when revising the data of the 3rd page of data buffer, following situation is arranged, first situation is, utilize the testing process of the first test data (such as " sample 1 ") still can't detect error bit among the 3rd page of the block A, as indicating the 3rd memory page of " defective " among the figure; Or second situation, detect error bit, but still exceed the scope that ECC can revise through the wrong figure place of revised data.
When above-mentioned second the situation of the error bit of detecting being arranged, control module still according to the data in the 3rd page of the error bit Update Table buffer, then reads the data of the 5th page of block A and compares, the error bit in the 5th page of the modified R AM scratch block.If in the time can't detecting first situation of error bit, the 3rd page data in the modified R AM data blocks not then, as indicating the memory page of " defective " in the RAM scratch block among Fig. 7 E, and the data of then going to read the 5th page of block A are compared, to revise the error bit in the 5th page in the data working area.
Then shown in Fig. 7 F, all after the test through the first test data, the block A that erases is to write the second test data (as indicating " sample 2 ") block A first page, the 3rd page and the 5th page.Then consult Fig. 7 G and Fig. 7 H, after the second test data writes block A, control module reads the data of the 3rd page of block A to make comparisons with the second test data, to find error bit, again according to the 3rd page data in the error bit modified R AM scratch block, as indicating the 3rd memory page of " revising " in Fig. 7 G RAM scratch block.
But, if the second test data still can't be revised in the data buffer the 3rd page data, then can write again another test data to block A, repeat above-mentioned handling procedure, with the 3rd page data in the repair data buffer.And if testing time next time surpasses a default test upper limit quantity, still the scope that the error bit correction can can't be revised to ECC, then control module is exported read error information, and the data in the RAM scratch block in this example are deposited back in another block of mnemon, then block A is labeled as the damage block, such as " the damage district " of block A sign among Fig. 7 H.
Another situation is, the data of first page, the 3rd page and the 5th page are all through revising in the RAM scratch block, make the wrong figure place of those page numbers all be down to the scope that ECC can revise, control module is convenient to look on the mnemon block B, data in the data buffer are write block B, then block A is labeled as the damage block.
So in the block mistake that ECC can't revise occurs originally, and cause the situation of data read errors, the method that it is repaired by data of the present invention, the potential drop of error in data in the block can be low to moderate the scope that ECC can revise, the block data of repairing is stored in another block, and can't the correct mistakes block of data of ECC is labeled as the damage block and avoids reusing, thereby the fiduciary level of Effective Raise stocking system data.
In sum, the present invention discloses a kind of stocking system and data recovery method thereof with data repair function, utilize repeatedly one or test repeatedly with repair flow process, make mistake in the memory media can be reduced to error detection and the recoverable scope of correction (ECC) function of normal operation, with the correctness of guaranteeing that data reads, effectively promote data reliability.
The above only is preferred implementation of the present invention; should be pointed out that for those skilled in the art, under the prerequisite that does not break away from the principle of the invention; can also make some improvements and modifications, these improvements and modifications also should be considered as protection scope of the present invention.

Claims (11)

1. the stocking system with data repair function is characterized in that, described system comprises:
One nonvolatile memory unit; And
One control module, described control module comprises:
One test data generator is in order to produce test data and described test data to be write to the position of the storage misdata of this mnemon;
One comparing unit is used for comparing test data and the former test data that writes of being read by this mnemon, judges the address of this mnemon error bit; And
One error detection and correcting unit are carried out debug and the correction of the data in this mnemon, and are used for the data execution error inspection of data buffer and the operation of proofreading and correct;
Described control module is when this error detection and correcting unit occur and can't correct mistakes the situation of data in a memory region that detects this nonvolatile memory unit, copy data in this memory region in a temporarily providing room, and the data of this memory region of erasing, this test data generator carries out a testing process producing a test data, and writes to the position of the storage misdata in this memory region; And draw wrong bit position and repair by the test data that reads in this memory region and the former test data that writes by this comparing unit comparison;
Described reparation also comprises the error bit of revising storage data in this temporarily providing room; And judge the misdata that after revising, whether still exists this error detection and correcting unit to repair, if the wrong figure place of stored data still surpasses the scope that this error detection and correcting unit can be repaired in this temporarily providing room, then carry out next time testing process; If the wrong figure place of stored data has been down to the scope that this error detection and correcting unit can be repaired, then execution error detection and correction in this temporarily providing room; And the memory region of mistake is labeled as the damage block.
2. the stocking system with data repair function as claimed in claim 1 is characterized in that, described system comprises that also one repairs the unit, the error bit of this data buffer storage data of Information revision of the error bit that provides according to this comparing unit.
3. the stocking system with data repair function as claimed in claim 2 is characterized in that, described data buffer is used for the data that the temporary wrong debug of this mnemon and correcting unit can't be repaired.
4. the stocking system with data repair function as claimed in claim 3 is characterized in that, described data buffer is selected from random access memory, non-voltile memory, phase-change memory element, free ferroelectric formula random access memory, magnetic RAM.
5. the stocking system with data repair function as claimed in claim 1 is characterized in that, described test data is two random number datas that produce at random.
6. a data recovery method that is applied to stocking system is characterized in that, described method comprises:
Detect a memory region can't the correct mistakes situation of data of detecting and correcting unit that makes a mistake;
Copy data in this memory region in a temporarily providing room, and the data of this memory region of erasing;
Carry out a testing process, comprise producing a test data, and write to the position of the storage misdata in this memory region;
By relatively drawing wrong bit position and repair by the test data that reads in this memory region and the former test data that writes,
The process of described reparation also comprises:
Revise the error bit of storage data in this temporarily providing room;
The misdata whether judgement still exists error detection and correcting unit to repair after revising if the wrong figure place of stored data still surpasses the scope that error detection and correcting unit can be repaired in this temporarily providing room, is then carried out next time testing process; If the wrong figure place of stored data has been down to the scope that error detection and correcting unit can be repaired in this temporarily providing room,
Execution error detection and correction; And
This memory region is labeled as the damage block.
7. the data recovery method that is applied to stocking system as claimed in claim 6 is characterized in that, described temporarily providing room is selected from random access memory, non-voltile memory, phase-change memory element, free ferroelectric formula random access memory, magnetic RAM.
8. the data recovery method that is applied to stocking system as claimed in claim 6 is characterized in that, the data Replica of this memory region in the step of this temporarily providing room, will be stopped the function of original error detection and reparation in this stocking system.
9. the data recovery method that is applied to stocking system as claimed in claim 6 is characterized in that, two random number datas of the described test data that produces for producing at random.
10. the data recovery method that is applied to stocking system as claimed in claim 6 is characterized in that, when not finding error bit, judges namely whether the number of times of next time test surpasses the upper limit number of times of a test.
11. the data recovery method that is applied to stocking system as claimed in claim 10, it is characterized in that, if the number of times of test has surpassed the upper limit number of times of this test next time, namely repay the information of read error by a control module, and this memory region is labeled as the damage block.
CN 200810109903 2008-06-04 2008-06-04 Storage system with data repair function and data repair method thereof Active CN101599305B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN 200810109903 CN101599305B (en) 2008-06-04 2008-06-04 Storage system with data repair function and data repair method thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN 200810109903 CN101599305B (en) 2008-06-04 2008-06-04 Storage system with data repair function and data repair method thereof

Publications (2)

Publication Number Publication Date
CN101599305A CN101599305A (en) 2009-12-09
CN101599305B true CN101599305B (en) 2013-03-27

Family

ID=41420709

Family Applications (1)

Application Number Title Priority Date Filing Date
CN 200810109903 Active CN101599305B (en) 2008-06-04 2008-06-04 Storage system with data repair function and data repair method thereof

Country Status (1)

Country Link
CN (1) CN101599305B (en)

Families Citing this family (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8453043B2 (en) * 2010-09-13 2013-05-28 Taiwan Semiconductor Manufacturing Company, Ltd. Built-in bit error rate test circuit
CN102543210B (en) * 2012-02-10 2016-12-14 上海华虹宏力半导体制造有限公司 Flash memory error checking and correction repairing method
JP5965076B2 (en) * 2012-09-25 2016-08-03 ヒューレット−パッカード デベロップメント カンパニー エル.ピー.Hewlett‐Packard Development Company, L.P. Uncorrectable memory error processing method and its readable medium
US9348748B2 (en) * 2013-12-24 2016-05-24 Macronix International Co., Ltd. Heal leveling
TWI569279B (en) 2015-10-15 2017-02-01 財團法人工業技術研究院 Memory protection device and method
TWI601148B (en) * 2016-05-05 2017-10-01 慧榮科技股份有限公司 Method for selecting bad columns and data storage device with? bad column summary table
CN106445725A (en) * 2016-09-20 2017-02-22 华中科技大学 Test method for error mode of flash memory and system
CN106502821A (en) * 2016-10-26 2017-03-15 武汉迅存科技有限公司 A kind of method and system for obtaining flash memory antithesis page false correlations
CN106847341A (en) * 2016-12-23 2017-06-13 鸿秦(北京)科技有限公司 The memory bank self-checking unit and method of a kind of pure electric automobile integrated information storage device
KR20180096845A (en) * 2017-02-20 2018-08-30 에스케이하이닉스 주식회사 Memory system and operation method of the same
CN109753374B (en) * 2017-11-01 2022-05-03 珠海兴芯存储科技有限公司 Memory bit level repair method
CN110660445B (en) * 2018-06-29 2021-07-30 华邦电子股份有限公司 Method for repairing outlier and memory device
WO2022198491A1 (en) * 2021-03-24 2022-09-29 Yangtze Memory Technologies Co., Ltd. Memory device with failed main bank repair using redundant bank

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1750172A (en) * 2004-09-08 2006-03-22 三星电子株式会社 Non-volatile memory device and method of testing thereof with test data buffers

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1750172A (en) * 2004-09-08 2006-03-22 三星电子株式会社 Non-volatile memory device and method of testing thereof with test data buffers

Also Published As

Publication number Publication date
CN101599305A (en) 2009-12-09

Similar Documents

Publication Publication Date Title
CN101599305B (en) Storage system with data repair function and data repair method thereof
US8418030B2 (en) Storage system with data recovery function and method thereof
US8046528B2 (en) Data writing method for flash memory, and flash memory controller and storage device thereof
CN103853582B (en) Flash memory update method
US8055834B2 (en) Method for preventing read-disturb happened in non-volatile memory and controller thereof
US8225067B2 (en) Multilevel cell NAND flash memory storage system, and controller and access method thereof
US8051339B2 (en) Data preserving method and data accessing method for non-volatile memory
CN101310343B (en) Memory diagnosis device
US9189313B2 (en) Memory system having NAND-type flash memory and memory controller with shift read controller and threshold voltage comparison module
US20140129891A1 (en) Methods and devices to increase memory device data reliability
JPWO2008078529A1 (en) Test apparatus and test method
CN109582216B (en) Data storage device and data processing method of memory device
US9514843B2 (en) Methods for accessing a storage unit of a flash memory and apparatuses using the same
KR20150029402A (en) Data storing system and operating method thereof
CN111710358B (en) Flash memory device, flash memory controller and flash memory storage management method
TW200929235A (en) Built-in self-repair method for NAND flash memory and system thereof
US20120166706A1 (en) Data management method, memory controller and embedded memory storage apparatus using the same
CN116880782B (en) Embedded memory and testing method thereof
CN101908376A (en) Non-volatile storage device and control method thereof
US20090164869A1 (en) Memory architecture and configuration method thereof
US10176876B2 (en) Memory control method and apparatus for programming and erasing areas
US20120159280A1 (en) Method for controlling nonvolatile memory apparatus
CN112802530A (en) NAND testing method and device, readable storage medium and electronic equipment
US20090300272A1 (en) Method for increasing reliability of data accessing for a multi-level cell type non-volatile memory
US8885406B2 (en) Memory device, memory control device, and memory control method

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant