CN104881370B - Collaboration uses correcting and eleting codes and the reliable flash-memory storage system construction method of error correcting code - Google Patents

Collaboration uses correcting and eleting codes and the reliable flash-memory storage system construction method of error correcting code Download PDF

Info

Publication number
CN104881370B
CN104881370B CN201510236451.XA CN201510236451A CN104881370B CN 104881370 B CN104881370 B CN 104881370B CN 201510236451 A CN201510236451 A CN 201510236451A CN 104881370 B CN104881370 B CN 104881370B
Authority
CN
China
Prior art keywords
page
request
bit
errors
error correcting
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201510236451.XA
Other languages
Chinese (zh)
Other versions
CN104881370A (en
Inventor
肖侬
陈志广
卢宇彤
周恩强
张伟
董勇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
National University of Defense Technology
Original Assignee
National University of Defense Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by National University of Defense Technology filed Critical National University of Defense Technology
Priority to CN201510236451.XA priority Critical patent/CN104881370B/en
Publication of CN104881370A publication Critical patent/CN104881370A/en
Application granted granted Critical
Publication of CN104881370B publication Critical patent/CN104881370B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Techniques For Improving Reliability Of Storages (AREA)

Abstract

The invention discloses a kind of reliable flash-memory storage system construction method cooperateed with using correcting and eleting codes and error correcting code, step includes:I/O request R is received, judges read-write type;For write request, the s user data page of each band is amounted into the s+k page to be written using correcting and eleting codes generation, together with verification and, the error correcting code together write storage device of each page;The son request for belonging to different bands is divided into for read request, asked for each height, read each page and its verification and, error correcting code, calculate each page verification and and identify bit-errors, the most page of bit-errors is found out, if the number of bit errors of the page is not more than the maximum wrong digit T that used error correcting code can correct, the bit-errors in being asked using error correcting code syndrome, otherwise using the bit-errors in the request of correcting and eleting codes syndrome, the data that son is asked are returned.The present invention has the advantages of computing cost is low, IO speed is fast, service life of flash memory prolongation effect is notable.

Description

Collaboration uses correcting and eleting codes and the reliable flash-memory storage system construction method of error correcting code
Technical field
The present invention relates to computer memory system field, and in particular to a kind of to cooperate with using the reliable of correcting and eleting codes and error correcting code Flash-memory storage system construction method.
Background technology
Flash memory is widely deployed in large-scale storage systems because of its superior performance, still, its limited life-span Popularization of the flash memory in the case where writing intensive load is hindered to a certain extent.The life-span of flash memory refers to that each memory cell can be held The erasable number received.Manufacturer can provide nominal life corresponding to this chip when releasing a flash chip.If each storage For the erasable number of unit within nominal life, the bit error rate of flash memory is relatively low, and the data being stored on flash memory are considered as can Lean on.In fact, in the very wide range after erasable number exceedes nominal life, flash memory still can preserve data, simply position Error rate progressively becomes big, in fact it could happen that loss of data phenomenon.But with the further increase of erasable number, bit error rate into Exponential increase, it is unavailable to ultimately result in flash memory.Flash memory bit error rate growth trend in accordance with the above is understood, in bit error rate With erasable number be added to exponential increase before, flash memory is still available;Correct in flash memory according to appropriate means Existing bit-errors, even if the erasable number of each memory cell exceedes nominal life, the data preserved in a flash memory are still can Lean on.
ECC (Error Correction Code, error correcting code) is a kind of conventional fault tolerant mechanism in storage device.Storage When equipment writes one page user data into storage medium, while it is the error correcting code that the page data generates some.Error correcting code with User data is together stored in storage medium.When responding the read request of upper layer application, storage device asks upper layer application User data and corresponding error correcting code are taken out from storage medium simultaneously, and are detected and corrected in user data using error correcting code The bit-errors of appearance, so as to ensure that user data is completely reliable.The characteristics of error correcting code is that space expense is very low, and it utilizes more than ten positions Information can ensure that the data of hundreds of bytes are completely reliable.But its limited error recovery capability, it is only applicable to the relatively low field of probability of malfunctioning Scape.When the erasable number of each memory cell of flash memory is when within nominal life, bit error rate is relatively low, and error correcting code can ensure number According to reliable.If erasable number exceedes nominal life, bit error rate constantly rises, and error correcting code must use more check bit, simultaneously The bit-errors occurred in user data could be corrected by being aided with the calculating of complexity.Error correcting code is calculated in IO critical paths, excessive Computing cost can have a strong impact on IO performances.So when flash memory actual life exceeding nominal life, error correcting code is simply utilized The reliability of flash-memory storage system can not be ensured with very low computing cost.
Correcting and eleting codes (Erasure Code) are that one kind is usually used in system-level fault tolerant mechanism, and it is mainly substantial amounts of by preserving Redundant data ensures data reliability.Correcting and eleting codes for being abstracted as (n, k) two tuple, when the write request of response upper layer application When, it generates k part redundant datas according to n parts user data, and this n+k part data is in the same size, is written simultaneously in storage system. When responding the read request of upper layer application, as long as there is n parts still complete reliable in this n+k part data, you can correctly recover all User data.The characteristics of correcting and eleting codes is that fault-tolerant ability is strong, and its fault-tolerant ability increases with the increase of k values, suitable for bit error rate Higher scene.When the erasable number of each memory cell of flash memory exceedes nominal life, although bit error rate is higher, as long as entangling Delete code and select sufficiently large k values, you can ensure that the data on flash memory are completely reliable, but certain computing cost and space can be introduced Expense.
The above-described two kinds of fault tolerant mechanisms of Integrated comparative are understood:Error correcting code is only applicable to the relatively low scene of error rate, when When error rate is higher, the computing cost of error correcting code dramatically increases, or even can not ensure the correctness of data;Correcting and eleting codes are applied to mistake The higher scene of rate by mistake, it can correct a large amount of bit-errors occurred in data with powerful fault-tolerant ability, and computing cost is permanent It is fixed, but when bit-errors are less, the computing cost of correcting and eleting codes is higher than error correcting code.The use of both the above mechanism is difficult independently to ensure By beyond the life of flash memory to manufacturer's nominal life while IO performances.Therefore, error correcting code and correcting and eleting codes how to be realized Collaboration uses, and premised on avoiding having a negative impact to performance, realizes when flash memory actual life exceeding nominal life The reliability for the data being stored in flash-memory storage system, extend the actual life of flash memory, it has also become urgently to be resolved hurrily Technical problem.
The content of the invention
The technical problem to be solved in the present invention is:For the above mentioned problem of prior art, there is provided a kind of computing cost is low, IO Speed is fast, realize the data being stored in when flash memory actual life exceeding nominal life in flash-memory storage system can The reliable flash-memory storage system construction method using correcting and eleting codes and error correcting code is significantly cooperateed with by property, service life of flash memory prolongation effect.
In order to solve the above-mentioned technical problem, the technical solution adopted by the present invention is:
A kind of reliable flash-memory storage system construction method cooperateed with using correcting and eleting codes and error correcting code, step include:
1) initialization receives the buffering area of I/O request;
2) I/O request R is received, I/O request R read-write type is judged, if read-write type is write request, redirects execution step 3);If otherwise read-write type is read request, redirects and perform step 4);
3) I/O request R data of writing are chosen according to band for unit, by s user of each band of selection Page of data generates the k redundant data page using correcting and eleting codes, calculates the s user data page and k redundant digit respectively According to the page composition the s+k page verification and, error correcting code, and by the verification of the s+k page and its each page with, entangle Error code together write storage device;
4) I/O request R is divided into the son request for being belonging respectively to different bands, asked for each height, reads son request Comprising each page and its verification and, error correcting code, calculate the verification of each page that son request included and and identify son The bit-errors of each page included are asked, the most page of bit-errors is found out, judges the dislocation of the most page of bit-errors Whether quantity is more than the maximum wrong digit T that used error correcting code can correct by mistake, if the position of the most page of bit-errors Number of errors is not more than the maximum wrong digit T that used error correcting code can correct, then in being asked using error correcting code syndrome The bit-errors occurred in each page;Can if the number of bit errors of the most page of bit-errors is more than used error correcting code The maximum wrong digit T of correction, then the bit-errors occurred in being asked using correcting and eleting codes syndrome in each page, return to son request Comprising the data that are included of each page.
Preferably, the step 1) also includes the write request of page sum of the initialization for recording storage device to be written Counter CountwFor 0 the step of, the detailed step of the step 3) includes:
3.1) the write request R page numbers included are added to write request counter CountwIn;
3.2) write request counter Count is judgedwWhether default threshold value h is exceeded, and wherein h is more than one complete band In the page quantity n integer that includes;If write request counter CountwMore than default threshold value h, then execution step is redirected 3.3);If write request counter CountwNo more than default threshold value h, then redirect and perform step 2);
3.3) it is Count to be written according to page numberwIndividual page of data of writing makees ascending sort, is written into CountwIndividual page of data of writing is divided into different bands so that and the page that numbering is x is divided into xth/n band, Wherein n represents the page quantity included in a complete band;From Count to be writtenwIndividual write chooses one in page of data Complete band, if choosing complete band success, redirect and perform step 3.4);Else if selection complete band is unsuccessful, Then choose comprising the user data page it is most have an imperfect band, redirect and perform step 3.4);
3.4) by the s user data page in the complete band of selection or imperfect band using correcting and eleting codes generation k The s+k page is obtained in the redundancy page, calculate respectively in the s+k page verification of each page and, error correcting code, will described in The error correcting code of each page and verification and together write storage device in the s+k page and the s+k page;Finally from write please Seek counter CountwIn subtract the user data page quantity s of this write storage device, jump procedure 3.2).
Preferably, the threshold value h in the step 3.2) is 10 times of the page quantity n included in complete band.
Preferably, the detailed step of the step 3.3) includes:
3.3.1) from write request counter Count to be writtenwIt is individual to write one complete band of selection in page of data, if Complete band success is chosen, then redirects and performs step 3.4);Else if selection complete band is unsuccessful, then execution step is redirected 3.3.2);
3.3.2) judge whether I/O request R is continuous write request, if I/O request R is continuous write request, redirect Perform step 2);Else if I/O request R and discrete write request, then redirect and perform step 3.3.3);
3.3.3) choose comprising the user data page it is most have an imperfect band, redirect and perform step 3.4).
Preferably, by the error correcting code of each page in the s+k page and the s+k page in the step 3.4) During with verification with together write storage device, storage device is that each page to be written distributes one in the s+k page Idle physical page, the data of each page are written into the data area of the physical page of distribution in the s+k page, The error correcting code of each page and the additional areas of verification and the physical page for being written into distribution in the s+k page.
Preferably, the verification of each page in the s+k page is calculated in the step 3.4) respectively and is specifically referred to Bit stream in the page to be calculated is divided into the word Word of fixed size, each word Word includes 64 bits, then by this A little word Word as the result that XOR is calculated as the verification of the page and.
Preferably, the detailed step of the step 4) includes:
4.1) I/O request R is divided into m son request for being belonging respectively to different bands;
4.2) the sub- request counter i of Initialize installation is 0;
4.3) son request R is readiComprising each page, together read son request RiComprising each page it is corresponding Error correcting code and verification and;
4.4) son request R is calculatediComprising each page verification and, according to the verification being calculated and and step 4.3) obtained verification is read in and is compared identification son request RiComprising each page in bit-errors;
4.5) sub- request R is found outiComprising each page in the most page of error bit, it is that the error bit is most The maximum wrong digit T that the wrong digit and used error correcting code of the page can correct is compared, if bit-errors are most The number of bit errors of the page be not more than the maximum wrong digit T that used error correcting code can correct, then redirect execution step 4.6);Else if the number of bit errors of the most page of bit-errors is more than the maximum mistake that used error correcting code can correct Digit T, then jump procedure 4.7);
4.6) using error correcting code syndrome request RiComprising each page bit-errors, if correct failure, jump Turn to perform step 4.7);Else if correcting successfully, then redirect and perform step 4.8);
4.7) correcting and eleting codes syndrome request R is utilizediComprising each page bit-errors;
4.8) son request R is returned to upper layer applicationiComprising user data;
4.9) sub- request counter i is increased;
If 4.10) sub- request counter i is less than son request fractionation quantity m, judge also have untreated complete son request, jump Turn to perform step 4.3);If sub- request counter i is equal to son, request splits quantity m, judges that having handled all sons asks Ask, redirect and perform step 2).
Preferably, when I/O request R to be divided into m son request for being belonging respectively to different bands in the step 4.1), m's It is worth for (OffsetR+SizeR+n-1)/n-OffsetR/ n, wherein n represent the user data number of pages included in a complete band, OffsetRRepresent I/O request R initial address, SizeRThe page number that I/O request R is included is represented, I/O request R is by with initial address OffsetRThe Size of beginningRIndividual continuous page composition;I-th of son request R in the m son requestiComprising the page such as formula (1) It is shown;
In formula (1), RiI-th of son request is represented, n represents the page quantity in a complete band, OffsetRRepresent IO Ask R initial address, SizeRRepresent the page number that I/O request R is included.
Preferably, the detailed step of the step 4.7) includes:
4.7.1 it is 0 that) initialization data, which recovers number counter,;In sub- request RiIn belonged to the v page of same band On the basis of being read, from storage device read n-v part data, be obtained correcting and eleting codes the band is made data recovery up to The n part data needed less;
4.7.2 R) is asked by correcting and eleting codes syndrome according to the common n parts data read outiComprising each page Bit-errors, if success syndrome request RiComprising each page bit-errors, then redirect perform step 4.8);Otherwise, will Data recovery number counter adds 1, redirects and performs step 4.7.3);
4.7.3) judge whether the value of data recovery number counter is equal toIf data recovery number counter Value is less thanThe other n parts data at least needed when correcting and eleting codes make data recovery to the band are then read, redirects and performs step Rapid 4.7.2);Else if the value of data recovery number counter is equal toThen judge to utilize correcting and eleting codes syndrome request RiInstitute Comprising each page bit-errors failure.
Present invention collaboration is had the advantage that using correcting and eleting codes and the reliable flash-memory storage system construction method of error correcting code tool:
1st, the present invention is for the problem of service life of flash memory is limited, bit error rate progressively increases with erasable increasing for number, profit Correct the bit-errors that occur in flash memory with correcting and eleting codes, so as to by the actual life of flash memory extend to manufacturer's nominal life with Outside.Because correcting and eleting codes have very strong error correcting capability, service life of flash memory can be extended decades of times by the present invention, have life effect The advantages of fruit is good, the reliability of flash-memory storage system can be significantly improved.
2nd, the present invention be directed to when the average erasable number of each memory cell of flash memory is less, bit error rate also than it is relatively low when, It is relatively large that the bit-errors computing cost occurred in flash memory is corrected using correcting and eleting codes.In order to reduce computing cost, the present invention is first Prejudged using verification and the bit-errors to occurring in flash memory pages.When verification and the less bit-errors judged, using error correction Code corrects the bit-errors occurred in the page.Error correcting code error correcting capability is relatively weak, is only applicable to the relatively low situation of bit error rate, but Its computing cost is relatively low, will not produce significant impact to IO performances.Due to flexibly using correcting and eleting codes and error correcting code, the present invention was both The life-span of flash memory can significantly be extended, there is the advantages of computing cost is low again.
3rd, computing cost of the invention is relatively low, and significant negative effect will not be produced to IO performances.Make data with correcting and eleting codes During recovery, the high concurrent of flash memory is given full play to, the read request that data recovery is related to is dispatched on multiple concurrent passages, had The advantages of IO performances are good.
Brief description of the drawings
Fig. 1 is the basic implementation process diagram of the embodiment of the present invention.
Fig. 2 is the detailed implementation process diagram of the embodiment of the present invention.
Embodiment
As shown in figure 1, the collaboration of the present embodiment association uses correcting and eleting codes and the reliable flash-memory storage system construction method of error correcting code The step of include:
1) initialization receives the buffering area of I/O request;The buffering area that initialization receives I/O request is to apply for one in internal memory Panel region, to preserve the read-write requests of upper layer application transmission;
2) I/O request R is received, I/O request R read-write type is judged, if read-write type is write request, redirects execution step 3);If otherwise read-write type is read request, redirects and perform step 4);
3) I/O request R data of writing are chosen according to band for unit, by s user of each band of selection Page of data generates the k redundant data page using correcting and eleting codes, calculates the s user data page and k redundant data page respectively The verification of the s+k page of face composition and, error correcting code, and by the verification of the s+k page and its each page with, error correcting code together Write storage device (i.e. flash-memory storage system);
4) I/O request R is divided into the son request for being belonging respectively to different bands, asked for each height, reads son request Comprising each page and its verification and, error correcting code, calculate the verification of each page that son request included and and identify son The bit-errors of each page included are asked, the most page of bit-errors is found out, judges the dislocation of the most page of bit-errors Whether quantity is more than the maximum wrong digit T that used error correcting code can correct by mistake, if the position of the most page of bit-errors Number of errors is not more than the maximum wrong digit T that used error correcting code can correct, then in being asked using error correcting code syndrome The bit-errors occurred in each page;Can if the number of bit errors of the most page of bit-errors is more than used error correcting code The maximum wrong digit T of correction, then the bit-errors occurred in being asked using correcting and eleting codes syndrome in each page, return to son request Comprising the data that are included of each page.
I/O request R can be identified as (TypeR, OffsetR, SizeR), wherein TypeRI/O request R read-write type is represented, OffsetRRepresent I/O request R initial address, SizeRRepresent the page number that I/O request R is included;Therefore, the present embodiment step 2) The read-write type for judging I/O request R is to judge read-write type TypeRValue Types.The present embodiment is intended to utilize fault tolerant mechanism, The bit-errors occurred in flash memory are corrected in collaboration using error correcting code and correcting and eleting codes, first by verification and technology (Check Sum) just The bit-errors in statistics are walked, when bit error rate is relatively low using error correcting code correction mistake, avoid producing unfavorable shadow to performance Ring;When bit-errors are higher, mistake is corrected using correcting and eleting codes, can be realized when flash memory actual life exceeding nominal life The reliability for the data being stored in flash-memory storage system, with computing cost is low, IO speed is fast, service life of flash memory extends effect The advantages of fruit is notable.
As shown in Fig. 2 the present embodiment step 1) also includes the page sum that initialization is used to record storage device to be written Write request counter CountwFor 0 the step of, CountwRepresent that the page for having not been written to storage device is total, the present embodiment step Rapid detailed step 3) includes:
3.1) the write request R page numbers included are added to write request counter CountwIn;
3.2) write request counter Count is judgedwWhether default threshold value h is exceeded, and wherein h is more than one complete band In the page quantity n integer that includes;If write request counter CountwMore than default threshold value h, then execution step is redirected 3.3);If write request counter CountwNo more than default threshold value h, then redirect and perform step 2);
3.3) it is Count to be written according to page numberw(represent write request counter CountwValue, similarly hereinafter) individual write Page of data makees ascending sort, the Count being written intowIndividual page of data of writing is divided into different bands so that numbering x The page be divided into xth/n band, wherein n represents the page quantity that includes in a complete band;From to be written CountwIt is individual to write one complete band of selection in page of data, if choosing complete band success, redirect and perform step 3.4); Else if choosing, complete band is unsuccessful, then choose comprising the user data page it is most have an imperfect band, redirect Perform step 3.4);
3.4) by the s user data page in the complete band of selection or imperfect band using correcting and eleting codes generation k The s+k page is obtained in the redundancy page, the verification of each page and, error correcting code in the s+k page is calculated respectively, by s+k page The error correcting code of each page and verification and together write storage device in face and the s+k page;Finally from write request counter CountwIn subtract the user data page quantity s of this write storage device, jump procedure 3.2).
Correcting and eleting codes can be abstracted as one (n, k) two tuple, represent that correcting and eleting codes generate k parts verification letter according to n parts user data Breath.This n+k parts data forms a band, together writes in the storage device of bottom.In the storage system using correcting and eleting codes, Each write operation is directed to a band for including the n+k page, rather than simply one page of write-in.So to storage When data are write in equipment, it should after waiting until that all data of a band all reach as far as possible, then send and write to storage system Request.Pass through write request counter Count in the present embodimentwThen it is used for accumulating the page, whenever passing through write request counter Countw When the data of the storage device to be written of record reach the h page, you can by partial data write storage device;It is if to be written The page sum Count of storage devicewNo more than h, then redirect execution step 2) and continue to receive new read-write requests, until to be written The data for entering storage device reach the h page, so as to try hard to find complete band from the page of accumulation so as to correcting and eleting codes, from And to reduce the loss of writing to flash memory, extend the service life of storage device.
In the present embodiment, the threshold value h in step 3.2) is 10 times of the page quantity n included in a complete band.
In the present embodiment, the detailed step of step 3.3) includes:
3.3.1) from write request counter Count to be writtenwIt is individual to write one complete band of selection in page of data, if Complete band success is chosen, then redirects and performs step 3.4);Else if selection complete band is unsuccessful, then execution step is redirected 3.3.2);
3.3.2) judge whether I/O request R is continuous write request, if I/O request R is continuous write request, redirect Perform step 2);Else if I/O request R and discrete write request, then redirect and perform step 3.3.3);
3.3.3) choose comprising the user data page it is most have an imperfect band, redirect and perform step 3.4).
By above-mentioned steps 3.3.1)~3.3.3), can avoid as far as possible in a band due to certain customers' page Renewal causes the renewal of all verification datas, realizes the optimization for continuous write request, it is ensured that in face of continuous write request When, if currently without complete band is found, redirect execution step 2) and continue to receive follow-up continuous write request, until to be written Enter to exist in the data of storage device complete band or be subsequently discrete write request, so as to further reduce to dodging The service life write loss, extend storage device deposited.
In the present embodiment, step 3.4) adopts the s user data page in the complete band of selection or imperfect band When the s+k page is obtained with the correcting and eleting codes k redundancy page of generation;For the complete band of selection, number of users therein It is the page quantity n in complete band according to page quantity s;For the imperfect band of selection, user data therein Page quantity s is less than for the page quantity n in foregoing complete band.But no matter the complete band or imperfect band chosen, It is assumed that there is s User Page storage device to be written in the band, then according to the new check information (k of this s page generation k part The individual page), so as to which the s+k page be obtained.
The physical page of flash memory includes two parts:Preserve the data area of data and the additional areas of Preservation Metadata.This In embodiment, the error correcting code of each page in the s+k page and the s+k page and verification and together write-in are deposited in step 3.4) When storing up equipment, storage device is that each page to be written distributes an idle physical page, s+k page in the s+k page The data of each page are written into the data area of the physical page of distribution in face, and each page entangles in the s+k page Error code and verification and the additional areas for the physical page for being written into distribution.It is assumed that storage device is page of data to be written PagedataThe physical page of distribution is Pagephysical, then PagedataCorresponding data write PagephysicalData area, PagedataError correcting code and verification and write PagephysicalAdditional areas.The present embodiment is based on additional areas come memory page Error correcting code and verification and, so as on the premise of the storage organization of storage device is not changed, realize the error correction to the page The storage of code and verification sum, so as to provide Back ground Information based on error correcting code and correcting and eleting codes to be follow-up.
The verification of each page in the s+k page is calculated in the present embodiment, in step 3.4) respectively and specifically refers to treat The word Word that the bit stream in the page is divided into fixed size is calculated, each word Word includes 64 bits, then by these words Word as the result that XOR is calculated as the page verification and.Certainly, the school of each page is calculated in step 4) Test and method with calculated respectively in step 3.4) verification of each page in the s+k page and method it is identical.Need Illustrate, verification and technology (Check Sum) are calibration technologies relatively common at present, such as can also be used as needed Current all kinds of common checksum algorithms, and the checksum algorithm that the present embodiment uses is calculated based on XOR, computing cost is small.
As shown in Fig. 2 the detailed step of step 4) includes in the present embodiment:
4.1) I/O request R is divided into m son request R for being belonging respectively to different bands0, R1, R2…Rm-1;Correcting and eleting codes are with bar Band ensures the reliability of data, the correcting and eleting codes for being conceptualized as (n, k) two tuple for base unit, and a band includes n+k Page data.Respond upper layer application read request when, correcting and eleting codes firstly the need of find upper layer application request data where band, These data are likely distributed in multiple bands.For the present embodiment, m son request R0, R1, R2…Rm-1User is collectively constituted All data of request, but they belong to different bands, and correcting and eleting codes will be successively read these bands, and therefrom find out user and ask The data asked;I-th of son request R in m son requestiComprising the page such as formula (1) shown in;
In formula (1), RiI-th of son request is represented, n represents the page quantity in a complete band, OffsetRRepresent IO Ask R initial address, SizeRRepresent the page number that I/O request R is included.
4.2) the sub- request counter i of Initialize installation is 0;
4.3) son request R is readiComprising each page, together read son request RiComprising each page it is corresponding Error correcting code and verification and;There is due to flash memory and all intrinsic concurrency in the storage device based on flash memory, so can be concomitantly Read son request RiComprising each page, it is ensured that the present embodiment structure reliable memory system there is higher reading performance;
4.4) son request R is calculatediComprising each page verification and, according to the verification being calculated and and step 4.3) obtained verification is read in and is compared identification son request RiComprising each page in bit-errors;The present embodiment In, with abovementioned steps 3.4) in calculate respectively the verification of each page in the s+k page with the step of it is identical, sample, which also refers to, to be treated The word Word that the bit stream in the page is divided into fixed size is calculated, each word Word includes 64 bits, then by these words Word as the result that XOR is calculated as the page verification and;Obtained school will be read in result of calculation and step 4.3) Test and make XOR, the difference digit of two kinds of verification sums is identified as BEj, BEjIt is the position occurred in the page that can be approximately considered Error number;
4.5) sub- request R is found outiComprising each page in the most page of error bit, by the most page of error bit Wrong digit and the maximum wrong digit T that can correct of used error correcting code be compared, if the page that bit-errors are most The number of bit errors in face is not more than maximum wrong digit T (the explanation son request R that used error correcting code can correctiIn it is each The bit error rate of the page is relatively low, is the bit-errors occurred in each page of recoverable only with error correcting code), then redirect execution step 4.6);Else if the number of bit errors of the most page of bit-errors is more than the maximum mistake that used error correcting code can correct Digit T (explanation son request RiIn each page bit error rate it is higher, the wrong digit of appearance has exceeded the error correction energy of error correcting code Power, error correcting code cannot succeed restoring user data, it is therefore desirable to the bit-errors occurred in each page are corrected using correcting and eleting codes), Then jump procedure 4.7);Wherein, the wrong digit of the most page of error bit can be expressed as shown in formula (2);
In formula (2), P represents son request RiTotal page number, BEjRepresent son request RiIn j-th of page in two kinds verification The difference digit of sum;
4.6) using error correcting code syndrome request RiComprising each page bit-errors, if correct failure, jump Turn to perform step 4.7);Else if correcting successfully, then redirect and perform step 4.8);In fact, verification and the dislocation detected Be possibly less than the bit-errors actually occurred in the page by mistake, if therefore the bit-errors that actually occur in the page have been over error correcting code Error correcting capability, error correcting code cannot succeed restoring user data, now, redirect and perform step 4.7) and utilize correcting and eleting codes calibration page Bit-errors in face;If error correcting code successfully corrects RiIn all pages bit-errors, then redirect perform step 4.8), so as to Improve the bit-errors correction capability of the present embodiment;
4.7) correcting and eleting codes syndrome request R is utilizediComprising each page bit-errors;
4.8) son request R is returned to upper layer applicationiComprising user data;
4.9) sub- request counter i is increased;
If 4.10) sub- request counter i is less than son request fractionation quantity m, judge also have untreated complete son request, jump Turn to perform step 4.3);If sub- request counter i is equal to son, request splits quantity m, judges that having handled all sons asks Ask, redirect and perform step 2).
When I/O request R to be divided into m son request for being belonging respectively to different bands in the present embodiment, in step 4.1), m's It is worth for (OffsetR+SizeR+n-1)/n-OffsetR/ n, wherein n represent the user data number of pages included in a complete band, OffsetRRepresent I/O request R initial address, SizeRThe page number that I/O request R is included is represented, I/O request R is by with initial address OffsetRThe Size of beginningRIndividual continuous page composition.Read request R is by with OffsetRFor the Size of initial addressRIndividual continuous page Composition, these user data pages are likely distributed in different bands.Because each band includes n pages of user data, then compile Number it is included in for the l user data page in the l/n band., can be by R all customer data page by above calculation Face is divided into m different bands.Wherein, m value is (OffsetR+SizeR+n-1)/n-OffsetR/n.Drop into m Son request in different bands is respectively identified as R0,R1,R2,…,Rm-1
In the present embodiment, the detailed step of step 4.7) includes:
4.7.1 it is 0 that) initialization data, which recovers number counter,;In sub- request RiIn belonged to the v page of same band On the basis of being read, n-v part data are read from storage device, being obtained when correcting and eleting codes make data recovery to band at least needs The n part data wanted;
4.7.2 R) is asked by correcting and eleting codes syndrome according to the common n parts data read outiComprising each page Bit-errors, if success syndrome request RiComprising each page bit-errors, then redirect perform step 4.8);Otherwise, will Data recovery number counter adds 1, redirects and performs step 4.7.3);
4.7.3) judge whether the value of data recovery number counter is equal toIf data recovery number counter Value is less thanOther n parts data that correcting and eleting codes are made at least to need during data recovery to band are then read, redirect execution step 4.7.2);Else if the value of data recovery number counter is equal toThen judge to utilize correcting and eleting codes syndrome request RiWrapped The bit-errors failure of each page contained.
For the storage system using correcting and eleting codes, when responding the read request of upper layer application, as long as in this n+k part data There are n parts still complete reliable, you can correctly to recover all customer data.The present embodiment is based on abovementioned steps 4.7.1)~ 4.7.3), it is ensured that the error correction to bit-errors can be realized based on complete, correct, reliable n parts data in n+k part data.
In summary, the bit-errors occurred in correcting and eleting codes and error correcting code recovery flash memory pages are used in combination in the present embodiment, from And reach and significantly extend service life of flash memory, improve the purpose of flash-memory storage system reliability.When upper layer application sends write request, to treat Write the page and calculate the redundancy of correcting and eleting codes, and calculate the error correcting code of each page and verify and by correcting and eleting codes redundancy, entangle Error code, verification and together preserve on a storage device.When upper layer application reads data, first with verification and the preliminary judgement page The bit error number of appearance.If bit-errors are less, with error correcting code correction data;If bit-errors are more, recovered using correcting and eleting codes Data.Approach described above is corrected using the relatively low method of computing cost use as far as possible while powerful error correcting capability is ensured The bit-errors occurred in user data, there is the advantages of computing cost is low, IO performances are good.Because correcting and eleting codes have powerful error correction energy Power, when the erasable number of each memory cell of flash memory is a lot, error rate is very high, correcting and eleting codes remain able to successfully restoring user data The bit-errors of middle appearance.So the present embodiment can dramatically increase the erasable number that each memory cell of flash memory can bear, tool There is the advantages of life effect is good.
Described above is only the preferred embodiment of the present invention, and protection scope of the present invention is not limited merely to above-mentioned implementation Example, all technical schemes belonged under thinking of the present invention belong to protection scope of the present invention.It should be pointed out that for the art Those of ordinary skill for, some improvements and modifications without departing from the principles of the present invention, these improvements and modifications It should be regarded as protection scope of the present invention.

Claims (9)

  1. A kind of 1. reliable flash-memory storage system construction method cooperateed with using correcting and eleting codes and error correcting code, it is characterised in that step bag Include:
    1) initialization receives the buffering area of I/O request;
    2) I/O request R is received, judges I/O request R read-write type, if read-write type is write request, redirects and performs step 3); If otherwise read-write type is read request, redirects and perform step 4);
    3) I/O request R data of writing are chosen according to band for unit, by s user data of each band of selection The page generates the k redundant data page using correcting and eleting codes, calculates the s user data page and k redundant data page respectively The verification of the s+k page of face composition and, error correcting code, and by the verification of the s+k page and its each page and, error correcting code Together write storage device;
    4) I/O request R is divided into the son request for being belonging respectively to different bands, asked for each height, son request is read and is wrapped Each page for containing and its verification and, error correcting code, calculate the verification for each page that son request is included and and identify sub- request Comprising each page bit-errors, find out the most page of bit-errors, judge the bit error number of the most page of bit-errors Whether amount is more than the maximum wrong digit T that used error correcting code can correct, if the bit-errors of the most page of bit-errors Quantity is not more than the maximum wrong digit T that used error correcting code can correct, then each in being asked using error correcting code syndrome The bit-errors occurred in the page;If the number of bit errors of the most page of bit-errors, which is more than used error correcting code, to correct Maximum wrong digit T, then the bit-errors occurred in being asked using correcting and eleting codes syndrome in each page, return to son request and wrapped The data that each page contained is included.
  2. 2. collaboration according to claim 1 uses correcting and eleting codes and the reliable flash-memory storage system construction method of error correcting code, its It is characterised by, the step 1) also includes the write request counter of page sum of the initialization for recording storage device to be written CountwFor 0 the step of, the detailed step of the step 3) includes:
    3.1) the write request R page numbers included are added to write request counter CountwIn;
    3.2) write request counter Count is judgedwWhether default threshold value h is exceeded, and wherein h is to be wrapped in more than one complete band The page quantity n contained integer;If write request counter CountwMore than default threshold value h, then redirect and perform step 3.3); If write request counter CountwNo more than default threshold value h, then redirect and perform step 2);
    3.3) it is Count to be written according to page numberwIndividual page of data of writing makees ascending sort, the Count being written intowIt is individual Write page of data to be divided into different bands so that the page that numbering is x is divided into xth/n band, and wherein n is represented The page quantity included in one complete band;From Count to be writtenwIt is individual to write one complete band of selection in page of data, If choosing complete band success, redirect and perform step 3.4);Else if choosing, complete band is unsuccessful, then selection includes The user data page it is most have an imperfect band, redirect and perform step 3.4);
    3.4) the s user data page in the complete band of selection or imperfect band is generated into k redundancy using correcting and eleting codes The s+k page is obtained in the page, calculate respectively in the s+k page verification of each page and, error correcting code, by the s+k The error correcting code of each page and verification and together write storage device in the individual page and the s+k page;Finally from write request Counter CountwIn subtract the user data page quantity s of this write storage device, jump procedure 3.2).
  3. 3. collaboration according to claim 2 uses correcting and eleting codes and the reliable flash-memory storage system construction method of error correcting code, its It is characterised by:Threshold value h in the step 3.2) is 10 times of the page quantity n included in a complete band.
  4. 4. collaboration according to claim 3 uses correcting and eleting codes and the reliable flash-memory storage system construction method of error correcting code, its It is characterised by, the detailed step of the step 3.3) includes:
    3.3.1) from write request counter Count to be writtenwIt is individual to write one complete band of selection in page of data, if chosen Complete band success, then redirect and perform step 3.4);Else if selection complete band is unsuccessful, then execution step is redirected 3.3.2);
    3.3.2) judge whether I/O request R is continuous write request, if I/O request R is continuous write request, redirect execution Step 2);Else if I/O request R and discrete write request, then redirect and perform step 3.3.3);
    3.3.3) choose comprising the user data page it is most have an imperfect band, redirect and perform step 3.4).
  5. 5. collaboration according to claim 4 uses correcting and eleting codes and the reliable flash-memory storage system construction method of error correcting code, its It is characterised by:By the error correcting code of each page and verification in the s+k page and the s+k page in the step 3.4) Together during write storage device, storage device is that each page distribution one to be written is idle in the s+k page Physical page, the data of each page are written into the data area of the physical page of distribution, the s+ in the s+k page The error correcting code of each page and verification and the additional areas for the physical page for being written into distribution in the k page.
  6. 6. collaboration according to claim 5 uses correcting and eleting codes and the reliable flash-memory storage system construction method of error correcting code, its It is characterised by:The verification of each page in the s+k page is calculated in the step 3.4) respectively and specifically refers to wait to count The word Word that the bit stream in the page is divided into fixed size is calculated, each word Word includes 64 bits, then by these words Word as the result that XOR is calculated as the page verification and.
  7. 7. the collaboration according to any one in claim 1~6 uses correcting and eleting codes and the reliable flash memory storage system of error correcting code System construction method, it is characterised in that the detailed step of the step 4) includes:
    4.1) I/O request R is divided into m son request for being belonging respectively to different bands;
    4.2) the sub- request counter i of Initialize installation is 0;
    4.3) son request R is readiComprising each page, together read son request RiComprising each page corresponding to entangle Error code and verification and;
    4.4) son request R is calculatediComprising each page verification and, according to the verification being calculated and with step 4.3) Read obtained verification and be compared identification son request RiComprising each page in bit-errors;
    4.5) sub- request R is found outiComprising each page in the most page of error bit, by the most page of the error bit Wrong digit and the maximum wrong digit T that can correct of used error correcting code be compared, if the page that bit-errors are most The number of bit errors in face is not more than the maximum wrong digit T that used error correcting code can correct, then redirects and perform step 4.6); Else if the number of bit errors of the most page of bit-errors is more than the maximum wrong digit that used error correcting code can correct T, then jump procedure 4.7);
    4.6) using error correcting code syndrome request RiComprising each page bit-errors, if correct failure, redirect execution Step 4.7);Else if correcting successfully, then redirect and perform step 4.8);
    4.7) correcting and eleting codes syndrome request R is utilizediComprising each page bit-errors;
    4.8) son request R is returned to upper layer applicationiComprising user data;
    4.9) sub- request counter i is increased;
    If 4.10) sub- request counter i is less than son request fractionation quantity m, judges there be untreated complete son request, redirect and hold Row step 4.3);If sub- request counter i is equal to son, request splits quantity m, judges to have handled all son requests, jumps Turn to perform step 2).
  8. 8. collaboration according to claim 7 uses correcting and eleting codes and the reliable flash-memory storage system construction method of error correcting code, its It is characterised by:When I/O request R to be divided into m son request for being belonging respectively to different bands in the step 4.1), m value is (OffsetR+SizeR+n-1)/n-OffsetR/ n, wherein n represent the user data number of pages included in a complete band, OffsetRRepresent I/O request R initial address, SizeRThe page number that I/O request R is included is represented, I/O request R is by with initial address OffsetRThe Size of beginningRIndividual continuous page composition;I-th of son request R in the m son requestiComprising the page such as formula (1) It is shown;
    <mrow> <msub> <mi>R</mi> <mi>i</mi> </msub> <mo>=</mo> <mfenced open = "{" close = ""> <mtable> <mtr> <mtd> <mrow> <mo>&amp;lsqb;</mo> <msub> <mi>Offset</mi> <mi>R</mi> </msub> <mo>,</mo> </mrow> </mtd> <mtd> <mrow> <mo>(</mo> <msub> <mi>Offset</mi> <mi>R</mi> </msub> <mo>+</mo> <mi>n</mi> <mo>-</mo> <mn>1</mn> <mo>)</mo> <mo>/</mo> <mi>n</mi> <mo>&amp;times;</mo> <mi>n</mi> <mo>)</mo> </mrow> </mtd> <mtd> <mrow> <mi>i</mi> <mo>=</mo> <mn>0</mn> <mo>;</mo> </mrow> </mtd> </mtr> <mtr> <mtd> <mrow> <mo>&amp;lsqb;</mo> <msub> <mi>Offset</mi> <mi>R</mi> </msub> <mo>/</mo> <mi>n</mi> <mo>&amp;times;</mo> <mi>n</mi> <mo>+</mo> <mi>i</mi> <mo>&amp;times;</mo> <mi>n</mi> <mo>,</mo> </mrow> </mtd> <mtd> <mrow> <msub> <mi>Offset</mi> <mi>R</mi> </msub> <mo>/</mo> <mi>n</mi> <mo>&amp;times;</mo> <mi>n</mi> <mo>+</mo> <mi>i</mi> <mo>&amp;times;</mo> <mi>n</mi> <mo>+</mo> <mi>n</mi> <mo>)</mo> </mrow> </mtd> <mtd> <mrow> <mn>0</mn> <mo>&lt;</mo> <mi>i</mi> <mo>&lt;</mo> <mi>m</mi> <mo>-</mo> <mn>1</mn> <mo>;</mo> </mrow> </mtd> </mtr> <mtr> <mtd> <mrow> <mo>&amp;lsqb;</mo> <mrow> <mo>(</mo> <msub> <mi>Offset</mi> <mi>R</mi> </msub> <mo>+</mo> <msub> <mi>Size</mi> <mi>R</mi> </msub> <mo>)</mo> </mrow> <mo>/</mo> <mi>n</mi> <mo>&amp;times;</mo> <mi>n</mi> <mo>,</mo> </mrow> </mtd> <mtd> <mrow> <mo>(</mo> <msub> <mi>Offset</mi> <mi>R</mi> </msub> <mo>+</mo> <msub> <mi>Size</mi> <mi>R</mi> </msub> <mo>)</mo> <mo>)</mo> </mrow> </mtd> <mtd> <mrow> <mi>i</mi> <mo>=</mo> <mi>m</mi> <mo>-</mo> <mn>1</mn> <mo>;</mo> </mrow> </mtd> </mtr> </mtable> </mfenced> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>1</mn> <mo>)</mo> </mrow> </mrow>
    In formula (1), RiI-th of son request is represented, n represents the page quantity in a complete band, OffsetRRepresent I/O request R Initial address, SizeRRepresent the page number that I/O request R is included.
  9. 9. collaboration according to claim 8 uses correcting and eleting codes and the reliable flash-memory storage system construction method of error correcting code, its It is characterised by, the detailed step of the step 4.7) includes:
    4.7.1 it is 0 that) initialization data, which recovers number counter,;In sub- request RiIn belong to the v page of same band by On the basis of reading, n-v part data are read from storage device, being obtained when correcting and eleting codes make data recovery to the band at least needs The n part data wanted;
    4.7.2 R) is asked by correcting and eleting codes syndrome according to the common n parts data read outiComprising each page dislocation By mistake, if success syndrome request RiComprising each page bit-errors, then redirect perform step 4.8);Otherwise, by data Recover number counter and add 1, redirect and perform step 4.7.3);
    4.7.3) judge whether the value of data recovery number counter is equal toIf the value of data recovery number counter is small InThe other n parts data at least needed when correcting and eleting codes make data recovery to the band are then read, redirect execution step 4.7.2);Else if the value of data recovery number counter is equal toThen judge to utilize correcting and eleting codes syndrome request RiWrapped The bit-errors failure of each page contained.
CN201510236451.XA 2015-05-11 2015-05-11 Collaboration uses correcting and eleting codes and the reliable flash-memory storage system construction method of error correcting code Active CN104881370B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510236451.XA CN104881370B (en) 2015-05-11 2015-05-11 Collaboration uses correcting and eleting codes and the reliable flash-memory storage system construction method of error correcting code

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510236451.XA CN104881370B (en) 2015-05-11 2015-05-11 Collaboration uses correcting and eleting codes and the reliable flash-memory storage system construction method of error correcting code

Publications (2)

Publication Number Publication Date
CN104881370A CN104881370A (en) 2015-09-02
CN104881370B true CN104881370B (en) 2018-01-12

Family

ID=53948870

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510236451.XA Active CN104881370B (en) 2015-05-11 2015-05-11 Collaboration uses correcting and eleting codes and the reliable flash-memory storage system construction method of error correcting code

Country Status (1)

Country Link
CN (1) CN104881370B (en)

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107544862B (en) * 2016-06-29 2022-03-25 中兴通讯股份有限公司 Stored data reconstruction method and device based on erasure codes and storage node
CN106445726A (en) * 2016-09-28 2017-02-22 上海爱数信息技术股份有限公司 Data repairing method for distributed erasure code storage system
CN109426622B (en) * 2017-08-31 2020-11-24 香港理工大学深圳研究院 Method for prolonging service life of flash memory solid-state disk and long-service-life flash memory solid-state disk
CN111435286B (en) * 2019-01-14 2023-12-05 深圳市茁壮网络股份有限公司 Data storage method, device and system
CN111007987A (en) * 2019-11-08 2020-04-14 苏州浪潮智能科技有限公司 Memory management method, system, terminal and storage medium for raid io
CN110896415B (en) * 2019-11-22 2022-05-24 浪潮电子信息产业股份有限公司 Data readdir method, system, equipment and computer medium
CN111708742B (en) * 2020-05-24 2022-11-29 苏州浪潮智能科技有限公司 Input/output pre-reading method and device for distributed file system
CN113890687B (en) * 2021-11-15 2024-08-13 杭州叙简未兰电子有限公司 High-reliability audio transmission method and device based on error correction code and erasure code mixing
CN114093409A (en) * 2022-01-21 2022-02-25 苏州浪潮智能科技有限公司 Flash memory verification method and device, computer equipment and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101627444A (en) * 2007-10-03 2010-01-13 株式会社东芝 Semiconductor memory device
CN102034548A (en) * 2009-09-25 2011-04-27 三星电子株式会社 Nonvolatile memory device and system, and method of programming a nonvolatile memory device
CN102934093A (en) * 2010-06-29 2013-02-13 英特尔公司 Method and system to improve the performance and/or reliability of a solid-state drive

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101627444A (en) * 2007-10-03 2010-01-13 株式会社东芝 Semiconductor memory device
CN102034548A (en) * 2009-09-25 2011-04-27 三星电子株式会社 Nonvolatile memory device and system, and method of programming a nonvolatile memory device
CN102934093A (en) * 2010-06-29 2013-02-13 英特尔公司 Method and system to improve the performance and/or reliability of a solid-state drive

Also Published As

Publication number Publication date
CN104881370A (en) 2015-09-02

Similar Documents

Publication Publication Date Title
CN104881370B (en) Collaboration uses correcting and eleting codes and the reliable flash-memory storage system construction method of error correcting code
US7788569B2 (en) Autonomic parity exchange
Xiang et al. Optimal recovery of single disk failure in RDP code storage systems
US7315976B2 (en) Method for using CRC as metadata to protect against drive anomaly errors in a storage array
US8438455B2 (en) Error correction in a solid state disk
US11531590B2 (en) Method and system for host-assisted data recovery assurance for data center storage device architectures
Xiang et al. A hybrid approach to failed disk recovery using RAID-6 codes: Algorithms and performance evaluation
CN102508733B (en) A kind of data processing method based on disk array and disk array manager
US20080016416A1 (en) Autonomic Parity Exchange
CN104798047A (en) Error detection and correction apparatus and method
US9058291B2 (en) Multiple erasure correcting codes for storage arrays
US8386891B2 (en) Anamorphic codes
CN103870352B (en) Method and system for data storage and reconstruction
CN104156174A (en) Strip based solid-state drive RAID (redundant array of independent disks) realizing method and device
Goel et al. RAID triple parity
US20070124648A1 (en) Data protection method
Venkatesan et al. Effect of latent errors on the reliability of data storage systems
US8661320B2 (en) Independent orthogonal error correction and detection
Iliadis Reliability evaluation of erasure coded systems under rebuild bandwidth constraints
CN113424262A (en) Storage verification method and device
Iliadis Effect of lazy rebuild on reliability of erasure-coded storage systems
CN106254033B (en) A kind of input tolerant for Chinese method of array storage system
Iliadis Reliability assessment of erasure-coded storage systems with latent errors
Iliadis Reliability of erasure coded systems under rebuild bandwidth constraints
CN116249969A (en) Data storage system with built-in redundancy and method for recovering and storing data

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
EXSB Decision made by sipo to initiate substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant