WO2015027700A1 - 一种修复出错数据的方法和设备 - Google Patents

一种修复出错数据的方法和设备 Download PDF

Info

Publication number
WO2015027700A1
WO2015027700A1 PCT/CN2014/073234 CN2014073234W WO2015027700A1 WO 2015027700 A1 WO2015027700 A1 WO 2015027700A1 CN 2014073234 W CN2014073234 W CN 2014073234W WO 2015027700 A1 WO2015027700 A1 WO 2015027700A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
page
storage
storage location
storage block
Prior art date
Application number
PCT/CN2014/073234
Other languages
English (en)
French (fr)
Inventor
鲍慧强
王大勇
王荣生
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to EP14772049.4A priority Critical patent/EP2857971B1/en
Priority to US14/501,368 priority patent/US9280301B2/en
Publication of WO2015027700A1 publication Critical patent/WO2015027700A1/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/08Error detection or correction by redundancy in data representation, e.g. by using checking codes
    • G06F11/10Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's
    • G06F11/1008Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's in individual solid state devices
    • G06F11/1048Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's in individual solid state devices using arrangements adapted for a specific error detection or correction feature
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/08Error detection or correction by redundancy in data representation, e.g. by using checking codes
    • G06F11/10Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's
    • G06F11/1076Parity data used in redundant arrays of independent storages, e.g. in RAID systems
    • G06F11/108Parity data distribution in semiconductor storages, e.g. in SSD
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/08Error detection or correction by redundancy in data representation, e.g. by using checking codes
    • G06F11/10Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's
    • G06F11/1076Parity data used in redundant arrays of independent storages, e.g. in RAID systems
    • G06F11/1088Reconstruction on already foreseen single or plurality of spare disks

Definitions

  • SSD Solid State Disk
  • the memory unit is composed of a flash chip. Due to process and cost factors, the flash chip has a certain failure rate. When the flash chip fails, the stored data is damaged. Therefore, the method of repairing the error data has been widely concerned.
  • the method for repairing the error data is specifically: when reading the data in the storage block included in the SSD flash chip, for a certain page in the storage block, checking the data in the page, if the data is in error If the number does not exceed the preset first threshold, the ECC (Error Correcting Code) repair is performed on the data in the page, and the correct data is returned; if the number of data errors exceeds the preset first threshold, Marking the memory block as a bad block, and then no longer using the memory block, and selecting a page with the same page identifier from the preset number of memory blocks, and determining the page according to the data in the selected page. Whether the data can be repaired by RAID (Redundant Arrays of Inexpensive Disks). If it is, the page is repaired with RAID and the correct data is returned. Otherwise, the data is read incorrectly.
  • RAID Redundant Arrays of Inexpensive Disks
  • a method of repairing erroneous data comprising:
  • the first number is greater than the preset first threshold, acquiring data from the spare space according to a storage location of the erroneous data in the page and a fixed entry corresponding to the storage block, and The erroneous data in the page is replaced with the acquired data, and the fixed entry includes the storage location of each data stored in the spare space.
  • the method further includes: performing a second error check on the data in the page, obtaining the Wrong data in the page;
  • the performing the first error check on the data in a certain page of the storage block, after acquiring the data in the page also includes:
  • the performing the first error check on the data in a certain page of the storage block, after acquiring the data in the page also includes:
  • the first preset value storage location with the largest number of errors is selected, and the selected storage location and the page identifier of the page are stored in the temporary entry corresponding to the storage block.
  • the performing the first error check on the data in a certain page of the storage block, after acquiring the data in the page also includes:
  • the storage location is stored in the temporary entry corresponding to the storage block, and the selected storage location and its corresponding page identifier are stored.
  • the method further includes:
  • the method further includes:
  • an apparatus for repairing erroneous data comprising:
  • a first obtaining module configured to perform a first error check on data in a certain page of the storage block when reading data in a storage block included in the solid state hard disk, to obtain data that is erroneous in the page;
  • a first repairing module configured to perform error checking and correct ECC repair on data in the page if a first number of data erroneous in the page is less than or equal to a preset first threshold;
  • the device further includes: a second acquiring module, configured to perform a second error check on the data in the page, to obtain the Wrong data in the page;
  • a second repairing module configured to perform ECC repair on the data in the page if the second number of data erroneous in the page is less than or equal to the preset first threshold
  • a marking module configured to mark the storage block as a bad block if the second number is greater than the preset first threshold, and from a preset number of storage blocks according to a page identifier of the page retrieve data;
  • a third repairing module configured to determine, according to the acquired data, whether to perform independent redundant disk array RAID repair on the data in the page, and if yes, perform RAID repair on the data in the page.
  • the device further includes: a third acquiring module, configured to acquire a storage location of the erroneous data in each page in the storage block
  • the first statistic module is configured to obtain the erroneous data with the same storage location, and count the number of errors of the erroneous data with the same storage location;
  • the first storage module is configured to select a first preset value storage location with the largest number of errors, and store the selected storage location in a temporary entry corresponding to the storage block.
  • the device further includes: a fourth acquiring module, configured to acquire data that each storage location in the page has been erroneous;
  • a second statistic module configured to count, according to data that has been erroneous in each storage location in the page, an error number of data that has been erroneous for each storage location in the page;
  • a second storage module configured to select a first preset value storage location with the largest number of errors, and store the selected storage location and the page identifier of the page in a temporary entry corresponding to the storage block.
  • the device further includes: a fifth acquiring module, configured to acquire a storage location of the erroneous data in each page in the storage block
  • the third statistic module is configured to obtain the erroneous data with the same storage location, and count the number of errors of the erroneous data with the same storage location;
  • the device further includes:
  • a moving module configured to acquire a free storage block in the SSD, and move data in the storage block to the free storage block according to a fixed entry corresponding to the free storage block.
  • the device further includes:
  • an apparatus for repairing erroneous data comprising a memory and a processor for performing the method of repairing erroneous data.
  • the first error check is performed on the data of a certain page in the storage block.
  • the data stored in the spare space of the page is replaced with an error.
  • the data with high probability of error is stored in the spare space in the 00B space of each page, which not only fully utilizes the 00B space, but also replaces the data stored in the spare space of the page with the erroneous data, which can greatly reduce the occurrence of uncorrectable faults. Probability to avoid easily marking the memory block as a bad block.
  • FIG. 1 is a flowchart of a method for repairing error data according to an embodiment of the present invention
  • FIG. 2 is a flowchart of another method for repairing error data according to an embodiment of the present invention.
  • FIG. 3 is a flowchart of another method for repairing error data according to an embodiment of the present invention.
  • FIG. 4 is a flowchart of another method for repairing erroneous data according to an embodiment of the present invention.
  • FIG. 5 is a schematic structural diagram of an apparatus for repairing error data according to an embodiment of the present invention.
  • FIG. 6 is a schematic structural diagram of another apparatus for repairing erroneous data according to an embodiment of the present invention.
  • An embodiment of the present invention provides a method for repairing error data. Referring to FIG. 1, the method includes: Step 101: When reading data in a storage block included in a solid state hard disk, data in a certain page of the storage block Perform the first error check to get the data in the page that is in error;
  • Step 102 If the first number of data in the page that is in error is less than or equal to the preset first threshold, perform error checking and correct ECC repair on the data in the page;
  • Step 103 If the first number is greater than the preset first threshold, obtain data from the spare space according to the storage location of the erroneous data in the page and the fixed entry corresponding to the storage block, and the error in the page The data is replaced with the acquired data, and the fixed table entry includes the storage location of each data stored in the spare space.
  • the first error check is performed on the data of a certain page in the storage block.
  • the data stored in the spare space of the page is replaced with an error.
  • the data, in which the data with a high probability of error is stored in the spare space in the 00B space of each page not only fully utilizes the 00B space, but also replaces the data stored in the spare space of the page with the erroneous data, which can greatly reduce the occurrence of the data.
  • the probability of correcting the fault to avoid easily marking the memory block as a bad block.
  • Embodiments of the present invention provide a method for repairing erroneous data.
  • the inside of the Flash chip is divided into a plurality of storage blocks, each storage block is composed of a plurality of pages, each page includes a main data space and an 00B (Out of Band Data) space, and the 00B space is used for storage.
  • ECC data since the ECC data stored in the 00B space is smaller than the space of 00B, in each page of the storage block, in addition to the main data space and the space for actually storing the ECC data, there is a certain spare in each page. Space, in which the data of the storage location with the highest failure rate can be stored.
  • a storage location corresponding to the storage block stores a storage location of each data in the spare space of the storage block, and when data backup is performed in the spare space of each page in the storage block, According to the storage location stored in the fixed table entry, data is separately obtained from each page and stored in its corresponding spare space.
  • the method includes:
  • Step 201 When data is written into the storage block for the first time, a first preset value storage location is randomly selected from a certain page of the storage block, and the selected storage location is sequentially stored in the corresponding storage block. In a fixed entry; wherein each storage page includes a storage space equal to each other, and the entire storage block may be in the storage block The first preset value storage location is randomly selected in a certain page, that is, the storage location corresponding to the data stored in the spare space corresponding to each page in the storage block is the same.
  • the memory block includes 3 pages, each page stores 100 bits of data, that is, the storage location of the data stored in each page is 1-100, if the first preset value is 4, and the first storage location is selected.
  • the second storage location is the 45th bit
  • the third storage location is the 70th bit
  • the fourth storage location is the 90th bit
  • the selected storage locations 2, 45, 70, and 90 are sequentially stored as follows. In the fixed entry shown in Table 1.
  • Storage location 2, 45, 70, 90 Step 202 After writing the data of each page, according to the storage location in the fixed entry corresponding to the storage block, the corresponding data is obtained from the data of each page, and the corresponding data is obtained. The data is written to the spare space of each page;
  • the corresponding data is obtained from the data of each page according to the storage location stored in the fixed entry corresponding to the storage block, and is stored according to the fixed entry corresponding to the storage block.
  • the order of the storage locations, the acquired data is sequentially written into the spare space of each page.
  • the size of the spare space corresponding to each page is equal, and is equal to the first preset value, and the storage location stored in the fixed entry is in a one-to-one correspondence with the data stored in the spare space.
  • the data of the second bit is 0 from the data of the first page
  • the data of the 45th bit of the storage location is 1
  • the data of the 70th bit of the storage location is 0, and the data of the 90th bit of the storage location is 0.
  • the obtained four data 0100 is stored in the spare space of the first page
  • the data of the second page is obtained as the data of the second bit is 0, and the data whose storage location is the 45th bit is 0.
  • the acquired 4 data 0001 is stored in the spare space of the second page
  • the storage is obtained from the data of the third page.
  • the data whose position is the second bit is 1, the data whose storage position is the 45th bit is 1, the data whose storage position is the 70th bit is 0, and the data whose storage position is the 90th bit is 0, then the acquired 4 data 1100 is stored. In the spare space of the third page.
  • the data After the data is written into the memory block, the data can be read from the memory block when the user needs to use the data in the memory block next time.
  • Step 203 When reading data in the storage block, performing a first error check on the data in each page in the storage block to obtain data that is erroneous in each page;
  • the data in the storage block is read, and the data stored in the main data space in each page in the storage block is subjected to the first error check to obtain the data in error in each page.
  • Step 204 Obtain a storage location of data that is erroneous in each page in the storage block, and collect the obtained storage. The number of errors in the same error data;
  • the storage location of the data stored in each page in the storage block is acquired, the error data of the same storage location is obtained, and the number of errors of the erroneous data having the same storage location is counted.
  • the data in the first page corresponds to a storage location of 2, 30, 65, 70.
  • the data in the second page corresponds to a storage location of 2, 34, 45, 78.
  • the third page has an error.
  • the storage location corresponding to the data is 2, 65, 70, 90.
  • the data storage location corresponding to the error in the fourth page is 10, 45, 65, 70, and the error data with the storage location 2 in the storage block is counted.
  • the number of errors is 3, the number of errors in the data with a memory location of 10 is 1, the number of errors in the data with a storage location of 30 is 1, and the number of errors in the data with a storage location of 34 is 1.
  • the number of errors in the data with a memory location of 45 is 2, the number of errors in the data with a memory location of 65 is 3, and the number of errors in the data with a memory location of 70 is 3.
  • the storage location is The number of errors of the error data of 78 is 1, and the number of errors of the error data of the storage location of 90 is 1.
  • Step 205 The first preset value position with the largest number of errors is selected, and the selected storage location is stored in the temporary entry corresponding to the storage block;
  • the four storage locations with the largest number of errors are selected as 2, 45, 65, and 70, and the selected storage locations 2, 45, 65, and 70 are sequentially stored in the temporary entries shown in Table 2 below.
  • Table 2 Table 2
  • Step 206 For a page in the storage block, count the first number of data of the page error, if the first number is less than or equal to the preset first threshold, then Perform ECC repair on the data in the page to end the operation;
  • the first number of data of the page error is counted, and the first number is compared with a preset first threshold, if the first number is less than or equal to the pre- Setting the first threshold, determining that the data ECC of the page is correctable, performing ECC repair on the data in the page according to the ECC data stored in the 00B space of the page, and ending the operation.
  • ECC check is performed on data stored in the main data space in each page of the memory block, and ECC data corresponding to each page of the memory block is obtained, and each The ECC data corresponding to the page is stored in its corresponding 00B space.
  • Step 207 If the first number is greater than the preset first threshold, obtaining data from the spare space of the page according to the storage location of the erroneous data of the page and the fixed entry corresponding to the storage block, and The data in the page is replaced with the data obtained; Specifically, if the first number is greater than the preset first threshold, determining that the data ECC of the page is uncorrectable, according to the storage location of the erroneous data of the page, obtaining the storage location of the erroneous data in the fixed entry The position order, according to the position order of the storage location of the erroneous data in the fixed table item, acquires the corresponding data from the spare space, and replaces the erroneous data in the page with the acquired data.
  • the probability of ECC uncorrectable faults in the data in each page of the memory block is relatively small. As time increases, the probability of ECC uncorrectable faults in each page increases. Large, because after opening the spare space function, when writing data in each page of the storage block, it is necessary to find a fixed entry corresponding to the storage block to perform backup of the spare space data, which introduces a certain delay, so When the SSD starts to use, the standby space function can be turned off. Only when the first number of data that is detected to be erroneous after the first ECC correction is performed on a certain page is greater than the first preset threshold, the system is notified to open the spare space function.
  • the first preset threshold is smaller than the preset first threshold, so that when the first number of data of a certain page error is greater than the first preset threshold and less than or equal to the preset first threshold, the standby space function is enabled.
  • the data in the storage block is read next time, if the first number of data in the memory block is greater than the second preset threshold and less than or equal to the preset first threshold, the data of the storage block is ECC. After the repair, the data is moved to an idle storage block, and the data is backed up in the spare space according to the fixed entry corresponding to the free storage block.
  • the backup space function When the backup space function is enabled, if data is written in a certain free storage block in the flash, the high fault may be directly in the spare space of the storage block according to the fixed entry corresponding to the free storage block. The rate of data is backed up.
  • the first preset threshold is smaller than the second preset threshold.
  • Step 208 Perform a second error check on the data of the page, and obtain data of the error of the page;
  • a second error check is performed on the data stored in the main data space in the page to obtain the erroneous data of the page.
  • Step 209 Count the second number of data that is erroneous on the page. If the second number is less than or equal to the preset first threshold, perform ECC repair on the data in the page.
  • the second number of data of the page fault is counted, and the second number is compared with a preset first threshold. If the second number is less than or equal to the preset first threshold, determining the page.
  • the data ECC is correctable, and ECC repair is performed on the data in the page according to the ECC data stored in the 00B space of the page.
  • Step 210 Acquire a free storage block in the SSD, and move the data in the storage block to the obtained free storage block according to the fixed entry corresponding to the free storage block, and end the operation;
  • the free storage block in the solid state hard disk is obtained, and the data stored in the main data space in the storage block and the ECC data stored in the 00B space are moved to the obtained free storage block, and corresponding to the free storage block according to the The storage location in the fixed entry, the data corresponding to the storage location is obtained from the data moved by each page of the free storage block, and stored in the spare space of each page in the free storage block, and the operation ends.
  • the fixed entry corresponding to the storage block is replaced with the temporary entry corresponding to the storage block, and when the data is written in the storage block next time, according to the storage location stored in the replaced fixed entry, Data backup is performed in the spare space of each page in the storage block.
  • the second ECC of the data of the page can be corrected and the data of the page is performed.
  • the system should be notified to move the repaired data of the memory block to other free memory blocks when idle, to prevent more data errors in the memory block, and thus the second ECC is not correctable.
  • the probability that the moved data will be uncorrected for the first time is extremely small. Therefore, it is only necessary to read the moved data once, and the efficiency can be greatly improved.
  • Step 211 If the second number is greater than a preset first threshold, mark the storage block as a bad block, and obtain data from a preset number of storage blocks according to the page identifier of the page;
  • the storage block is marked as a bad block, and the page identifier of the page is obtained from the preset number of storage blocks according to the page identifier of the page. The same page, and read data from the fetched page.
  • Step 212 According to the obtained data, determine whether the data in the page is RAID repaired. If yes, perform RAID repair on the data of the page, and end the operation.
  • performing error checking on the acquired data and performing error checking on data in the same page of the page in the RAID redundant storage block if the data in the page corresponding to the page identifier in the RAID redundant storage block
  • the ECC is correctable, and the number of data ECC uncorrectable in the page corresponding to the page identifier in the preset number of storage blocks is counted, and if the number of uncorrectable ECCs is less than or equal to the preset second threshold, the page is determined.
  • the data in the data can be repaired by RAID, and the data of the page in the storage block is subjected to RAID repair according to the data of the page in the RAID redundant storage block, and the operation ends.
  • the RAID check is performed on the data in the page with the same page identifier in the preset number of storage blocks, and the result of the check is stored in the RAID redundant storage block corresponding to the preset number of storage blocks. This page identifies the corresponding page.
  • the number of uncorrectable ECCs is greater than the preset second threshold, it is determined that the data in the page cannot be repaired by RAID, and the data in the page is read incorrectly, and the operation ends.
  • the second number of data in the second page of the memory block is 20, and the first threshold is 5, and the second number is greater than the preset first threshold, the memory block is marked as Bad block, if the preset number is 16, and the preset second threshold is 3, the data of the second page of the 16 storage blocks is acquired, and the second of each obtained storage block is obtained.
  • the data of the page is checked for errors, and the data of the second page in the RAID redundant memory block is checked for error if the data ECC of the second page in the RAID redundant memory block is correctable, and the 16 memory blocks are The number of uncorrectable data ECC of the second page is 2 less than the preset second threshold 3, then it is determined that the data of the page can be repaired by RAID, and the storage block is according to the data of the second page in the RAID redundant storage block.
  • the second page of data is RAID repaired and returns the correct data.
  • corresponding data is obtained from each page of the storage block and written into the spare space of each page according to the fixed entry corresponding to the storage block.
  • the first error check is performed on the data of a page in the memory block, and when the data in the page is uncorrectable for the first time, the page is stored in the spare space of the page.
  • the data replaces the erroneous data, and a second error check is performed on the data in the page, and the RAID repair is performed when the data in the page is uncorrectable again.
  • the data with high probability of error is stored in the spare space in the 00B space of each page, which not only makes full use of the 00B space, but also replaces the erroneous data according to the data stored in the spare space of each page and performs the second error check.
  • the probability of occurrence of an uncorrectable fault can be greatly reduced to avoid easily marking the memory block as a bad block.
  • Embodiments of the present invention provide a method for repairing erroneous data.
  • the internal part of the Flash chip is divided into a plurality of storage blocks, each storage block is composed of a plurality of pages, each page includes a main data space and a 00B space, and the 00B space is used for storing ECC data, because the ECC stored in the 00B space
  • the data is smaller than the space of 00B, so in each page of the storage block, in addition to the main data space and the space for actually storing the ECC data, there is a certain spare space in each page, and the spare space can be stored and failed.
  • the highest rate of storage location data is composed of a plurality of pages, each page includes a main data space and a 00B space, and the 00B space is used for storing ECC data, because the ECC stored in the 00B space
  • the data is smaller than the space of 00B, so in each page of the storage block, in addition to the main data space and the space for actually storing the ECC data, there is a certain spare space in each page, and the spare space
  • the correspondence between the storage location and the page identifier is stored in the fixed entry corresponding to the storage block, and the number of storage locations corresponding to each page identifier is equal to the size of the spare space of each page, and
  • the storage location corresponding to the page identifier is the storage location with the largest number of errors in the page, and when the data is backed up in the spare space of each page in the storage block, according to the page identifier of the page, from the fixed table
  • the corresponding storage location is obtained from the item, and corresponding data is obtained from the data of the page according to the obtained storage location, and stored in the spare space of the page.
  • Step 301 When data is written into the storage block for the first time, a first preset value storage location is randomly selected from each page in the storage block, and the selected storage is selected. The location and the page identifier of each page are stored in a fixed entry corresponding to the storage block;
  • the storage locations corresponding to each page identifier in the fixed entry may be the same or different.
  • the memory block includes 3 pages, each page stores 100 bits of data, that is, the storage location of the data stored in each page is 1-100, and if the first preset value is 4, the first page is selected from the first page.
  • the first storage location is the first 2bit
  • the second storage location is the 45th bit
  • the third storage location is the 70th bit
  • the fourth storage location is the 90th bit
  • the first storage location of the storage location selected from the second page is the first 2bit
  • the second storage location is the 50th bit
  • the third storage location is the 56th bit
  • the fourth storage location is the 80th bit
  • the first storage location selected from the third page is the 5th bit
  • the two storage locations are the 45th bit
  • the third storage location is the 80th bit
  • the fourth storage location is the 90th bit, storing the storage locations 2, 45, 70, and 90 in the first page and the first page.
  • the page identifier Namel is stored in a fixed entry as shown in Table 1 below, and the storage locations 2, 50, 56, and 80 in the first page and the page identifier N a me2 of the second page are stored in Table 1 below.
  • the fixed entries shown, and the storage locations 5, 45, 80, and 90 in the first page and the page identifiers N a me3 of the third page are stored in the fixed entries shown in Table 1 below.
  • Step 302 After writing the data of each page, according to the storage location in the fixed entry corresponding to the storage block, obtain corresponding data from the data of each page, and write the acquired data to each page for backup.
  • Step 302 After writing the data of each page, according to the storage location in the fixed entry corresponding to the storage block, obtain corresponding data from the data of each page, and write the acquired data to each page for backup.
  • the corresponding data is obtained from the data of each page according to the storage location stored in the fixed entry corresponding to the storage block, and according to the fixed entry corresponding to the storage block.
  • the order in which the storage locations are stored is sequentially written to the spare space of each page.
  • the size of the spare space corresponding to each page is equal, and is equal to the first preset value, and the storage location stored in the fixed entry is in a one-to-one correspondence with the data stored in the spare space.
  • the data of the second bit is 0 from the data of the first page
  • the data of the 45th bit of the storage location is 1
  • the data of the 70th bit of the storage location is 0,
  • the data of the 90th bit of the storage location is 0.
  • the four data 0100 obtained are stored in the spare space of the first page
  • the data of the second page is obtained as 0 from the data of the second page
  • the data of the 50th bit is 0.
  • the data whose storage location is the 56th bit is 0 and the data whose storage location is the 80th bit is 1
  • the acquired 4 data 0001 is stored in the spare space of the second page
  • the storage is obtained from the data of the third page.
  • the data whose position is the fifth bit is 1, the data whose storage position is the 45th bit is 1, the data whose storage position is the 80th bit is 0, and the data whose storage position is the 90th bit is 0, then the acquired 4 data 1100 is stored. In the spare space of the third page. After the data is written into the storage block, the data can be read from the storage block when the user needs to use the data in the storage block next time.
  • Step 303 When reading the data in the storage block, perform a first error check on the data in each page in the storage block, and obtain data of each page error in the storage block at the current time;
  • the data in the storage block is read, and the data stored in the main data space in each page in the storage block is respectively subjected to a first error check to obtain an error of each page in the storage block at the current time. data.
  • Step 304 Acquire, for a certain page in the storage block, data that has been erroneous for each storage location in the page; wherein, each of the other storage locations in the storage block may be obtained according to step 304 above.
  • the data may be obtained according to step 304 above.
  • Step 305 According to the data that has been erroneous in each storage location on the page, count the number of errors of the data that has been erroneous in each storage location on the page;
  • the number of errors of data in which each of the other pages in the memory block has been erroneous may be counted according to the above step 305.
  • the data in the second bit storage location of the first page has been 0, 1, 1, and 0.
  • the data in the 30th bit storage location is 1, 1, 0, 1, and 0, and the 45th bit is stored.
  • the data whose position has been erroneous is 1, 1 and 0.
  • the data of the 70th bit storage location has been erroneous, 1, 1, 0, 1, and 0.
  • the data of the 75th bit storage location has been erroneous, 1, 0, 1 and 0, the data of the 90th bit storage location has 0, 1 and 0, and the other storage location has no error data.
  • the number of errors in the data storage location of the 2nd bit in the first page is 4,
  • the number of errors in the data of the 30th bit storage location is 5,
  • the number of errors in the 45th bit storage location error data is 3,
  • the number of errors in the 70th bit storage location error data is 5, 75th bit.
  • the number of errors in the data in which the storage location has been erroneous is 4, and the number of errors in the data in which the 90th bit storage location has been erroneous is 3;
  • the data in the second bit storage location on the first page has been erroneous, 1, 1, 1, and 0.
  • the data in the 45th bit storage location has been erroneous, 1, 1, 0, 1, and 0.
  • the 50th bit storage location has been The data of the error is 1, 1 and 0.
  • the data of the 56th bit storage location is 1, 1, 0, 1, and 0.
  • the data of the 75th bit storage location is 1, 0, 1, and 0.
  • the data in the 80th bit storage location has 0, 1 and 0, and the other storage location has no error data.
  • the number of errors in the 2nd bit of the first page in the first page is 4, 45th bit.
  • the number of errors in the data in which the storage location has been erroneous is 5, the number of errors in the data in which the 50th bit is stored is erroneous, and the number of errors in the 56th bit of the storage location is 5, and the 75th bit is stored.
  • the number of errors in the data in which the position has been erroneous is 4, and the number of errors in the data in which the 80th bit is in the wrong position is 3;
  • the data of the 5th bit storage location in the first page has been 0, 1, 1, and 0, and the storage bit of the 30th bit.
  • the data that has been erroneous is 1, 1, 0, 1, and 0.
  • the data of the 45th bit storage location is 1, 1, and 0, and the data of the 70th bit storage location is 1, 1, 0, 1 And 0, the 85th bit of the memory location error data is 1, 0, 1 and 0, the 90th bit storage location error data is 0, 1 and 0, other storage locations have no error data,
  • the number of errors in the data of the 5th bit storage location is 4, the number of errors in the 30th bit storage location error data is 5, and the 45th bit storage location error data is incorrect.
  • the number is 3, the number of errors in the 70th bit storage location error data is 5, the number of errors in the 85th bit storage location error data is 4, and the number of errors in the 90th bit storage location error data is 3.
  • Step 306 The first preset value storage location with the largest number of errors is selected, and the selected storage location and the identifier of the page are stored in the temporary entry corresponding to the storage block.
  • the correspondence between the page identifier and the location number is stored in the temporary entry.
  • the four storage locations with the largest number of errors are 2, 30, 70, and 75, and the four storage locations with the largest number of errors from the second page are 2, 45, respectively.
  • 56 and 75 from the third page, the four storage locations with the largest number of errors are 5, 30, 70, and 85, and the page of the first page is identified by Name l and the selected storage location 2, 30, 70 And 75 are stored in the temporary entry shown in Table 2 below, and the page identifier N a me2 of the second page and the selected storage locations 2, 45, 56, and 75 are stored in the temporary entry shown in Table 2 below. And storing the page identifier N a me3 of the third page and the selected storage locations 5, 30, 70, and 85 in the temporary entries shown in Table 2 below.
  • Step 307 Counting the first number of data of the page error at the current time. If the first number is less than or equal to the preset first threshold, performing ECC repair on the data in the page, and ending the operation;
  • ECC check is performed on data stored in the main data space in each page of the memory block, and ECC data corresponding to each page of the memory block is obtained, and each ECC data corresponding to the page Stored in its corresponding 00B space.
  • Step 308 If the first number is greater than the preset first threshold, the data is obtained from the spare space of the page according to the storage location of the data of the page fault and the fixed entry corresponding to the storage block, and the page is The data in the error is replaced with the acquired data;
  • the first number is greater than the preset first threshold, determining that the data ECC of the page is uncorrectable, obtaining a storage location of the erroneous data according to the storage location of the erroneous data of the page and the page identifier of the page.
  • the position order in the fixed table item according to the position order of the storage location of the erroneous data in the fixed table item, the corresponding data is obtained from the spare space of the page, and the erroneous data in the page is replaced with the acquired data. .
  • the specific operation of obtaining the location order of the storage location of the erroneous data in the fixed entry is: according to the page identifier of the page, from the storage block Obtaining a corresponding storage location in the corresponding fixed entry, and determining a location order of the storage location of the erroneous data of the page in the fixed entry from the obtained storage location according to the storage location of the erroneous data of the page.
  • the probability of ECC uncorrectable faults in the data in each page of the memory block is relatively small. As time increases, the probability of ECC uncorrectable faults in each page increases. Large, because after opening the spare space function, when writing data in each page of the storage block, it is necessary to find a fixed entry corresponding to the storage block to perform backup of the spare space data, which introduces a certain delay, so When the SSD starts to use, the standby space function can be turned off. Only when the first number of data that is detected to be erroneous after the first ECC correction is performed on a certain page is greater than the first preset threshold, the system is notified to open the spare space function.
  • the first preset threshold is smaller than the preset first threshold, so that when the first number of data of a certain page error is greater than the first preset threshold and less than or equal to the preset first threshold, the standby space function is enabled.
  • the data in the storage block is read next time, if the first number of data in the memory block is greater than the second preset threshold and less than or equal to the preset first threshold, the data of the storage block is ECC. After the repair, the data is moved to an idle storage block, and the data is backed up in the spare space according to the fixed entry corresponding to the free storage block.
  • the backup space function When the backup space function is enabled, if data is written in a certain free storage block in the flash, the high fault may be directly in the spare space of the storage block according to the fixed entry corresponding to the free storage block. The rate of data is backed up.
  • the first preset threshold is smaller than the second preset threshold.
  • Step 309 Perform a second error check on the data of the page, and obtain data of the error of the page;
  • a second error check is performed on the data and ECC data stored in the main data space in the page, and the erroneous data of the page is obtained.
  • Step 310 Count the second number of data that is erroneous on the page, if the second number is less than or equal to the preset first Threshold, then ECC repair of the data in the page;
  • the second number of data of the page fault is counted, and the second number is compared with a preset first threshold. If the second number is less than or equal to the preset first threshold, determining the page.
  • the data ECC is correctable, and ECC repair is performed on the data in the page according to the ECC data stored in the 00B space of the page.
  • Step 311 Acquire a free storage block in the SSD, and move the data in the storage block to the obtained free storage block according to the fixed entry corresponding to the free storage block, and end the operation;
  • the free storage block in the solid state hard disk is obtained, and the data stored in the main data space in the storage block and the ECC data stored in the 00B space are moved to the obtained free storage block, and corresponding to the free storage block according to the
  • the storage location in the fixed entry obtains the data corresponding to the storage location from the data that is moved by each page of the free storage block, and stores the data in the spare space of each page in the free storage block, and ends the operation.
  • the fixed entry corresponding to the storage block is replaced with the temporary entry corresponding to the storage block, and when the data is written in the storage block next time, according to the storage location stored in the replaced fixed entry, Data backup is performed in the spare space of each page in the storage block.
  • the second ECC of the data of the page can be corrected and the data of the page is performed.
  • the system should be notified to move the repaired data of the memory block to other free memory blocks when idle, to prevent more data errors in the memory block, and thus the second ECC is not correctable.
  • the probability that the moved data will be uncorrected for the first time is extremely small. Therefore, it is only necessary to read the moved data once, and the efficiency can be greatly improved.
  • Step 312 If the second number is greater than the preset first threshold, mark the storage block as a bad block, and obtain data from a preset number of storage blocks according to the page identifier of the page;
  • the storage block is marked as a bad block, and the page identifier of the page is obtained from the preset number of storage blocks according to the page identifier of the page. The same page, and read data from the fetched page.
  • Step 313 According to the obtained data, determine whether the data in the page is RAID repaired. If yes, perform RAID repair on the data of the page, and end the operation.
  • performing error checking on the acquired data and performing error checking on data in the same page of the page in the RAID redundant storage block if the data in the page corresponding to the page identifier in the RAID redundant storage block
  • the ECC is correctable, and the number of data ECC uncorrectable in the page corresponding to the page identifier in the preset number of storage blocks is counted, and if the number of uncorrectable ECCs is less than or equal to the preset second threshold, the page is determined.
  • the data in the data can be repaired by RAID, and the data of the page in the storage block is subjected to RAID repair according to the data of the page in the RAID redundant storage block, and the operation ends.
  • the RAID check is performed on the data in the page with the same page identifier in the preset number of storage blocks, and the result of the check is stored in the RAID redundant storage block corresponding to the preset number of storage blocks. This page identifies the corresponding page.
  • the number of uncorrectable ECCs is greater than the preset second threshold, it is determined that the data in the page cannot be repaired by RAID, and the data in the page is read incorrectly, and the operation ends.
  • the second number of data in the second page of the memory block is 20, and the first threshold is 5, and the second number is greater than the preset first threshold, the memory block is marked as Bad block, if the preset number is 16, and the preset second threshold is 3, the data of the second page of the 16 memory blocks is acquired, and the data of the second page of each memory block is obtained.
  • the fixed entry corresponding to the storage block stores a first preset value storage location with the largest number of errors of data stored in each location in each page, when the storage block is read.
  • the storage location stored in the fixed entry is obtained according to the number of data history errors corresponding to each storage location, which improves the validity of the data stored in the spare space, and the spare space in the 00B space of each page.
  • Embodiments of the present invention provide a method for repairing erroneous data.
  • the internal part of the Flash chip is divided into a plurality of storage blocks, each storage block is composed of a plurality of pages, each page includes a main data space and a 00B space, and the 00B space is used for storing ECC data, because the ECC stored in the 00B space
  • the data is smaller than the space of 00B, so in each page of the storage block, in addition to the main data space and the space for actually storing the ECC data, there is a certain spare space in each page, and the spare space can be stored and failed.
  • the highest rate of storage location data is composed of a plurality of pages, each page includes a main data space and a 00B space, and the 00B space is used for storing ECC data, because the ECC stored in the 00B space
  • the data is smaller than the space of 00B, so in each page of the storage block, in addition to the main data space and the space for actually storing the ECC data, there is a certain spare space in each page, and the spare space
  • the correspondence between the storage location and the page identifier is stored in the fixed entry corresponding to the storage block, and the number of storage locations corresponding to each page identifier is not necessarily equal, and the fixed entry is stored in the fixed entry.
  • the number of all storage locations is equal to the storage block The size of all the spare space.
  • Step 401 When data is written into the storage block for the first time, a second preset value storage location is randomly selected from the storage block, and the selected storage location and its corresponding page identifier are stored in the storage block.
  • Fixed item
  • the correspondence between the page identifier and the location number is stored in the fixed entry.
  • erroneous data does not necessarily appear in each page in the memory block.
  • the spare space corresponding to the page may be wasted, so the present invention
  • An embodiment is to select a second preset value storage location from the memory block without causing waste of spare space of a certain page.
  • the memory block includes three pages, each page stores 100 bits of data, that is, the storage location of the data stored in each page is 1-100, if the first preset value is 4, when the memory block is first time When data is written, two storage locations are randomly selected from the first page of the storage block, respectively being the second bit and the 45th bit, and six storage locations are randomly selected from the second page of the storage block, respectively.
  • the second bit, the fifth bit, the 34th bit, the 56th bit, the 80th bit, and the 90th bit are randomly selected from the third page of the memory block, and are respectively the second bit, the 34th bit, the 60th bit, and the 90th bit.
  • the page identification Namel of the first page and the positions 2 and 45 selected from the first page are stored in the fixed entry shown in Table 1 below, and the page identification of the second page is N a me2 and from the second
  • the positions 2, 5, 34, 56, 80, and 90 selected in the page are stored in the fixed entry shown in Table 1 below, and the page identifier N a me3 of the third page and the selected from the third page are selected.
  • Positions 2, 34, 60, and 90 are stored in fixed entries as shown in Table 1 below.
  • Step 402 After the data of each page is written, according to the storage location in the fixed entry corresponding to the storage block, the corresponding data is obtained from the data of the storage block, and the acquired data is written into the storage block. Specifically, after the data of each page is written, the corresponding data is obtained from the data of the storage block according to the storage location stored in the fixed entry corresponding to the storage block, and according to the corresponding storage block Storage bits stored in fixed entries In the order of the settings, the acquired data is sequentially written into the spare space of the storage block.
  • the second preset value is the sum of the sizes of the spare spaces of each page in the storage block, and the storage locations stored in the fixed entries are in a one-to-one correspondence with the data stored in the spare spaces.
  • the data whose storage location is the second bit is 0, the data whose storage location is the 45th bit is 1, and the data whose storage location is the second bit is 1 from the data of the second page.
  • the data whose storage position is the 5th bit is 1, the data whose storage position is the 34th bit is 0, the data whose storage position is the 56th bit is 1, the data whose storage position is the 80th bit is 0, and the data whose storage position is the 90th bit is 1,
  • the data of the second page is 0 from the data of the third page, the data of the 34th bit of the storage location is 1, the data of the 60th bit of the storage location is 0, and the data of the 90th bit of the storage location is 1.
  • the two data 01 obtained from the first page and the first two data 11 obtained from the second page are stored in the spare space of the first page; the last 4 obtained from the data of the second page
  • the data 0101 is stored in the spare space of the second page;
  • the 4 data 0101 obtained from the data of the third page is stored in the spare space of the third page.
  • the data After the data is written into the memory block, the data can be read from the memory block when the user needs to use the data in the memory block next time.
  • Step 403 When reading data in the storage block, performing a first error check on the data in each page in the storage block to obtain the erroneous data in the storage block;
  • the data in the storage block is read, and the data stored in the main data space in each page in the storage block is subjected to a first error check to obtain the erroneous data in the storage block.
  • Step 404 Obtain a storage location of data that is erroneous in each page in the storage block, and count the number of errors of the erroneous data with the same storage location;
  • the storage location of the erroneous data in each page in the storage block is acquired, the erroneous data with the same storage location is obtained, and the number of errors of the erroneous data having the same storage location is counted.
  • the data of the error in the first page is stored in 2, 5, 30, 45, and 60, respectively, and the data in the second page is stored in 2, 5, 30, 45, 60, 80, and 90
  • the storage location of the error data in the third page is 2, 5, 34, and 56
  • the number of errors in the data with the statistical storage location of 2 is 3
  • the error of the data with the storage location of 5 is 5
  • the number is 3, the number of errors of the error data of the storage location 30 is 2
  • the number of errors of the error data of the storage location 34 is 1
  • the number of errors of the error data of the storage location 45 is 2
  • storage The number of errors of the error data of position 56 is 1, the number of errors of the error data of the storage location of 60 is 2, the number of errors of the error data of the storage location of 80 is 1, and the storage location is 90.
  • the number of errors in the erroneous data is 1.
  • Step 405 Select, according to the number of statistical errors, a storage location corresponding to the data in which the error occurs in the storage block. a second preset value storage location, and storing the selected storage location and its corresponding page identifier in a temporary entry corresponding to the storage block;
  • the storage location corresponding to the erroneous data in the storage block is sorted according to the number of statistical errors, and the order of the storage location corresponding to the erroneous data in the storage block is obtained, according to the erroneous data in the storage block.
  • the order of the storage locations is selected, and the second preset value storage location is selected.
  • the storage location corresponding to the data in the memory block may be sorted according to the number of errors from large to small.
  • the storage locations corresponding to the data in the memory block are sorted according to the number of statistical errors, and the order of the storage locations corresponding to the data in the memory block is 2, 5, 30, 45, 60, 34. 80, 90, select the top 12 storage locations are the second bit in the first page, the second bit in the second page, the second bit in the third page, the fifth bit in the first page , the 5th bit in the second page, the 5th bit in the third page, the 30th bit in the first page, the 30th bit in the second page, the 45th bit in the first page, the second page
  • the 45th bit in the first, the 60th bit in the first page, and the 60th bit in the second page; the page of the first page identifies the Namel and the storage locations 2, 5, 30, 45, and 60 in the first page Stored in the temporary table entry shown in Table 2 below, storing the page identifier N a me2 of the second page and the storage locations 2, 5, 30, 45, and 60 in the third page as shown in Table 2 below.
  • Step 406 For a page in the storage block, count the first number of data of the page error, and if the first number is less than or equal to the preset first threshold, perform ECC repair on the data in the page. End the operation;
  • the first number of data of the page error is counted, and the first number is compared with a preset first threshold, if the first number is less than or equal to the pre- Setting the first threshold, determining that the data ECC of the page is correctable, performing ECC repair on the data in the page according to the ECC data stored in the 00B space of the page, and ending the operation.
  • the ECC data corresponding to each page of the memory block is obtained, and the ECC data corresponding to each page is stored in its corresponding 00B space.
  • Step 407 If the first number is greater than the preset first threshold, acquiring data from the spare space of the storage block according to the storage location of the data of the page fault and the fixed entry corresponding to the storage block, and The data in the page is replaced with the data obtained;
  • the position order acquires corresponding data from the spare space of the storage block according to the position order of the storage location of the erroneous data in the fixed table, and replaces the erroneous data in the page with the acquired data.
  • the probability of ECC uncorrectable faults in the data in each page of the memory block is relatively small. As time increases, the probability of ECC uncorrectable faults in each page increases. Large, because after opening the spare space function, when writing data in each page of the storage block, it is necessary to find a fixed entry corresponding to the storage block to perform backup of the spare space data, which introduces a certain delay, so When the SSD starts to use, the standby space function can be turned off. Only when the first number of data that is detected to be erroneous after the first ECC correction is performed on a certain page is greater than the first preset threshold, the system is notified to open the spare space function.
  • the first preset threshold is smaller than the preset first threshold, so that when the first number of data of a certain page error is greater than the first preset threshold and less than or equal to the preset first threshold, the standby space function is enabled.
  • the data in the storage block is read next time, if the first number of data in the memory block is greater than the second preset threshold and less than or equal to the preset first threshold, the data of the storage block is ECC. After the repair, the data is moved to an idle storage block, and the data is backed up in the spare space according to the fixed entry corresponding to the free storage block.
  • the backup space function When the backup space function is enabled, if data is written in a certain free storage block in the flash, the high fault may be directly in the spare space of the storage block according to the fixed entry corresponding to the free storage block. The rate of data is backed up.
  • the first preset threshold is smaller than the second preset threshold.
  • Step 408 Perform a second error check on the data of the page, and obtain data of the error of the page;
  • a second error check is performed on the data stored in the main data space in the page to obtain the erroneous data of the page.
  • Step 409 Count the second number of data that is erroneous on the page. If the second number is less than or equal to the preset first threshold, perform ECC repair on the data in the page.
  • the second number of data of the page fault is counted, and the second number is compared with a preset first threshold. If the second number is less than or equal to the preset first threshold, determining the page.
  • Data ECC can be corrected, according to 00B of the page ECC data stored in the space, ECC repair of the data in the page.
  • Step 410 Acquire a free storage block in the SSD, and move the data in the storage block to the obtained free storage block according to the fixed entry corresponding to the free storage block, and end the operation;
  • the free storage block in the solid state hard disk is obtained, and the data stored in the main data space in the storage block and the ECC data stored in the 00B space are moved to the obtained free storage block, and corresponding to the free storage block according to the
  • the storage location in the fixed entry acquires data corresponding to the storage location from the moved data of the free storage block, and stores the data in the spare space in the free storage block, and ends the operation.
  • the fixed entry corresponding to the storage block is replaced with the temporary entry corresponding to the storage block, and when the data is written in the storage block next time, according to the storage location stored in the replaced fixed entry, Data backup in the spare space in the storage block.
  • the second ECC of the data of the page can be corrected and the data of the page is performed.
  • the system should be notified to move the repaired data of the memory block to other free memory blocks when idle, to prevent more data errors in the memory block, and thus the second ECC is not correctable.
  • the probability that the moved data will be uncorrected for the first time is extremely small. Therefore, it is only necessary to read the moved data once, and the efficiency can be greatly improved.
  • Step 411 If the second number is greater than the preset first threshold, mark the storage block as a bad block, and obtain data from a preset number of storage blocks according to the page identifier of the page;
  • the storage block is marked as a bad block, and the page identifier of the page is obtained from the preset number of storage blocks according to the page identifier of the page. The same page, and read data from the fetched page.
  • Step 412 According to the obtained data, determine whether the data in the page is RAID repaired. If yes, perform RAID repair on the data of the page, and end the operation.
  • performing error checking on the acquired data and performing error checking on data in the same page of the page in the RAID redundant storage block if the data in the page corresponding to the page identifier in the RAID redundant storage block
  • the ECC is correctable, and the number of data ECC uncorrectable in the page corresponding to the page identifier in the preset number of storage blocks is counted, and if the number of uncorrectable ECCs is less than or equal to the preset second threshold, the page is determined.
  • the data in the data can be repaired by RAID, and the data of the page in the storage block is subjected to RAID repair according to the data of the page in the RAID redundant storage block, and the operation ends.
  • the RAID check is performed on the data in the page with the same page identifier in the preset number of storage blocks, and the result of the check is stored in the RAID redundant storage block corresponding to the preset number of storage blocks. This page identifies the corresponding page. Further, if the number of uncorrectable ECCs is greater than the preset second threshold, it is determined that the data in the page cannot be repaired by RAID, the data in the page is read incorrectly, and the operation is ended.
  • the second number of data in the second page of the memory block is 20, and the first threshold is 5, and the second number is greater than the preset first threshold, the memory block is marked as Bad block, if the preset number is 16, and the preset second threshold is 3, the data of the second page of the 16 memory blocks is acquired, and the data of the second page of each memory block is obtained.
  • the first error check is performed on the data in a certain page of the storage block.
  • the ECC is uncorrectable.
  • Replace the data stored in the spare space of the memory block with the erroneous data and perform a second error check on the data in the page, and perform RAID repair when the data in the page is uncorrectable again.
  • the spare space in each page 00B space data with a high probability of error is stored, which not only makes full use of the 00B space, but also replaces the erroneous data according to the data stored in the spare space of each page and performs the second error check.
  • an embodiment of the present invention provides an apparatus for repairing erroneous data, where the apparatus includes: a first obtaining module 501, configured to: when reading data in a storage block included in a solid state hard disk, The data in the pages is checked for the first time, and the data in the page is erroneous;
  • the first repair module 502 is configured to perform error checking and correct ECC repair on the data in the page if the first number of data erroneous in the page is less than or equal to the preset first threshold;
  • the first replacement module 503 is configured to: if the first number is greater than the preset first threshold, obtain data from the spare space according to the storage location of the erroneous data in the page and the fixed entry corresponding to the storage block. And replacing the erroneous data in the page with the acquired data, the fixed table entry including the storage location of each data stored in the spare space.
  • the device also includes:
  • a second obtaining module configured to perform a second error check on the data in the page to obtain data of an error in the page
  • a second repairing module configured to: if the second number of data in the page is less than or equal to The preset first threshold, the ECC repair is performed on the data in the page
  • a marking module configured to mark the storage block as a bad block if the second number is greater than the preset first threshold, and obtain data from a preset number of storage blocks according to the page identifier of the page;
  • the third repairing module is configured to determine, according to the acquired data, whether to perform independent redundant disk array RAID repair on the data in the page, and if yes, perform RAID repair on the data in the page.
  • the device further includes:
  • a third obtaining module configured to acquire a storage location of data that is erroneous in each page in the storage block;
  • a first statistic module configured to acquire data of an error with the same storage location, and collect statistics of the erroneous data with the same storage location The number of errors;
  • the first storage module is configured to select a first preset value storage location with the largest number of errors, and store the selected storage location in a temporary entry corresponding to the storage block.
  • the device further includes:
  • a fourth obtaining module configured to acquire data that has been erroneous in each storage location in the page
  • a second statistic module configured to count the number of errors of data that has been erroneous in each storage location in the page according to data that has been erroneous in each storage location on the page;
  • the second storage module is configured to select a first preset value storage location with the largest number of errors, and store the selected storage location and the page identifier of the page in a temporary entry corresponding to the storage block.
  • the device further includes:
  • a fifth obtaining module configured to acquire a storage location of the erroneous data in each page in the storage block; and a third statistic module, configured to acquire the erroneous data with the same storage location, and collect the erroneous data with the same storage location The number of errors;
  • a third storage module configured to select a second preset value storage location from a storage location of the erroneous data in the storage block according to the number of statistical errors, and store the selected storage location and its corresponding page identifier in the storage location The temporary entry corresponding to the storage block.
  • the device further includes:
  • the moving module is configured to obtain a free storage block in the solid state hard disk, and move the data in the storage block to the free storage block according to the fixed entry corresponding to the free storage block.
  • the device further includes:
  • the second replacement module is configured to replace the fixed entry corresponding to the storage block with the temporary entry corresponding to the storage block.
  • the first error check is performed on the data of a certain page in the storage block.
  • the data stored in the spare space of the page is replaced with an error.
  • Data, in which the data with a high probability of error is stored in the spare space in the 00B space of each page, which not only makes full use of the 00B space, but also And replacing the data stored in the spare space of the page with the erroneous data can greatly reduce the probability of occurrence of an uncorrectable fault, thereby avoiding easily marking the memory block as a bad block.
  • an embodiment of the present invention provides an apparatus for repairing erroneous data, where the apparatus includes: a memory 601 and a processor 602, configured to perform the following method for repairing erroneous data:
  • the first number is greater than the preset first threshold, acquiring data from the spare space according to a storage location of the erroneous data in the page and a fixed entry corresponding to the storage block, and The erroneous data in the page is replaced with the acquired data, and the fixed entry includes the storage location of each data stored in the spare space.
  • the method further includes:
  • the method further includes:
  • the method further includes:
  • the first preset value storage location with the largest number of errors is selected, and the selected storage location and the page identifier of the page are stored in the temporary entry corresponding to the storage block.
  • the method further includes:
  • the storage location is stored in the temporary entry corresponding to the storage block, and the selected storage location and its corresponding page identifier are stored.
  • the method further includes:
  • the method further includes:
  • the fixed entry corresponding to the storage block is replaced with the temporary entry corresponding to the storage block.
  • the first error check is performed on the data of a certain page in the storage block.
  • the data stored in the spare space of the page is replaced with an error.
  • the data with high probability of error is stored in the spare space in the 00B space of each page, which not only fully utilizes the 00B space, but also replaces the data stored in the spare space of the page with the erroneous data, which can greatly reduce the occurrence of uncorrectable faults. Probability to avoid easily marking the memory block as a bad block.
  • the device for repairing the error data provided in the foregoing embodiment is only illustrated by the division of the foregoing functional modules when the error data is repaired.
  • the functions may be allocated by different functional modules according to requirements.
  • the internal structure of the device is divided into different functional modules to perform all or part of the functions described above.
  • the device for repairing the erroneous data provided by the foregoing embodiment and the method for repairing the erroneous data are in the same concept, and the specific implementation process is described in detail in the method embodiment, and details are not described herein again.
  • the above-mentioned storage medium may be a read only memory, a magnetic disk or an optical disk or the like.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Techniques For Improving Reliability Of Storages (AREA)

Abstract

本发明公开了一种修复出错数据的方法和设备,属于计算机领域。该方法包括:当读取固态硬盘包括的存储块中的数据时,对该存储块的某个页中的数据进行第一次错误检查,获取该页中出错的数据;如果该页中出错的数据的第一个数小于或等于预设第一门限,则对该页中的数据进行错误检查和纠正ECC修复;如果第一个数大于预设第一门限,则根据该页中出错的数据的存储位置和该存储块对应的固定表项,从备用空间中获取数据,并将该页中出错的数据替换为获取的数据,固定表项包括备用空间中存储的各数据的存储位置。该设备包括:第一获取模块、第一修复模块和第一替换模块。本发明可以大大降低发生不可纠故障的概率,以避免轻易的将该存储块标记为坏块。

Description

一种修复出错 ifc g的方法和设备 本申请要求于 2013年 8月 28日提交中国专利局、 申请号为 201310381426. 1、发明 名称为 "一种修复出错数据的方法和设备"的中国专利申请的优先权, 其全部内容通过 引用结合在本申请中。 技术领域 本发明涉及计算机领域, 特别涉及一种修复出错数据的方法和设备。 背景技术
SSD ( Solid State Disk, 固态硬盘) 是用固态电子存储芯片阵列制成的硬盘, 由 控制单元和存储单元组成。 存储单元由 Flash芯片构成, 由于工艺和成本的因素, 导致 Flash芯片存在一定的故障率, 当 Flash芯片出现故障时会损坏存储的数据, 因此, 修 复出错数据的方法受到了广泛地关注。
目前, 修复出错数据的方法, 具体为: 当读取 SSD的 Flash芯片包括的存储块中的 数据时, 对于该存储块中的某一个页, 检查该页中出错的数据, 如果出错的数据个数没 有超过预设第一门限, 则对该页中出错的数据进行 ECC (Error Correcting Code, 错误 检查和纠正)修复, 并返回正确的数据; 如果出错的数据个数超过预设第一门限, 则将 该存储块标记为坏块, 后续不再使用该存储块, 同时从预设个数个存储块中选择与该页 的页标识相同的页, 根据选择的页中的数据, 判断该页的数据是否可以进行 RAID ( Redundant Arrays of Inexpensive Disks , 独立冗余磁盘阵列) 修复, 如果是, 则 对该页进行 RAID修复, 并返回正确的数据, 否则, 数据读取错误。
在实现本发明的过程中, 发明人发现现有技术至少存在以下问题:
当某个页中出错的数据个数超过预设第一门限时, 只能对该存储块进行 RAID修复, 而 RAID修复需要根据预设个数个存储块中的数据进行修复, 修复数据的时间较长, 效 率过低; 并且将整个存储块被标记为坏块后, SSD的容量减小, 后续只能使用没有被标 记为坏块的存储块, 使没有被标记为坏块的存储块使用频繁, 从而降低 SSD的寿命和性
发明内容 为了解决现有技术的问题, 本发明实施例提供了一种修复出错数据的方法和设备。 所述技术方案如下:
第一方面, 提供了一种修复出错数据的方法, 所述方法包括:
当读取固态硬盘包括的存储块中的数据时,对所述存储块的某个页中的数据进行第 一次错误检查, 获取所述页中出错的数据;
如果所述页中出错的数据的第一个数小于或等于预设第一门限, 则对所述页中的数 据进行错误检查和纠正 ECC修复;
如果所述第一个数大于所述预设第一门限, 则根据所述页中出错的数据的存储位置 和所述存储块对应的固定表项, 从备用空间中获取数据, 并将所述页中出错的数据替换 为获取的数据, 所述固定表项包括所述备用空间中存储的各数据的存储位置。
结合第一方面, 在上述第一方面的第一种可能的实现方式中, 所述如果所述第一个 数大于所述预设第一门限, 则根据所述页中出错的数据的存储位置和已存储的固定表 项, 从备用空间中获取数据, 并将所述页中出错数据替换为获取的数据之后, 还包括: 对所述页中的数据进行第二次错误检查, 获取所述页中出错的数据;
如果所述页中出错的数据的第二个数小于或等于所述预设第一门限, 则对所述页中 的数据进行 ECC修复;
如果所述第二个数大于所述预设第一门限, 则将所述存储块标记为坏块, 并根据所 述页的页标识, 从预设个数个存储块中获取数据;
根据获取的数据, 判断是否对所述页中的数据进行独立冗余磁盘阵列 RAID修复, 如 果是, 则对所述页中的数据进行 RAID修复。
结合第一方面, 在上述第一方面的第二种可能的实现方式中, 所述对所述存储块的 某个页中的数据进行第一次错误检查, 获取所述页中出错的数据之后, 还包括:
获取所述存储块中的每个页中出错的数据的存储位置;
获取存储位置相同的出错的数据, 并统计存储位置相同的出错的数据的出错个数; 选择出错个数最大的第一预设数值个存储位置, 并将选择的存储位置存储在所述存 储块对应的临时表项中。
结合第一方面, 在上述第一方面的第三种可能的实现方式中, 所述对所述存储块的 某个页中的数据进行第一次错误检查, 获取所述页中出错的数据之后, 还包括:
获取所述页中的每个存储位置已出错的数据; 根据所述页中的每个存储位置已出错的数据, 统计所述页中的每个存储位置已出错 的数据的出错个数;
选择出错个数最大的第一预设数值个存储位置, 并将选择的存储位置和所述页的页 标识存储在所述存储块对应的临时表项中。
结合第一方面, 在上述第一方面的第四种可能的实现方式中, 所述对所述存储块的 某个页中的数据进行第一次错误检查, 获取所述页中出错的数据之后, 还包括:
获取所述存储块中的每个页中出错的数据的存储位置;
获取存储位置相同的出错的数据, 并统计存储位置相同的出错的数据的出错个数; 根据统计的出错个数, 从所述存储块中出错的数据的存储位置中选择第二预设数值 个存储位置, 并将选择的存储位置和其对应的页标识存储在所述存储块对应的临时表项 中。
结合第一方面或第一方面的第一种可能的实现方式至第一方面的第四种可能的实 现方式中的任一种可能的实现方式, 在上述第一方面的第五种可能的实现方式中, 所述 如果所述页中出错的数据的第二个数小于或等于所述预设第一门限, 则对所述页中的数 据进行 ECC修复之后, 还包括:
获取固态硬盘中的空闲存储块, 根据所述空闲存储块对应的固定表项, 将所述存储 块中的数据搬移到所述空闲存储块中。
结合第一方面的第五种可能的实现方式,在上述第一方面的第六种可能的实现方式 中, 所述方法还包括:
将所述存储块对应的固定表项替换为所述存储块对应的临时表项。 另一方面, 提供了一种修复出错数据的装置, 所述装置包括:
第一获取模块, 用于当读取固态硬盘包括的存储块中的数据时, 对所述存储块的某 个页中的数据进行第一次错误检查, 获取所述页中出错的数据;
第一修复模块, 用于如果所述页中出错的数据的第一个数小于或等于预设第一门 限, 则对所述页中的数据进行错误检查和纠正 ECC修复;
第一替换模块, 用于如果所述第一个数大于所述预设第一门限, 则根据所述页中出 错的数据的存储位置和所述存储块对应的固定表项, 从备用空间中获取数据, 并将所述 页中出错的数据替换为获取的数据,所述固定表项包括所述备用空间中存储的各数据的 存储位置。 结合第二方面, 在上述第二方面的第一种可能的实现方式中, 所述设备还包括: 第二获取模块, 用于对所述页中的数据进行第二次错误检查, 获取所述页中出错的 数据;
第二修复模块,用于如果所述页中出错的数据的第二个数小于或等于所述预设第一 门限, 则对所述页中的数据进行 ECC修复;
标记模块, 用于如果所述第二个数大于所述预设第一门限, 则将所述存储块标记为 坏块, 并根据所述页的页标识, 从预设个数个存储块中获取数据;
第三修复模块, 用于根据获取的数据, 判断是否对所述页中的数据进行独立冗余磁 盘阵列 RAID修复, 如果是, 则对所述页中的数据进行 RAID修复。
结合第二方面, 在上述第二方面的第二种可能的实现方式中, 所述设备还包括: 第三获取模块, 用于获取所述存储块中的每个页中出错的数据的存储位置; 第一统计模块, 用于获取存储位置相同的出错的数据, 并统计存储位置相同的出错 的数据的出错个数;
第一存储模块, 用于选择出错个数最大的第一预设数值个存储位置, 并将选择的存 储位置存储在所述存储块对应的临时表项中。
结合第二方面, 在上述第二方面的第三种可能的实现方式中, 所述设备还包括: 第四获取模块, 用于获取所述页中的每个存储位置已出错的数据;
第二统计模块, 用于根据所述页中的每个存储位置已出错的数据, 统计所述页中的 每个存储位置已出错的数据的出错个数;
第二存储模块, 用于选择出错个数最大的第一预设数值个存储位置, 并将选择的存 储位置和所述页的页标识存储在所述存储块对应的临时表项中。
结合第二方面, 在上述第二方面的第四种可能的实现方式中, 所述设备还包括: 第五获取模块, 用于获取所述存储块中的每个页中出错的数据的存储位置; 第三统计模块, 用于获取存储位置相同的出错的数据, 并统计存储位置相同的出错 的数据的出错个数;
第三存储模块, 用于根据统计的出错个数, 从所述存储块中出错的数据的存储位置 中选择第二预设数值个存储位置, 并将选择的存储位置和其对应的页标识存储在所述存 储块对应的临时表项中。 结合第二方面或第二方面的第一种可能的实现方式至第二方面的第四种可能的实 现方式中的任一种可能的实现方式, 在上述第二方面的第五种可能的实现方式中, 所述 设备还包括:
搬移模块, 用于获取固态硬盘中的空闲存储块, 根据所述空闲存储块对应的固定表 项, 将所述存储块中的数据搬移到所述空闲存储块中。
结合第二方面的第五种可能的实现方式,在上述第二方面的第六种可能的实现方式 中, 所述设备还包括:
第二替换模块,用于将所述存储块对应的固定表项替换为所述存储块对应的临时表 项。 第三方面, 提供了一种修复出错数据的设备, 所述设备包括存储器和处理器, 用于 执行所述的一种修复出错数据的方法。
在本发明实施例中, 对该存储块中某个页的数据进行第一次错误检查, 当该页中的 数据第一次 ECC不可纠时, 将该页的备用空间中存储的数据替换出错的数据。 其中, 在 每个页 00B空间中的备用空间中存储出错概率大的数据, 不仅充分利用了 00B空间, 并且 将该页的备用空间中存储的数据替换出错的数据可以大大降低发生不可纠故障的概率, 以避免轻易的将该存储块标记为坏块。 附图说明 为了更清楚地说明本发明实施例中的技术方案, 下面将对实施例描述中所需要使用 的附图作简单地介绍, 显而易见地, 下面描述中的附图仅仅是本发明的一些实施例, 对 于本领域普通技术人员来讲, 在不付出创造性劳动的前提下, 还可以根据这些附图获得 其他的附图。
图 1是本发明实施例提供的一种修复出错数据的方法流程图;
图 2是本发明实施例提供的另一种修复出错数据的方法流程图;
图 3是本发明实施例提供的另一种修复出错数据的方法流程图;
图 4是本发明实施例提供的另一种修复出错数据的方法流程图;
图 5是本发明实施例提供的一种修复出错数据的装置结构示意图;
图 6是本发明实施例提供的另一种修复出错数据的装置结构示意图。 具体实肺式 为使本发明的目的、 技术方案和优点更加清楚, 下面将结合附图对本发明实施方式 作进一步地详细描述。 本发明实施例提供了一种修复出错数据的方法, 参见图 1, 该方法包括: 步骤 101 : 当读取固态硬盘包括的存储块中的数据时, 对该存储块的某个页中的数 据进行第一次错误检查, 获取该页中出错的数据;
步骤 102: 如果该页中出错的数据的第一个数小于或等于预设第一门限, 则对该页 中的数据进行错误检查和纠正 ECC修复;
步骤 103: 如果第一个数大于预设第一门限, 则根据该页中出错的数据的存储位置 和该存储块对应的固定表项, 从备用空间中获取数据, 并将该页中出错的数据替换为获 取的数据, 固定表项包括备用空间中存储的各数据的存储位置。
在本发明实施例中, 对该存储块中某个页的数据进行第一次错误检查, 当该页中的 数据第一次 ECC不可纠时, 将该页的备用空间中存储的数据替换出错的数据, 其中, 在 每个页 00B空间中的备用空间中存储出错概率大的数据, 不仅充分利用了 00B空间, 并 且将该页的备用空间中存储的数据替换出错的数据可以大大降低发生不可纠故障的概 率, 以避免轻易的将该存储块标记为坏块。 本发明实施例提供了一种修复出错数据的方法。 其中, Flash芯片的内部分为多个 存储块,每个存储块由多个页组成,每个页中包括主数据空间和 00B ( Out of Band Data, 冗余空间) 空间, 00B空间用于存储 ECC数据, 由于 00B空间中存储的 ECC数据比 00B 的空间小,所以在存储块的每个页中,除去主数据空间和真正存储 ECC数据的空间之外, 每个页中还存在一定的备用空间, 该备用空间中可以存储发生故障率最高的存储位置的 数据。 其中, 在本发明实施例中, 存储块对应的固定表项中存储该存储块的备用空间中 的各数据的存储位置, 当对该存储块中每个页的备用空间中进行数据备份时, 根据固定 表项中存储的存储位置, 从每个页中分别获取数据并存储在其对应的备用空间中。 参见 图 2, 该方法包括:
步骤 201 : 当第一次对该存储块中写入数据时, 从该存储块的某个页中随机选择第 一预设数值个存储位置, 将选择的存储位置依次存储在该存储块对应的固定表项中; 其中, 存储块包括的每个页的存储空间相等, 针对于整个存储块可以在该存储块中 的某个页中随机选择第一预设数值个存储位置, 即该存储块中每个页对应的备用空间中 存储的数据对应的存储位置相同。
例如, 该存储块包括 3个页, 每个页存储 lOObit的数据, 即每个页中存储数据的 存储位置为 1-100, 假如第一预设数值为 4, 且选择的第一个存储位置为第 2bit, 第二 个存储位置为第 45 bit, 第三个存储位置为第 70 bit以及第四个存储位置为第 90 bit, 将选择的存储位置 2、 45、 70和 90依次存储在如下表 1所示的固定表项中。
表 1
存储位置 2、 45、 70、 90 步骤 202: 当写入每个页的数据后, 根据该存储块对应的固定表项中的存储位置, 从每个页的数据中获取对应的数据, 将获取的数据写入每个页的备用空间中;
具体地, 当写入每个页的数据后, 根据该存储块对应固定表项中存储的存储位置, 从每个页的数据中获取对应的数据, 并根据该存储块对应固定表项中存储的存储位置的 顺序, 将获取的数据依次写入每个页的备用空间中。
其中, 每个页对应的备用空间的大小相等, 且等于第一预设数值, 并且固定表项中 存储的存储位置和备用空间中存储的数据是一一对应关系。
例如,从第一个页的数据中获取存储位置为第 2bit的数据为 0,存储位置为第 45bit 的数据为 1, 存储位置为第 70bit的数据为 0以及存储位置为第 90bit的数据为 0, 则 将获取的 4个数据 0100存储在第一个页的备用空间中; 从第二个页的数据中获取存储 位置为第 2bit的数据为 0, 存储位置为第 45 bit的数据为 0, 存储位置为第 70 bit的 数据为 0以及存储位置为第 90bit的数据为 1,则将获取的 4个数据 0001存储在第二个 页的备用空间中; 从第三个页的数据中获取存储位置为第 2bit的数据为 1,存储位置为 第 45 bit的数据为 1, 存储位置为第 70 bit的数据为 0以及存储位置为第 90bit的数 据为 0, 则将获取的 4个数据 1100存储在第三个页的备用空间中。
其中, 向该存储块中写入数据后, 当用户下次需要使用该存储块中的数据时, 可以 从该存储块中读取数据。
步骤 203: 当读取该存储块中的数据时, 对该存储块中的每个页中的数据分别进行 第一次错误检查, 获取每个页中出错的数据;
具体地, 读取该存储块中的数据, 对该存储块中的每个页中的主数据空间中存储的 数据进行第一次错误检查, 获取每个页中出错的数据。
步骤 204: 获取该存储块中的每个页中出错的数据的存储位置, 并统计获取的存储 位置相同的出错的数据的出错个数;
具体地, 获取该存储块中的每个页中存储的数据的存储位置, 获取存储位置相同的 出错的数据, 并统计存储位置相同的出错的数据的出错个数。
例如, 第一个页中出错的数据对应的存储位置为 2、 30、 65、 70, 第二个页中出错 的数据对应的存储位置为 2、 34、 45、 78, 第三个页中出错的数据对应的存储位置为 2、 65、 70、 90, 第四个页中出错的数据对应的存储位置为 10、 45、 65、 70, 则统计该存储 块中存储位置为 2的出错的数据的出错个数为 3,存储位置为 10的出错的数据的出错个 数为 1, 存储位置为 30的出错的数据的出错个数为 1, 存储位置为 34的出错的数据的 出错个数为 1, 存储位置为 45的出错的数据的出错个数为 2, 存储位置为 65的出错的 数据的出错个数为 3, 存储位置为 70的出错的数据的出错个数为 3, 存储位置为 78的 出错的数据的出错个数为 1, 存储位置为 90的出错的数据的出错个数为 1。
步骤 205 : 选择出错个数最大的第一预设数值个位置, 并将选择的存储位置存储在 该存储块对应的临时表项中;
例如, 选择出错个数最大的 4个存储位置分别为 2、 45、 65和 70, 将选择的存储位 置 2、 45、 65和 70依次存储在如下表 2所示的临时表项中。 表 2
存储位置 2、 45、 65、 70 步骤 206 : 对于该存储块中的某个页, 统计该页出错的数据的第一个数, 如果该第 一个数小于或等于预设第一门限, 则对该页中的数据进行 ECC修复, 结束操作;
具体地, 对于该存储块中的某个页, 统计该页出错的数据的第一个数, 将该第一个 数和预设第一门限进行比较, 如果该第一个数小于或等于预设第一门限, 则确定该页的 数据 ECC可纠,根据该页的 00B空间中存储的 ECC数据,对该页中的数据进行 ECC修复, 结束操作。
其中, 对该存储块中写入数据时, 对该存储块的每个页中的主数据空间中存储的数 据进行 ECC校验, 得到该存储块的每个页对应的 ECC数据, 将每个页对应的 ECC数据存 储在其对应的 00B空间中。
步骤 207 : 如果该第一个数大于预设第一门限, 则根据该页的出错的数据的存储位 置和该存储块对应的固定表项, 从该页的备用空间中获取数据, 并将该页中出错的数据 替换为获取的数据; 具体地, 如果该第一个数大于预设第一门限, 则确定该页的数据 ECC不可纠, 根据 该页的出错的数据的存储位置, 获取出错的数据的存储位置在固定表项中的位置顺序, 根据出错的数据的存储位置在固定表项中的位置顺序, 从备用空间中获取对应的数据, 并将该页中出错的数据替换为获取的数据。
其中, 在 SSD刚开始使用时, 存储块的每个页中的数据发生 ECC不可纠故障的概率 比较小, 随着时间的增加, 每个页中的数据发生 ECC不可纠故障的概率也会增大, 由于 开启备用空间功能后,在对该存储块的每个页中写数据时需要去查找该存储块对应的固 定表项以进行备用空间数据的备份, 会引入一定的延时, 所以在 SSD开始使用时可关闭 备用空间功能, 只有当对某个页进行第一次 ECC纠错后监测到出错的数据的第一个数大 于第一预设阈值时, 才通知系统开启备用空间功能。
其中, 第一预设阈值小于预设第一门限, 如此, 当某个页出错的数据的第一个数大 于第一预设阈值且小于或等于预设第一门限, 则开启备用空间功能, 当下次读取该存储 块中的数据时, 如果该存储块中出错的数据的第一个数大于第二预设阈值且小于或等于 预设第一门限, 则将该存储块的数据进行 ECC修复后搬移到某个空闲的存储块中, 并根 据该空闲存储块对应的固定表项对备用空间中进行数据备份。
其中,当开启备用空间功能后,如果在该 Flash中的某个空闲存储块中写入数据时, 可以根据该空闲存储块对应的固定表项, 直接在该存储块的备用空间中对高故障率的数 据进行备份。
其中, 第一预设阈值小于第二预设阈值。
步骤 208 : 对该页的数据进行第二次错误检查, 获取该页出错的数据;
具体地, 对该页中的主数据空间中存储的数据进行第二次错误检查, 获取该页的出 错的数据。
步骤 209: 统计该页出错的数据的第二个数, 如果该第二个数小于或等于预设第一 门限, 则对该页中的数据进行 ECC修复;
具体地,统计该页出错的数据的第二个数,将该第二个数和预设第一门限进行比较, 如果该第二个数小于或等于预设第一门限,则确定该页的数据 ECC可纠,根据该页的 00B 空间中存储的 ECC数据, 对该页中的数据进行 ECC修复。
步骤 210: 获取固态硬盘中的空闲存储块, 根据该空闲存储块对应的固定表项, 将 该存储块中的数据搬移到获取的空闲存储块中, 结束操作;
具体地, 获取固态硬盘中的空闲存储块, 将该存储块中的主数据空间中存储的数据 和 00B空间中存储的 ECC数据搬移到获取的空闲存储块中, 并根据该空闲存储块对应的 固定表项中的存储位置, 从该空闲存储块的每个页搬移的数据中获取该存储位置对应的 数据, 并存储在该空闲存储块中每个页的备用空间中, 结束操作。
进一步地, 将该存储块对应的固定表项替换为该存储块对应的临时表项, 当下次在 该存储块中写入数据时, 根据替换后的固定表项中存储的存储位置, 对该存储块中每个 页的备用空间中进行数据备份。
其中, 当某个页的数据发生第一次 ECC不可纠时, 而通过利用备用空间中存储的数 据替换出错的数据后, 该页的数据发生第二次 ECC可纠并对该页的数据进行 ECC修复后, 应通知系统, 在空闲时将该存储块修复后的数据搬移到其他空闲存储块中, 以防止该存 储块中更多的数据出错, 从而导致第二次 ECC也不可纠。 同时通过存储块中的数据搬移 后, 搬移后的数据出现第一次 ECC不可纠的概率特别小, 所以读取搬移后的数据只需一 次 ECC便可, 可大大提高效率。
步骤 211 : 如果该第二个数大于预设第一门限, 则将该存储块标记为坏块, 并根据 该页的页标识, 从预设个数个存储块中获取数据;
具体地, 如果该第二个数大于预设第一门限, 则将该存储块标记为坏块, 并根据该 页的页标识, 从预设个数个存储块中获取与该页的页标识相同的页, 并从获取的页中读 取数据。
步骤 212 : 根据获取的数据, 判断是否对该页中的数据进行 RAID修复, 如果是, 则 对该页的数据进行 RAID修复, 结束操作。
具体地, 对获取的数据进行错误检查以及对 RAID冗余存储块中与该页的页标识相 同的页中的数据进行错误检查,如果 RAID冗余存储块中该页标识对应的页中的数据 ECC 可纠, 则统计预设个数个存储块中该页标识对应的页中的数据 ECC不可纠的个数, 如果 ECC不可纠的个数小于或等于预设第二门限, 则确定该页中的数据可以进行 RAID修复, 根据 RAID冗余存储块中该页的数据对该存储块中该页的数据进行 RAID修复,结束操作。
其中, 事先对预设个数个存储块中页标识相同的页中的数据进行 RAID校验, 并将 校验的结果存储在该预设个数个存储块对应的 RAID冗余存储块中的该页标识对应的页 中。
进一步地, 如果 ECC不可纠的个数大于预设第二门限, 则确定该页中的数据不可以 进行 RAID修复, 该页中的数据读取错误, 结束操作。
例如, 该存储块中的第二个页中出错的数据的第二个数为 20, 预设第一门限为 5, 由于第二个数大于预设第一门限, 则将该存储块标记为坏块, 假如预设个数 16, 预设第 二门限为 3,则获取该 16个存储块中第二个页的数据, 并对获取的每个存储块的第二个 页的数据进行错误检查, 以及对 RAID冗余存储块中的第二页的数据进行错误检查, 如 果 RAID冗余存储块中的第二个页的数据 ECC可纠, 且该 16个存储块中第二个页的数据 ECC不可纠的个数为 2小于预设第二门限 3, 则确定该页的数据可以进行 RAID修复, 根 据 RAID冗余存储块中第二个页的数据对该存储块中第二个页的数据进行 RAID修复, 并 返回正确的数据。
在本发明实施例中, 当对该存储块写入数据时, 根据该存储块对应的固定表项, 从 该存储块的每个页中获取对应的数据并写入每个页的备用空间中。当读取该存储块中的 数据时, 对该存储块中某个页的数据进行第一次错误检查, 当该页中的数据第一次 ECC 不可纠时, 用该页的备用空间中存储的数据替换出错的数据, 并对该页中的数据进行第 二次错误检查, 当该页中的数据再次 ECC不可纠时执行 RAID修复。其中, 在每个页 00B 空间中的备用空间中存储出错概率大的数据, 不仅充分利用了 00B空间, 并且根据每个 页的备用空间中存储的数据替换出错的数据并进行第二次错误检查, 可以大大降低发生 不可纠故障的概率, 以避免轻易的将该存储块标记为坏块。 本发明实施例提供了一种修复出错数据的方法。 其中, Flash芯片的内部分为多个 存储块, 每个存储块由多个页组成, 每个页中包括主数据空间和 00B空间, 00B空间用 于存储 ECC数据, 由于 00B空间中存储的 ECC数据比 00B的空间小, 所以在存储块的每 个页中, 除去主数据空间和真正存储 ECC数据的空间之外, 每个页中还存在一定的备用 空间, 该备用空间中可以存储发生故障率最高的存储位置的数据。 其中, 在本发明实施 例中, 存储块对应的固定表项中存储存储位置与页标识的对应关系, 每个页标识对应的 存储位置的个数与每个页的备用空间的大小相等, 且页标识对应的存储位置为该页中已 出错的数据出错个数最多的存储位置, 当对该存储块中每个页的备用空间中进行数据备 份时, 根据该页的页标识, 从固定表项中获取对应的存储位置, 根据获取的存储位置从 该页的数据中获取对应的数据, 并存储在该页的备用空间中。 参见图 3, 该方法包括: 步骤 301 : 当第一次对该存储块中写入数据时, 从该存储块中的每个页中随机选择 第一预设数值个存储位置,将选择的存储位置和每个页的页标识存储在该存储块对应的 固定表项中;
其中, 固定表项中存储页标识与存储位置的对应关系。
其中, 固定表项中每个页标识对应的存储位置可以是相同的, 也可以是不同的。 例如, 该存储块包括 3个页, 每个页存储 lOObit的数据, 即每个页中存储数据的 存储位置为 1-100, 假如第一预设数值为 4, 从第一个页中选择的第一个存储位置为第 2bit, 第二个存储位置为第 45 bit , 第三个存储位置为第 70 bit以及第四个存储位置 为第 90 bit , 从第二个页中选择的存储位置的第一个存储位置为第 2bit, 第二个存储 位置为第 50 bit , 第三个存储位置为第 56 bit以及第四个存储位置为第 80 bit , 从第 三个页中选择的第一个存储位置为第 5bit, 第二个存储位置为第 45 bit , 第三个存储 位置为第 80 bit以及第四个存储位置为第 90 bit , 将第一个页中的存储位置 2、 45、 70和 90以及第一个页的页标识 Namel存储在如下表 1所示的固定表项中, 将第一个页 中的存储位置 2、 50、 56和 80以及第二个页的页标识 Name2存储在如下表 1所示的固 定表项中, 以及将第一个页中的存储位置 5、 45、 80和 90以及第三个页的页标识 Name3 存储在如下表 1所示的固定表项中。
表 1
Figure imgf000013_0001
步骤 302: 当写入每个页的数据后, 根据该存储块对应的固定表项中的存储位置, 从每个页的数据中获取对应的数据, 将获取的数据写入每个页的备用空间中;
具体地,当写入每个页的数据后,根据该存储块对应的固定表项中存储的存储位置, 从每个页的数据中获取对应的数据, 并根据该存储块对应的固定表项中存储的存储位置 的顺序, 将获取的数据依次写入每个页的备用空间中。
其中, 每个页对应的备用空间的大小相等, 且等于第一预设数值, 并且固定表项中 存储的存储位置和备用空间中存储的数据是一一对应关系。
例如,从第一个页的数据中获取存储位置为第 2bit的数据为 0,存储位置为第 45bit 的数据为 1, 存储位置为第 70bit的数据为 0以及存储位置为第 90bit的数据为 0, 则 将获取的 4个数据 0100存储在第一个页的备用空间中; 从第二个页的数据中获取存储 位置为第 2bit的数据为 0, 存储位置为第 50 bit的数据为 0, 存储位置为第 56 bit的 数据为 0以及存储位置为第 80bit的数据为 1,则将获取的 4个数据 0001存储在第二个 页的备用空间中; 从第三个页的数据中获取存储位置为第 5bit的数据为 1,存储位置为 第 45 bit的数据为 1, 存储位置为第 80 bit的数据为 0以及存储位置为第 90bit的数 据为 0, 则将获取的 4个数据 1100存储在第三个页的备用空间中。 其中, 向该存储块中写入数据后, 当用户下次需要使用该存储块中的数据时, 可以 从该存储块中读取数据。
步骤 303: 当读取该存储块中的数据时, 对该存储块中的每个页中的数据分别进行 第一次错误检查, 获取当前时刻该存储块中每个页出错的数据;
具体地, 读取该存储块中的数据, 对该存储块中的每个页中的主数据空间中存储的 数据分别进行第一次错误检查, 获取当前时刻该存储块中每个页出错的数据。
步骤 304:对于该存储块中的某一个页,获取该页中的每个存储位置已出错的数据; 其中, 可以根据上述步骤 304获取该存储块中的其他页中的每个存储位置已出错的 数据。
步骤 305: 根据该页中的每个存储位置已出错的数据, 统计该页中的每个存储位置 已出错的数据的出错个数;
其中, 可以根据上述步骤 305统计该存储块中的其他页中的每个存储位置已出错的 数据的出错个数。
例如, 第一个页中的第 2bit 的存储位置已出错的数据为 0、 1、 1和 0, 第 30bit 的存储位置已出错的数据为 1、 1、 0、 1和 0, 第 45bit的存储位置已出错的数据为 1、 1和 0, 第 70 bit的存储位置已出错的数据为 1、 1、 0、 1和 0, 第 75 bit的存储位置 已出错的数据为 1、 0、 1和 0, 第 90 bit的存储位置已出错的数据为 0、 1和 0, 其他 存储位置没有已出错的数据, 则第一个页中第 2bit 的存储位置已出错的数据的出错个 数为 4, 第 30bit的存储位置已出错的数据的出错个数为 5, 第 45bit的存储位置已出 错的数据的出错个数为 3,第 70bit的存储位置已出错的数据的出错个数为 5,第 75bit 的存储位置已出错的数据的出错个数为 4, 第 90bit的存储位置已出错的数据的出错个 数为 3;
第一个页中的第 2bit的存储位置已出错的数据为 1、 1、 1和 0, 第 45bit的存储位 置已出错的数据为 1、 1、 0、 1和 0, 第 50bit的存储位置已出错的数据为 1、 1和 0, 第 56 bit的存储位置已出错的数据为 1、 1、 0、 1和 0, 第 75 bit的存储位置已出错的 数据为 1、 0、 1和 0, 第 80 bit的存储位置已出错的数据为 0、 1和 0, 其他存储位置 没有已出错的数据,则第一个页中第 2bit的存储位置已出错的数据的出错个数为 4,第 45bit的存储位置已出错的数据的出错个数为 5, 第 50bit的存储位置已出错的数据的 出错个数为 3, 第 56bit的存储位置已出错的数据的出错个数为 5, 第 75bit的存储位 置已出错的数据的出错个数为 4, 第 80bit的存储位置已出错的数据的出错个数为 3;
第一个页中的第 5bit的存储位置已出错的数据为 0、 1、 1和 0, 第 30bit的存储位 置已出错的数据为 1、 1、 0、 1和 0, 第 45bit的存储位置已出错的数据为 1、 1和 0, 第 70 bit的存储位置已出错的数据为 1、 1、 0、 1和 0, 第 85 bi t的存储位置已出错的 数据为 1、 0、 1和 0, 第 90 bi t的存储位置已出错的数据为 0、 1和 0, 其他存储位置 没有已出错的数据,则第一个页中第 5bit的存储位置已出错的数据的出错个数为 4,第 30bit的存储位置已出错的数据的出错个数为 5, 第 45bit的存储位置已出错的数据的 出错个数为 3, 第 70bit的存储位置已出错的数据的出错个数为 5, 第 85bit的存储位 置已出错的数据的出错个数为 4, 第 90bit的存储位置已出错的数据的出错个数为 3。
步骤 306 : 选择出错个数最大的第一预设数值个存储位置, 并将选择的存储位置和 该页的标识存储在该存储块对应的临时表项中;
其中, 临时表项中存储页标识与位置编号的对应关系。
例如, 从第一个页中选择出错个数最大的 4个存储位置分别为 2、 30、 70和 75, 从 第二个页中选择出错个数最大的 4个存储位置分别为 2、 45、 56和 75, 从第三个页中选 择出错个数最大的 4个存储位置分别为 5、 30、 70和 85, 将第一个页的页标识 Name l 和选择的存储位置 2、 30、 70和 75存储在如下表 2所示的临时表项中, 将第二个页的 页标识 Name2和选择的存储位置 2、 45、 56和 75存储在如下表 2所示的临时表项中, 以及将第三个页的页标识 Name3和选择的存储位置 5、 30、 70和 85存储在如下表 2所 示的临时表项中。
表 2
Figure imgf000015_0001
步骤 307 : 统计当前时刻该页出错的数据的第一个数, 如果该第一个数小于或等于 预设第一门限, 则对该页中的数据进行 ECC修复, 结束操作;
具体地, 统计该中每个页出错的数据的第一个数, 将该第一个数和预设第一门限进 行比较, 如果该第一个数小于或等于预设第一门限, 则确定该页的数据 ECC可纠, 根据 该页的 00B空间中存储的 ECC数据, 对该页中的数据进行 ECC修复, 结束操作。
其中, 对该存储块中写入数据时, 对该存储块的每个页中的主数据空间中存储的数 据进行 ECC校验, 得到该存储块的每个页对应的 ECC数据, 将每个页对应的 ECC数据存 储在其对应的 00B空间中。
步骤 308 : 如果该第一个数大于预设第一门限, 则根据该页出错的数据的存储位置 和该存储块对应的固定表项, 从该页的备用空间中获取数据, 并将该页中出错的数据替 换为获取的数据;
具体地, 如果该第一个数大于预设第一门限, 则确定该页的数据 ECC不可纠, 根据 该页的出错的数据的存储位置和该页的页标识, 获取出错的数据的存储位置在固定表项 中的位置顺序, 根据出错的数据的存储位置在固定表项中的位置顺序, 从该页的备用空 间中获取对应的数据, 并将该页中出错的数据替换为获取的数据。
其中, 根据该页的出错的数据的存储位置和该页的页标识, 获取出错的数据的存储 位置在固定表项中的位置顺序的具体操作为: 根据该页的页标识, 从该存储块对应的固 定表项中获取对应的存储位置, 根据该页的出错的数据的存储位置, 从获取的存储位置 中确定该页的出错的数据的存储位置在固定表项中的位置顺序。
其中, 在 SSD刚开始使用时, 存储块的每个页中的数据发生 ECC不可纠故障的概率 比较小, 随着时间的增加, 每个页中的数据发生 ECC不可纠故障的概率也会增大, 由于 开启备用空间功能后,在对该存储块的每个页中写数据时需要去查找该存储块对应的固 定表项以进行备用空间数据的备份, 会引入一定的延时, 所以在 SSD开始使用时可关闭 备用空间功能, 只有当对某个页进行第一次 ECC纠错后监测到出错的数据的第一个数大 于第一预设阈值时, 才通知系统开启备用空间功能。
其中, 第一预设阈值小于预设第一门限, 如此, 当某个页出错的数据的第一个数大 于第一预设阈值且小于或等于预设第一门限, 则开启备用空间功能, 当下次读取该存储 块中的数据时, 如果该存储块中出错的数据的第一个数大于第二预设阈值且小于或等于 预设第一门限, 则将该存储块的数据进行 ECC修复后搬移到某个空闲的存储块中, 并根 据该空闲存储块对应的固定表项对备用空间中进行数据备份。
其中,当开启备用空间功能后,如果在该 Flash中的某个空闲存储块中写入数据时, 可以根据该空闲存储块对应的固定表项, 直接在该存储块的备用空间中对高故障率的数 据进行备份。
其中, 第一预设阈值小于第二预设阈值。
步骤 309: 对该页的数据进行第二次错误检查, 获取该页出错的数据;
具体地, 对该页中的主数据空间中存储的数据和 ECC数据进行第二次错误检查, 获 取该页的出错的数据。
步骤 310: 统计该页出错的数据的第二个数, 如果该第二个数小于或等于预设第一 门限, 则对该页中的数据进行 ECC修复;
具体地,统计该页出错的数据的第二个数,将该第二个数和预设第一门限进行比较, 如果该第二个数小于或等于预设第一门限,则确定该页的数据 ECC可纠,根据该页的 00B 空间中存储的 ECC数据, 对该页中的数据进行 ECC修复。
步骤 311 : 获取固态硬盘中的空闲存储块, 根据该空闲存储块对应的固定表项, 将 该存储块中的数据搬移到获取的空闲存储块中, 结束操作;
具体地, 获取固态硬盘中的空闲存储块, 将该存储块中的主数据空间中存储的数据 和 00B空间中存储的 ECC数据搬移到获取的空闲存储块中, 并根据该空闲存储块对应的 固定表项中的存储位置, 从空闲该存储块的每个页搬移的数据中获取该存储位置对应的 数据, 并存储在该空闲存储块中每个页的备用空间中, 结束操作。
进一步地, 将该存储块对应的固定表项替换为该存储块对应的临时表项, 当下次在 该存储块中写入数据时, 根据替换后的固定表项中存储的存储位置, 对该存储块中每个 页的备用空间中进行数据备份。
其中, 当某个页的数据发生第一次 ECC不可纠时, 而通过利用备用空间中存储的数 据替换出错的数据后, 该页的数据发生第二次 ECC可纠并对该页的数据进行 ECC修复后, 应通知系统, 在空闲时将该存储块修复后的数据搬移到其他空闲存储块中, 以防止该存 储块中更多的数据出错, 从而导致第二次 ECC也不可纠。 同时通过存储块中的数据搬移 后, 搬移后的数据出现第一次 ECC不可纠的概率特别小, 所以读取搬移后的数据只需一 次 ECC便可, 可大大提高效率。
步骤 312: 如果该第二个数大于预设第一门限, 则将该存储块标记为坏块, 并根据该页的页标识, 从预设个数个存储块中获取数据;
具体地, 如果该第二个数大于预设第一门限, 则将该存储块标记为坏块, 并根据该 页的页标识, 从预设个数个存储块中获取与该页的页标识相同的页, 并从获取的页中读 取数据。
步骤 313: 根据获取数据, 判断是否对该页中的数据进行 RAID修复, 如果是, 则对 该页的数据进行 RAID修复, 结束操作。
具体地, 对获取的数据进行错误检查以及对 RAID冗余存储块中与该页的页标识相 同的页中的数据进行错误检查,如果 RAID冗余存储块中该页标识对应的页中的数据 ECC 可纠, 则统计预设个数个存储块中该页标识对应的页中的数据 ECC不可纠的个数, 如果 ECC不可纠的个数小于或等于预设第二门限, 则确定该页中的数据可以进行 RAID修复, 根据 RAID冗余存储块中该页的数据对该存储块中该页的数据进行 RAID修复,结束操作。 其中, 事先对预设个数个存储块中页标识相同的页中的数据进行 RAID校验, 并将 校验的结果存储在该预设个数个存储块对应的 RAID冗余存储块中的该页标识对应的页 中。
进一步地, 如果 ECC不可纠的个数大于预设第二门限, 则确定该页中的数据不可以 进行 RAID修复, 该页中的数据读取错误, 结束操作。
例如, 该存储块中的第二个页中出错的数据的第二个数为 20, 预设第一门限为 5, 由于第二个数大于预设第一门限, 则将该存储块标记为坏块, 假如预设个数 16, 预设第 二门限为 3,则获取该 16个存储块中第二个页的数据, 并对获取的每个存储块的第二个 页的数据进行错误检查, 以及对 RAID冗余存储块中的第二页的数据进行错误检查, 如 果 RAID冗余存储块中的第二个页的数据 ECC可纠, 且该 16个存储块中第二个页的数据 ECC不可纠的个数为 2小于预设第二门限 3, 则确定该页的数据可以进行 RAID修复, 根 据 RAID冗余存储块中第二个页的数据对该存储块中第二个页的数据进行 RAID修复, 并 返回正确的数据。
在本发明实施例中, 该存储块对应的固定表项中存储每个页中的每个位置已存储的 数据的出错个数最大的第一预设数值个存储位置, 当读取该存储块中的数据时, 对该存 储块中某个页中的数据进行第一次错误检查, 当该页中的数据第一次 ECC不可纠时, 用 该页的备用空间中存储的数据替换出错的数据, 并对该页中的数据进行第二次错误检 查, 当该页中的数据再次 ECC不可纠时执行 RAID修复。 其中, 固定表项中存储的存储 位置是根据每个存储位置对应的数据历史出错的个数获取的,提高了备用空间中存储的 数据的有效性, 并且在每个页 00B空间中的备用空间中存储出错概率大的数据, 不仅充 分利用了 00B空间, 而且根据每个页的备用空间中存储的数据替换出错的数据并进行第 二次错误检查, 可以大大降低发生不可纠故障的概率, 以避免轻易的将该存储块标记为 坏块。 本发明实施例提供了一种修复出错数据的方法。 其中, Flash芯片的内部分为多个 存储块, 每个存储块由多个页组成, 每个页中包括主数据空间和 00B空间, 00B空间用 于存储 ECC数据, 由于 00B空间中存储的 ECC数据比 00B的空间小, 所以在存储块的每 个页中, 除去主数据空间和真正存储 ECC数据的空间之外, 每个页中还存在一定的备用 空间, 该备用空间中可以存储发生故障率最高的存储位置的数据。 其中, 在本发明实施 例中, 存储块对应的固定表项中存储存储位置与页标识的对应关系, 每个页标识对应的 存储位置的个数不一定相等, 且该固定表项中存储的所有存储位置的个数等于该存储块 的所有备用空间的大小, 当对该存储块中每个页的备用空间中进行数据备份时, 根据该 页的页标识, 从固定表项中获取对应的存储位置, 根据获取的存储位置从该页的数据中 获取对应的数据, 并存储在该存储块的备用空间中。 参见图 4, 该方法包括:
步骤 401 : 当第一次对该存储块中写入数据时, 从该存储块中随机选择第二预设数 值个存储位置, 将选择的存储位置和其对应的页标识存储在该存储块对应的固定表项 中;
其中, 固定表项中存储页标识与位置编号的对应关系。
其中, 对于一个存储块, 该存储块中的每个页中的不一定都会出现出错的数据, 当 某个页中没有出现出错的数据时, 该页对应的备用空间可能会浪费, 所以本发明实施例 是从该存储块中选择第二预设数值个存储位置, 不会导致某个页的备用空间的浪费。
例如, 该存储块包括 3个页, 每个页存储 lOObit的数据, 即每个页中存储数据的 存储位置为 1-100, 假如第一预设数值为 4, 当第一次对该存储块中写入数据时, 从该 存储块的第一个页中随机选择 2个存储位置,分别为第 2bit和第 45bit,从该存储块的 第二个页中随机选择 6个存储位置, 分别为第 2bit、 第 5bit、 第 34bit、 第 56bit、 第 80bit和第 90bit, 从该存储块的第三个页中随机选择 4个存储位置, 分别为第 2bit、 第 34bit、 第 60bit和第 90bit, 将第一个页的页标识 Namel和从第一个页中选择的位 置 2和 45存储在如下表 1所示的固定表项中, 将第二个页的页标识 Name2和从第二个 页中选择的位置 2、 5、 34、 56、 80和 90存储在如下表 1所示的固定表项中, 以及将第 三个页的页标识 Name3和从第三个页中选择的位置 2、 34、 60和 90存储在如下表 1所 示的固定表项中。
表 1
Figure imgf000019_0001
步骤 402: 当写入每个页的数据后, 根据该存储块对应的固定表项中的存储位置, 从该存储块的数据中获取对应的数据, 将获取的数据写入该存储块的备用空间中; 具体地,当写入每个页的数据后,根据该存储块对应的固定表项中存储的存储位置, 从该存储块的数据中获取对应的数据, 并根据该存储块对应的固定表项中存储的存储位 置的顺序, 将获取的数据依次写入该存储块的备用空间中。
其中, 第二预设数值为该存储块中每个页的备用空间的大小之和, 并且固定表项中 存储的存储位置和备用空间中存储的数据是一一对应关系。
例如,从第一个页的数据中获取存储位置为第 2bit的数据为 0,存储位置为第 45bit 的数据为 1,从第二个页的数据中获取存储位置为第 2bit的数据为 1,存储位置为第 5bit 的数据为 1, 存储位置为第 34bit的数据为 0, 存储位置为第 56bit的数据为 1, 存储位 置为第 80bit的数据为 0, 存储位置为第 90bit的数据为 1, 从第三个页的数据中获取 存储位置为第 2bit的数据为 0, 存储位置为第 34bit的数据为 1, 存储位置为第 60bit 的数据为 0, 存储位置为第 90bit的数据为 1。 将从第一个页中获取的 2个数据 01和从 第二个页中获取的前 2个数据 11存储在第一个页的备用空间中; 从第二个页的数据中 获取的后 4个数据 0101存储在第二个页的备用空间中; 从第三个页的数据中获取的 4 个数据 0101存储在第三个页的备用空间中。
其中, 向该存储块中写入数据后, 当用户下次需要使用该存储块中的数据时, 可以 从该存储块中读取数据。
步骤 403: 当读取该存储块中的数据时, 对该存储块中的每个页中的数据分别进行 第一次错误检查, 获取该存储块中的出错的数据;
具体地, 读取该存储块中的数据, 对该存储块中的每个页中的主数据空间中存储的 数据分别进行第一次错误检查, 获取该存储块中的出错的数据。
步骤 404: 获取该存储块中的每个页中出错的数据的存储位置, 并统计获取的存储 位置相同的出错的数据的出错个数;
具体地, 获取该存储块中的每个页中出错的数据的存储位置, 获取存储位置相同的 出错的数据, 并统计存储位置相同的出错的数据的出错个数。
例如, 第一个页中出错的数据的存储位置分别为 2、 5、 30、 45和 60, 第二个页中 出错的数据的存储位置分别为 2、 5、 30、 45、 60、 80和 90, 第三个页中出错的数据的 存储位置分别为 2、 5、 34和 56; 统计存储位置为 2的出错的数据的出错个数为 3, 存 储位置为 5的出错的数据的出错个数为 3,存储位置为 30的出错的数据的出错个数为 2, 存储位置为 34的出错的数据的出错个数为 1, 存储位置为 45的出错的数据的出错个数 为 2, 存储位置为 56的出错的数据的出错个数为 1, 存储位置为 60的出错的数据的出 错个数为 2, 存储位置为 80的出错的数据的出错个数为 1, 以及存储位置为 90的出错 的数据的出错个数为 1。
步骤 405: 根据统计的出错个数, 从该存储块中出错的数据对应的存储位置中选择 第二预设数值个存储位置, 并将选择的存储位置和其对应的页标识存储在该存储块对应 的临时表项中;
具体地,根据统计的出错个数,对该存储块中出错的数据对应的存储位置进行排序, 得到该存储块中出错的数据对应的存储位置的顺序,根据该存储块中出错的数据对应的 存储位置的顺序, 选择第二预设数值个存储位置。
其中, 可以根据出错个数从大到小对该存储块中出错的数据对应的存储位置进行排 序。
例如, 根据统计的出错个数对该存储块中出错的数据对应的存储位置进行排序, 得 到该存储块中出错的数据对应的存储位置的顺序为 2、 5、 30、 45、 60、 34、 80、 90, 选 择排序靠前的 12个存储位置分别为第一个页中的第 2bit、 第二个页中的第 2bit、 第三 个页中的第 2bit、第一个页中的第 5bit、第二个页中的第 5bit、第三个页中的第 5bit、 第一个页中的第 30bit、第二个页中的第 30bit、第一个页中的第 45bit、第二个页中的 第 45bit、第一个页中的第 60bit和第二个页中的第 60bit ;将第一个页的页标识 Namel 和第一个页中的存储位置 2、 5、 30、 45和 60存储在如下表 2所示的临时表项中, 将第 二个页的页标识 Name2和第三个页中的存储位置 2、 5、 30、 45和 60存储在如下表 2所 示的临时表项中, 以及将第三个页的页标识 Name3和第三个页中的存储位置 2和 5存储 在如下表 2所示的临时表项中。
表 2
Figure imgf000021_0001
步骤 406: 对于该存储块中的某个页, 统计该页出错的数据的第一个数, 如果该第 一个数小于或等于预设第一门限, 则对该页中的数据进行 ECC修复, 结束操作;
具体地, 对于该存储块中的某个页, 统计该页出错的数据的第一个数, 将该第一个 数和预设第一门限进行比较, 如果该第一个数小于或等于预设第一门限, 则确定该页的 数据 ECC可纠,根据该页的 00B空间中存储的 ECC数据,对该页中的数据进行 ECC修复, 结束操作。
其中, 对该存储块中写入数据时, 对该存储块的每个页中的主数据空间中存储的数 据进行 ECC校验, 得到该存储块的每个页对应的 ECC数据, 将每个页对应的 ECC数据存 储在其对应的 00B空间中。
步骤 407 : 如果该第一个数大于预设第一门限, 则根据该页出错的数据的存储位置 和该存储块对应的固定表项, 从该存储块的备用空间中获取数据, 并将该页中出错的数 据替换为获取的数据;
具体地, 如果该第一个数大于预设第一门限, 则确定该页的数据 ECC不可纠, 根据 该页的出错的数据的存储位置, 获取出错的数据的存储位置在固定表项中的位置顺序, 根据出错的数据的存储位置在固定表项中的位置顺序, 从该存储块的备用空间中获取对 应的数据, 并将该页中出错的数据替换为获取的数据。
其中, 在 SSD刚开始使用时, 存储块的每个页中的数据发生 ECC不可纠故障的概率 比较小, 随着时间的增加, 每个页中的数据发生 ECC不可纠故障的概率也会增大, 由于 开启备用空间功能后,在对该存储块的每个页中写数据时需要去查找该存储块对应的固 定表项以进行备用空间数据的备份, 会引入一定的延时, 所以在 SSD开始使用时可关闭 备用空间功能, 只有当对某个页进行第一次 ECC纠错后监测到出错的数据的第一个数大 于第一预设阈值时, 才通知系统开启备用空间功能。
其中, 第一预设阈值小于预设第一门限, 如此, 当某个页出错的数据的第一个数大 于第一预设阈值且小于或等于预设第一门限, 则开启备用空间功能, 当下次读取该存储 块中的数据时, 如果该存储块中出错的数据的第一个数大于第二预设阈值且小于或等于 预设第一门限, 则将该存储块的数据进行 ECC修复后搬移到某个空闲的存储块中, 并根 据该空闲存储块对应的固定表项对备用空间中进行数据备份。
其中,当开启备用空间功能后,如果在该 Flash中的某个空闲存储块中写入数据时, 可以根据该空闲存储块对应的固定表项, 直接在该存储块的备用空间中对高故障率的数 据进行备份。
其中, 第一预设阈值小于第二预设阈值。
步骤 408 : 对该页的数据进行第二次错误检查, 获取该页出错的数据;
具体地, 对该页中的主数据空间中存储的数据进行第二次错误检查, 获取该页的出 错的数据。
步骤 409: 统计该页出错的数据的第二个数, 如果该第二个数小于或等于预设第一 门限, 则对该页中的数据进行 ECC修复;
具体地,统计该页出错的数据的第二个数,将该第二个数和预设第一门限进行比较, 如果该第二个数小于或等于预设第一门限,则确定该页的数据 ECC可纠,根据该页的 00B 空间中存储的 ECC数据, 对该页中的数据进行 ECC修复。
步骤 410: 获取固态硬盘中的空闲存储块, 根据该空闲存储块对应的固定表项, 将 该存储块中的数据搬移到获取的空闲存储块中, 结束操作;
具体地, 获取固态硬盘中的空闲存储块, 将该存储块中的主数据空间中存储的数据 和 00B空间中存储的 ECC数据搬移到获取的空闲存储块中, 并根据该空闲存储块对应的 固定表项中的存储位置, 从该空闲存储块的搬移的数据中获取该存储位置对应的数据, 并存储在该空闲存储块中的备用空间中, 结束操作。
进一步地, 将该存储块对应的固定表项替换为该存储块对应的临时表项, 当下次在 该存储块中写入数据时, 根据替换后的固定表项中存储的存储位置, 对该存储块中的备 用空间中进行数据备份。
其中, 当某个页的数据发生第一次 ECC不可纠时, 而通过利用备用空间中存储的数 据替换出错的数据后, 该页的数据发生第二次 ECC可纠并对该页的数据进行 ECC修复后, 应通知系统, 在空闲时将该存储块修复后的数据搬移到其他空闲存储块中, 以防止该存 储块中更多的数据出错, 从而导致第二次 ECC也不可纠。 同时通过存储块中的数据搬移 后, 搬移后的数据出现第一次 ECC不可纠的概率特别小, 所以读取搬移后的数据只需一 次 ECC便可, 可大大提高效率。
步骤 411 : 如果该第二个数大于预设第一门限, 则将该存储块标记为坏块, 并根据 该页的页标识, 从预设个数个存储块中获取数据;
具体地, 如果该第二个数大于预设第一门限, 则将该存储块标记为坏块, 并根据该 页的页标识, 从预设个数个存储块中获取与该页的页标识相同的页, 并从获取的页中读 取数据。
步骤 412: 根据获取的数据, 判断是否对该页中的数据进行 RAID修复, 如果是, 则 对该页的数据进行 RAID修复, 结束操作。
具体地, 对获取的数据进行错误检查以及对 RAID冗余存储块中与该页的页标识相 同的页中的数据进行错误检查,如果 RAID冗余存储块中该页标识对应的页中的数据 ECC 可纠, 则统计预设个数个存储块中该页标识对应的页中的数据 ECC不可纠的个数, 如果 ECC不可纠的个数小于或等于预设第二门限, 则确定该页中的数据可以进行 RAID修复, 根据 RAID冗余存储块中该页的数据对该存储块中该页的数据进行 RAID修复,结束操作。
其中, 事先对预设个数个存储块中页标识相同的页中的数据进行 RAID校验, 并将 校验的结果存储在该预设个数个存储块对应的 RAID冗余存储块中的该页标识对应的页 中。 进一步地, 如果 ECC不可纠的个数大于预设第二门限, 则确定该页中的数据不可以 进行 RAID修复, 该页中的数据读取错误, 结束操作。
例如, 该存储块中的第二个页中出错的数据的第二个数为 20, 预设第一门限为 5, 由于第二个数大于预设第一门限, 则将该存储块标记为坏块, 假如预设个数 16, 预设第 二门限为 3,则获取该 16个存储块中第二个页的数据, 并对获取的每个存储块的第二个 页的数据进行错误检查, 以及对 RAID冗余存储块中的第二页的数据进行错误检查, 如 果 RAID冗余存储块中的第二个页的数据 ECC可纠, 且该 16个存储块中第二个页的数据 ECC不可纠的个数为 2小于预设第二门限 3, 则确定该页的数据可以进行 RAID修复, 根 据 RAID冗余存储块中第二个页的数据对该存储块中第二个页的数据进行 RAID修复, 并 返回正确的数据。
在本发明实施例中, 当读取该存储块中的数据时, 对该存储块中某个页中的数据进 行第一次错误检查, 当该页中的数据第一次 ECC不可纠时, 将该存储块的备用空间中存 储的数据替换出错的数据, 并对该页中的数据进行第二次错误检查, 当该页中的数据再 次 ECC不可纠时执行 RAID修复。 其中, 在每个页 00B空间中的备用空间中存储出错概 率大的数据, 不仅充分利用了 00B空间, 并且根据每个页的备用空间中存储的数据替换 出错的数据并进行第二次错误检查, 可以大大降低发生不可纠故障的概率, 以避免轻易 的将该存储块标记为坏块。 参见图 5, 本发明实施例提供了一种修复出错数据的装置, 该装置包括: 第一获取模块 501, 用于当读取固态硬盘包括的存储块中的数据时, 对该存储块的 某个页中的数据进行第一次错误检查, 获取该页中出错的数据;
第一修复模块 502, 用于如果该页中出错的数据的第一个数小于或等于预设第一门 限, 则对该页中的数据进行错误检查和纠正 ECC修复;
第一替换模块 503, 用于如果该第一个数大于所述预设第一门限, 则根据该页中出 错的数据的存储位置和该存储块对应的固定表项, 从备用空间中获取数据, 并将该页中 出错的数据替换为获取的数据, 该固定表项包括所述备用空间中存储的各数据的存储位 置。
其中, 该设备还包括:
第二获取模块,用于对该页中的数据进行第二次错误检查,获取该页中出错的数据; 第二修复模块,用于如果该页中出错的数据的第二个数小于或等于所述预设第一门 限, 则对该页中的数据进行 ECC修复; 标记模块,用于如果该第二个数大于所述预设第一门限,则将该存储块标记为坏块, 并根据该页的页标识, 从预设个数个存储块中获取数据;
第三修复模块, 用于根据获取的数据, 判断是否对该页中的数据进行独立冗余磁盘 阵列 RAID修复, 如果是, 则对该页中的数据进行 RAID修复。
进一步地, 该设备还包括:
第三获取模块, 用于获取该存储块中的每个页中出错的数据的存储位置; 第一统计模块, 用于获取存储位置相同的出错的数据, 并统计存储位置相同的出错 的数据的出错个数;
第一存储模块, 用于选择出错个数最大的第一预设数值个存储位置, 并将选择的存 储位置存储在该存储块对应的临时表项中。
进一步地, 该设备还包括:
第四获取模块, 用于获取该页中的每个存储位置已出错的数据;
第二统计模块, 用于根据该页中的每个存储位置已出错的数据, 统计该页中的每个 存储位置已出错的数据的出错个数;
第二存储模块, 用于选择出错个数最大的第一预设数值个存储位置, 并将选择的存 储位置和该页的页标识存储在该存储块对应的临时表项中。
进一步地, 该设备还包括:
第五获取模块, 用于获取该存储块中的每个页中出错的数据的存储位置; 第三统计模块, 用于获取存储位置相同的出错的数据, 并统计存储位置相同的出错 的数据的出错个数;
第三存储模块, 用于根据统计的出错个数, 从该存储块中出错的数据的存储位置中 选择第二预设数值个存储位置, 并将选择的存储位置和其对应的页标识存储在该存储块 对应的临时表项中。
进一步地, 该设备还包括:
搬移模块,用于获取固态硬盘中的空闲存储块,根据该空闲存储块对应的固定表项, 将该存储块中的数据搬移到该空闲存储块中。
进一步地, 该设备还包括:
第二替换模块, 用于将该存储块对应的固定表项替换为该存储块对应的临时表项。 在本发明实施例中, 对该存储块中某个页的数据进行第一次错误检查, 当该页中的 数据第一次 ECC不可纠时, 将该页的备用空间中存储的数据替换出错的数据, 其中, 在 每个页 00B空间中的备用空间中存储出错概率大的数据, 不仅充分利用了 00B空间, 并 且将该页的备用空间中存储的数据替换出错的数据可以大大降低发生不可纠故障的概 率, 以避免轻易的将该存储块标记为坏块。 参见图 6, 本发明实施例提供了一种修复出错数据的设备, 该设备包括: 存储器 601和处理器 602, 用于执行如下修复出错数据的方法:
当读取固态硬盘包括的存储块中的数据时,对所述存储块的某个页中的数据进行第 一次错误检查, 获取所述页中出错的数据;
如果所述页中出错的数据的第一个数小于或等于预设第一门限, 则对所述页中的数 据进行错误检查和纠正 ECC修复;
如果所述第一个数大于所述预设第一门限, 则根据所述页中出错的数据的存储位置 和所述存储块对应的固定表项, 从备用空间中获取数据, 并将所述页中出错的数据替换 为获取的数据, 所述固定表项包括所述备用空间中存储的各数据的存储位置。
进一步地, 所述如果所述第一个数大于所述预设第一门限, 则根据所述页中出错的 数据的存储位置和已存储的固定表项, 从备用空间中获取数据, 并将所述页中出错数据 替换为获取的数据之后, 还包括:
对所述页中的数据进行第二次错误检查, 获取所述页中出错的数据;
如果所述页中出错的数据的第二个数小于或等于所述预设第一门限, 则对所述页中 的数据进行 ECC修复;
如果所述第二个数大于所述预设第一门限, 则将所述存储块标记为坏块, 并根据所 述页的页标识, 从预设个数个存储块中获取数据;
根据获取的数据, 判断是否对所述页中的数据进行独立冗余磁盘阵列 RAID修复, 如果是, 则对所述页中的数据进行 RAID修复。
进一步地, 所述对所述存储块的某个页中的数据进行第一次错误检查, 获取所述页 中出错的数据之后, 还包括:
获取所述存储块中的每个页中出错的数据的存储位置;
获取存储位置相同的出错的数据, 并统计存储位置相同的出错的数据的出错个数; 选择出错个数最大的第一预设数值个存储位置, 并将选择的存储位置存储在所述存 储块对应的临时表项中。
进一步地, 所述对所述存储块的某个页中的数据进行第一次错误检查, 获取所述页 中出错的数据之后, 还包括:
获取所述页中的每个存储位置已出错的数据; 根据所述页中的每个存储位置已出错的数据, 统计所述页中的每个存储位置已出错 的数据的出错个数;
选择出错个数最大的第一预设数值个存储位置, 并将选择的存储位置和所述页的页 标识存储在所述存储块对应的临时表项中。
进一步地, 所述对所述存储块的某个页中的数据进行第一次错误检查, 获取所述页 中出错的数据之后, 还包括:
获取所述存储块中的每个页中出错的数据的存储位置;
获取存储位置相同的出错的数据, 并统计存储位置相同的出错的数据的出错个数; 根据统计的出错个数, 从所述存储块中出错的数据的存储位置中选择第二预设数值 个存储位置, 并将选择的存储位置和其对应的页标识存储在所述存储块对应的临时表项 中。
进一步地, 所述如果所述页中出错的数据的第二个数小于或等于所述预设第一门 限, 则对所述页中的数据进行 ECC修复之后, 还包括:
获取固态硬盘中的空闲存储块, 根据所述空闲存储块对应的固定表项, 将所述存储 块中的数据搬移到所述空闲存储块中。
进一步地, 所述方法还包括:
将所述存储块对应的固定表项替换为所述存储块对应的临时表项。
在本发明实施例中, 对该存储块中某个页的数据进行第一次错误检查, 当该页中的 数据第一次 ECC不可纠时, 将该页的备用空间中存储的数据替换出错的数据。 其中, 在 每个页 00B空间中的备用空间中存储出错概率大的数据, 不仅充分利用了 00B空间, 并 且将该页的备用空间中存储的数据替换出错的数据可以大大降低发生不可纠故障的概 率, 以避免轻易的将该存储块标记为坏块。 需要说明的是: 上述实施例提供的修复出错数据的设备在修复出错数据时, 仅以上 述各功能模块的划分进行举例说明, 实际应用中, 可以根据需要而将上述功能分配由不 同的功能模块完成, 即将设备的内部结构划分成不同的功能模块, 以完成以上描述的全 部或者部分功能。 另外, 上述实施例提供的修复出错数据的设备与修复出错数据的方法 实施例属于同一构思, 其具体实现过程详见方法实施例, 这里不再赘述。
上述本发明实施例序号仅仅为了描述, 不代表实施例的优劣。
本领域普通技术人员可以理解实现上述实施例的全部或部分步骤可以通过硬件来 完成, 也可以通过程序来指令相关的硬件完成, 所述的程序可以存储于一种计算机可读 存储介质中, 上述提到的存储介质可以是只读存储器, 磁盘或光盘等。
以上所述仅为本发明的较佳实施例, 并不用以限制本发明, 凡在本发明的精神和原 则之内, 所作的任何修改、 等同替换、 改进等, 均应包含在本发明的保护范围之内。

Claims

权利要求
1、 一种修复出错数据的方法, 其特征在于, 所述方法包括:
当读取固态硬盘包括的存储块中的数据时,对所述存储块的某个页中的数据进行第 一次错误检查, 获取所述页中出错的数据;
如果所述页中出错的数据的第一个数小于或等于预设第一门限, 则对所述页中的数 据进行错误检查和纠正 ECC修复;
如果所述第一个数大于所述预设第一门限, 则根据所述页中出错的数据的存储位置 和所述存储块对应的固定表项, 从备用空间中获取数据, 并将所述页中出错的数据替换 为获取的数据, 所述固定表项包括所述备用空间中存储的各数据的存储位置。
2、 如权利要求 1所述的方法, 其特征在于, 所述如果所述第一个数大于所述预设第 一门限, 则根据所述页中出错的数据的存储位置和已存储的固定表项, 从备用空间中获 取数据, 并将所述页中出错数据替换为获取的数据之后, 还包括:
对所述页中的数据进行第二次错误检查, 获取所述页中出错的数据;
如果所述页中出错的数据的第二个数小于或等于所述预设第一门限, 则对所述页中 的数据进行 ECC修复;
如果所述第二个数大于所述预设第一门限, 则将所述存储块标记为坏块, 并根据所 述页的页标识, 从预设个数个存储块中获取数据;
根据获取的数据, 判断是否对所述页中的数据进行独立冗余磁盘阵列 RAID修复, 如 果是, 则对所述页中的数据进行 RAID修复。
3、 如权利要求 1所述的方法, 其特征在于, 所述对所述存储块的某个页中的数据进 行第一次错误检查, 获取所述页中出错的数据之后, 还包括:
获取所述存储块中的每个页中出错的数据的存储位置;
获取存储位置相同的出错的数据, 并统计存储位置相同的出错的数据的出错个数; 选择出错个数最大的第一预设数值个存储位置, 并将选择的存储位置存储在所述存 储块对应的临时表项中。
4、 如权利要求 1所述的方法, 其特征在于, 所述对所述存储块的某个页中的数据进 行第一次错误检查, 获取所述页中出错的数据之后, 还包括: 获取所述页中的每个存储位置已出错的数据;
根据所述页中的每个存储位置已出错的数据, 统计所述页中的每个存储位置已出错 的数据的出错个数;
选择出错个数最大的第一预设数值个存储位置, 并将选择的存储位置和所述页的页 标识存储在所述存储块对应的临时表项中。
5、 如权利要求 1所述的方法, 其特征在于, 所述对所述存储块的某个页中的数据进 行第一次错误检查, 获取所述页中出错的数据之后, 还包括:
获取所述存储块中的每个页中出错的数据的存储位置;
获取存储位置相同的出错的数据, 并统计存储位置相同的出错的数据的出错个数; 根据统计的出错个数, 从所述存储块中出错的数据的存储位置中选择第二预设数值 个存储位置, 并将选择的存储位置和其对应的页标识存储在所述存储块对应的临时表项 中。
6、 如权利要求 1-5任一所述的方法, 其特征在于, 所述如果所述页中出错的数据的 第二个数小于或等于所述预设第一门限, 则对所述页中的数据进行 ECC修复之后, 还包 括:
获取固态硬盘中的空闲存储块, 根据所述空闲存储块对应的固定表项, 将所述存储 块中的数据搬移到所述空闲存储块中。
7、 如权利要求 6所述的方法, 其特征在于, 所述方法还包括:
将所述存储块对应的固定表项替换为所述存储块对应的临时表项。
8、 一种修复出错数据的设备, 其特征在于, 所述设备包括:
第一获取模块, 用于当读取固态硬盘包括的存储块中的数据时, 对所述存储块的某 个页中的数据进行第一次错误检查, 获取所述页中出错的数据;
第一修复模块, 用于如果所述页中出错的数据的第一个数小于或等于预设第一门 限, 则对所述页中的数据进行错误检查和纠正 ECC修复;
第一替换模块, 用于如果所述第一个数大于所述预设第一门限, 则根据所述页中出 错的数据的存储位置和所述存储块对应的固定表项, 从备用空间中获取数据, 并将所述 页中出错的数据替换为获取的数据,所述固定表项包括所述备用空间中存储的各数据的 存储位置。
9、 如权利要求 8所述的设备, 其特征在于, 所述设备还包括:
第二获取模块, 用于对所述页中的数据进行第二次错误检查, 获取所述页中出错的 数据;
第二修复模块,用于如果所述页中出错的数据的第二个数小于或等于所述预设第一 门限, 则对所述页中的数据进行 ECC修复;
标记模块, 用于如果所述第二个数大于所述预设第一门限, 则将所述存储块标记为 坏块, 并根据所述页的页标识, 从预设个数个存储块中获取数据;
第三修复模块, 用于根据获取的数据, 判断是否对所述页中的数据进行独立冗余磁 盘阵列 RAID修复, 如果是, 则对所述页中的数据进行 RAID修复。
10、 如权利要求 8所述的设备, 其特征在于, 所述设备还包括:
第三获取模块, 用于获取所述存储块中的每个页中出错的数据的存储位置; 第一统计模块, 用于获取存储位置相同的出错的数据, 并统计存储位置相同的出错 的数据的出错个数;
第一存储模块, 用于选择出错个数最大的第一预设数值个存储位置, 并将选择的存 储位置存储在所述存储块对应的临时表项中。
11、 如权利要求 8所述的设备, 其特征在于, 所述设备还包括:
第四获取模块, 用于获取所述页中的每个存储位置已出错的数据;
第二统计模块, 用于根据所述页中的每个存储位置已出错的数据, 统计所述页中的 每个存储位置已出错的数据的出错个数;
第二存储模块, 用于选择出错个数最大的第一预设数值个存储位置, 并将选择的存 储位置和所述页的页标识存储在所述存储块对应的临时表项中。
12、 如权利要求 8所述的设备, 其特征在于, 所述设备还包括:
第五获取模块, 用于获取所述存储块中的每个页中出错的数据的存储位置; 第三统计模块, 用于获取存储位置相同的出错的数据, 并统计存储位置相同的出错 的数据的出错个数;
第三存储模块, 用于根据统计的出错个数, 从所述存储块中出错的数据的存储位置 中选择第二预设数值个存储位置, 并将选择的存储位置和其对应的页标识存储在所述存 储块对应的临时表项中。
13、 如权利要求 8-12任一所述的设备, 其特征在于, 所述设备还包括: 搬移模块, 用于获取固态硬盘中的空闲存储块, 根据所述空闲存储块对应的固定表 项, 将所述存储块中的数据搬移到所述空闲存储块中。
14、 如权利要求 13所述的设备, 其特征在于, 所述设备还包括:
第二替换模块,用于将所述存储块对应的固定表项替换为所述存储块对应的临时表 项。
15、 一种修复出错数据的设备, 其特征在于, 所述设备包括存储器和处理器, 用于 执行如权利要求 1至 7任一权利要求所述的一种修复出错数据的方法。
PCT/CN2014/073234 2013-08-28 2014-03-11 一种修复出错数据的方法和设备 WO2015027700A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP14772049.4A EP2857971B1 (en) 2013-08-28 2014-03-11 Method and device for repairing error data
US14/501,368 US9280301B2 (en) 2013-08-28 2014-09-30 Method and device for recovering erroneous data

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201310381426.1A CN103455386B (zh) 2013-08-28 2013-08-28 一种修复出错数据的方法和设备
CN201310381426.1 2013-08-28

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US14/501,368 Continuation US9280301B2 (en) 2013-08-28 2014-09-30 Method and device for recovering erroneous data

Publications (1)

Publication Number Publication Date
WO2015027700A1 true WO2015027700A1 (zh) 2015-03-05

Family

ID=49737789

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2014/073234 WO2015027700A1 (zh) 2013-08-28 2014-03-11 一种修复出错数据的方法和设备

Country Status (3)

Country Link
EP (1) EP2857971B1 (zh)
CN (1) CN103455386B (zh)
WO (1) WO2015027700A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111916141A (zh) * 2019-05-09 2020-11-10 点序科技股份有限公司 快闪存储器管理方法及快闪存储器

Families Citing this family (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103455386B (zh) * 2013-08-28 2016-11-23 华为技术有限公司 一种修复出错数据的方法和设备
US9280301B2 (en) 2013-08-28 2016-03-08 Huawei Technologies Co., Ltd. Method and device for recovering erroneous data
CN107783725A (zh) * 2016-08-30 2018-03-09 南京中兴新软件有限责任公司 数据存储的方法、装置以及非易失性存储器
CN106645302B (zh) * 2016-09-13 2020-04-28 杭州华为数字技术有限公司 一种硫化检测方法及相关设备
CN106502583A (zh) * 2016-10-12 2017-03-15 记忆科技(深圳)有限公司 一种降低固态硬盘响应延迟的方法
CN108170366A (zh) 2016-12-06 2018-06-15 华为技术有限公司 存储设备中的存储介质管理方法、装置和存储设备
CN108614664B (zh) * 2016-12-09 2021-04-16 北京兆易创新科技股份有限公司 基于NAND flash的读错误处理方法和装置
CN107168650B (zh) * 2017-05-10 2020-05-01 合肥联宝信息技术有限公司 一种对bios的存储器中的数据的处理方法及装置
CN107992268B (zh) * 2017-11-24 2021-08-10 郑州云海信息技术有限公司 一种坏块标记的方法及相关装置
CN108829785A (zh) * 2018-05-31 2018-11-16 沈文策 数据库中故障表的修复方法、装置、电子设备及存储介质
CN110888820B (zh) * 2018-09-07 2022-01-25 慧荣科技股份有限公司 数据储存装置以及非挥发式存储器控制方法
CN109614052B (zh) * 2018-12-13 2022-05-10 郑州云海信息技术有限公司 一种数据巡检方法、装置和计算机可读存储介质
CN110471789A (zh) * 2019-07-02 2019-11-19 深圳市金泰克半导体有限公司 固态硬盘纠错方法及固态硬盘
CN111459708B (zh) * 2020-03-11 2023-08-29 深圳佰维存储科技股份有限公司 坏块处理方法及装置
CN116880781B (zh) * 2023-09-08 2023-12-26 合肥康芯威存储技术有限公司 一种存储设备及其控制方法

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1498371A (zh) * 2002-01-31 2004-05-19 松下电器产业株式会社 存储器件、终端设备、和数据修复系统
CN102592680A (zh) * 2011-01-12 2012-07-18 北京兆易创新科技有限公司 一种存储芯片的修复装置和方法
CN103455386A (zh) * 2013-08-28 2013-12-18 华为技术有限公司 一种修复出错数据的方法和设备

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101211042B1 (ko) * 2010-11-23 2012-12-13 에스케이하이닉스 주식회사 고장 정보 저장장치 및 저장방법
US8806111B2 (en) * 2011-12-20 2014-08-12 Fusion-Io, Inc. Apparatus, system, and method for backing data of a non-volatile storage device using a backing store

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1498371A (zh) * 2002-01-31 2004-05-19 松下电器产业株式会社 存储器件、终端设备、和数据修复系统
CN102592680A (zh) * 2011-01-12 2012-07-18 北京兆易创新科技有限公司 一种存储芯片的修复装置和方法
CN103455386A (zh) * 2013-08-28 2013-12-18 华为技术有限公司 一种修复出错数据的方法和设备

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111916141A (zh) * 2019-05-09 2020-11-10 点序科技股份有限公司 快闪存储器管理方法及快闪存储器
CN111916141B (zh) * 2019-05-09 2022-05-31 点序科技股份有限公司 快闪存储器管理方法及快闪存储器

Also Published As

Publication number Publication date
EP2857971A1 (en) 2015-04-08
CN103455386B (zh) 2016-11-23
CN103455386A (zh) 2013-12-18
EP2857971A4 (en) 2015-05-13
EP2857971B1 (en) 2016-08-17

Similar Documents

Publication Publication Date Title
WO2015027700A1 (zh) 一种修复出错数据的方法和设备
JP5792380B2 (ja) データ完全性を与えるための装置および方法
US9170898B2 (en) Apparatus and methods for providing data integrity
CN107967186B (zh) 用于控制存储器装置的方法和控制器及存储器系统
US8869007B2 (en) Three dimensional (3D) memory device sparing
US20110029716A1 (en) System and method of recovering data in a flash storage system
US20120215962A1 (en) Partitioning pages of an electronic memory
US9280301B2 (en) Method and device for recovering erroneous data
CN105808371A (zh) 数据备份与恢复方法、控制芯片及存储装置
US9626242B2 (en) Memory device error history bit
CN101339525A (zh) 一种对数据进行错误检测的方法、系统和设备
US20190354435A1 (en) Memory system and operating method thereof
US10353769B2 (en) Recovering from addressing fault in a non-volatile memory
CN104750577A (zh) 面向片上大容量缓冲存储器的任意多位容错方法及装置
CN105575439B (zh) 一种存储单元失效纠错的方法及存储器
US20190354436A1 (en) Memory system and operating method thereof
US9934093B2 (en) Control device, method of controlling a storage device, and storage system
US9436547B2 (en) Data storing method, memory control circuit unit and memory storage device
EP4439564A1 (en) Method and system for repairing a dynamic random access memory (dram) of memory device
US11609813B2 (en) Memory system for selecting counter-error operation through error analysis and data process system including the same

Legal Events

Date Code Title Description
REEP Request for entry into the european phase

Ref document number: 2014772049

Country of ref document: EP

WWE Wipo information: entry into national phase

Ref document number: 2014772049

Country of ref document: EP

NENP Non-entry into the national phase

Ref country code: DE