WO2021196046A1 - Procédé de gestion d'ensemble de stockage de données, dispositif et support de stockage - Google Patents

Procédé de gestion d'ensemble de stockage de données, dispositif et support de stockage Download PDF

Info

Publication number
WO2021196046A1
WO2021196046A1 PCT/CN2020/082627 CN2020082627W WO2021196046A1 WO 2021196046 A1 WO2021196046 A1 WO 2021196046A1 CN 2020082627 W CN2020082627 W CN 2020082627W WO 2021196046 A1 WO2021196046 A1 WO 2021196046A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
uncorrectable
data block
erasing
current
Prior art date
Application number
PCT/CN2020/082627
Other languages
English (en)
Chinese (zh)
Inventor
伦志远
褚艳旭
单明星
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to CN202080092988.2A priority Critical patent/CN114930299A/zh
Priority to PCT/CN2020/082627 priority patent/WO2021196046A1/fr
Publication of WO2021196046A1 publication Critical patent/WO2021196046A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/08Error detection or correction by redundancy in data representation, e.g. by using checking codes
    • G06F11/10Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's

Definitions

  • This application relates to the field of data processing technology, and in particular to a method, device and storage medium for managing a data storage array.
  • NAND Flash chip is mainly composed of data storage array and peripheral circuit area. As shown in Figure 1, in the NAND Flash manufacturing process, different process steps may introduce defects in the data storage array and the peripheral circuit area; at the same time, as the NAND Flash is repeatedly erased and written, the above data storage array will also have defects . The existence of defects will affect the reliability and functions of the NAND Flash chip to a certain extent, such as causing data loss.
  • the smallest unit of NAND Flash data storage is a page. Physically, a certain number of consecutive pages form a data block Block, a certain number of blocks form a physical plane Plane, and different physical planes form an overall storage array, as shown in Figure 2.
  • the data block Block is the smallest unit of the NAND Flash erasing operation. Generally, the block is used as the basic unit of data management.
  • SSD Solid State Drive
  • NAND Flash NAND Flash
  • the block as a whole will be marked as a bad block and eliminated, that is, no subsequent data storage will be performed.
  • Due to the existence of redundant blocks if a certain data becomes UNC data, other data in the strip can still be used to restore the UNC location.
  • the number of redundant blocks is limited. After a certain number of blocks are eliminated, the SSD device has reached the end of its useful life. At the same time, the elimination of bad blocks will also cause the loss of the internal capacity of the SSD device, resulting in a loss of cost.
  • the embodiments of the present application provide a method, device, and storage medium for managing a data storage array, which can manage the data storage array based on the risk level of the data block, thereby improving the reliability of data storage.
  • a first aspect of the embodiments of the present application provides a method for managing a data storage array, including: reading data stored in a first data block in the data storage array; When the data is not error-correctable, acquiring the current non-correction information of the first data block, where the first data block includes N pages, and the first page is any one of the N pages, N is an integer greater than 1; the risk level of the first data block is determined according to the historical uncorrectable information of the first data block and the current uncorrectable information of the first data block; according to the first data The risk level of the block manages the data storage array.
  • the risk level of the data block is determined based on the current uncorrectable information and the historical uncorrectable information of the data block, and then the data storage array is managed according to the risk level of the data block.
  • the confirmation of the risk level is obtained based on the current uncorrectable information and historical uncorrectable information of the data block.
  • the historical uncorrectable information includes the total number of erasing times corresponding to the uncorrectable historical data
  • the current uncorrectable information includes the number corresponding to the current uncorrectable erasing and writing times
  • the risk level of the first data block is determined according to the total number of erasing times corresponding to the uncorrectable erasing and writing of the historical data and the number corresponding to the erasing times of the uncorrectable erasing and writing of the currently occurring data.
  • the historical uncorrectable information of the first data block includes the total number of erasing times corresponding to the uncorrectable historical occurrence of data
  • the current uncorrectable information of the first data block includes the current uncorrectable data.
  • the number corresponding to the number of erasing and writing times of error correction said determining the risk level of the first data block according to the historical uncorrectable information of the first data block and the current uncorrectable information of the first data block, Including: obtaining the uncorrectable data in the first data block according to the total number corresponding to the uncorrectable erasing and writing times of the historically occurring data and the number corresponding to the uncorrectable erasing and writing times of the currently occurring data The total number corresponding to the number of wrong erasing and writing times; when the total number corresponding to the number of erasing and writing times with uncorrectable data in the first data block is higher than the first preset threshold, the first data block is confirmed The risk level of is higher than the reference level, where the reference level is set according to the threshold
  • the number corresponding to the number of erasing and writing times for which the currently occurring data of the first data block cannot be corrected is acquired; according to the first data block
  • the total number of erasing times corresponding to the uncorrectable erasing and writing times of the historical data of the data block and the number corresponding to the erasing times corresponding to the uncorrectable erasing and erasing times of the current occurrence data of the first data block determine the risk level of the first data block .
  • the risk level of the data block is determined based on the current uncorrectable information and the historical uncorrectable information of the data block and then the data storage array is managed, which can realize selective storage according to the risk level of the data block during data storage. , Reduce the chance of data loss, and improve the reliability of data storage.
  • the first The historical uncorrectable information of the data block also includes the number of erasing and writing times when the data is uncorrectable for the first time
  • the current uncorrectable information also includes the number of erasing and writing times when the data is currently uncorrectable.
  • the historical uncorrectable information of the data block and the current uncorrectable information of the first data block determine the risk level of the first data block, including: confirming that there is currently an uncorrectable erasure of data in the first data block.
  • Whether the difference between the number of writing times and the number of erasing and writing times for which data in the first data block is not correctable for the first time is less than a third preset threshold; if it is less than the third preset threshold, confirm the first data block The risk level of a data block is higher than the reference level.
  • the number corresponding to the number of erasing and writing times for which the currently occurring data of the first data block cannot be corrected is acquired; according to the first data block
  • the total number of erasing times corresponding to the uncorrectable erasing and writing of the data in the history of the data block, the first erasing and erasing times of the uncorrectable data appearing, and the current erasing and erasing times of the first data block that are not correctable are determined The risk level of the first data block.
  • the The historical uncorrectable information of the first data block and the current uncorrectable information of the first data block determine the risk level of the first data block, including: confirming that the data currently present in the first data block is uncorrectable Whether the difference between the number of wrong erasing and writing times and the number of erasing and writing times for which data in the first data block cannot be corrected for the first time exceeds a fifth preset threshold, wherein the fifth preset threshold is greater than the A third preset threshold; if the fifth preset threshold is exceeded, it is confirmed that the risk level of the first data block is lower than the reference level.
  • the number corresponding to the number of erasing and writing times for which the currently occurring data of the first data block cannot be corrected is acquired; according to the first data block The total number of erasing times corresponding to the uncorrectable erasing and writing of data in the history of the data block, the number of erasing erasing and erasing with uncorrectable data appearing for the first time, and the current erasing and erasing times with uncorrectable data appearing in the first data block, and The number corresponding to the number of erasing and writing times for which the data is currently uncorrectable determines the risk level of the first data block.
  • the first data is confirmed If the difference between the number of erasing and writing times of uncorrectable data that currently appears in the block and the number of erasing and writing times of uncorrectable data that appear for the first time in the first data block exceeds the fifth preset threshold, then it is confirmed that the first The risk level of the data block is lower than the reference level.
  • the historical uncorrectable information of the first data block includes the historical maximum erasing operation time
  • the current uncorrectable information of the first data block includes the current erasing operation time
  • the first data block includes the current erasing operation time.
  • the historical uncorrectable information of the data block and the current uncorrectable information of the first data block determine the risk level of the first data block, including: combining the historical maximum erasing operation time and the current erasing operation
  • the maximum value of the time is determined as the maximum erasing operation time of the first data block; when the maximum erasing operation time of the first data block is higher than a first preset threshold, the first data block is confirmed
  • the risk level of is higher than the reference level.
  • the first preset threshold may be a threshold different from the above-mentioned first preset threshold.
  • the current erasing operation time of the first data block is obtained; according to the historical maximum erasing operation time of the first data block And the current erasing operation time of the first data block to determine the risk level of the first data block.
  • the risk level of the data block is determined based on the current uncorrectable information and historical uncorrectable information of the data block, which can realize selective storage according to the risk level of the data block during data storage, and reduce data loss. Probability, thereby improving the reliability of data storage.
  • the historical uncorrectable information of the first data block includes a historical minimum programming operation time
  • the current uncorrectable information of the first data block includes a current minimum programming operation time
  • the first data block includes the current minimum programming operation time.
  • the historical uncorrectable information of the block and the current uncorrectable information of the first data block determine the risk level of the first data block, including: combining the historical minimum programming operation time and the current minimum programming operation time Is determined as the minimum programming operation time of the first data block; when the minimum programming operation time of the first data block is less than the first preset threshold, it is confirmed that the risk level of the first data block is higher than Reference level.
  • the current minimum programming operation time of the first data block is obtained; according to the historical minimum programming operation time of the first data block and The current minimum programming operation time of the first data block determines the risk level of the first data block.
  • the risk level of the data block is determined based on the current uncorrectable information and historical uncorrectable information of the data block, which can realize selective storage according to the risk level of the data block during data storage, and reduce data loss. Probability, thereby improving the reliability of data storage.
  • the obtaining the current uncorrectable information of the first data block includes: separately detecting other N-1 pages in the first data block except the first page to obtain all The current uncorrectable information of the first data block.
  • the current uncorrectable information of the first data block also includes information obtained by separately detecting N-1 pages other than the first page. Using this method makes information acquisition more comprehensive, helps to determine the risk level of the data block based on more comprehensive information, and improves the reliability of determining the risk level of the data block.
  • the historical uncorrectable information of the first data block includes a maximum number and a minimum number of page numbers where the historically-occurring data cannot be corrected, and the current uncorrectable information of the first data block includes the uncorrectable currently occurring data
  • the maximum number and the minimum number of the page number, the determining the risk level of the first data block according to the historical uncorrectable information of the first data block and the current uncorrectable information of the first data block includes: The maximum number of the maximum number of the currently uncorrectable page number of the data appearing and the maximum number of the maximum number of the uncorrectable page number of the historically occurring data is determined as the maximum number of the uncorrectable page number of the first data block; The minimum of the minimum number of the page number where the current data is not correctable and the minimum number of the page number where the historical data is not correctable is determined to be the minimum number of the page where the data is not correctable for the first data block; obtaining the first The difference between the maximum number and the minimum number of the page number with uncorrectable data appears in the data block; when the
  • each page in the first data block except the first page is detected separately to obtain the first data block.
  • the maximum number and the minimum number of the uncorrectable page number of the currently occurring data of a data block; the maximum number and the minimum number of the uncorrectable page number according to the historical occurrence of the first data block and the uncorrectable page number of the currently occurring data The largest number and the smallest number determine the risk level of the first data block.
  • the risk level of the data block is determined based on the current uncorrectable information and historical uncorrectable information of the data block, which can realize selective storage according to the risk level of the data block during data storage, and reduce data loss. Probability, thereby improving the reliability of data storage.
  • the historical uncorrectable information of the first data block includes the maximum data retention time for which historically occurring data is uncorrectable, the total number of pages of historically uncorrectable data, and the current uncorrectable information of the first data block
  • the error information includes the data retention time for the current uncorrectable data, the number of erasing and writing for the current uncorrectable data, and the total number of pages for which the current uncorrectable data is uncorrectable. According to the history of the first data block, the uncorrectable data is uncorrectable.
  • the maximum data retention time for which the historically occurring data is uncorrectable and the currently uncorrectable data The maximum value of the data retention time in the data retention time is determined as the maximum data retention time in the first data block for which the data cannot be corrected; according to the total number of pages in which the historical data is not correctable and the current data that is not correctable Obtain the total number of pages with uncorrectable data in the first data block; when the maximum data retention time for uncorrectable data in the first data block is lower than the first preset threshold, and When the number of erasing and writing times of uncorrectable data currently occurring in the first data block is lower than the second preset threshold, and the total number of pages with uncorrectable data in the first data block is greater than the third preset threshold, It is confirmed that the risk level of the first data block is higher than the reference level.
  • each page in the first data block except the first page is detected separately to obtain the first data block.
  • the risk level of the data block is determined based on the current uncorrectable information and historical uncorrectable information of the data block, which can realize selective storage according to the risk level of the data block during data storage, and reduce data loss. Probability, thereby improving the reliability of data storage.
  • the managing the first data block according to the risk level of the first data block includes: if the risk level of the first data block is higher than a reference level, then The first data block in the data storage array is deleted.
  • the first data block in the data storage array is used for data storage.
  • the using the first data block in the data storage array for data storage may include: confirming whether the risk level of the first data block is lower than a first preset level, and the first preset It is assumed that the level is set according to the threshold of the data importance level; if the risk level of the first data block is lower than the first preset level, the first data is stored in the first data block, wherein the The first data is important data. That is to say, when the risk level of the first data block is not higher than the reference level, when storing data, different data can be determined to be stored according to the risk level of the obtained data block. For example, a data block with a higher risk level and lower than the reference level is used to store general data; a data block with a risk level lower than the first preset level is used to store important data.
  • a second aspect of the embodiments of the present application provides an apparatus for managing a data storage array, which includes: a reading module for reading data stored in a first data block in the data storage array; and an obtaining module for when it is detected When the data of the first page of the first data block cannot be corrected, the current non-correctable information of the first data block is acquired, where the first data block includes N pages, and the first page is Any one of the N pages, where N is an integer greater than 1; a determining module, configured to determine according to the historical uncorrectable information of the first data block and the current uncorrectable information of the first data block The risk level of the first data block; a management module for managing the data storage array according to the risk level of the first data block.
  • the historical uncorrectable information of the first data block includes the total number of erasing times corresponding to the uncorrectable historical occurrence of data
  • the current uncorrectable information of the first data block includes the current uncorrectable data.
  • the number corresponding to the number of erasing and writing times of error correction the determining module is specifically used for: according to the total number of erasing and writing times corresponding to the uncorrectable erasing and writing of the historical data and the erasing and writing of the current uncorrectable data.
  • the number corresponding to the number of times, the total number corresponding to the number of erasing and writing times with uncorrectable data in the first data block is obtained; the total number corresponding to the number of erasing and writing times with uncorrectable data appears in the first data block
  • the number is higher than the first preset threshold, it is confirmed that the risk level of the first data block is higher than the reference level.
  • the historical uncorrectable information of the first data block includes the total number of erasing times corresponding to the uncorrectable erasing and writing of the historically occurring data, and the first erasing and erasing times of the uncorrectable data appearing in the first data block.
  • the current uncorrectable information of the data block includes the number corresponding to the number of erasing and writing times for which the data is currently uncorrectable, and the determining module is specifically configured to: according to the number corresponding to the number of erasing and writing times for which the currently uncorrectable data appears , And the total number of erasing times corresponding to uncorrectable data in the historical occurrence of data, obtaining the total number of erasing times corresponding to uncorrectable data in the first data block; when the first data block The total number corresponding to the number of erasing and writing times with uncorrectable data in the data is higher than the second preset threshold and not higher than the first preset threshold, and the number of erasing and writing with uncorrectable data in the first data block currently occurs When the difference between the number of erasing and writing times for which the data in the first data block cannot be corrected for the first time appears is less than the third preset threshold, it is confirmed that the risk level of the first data block is higher than the reference level.
  • the historical uncorrectable information of the first data block includes the total number of erasing times corresponding to the uncorrectable erasing and writing of the historically occurring data, and the first erasing and erasing times of the uncorrectable data appearing in the first data block.
  • the current uncorrectable information of the data block includes the number of erasing and writing times for which the data is currently uncorrectable and the number corresponding to the number of erasing and writing times for which the data is currently uncorrectable.
  • the determining module is specifically used to: according to the currently occurring data The number corresponding to the uncorrectable erasing and writing times, and the total number corresponding to the uncorrectable erasing and writing times of the historical data, and the corresponding number of erasing and writing times in the first data block with uncorrectable data is obtained.
  • the risk level of the block is lower than the reference level.
  • the historical uncorrectable information of the first data block includes the historical maximum erasing operation time
  • the current uncorrectable information of the first data block includes the current erasing operation time
  • the determining module specifically uses In: determining the maximum value of the historical maximum erasing operation time and the current erasing operation time as the maximum erasing operation time of the first data block; when the maximum erasing operation of the first data block When the time is higher than the first preset threshold, it is confirmed that the risk level of the first data block is higher than the reference level.
  • the historical uncorrectable information of the first data block includes a historical minimum programming operation time
  • the current uncorrectable information of the first data block includes a current minimum programming operation time
  • the determining module is specifically configured to : Determine the minimum of the historical minimum programming operation time and the current minimum programming operation time as the minimum programming operation time of the first data block; when the minimum programming operation time of the first data block is less than the first When the threshold is preset, it is confirmed that the risk level of the first data block is higher than the reference level.
  • the acquiring module is specifically configured to: separately detect other N-1 pages in the first data block except for the first page, to acquire the current uncorrectable value of the first data block Wrong information.
  • the historical uncorrectable information of the first data block includes a maximum number and a minimum number of page numbers where the historically-occurring data cannot be corrected
  • the current uncorrectable information of the first data block includes the uncorrectable currently occurring data
  • the determining module is specifically configured to: determine the maximum number of the maximum number of the currently uncorrectable page number and the maximum number of the historically occurring data uncorrectable page number as the maximum number The maximum number of the page number of the first data block where the data cannot be corrected; the minimum number of the page number where the current data is not correctable and the minimum number of the page number where the historical data is not correctable is determined as the first data block
  • the minimum number of the page number where the data is not correctable in a data block; the difference between the maximum number and the minimum number of the page where the data is not correctable in the first data block is obtained; when the difference exceeds the first preset threshold When the risk level of the first data block is higher than the reference level, it is confirmed.
  • the historical uncorrectable information of the first data block includes the maximum data retention time for which historically occurring data is uncorrectable, the total number of pages of historically uncorrectable data, and the current uncorrectable information of the first data block
  • the error information includes the data retention time for the current uncorrectable data, the number of erasing and writing for the current uncorrectable data, and the total number of pages with uncorrectable data.
  • the determining module is specifically used to: display the history The maximum value of the maximum data retention time for uncorrectable data and the maximum value of the current uncorrectable data retention time is determined as the maximum data retention time for uncorrectable data in the first data block; according to the history The total number of pages with uncorrectable data and the current total number of pages with uncorrectable data are obtained, and the total number of pages with uncorrectable data in the first data block is obtained; when the first data block appears The maximum data retention time for which the data cannot be corrected is lower than the first preset threshold, and the number of erasing and writing of the current uncorrectable data in the first data block is lower than the second preset threshold, and the first data block When the total number of pages in which data cannot be corrected is greater than the third preset threshold, it is confirmed that the risk level of the first data block is higher than the reference level.
  • the management module is specifically configured to: if the risk level of the first data block is higher than the reference level, delete the first data block in the data storage array.
  • the management module is specifically configured to: if the risk level of the first data block is not higher than the reference level, use the first data block in the data storage array as For data storage.
  • a third aspect of the embodiments of the present application provides a device for managing data blocks, including a processor and a NAND Flash management module, where the processor is used to read data stored in a first data block in a data storage array;
  • the NAND Flash management module is configured to obtain the current non-correctable information of the first data block when the data of the first page of the first data block cannot be corrected, wherein the first data block includes N
  • the first page is any one of the N pages, and N is an integer greater than 1.
  • the NAND Flash management module is also used for the historical uncorrectable information of the first data block and The current uncorrectable information of the first data block determines the risk level of the first data block; the processor is further configured to manage the first data block according to the risk level of the first data block.
  • the historical uncorrectable information includes the total number of erasing times corresponding to the uncorrectable historical data
  • the current uncorrectable information includes the number corresponding to the current uncorrectable erasing and writing times
  • the risk level of the first data block is determined according to the total number of erasing times corresponding to the uncorrectable erasing and writing of the historical data and the number corresponding to the erasing times of the uncorrectable erasing and writing of the currently occurring data.
  • the NAND Flash management module is specifically configured to: obtain the total number of erasing and writing times corresponding to the uncorrectable erasing and writing times of the historical data and the number corresponding to the uncorrectable erasing and writing times of the current occurrence data.
  • the data block is eliminated, it is confirmed that the risk level of the first data block is higher than the reference level, and the reference level is set according to the threshold value of the corresponding risk level when the data block is eliminated.
  • the first data when the total number of erasing and writing times with uncorrectable data in the first data block is higher than a second preset threshold and not higher than the first preset threshold, the first data
  • the historical uncorrectable information of the block also includes the number of erasing and writing times when the data is uncorrectable for the first time
  • the current uncorrectable information also includes the number of erasing and writing times when the data is currently uncorrectable.
  • the NAND Flash management module specifically uses In: confirming whether the difference between the number of erasing and writing times of uncorrectable data currently occurring in the first data block and the number of erasing and writing times of uncorrectable data occurring for the first time in the first data block is less than the third preset Set a threshold; if it is less than the third preset threshold, it is confirmed that the risk level of the first data block is higher than the reference level.
  • the NAND Flash management The module is specifically configured to: confirm whether the difference between the number of erasing and writing times of uncorrectable data that currently appears in the first data block and the number of erasing and writing times of uncorrectable data that appear for the first time in the first data block exceeds A fifth preset threshold, wherein the fifth preset threshold is greater than the third preset threshold; if the fifth preset threshold is exceeded, it is confirmed that the risk level of the first data block is lower than the reference grade.
  • the historical uncorrectable information includes the historical maximum erasing operation time
  • the current uncorrectable information includes the current erasing operation time
  • the NAND Flash management module is specifically configured to: The maximum erase operation time and the maximum of the current erase operation time are determined as the maximum erase operation time of the first data block; when the maximum erase operation time of the first data block is higher than the first preset When the threshold is set, it is confirmed that the risk level of the first data block is higher than the reference level.
  • the historical non-correctable information includes the historical minimum programming operation time
  • the current non-correctable information includes the current minimum programming operation time
  • the NAND Flash management module is specifically configured to: minimize the historical The minimum value of the programming operation time and the current minimum programming operation time is determined as the minimum programming operation time of the first data block; when the minimum programming operation time of the first data block is less than a first preset threshold, then It is confirmed that the risk level of the first data block is higher than the reference level.
  • the processor is further configured to: separately detect other N-1 pages in the first data block except for the first page, so as to obtain information about the first data block. Currently uncorrectable information.
  • the historical uncorrectable information includes the maximum number and the minimum number of the page numbers that cannot be corrected for the historically occurring data
  • the current uncorrectable information includes the maximum number and the minimum number of the uncorrectable page numbers for the currently occurring data.
  • the NAND The Flash management module is specifically configured to: determine the maximum value of the maximum number of the page number for which the currently-occurring data cannot be corrected and the maximum number of the page number for which the historically-occurring data cannot be corrected as the maximum number of the non-correctable data for the first data block.
  • the maximum number of the page number; the minimum number of the page number of the currently uncorrectable data and the minimum number of the page number of the historically uncorrectable data are determined as the uncorrectable page number of the data in the first data block The smallest number; obtain the difference between the largest number and the smallest number of the page numbers where data is not correctable in the first data block; when the difference exceeds the first preset threshold, confirm the first data block
  • the risk level of is higher than the reference level, and the reference level is set according to the threshold of the corresponding risk level when the data block is eliminated.
  • the historical uncorrectable information includes the maximum data retention time of historically occurring data that cannot be corrected, and the total number of pages of historically uncorrectable data
  • the current uncorrectable information includes the current uncorrectable data.
  • the NAND Flash management module is specifically used to: The maximum value of the data retention time and the data retention time in which the current data is uncorrectable is determined to be the maximum data retention time in the first data block when the data is uncorrectable; The total number of pages and the total number of pages with uncorrectable data that currently appear to obtain the total number of pages with uncorrectable data in the first data block; when the maximum number of uncorrectable data appears in the first data block The data retention time is lower than a first preset threshold, and the number of erasing and writing of data currently uncorrectable in the first data block is lower than a second preset threshold, and data in the first data block is uncorrectable When the total number of pages is greater than the third preset threshold, it is confirmed that the risk level of the first data block is higher than the reference level, and the reference level is set according to the threshold of the corresponding risk level when the data block is eliminated.
  • the processor is specifically configured to: if the risk level of the first data block is higher than the reference level, delete the first data block in the data storage array.
  • the processor is further configured to: if the risk level of the first data block is not higher than the reference level, use the first data block in the data storage array for data storage.
  • the fourth aspect of the embodiments of the present application provides a computer-readable storage medium, the computer-readable storage medium stores a computer program, and the computer program is executed by a processor to implement the method.
  • Figure 1 is a schematic diagram of the structure of a NAND Flash chip in the prior art
  • FIG. 2 is a schematic diagram of the structure of a data storage array of a NAND Flash chip in the prior art
  • FIG. 3 is a schematic flowchart of a method for managing a data storage array provided by an embodiment of the application
  • FIG. 4 is a schematic flowchart of a method for managing a data storage array provided by an embodiment of the application
  • FIG. 5 is a schematic flowchart of a method for managing a data storage array provided by an embodiment of the application
  • FIG. 6 is a schematic flowchart of a method for managing a data storage array provided by an embodiment of the application.
  • FIG. 7 is a schematic structural diagram of a system for managing a data storage array provided by an embodiment of the application.
  • FIG. 8 is a schematic structural diagram of a block classification module provided by an embodiment of the application.
  • a single data read cannot be decoded, and there may be multiple reasons; for example, a page is easily affected by the data retention time (DR) and the number of read disturbances (Read Disturb, RD) to cause bit errors.
  • DR data retention time
  • RD read disturbances
  • the block does not need to be eliminated because there are no internal defects in the block.
  • UNC is likely to appear probabilistically. Due to the existence of RAID protection, the block does not need to be marked as a bad block immediately, which may cause premature loss of device capacity.
  • embodiments of the present application provide a method, device, and storage medium for managing a data storage array, including: reading data stored in a first data block in the data storage array; when the first data block of the first data block is detected When the data of one page is not error-correctable, obtain the current non-correction information of the first data block, where the first data block includes N pages, and the first page is any of the N pages One page, where N is an integer greater than 1.
  • the risk level of the first data block is determined according to the historical uncorrectable information of the first data block and the current uncorrectable information of the first data block; according to the The risk level of the first data block manages the data storage array.
  • the risk level of the data block is determined based on the current uncorrectable information and the historical uncorrectable information of the data block, and then the data storage array is managed according to the risk level of the data block.
  • the confirmation of the risk level is obtained based on the current uncorrectable information and historical uncorrectable information of the data block.
  • the solution manages the data storage array, by quantifying the risk of the data block, it can realize the selective storage according to the risk level of the data block during data storage, reducing the probability of data loss. In turn, the reliability of data storage is improved.
  • a method for managing a data storage array is provided in an embodiment of this application.
  • the method includes steps 301-304, which are specifically as follows:
  • the SSD device when receiving the read data request sent by the host, the SSD device can read the data stored on the NAND Flash chip.
  • the first data block may be any data block in the aforementioned data storage array.
  • the NANDFlash chip contains multiple data blocks, and each data block contains multiple pages.
  • the SSD device performs a read operation on the data of the first page of the first data block and detects that the data of the first page is uncorrectable, the SSD device obtains the current uncorrectable information of the first data block.
  • the above-mentioned data of the first page is not error-correctable. It may be that when the data stored on the first page is read, the total number of bits flipped on the page exceeds the preset number, indicating that the data of the page is not error-correctable. For example, the data stored on this page is 010101...etc. When the readout is 101010..., and the total number of bits with errors exceeds the preset number, it means that the data on the page cannot be corrected.
  • the current uncorrectable information of the first data block may include the number of the page where the current uncorrectable data appears.
  • the N pages of the data block correspond to N different page numbers in a one-to-one correspondence, for example, the page numbers are from 1 to N.
  • the above-mentioned page number with uncorrectable data currently appears is the page number corresponding to the page with uncorrectable data currently appearing.
  • the current uncorrectable information of the first data block may also include the number of times of erasing and writing of the current uncorrectable data. Among them, when each page in the data block is full of data, the data block needs to be erased to write data again.
  • the above-mentioned number of erasing and writing times for which the data is currently uncorrectable is the number of erasing and writing corresponding to the data block for which the data is currently uncorrectable. After that, if the data is currently uncorrectable, the number of erasing and writing corresponding to the current uncorrectable data is 50.
  • the current uncorrectable information of the first data block may also include the number corresponding to the number of erasing and writing times of uncorrectable data currently occurring. If the data block is currently being read, and the data block has undergone 50 erasing and writing operations, and the data is currently uncorrectable, the number of erasing and writing times for which the data is currently uncorrectable is 1. If there is currently no uncorrectable data, the number of erasing and writing times corresponding to the uncorrectable data is 0. Among them, the embodiment of the present application is based on the current situation that the data cannot be corrected. Therefore, the number of erasing and writing times for which the current data is not correctable is 1.
  • the current uncorrectable information of the first data block may also include the current erasing operation time.
  • the current erasing operation time is the time used when the data block corresponds to the current erasing operation. That is to say, the time it takes to erase all the data in the data block in the current corresponding erase operation. If the data block has been erased and written for 50 times, and the current data is uncorrectable, it will take the time to erase all the data in the data block for the 50th time.
  • the current uncorrectable information of the first data block may also include the data retention time of the current uncorrectable data. If the data block is currently being read, and the data block has undergone 50 erasing and writing operations, there is currently uncorrectable data, then the interval between the 50th erasing and writing operations and the 51st erasing and writing operations The time is the data retention time of the current uncorrectable data.
  • the current uncorrectable information of the data block may also include any other information, which is not specifically limited in this solution.
  • the SSD device may pre-store the historical uncorrectable information of the first data block.
  • the historical uncorrectable information of the first data block may include the total number of erasing times corresponding to the uncorrectable historical data. If the data block is currently being read, the data block has undergone 50 erasing and writing operations, and the current data is uncorrectable, then the total number of erasing and writing times with uncorrectable data in the history is the number of erasing and writing operations. The total number of erasing and writing times corresponding to the occurrence of uncorrectable data in 49 erasing and writing operations before the 50th erasing and writing operation.
  • the total number of erasing and writing times corresponding to the uncorrectable historical data is 3.
  • the historical uncorrectable information of the first data block may also include the number of erasing and writing times when the data is uncorrectable for the first time.
  • the number of erasing and writing times for which the data is uncorrectable for the first time is the corresponding erasing and writing times when the data is uncorrectable for the first time in the erasing and writing operation of the data block. For example, after the 6th erasing and writing operation of the data block, the data can not be corrected for the first time. Therefore, the number of erasing and writing of the data block for which the data cannot be corrected for the first time is 6.
  • the historical uncorrectable information of the first data block may also include the historical maximum erasing operation time. If the data block is currently being read, and the data block has undergone 50 erasing and writing operations, and the current data is uncorrectable, the historical maximum erasing operation time is the time before the 50th erasing and writing operation. The maximum erasing operation time in erasing and writing operations with uncorrectable data corresponding to 49 erasing and writing operations. If after the 6th erasing and writing operation, there has been data that cannot be corrected for error, the erasing operation time corresponding to this erasing and writing operation is 5ms; after the 36th erasing and writing operation, there have been data that cannot be corrected for error.
  • the erasing operation time corresponding to the erasing and writing operation is 3ms; after the 49th erasing and writing operation, there has been uncorrectable data, and the erasing operation time corresponding to this erasing and writing operation is 6ms.
  • the corresponding historical maximum erasing operation time is 6ms.
  • the historical uncorrectable information of the first data block may also include the maximum data retention time for which historically occurring data cannot be corrected. If the data block is currently being read, the data block has undergone 50 erasing and writing operations, and the current data is uncorrectable, the maximum data retention time for the historically uncorrectable data is the 50th time The maximum data retention time in the erasing and writing operations with uncorrectable data in the 49 erasing and writing operations before the erasing and writing operations. For example, after the 6th erasing and writing operation, there has been data that cannot be corrected. The interval from the 6th erasing and writing operation to the 7th erasing and writing operation is 48 hours; after the 36th erasing and writing operation, there has been The data cannot be corrected.
  • the interval from the 36th erasing and writing operations to the 37th erasing and writing operations is 5 hours; after the 49th erasing and writing operations, the data cannot be corrected for errors. From the 49th erasing and writing operations The interval between the 50th erasing and writing operation is 12 hours. Therefore, the maximum data retention time corresponding to the uncorrectable historical data is 48 hours.
  • the historical uncorrectable information of the data block may also include other arbitrary information, which is not specifically limited in this solution.
  • the risk level of the first data block is determined according to the historical uncorrectable information of the first data block and the current uncorrectable information of the first data block.
  • the historical uncorrectable information of the first data block can be updated according to the current uncorrectable information of the first data block to obtain the uncorrectable information of the first data block, and then according to the uncorrectable information of the first data block. Error correction information is used to determine the risk level of the first data block.
  • the data storage array is further managed according to the risk level of the first data block obtained above.
  • the management of the data storage array in this solution is based on the risk level of the data block.
  • the risk level of the above data block can include high risk level, low risk level, and no risk; or it can be classified according to preset levels, such as the first level, the second level, the third level, etc., which are not included here.
  • the determination of the risk level described here can also be to determine its comparison with the reference level, for example, it can be to determine that the risk level of the first data block is higher than the reference level, or to determine that the risk level of the first data block is lower than the reference level. Wait.
  • the reference level it can be set arbitrarily, for example, it can be set according to the threshold of the corresponding risk level when the data block is eliminated, that is, when the data block is higher than the reference level, the data block is deleted.
  • a data block with a high risk level you can directly mark the data block as a bad block or delete the data block without storing data in the data block; for a data block with a low risk level, it can be mainly used to store important data.
  • the risk level of the first data block is lower than a first preset level, and the first preset level is set according to the threshold of the data importance level; if the risk level of the first data block is If the level is lower than the first preset level, the first data is stored in the first data block, where the first data is important data. That is to say, when the risk level of the first data block is not higher than the reference level, when storing data, different data can be determined to be stored according to the risk level of the obtained data block. For example, a data block with a higher risk level and lower than the reference level is used to store general data; a data block with a risk level lower than the first preset level is used to store important data.
  • This method can realize selective storage according to the risk level of the data block during data storage, reduce the probability of data loss, and further improve the reliability of data storage.
  • the risk level of the data block is determined based on the current uncorrectable information and the historical uncorrectable information of the data block, and then the data storage array is managed according to the risk level of the data block.
  • the confirmation of the risk level is obtained based on the current uncorrectable information and historical uncorrectable information of the data block.
  • the above-mentioned historical uncorrectable information includes the total number of erasing times corresponding to the uncorrectable historical data
  • the above-mentioned current uncorrectable information includes the corresponding erasing times of the current uncorrectable data.
  • the risk level of the first data block is determined according to the total number of erasing times corresponding to the uncorrectable erasures of the historical data and the number corresponding to the current erasing and erasing times of the uncorrectable data of.
  • a method for managing a data storage array provided by an embodiment of this application.
  • the method includes steps 401-404, which are specifically as follows:
  • the data of the first page of the first data block is not error-correctable, acquire the number corresponding to the number of erasing and writing times of the first data block whose data is currently uncorrectable, wherein the first data block includes N pages, the first page is any one of the N pages, and N is an integer greater than 1;
  • the current uncorrectable information of the first data block in the embodiment of the present application is the number corresponding to the number of erasing and writing times of uncorrectable data currently occurring.
  • the data of the first page of the current first data block is not error-correctable, which means that the first data block has uncorrectable data, and the corresponding number of erasing and writing times of the current uncorrectable data is 1.
  • the historical uncorrectable information of the first data block in the embodiment of the present application is the total number of erasing times corresponding to the uncorrectable historical data. If the data block is currently being read, the data block has undergone 50 erasing and writing operations, and the current data is uncorrectable, then the total number of erasing and writing times with uncorrectable data in the history is the number of erasing and writing operations. The total number of erasing and writing times corresponding to the occurrence of uncorrectable data in 49 erasing and writing operations before the 50th erasing and writing operation.
  • the total number of erasing and writing times corresponding to the uncorrectable historical data is 3.
  • the uncorrectable data in the first data block appears.
  • the total number corresponding to the number of erasing and writing If the current number of erasing and writing times with uncorrectable data is 1, and the total number of erasing and writing times with uncorrectable data in history is 3, then there is an uncorrectable erasing and writing of data in the first data block.
  • the total number of times corresponds to 3+1, that is, 4.
  • the total number of erasing and writing times corresponding to uncorrectable data occurrences in the first data block may also be calculated according to a preset weight ratio. For example, it is preset that the total number of erasing and writing times with uncorrectable data in history accounts for 70% of the weight, and the weight corresponding to the number of erasing and writing times with uncorrectable data currently appears to be 30%. The product of the weight and the number corresponding to the corresponding erasing and writing times is added to obtain the total number of erasing and writing times corresponding to the uncorrectable data in the first data block.
  • the reference level is set according to the threshold of the corresponding risk level when the data block is eliminated;
  • the risk level of the first data block is high, that is, it is higher than the reference level.
  • the SSD device can directly delete the data block, so that subsequent data writing is not performed on the data block.
  • the number corresponding to the number of erasing and writing times for which the currently occurring data of the first data block cannot be corrected is acquired; according to the first data block
  • the total number of erasing times corresponding to the uncorrectable erasing and writing times of the historical data of the data block and the number corresponding to the erasing times corresponding to the uncorrectable erasing and erasing times of the current occurrence data of the first data block determine the risk level of the first data block .
  • the risk level of the data block is determined based on the current uncorrectable information and the historical uncorrectable information of the data block and then the data storage array is managed, which can realize selective storage according to the risk level of the data block during data storage. , Reduce the chance of data loss, and improve the reliability of data storage.
  • the first data block The historical uncorrectable information also includes the number of erasing and writing times when the data is uncorrectable for the first time, and the current uncorrectable information also includes the number of erasing and writing times when the data is currently uncorrectable.
  • the method further includes:
  • the historical uncorrectable information of the first data block in the embodiment of the present application is the total number of erasing times corresponding to the uncorrectable erasing and writing of the historical data and the erasing times corresponding to the uncorrectable erasing of the data for the first time.
  • the current uncorrectable data in the first data block When the total number of erasing and writing times with uncorrectable data in the first data block is higher than the second preset threshold and not higher than the first preset threshold, at the same time, the current uncorrectable data in the first data block When the difference between the number of incorrect erasing and writing times and the number of erasing and writing times for which data in the first data block is not correctable for the first time is less than the third preset threshold, it indicates that the risk level of the first data block is higher than the reference grade.
  • the current number of erasing and writing with uncorrectable data is 50, and the number of erasing and writing with uncorrectable data for the first time is 6, then the current number of erasing and writing with uncorrectable data is the same as the number of erasing and writing in the first data block.
  • the difference between the number of erasing and writing of uncorrectable data at one time is 44.
  • the total number of erasing and writing times with uncorrectable data in the first data block is 6, if 6 is higher than the second preset threshold and not higher than the first preset threshold, and 44 is less than the third preset threshold , It means that the risk level of the first data block is higher than the reference level.
  • the SSD device can directly delete the data block, so that subsequent data writing is not performed on the data block.
  • the number corresponding to the number of erasing and writing times for which the currently occurring data of the first data block cannot be corrected is acquired; according to the first data block
  • the total number of erasing times corresponding to the uncorrectable erasing and writing of the data in the history of the data block, the first erasing and erasing times of the uncorrectable data appearing, and the current erasing and erasing times of the first data block that are not correctable are determined The risk level of the first data block.
  • the method further includes :
  • the current uncorrectable information of the first data block in the embodiment of the present application is the number of erasing and writing times for which the data is currently uncorrectable and the number corresponding to the current erasing and writing times for which the data is uncorrectable.
  • the above-mentioned number of erasing and writing times for which the data is currently uncorrectable is the number of erasing and writing corresponding to the data block for which the data is currently uncorrectable. After that, if the data is currently uncorrectable, the number of erasing and writing corresponding to the current uncorrectable data is 50.
  • the historical uncorrectable information of the first data block in the embodiment of the present application is the total number of erasing times corresponding to the uncorrectable erasing and writing of the historical data and the erasing times of the first uncorrectable data appearing.
  • Data blocks with low risk levels can be used to store data.
  • the number corresponding to the number of erasing and writing times for which the currently occurring data of the first data block cannot be corrected is acquired; according to the first data block The total number of erasing times corresponding to the uncorrectable erasing and writing of data in the history of the data block, the number of erasing erasing and erasing with uncorrectable data appearing for the first time, and the current erasing and erasing times with uncorrectable data appearing in the first data block, and The number corresponding to the number of erasing and writing times for which the data is currently uncorrectable determines the risk level of the first data block.
  • the first data is confirmed If the difference between the number of erasing and writing times of uncorrectable data that currently appears in the block and the number of erasing and writing times of uncorrectable data that appear for the first time in the first data block exceeds the fifth preset threshold, then it is confirmed that the first The risk level of the data block is lower than the reference level.
  • an embodiment of the present application also provides a method for managing a data storage array.
  • the method includes the following steps:
  • the current uncorrectable information of the first data block in the embodiment of the present application is the current erasing operation time.
  • the current erasing operation time is the time used when the data block corresponds to the current erasing operation. That is to say, the time it takes to erase all the data in the data block in the current corresponding erase operation. If the data block has been erased and written for 50 times, and the current data is uncorrectable, it will take the time to erase all the data in the data block for the 50th time.
  • the historical uncorrectable information of the first data block in the embodiment of the present application is the historical maximum erasing operation time. If the data block is currently being read, and the data block has undergone 50 erasing and writing operations, and the current data is uncorrectable, the historical maximum erasing operation time is the time before the 50th erasing and writing operation. The maximum erasing operation time in erasing and writing operations with uncorrectable data corresponding to 49 erasing and writing operations.
  • the erasing operation time corresponding to this erasing and writing operation is 5ms; after the 36th erasing and writing operation, there have been data that cannot be corrected for error.
  • the erasing operation time corresponding to the erasing and writing operation is 3ms; after the 49th erasing and writing operation, there has been uncorrectable data, and the erasing operation time corresponding to this erasing and writing operation is 6ms.
  • the corresponding historical maximum erasing operation time is 6ms.
  • the maximum value is used as the maximum erasing operation time of the first data block.
  • the maximum erasing operation time of the first data block is higher than the first preset threshold, it indicates that the erasing operation time is too long, indicating that the data block is abnormal, and it is determined that the risk level of the first data block is higher than the reference level.
  • the SSD device can directly delete the data block, so that subsequent data writing is not performed on the data block.
  • the current erasing operation time of the first data block is obtained; according to the historical maximum erasing operation time of the first data block And the current erasing operation time of the first data block to determine the risk level of the first data block.
  • the risk level of the data block is determined based on the current uncorrectable information and historical uncorrectable information of the data block, which can realize selective storage according to the risk level of the data block during data storage, and reduce data loss. Probability, thereby improving the reliability of data storage.
  • the maximum erasing operation time in the embodiment of the present application is to select the maximum erasing operation time from the number of erasing and writing times where data cannot be corrected.
  • this solution can also select the maximum erasing operation time among all erasing and writing times. This plan does not make specific restrictions on this.
  • an embodiment of the present application also provides a method for managing a data storage array.
  • the method includes the following steps:
  • the current uncorrectable information of the first data block in the embodiment of the present application is the current minimum programming operation time.
  • the current minimum programming operation time refers to the minimum value of the programming operation time of each page of the first data block in the current erasing and writing operation. That is to say, in the current corresponding erasing and writing times, the writing time corresponding to the page that takes the shortest time when writing each page of data of the first data block.
  • the programming operation time of each page can be centrally stored in the preset module.
  • the preset module There is no specific limitation here.
  • the historical uncorrectable information of the first data block in the embodiment of the present application is the historical minimum programming operation time. If the data block is currently being read, the data block has undergone 50 erasing and writing operations, and there is currently uncorrectable data, then the historical minimum programming operation time is 49 before the 50th erasing and writing operation.
  • the minimum programming operation time in the erasing and writing operations with uncorrectable data in the corresponding erasing and writing operations For example, after the 6th erasing and writing operation, there has been data that cannot be corrected.
  • the minimum programming operation time in the 6th erasing and writing operation is 0.2ms for writing the third page; after the 36th erasing and writing operation, There has been data that cannot be corrected.
  • the minimum programming operation time in the 36th erasing and writing operation is 0.05ms for writing to the 18th page; after the 49th erasing and writing operation, there has been data that cannot be corrected, the 49th
  • the minimum programming operation time in erasing and writing operations is 0.1ms for writing the 8th page.
  • the corresponding historical minimum programming operation time is 0.05ms.
  • the minimum value is determined to be the minimum programming operation time of the first data block.
  • the minimum programming operation time of the first data block is less than the first preset threshold, and the programming operation time is too short, indicating that the data block is abnormal, it is determined that the risk level of the first data block is higher than the reference level.
  • the SSD device can directly delete the data block, so that subsequent data writing is not performed on the data block.
  • the current minimum programming operation time of the first data block is obtained; according to the historical minimum programming operation time of the first data block and The current minimum programming operation time of the first data block determines the risk level of the first data block.
  • the risk level of the data block is determined based on the current uncorrectable information and historical uncorrectable information of the data block, which can realize selective storage according to the risk level of the data block during data storage, and reduce data loss. Probability, thereby improving the reliability of data storage.
  • the minimum programming operation time in the embodiment of the present application is to select the minimum programming operation time among the times of erasing and writing in which data cannot be corrected.
  • this solution can also select the minimum programming operation time among all erasing and writing times. This plan does not make specific restrictions on this.
  • Each of the foregoing embodiments manages the data storage array when it is detected that the data of the first page of the data block cannot be corrected. Further, the embodiment of the present application also provides a method for managing a data storage array. It also includes detecting each page other than the first page separately, and then managing the data storage array.
  • the method for managing a data storage array includes steps 501-506, which are specifically as follows:
  • the data of the first page of the first data block is not error-correctable, detect the other N-1 pages in the first data block except the first page to obtain the first data block.
  • the embodiment of the present application removes the first page from the first data block when the data on the first page of the first data block cannot be corrected.
  • the other pages are detected separately, and then the current uncorrectable information of the first data block is obtained.
  • the current uncorrectable information is the maximum number and the minimum number of the page number where the data currently cannot be corrected.
  • the largest number and the smallest number can be obtained.
  • the historical uncorrectable information of the first data block in the embodiment of the present application is the maximum number and the minimum number of the page numbers of historically occurring data that cannot be corrected.
  • the maximum number and the minimum number of the page numbers for which the historical occurrence data cannot be corrected are the maximum and minimum values among the numbers of pages with non-correctable data in the number of erasing and writing times of the historical occurrence data that cannot be corrected.
  • the first preset threshold here and any of the first preset thresholds mentioned above may be the same value or different values.
  • the SSD device can directly delete the data block, so that subsequent data writing is not performed on the data block.
  • each page in the first data block except the first page is detected separately to obtain the first data block.
  • the maximum number and the minimum number of the uncorrectable page number of the currently occurring data of a data block; the maximum number and the minimum number of the uncorrectable page number according to the historical occurrence of the first data block and the uncorrectable page number of the currently occurring data The largest number and the smallest number determine the risk level of the first data block.
  • the risk level of the data block is determined based on the current uncorrectable information and historical uncorrectable information of the data block, which can realize selective storage according to the risk level of the data block during data storage, and reduce data loss. Probability, thereby improving the reliability of data storage.
  • steps 601-605 which are specifically as follows:
  • the data of the first page of the first data block is not error-correctable, separately detect other N-1 pages in the first data block except the first page to obtain the first data block.
  • the first page is any one of the N pages, and N is an integer greater than 1.
  • the first data block when the data of the first page of the first data block cannot be corrected, the first data block is divided by Each page other than the first page is detected separately, and then the current uncorrectable information of the first data block is obtained.
  • the current uncorrectable information is the data retention time when the current data is uncorrectable, the number of erasing and writing when the current data is uncorrectable, and the total number of pages where the current uncorrectable data occurs.
  • the interval between the 50th erasing and writing operations and the 51st erasing and writing operations is the data retention time of the current uncorrectable data.
  • the current total number of pages with uncorrectable data is the total number of pages with uncorrectable data in the data block corresponding to the erase/write operation. If the data block is currently being read, and the data block has been erased and written for 50 times, there is currently uncorrectable data. Check the first, fifth, eighth, and twenty-third pages of the data. If the data is uncorrectable, the total number of pages with uncorrectable data is 4.
  • the maximum data retention time for the historically uncorrectable data is the 50th time
  • the maximum data retention time in the erasing and writing operations with uncorrectable data in the 49 erasing and writing operations before the erasing and writing operations For example, after the 6th erasing and writing operation, there has been data that cannot be corrected.
  • the interval from the 6th erasing and writing operation to the 7th erasing and writing operation is 48 hours; after the 36th erasing and writing operation, there has been The data cannot be corrected.
  • the interval from the 36th erasing and writing operations to the 37th erasing and writing operations is 5 hours; after the 49th erasing and writing operations, the data cannot be corrected for errors. From the 49th erasing and writing operations The interval between the 50th erasing and writing operation is 12 hours. Therefore, the maximum data retention time corresponding to the uncorrectable historical data is 48 hours.
  • the maximum value is determined to be the maximum data retention time of the data that cannot be corrected in the first data block.
  • the total number of pages with uncorrectable data in the history can be understood as: if the data block is currently being read, the data block has been erased and written for 50 times, and the current data is uncorrectable, then the history appears
  • the total number of pages with uncorrectable data is the total number of pages with uncorrectable data among the 49 erasing and writing operations before the 50th erasing and writing operations. For example, after the 6th erase and write operation, there have been data that cannot be corrected. Among them, a total of 3 pages have data that cannot be corrected; after the 36th erase and write operation, there have been data that cannot be corrected, and there are 8 in total.
  • the data that appears on the page cannot be corrected; after the 49th erasing and writing operation, there has been a data that cannot be corrected. There are a total of 20 pages that cannot be corrected. The total number of pages corresponding to historical data that cannot be corrected is 3+8+20, that is, 31 pages.
  • the total number of pages with uncorrectable data currently present is obtained.
  • the historical uncorrectable information of the first data block in the embodiment of the present application is the maximum data retention time of historically occurring data that is uncorrectable and the total number of pages of historically uncorrectable data.
  • the maximum data retention time for which the data in the first data block cannot be corrected is lower than the first preset threshold, and the number of erasing and writing times of the current uncorrectable data in the first data block is lower than the second preset threshold, And when the total number of pages with uncorrectable data in the first data block is greater than the third preset threshold, it indicates that the risk level of the first data block is high.
  • the SSD device can directly delete the data block, so that subsequent data writing is not performed on the data block.
  • each page in the first data block except the first page is detected separately to obtain the first data block.
  • the risk level of the data block is determined based on the current uncorrectable information and historical uncorrectable information of the data block, which can realize selective storage according to the risk level of the data block during data storage, and reduce data loss. Probability, thereby improving the reliability of data storage.
  • the foregoing embodiments only introduce a method for managing a data storage array based on part of historical uncorrectable information and part of current uncorrectable information.
  • the embodiment of the present application does not limit the foregoing historical uncorrectable information and current uncorrectable information. It may also be any other historical uncorrectable information and other current uncorrectable information.
  • the multiple first preset thresholds in the embodiment of the present application may be equal or unequal. There is no specific limitation here.
  • the system may include a host and an SSD device, where the SSD device includes an SSD controller and a multi-channel NAND Flash.
  • the SSD controller is connected to a host (such as a server) through a variety of protocol interfaces such as NMVe/SAS/PCIe/UFS/eMMC, so as to receive read and write requests sent by the host.
  • the SSD controller also accesses and controls the NAND Flash chip on the channel through the NAND Flash interface.
  • the SSD controller includes a processor, a data cache area, and a NAND Flash management module.
  • the block classification module is used to manage the data storage array.
  • the SSD device can perform the method described in any of the embodiments in FIG. 3 to FIG. 6 to determine the risk level of the data block.
  • the Block grading module includes a Block-UNC management table storage unit, a UNC information extraction unit, and a Block grading calculation unit.
  • UNC data read by the SSD device from a certain block
  • the address of the block is obtained.
  • the UNC information extraction unit initiates data reading of other pages in the block, and judges whether the data of other pages is UNC, so as to obtain the total number of UNC pages at this time, as well as the maximum page number and minimum page number of all UNCs in the block. .
  • the UNC information extraction unit searches the Block-UNC management table stored in the Block-UNC management table storage unit for the historical information of the block according to the address of the block, and updates the content of the Block-UNC management table in combination with the parameters obtained above.
  • the block classification calculation unit calculates the updated UNC information to obtain the risk level of the block.
  • Block-UNC management table can refer to the following table 1:
  • the block classification calculation unit performs calculations according to the updated UNC information to obtain the risk level of the block. Among them, when the UNC information of a block meets at least one of the following conditions, the risk level of the block is high, bad block marking can be performed, and data storage is no longer used.
  • CNT_PE>a that is, when the number of UNC erase (PE) occurrences of the block is higher than the threshold a, the risk level of the block is higher than the reference level.
  • the SSD controller can protect the data by periodically checking the data.
  • the block does not meet any of the above 1) to 4) conditions, it is determined that the block is a normal block and no other processing is required.
  • a, b, c, d, e, and f are all configurable parameters, and these parameters can be determined by the measured data of NAND Flash.
  • the values of b and e are less than a; the value of f is greater than c.
  • different parameter values affect the standard of risk judgment within a certain range.
  • Block-UNC management table can refer to the following table two:
  • the UNC information of a block meets at least one of the following conditions, the risk level of the block is higher than the reference level, bad block marking can be performed, and data storage is no longer used.
  • tERASE_Max>g that is, when the maximum erase operation time of the block is too long, the risk level of the block is higher than the reference level.
  • tPROG_min ⁇ k that is, if the minimum programming operation time of the block is too low, the risk level of the block is higher than the reference level.
  • g, h, i, j, and k are all configurable parameters, and these parameters can be determined by the actual measurement data of NAND Flash.
  • An embodiment of the present application also provides an apparatus for managing a data storage array, including:
  • the reading module is used to read the data stored in the first data block in the data storage array
  • the acquiring module is configured to acquire the current uncorrectable information of the first data block when it is detected that the data of the first page of the first data block is not correctable, wherein the first data block includes N Page, the first page is any one of the N pages, and N is an integer greater than 1;
  • a determining module configured to determine the risk level of the first data block according to the historical uncorrectable information of the first data block and the current uncorrectable information of the first data block;
  • the management module is configured to manage the data storage array according to the risk level of the first data block.
  • the embodiment of the application adopts the above methods to determine the risk level of the data block based on the current uncorrectable information and historical uncorrectable information of the data block, which can realize selective storage according to the risk level of the data block when storing the data. Reduce the chance of data loss, thereby improving the reliability of data storage.
  • the historical uncorrectable information of the first data block includes the total number of erasing and writing times corresponding to the uncorrectable historical occurrence of the data
  • the current uncorrectable information of the first data block includes the current uncorrectable data.
  • the number corresponding to the number of erasing and writing for error correction, and the determining module is specifically used for:
  • the risk level of the first data block is higher than the reference level.
  • the number corresponding to the number of erasing and writing times for which the currently occurring data of the first data block cannot be corrected is acquired; according to the first data block
  • the total number of erasing times corresponding to the uncorrectable erasing and writing times of the historical data of the data block and the number corresponding to the erasing times corresponding to the uncorrectable erasing and erasing times of the current occurrence data of the first data block determine the risk level of the first data block .
  • the risk level of the data block is determined based on the current uncorrectable information and historical uncorrectable information of the data block, which can realize selective storage according to the risk level of the data block during data storage, and reduce data loss. Probability, thereby improving the reliability of data storage.
  • the historical uncorrectable information of the first data block includes the total number of erasing times corresponding to the uncorrectable erasing and writing of the historically occurring data, and the first erasing and erasing times of the uncorrectable data appearing in the first data block.
  • the current uncorrectable information of the data block includes the number corresponding to the number of erasing and writing times for which the data is currently uncorrectable, and the determining module is specifically used for:
  • the number corresponding to the number of erasing and writing times for which the currently occurring data is not correctable and the total number corresponding to the number of erasing and writing times for which the historically-occurring data is not correctable, obtain the data that is not correctable for the first data block.
  • the number corresponding to the number of erasing and writing times for which the currently occurring data of the first data block cannot be corrected is acquired; according to the first data block
  • the total number of erasing times corresponding to the uncorrectable erasing and writing of the data in the history of the data block, the first erasing and erasing times of the uncorrectable data appearing, and the current erasing and erasing times of the first data block that are not correctable are determined The risk level of the first data block.
  • the risk level of the data block is determined based on the current uncorrectable information and historical uncorrectable information of the data block, which can realize selective storage according to the risk level of the data block during data storage, and reduce data loss. Probability, thereby improving the reliability of data storage.
  • the historical uncorrectable information of the first data block includes the total number of erasing times corresponding to the uncorrectable erasing and writing of the historically occurring data, and the first erasing and erasing times of the uncorrectable data appearing in the first data block.
  • the current uncorrectable information of the data block includes the number of erasing and writing times for which the data is currently uncorrectable and the number corresponding to the erasing times for which the data is currently uncorrectable.
  • the determining module is specifically used for:
  • the number corresponding to the number of erasing and writing times for which the currently occurring data is not correctable and the total number corresponding to the number of erasing and writing times for which the historically-occurring data is not correctable, obtain the data that is not correctable for the first data block.
  • the number corresponding to the number of erasing and writing times for which the currently occurring data of the first data block cannot be corrected is acquired; according to the first data block The total number of erasing times corresponding to the uncorrectable erasing and writing of data in the history of the data block, the number of erasing erasing and erasing with uncorrectable data appearing for the first time, and the current erasing and erasing times with uncorrectable data appearing in the first data block, and The number corresponding to the number of erasing and writing times for which the data is currently uncorrectable determines the risk level of the first data block.
  • the risk level of the data block is determined based on the current uncorrectable information and historical uncorrectable information of the data block, which can realize selective storage according to the risk level of the data block during data storage, and reduce data loss. Probability, thereby improving the reliability of data storage.
  • the historical uncorrectable information of the first data block includes the historical maximum erasing operation time
  • the current uncorrectable information of the first data block includes the current erasing operation time
  • the determining module specifically uses At:
  • the current erasing operation time of the first data block is obtained; according to the historical maximum erasing operation time of the first data block And the current erasing operation time of the first data block to determine the risk level of the first data block.
  • the risk level of the data block is determined based on the current uncorrectable information and historical uncorrectable information of the data block, which can realize selective storage according to the risk level of the data block during data storage, and reduce data loss. Probability, thereby improving the reliability of data storage.
  • the historical uncorrectable information of the first data block includes a historical minimum programming operation time
  • the current uncorrectable information of the first data block includes a current minimum programming operation time
  • the determining module is specifically configured to :
  • the current minimum programming operation time of the first data block is obtained; according to the historical minimum programming operation time of the first data block and The current minimum programming operation time of the first data block determines the risk level of the first data block.
  • the risk level of the data block is determined based on the current uncorrectable information and historical uncorrectable information of the data block, which can realize selective storage according to the risk level of the data block during data storage, and reduce data loss. Probability, thereby improving the reliability of data storage.
  • the acquisition module is specifically used for:
  • the current uncorrectable information of the first data block also includes information obtained by separately detecting N-1 pages other than the first page. Using this method makes information acquisition more comprehensive, helps to determine the risk level of the data block based on more comprehensive information, and improves the reliability of determining the risk level of the data block.
  • the historical uncorrectable information of the first data block includes a maximum number and a minimum number of page numbers where the historically-occurring data cannot be corrected, and the current uncorrectable information of the first data block includes the uncorrectable currently occurring data
  • the maximum number and minimum number of the page number, the determination module is specifically used for:
  • the historical uncorrectable information of the first data block includes the maximum data retention time for which historically occurring data is uncorrectable, the total number of pages of historically uncorrectable data, and the current uncorrectable information of the first data block
  • the error information includes the data retention time for the current uncorrectable data, the number of erasing and writing for the current uncorrectable data, and the total number of pages with uncorrectable data.
  • the determining module is specifically used for:
  • each page in the first data block except the first page is detected separately to obtain the first data block.
  • the risk level of the data block is determined based on the current uncorrectable information and historical uncorrectable information of the data block, which can realize selective storage according to the risk level of the data block during data storage, and reduce data loss. Probability, thereby improving the reliability of data storage.
  • the management module is specifically configured to: if the risk level of the first data block is higher than the reference level, delete the first data block in the data storage array.
  • the management module is specifically configured to: if the risk level of the first data block is not higher than the reference level, use the first data block in the data storage array as For data storage.
  • the embodiment of the present application also provides a device for managing data blocks, including a processor and a NAND Flash management module, where:
  • the processor is configured to read data stored in the first data block in the data storage array
  • the NAND Flash management module is configured to obtain the current non-correctable information of the first data block when the data of the first page of the first data block cannot be corrected, wherein the first data block includes N Pages, the first page is any one of the N pages, and N is an integer greater than 1;
  • the NAND Flash management module is further configured to determine the risk level of the first data block according to the historical uncorrectable information of the first data block and the current uncorrectable information of the first data block;
  • the processor is further configured to manage the first data block according to the risk level of the first data block.
  • the embodiments of the present application also provide a computer-readable storage medium, the computer-readable storage medium stores a computer program, and the computer program is executed by a processor to implement the method.
  • the disclosed device may be implemented in other ways.
  • the device embodiments described above are merely illustrative.
  • the division of the units is only a logical function division, and there may be other divisions in actual implementation, for example, multiple units or components may be combined or may be Integrate into another system, or some features can be ignored or not implemented.
  • the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may be in electrical or other forms.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
  • each functional unit in each embodiment may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit.
  • the above-mentioned integrated unit can be implemented in the form of hardware or in the form of software program modules.
  • the integrated unit is implemented in the form of a software program module and sold or used as an independent product, it can be stored in a computer readable memory.
  • the technical solution of the present application essentially or the part that contributes to the existing technology or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a memory.
  • a number of instructions are included to enable a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the methods described in the various embodiments of the present application.
  • the foregoing memory includes: U disk, read-only memory (read-only memory, ROM), random access memory (random access memory, RAM), mobile hard disk, magnetic disk or optical disk and other media that can store program codes.
  • the program can be stored in a computer-readable memory, and the memory can include: a flash disk , Read-only memory, random access device, magnetic or optical disk, etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Techniques For Improving Reliability Of Storages (AREA)

Abstract

Selon des modes de réalisation, la présente invention concerne un procédé de gestion d'un ensemble de stockage de données, un dispositif et un support de stockage. Le procédé comprend les étapes suivantes : lire des données stockées dans un premier bloc de données d'un ensemble de stockage de données; lors de la détection d'un mot de code non corrigible dans une première page du premier bloc de données, acquérir des informations de mot de code non corrigible actuelles du premier bloc de données, le premier bloc de données comprenant N pages, la première page étant une page quelconque des N pages, et N étant un nombre entier supérieur à 1; et déterminer un niveau de risque du premier bloc de données selon des informations de mot de code non corrigible historiques du premier bloc de données et les informations de mot de code non corrigible actuelles de celui-ci. Cette approche détermine un niveau de risque du bloc de données en fonction des informations de mot de code non corrigible actuelles et des informations de mot de code non corrigible historiques du bloc de données, et permet de stocker sélectivement des données selon le niveau de risque du bloc de données dans le processus de stockage des données, ce qui réduit la probabilité de perte de données, et améliore ainsi la fiabilité de stockage de données.
PCT/CN2020/082627 2020-03-31 2020-03-31 Procédé de gestion d'ensemble de stockage de données, dispositif et support de stockage WO2021196046A1 (fr)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202080092988.2A CN114930299A (zh) 2020-03-31 2020-03-31 一种管理数据存储阵列的方法、装置及存储介质
PCT/CN2020/082627 WO2021196046A1 (fr) 2020-03-31 2020-03-31 Procédé de gestion d'ensemble de stockage de données, dispositif et support de stockage

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2020/082627 WO2021196046A1 (fr) 2020-03-31 2020-03-31 Procédé de gestion d'ensemble de stockage de données, dispositif et support de stockage

Publications (1)

Publication Number Publication Date
WO2021196046A1 true WO2021196046A1 (fr) 2021-10-07

Family

ID=77927214

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/082627 WO2021196046A1 (fr) 2020-03-31 2020-03-31 Procédé de gestion d'ensemble de stockage de données, dispositif et support de stockage

Country Status (2)

Country Link
CN (1) CN114930299A (fr)
WO (1) WO2021196046A1 (fr)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115509466B (zh) * 2022-11-17 2023-03-28 苏州浪潮智能科技有限公司 一种数据管理方法、装置及电子设备和存储介质

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103269230A (zh) * 2013-05-28 2013-08-28 中国科学院自动化研究所 一种自适应调整纠错码的容错系统及方法
US20140337687A1 (en) * 2011-09-22 2014-11-13 Violin Memory Inc. System and method for correcting errors in data using a compound code
CN106776109A (zh) * 2016-12-26 2017-05-31 湖南国科微电子股份有限公司 固态硬盘读错误检测装置及读不可纠错误原因的检测方法
CN107391300A (zh) * 2017-07-26 2017-11-24 湖南国科微电子股份有限公司 一种提高闪存数据存储可靠性的方法及系统

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140337687A1 (en) * 2011-09-22 2014-11-13 Violin Memory Inc. System and method for correcting errors in data using a compound code
CN103269230A (zh) * 2013-05-28 2013-08-28 中国科学院自动化研究所 一种自适应调整纠错码的容错系统及方法
CN106776109A (zh) * 2016-12-26 2017-05-31 湖南国科微电子股份有限公司 固态硬盘读错误检测装置及读不可纠错误原因的检测方法
CN107391300A (zh) * 2017-07-26 2017-11-24 湖南国科微电子股份有限公司 一种提高闪存数据存储可靠性的方法及系统

Also Published As

Publication number Publication date
CN114930299A (zh) 2022-08-19

Similar Documents

Publication Publication Date Title
CN106776109B (zh) 固态硬盘读错误检测装置及读不可纠错误原因的检测方法
US9910748B2 (en) Rebuilding process for storage array
US9952795B2 (en) Page retirement in a NAND flash memory system
CN106448737B (zh) 读取闪存数据的方法、装置以及固态驱动器
US9959059B2 (en) Storage error management
CN108052414B (zh) 一种提升ssd工作温度范围的方法及系统
US10936391B2 (en) Memory management method and storage controller
US8560922B2 (en) Bad block management for flash memory
TWI623878B (zh) 資料讀取方法以及儲存控制器
US10372382B2 (en) Methods and apparatus for read disturb detection based on logical domain
US10204003B2 (en) Memory device and storage apparatus
JP2008287404A (ja) 読み出しによる非アクセスメモリセルのデータ破壊を検出及び回復する装置、及びその方法
US11048601B2 (en) Disk data reading/writing method and device
TWI610169B (zh) 檔案系統的日誌子系統寫入方法、錯誤追蹤方法及處理器
US20150378800A1 (en) Storage device and storage device control method
US11042432B1 (en) Data storage device with dynamic stripe length manager
US20190095276A1 (en) Method for Processing Data Stored in a Memory Device and a Data Storage Device Utilizing the Same
US10593421B2 (en) Method and apparatus for logically removing defective pages in non-volatile memory storage device
US10324648B1 (en) Wear-based access optimization
US20200258582A1 (en) Pre-Program Read to Counter Wordline Failures
CN113272905A (zh) 具有时变位错误率的存储器中的缺陷检测
CN109801668A (zh) 数据储存装置及应用于其上的操作方法
JP2018163707A (ja) 半導体記憶装置及びそのリード制御方法
WO2021196046A1 (fr) Procédé de gestion d'ensemble de stockage de données, dispositif et support de stockage
US20200057702A9 (en) Page retirement in a nand flash memory system

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20929173

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20929173

Country of ref document: EP

Kind code of ref document: A1