WO2020186524A1 - 一种存储校验方法及装置 - Google Patents

一种存储校验方法及装置 Download PDF

Info

Publication number
WO2020186524A1
WO2020186524A1 PCT/CN2019/079118 CN2019079118W WO2020186524A1 WO 2020186524 A1 WO2020186524 A1 WO 2020186524A1 CN 2019079118 W CN2019079118 W CN 2019079118W WO 2020186524 A1 WO2020186524 A1 WO 2020186524A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
chips
check
ecc
carry
Prior art date
Application number
PCT/CN2019/079118
Other languages
English (en)
French (fr)
Inventor
孙亚萍
王岩松
李挺
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to PCT/CN2019/079118 priority Critical patent/WO2020186524A1/zh
Priority to CN201980091806.7A priority patent/CN113424262B/zh
Publication of WO2020186524A1 publication Critical patent/WO2020186524A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/08Error detection or correction by redundancy in data representation, e.g. by using checking codes
    • G06F11/10Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11CSTATIC STORES
    • G11C29/00Checking stores for correct operation ; Subsequent repair; Testing stores during standby or offline operation
    • G11C29/04Detection or location of defective memory elements, e.g. cell constructio details, timing of test signals
    • G11C29/08Functional testing, e.g. testing during refresh, power-on self testing [POST] or distributed testing
    • G11C29/12Built-in arrangements for testing, e.g. built-in self testing [BIST] or interconnection details
    • G11C29/38Response verification devices
    • G11C29/42Response verification devices using error correcting codes [ECC] or parity check

Definitions

  • This application relates to the storage field, and in particular to a storage verification method and device.
  • 3D XPoint Intel (intel) and Micron (micron) launched a new mainstream memory chip technology 3D XPoint.
  • 3D XPoint has 10 times lower latency, 3 times write durability, 4 times write per second, 3 times read per second performance improvement, and 30% power consumption.
  • 3D XPoint helps computers acquire and process large amounts of data generated by networked devices at a faster speed.
  • the 3D XPoint memory includes multiple chips (die), each die includes multiple partitions, and each partition includes multiple data pages. The data page is the smallest write and read granularity.
  • Figure 1 shows the storage structure of a die in 3D XPoint memory chip technology. As shown in Figure 1, a die includes 16 partitions, denoted by P0, P1, ..., P15.
  • a row of a die is a strip, and the storage space of a strip consists of multiple data pages (pages), and a smallest rectangular box represents a data page.
  • User data and corresponding ECC check data are carried on a strip in a die, and the ECC check data on the same die is used to check user data.
  • the user data size is 4kB
  • the storage space of one stripe of a die is 256 bytes (byte)
  • the storage space of each data page is 16 bytes.
  • the 4kB user data can be scattered on 17 dies, for example, die0 to die16 carry 4KB user data.
  • the 16 data pages of a strip in each die among die0 to die16 are used to carry part of user data and corresponding ECC check data.
  • the part of user data occupies 240bytes
  • the ECC check data occupies 16bytes. Carrying RAID/EC parity data on die17.
  • This storage and verification method is more suitable for user data with larger granularity, such as 4kB or 2kB. How to store and verify user data with a small granularity (such as 256 bytes) is a technical problem that needs to be solved.
  • This application provides a storage verification method and device for storing and verifying user data with a relatively small granularity.
  • a storage verification method is provided. The method is implemented by performing error checking and correcting ECC verification on user data to obtain ECC verification data.
  • the user data includes m pieces of data. Is an integer greater than 2; performs a redundant array of independent disks RAID check or an erasure code EC check on the user data and the ECC check data to obtain parity check data; divides the m pieces of data
  • the ECC check data and the parity check data are stored on a plurality of chips; wherein, each of the m chips in the plurality of chips carries one piece of data, and the pieces of data carried by different chips Different, n first chips other than the m chips in the plurality of chips carry the ECC check data, and p second chips in the plurality of chips other than the m chips Used to carry the parity data, and the n and p are positive integers.
  • the fragmented data is scattered on m dies, and ECC verification data is stored on n dies different from the m dies.
  • the ECC verification data can verify the multiple fragmented data. Achieve small-grained user data storage verification. By storing parity data on p dies different from the m and n dies, data reliability is further ensured. If the existing storage verification method is used, the ECC verification data on the same die verifies the user data on the same die. When storing user data with a smaller granularity, it may cause the user data and ECC verification data to occupy Multiple stripes, causing read delay. Through the method of the embodiment of the present application, the ECC verification data can verify the fragmented data on multiple dies, which can avoid reading delay.
  • the parity data is generated independently by combining user data and ECC check data composed of a group of chip data, without introducing additional write amplification.
  • ECC check data composed of a group of chip data
  • user data composed of multiple pieces of data and ECC check data may be equivalent to a set of data
  • parity data is generated independently by combining user data composed of a group of pieces of data and ECC check data. No additional write amplification is introduced.
  • any one of the multiple chips includes multiple partitions; one or more partitions on any one of the m chips carry one piece of data; the first chip One or more partitions on the second chip carry the ECC check data; one or more partitions on the second chip carry the parity check data.
  • the multiple partitions are on one strip.
  • a strip is a row of multiple die.
  • any one of the multiple chips includes multiple partitions, and one partition on any one of the multiple chips includes multiple data pages; any of the m chips One or more data pages on one chip carry one piece of data; one or more data pages on the first chip carry the ECC check data; one or more data pages on the second chip Carry the parity data.
  • the multiple data pages are on one strip.
  • a strip is a row of multiple die.
  • a storage verification method which can be implemented by the following steps: reading multiple chips; the multiple chips are used to store user data, ECC check data, and parity check data, so The user data includes multiple pieces of data, each of the m chips in the multiple chips carries one piece of data, and the pieces of data carried by different chips are different, except for the m N first chips other than n chips carry the ECC check data, and p second chips other than the m chips among the plurality of chips are used to carry the parity check data, the n and p are positive integers; if the data stored on p third chips in the plurality of chips is lost, then: perform redundant array of independent disks RAID check or erasure code EC on the p third chips Verify to obtain the recovery data on the p third chips; use the ECC verification data to perform ECC verification on the data of the m chips to obtain the user data, where if p third chips If chips in m chips are included, the data of the m chips includes the restored data
  • the slicing data is scattered on m dies, and the ECC check data is stored on n dies different from the m die.
  • the ECC check data The multiple pieces of data can be verified, and the storage verification of user data with small granularity can be realized.
  • the ECC verification data on the same die verifies the user data on the same die.
  • the ECC verification data can verify the fragmented data on multiple dies, which can avoid reading delay.
  • the parity data is generated independently by combining user data and ECC check data composed of a group of chip data, without introducing additional write amplification.
  • user data composed of multiple pieces of data and ECC check data may be equivalent to a set of data, and parity data is generated independently by combining user data composed of a group of pieces of data and ECC check data. No additional write amplification
  • the maximum number of error-correctable bits of the ECC check data is not less than the number of bits in the data of the m chips. This can ensure the reliability in the case of die data loss.
  • a storage verification device which has the function of realizing any possible design of the above-mentioned first aspect, second aspect, and first aspect or any possible design of the second aspect.
  • the function can be realized by hardware, or by hardware executing corresponding software.
  • the hardware or software includes one or more modules corresponding to the above-mentioned functions.
  • a storage system in a fourth aspect, includes a processing device and a storage device, and the storage device can be used to implement any possible design of the first aspect, the second aspect, the first aspect, or the second aspect. The function of any possible design.
  • a computer storage medium which stores a computer program, and the computer program includes instructions for executing the above aspects and any possible design method in each aspect.
  • a computer program product containing instructions which when run on a computer, causes the computer to execute the above-mentioned aspects and the methods described in any possible design of the aspects.
  • Figure 1 is a schematic diagram of the structure of a 3D XPoint memory chip in the prior art
  • FIG. 2 is a schematic diagram of the storage system architecture in an embodiment of the application
  • FIG. 3 is a schematic flowchart of a storage verification method in an embodiment of the application.
  • FIG. 4 is a schematic diagram of the storage structure in an embodiment of the application.
  • Figure 5 is a schematic diagram of a verification method in an embodiment of the application.
  • FIG. 6 is one of the schematic structural diagrams of the storage verification device in an embodiment of the application.
  • FIG. 7 is the second structural diagram of the storage verification device in the embodiment of the application.
  • FIG. 8 is the third structural diagram of the storage verification device in an embodiment of the application.
  • the embodiments of the present application provide a storage verification method and device for storing and verifying user data with a relatively small granularity.
  • the method and the device are based on the same concept. Since the principles of the method and the device to solve the problem are similar, the implementation of the device and the method can be referred to each other, and the repetition will not be repeated.
  • "and/or" describes the association relationship of the associated objects, indicating that there can be three relationships, for example, A and/or B, which can mean: A alone exists, and both A and B exist at the same time. There are three cases of B.
  • the character "/" generally indicates that the associated objects are in an "or” relationship.
  • At least one involved in this application refers to one or more; multiple involved refers to two or more.
  • words such as “first” and “second” are only used for the purpose of distinguishing description, and cannot be understood as indicating or implying relative importance, nor can it be understood as indicating Or imply the order.
  • FIG. 2 shows the architecture of a possible storage system to which the storage verification method provided in the embodiment of the present application is applicable.
  • the storage system 200 includes a processing device 201 and a storage device 202.
  • the processing device 201 may be a computer host, for example.
  • the storage device 202 may include storage servers and interface servers.
  • the storage device 202 has its own interface and protocol. It is connected to the processing device 201 through coaxial cables, network cables, optical fibers, etc., as a data storage center to provide storage for the processing device 201 service.
  • the storage device 202 in this application is mainly used to store user data and verification data.
  • one I/O operation is divided into two aspects: writing and reading.
  • the processing device 201 initiates a write request through a network such as a SAN, and the storage device 202 receives a write request from the processing device 201.
  • the write request is used to request to write user data to the storage device 202.
  • the storage device 202 receives a read request from the processing device 201, and the storage device 202 reads user data and verification data. When the user data matches the verification data, the storage device 202 returns the user data to the processing device 201 .
  • the method provided in the embodiments of this application can be applied to any application scenario that uses ECC and RAID/EC verification methods.
  • This application takes 3D XPoint storage media as an example for introduction, but it is not limited to this, and can also be applied to other On the storage medium.
  • the method provided by the embodiment of the present application can store and verify user data with a relatively small granularity.
  • user data with relatively small granularity for example, the size of user data is 256byte, 128byte, 512byte, etc.
  • the embodiments of this application aim at user data with relatively small granularity.
  • the storage structure of die if the storage size of a row (ie, one strip) of a die is not enough to store the user data and the ECC check data, this application is used for implementation
  • the method provided in the example performs storage verification.
  • the user data size is 256 bytes.
  • a die may include 16 partitions, the size of a data page in a partition is 16 bytes, and the storage size of a strip of a die is 256 bytes. Therefore, if the prior art method is adopted and a strip in a die stores the user data with a size of 256 bytes, the ECC check data of the user data can only be stored in another strip in the die, so Increased read latency.
  • the method provided by the embodiments of the present application can help solve the problem of read delay.
  • S301 Perform ECC verification on user data to obtain ECC verification data.
  • the user data includes multiple pieces of data.
  • user data can be fragmented first to obtain multiple fragmented data.
  • a slice of data can occupy part or all of the data pages in a strip of a die. Determine the number of fragments according to the size of user data and the storage space of data pages. For example, if the user data size is 256byte, and the storage space of a data page is 16byte, it can be fragmented into 16 fragments of data, and one fragment of data occupies exactly one data page.
  • the user data can be fragmented so that the fragmented data occupies two or more data pages.
  • S302 Perform a RAID check or an EC check on the user data and the ECC check data to obtain parity data.
  • S303 Store the obtained multiple piece data, ECC check data, and parity check data on multiple chips.
  • the multiple chips may include m chips, n chips, and p chips.
  • the m chips, n chips, and p chips are different chips among m chips, where m chips carry m slice data, n chips carry ECC check data, and p chips carry parity check data.
  • m is an integer greater than 1
  • both n and p are positive integers.
  • the value of p is related to the number of recovered die.
  • the user data and check data can be stored on (m+2) dies, where the check data includes ECC check data and parity check data.
  • the check data includes ECC check data and parity check data.
  • Each of the m dies in the (m+2) die carries one piece of data, and different dies carry different pieces of data.
  • the m pieces of data divided into the user data are numbered according to the sequence numbers, and the m pieces of data may be sequentially placed on the die of the corresponding sequence number.
  • ECC check data and parity check data are respectively carried on two dies other than m dies.
  • the ECC check data is carried on the first chip other than m dies.
  • the second chip carries parity data.
  • any one of the (m+2) dies includes multiple partitions.
  • one or more partitions on any one of the m dies carry one piece of data.
  • the number of partitions carrying fragmented data is determined according to the size of user data and the storage space of data pages.
  • one or more partitions on the first chip carry ECC verification data.
  • the first one or more partitions on the second chip carry parity data.
  • the number of partitions carrying ECC check data is determined according to the size of the ECC check data and the storage space of the data page.
  • the number of partitions carrying parity data is determined according to the size of the parity data and the storage space of the data page.
  • the size of the ECC check data is determined by the size of the user data
  • the size of the parity check data is determined by the size of the user data and the ECC check data.
  • the multiple partitions can be on one strip.
  • the multiple partitions can be on one strip.
  • the multiple partitions can be on one stripe, which can reduce the read latency.
  • Any die in the (m+2) die includes multiple partitions, and one partition includes multiple data pages, and one data page is the smallest read-write unit. Then, one or more data pages on any one of the m dies carry one piece of data. The number of partitions carrying fragmented data is determined according to the size of user data and the storage space of data pages. Similarly, one or more data pages on the first chip carry ECC check data. One or more data pages on the second chip carry parity data. The number of partitions carrying ECC check data is determined according to the size of the ECC check data and the storage space of the data page. The number of data pages carrying parity data is determined according to the size of the parity data and the storage space of the data pages.
  • the size of the ECC check data is determined by the size of the user data
  • the size of the parity check data is determined by the size of the user data and the ECC check data.
  • the multiple data pages can be on one stripe.
  • the multiple data pages can be One stripe; when multiple data pages carry parity data, the multiple data pages can be on one stripe, which can reduce the read delay.
  • the user data is 256byte and the storage size of a data page in die is 16byte, then the user data can be divided into 16 pieces of data, and the size of one piece of data is 16byte.
  • the entire user data occupies 16 die. Integrating user data and verification data, a total of 18 die are occupied. As shown in Figure 4, assume that 18 die are represented by die0, die1, ..., die17. Different pieces of data among the 16 pieces of data can be stored on die0 ⁇ die15 respectively, and the piece data is represented by the letter d. Store ECC check data on die17, and store parity check data on die18.
  • the storage location is only an example, and the embodiment of the present application may also have other storage methods, for example, two or more data pages of a strip are occupied on a die to store fragmented data.
  • user data composed of multiple pieces of data and ECC check data may be equivalent to a set of data, and parity data is generated independently by combining user data composed of a group of pieces of data and ECC check data. No additional write amplification is introduced.
  • user data and verification data are scattered on multiple dies, and the user data or verification data carried on each die does not exceed one stripe, so that it can be read Concurrent access, and can reduce the read delay when reading.
  • p chips can be any p chips in the plurality of chips.
  • the data on the chip storing slice data is lost, or the data on the second chip storing ECC check data is lost, or It is the loss of data on the third chip that stores parity data.
  • the p chips that have lost data are called p third chips.
  • the parity check data stored on the second chip is used for RAID check.
  • the parity data on the second chip is also lost, then the parity data is calculated using the fragmented data and the ECC check data, and then the data is restored.
  • the parity check data stored on the first chip is used to perform the RAID check to obtain the recovered data on the third chip.
  • the parity check data stored on the first chip is used for the EC check to obtain the restored data on the third chip.
  • ECC check is performed on multiple pieces of data and stored on a certain die, such as the first chip; and multiple pieces of data and ECC check data are checked by RAID/EC to obtain parity check
  • the verification data is stored on another die, such as the second chip.
  • the RAID/EC check method is used to first restore the data on the third chip to obtain the restored data.
  • the ECC verification method is used to jointly verify the restored data on the third chip and the slice data on other chips storing slice data, so as to obtain correct user data.
  • the number of check bits in the recovered data obtained by using the RAID/EC verification method is at most the sum of the number of error bits of the fragmented data on other chips, then when using ECC verification, ECC verification data error correction is required
  • the maximum number of bits is not less than the sum of the number of error bits of the data on the m chips, and the data on the m chips includes the recovered data of the third chip.
  • the user data is 256byte, and the storage size of a data page in die is 16byte.
  • the user data is fragmented to obtain 16 fragment data, which can be stored on die0 ⁇ die15.
  • 16 pieces of data perform ECC check on user data composed of the 16 pieces of data, obtain ECC check data, and store the ECC check data on die16.
  • Perform RAID/EC verification on user data and ECC verification data obtain parity data, and store parity data on die17.
  • error bits In data storage, there are bound to be error bits. Assume that both user data and check data have error bits. Assuming that die2 is lost during reading, first use parity check data to perform RAID/EC check on the slice data on die1, die3, die4....die15 and the ECC check data on die16 to obtain the recovered data on die2. Because there are error bits when storing user data, the recovery data of die2 obtained through RAID/EC verification will also have error bits. Assuming that there are x error bits in the fragmented data on die except die2, there will be at most x error bits in the recovered data of die2. Then there are 2x error bits in the entire user data.
  • the ECC check data on die16 is used to verify the fragment data on die0, die1, die3 to die15 and the recovered data on die2 obtained above to obtain the fragment data lost on die2. If the error bit of the user data is 2 times xbit, when the error correction capability of ECC is greater than or equal to 2 times x bit, the data on die2 can be read correctly, so that the entire user data can be read correctly.
  • the error correction capability of ECC is greater than or equal to 2 times x bits, which means that the number of error bits that can be corrected by ECC can be greater than or equal to 2 times x bits. This can ensure the reliability in the case of die data loss.
  • the restored data of the p die will store at most p times x error bits. Then there are (p+1) times x error bits in the entire user data.
  • the error correction capability of ECC is greater than or equal to ((p+1) times x bits, that is to say, the number of error bits that ECC can correct can be greater than or equal to (p+1) times x bits. This ensures die Reliability in case of data loss.
  • an embodiment of the present application further provides a storage verification device 600, which is configured to perform the corresponding writing aspect in the foregoing storage verification method. Operation.
  • the storage verification device 600 includes a verification unit 601 and a storage unit 602. among them:
  • the verification unit 601 is used to perform error checking and correct ECC verification on user data to obtain ECC verification data, where the user data includes m pieces of data; and to perform verification on the user data and the ECC verification data Redundant Array of Independent Disks RAID check or Erasure Code EC check to obtain parity check data, m is an integer greater than 2;
  • the storage unit 602 is configured to store the m slice data, the ECC check data, and the parity check data on (m+2) chips; among them, m of the (m+2) chips Each chip in the chip carries one piece of data, and different chips carry different piece of data.
  • n first chips except the m chips carry the ECC check data
  • p second chips other than the m chips are used to carry the parity data
  • n and p are positive integers.
  • any one of the (m+2) chips includes multiple partitions
  • One or more partitions on any one of the m chips carry one piece of data
  • One or more partitions on the first chip carry the ECC verification data
  • One or more partitions on the second chip carry the parity data.
  • the multiple partitions are on one strip.
  • any one of the (m+2) chips includes multiple partitions, and one partition on any one of the (m+2) chips includes multiple data pages;
  • One or more data pages on any one of the m chips carry one piece of data
  • One or more data pages on the first chip carry the ECC check data
  • One or more data pages on the second chip carry the parity data.
  • the multiple data pages are on one strip.
  • an embodiment of the present application also provides a storage verification device 700, which is used to perform the corresponding reading aspect in the foregoing storage verification method. Operation.
  • the storage verification device 700 includes a reading unit 701 and a verification unit 702. among them:
  • the reading unit 701 reads (m+2) chips; the (m+2) chips are used to store user data, ECC check data, and parity check data, and the user data includes multiple pieces of data , Each of the m chips in the (m+2) chips carries a piece of data, and different chips carry different pieces of data, except for the m chips in the (m+2) chips.
  • the first chip of the (m+2) chips carries the ECC check data, and the second chip of the (m+2) chips except the m chips is used to carry the parity check data;
  • the check unit 702 is configured to use the parity check data to perform a redundant array of independent disks RAID check or an erasure code EC check when the slice data stored on the third chip of the m chips is lost, to obtain Recovering data on the third chip; using the ECC verification data to perform ECC verification on the data of the m chips to obtain the user data, wherein the data of the m chips includes the recovered data.
  • the maximum number of error-correctable bits of the ECC check data is not less than the number of bits in the data of the m chips.
  • an embodiment of the present application further provides a storage verification device 800, which is configured to execute the foregoing storage verification method.
  • the storage verification device 800 includes a processor 801 and a memory 802. among them:
  • the processor 801 is configured to receive a write request for writing user data into the chip, a read request for reading the chip, and the verification operations described in the foregoing method embodiment.
  • the memory 802 may be equivalent to the foregoing method implementation
  • the chip in the example is used to store user data and check data.
  • the processor 801 in the storage verification apparatus 800 may also be used to perform other operations in the foregoing method embodiments, and details are not described herein again.
  • the embodiments of the present application can be provided as methods, systems, or computer program products. Therefore, the present application may adopt the form of a complete hardware embodiment, a complete software embodiment, or an embodiment combining software and hardware. Moreover, this application may adopt the form of a computer program product implemented on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) containing computer-usable program codes.
  • a computer-usable storage media including but not limited to disk storage, CD-ROM, optical storage, etc.
  • These computer program instructions can also be stored in a computer-readable memory that can guide a computer or other programmable data processing equipment to work in a specific manner, so that the instructions stored in the computer-readable memory produce an article of manufacture including the instruction device.
  • the device implements the functions specified in one process or multiple processes in the flowchart and/or one block or multiple blocks in the block diagram.
  • These computer program instructions can also be loaded on a computer or other programmable data processing equipment, so that a series of operation steps are executed on the computer or other programmable equipment to produce computer-implemented processing, so as to execute on the computer or other programmable equipment.
  • the instructions provide steps for implementing functions specified in a flow or multiple flows in the flowchart and/or a block or multiple blocks in the block diagram.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Techniques For Improving Reliability Of Storages (AREA)

Abstract

一种存储校验方法及装置,用于存储和校验粒度比较小的用户数据。该方法包括:对用户数据进行错误检查和纠正ECC校验,获得ECC校验数据,用户数据包括m个分片数据,m为大于2的整数;对用户数据和ECC校验数据,进行独立磁盘冗余阵列RAID校验或纠删码EC校验,获得奇偶校验数据;将m个分片数据、ECC校验数据和奇偶校验数据存储到多个芯片上;其中,该多个芯片中的m个芯片中的每个芯片承载一个分片数据,不同芯片承载的分片数据不同,所述多个芯片中除所述m个芯片之外的n个第一芯片承载所述ECC校验数据,所述多个芯片中除所述m个芯片之外的p个第二芯片用于承载所述奇偶校验数据。

Description

一种存储校验方法及装置 技术领域
本申请涉及存储领域,特别涉及一种存储校验方法及装置。
背景技术
随着计算机技术的高速发展,用户数据越来越多,计算机自身携带的单个硬盘已经无法满足用户数据的存储需求,需要使用独立的存储系统存储用户数据。用户数据在存储系统内部的保存和传输过程中,因存储系统的硬件故障、软件故障或硬盘错误等原因可能导致用户数据的损坏。现有技术中,通常会采用一些数据保护技术来提高数据存储的安全性和可靠性。例如,奇偶校验(parity)、独立磁盘冗余阵列(redundant arrays of independent drives,RAID)、纠删码(erasure coding,EC)或错误检查和纠正(error correcting code,ECC)。在有些存储系统中,会采用多种数据保护技术结合的方式。如,采用ECC和EC结合的技术,或采用ECC和RAID结合的技术。ECC用于纠正比特错误,RAID/EC用于恢复分片数据。
英特尔(intel)和美光(micron)推出了全新主流存储芯片技术3D XPoint。相较于NAND闪存,3D XPoint具有10倍的低延迟、3倍写入耐久、4倍每秒写入、3倍每秒读取的性能提升以及30%的功耗。3D XPoint有助于计算机以更快的速度获取并处理联网设备产生的大量数据。3D XPoint存储器包括多个芯片(die),每一个die包括多个分区(partition),每个分区包括多个数据页(page),数据页为最小的写入读取粒度。图1示出了3D XPoint存储芯片技术中一个die的存储结构。如图1所示,一个die包括16个分区,用P0、P1、……、P15表示。一个die的一行即为一个条带,一个条带的存储空间由多个数据页(page)组成,一个最小的矩形框代表一个数据页。在一个die中的一个条带上承载用户数据和对应的ECC校验数据,同一个die上的ECC校验数据用于校验用户数据。例如,用户数据大小为4kB,一个die的一个条带的存储空间为256字节(byte),每个数据页的存储空间为16byte。可以将4kB的用户数据分散在17个die上,例如die0~die16承载4KB的用户数据。die0~die16中每个die中一个条带的16个数据页用于承载部分用户数据和对应的ECC校验数据,如该部分用户数据占240byte,ECC校验数据占16byte。在die17上承载RAID/EC的奇偶校验数据。
这种存储和校验方式比较适用于粒度较大的用户数据,例如4kB或2kB等。对于如何存储及校验粒度较小(如256byte)的用户数据,是需要解决的技术问题。
发明内容
本申请提供一种存储校验方法及装置,用以存储和校验粒度比较小的用户数据。
一方面,提供一种存储校验方法,该方法通过以下步骤实现:对用户数据进行错误检查和纠正ECC校验,获得ECC校验数据,所述用户数据包括m个分片数据,所述m为大于2的整数;对所述用户数据和所述ECC校验数据,进行独立磁盘冗余阵列RAID校验或纠删码EC校验,获得奇偶校验数据;将所述m个分片数据、所述ECC校验数据和所奇偶校验数据存储到多个芯片上;其中,所述多个芯片中的m个芯片中的每个芯片承载一个 分片数据,不同芯片承载的分片数据不同,所述多个芯片中除所述m个芯片之外的n个第一芯片承载所述ECC校验数据,所述多个芯片中除所述m个芯片之外的p个第二芯片用于承载所述奇偶校验数据,所述n、p为正整数。采用以上方法,将分片数据分散在m个die上,并且在不同于该m个die的n个die上存储ECC校验数据,该ECC校验数据能够校验该多个分片数据,能够实现小粒度的用户数据的存储校验。通过在不同于该m和n个die的p个die上存储奇偶校验数据,进一步保证了数据可靠性。若采用现有的存储校验方法,在同一个die上的ECC校验数据校验同一个die上的用户数据,当存储较小粒度的用户数据时,可能导致用户数据和ECC校验数据占用多个条带,从而引起读取延迟。通过本申请实施例的方法,ECC校验数据能够校验多个die上的分片数据,这样能够避免读取延迟。通过将一组分片数据组成的用户数据和ECC校验数据独立生成奇偶校验数据,不会引入额外的写放大。现有的方法如果将多组数据来生成一组奇偶校验数据,当一组数据变化,需要重新生成奇偶校验数据,导致校验的写入量很大。本申请实施例中,多个分片数据组成的用户数据以及ECC校验数据可以相当于一组数据,通过将一组分片数据组成的用户数据和ECC校验数据独立生成奇偶校验数据,不会引入额外的写放大。
在一个可能的设计中,所述多个芯片中的任意一个芯片包括多个分区;所述m个芯片中的任意一个芯片上的一个或多个分区承载一个分片数据;所述第一芯片上的一个或多个分区承载所述ECC校验数据;所述第二芯片上的一个或多个分区承载所述奇偶校验数据。
在一个可能的设计中,所述多个分区在一个条带上。其中,一个条带即多个die中的一行。
在一个可能的设计中,所述多个芯片中的任意一个芯片包括多个分区,所述多个芯片中的任意一个芯片上的一个分区包括多个数据页;所述m个芯片中的任意一个芯片上的一个或多个数据页承载一个分片数据;所述第一芯片上的一个或多个数据页承载所述ECC校验数据;所述第二芯片上的一个或多个数据页承载所述奇偶校验数据。
在一个可能的设计中,所述多个数据页在一个条带上。其中,一个条带即多个die中的一行。
第二方面,提供一种存储校验方法,该方法可以通过以下步骤实现:对多个芯片进行读取;所述多个芯片用于存储用户数据、ECC校验数据和奇偶校验数据,所述用户数据包括多个分片数据,所述多个芯片中的m个芯片中的每个芯片承载一个分片数据,不同芯片承载的分片数据不同,所述多个芯片中除所述m个芯片之外的n个第一芯片承载所述ECC校验数据,所述多个芯片中除所述m个芯片之外的p个第二芯片用于承载所述奇偶校验数据,所述n、p为正整数;若所述多个芯片中的p个第三芯片上存储的数据丢失,则:对所述p个第三芯片进行独立磁盘冗余阵列RAID校验或纠删码EC校验,获得所述p个第三芯片上的恢复数据;采用所述ECC校验数据对所述m个芯片的数据进行ECC校验,获得所述用户数据,其中,若p个第三芯片包括m个芯片中的芯片,则所述m个芯片的数据包括所述恢复数据。采用以上方法,通过将粒度比较小的用户数据分片,将分片数据分散在m个die上,并且在不同于该m个die的n个die上存储ECC校验数据,该ECC校验数据能够校验该多个分片数据,能够实现小粒度的用户数据的存储校验。通过在不同于该m和n个die的p个die上存储奇偶校验数据,进一步保证了数据可靠性。若采用现有的存储校验方法,在同一个die上的ECC校验数据校验同一个die上的用户数据,当存储较小粒度的用户数据时,可能导致用户数据和ECC校验数据占用多个条带,从而引起读取 延迟。通过本申请实施例的方法,ECC校验数据能够校验多个die上的分片数据,这样能够避免读取延迟。通过将一组分片数据组成的用户数据和ECC校验数据独立生成奇偶校验数据,不会引入额外的写放大。现有的方法如果将多组数据来生成一组奇偶校验数据,当一组数据变化,需要重新生成奇偶校验数据,导致校验的写入量很大。本申请实施例中,多个分片数据组成的用户数据以及ECC校验数据可以相当于一组数据,通过将一组分片数据组成的用户数据和ECC校验数据独立生成奇偶校验数据,不会引入额外的写放大
在一个可能的设计中,所述ECC校验数据可纠错的最大比特数,不小于所述m个芯片的数据存在的比特数。这样能保证die数据丢失情况下的可靠性。
第三方面,提供一种存储校验装置,该装置具有实现上述第一方面、第二方面、第一方面的任一种可能的设计或第二方面的任一种可能的设计的功能。所述功能可以通过硬件实现,也可以通过硬件执行相应的软件实现。所述硬件或软件包括一个或多个与上述功能相对应的模块。
第四方面,提供一种存储系统,该存储系统中包括处理设备和存储设备,该存储设备可用于执行上述第一方面、第二方面、第一方面的任一种可能的设计或第二方面的任一种可能的设计的功能。
第五方面,提供一种计算机存储介质,存储有计算机程序,该计算机程序包括用于执行上述各方面和各方面的任一可能的设计中方法的指令。
第六方面,提供了一种包含指令的计算机程序产品,当其在计算机上运行时,使得计算机执行上述各方面和各方面的任一可能的设计中所述的方法。
附图说明
图1为现有技术中3D XPoint存储芯片结构示意图;
图2为本申请实施例中存储系统架构示意图;
图3为本申请实施例中存储校验方法流程示意图;
图4为本申请实施例中存储结构示意图;
图5为本申请实施例中校验方式示意图;
图6为本申请实施例中存储校验装置结构示意图之一;
图7为本申请实施例中存储校验装置结构示意图之二;
图8为本申请实施例中存储校验装置结构示意图之三。
具体实施方式
本申请实施例提供一种存储校验方法及装置,用以存储和校验粒度比较小的用户数据。其中,方法和装置是基于同一构思的,由于方法及装置解决问题的原理相似,因此装置与方法的实施可以相互参见,重复之处不再赘述。本申请实施例的描述中,“和/或”,描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。字符“/”一般表示前后关联对象是一种“或”的关系。本申请中所涉及的至少一个是指一个或多个;涉及的多个,是指两个或两个以上。另外,需要理解的是,在本申请的描述中,“第一”、“第二”等词汇,仅用于区分描述的目的,而不能理解为指示或暗示相对重要性,也不能理解为指示或暗示顺序。
图2示出了本申请实施例提供的存储校验方法适用的一种可能的存储系统的架构。如图2所示,存储系统200中包括处理设备201和存储设备202。处理设备201例如可以是计算机主机。存储设备202可以包括存储服务器和接口服务器等设备,存储设备202具有自己的接口和协议,通过同轴电缆、网线、光纤等方式与处理设备201连接,作为数据的存储中心为处理设备201提供存储服务。本申请中存储设备202主要用于存储用户数据和校验数据。在使用上述存储系统200时,一次I/O操作分为写和读两个方面。对于写方面,处理设备201通过SAN等网络发起写入请求,存储设备202接收来自处理设备201的写入请求,写入请求用于请求将用户数据写入存储设备202。对于读方面,存储设备202接收来自处理设备201的读取请求,存储设备202读取用户数据和校验数据,当用户数据与校验数据相匹配时,存储设备202向处理设备201返回用户数据。
本申请实施例提供的方法可以适用于使用ECC和RAID/EC的校验方法的任何应用场景,本申请中以3D XPoint存储介质为例进行介绍,但并不局限于此,还可以应用在其他存储介质上。本申请实施例提供的方法能够存储和校验粒度比较小的用户数据。其中,粒度比较小的用户数据,例如用户数据的大小为256byte、128byte、512byte等。本申请实施例针对粒度比较小的用户数据,根据die的存储结构,若一个die的一行(即一个条带)的存储大小不足以存储该用户数据和该ECC校验数据,则采用本申请实施例提供的方法进行存储校验。例如,用户数据大小为256byte。如图1所示,一个die可包括16个分区,一个分区中的一个数据页的大小为16byte,一个die的一个条带的存储大小为256byte。因此,若采用现有技术的方法,一个die中的一个条带存储该大小为256byte的用户数据,那么该用户数据的ECC校验数据只能在这个die中的另一个条带上存储,这样增加了读取延迟。通过本申请实施例提供的方法能够有助于解决读取延迟的问题。
如图3所示,下面对本申请的存储校验方法进行详细说明。
S301、对用户数据进行ECC校验,获得ECC校验数据。
其中,用户数据包括多个分片数据。例如,可以先对用户数据进行分片,获得多个分片数据。为了降低读取延迟,一个分片数据可以占用一个die的一个条带中的部分或全部数据页。根据用户数据的大小以及数据页的存储空间,确定分片的数量。例如用户数据大小为256byte,一个数据页的存储空间为16byte,则可以分片成16个分片数据,一个分片数据正好占用一个数据页。当然根据用户数据的大小,可以将用户数据进行分片使得分片数据占用两个或两个以上数据页。
S302、对该用户数据和该ECC校验数据,进行RAID校验或EC校验,获得奇偶校验(parity)数据。
需要说明的是,本申请实施例中的ECC校验、RAID校验和EC校验的具体方法可参见现有技术,在此不再赘述。
S303、将获得的多个分片数据、ECC校验数据和奇偶校验数据存储到多个芯片上。
若将用户数据分成m个分片数据,那么该多个芯片可以包括m个芯片、n个芯片和p个芯片。m个芯片、n个芯片和p个芯片为m个芯片中的不同芯片,其中,m个芯片承载m个分片数据,n个芯片承载ECC校验数据,p个芯片承奇偶校验数据。m是大于1的整数,n、p均为正整数。一般情况下,p的值与恢复的die的数量有关系。本申请实施例中以n=1、且p=1为例进行介绍。可以理解,将该举例的方法应用到n和p为大于1的整数的场景也属于本申请的保护范围。
将用户数据分成m个分片数据,那么可以将用户数据和校验数据存储到(m+2)个die上,其中,校验数据包括ECC校验数据和奇偶校验数据。该(m+2)个die中的m个die中的每个die承载一个分片数据,不同die上承载不同的分片数据。可选的,用户数据分成的m个分片数据按照序号编号,可将m个分片数据依次放置对应序号的die上。在m个die之外的两个die上分别承载ECC校验数据和奇偶校验数据,例如,在m个die之外的第一芯片上承载ECC校验数据,在m个die之外的第二芯片上承载奇偶校验数据。假设(m+2)个die用die0、die1、……、die(m+1)来表示,可以将m个不同分片数据分别承载在die0、die1、……、die(m-1)上,在die m上承载ECC校验数据,在die(m+1)上承载奇偶校验数据。
(m+2)个die中的任意一个die包括多个分区(partition)。其中,m个die中的任意一个die上的一个或多个分区,承载一个分片数据。承载分片数据的分区的数量,根据用户数据的大小和数据页的存储空间决定。类似的,第一芯片上的一个或多个分区,承载ECC校验数据。第二芯片上的第一个或多个分区承载奇偶检验数据。承载ECC校验数据的分区的数量,根据ECC校验数据大小和数据页的存储空间决定。承载奇偶校验数据的分区的数量,根据奇偶校验数据大小和数据页的存储空间决定。一般情况下,ECC校验数据的大小由用户数据的大小决定,奇偶校验数据的大小由用户数据和ECC校验数据的大小决定。进一步的,当多个分区来承载一个分片数据时,该多个分区可以在一个条带上,类似的,当多个分区来承载ECC校验数据,该多个分区可以在一个条带上;当多个分区来承载奇偶校验数据,该多个分区可以在一个条带上,这样能够降低读取时延。
(m+2)个die中的任意一个die包括多个分区(partition),且一个分区包括多个数据页,一个数据页为最小的读写单元。那么,m个die中的任意一个die上的一个或多个数据页,承载一个分片数据。承载分片数据的分区的数量,根据用户数据的大小和数据页的存储空间决定。类似的,第一芯片上的一个或多个数据页,承载ECC校验数据。第二芯片上的一个或多个数据页承载奇偶检验数据。承载ECC校验数据的分区的数量,根据ECC校验数据大小和数据页的存储空间决定。承载奇偶校验数据的数据页的数量,根据奇偶校验数据大小和数据页的存储空间决定。一般情况下,ECC校验数据的大小由用户数据的大小决定,奇偶校验数据的大小由用户数据和ECC校验数据的大小决定。进一步的,当多个数据页来承载一个分片数据时,该多个数据页可以在一个条带上,类似的,当多个数据页来承载ECC校验数据,该多个数据页可以在一个条带上;当多个数据页来承载奇偶校验数据,该多个数据页可以在一个条带上,这样能够降低读取时延。
举例来说,假设用户数据为256byte,die中一个数据页的存储大小为16byte,那么可以将用户数据分为16个分片数据,一个分片数据的大小为16byte。整个用户数据占用16个die。综合用户数据和校验数据,共占用18个die。如图4所示,假设18个die用die0、die1、……、die17来表示。可以在die0~die15上分别存储16个分片数据中的不同分片数据,分片数据用字母d表示。在die17上存储ECC校验数据,并在die18上存储奇偶校验数据。存储位置仅仅是一种举例,本申请实施例还可以有其它存储方式,例如,在一个die上占用一个条带的两个或两个以上数据页来存储分片数据。
如果将多组数据来生成一组奇偶校验数据,当一组数据变化,需要重新生成奇偶校验数据,导致校验的写入量很大。本申请实施例中,多个分片数据组成的用户数据以及ECC校验数据可以相当于一组数据,通过将一组分片数据组成的用户数据和ECC校验数据独立 生成奇偶校验数据,不会引入额外的写放大。
S304、对多个芯片进行读取。
通过将较小粒度的用户数据进行分片,将用户数据和校验数据分散在多个die上,每个die上承载的用户数据或校验数据不超过一个条带,这样在读取时可以并发访问,且读取时可以降低读取时延。
若多个芯片中的p个芯片上存储的数据丢失,则执行S305~S306。其中,p个芯片可以是该多个芯片中的任意p个芯片,例如,存储分片数据的芯片上的数据丢失,也可以是存储ECC校验数据的第二芯片上的数据丢失,也可以是存储奇偶校验数据的第三芯片上的数据丢失。丢失数据的p个芯片称为p个第三芯片。
S305、采用奇偶校验数据,进行RAID校验或EC校验,获得p个第三芯片上的恢复数据。
其中,若丢失数据的芯片为存储分片数据和ECC校验数据的芯片,则采用第二芯片上存储的奇偶校验数据进行RAID校验。第二芯片上的奇偶校验数据也丢失了,那么采用分片数据和ECC校验数据先计算出奇偶校验数据,再进行数据恢复。
若在S302对该用户数据和该ECC校验数据,进行RAID校验,则采用第一芯片上存储的奇偶校验数据进行RAID校验,获得第三芯片上的恢复数据。
若在S302对该用户数据和该ECC校验数据,进行EC校验,则采用第一芯片上存储的奇偶校验数据进行EC校验,获得第三芯片上的恢复数据。
S306、采用第一芯片上存储的ECC校验数据,对m个芯片上的数据进行ECC校验,若m个芯片上存在数据丢失,则获得丢失芯片上存储的分片数据,进一步获得整个用户数据。其中m个芯片上的数据包括S305得到的恢复数据。
本申请中,对多个分片数据进行ECC校验,存储在某个die上,例如第一芯片上;并将多个分片数据和ECC校验数据进行RAID/EC校验,获得奇偶校验数据,存储在另一个die上,例如第二芯片上。在读取数据时,若第三芯片上的分片数据丢失,则按照先采用RAID/EC校验后ECC校验的方法进行数据恢复。即,采用RAID/EC校验方法先恢复第三芯片上的数据,得到恢复数据。在用ECC校验方法对第三芯片上的恢复数据和其它存储分片数据的芯片上的分片数据进行联合校验,从而获得正确的用户数据。其中,采用RAID/EC校验方法得到的该恢复数据存在校验比特数最多为其它芯片上的分片数据的错误比特数之和,那么当用ECC校验时,需要ECC校验数据纠错最大比特数不小于m个芯片上数据的错误比特数之和,m个芯片上数据包括第三芯片的恢复数据。
下面通过举例对数据恢复的过程进行详细说明。
如图5所示,仍以上述例子为例,用户数据为256byte,die中一个数据页的存储大小为16byte,对用户数据进行分片,获得16个分片数据,可以在die0~die15上存储16个分片数据,对该16个分片数据组成的用户数据进行ECC校验,获得ECC校验数据,将ECC校验数据存储在die16上。对用户数据和ECC校验数据进行RAID/EC校验,获得parity数据,将parity数据存储在die17上。
在数据存储时,必然会有错误比特。假设用户数据和校验数据均存在错误比特。假设在读取时die2丢失,先用parity校验数据对die1、die3、die4….die15上的分片数据和die16上的ECC校验数据进行RAID/EC校验,获得die2上的恢复数据。由于用户数据存储时存在错误比特,因此通过RAID/EC校验获得的die2的恢复数据也会存在错误比特。假设 除die2之外的die上的分片数据共存在x个错误比特,则die2的恢复数据最多会存在x个错误比特。那么整个用户数据存在2x个错误比特。继而,采用die16上的ECC校验数据再对die0、die1、die3~die15上的分片数据和上述得到的die2上的恢复数据进行校验,得到die2上丢失的分片数据。假如用户数据的错误比特为2倍的xbit时,当ECC的纠错能力大于或等于2倍的x比特时,die2上的数据可被正确读取,从而整个用户数据可被正确读出。ECC的纠错能力大于或等于2倍的x比特,也就是说ECC能够纠正的错误比特数能够大于或等于2倍的x比特。这样能保证die数据丢失情况下的可靠性。
假设在读取时共有p个die发生数据丢失,除该p个die之外的die上共存在y个错误比特,则p个die的恢复数据最多会存储p倍的x个错误比特。那么整个用户数据存在(p+1)倍的x个错误比特。类似的,ECC的纠错能力大于或等于((p+1)倍的x比特,也就是说ECC能够纠正的错误比特数能够大于或等于(p+1)倍的x比特。这样能保证die数据丢失情况下的可靠性。
基于上述方法实施例的同一构思,如图6所示,本申请实施例还提供了一种存储校验装置600,该存储校验装置600用于执行上述存储校验方法中对应的写入方面的操作。该存储校验装置600包括校验单元601和存储单元602。其中:
校验单元601,用于对用户数据进行错误检查和纠正ECC校验,获得ECC校验数据,该用户数据包括m个分片数据;以及用于对该用户数据和该ECC校验数据,进行独立磁盘冗余阵列RAID校验或纠删码EC校验,获得奇偶校验数据,m为大于2的整数;;
存储单元602,用于将该m个分片数据、该ECC校验数据和所奇偶校验数据存储到(m+2)个芯片上;其中,该(m+2)个芯片中的m个芯片中的每个芯片承载一个分片数据,不同芯片承载的分片数据不同,该(m+2)个芯片中除该m个芯片之外的n个第一芯片承载该ECC校验数据,该(m+2)个芯片中除该m个芯片之外的p个第二芯片用于承载该奇偶校验数据,n、p为正整数。
可选的,该(m+2)个芯片中的任意一个芯片包括多个分区;
该m个芯片中的任意一个芯片上的一个或多个分区承载一个分片数据;
该第一芯片上的一个或多个分区承载该ECC校验数据;
该第二芯片上的一个或多个分区承载该奇偶校验数据。
可选的,该多个分区在一个条带上。
可选的,该(m+2)个芯片中的任意一个芯片包括多个分区,该(m+2)个芯片中的任意一个芯片上的一个分区包括多个数据页;
该m个芯片中的任意一个芯片上的一个或多个数据页承载一个分片数据;
该第一芯片上的一个或多个数据页承载该ECC校验数据;
该第二芯片上的一个或多个数据页承载该奇偶校验数据。
可选的,该多个数据页在一个条带上。
基于上述方法实施例的同一构思,如图7所示,本申请实施例还提供了一种存储校验装置700,该存储校验装置700用于执行上述存储校验方法中对应的读取方面的操作。该存储校验装置700包括读取单元701和校验单元702。其中:
读取单元701,对(m+2)个芯片进行读取;该(m+2)个芯片用于存储用户数据、ECC校验数据和奇偶校验数据,该用户数据包括多个分片数据,该(m+2)个芯片中的m个芯片中的每个芯片承载一个分片数据,不同芯片承载的分片数据不同,该(m+2)个芯 片中除该m个芯片之外的第一芯片承载该ECC校验数据,该(m+2)个芯片中除该m个芯片之外的第二芯片用于承载该奇偶校验数据;
校验单元702,用于在该m个芯片中的第三芯片上存储的分片数据丢失时,采用该奇偶校验数据进行独立磁盘冗余阵列RAID校验或纠删码EC校验,获得该第三芯片上的恢复数据;采用该ECC校验数据对该m个芯片的数据进行ECC校验,获得该用户数据,其中,该m个芯片的数据包括该恢复数据。
可选的,该ECC校验数据可纠错的最大比特数,不小于该m个芯片的数据存在的比特数。
基于上述方法实施例的同一构思,如图8所示,本申请实施例还提供了一种存储校验装置800,该存储校验装置800用于执行上述存储校验方法。该存储校验装置800包括处理器801和存储器802。其中:
处理器801用于接收将用户数据写入芯片的写入请求,以及用于读取芯片的读取请求,以及上述方法实施例描述的校验方面的操作等,存储器802可以相当于上述方法实施例中的芯片,用于存储用户数据和校验数据。存储校验装置800中的处理器801还可以用于执行上述方法实施例中的其他操作,在此不再赘述。
本领域内的技术人员应明白,本申请的实施例可提供为方法、系统、或计算机程序产品。因此,本申请可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且,本申请可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。
本申请是参照根据本申请实施例的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。
这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上,使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。
尽管已描述了本申请的优选实施例,但本领域内的技术人员一旦得知了基本创造性概念,则可对这些实施例作出另外的变更和修改。所以,所附权利要求意欲解释为包括优选实施例以及落入本申请范围的所有变更和修改。

Claims (14)

  1. 一种存储校验方法,其特征在于,包括:
    对用户数据进行错误检查和纠正ECC校验,获得ECC校验数据,所述用户数据包括m个分片数据,所述m为大于2的整数;
    对所述用户数据和所述ECC校验数据,进行独立磁盘冗余阵列RAID校验或纠删码EC校验,获得奇偶校验数据;
    将所述m个分片数据、所述ECC校验数据和所奇偶校验数据存储到多个芯片上;其中,所述多个芯片中的m个芯片中的每个芯片承载一个分片数据,不同芯片承载的分片数据不同,所述多个芯片中除所述m个芯片之外的n个第一芯片承载所述ECC校验数据,所述多个芯片中除所述m个芯片之外的p个第二芯片用于承载所述奇偶校验数据,所述n、p为正整数。
  2. 如权利要求1所述的方法,其特征在于,所述多个芯片中的任意一个芯片包括多个分区;
    所述m个芯片中的任意一个芯片上的一个或多个分区承载一个分片数据;
    所述第一芯片上的一个或多个分区承载所述ECC校验数据;
    所述第二芯片上的一个或多个分区承载所述奇偶校验数据。
  3. 如权利要求2所述的方法,其特征在于,所述多个分区在一个条带上。
  4. 如权利要求1所述的方法,其特征在于,所述多个芯片中的任意一个芯片包括多个分区,所述多个芯片中的任意一个芯片上的一个分区包括多个数据页;
    所述m个芯片中的任意一个芯片上的一个或多个数据页承载一个分片数据;
    所述第一芯片上的一个或多个数据页承载所述ECC校验数据;
    所述第二芯片上的一个或多个数据页承载所述奇偶校验数据。
  5. 如权利要求4所述的方法,其特征在于,所述多个数据页在一个条带上。
  6. 一种存储校验方法,其特征在于,包括:
    对多个芯片进行读取;所述多个芯片用于存储用户数据、ECC校验数据和奇偶校验数据,所述用户数据包括多个分片数据,所述多个芯片中的m个芯片中的每个芯片承载一个分片数据,不同芯片承载的分片数据不同,所述多个芯片中除所述m个芯片之外的n个第一芯片承载所述ECC校验数据,所述多个芯片中除所述m个芯片之外的p个第二芯片用于承载所述奇偶校验数据,所述n、p为正整数;
    若所述多个芯片中的p个第三芯片上存储的数据丢失,则:
    对所述p个第三芯片进行独立磁盘冗余阵列RAID校验或纠删码EC校验,获得所述p个第三芯片上的恢复数据;
    采用所述ECC校验数据对所述m个芯片的数据进行ECC校验,获得所述用户数据。
  7. 如权利要求6所述的方法,其特征在于,所述ECC校验数据可纠错的最大比特数,不小于所述m个芯片的数据存在的比特数。
  8. 一种存储校验装置,其特征在于,包括:
    校验单元,用于对用户数据进行错误检查和纠正ECC校验,获得ECC校验数据,所述用户数据包括m个分片数据,所述m为大于2的整数;以及用于对所述用户数据和所述ECC校验数据,进行独立磁盘冗余阵列RAID校验或纠删码EC校验,获得奇偶校验数 据;
    存储单元,用于将所述m个分片数据、所述ECC校验数据和所奇偶校验数据存储到多个芯片上;其中,所述多个芯片中的m个芯片中的每个芯片承载一个分片数据,不同芯片承载的分片数据不同,所述多个芯片中除所述m个芯片之外的n个第一芯片承载所述ECC校验数据,所述多个芯片中除所述m个芯片之外的p个第二芯片用于承载所述奇偶校验数据,所述n、p为正整数。
  9. 如权利要求8所述的装置,其特征在于,所述多个芯片中的任意一个芯片包括多个分区;
    所述m个芯片中的任意一个芯片上的一个或多个分区承载一个分片数据;
    所述第一芯片上的一个或多个分区承载所述ECC校验数据;
    所述第二芯片上的一个或多个分区承载所述奇偶校验数据。
  10. 如权利要求9所述的装置,其特征在于,所述多个分区在一个条带上。
  11. 如权利要求9所述的装置,其特征在于,所述多个芯片中的任意一个芯片包括多个分区,所述多个芯片中的任意一个芯片上的一个分区包括多个数据页;
    所述m个芯片中的任意一个芯片上的一个或多个数据页承载一个分片数据;
    所述第一芯片上的一个或多个数据页承载所述ECC校验数据;
    所述第二芯片上的一个或多个数据页承载所述奇偶校验数据。
  12. 如权利要求11所述的装置,其特征在于,所述多个数据页在一个条带上。
  13. 一种存储校验装置,其特征在于,包括:
    读取单元,对多个芯片进行读取;所述多个芯片用于存储用户数据、ECC校验数据和奇偶校验数据,所述用户数据包括多个分片数据,所述多个芯片中的m个芯片中的每个芯片承载一个分片数据,不同芯片承载的分片数据不同,所述多个芯片中除所述m个芯片之外的n个第一芯片承载所述ECC校验数据,所述多个芯片中除所述m个芯片之外的p个第二芯片用于承载所述奇偶校验数据,所述n、p为正整数;
    校验单元,用于在所述多个芯片中的p个第三芯片上存储的数据丢失时,对所述p个第三芯片进行独立磁盘冗余阵列RAID校验或纠删码EC校验,获得所述p个第三芯片上的恢复数据;以及采用所述ECC校验数据对所述m个芯片的数据进行ECC校验,获得所述用户数据。
  14. 如权利要求13所述的装置,其特征在于,所述ECC校验数据可纠错的最大比特数,不小于所述m个芯片的数据存在的比特数。
PCT/CN2019/079118 2019-03-21 2019-03-21 一种存储校验方法及装置 WO2020186524A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/CN2019/079118 WO2020186524A1 (zh) 2019-03-21 2019-03-21 一种存储校验方法及装置
CN201980091806.7A CN113424262B (zh) 2019-03-21 2019-03-21 一种存储校验方法及装置

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2019/079118 WO2020186524A1 (zh) 2019-03-21 2019-03-21 一种存储校验方法及装置

Publications (1)

Publication Number Publication Date
WO2020186524A1 true WO2020186524A1 (zh) 2020-09-24

Family

ID=72519441

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/079118 WO2020186524A1 (zh) 2019-03-21 2019-03-21 一种存储校验方法及装置

Country Status (2)

Country Link
CN (1) CN113424262B (zh)
WO (1) WO2020186524A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116662063B (zh) * 2023-05-10 2024-02-23 珠海妙存科技有限公司 一种闪存的纠错配置方法、纠错方法、系统、设备及介质

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103902403A (zh) * 2012-12-27 2014-07-02 Lsi公司 经由冗余阵列的非易失性存储器编程故障恢复
CN104246898A (zh) * 2012-05-31 2014-12-24 惠普发展公司,有限责任合伙企业 局部错误检测和全局错误纠正
CN105359218A (zh) * 2013-01-25 2016-02-24 桑迪士克技术有限公司 非易失性存储器编程数据保存
CN107408019A (zh) * 2015-03-27 2017-11-28 英特尔公司 用于提高对非易失性存储器中的缺陷的抗干扰性的方法和装置
CN107943609A (zh) * 2016-10-12 2018-04-20 三星电子株式会社 存储器模块、存储器控制器和系统及其相应操作方法

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7392458B2 (en) * 2004-11-19 2008-06-24 International Business Machines Corporation Method and system for enhanced error identification with disk array parity checking

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104246898A (zh) * 2012-05-31 2014-12-24 惠普发展公司,有限责任合伙企业 局部错误检测和全局错误纠正
CN103902403A (zh) * 2012-12-27 2014-07-02 Lsi公司 经由冗余阵列的非易失性存储器编程故障恢复
CN105359218A (zh) * 2013-01-25 2016-02-24 桑迪士克技术有限公司 非易失性存储器编程数据保存
CN107408019A (zh) * 2015-03-27 2017-11-28 英特尔公司 用于提高对非易失性存储器中的缺陷的抗干扰性的方法和装置
CN107943609A (zh) * 2016-10-12 2018-04-20 三星电子株式会社 存储器模块、存储器控制器和系统及其相应操作方法

Also Published As

Publication number Publication date
CN113424262A (zh) 2021-09-21
CN113424262B (zh) 2024-01-02

Similar Documents

Publication Publication Date Title
US11507281B2 (en) Method and apparatus for flexible RAID in SSD
EP2684134B1 (en) Programmable data storage management
US8095763B2 (en) Method for reducing latency in a raid memory system while maintaining data integrity
KR102102728B1 (ko) 스케일러블 스토리지 보호
CN104035830B (zh) 一种数据恢复方法和装置
US8370715B2 (en) Error checking addressable blocks in storage
US7131050B2 (en) Optimized read performance method using metadata to protect against drive anomaly errors in a storage array
US9158675B2 (en) Architecture for storage of data on NAND flash memory
US20100169743A1 (en) Error correction in a solid state disk
US7818524B2 (en) Data migration systems and methods for independent storage device expansion and adaptation
US10817372B2 (en) Systems and methods for ultra fast ECC with parity
US20150089328A1 (en) Flex Erasure Coding of Controllers of Primary Hard Disk Drives Controller
CN108228382B (zh) 一种针对evenodd码单盘故障的数据恢复方法
CN103019617A (zh) 高效实现ssd内部raid的构建方法、数据读写方法及装置
CN103019893A (zh) 一种多盘容错的二维混合盘raid4系统架构及其读写方法
CN108984133B (zh) 一种ssd中raid的实现方法
WO2020186524A1 (zh) 一种存储校验方法及装置
US8145839B2 (en) Raid—5 controller and accessing method with data stream distribution and aggregation operations based on the primitive data access block of storage devices
US20200042386A1 (en) Error Correction With Scatter-Gather List Data Management
WO2015078192A1 (zh) 一种数据处理方法及设备
CN108121509A (zh) 一种提高ssd读操作时raid效率的方法及ssd
CN117785025B (zh) Ecc与raid5混合编码优化ssd读性能的方法
WO2013023564A9 (en) Method and apparatus for flexible raid in ssd

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19919773

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19919773

Country of ref document: EP

Kind code of ref document: A1