CN116490853A - Distributed ECC scheme in a memory controller - Google Patents

Distributed ECC scheme in a memory controller Download PDF

Info

Publication number
CN116490853A
CN116490853A CN202080107284.8A CN202080107284A CN116490853A CN 116490853 A CN116490853 A CN 116490853A CN 202080107284 A CN202080107284 A CN 202080107284A CN 116490853 A CN116490853 A CN 116490853A
Authority
CN
China
Prior art keywords
data
nand flash
decoder
error correction
flash memory
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202080107284.8A
Other languages
Chinese (zh)
Inventor
胡潮红
刘淳
廖鑫
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Publication of CN116490853A publication Critical patent/CN116490853A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/08Error detection or correction by redundancy in data representation, e.g. by using checking codes
    • G06F11/10Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's
    • G06F11/1008Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's in individual solid state devices
    • G06F11/1048Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's in individual solid state devices using arrangements adapted for a specific error detection or correction feature
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

Methods and systems are provided for: a first NAND receives a request to read data, wherein the data is stored on the NAND flash memory and the data has been encoded using ECC, and the request is received from a flash controller; retrieving the data and the ECC from the NAND flash of the first NAND flash device; determining whether a parameter specified in the request meets an error correction criterion for decoding the data using a first decoder implemented on the first NAND flash memory device; decoding the data using a first decoder implemented on the first NAND flash memory device if the parameter meets the error correction criteria; if the parameter does not meet the error correction criteria, the retrieved data and the ECC are transmitted to the flash controller for decoding using a second decoder implemented by the flash controller.

Description

Distributed ECC scheme in a memory controller
Technical Field
The present disclosure relates generally to SSD (solid state disk) controllers, and in particular, to methods of using NAND flash SRAM (static random access memory) in SSD controllers.
Background
SSDs store data in solid state devices, rather than in magnetic or optical media. A typical SSD includes a controller and a solid state memory device. The host device performs read and write operations on the SSD. In response, the SSD acknowledges receipt of the data, stores the data, and then retrieves the data. Reading and storing data on SSDs is prone to errors. Typical SSDs perform error correction when reading data through a host interface or memory controller.
Disclosure of Invention
Various examples are now described, some of which are introduced in simplified form, and which are further described in the detailed description. This summary is not intended to identify key or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter.
In some aspects, an error correction method for a Solid State Disk (SSD) including a plurality of NAND (NAND) flash memory devices is provided, the method comprising: a first NAND flash device of the plurality of NAND flash devices receives a request to read data, the data stored on NAND flash of the first NAND flash device: the data stored on the NAND flash memory has been encoded using error correction codes (error correction code, ECC); the request is received by the first NAND flash device from a flash controller of the SSD over a first channel associated with the first NAND flash device; retrieving the data from a NAND flash memory of the first NAND flash memory device and an ECC used to encode the data; the first NAND flash device determining whether a parameter specified in the request to read data meets an error correction criterion for decoding the data encoded using the ECC using a first decoder of the plurality of decoders, wherein the first decoder is implemented on the first NAND flash device; in response to determining that the parameter meets the error correction criteria, performing the following: decoding the data using a first decoder implemented on the first NAND flash memory device to correct one or more errors in the retrieved data according to an ECC used to encode the data; and transmitting the decoded data to a flash memory controller of the SSD through the first channel.
In some aspects, in response to determining that the parameter does not meet the error correction criteria, the following is performed:
in some aspects, the error correction criteria includes at least one of a plurality of priority issues including a delay parameter, a balance parameter, or an energy consumption parameter.
In some aspects, the method comprises: determining that the first decoder is busy performing other operations; in response to determining that the first decoder is busy performing other operations, the retrieved data and the ECC used to encode the data are transmitted over the first channel to a flash controller of the SSD, which decodes the data using a second decoder implemented by the flash controller in accordance with the ECC used to encode the data to correct one or more errors in the retrieved data.
In some aspects, the method comprises: determining that the first decoder is busy performing other operations; in response to determining that the first decoder is busy performing other operations, the retrieved data and the ECC used to encode the data are transmitted over the first channel to a second NAND flash device, which decodes the data using a second decoder implemented by the second NAND flash device in accordance with the ECC used to encode the data to correct one or more errors in the retrieved data.
In some aspects, the method comprises: the flash controller of the SSD receives the decoded data from the second NAND flash device over a second channel associated with the second NAND flash device.
In some aspects, the retrieved data and the ECC are transferred to the second NAND flash device by a flash controller of the SSD.
In some aspects, the method comprises: determining that the error correction criteria corresponds to prioritizing delay parameters to reduce error correction delay; transmitting the retrieved data and the ECC used to encode the data to a flash controller of the SSD over the first channel, the flash controller decoding the data using a second decoder implemented by the flash controller in accordance with the ECC used to encode the data while decoding the data using the first decoder in accordance with the ECC used to encode the data to correct the one or more errors in the retrieved data.
In some aspects, the method comprises: a decoder that first completes decoding the data from either of the first decoder and the second decoder accesses the decoded data.
In some aspects, the method comprises: determining that the error correction criteria corresponds to a prioritized balance parameter; partially decoding the data using a first decoder implemented on the first NAND flash memory device according to an ECC used to encode the data; transmitting the partially decoded data and ECC used to encode the data to a flash controller of the SSD over the first channel, the flash controller completing decoding of the partially decoded data using a second decoder implemented by the flash controller according to the ECC used to encode the data.
In some aspects, the first decoder is to perform weak error correction including at least one of hard sensing (hard sensing) or iterating a first number of times of the data; the second decoder is to perform a strong error correction including at least one of soft sensing (soft sensing) and hard sensing of the data or a second number of iterations, the second number of iterations being greater than the first number of iterations.
In some aspects, the first decoder and the second decoder include different resource characteristics and different delays.
In some aspects, the NAND flash memory device comprises a 3D or 4D flash memory device.
In some aspects, the method comprises: generating an error correction result in the first decoder; determining that uncorrectable errors exist in the error correction result; in response to determining that the uncorrectable error exists in the error correction result, sending a data packet to a flash controller of the SSD, the data packet including the data and error correction information from the first NAND flash memory device.
In some aspects, the ECC comprises a block code.
In some aspects, a system for performing error correction in a Solid State Disk (SSD) is provided, the system comprising: a plurality of NAND (NAND) flash memory devices, each NAND flash memory device of the plurality of NAND flash memory devices having an on-chip NAND flash memory and a respective decoder of a plurality of decoders, a first NAND flash memory device of the plurality of NAND flash memory devices performing operations comprising: receiving a request to read data, the data stored on a NAND flash of the first NAND flash device: wherein data stored on the NAND flash memory has been encoded using an Error Correction Code (ECC); the request is received by the first NAND flash device from a flash controller of the SSD over a first channel associated with the first NAND flash device; retrieving the data from a NAND flash memory of the first NAND flash memory device and an ECC used to encode the data; determining whether a parameter specified in the request to read data meets an error correction criterion for decoding the data encoded using the ECC using a first decoder of the plurality of decoders, wherein the first decoder is implemented on the first NAND flash memory device; in response to determining that the parameter meets the error correction criteria, performing the following: decoding the data using a first decoder implemented on the first NAND flash memory device to correct one or more errors in the retrieved data according to an ECC used to encode the data; and transmitting the decoded data to a flash memory controller of the SSD through the first channel.
In some aspects, the operations further comprise: in response to determining that the parameter does not meet the error correction criteria, performing the following: bypassing a first decoder implemented on the first NAND flash memory device; transmitting the retrieved data and the ECC used to encode the data over the first channel to a flash controller of the SSD, the flash controller decoding the data using a second decoder implemented by the flash controller according to the ECC used to encode the data to correct one or more errors in the retrieved data.
In some aspects, the error correction criteria includes at least one of a plurality of priority issues including a delay parameter, a balance parameter, or an energy consumption parameter.
In some aspects, an apparatus for a Solid State Disk (SSD), the SSD comprising a plurality of NAND (NAND) flash memory devices, the apparatus comprising: means for receiving, by a first NAND flash memory device of the plurality of NAND flash memory devices, a request to read data stored on NAND flash memory of the first NAND flash memory device: wherein data stored on the NAND flash memory has been encoded using an Error Correction Code (ECC); the request is received by the first NAND flash device from a flash controller of the SSD over a first channel associated with the first NAND flash device; means for retrieving the data from the NAND flash memory of the first NAND flash memory device and for encoding the ECC used for the data; means for determining, by the first NAND flash memory device, whether a parameter specified in the request to read data meets an error correction criterion for decoding the data encoded using the ECC using a first decoder of the plurality of decoders, wherein the first decoder is implemented on the first NAND flash memory device; means for, in response to determining that the parameter meets the error correction criteria, performing the following: decoding the data using a first decoder implemented on the first NAND flash memory device to correct one or more errors in the retrieved data according to an ECC used to encode the data; and transmitting the decoded data to a flash memory controller of the SSD through the first channel.
The explanations of the various aspects and implementations thereof apply equally to the other aspects and their corresponding implementations. The different embodiments may be implemented in hardware, software or any combination thereof. Moreover, any of the examples above may be combined with any one or more of the other examples above to create new embodiments within the scope of the present disclosure.
Drawings
In the drawings, which are not necessarily drawn to scale, like numerals may describe like components in different views. The drawings illustrate generally, by way of example and not by way of limitation, the various embodiments discussed herein.
FIG. 1 is a schematic diagram of a NAND flash SSD according to some embodiments;
FIG. 2 is a schematic diagram of a NAND flash memory device of the SSD shown in FIG. 1, according to some embodiments;
3A-3D illustrate schematic diagrams of distributed error correction schemes according to some embodiments;
FIGS. 4A, 4B, 5, 6, and 7 illustrate a flowchart of performing distributed error correction according to some embodiments;
FIG. 8 is a block diagram illustrating circuitry in the form of a processing system implementing a system and method for performing distributed error correction in accordance with some embodiments.
Detailed Description
It should be understood at the outset that although an illustrative implementation of one or more embodiments are provided below, the disclosed systems and/or methods described in connection with fig. 1-8 may be implemented using any number of techniques, whether currently known or in existence. The disclosure should in no way be limited to the illustrative implementations, drawings, and techniques illustrated below, including the exemplary designs and implementations illustrated and described herein, but may be modified within the scope of the appended claims along with their full scope of equivalents.
Newly developed NAND flash memory chips include Static Random Access Memory (SRAM) on the chip. Such a chip may be a so-called 3D NAND chip or a 4D NAND chip. In the present disclosure, these two types are collectively referred to as "NAND chip with on-chip SRAM". Some such NAND chips provide 1MB (megabyte) of on-chip SRAM, but other chips provide greater or less than 1MB of on-chip SRAM. The physical layout of such 3D NAND chips and 4D NAND chips provides greater memory storage space and additional physical space for additional processing devices (e.g., encoders and decoders). These processing devices, known as back-end memory controllers, may be used to distribute error correction operations, such as decoding data stored by a 3D or 4D NAND chip on the chip itself. These decoding operations may supplement or replace decoding operations typically performed by a flash memory controller (referred to as a front-end memory controller). The present disclosure proposes a new process of performing error correction (e.g., data decoding) using an on-chip decoder of such a NAND chip.
Fig. 1 is a schematic diagram of a NAND flash SSD 100. SSD 100 includes a main CPU 102 and a NAND flash interface (NAND Flash Interface, NFI) CPU 108. The main CPU 102 includes a front-end CPU 104 and a back-end CPU 106. The front-end CPU 104 implements a handler (handler) for the following commands: commands received from host device 130 via a PCIe bus (peripheral component interconnect express), SAS bus (serial attached SCSI (small computer system interface)), or other suitable interface. The front-end CPU 104 also implements a scheduler for the following commands: a Back End (BE) command issued in response to a received host command. The back-end CPU 106 implements a back-end Firmware (FW) to perform flash translation layers (Flash Translation Layer, FTL), mapping, and other back-end functions.
NFI CPU 108 controls and manages channel 122. Each channel 122 transmits data and commands to a subset of NAND flash chips in NAND flash device 210 (described in more detail in connection with fig. 2). In other SSDs, host CPU 102 and/or NFI CPU 108 may be implemented with other numbers or types of CPUs and/or other functional distributions.
SSD 100 also includes dynamic random access memory (Dynamic Random Access Memory, DRAM) 112, SRAM 114, hardware (HW) accelerator 116, and other peripherals 118.DRAM 112 is 32 Gigabytes (GB), but may be larger or smaller in other SSDs. SRAM 114 is 10 Megabytes (MB), but may be larger or smaller in other SSDs.
The HW accelerator 116 includes an Exclusive-OR (XOR) engine, a buffer manager, a HW garbage collection (Garbage Collection, GC) engine, and may include other HW circuitry designed to independently handle the following functions: specific, limited functionality for the main CPU 102 and NFI CPU 108. Other peripheral devices 118 may include serial peripheral interface (Serial Peripheral Interface, SPI) circuitry, general purpose input/Output (GPIO) circuitry, inter-Integrated Circuit, I2C) bus interfaces, universal asynchronous receiver/Transmitter (Universal Asynchronous Receiver/Transmitter, UART) circuitry, and other interface circuitry, among others.
SSD 100 also includes flash subsystem 120, which may include low density parity check (Low Density Parity Check, LDPC) error correction circuitry or other error correction circuitry (e.g., decoder), randomizer circuitry, flash signal processing circuitry, and may include other circuitry that provides the following: processes related to writing and reading data to the NAND flash memory device 210. In some cases, flash subsystem 120 is referred to herein as a front-end memory controller. In accordance with the disclosed technology, the decoder may be implemented on the front-end memory controller and on the NAND flash memory array 150. In particular, when storing digital data in a nonvolatile memory, it is important to have a mechanism capable of detecting and correcting a certain number of errors. This mechanism is called data decoding. Error Correction Codes (ECC) encode data so that a decoder can identify and correct errors in the data. Typically, a data string is encoded by adding a plurality of redundancy bits to the data string. After reconstruction of the original data, the decoder examines the encoded message to check if there are any errors. There are various types of ECC decoders, including block code decoders and convolutional code decoders. The block code decoder performs operations on codes called "n" codes and "k" codes. A block of k data bits is encoded into a block of n bits, which is called a codeword. In block codes, the codeword has no correlation with the previously encoded message. The block code may include a linear code and a nonlinear code, and both types may be systematic codes. The linear codes include repetition codes, parity check codes, hamming codes, and cyclic codes. The convolutional code decoder performs operations on the following codewords: depending on the data message and the codeword of a given number of previously encoded messages. The encoder changes state as each message is processed. LDPC is a convolutional error correcting code.
Typically, data is read from the NAND flash array 150 by the flash memory subsystem 120. Flash subsystem 120 is used to perform error correction to detect and correct memory errors and bus errors, always using a decoder implemented by the front-end memory controller. While such methods are generally effective, using the same decoder to handle all error correction operations can result in data reading, recovering, and correcting bottlenecks and single points of failure. This process also consumes bandwidth on the following channels: a channel for receiving data from the NAND flash memory array 150 because ECC information must be transmitted through the channel in addition to the base data. This slows down the process of reading and decoding data from the NAND flash array 150.
As the technology of NAND flash array 150 improves, additional physical space is available on NAND flash array 150. This is because NAND memory cells are becoming smaller and physically arranged in a stacked and layered fashion, which can free up physical space on the same size chip. This physical space may be used to include additional decoders on the NAND flash array 150 itself. This duplication and addition of decoders provides a solution that enables distributed error correction work between the front-end memory controller and the back-end memory controller. In accordance with the disclosed embodiments, a distributed scheme for performing error correction on a NAND flash memory array 150 is provided, in particular for performing error correction on a 3D or 4D memory device. By utilizing the available space in the 3D and 4D memory logic chips, this enables near data computation, such as data replication, data searching, or any data processing function. In some cases, the decoder and processing circuits or devices on the NAND flash array 150 itself are referred to as a back-end memory controller.
In some embodiments, the distributed coupled ECC scheme includes an ECC manager in the front-end memory controller that manages ECC resources and distributes error correction operations. The ECC manager determines when and where to perform the ECC function or a portion of the ECC function. Specifically, the ECC manager coordinates the ECC operations such that error correction is performed on the following units: on only the front-end memory controller, on only the decoder implemented on one or more back-end memory controllers, or partially on the front-end memory controller and partially on one or more back-end memory controllers. In controlling the location and time at which ECC operations are performed in a distributed scheme, the ECC manager considers data traffic, ECC resource characteristics, performance/energy priority, and various other factors.
The ECC manager provides parameters in a request to read data to a given back-end memory controller to control whether the back-end memory controller uses a decoder implemented on the NAND flash memory device to decode the data or bypasses such decoder for the front-end memory controller to perform decoding. That is, the NAND flash memory device determines whether a parameter specified in a request to read data satisfies an error correction criterion for decoding data encoded using ECC using a decoder implemented on the NAND flash memory device, or whether error correction is performed by the front-end memory controller. The error correction criteria may include a plurality of priority issues including delay parameters, balance parameters, or energy consumption parameters. In accordance with a determination that the parameter meets the error correction criteria, the NAND flash device decodes the data using a decoder implemented on the NAND flash device in accordance with the ECC associated with the data to correct one or more errors in the retrieved data, and transmits the decoded data over the channel to a flash controller of the SSD.
In some embodiments, the NAND flash memory device determines that the parameter does not satisfy the error correction criteria. In such cases, the decoder implemented on the NAND flash device is bypassed (e.g., the decoder of the NAND flash device is bypassed, routing the raw data along with its ECC). The encoded data (raw data and ECC) is transmitted over a channel to a flash controller of the SSD, which decodes the data according to the associated ECC using a decoder implemented by the flash controller to correct one or more errors in the retrieved data.
In some embodiments, it is determined that the decoder of the NAND flash memory device is in a busy state. In such cases, the retrieved data and the ECC associated with the data are transmitted over the channel to the second NAND flash device in response to determining that the decoder is busy performing other operations (e.g., is decoding data for another NAND flash device, or is still decoding data from a previous read operation). The second NAND flash device decodes the data according to the associated ECC using a decoder implemented by the second NAND flash device to correct one or more errors in the retrieved data. The flash controller of the SSD receives the decoded data from the second NAND flash memory device over a channel associated with the second NAND flash memory device.
In some embodiments, the NAND flash memory device determines that the error correction criteria corresponds to a priority balancing parameter. In response, the NAND flash memory device partially decodes the data using a decoder implemented on the NAND flash memory device according to the ECC associated with the data. The partially decoded data and the ECC associated with the partially decoded data are transmitted over a channel to a flash controller of the SSD, which completes decoding of the partially decoded data using a decoder implemented by the flash controller in accordance with the associated ECC. As an example, a decoder of the NAND flash memory device may be used to perform weak error correction (e.g., by decoding according to at least one of: hard sensing of data or iterating a first number of times). In such cases, the decoder of the front-end memory controller is to perform strong error correction (e.g., by decoding the data according to at least one of soft and hard sensing of the data or iterating a second number of times, wherein the second number of iterations is greater than the first number of iterations). That is, the NAND flash memory device may begin decoding the encoded data a first number of iterations of the LDPC error correction code, and then transmit the partially decoded data and corresponding ECC information to the front-end memory controller to perform the remaining iterations of the LDPC error correction code, thereby completing decoding of the data.
In some embodiments, the NAND flash memory device attempts to decode data read from the NAND flash memory device using a decoder of the NAND flash memory device according to a first error correction scheme. The NAND flash memory device determines that there is an uncorrectable error. In response, the NAND flash device transmits the ECC along with the raw data to the front-end memory controller, indicating that there is an uncorrectable error. The front-end memory controller uses a more advanced decoder and error correction scheme to attempt to recover the uncorrectable errors.
Fig. 2 is a schematic diagram of a NAND flash memory device 210 of the SSD 100 of fig. 1. Each channel 122 transmits data and commands from the flash subsystem 120 to a subset of the NAND flash chips in the NAND flash device 210. The 16 channels (CH 0, CH1, … …, CH 15) are coupled to subsets (226 a, 226b, … …, 226 p) of the NAND flash memory device 210, respectively. There are 16 NAND flash memory devices in each subset, identified as logical units (LUN 0, LUN1, … …, LUN 15). The terms "NAND flash memory device" and "LUN" are used interchangeably herein. In other SSDs, fewer or more channels may be used. Similarly, in other SSDs, fewer or more NAND flash memory devices may be provided for each channel.
According to some embodiments, each of the subsets (226 a, 226b, … …, 226 p) of NAND flash memory devices 210 implements a respective decoder on its back-end memory controller. In this way, one decoder may be implemented on a flash controller, such as flash subsystem 120, and one or more additional decoder instances may be implemented on each of the subsets (226 a, 226b, … …, 226 p) of NAND flash memory device 210. In some embodiments, one subset 226a may communicate with any one or more of the other subsets 226 b-226 p over the channel 122. These communications may be made directly between subsets and/or through processing devices in flash subsystem 120.
The decoder implemented by the back-end memory controller may be different from the decoder implemented by the front-end memory controller. For example, a back-end memory controller implemented decoder may be used to perform weak error correction (e.g., error correction comprising at least one of hard sensing of data or a first number of iterations), and a front-end memory controller implemented decoder may be used to perform strong error correction (e.g., error correction comprising at least one of soft sensing of data and hard sensing or a second number of iterations, the second number of iterations being greater than the first number of iterations). In some implementations, the back-end memory implemented decoder may include different resource characteristics and have different delays than the front-end memory controller implemented decoder.
The front-end memory controller may include a manager component for selecting the manner, time, and location in which error correction is performed on data read from a given subset 226a through 226p of the NAND flash memory device 210. The manner and configuration of the different methods of front-end memory controller distributed error correction are discussed in connection with fig. 3A through 3D. In some cases, the manager component configures the error correction distribution according to one or more criteria (e.g., balancing priority with respect to data traffic, energy consumption, processing resource availability, and/or latency).
Fig. 3A-3D are diagrams 300-303 of distributed error correction schemes according to some embodiments. As shown in fig. 3A-3D, the distributed error correction scheme includes a flash memory controller 310 (e.g., a front-end memory controller) and one or more memory devices 320 (e.g., a back-end memory controller). Flash controller 310 may be implemented, at least in part, by flash subsystem 120, and one or more memory devices 320 may be implemented, at least in part, by subsets (226 a, 226b, … …, 226 p) of NAND flash devices 210, respectively. The memory controller 310 may communicate with one or more memory devices 320 over channels (e.g., over respective channels 122).
Flash controller 310 includes a first decoder 312, an ECC manager 318, and an interface 314. Flash controller 310 communicates with a host interface to receive requests to read data from memory device 320. In response, ECC manager 318 analyzes the various parameters to select a distributed error correction scheme. For example, ECC manager 318 may analyze the data traffic pattern, the power consumption, the priority or latency associated with the request received from the host, and whether first decoder 312 of flash controller 310 is in a busy state. Based on this analysis, ECC manager 318 determines the location and time to perform decoding on the data read from flash device 320.
As shown in schematic diagram 300 in fig. 3A, ECC manager 318 determines that decoding is performed only by second decoder 322 implemented on flash device 320. This is illustrated by the solid line around the second decoder 322 and the dashed line around the first decoder 312. The ECC manager 318 inserts parameters in the read data message sent to the memory device 320 that cause the memory device 320 to locally decode the data using a second decoder 322 implemented by the memory device 320 before providing the data back to the flash controller 310. Flash controller 310 sends a request to read data and parameters that control where decoding is performed over channel 122 associated with memory device 320.
Memory device 320 receives requests over channel 122 via interface 324 of memory device 320. The memory device 320 reads data and ECC associated with the data from one or more memory cells 326 (e.g., memory cells 326 implemented by the NAND flash memory device 210). The memory device 320 retrieves the parameter from the request and determines that the parameter meets an error correction criterion for performing error correction using a decoder 322 of the memory device 320. In response, the memory device 320 decodes the data using the second decoder 322 according to the ECC associated with the data. Memory device 320 transmits data packets including the decoded data back to flash controller 310 over channel 122 via interface 324. In some cases, the second decoder 322 detects an uncorrectable error (uncorrectable error, UE). In such cases, the memory device 320 transmits a data packet including the ECC, the raw data, and the uncorrectable error to the flash controller 310. At this time, the flash controller 310 may attempt to perform additional error decoding using the first decoder 312 or indicate to the host device that there is a UE in the data read from the memory device 320.
As shown in schematic 301 in fig. 3B, ECC manager 318 determines that decoding is performed only by first decoder 312 implemented on flash controller 310. This is illustrated by the solid line around the first decoder 312 and the dashed line around the second decoder 322. The ECC manager 318 inserts a parameter in the message sent to the memory device 320 to read the data that causes the memory device 320 to bypass the second decoder 322 when returning the data to the flash controller 310. Flash controller 310 sends a request to read data and parameters that control where decoding is performed over channel 122 associated with memory device 320.
Memory device 320 receives requests over channel 122 via interface 324 of memory device 320. The memory device 320 reads data and ECC associated with the data from one or more memory cells 326 (e.g., memory cells 326 implemented by the NAND flash memory device 210). The memory device 320 retrieves the parameter from the request and determines that the parameter does not meet the error correction criteria for performing error correction using the decoder 322 of the memory device 320. In response, the memory device 320 bypasses the second decoder 322 and routes the raw data and ECC information 330 directly to the interface 324 for transmission to the flash controller 310. Memory device 320 transmits data packets including raw data and ECC back to flash controller 310 over channel 122 via interface 324. Flash controller 310 uses first decoder 312 to decode data according to the ECC received from memory device 320.
As shown in schematic 302 in fig. 3C, ECC manager 318 determines that decoding is performed by both first decoder 312 implemented on flash controller 310 and second decoder 322 implemented on memory device 320. This is illustrated by the solid line around the first decoder 312 and the solid line around the second decoder 322. The ECC manager 318 inserts a parameter in a message sent to the memory device 320 to read the data that causes the memory device 320 to decode the data using the second decoder 322 when returning the data to the flash controller 310. Flash controller 310 sends a request to read data and parameters that control where decoding is performed over channel 122 associated with memory device 320. The parameter may specify a level of decoding performed by the second decoder 322 relative to a level of decoding performed by the first decoder 312. In this implementation, partial decoding is performed on memory device 320 and residual decoding is performed on flash controller 310. This divides the decoding effort between the two devices, thereby increasing the overall efficiency and speed of reading data from memory.
Memory device 320 receives requests over channel 122 via interface 324 of memory device 320. The memory device 320 reads the raw data and the ECC associated with the data from one or more memory cells 326 (e.g., memory cells 326 implemented by the NAND flash memory device 210). The memory device 320 retrieves the parameter from the request and determines that the parameter meets an error correction criterion for performing error correction using a decoder 322 of the memory device 320. In response, the memory device 320 passes the original data read from the memory unit 326 and the ECC 330 to the second decoder 322 to perform initial decoding of the data using the second decoder 322 according to the ECC of the data. For example, memory device 320 decodes the data using a first number of LDPC iterations and/or weak decoding operations (e.g., hard sensing using only the data). If the initial decoding (e.g., weak decoding by the second decoder 322) is successful, the memory device 320 provides a data packet including the partial decoding result (e.g., data decoded using the first number of iterations of the LDPC code) and the originally read ECC information back to the flash controller 310. The flash controller 310 completes decoding of the data using the first decoder 312 according to the ECC information and the partially decoded data. As an example, the first decoder 312 may process the data generated by the ECC and the second decoder 322 to perform a second number of iterations remaining in the LDPC code to complete decoding of the data and/or may perform stronger decoding techniques (e.g., using hard and soft transmissions of the data) to decode the data. In some cases, the partially decoded data from the second decoder 322 may be processed by the first decoder 312 to determine that no further errors were detected by the first decoder 312. In such cases, the first decoder 312 passes the partially decoded data received from the second decoder 322 to the requesting host.
In some cases, the initial decoding of the second decoder 322 is unsuccessful. In such cases, the memory device 320 provides a data packet including the original data and the original read ECC information back to the flash controller 310. The flash controller 310 attempts to decode the original data using the first decoder 312 according to the ECC information. As an example, the first decoder 312 may process the data and ECC information read from the memory cells using a strong decoding technique (e.g., using hard and soft transmissions of the data).
As shown in schematic 303 in fig. 3D, ECC manager 318 instructs memory device 320 to perform local decoding of the data using second decoder 322. The memory device 320 determines that the second decoder 322 is currently busy performing other operations and cannot complete the request to perform local decoding. In response, the memory device 320 communicates with the second memory device 340 to perform decoding operations. That is, the memory device 320 may send a data packet including the original data read from the memory 326 and the ECC 330 to the second memory device 340, and instructions for the second memory device 340 to perform decoding using the third decoder 342 implemented by the second memory device 340. The second memory device 340 may be implemented as another instance of a subset (226 a, 226b, … …, 226 p) of the NAND flash memory device 210. That is, the memory device 320 may be a first subset 226a and the second memory device 340 may be a second subset 226b.
In one embodiment, memory device 320 provides data packets including data and ECC 330 directly to second memory device 340 through interface 324 of memory device 320 and interface 344 of second memory device 340. For example, the memory device 320 may communicate directly with the second memory device 340 over the channel 122 without passing information or messages through the flash controller 310. In such cases, the second memory device 340 decodes the data according to the ECC information using a third decoder 342 implemented on the second memory device 340. After decoding the data, the second memory device 340 may return the decoded data to the memory device 320. At this time, the memory device 320 transmits the data decoded by the second memory device 340 to the flash controller 310 through a channel associated with the memory device 320. In other embodiments, the second memory device 340 directly transmits the decoded data back to the flash controller 310 through a channel associated with the second memory device 340.
In another embodiment, the memory device 320 provides a data packet including data and ECC 330 to the second memory device 340 through the flash controller 310. Specifically, the memory device 320 provides a data packet including data and ECC information to the flash controller 310. The ECC manager 318 in the flash controller 310 finds a memory device that is not busy (e.g., the second memory device 340), or selects one memory device randomly or in a round robin fashion. Flash controller 310 provides data and the ECC associated with the data to selected memory device 340, as well as instructions for decoding the data using the decoder of the selected memory device. As an example, the second memory device 340 decodes the data using the third decoder 342 and returns the decoded data to the flash controller 310. The second memory device 340 may be used to fully decode data read from the memory 326 of the memory device 320 or to partially or initially decode data read from the memory 326 of the memory device 320. In the case of partially decoding the data, the second memory device 340 returns the partially decoded data as well as the ECC information to the flash controller so that the first decoder 312 of the flash controller 310 completes decoding the data.
Fig. 4A, 4B, 5, 6, and 7 illustrate a flowchart of performing distributed error correction according to some embodiments. The processes 400, 410, 500, 600, and 700 may be embodied in computer readable instructions for execution by one or more processors or one or more servers, front-end memory controllers, back-end memory controllers, or a combination thereof; accordingly, processes 400, 410, 500, 600, and 700 are described below in connection with examples thereof. However, in other embodiments, at least some of the operations of processes 400, 410, 500, 600, and 700 may be deployed on a variety of other hardware configurations. Some or all of the operations of processes 400, 410, 500, 600, and 700 may be performed in parallel or out of order, or may be omitted entirely.
In operation 401, an ECC decoder result is received. For example, memory device 320 receives a request to read data from memory 326. The memory device 320 determines that the parameters specified in the request meet the criteria for performing error correction using the second decoder 322 implemented on the memory device 320 (fig. 3A). The memory device 320 routes the data from the memory 326 to the second decoder 322 along with the ECC for the data. The second decoder 322 decodes the data according to the ECC of the data to generate decoder results.
In operation 402, it is determined whether the decoder result includes an Uncorrectable Error (UE). If so, the process passes to operation 403; otherwise, the process passes to operation 404. For example, if the second decoder 322 does not detect an uncorrectable error, the process passes to operation 404. If the second decoder 322 detects an uncorrectable error, the process passes to operation 403.
In operation 404, the corrected data and error correction information are transmitted to the front-end memory controller. For example, the memory device 320 provides the decoded data and (optionally) the ECC information to the flash controller 310. Specifically, if the second decoder 322 performs only partial decoding, the ECC information is provided to the flash controller 310 to complete decoding of the partially decoded data.
In operation 403, the raw data and error correction information are transmitted to the front-end memory controller. For example, the ECC information and raw data are provided to the flash controller 310 along with an indication of the UE so that the first decoder 312 of the flash controller 310 may be used to decode the data read from the memory 326. After performing operation 403, the process passes to operation 405 discussed in connection with FIG. 4B.
In operation 405, data and error correction information are received from a memory device. For example, the second decoder 322 provides the partially decoded data to the flash controller 310 along with ECC information. As another example, in the event that the decoder of the memory device 320 did not successfully decode the data and the UE was detected, the flash controller 310 receives the raw data and ECC information from the memory device 320.
In operation 406, error correction is performed with the front-end memory controller decoder. For example, the first decoder 312 is used to complete decoding of data that has been partially decoded by the second decoder 322 of the memory device 320, or to correct data that the second decoder 322 detects the UE.
In operation 407, it is determined whether the decoder result includes an Uncorrectable Error (UE). If so, the process passes to operation 409; otherwise, the process passes to operation 408. For example, the first decoder 312 of the flash controller 310 determines whether the data can be successfully decoded (e.g., no UE is present).
In operation 408, the corrected data is transmitted to the host interface.
In operation 409, an uncorrectable error flag is generated. For example, the host interface receives a notification: the data read from the memory is not successfully decoded.
Process 500 illustrates the operation of identifying and using a decoder of a memory device from which data is read when the decoder is busy or unavailable. In operation 501, a read request is received from a front-end memory controller on a first memory device. For example, memory device 320 receives a request to read data from memory 326. The memory device 320 determines that the parameters specified in the request meet the criteria for performing error correction using the second decoder 322 implemented on the memory device 320 (fig. 3D).
In operation 502, it is determined whether the decoder of the first memory device is in a busy state. If so, the process passes to operation 504; otherwise, the process goes to operation 503. For example, memory device 320 determines that a request to read data from memory device 320 needs to be completed with minimal delay and there is insufficient time to wait for the decoder of memory device 320 to complete other operations (e.g., decoding data from a previous read request).
In operation 503, error correction is performed with a back-end memory controller of the first memory device. The second decoder 322 of the memory device 320 is used to decode data read from the memory 326 in the event that the decoder of the memory device 320 is not in a busy state.
In operation 504, it is determined whether the error correction device of the front-end controller is in a busy state. If so, the process passes to operation 506; otherwise, the process goes to operation 505. If the decoder of the memory device 320 is in a busy state, the memory device 320 communicates with the memory controller 310 to determine whether the first decoder 312 of the memory controller 310 is available to perform a decoding operation.
In operation 505, error correction is performed with the front-end memory controller. For example, the memory device 320 bypasses the second decoder 322 of the memory device 320 and transmits the raw data and associated ECC information to the memory controller 310. The memory controller 310 decodes data read from the memory device 320 using the first decoder 312.
In operation 506, error correction is performed with a back-end memory controller of the second memory device. For example, the memory device 320 transfers the raw data and associated ECC information to the second memory device 340 (either directly without passing through the memory controller 310 or indirectly through the memory controller 310). The second memory device 340 decodes the data read from the memory device 320 using the third decoder 342 and provides the decoding result back to the flash controller 310 directly or through the memory device 320. In some cases, memory device 320 selects memory device 340 randomly or in a round robin fashion. In some cases, memory device 320 communicates with ECC manager 318 to identify memory device 340 from a set of available memory devices.
Process 600 illustrates operations for identifying and using decoders of a memory device according to error correction criteria. Error correction criteria may include delay parameters, balance parameters, power saving parameters, bandwidth parameters, and various other conditions or parameters.
In operation 601, a read request is received from a front-end memory controller on a first memory device. For example, memory device 320 receives a request to read data from memory 326. The memory device 320 determines that the parameters specified in the request meet the criteria for performing error correction using the second decoder 322 implemented on the memory device 320 (fig. 3D).
In operation 602, it is determined whether the distributed error correction criteria and parameters specified in the request satisfy a condition for performing error correction using a decoder of a memory device having data stored thereon. If the distributed error correction criteria includes a delay parameter (e.g., delay over other criteria), the process passes to operation 603. If the distributed error correction criteria includes a balancing parameter (e.g., balancing between delay and energy takes precedence over other criteria), the process passes to operation 606. If the distributed error correction criteria includes energy parameters (e.g., energy conservation takes precedence over other criteria), then the process passes to operation 607. For example, ECC manager 318 may analyze various conditions and operations to select parameters for performing distributed error correction. The ECC manager 318 specifies parameters (e.g., energy, delay, balance, etc.) to be used by the memory device 320 in determining whether to perform error correction using a decoder of the memory device 320.
In operation 603, the fastest decoder performs error correction. For example, the first decoder 312 implemented by the flash controller 310 may be more complex and process and decoder power higher than the decoders implemented by the memory devices (e.g., the second decoder 322 and the third decoder 342). In such cases, the ECC manager specifies parameters that cause the memory device 320 to bypass the second decoder 322 and provide the original data and associated ECC back to the flash controller 310 to perform decoding using the first decoder 312.
As another example, both the first decoder 312 implemented by the flash controller 310 and the second decoder 322 implemented by the memory device 320 decode data read from the memory device 320 in parallel as directed. In this case, the original data and the ECC are provided to both the first decoder 312 and the second decoder 322. The ECC manager 318 monitors its decoding operations to determine which of the decoders completes decoding the data first. If the first decoder 312 completes decoding the data first, the flash controller 310 provides the data back to the host interface using the decoding result from the first decoder 312. If the second decoder 322 completes decoding the data before the first decoder 312, the decoded data is transmitted back to the flash controller 310 to provide a host interface.
In operation 604, it is determined whether the fastest decoder is in a busy state. If so, the process passes to operation 605; otherwise, the fastest error correction device performs error correction. For example, the flash controller 310 may receive raw data and ECC from the memory device 320 and may determine that the first decoder 312 is busy decoding previously read data.
In operation 605, an alternative error correction device is searched for and found. For example, the flash controller 310 may send the raw data and ECC to another flash controller 310, host device or second memory device 340, or may send it back to the memory device 320. That is, the ECC manager 318 may store the index of the different decoders available and their processing speeds. If the decoder implemented by flash controller 310 is busy and is at the top of the list as the fastest decoder, then ECC manager 318 selects the next decoder on the list. In some cases, the decoder of the second memory device 340 may be faster than the decoder of the memory device 320. In such cases, the flash controller 310 provides the ECC information and the data read from the memory device 320 to the second memory device 340 to perform decoding, for example, in the manner described in connection with fig. 3D, and the like.
In operation 606, error correction is performed with a decoder of the first memory device and a decoder of the front-end memory controller. For example, the second decoder 322 may perform an initial or partial decoding of the data and provide the partially decoded data to the first decoder 312 to complete decoding of the data, as described above in connection with fig. 3C.
In operation 607, error correction is performed with the power saving decoder. For example, the second decoder 322 implemented by the memory device 320 is less complex and consumes less power than the decoder 312 implemented by the flash controller 310. In such cases, the ECC manager specifies parameters that cause the memory device 320 to decode data read from the memory 326 using the second decoder 322 and provide the decoded data back to the flash controller 310.
Process 700 illustrates operations for an on-board decoder of a NAND flash memory device using a NAND flash memory array to decode data. In operation 701, a request to read data stored on a NAND flash memory is received.
In operation 702, the data and an ECC associated with the data are retrieved from the NAND flash memory.
In operation 703, it is determined whether the parameter specified in the request to read the data meets an error correction criterion for decoding the data using a decoder implemented on the NAND flash memory.
In operation 704, the data is decoded using a decoder implemented on the NAND flash memory.
In operation 705, the decoded data is transmitted to the flash memory controller through a channel.
Fig. 8 is a block diagram illustrating circuitry in the form of a processing system implementing a system and method for performing distributed error correction as described above in connection with fig. 1-7, in accordance with some embodiments. Not all components need be used in embodiments. An exemplary computing device in the form of a computer 800 may include a processing unit 802, memory 803, cache 807, removable storage 811, and non-removable storage 822. While an exemplary computing device is illustrated and described as computer 800, in different embodiments, the computing device may take different forms. Smart phones, tablets, and smart watches are commonly referred to collectively as mobile devices or user devices. Further, while the various data storage elements are illustrated as part of computer 800, the storage devices may also or alternatively comprise cloud-based storage or server-based storage accessible via a network such as the internet.
The memory 803 may include volatile memory 814 and nonvolatile memory 808. The computer 800 may also include or access a computing environment that includes: various computer readable media are available such as volatile memory 814, nonvolatile memory 808, removable storage 811 and non-removable storage 822. Computer storage includes random access memory (random access memory, RAM), read Only Memory (ROM), erasable programmable read only memory (erasable programmable read-only memory, EPROM), electrically erasable programmable read only memory (electrically erasable programmable read-only memory, EEPROM), flash memory or other memory technology, compact disc read only memory (CD ROM), digital versatile discs (digital versatile disk, DVD) or other optical disc storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium capable of storing computer readable instructions.
Computer 800 may include or access a computing environment that includes an input interface 826, an output interface 824, and a communication interface 816. The output interface 824 may include a display device, such as a touch screen, that may also function as an input device. The input interface 826 may include one or more of the following: a touch screen, a touch pad, a mouse, a keyboard, a camera, one or more device-specific buttons, one or more sensors integrated within the computer 800 or coupled to the computer 800 through a wired or wireless data connection, and other input devices. The computer 800 may be connected to one or more remote computers through a communication connection to operate in a networked environment. The one or more remote computers may include personal computers (personal computer, PCs), servers, routers, network PCs, peer devices or other public DFD network switches, and the like. The communication connection may include a local area network (Local Area Network, LAN), wide area network (Wide Area Network, WAN), cellular network, wi-Fi network, bluetooth network, or other network. Various components of computer 800 are connected to a system bus 820, according to one embodiment.
Computer readable instructions (e.g., program 818) stored on a computer readable medium may be executed by the processing unit 802 of the computer 800. In some embodiments, program 818 includes software that, when executed by processing unit 802, performs task distribution operations in accordance with any of the embodiments contained herein. Hard disk drives, CD-ROMs, and RAMs are some examples of components comprising non-transitory computer readable media, such as storage devices. The terms "computer-readable medium" and "storage device" do not include a carrier wave in which case the carrier wave is regarded as transitory. Storage may also include networked storage, such as a storage area network (storage area network, SAN). Computer program 818 may also include instruction modules that, when processed, cause processing unit 802 to perform one or more of the methods or algorithms described herein.
Although a few embodiments have been described in detail above, other modifications are possible. For example, the logic flows depicted in the figures do not require the particular order shown, or sequential order, to achieve desirable results. Other steps may be provided to, or steps in, the described flows, and other components may be added to, or components of the described systems. Other embodiments may be within the scope of the following claims.
It should also be appreciated that software comprising one or more computer-executable instructions that facilitate the processing and operation of any or all of the steps described above with respect to the present disclosure may be installed on and sold with one or more computing devices consistent with the present disclosure. Alternatively, the software may be obtained and loaded into one or more computing devices, including obtaining the software through a physical medium or distribution system, including for example, obtaining the software from a server owned by the software creator or from a server not owned but used by the software creator. For example, the software may be stored in a server for distribution over the internet.
Furthermore, it will be appreciated by those skilled in the art that the present disclosure is not limited in its application to the details of construction and the arrangement of components set forth in the description or illustrated in the drawings. The embodiments herein are capable of other embodiments and of being practiced or of being carried out in various ways. Also, it is to be understood that the phraseology and terminology used herein is for the purpose of description and should not be regarded as limiting. The use of "including," "comprising," or "having" and variations thereof herein is meant to encompass the items listed thereafter and equivalents thereof as well as additional items. The terms "connected," "coupled," and "mounted," and variations thereof, as used herein, are used broadly and encompass both direct and indirect connections, couplings, and mountings, unless otherwise limited. Furthermore, the terms "connected" and "coupled" and their variants are not limited to physical or mechanical connections or couplings.
The components of the illustrative devices, systems, and methods employed in accordance with the illustrated embodiments may be implemented at least in part in digital electronic circuitry, analog electronic circuitry, computer hardware, firmware, software, or in combinations of them. For example, these components may be implemented as a computer program product, such as a computer program, program code, or computer instructions tangibly embodied in an information carrier or in a machine-readable storage device for execution by, or to control the operation of, data processing apparatus, such as a programmable processor, a computer, or multiple computers.
A computer program can be written in any form of programming language, including compiled or interpreted languages, and it can be deployed in any form, including as a stand-alone program or as a module, component, subroutine, or other unit suitable for use in a computing environment. A computer program can be deployed to be run on one computer or on multiple computers at one site or distributed across multiple sites and interconnected by a communication network. Furthermore, functional programs, codes, and code segments for accomplishing the techniques described herein may be easily construed by one skilled in the art to which the techniques described herein pertains to be within the scope of the claims. Method steps associated with the illustrative embodiments may be performed by one or more programmable processors executing a computer program, code, or instruction to perform functions (e.g., by operating on input data and/or generating output). Method steps may also be performed by, and means for performing, special purpose logic circuitry, e.g., an FPGA or application-specific integrated circuit (ASIC).
The various illustrative logical blocks, modules, and circuits described in connection with the embodiments disclosed herein may be implemented or performed with a general purpose processor, a digital signal processor (digital signal processor, DSP), an ASIC, an FPGA or other programmable logic device, discrete gate or transistor logic, discrete hardware components, or any combination thereof designed to perform the functions described herein. A general purpose processor may be a microprocessor, but in the alternative, the processor may be any conventional processor, controller, microcontroller, or state machine. A processor may also be implemented as a combination of computing devices, e.g., a combination of a DSP and a microprocessor, a plurality of microprocessors, one or more microprocessors in conjunction with a DSP core, or any other such configuration.
Processors suitable for the execution of a computer program include, by way of example, both general and special purpose microprocessors, and any one or more processors of any kind of digital computer. Generally, a processor will receive instructions and data from a read-only memory or a random access memory or both. Elements of a computer system include a processor for executing instructions and one or more memory devices for storing instructions and data. Typically, a computer will also include, or be operatively coupled to receive data from and/or transfer data to, one or more mass storage devices for storing data, e.g., magnetic, magneto-optical disks, or optical disks. Information carriers suitable for embodying computer program instructions and data include all forms of non-volatile memory, including by way of example semiconductor memory devices, e.g., erasable programmable read-only memory (EPROM), electrically Erasable Programmable ROM (EEPROM), flash memory devices, and data storage disks (e.g., magnetic disks, internal hard disks, removable disks, magneto-optical disks, CD-ROMs, and DVD-ROMs). The processor and the memory may be supplemented by, or incorporated in, special purpose logic circuitry.
Those of skill in the art would understand that information and signals may be represented using any of a variety of different technologies and techniques. For example, data, instructions, commands, information, signals, bits, symbols, and chips that may be referenced throughout the above description may be represented by voltages, currents, electromagnetic waves, magnetic fields or particles, optical fields or particles, or any combination thereof.
"machine-readable medium" as used herein refers to a device capable of temporarily or permanently storing instructions and data, and may include, but is not limited to, random Access Memory (RAM), read Only Memory (ROM), buffer memory, flash memory, optical media, magnetic media, cache memory, other types of storage (e.g., erasable programmable read-only memory (EPROM)), and/or any suitable combination thereof. The term "machine-readable medium" shall be taken to include a single medium or multiple media (e.g., a centralized or distributed database, or associated caches and servers) that are capable of storing the processor instructions. The term "machine-readable medium" shall also be taken to include any medium or combination of media that is capable of storing instructions for execution by one or more processors (e.g., processing unit 802) such that the instructions, when executed by the one or more processors, cause the one or more processors to perform any one or more of the methodologies described herein. Thus, a "machine-readable medium" refers to a single storage device or apparatus, and a "cloud-based" storage system comprising multiple storage devices or apparatus.
In addition, techniques, systems, subsystems, and methods described and illustrated in the various embodiments as discrete or separate may be combined or integrated with other systems, modules, techniques, or methods without departing from the scope of the present disclosure. Other items shown or described as coupled or directly coupled or communicating with each other may be indirectly coupled or communicating through some interface, device, or intermediate component, whether electrically, mechanically, or otherwise. Other examples of changes, substitutions, and alterations are ascertainable by one skilled in the art and could be made without departing from the scope of the subject matter disclosed herein.
Although the present disclosure has been described with reference to specific features and embodiments thereof, it will be apparent that various modifications and combinations of the specific features and embodiments of the disclosure can be made without departing from the scope of the disclosure. Accordingly, the specification and drawings are to be regarded only as illustrative of the present disclosure as defined in the appended claims, and are intended to cover any and all modifications, variations, combinations, or equivalents that fall within the scope of the present disclosure.

Claims (20)

1. An error correction method for a Solid State Disk (SSD), wherein the SSD includes a plurality of NAND (NAND) flash memory devices, each NAND flash memory device of the plurality of NAND flash memory devices having an on-chip NAND flash memory and a respective decoder of a plurality of decoders, the method comprising:
a first NAND flash device of the plurality of NAND flash devices receives a request to read data, the data stored on NAND flash of the first NAND flash device:
wherein data stored on the NAND flash memory has been encoded using an Error Correction Code (ECC);
the request is received by the first NAND flash device from a flash controller of the SSD over a first channel associated with the first NAND flash device;
Retrieving the data from a NAND flash memory of the first NAND flash memory device and an ECC used to encode the data;
the first NAND flash device determining whether a parameter specified in the request to read data meets an error correction criterion for decoding the data encoded using the ECC using a first decoder of the plurality of decoders, wherein the first decoder is implemented on the first NAND flash device; and
in response to determining that the parameter meets the error correction criteria, performing the following:
decoding the data using a first decoder implemented on the first NAND flash memory device to correct one or more errors in the retrieved data according to an ECC used to encode the data;
and transmitting the decoded data to a flash memory controller of the SSD through the first channel.
2. The error correction method of claim 1, further comprising:
in response to determining that the parameter does not meet the error correction criteria, performing the following:
bypassing a first decoder implemented on the first NAND flash memory device;
transmitting the retrieved data and the ECC used to encode the data over the first channel to a flash controller of the SSD, the flash controller decoding the data using a second decoder implemented by the flash controller according to the ECC used to encode the data to correct one or more errors in the retrieved data.
3. The error correction method of claim 1, wherein the error correction criteria comprises at least one of a plurality of priority issues including a delay parameter, a balance parameter, or an energy consumption parameter.
4. The error correction method of claim 1, further comprising:
determining that the first decoder is busy performing other operations;
in response to determining that the first decoder is busy performing other operations, the retrieved data and the ECC used to encode the data are transmitted over the first channel to a flash controller of the SSD, which decodes the data using a second decoder implemented by the flash controller in accordance with the ECC used to encode the data to correct one or more errors in the retrieved data.
5. The error correction method of claim 1, further comprising:
determining that the first decoder is busy performing other operations;
in response to determining that the first decoder is busy performing other operations, the retrieved data and the ECC used to encode the data are transmitted over the first channel to a second NAND flash device, which decodes the data using a second decoder implemented by the second NAND flash device in accordance with the ECC used to encode the data to correct one or more errors in the retrieved data.
6. The error correction method of claim 5, further comprising:
the flash controller of the SSD receives the decoded data from the second NAND flash device over a second channel associated with the second NAND flash device.
7. The error correction method of claim 5, wherein the retrieved data and the ECC are transferred to the second NAND flash device by a flash controller of the SSD.
8. The error correction method of claim 5, further comprising:
determining that the error correction criteria corresponds to a prioritized delay parameter to reduce error correction delay;
transmitting the retrieved data and the ECC used to encode the data to a flash controller of the SSD over the first channel, the flash controller decoding the data using a second decoder implemented by the flash controller in accordance with the ECC used to encode the data while decoding the data using the first decoder in accordance with the ECC used to encode the data to correct one or more errors in the retrieved data.
9. The error correction method of claim 8, further comprising: a decoder that first completes decoding the data from either of the first decoder and the second decoder accesses the decoded data.
10. The error correction method of claim 1, further comprising:
determining that the error correction criteria corresponds to a prioritized balance parameter;
partially decoding the data using a first decoder implemented on the first NAND flash memory device according to an ECC used to encode the data;
transmitting the partially decoded data and ECC used to encode the data to a flash controller of the SSD over the first channel, the flash controller completing decoding of the partially decoded data using a second decoder implemented by the flash controller according to the ECC used to encode the data.
11. The error correction method of claim 10, wherein the first decoder is to perform weak error correction comprising at least one of hard sensing or iterating a first number of times of the data; the second decoder is to perform a strong error correction including at least one of soft sensing and hard sensing of the data or a second number of iterations, the second number of iterations being greater than the first number of iterations.
12. The error correction method of claim 10, wherein the first decoder and the second decoder comprise different resource characteristics and different delays.
13. The error correction method of claim 1, wherein the NAND flash memory device comprises a 3D flash memory device or a 4D flash memory device.
14. The error correction method of claim 1, further comprising:
generating an error correction result in the first decoder;
determining that uncorrectable errors exist in the error correction result;
in response to determining that the uncorrectable error exists in the error correction result, sending a data packet to a flash controller of the SSD, the data packet including the data and error correction information from the first NAND flash memory device.
15. The error correction method of claim 1, wherein the ECC comprises a block code.
16. The error correction method of claim 1, wherein the ECC comprises a convolutional code.
17. A system for performing error correction in a Solid State Disk (SSD), the system comprising:
a plurality of NAND (NAND) flash memory devices, each NAND flash memory device of the plurality of NAND flash memory devices having an on-chip NAND flash memory and a respective decoder of a plurality of decoders, a first NAND flash memory device of the plurality of NAND flash memory devices performing operations comprising:
receiving a request to read data, the data stored on a NAND flash of the first NAND flash device:
Wherein data stored on the NAND flash memory has been encoded using an Error Correction Code (ECC);
the request is received by the first NAND flash device from a flash controller of the SSD over a first channel associated with the first NAND flash device;
retrieving the data from a NAND flash memory of the first NAND flash memory device and an ECC used to encode the data;
determining whether a parameter specified in the request to read data meets an error correction criterion for decoding the data encoded using the ECC using a first decoder of the plurality of decoders, wherein the first decoder is implemented on the first NAND flash memory device; and;
in response to determining that the parameter meets the error correction criteria, performing the following:
decoding the data using the first decoder implemented on the first NAND flash memory device to correct one or more errors in the retrieved data according to an ECC used to encode the data;
and transmitting the decoded data to a flash memory controller of the SSD through the first channel.
18. The system of claim 17, wherein the operations further comprise:
In response to determining that the parameter does not meet the error correction criteria, performing the following:
bypassing a first decoder implemented on the first NAND flash memory device;
transmitting the retrieved data and the ECC used to encode the data over the first channel to a flash controller of the SSD, the flash controller decoding the data using a second decoder implemented by the flash controller according to the ECC used to encode the data to correct one or more errors in the retrieved data.
19. The system of claim 17, wherein the error correction criteria comprises at least one of a plurality of priority issues including a delay parameter, a balance parameter, or an energy consumption parameter.
20. An apparatus for a Solid State Disk (SSD), wherein the SSD includes a plurality of NAND (NAND) flash memory devices, each NAND flash memory device of the plurality of NAND flash memory devices having an on-chip NAND flash memory and a respective decoder of a plurality of decoders, the apparatus comprising:
means for receiving, by a first NAND flash memory device of the plurality of NAND flash memory devices, a request to read data stored on NAND flash memory of the first NAND flash memory device:
Wherein data stored on the NAND flash memory has been encoded using an Error Correction Code (ECC);
the request is received by the first NAND flash device from a flash controller of the SSD over a first channel associated with the first NAND flash device;
means for retrieving the data from the NAND flash memory of the first NAND flash memory device and for encoding the ECC used for the data;
means for determining, by the first NAND flash memory device, whether a parameter specified in the request to read data meets an error correction criterion for decoding the data encoded using the ECC using a first decoder of the plurality of decoders, wherein the first decoder is implemented on the first NAND flash memory device; and;
means for, in response to determining that the parameter meets the error correction criteria, performing the following:
decoding the data using a first decoder implemented on the first NAND flash memory device to correct one or more errors in the retrieved data according to an ECC used to encode the data;
and transmitting the decoded data to a flash memory controller of the SSD through the first channel.
CN202080107284.8A 2020-12-10 2020-12-10 Distributed ECC scheme in a memory controller Pending CN116490853A (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/US2020/064310 WO2022125101A1 (en) 2020-12-10 2020-12-10 Distributed ecc scheme in memory controllers

Publications (1)

Publication Number Publication Date
CN116490853A true CN116490853A (en) 2023-07-25

Family

ID=74183507

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202080107284.8A Pending CN116490853A (en) 2020-12-10 2020-12-10 Distributed ECC scheme in a memory controller

Country Status (2)

Country Link
CN (1) CN116490853A (en)
WO (1) WO2022125101A1 (en)

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8495467B1 (en) * 2009-06-30 2013-07-23 Micron Technology, Inc. Switchable on-die memory error correcting engine
US10146482B2 (en) * 2014-08-01 2018-12-04 Toshiba Memory Corporation Global error recovery system
US10599515B2 (en) * 2017-12-21 2020-03-24 Intel Corporation Transfer of encoded data stored in non-volatile memory for decoding by a controller of a memory device

Also Published As

Publication number Publication date
WO2022125101A1 (en) 2022-06-16

Similar Documents

Publication Publication Date Title
CN111433732B (en) Storage device and computer-implemented method performed by the storage device
KR101912596B1 (en) Non-volatile memory program failure recovery via redundant arrays
US20200042223A1 (en) System and method for facilitating a high-density storage device with improved performance and endurance
JP6387231B2 (en) Management of nonvolatile memory writing and area selection
US20140379959A1 (en) Map recycling acceleration
KR20150020136A (en) Translation layer partitioned between host and controller
KR20120113853A (en) Memory controller, data processing method thereof, memory system having the same
US11372564B2 (en) Apparatus and method for dynamically allocating data paths in response to resource usage in data processing system
US11294834B2 (en) Data processing system allocating memory area in host as extension of memory and operating method thereof
KR20220001222A (en) Memory system for handling a bad block and operation method thereof
US11870463B2 (en) Data reliability for extreme temperature usage conditions in data storage
US11355213B2 (en) Apparatus and method for verifying reliability of data read from memory device through clock modulation, and memory system including the same
WO2014144043A1 (en) Apparatus and method for generating descriptors to reaccess a non-volatile semiconductor memory of a storage drive due to an error
EP4020244A1 (en) Memory system architecture for heterogeneous memory technologies
KR20220045343A (en) Apparatus and method for correcting an error in data transmission of a data processing system
CN113010098A (en) Apparatus and method for improving input/output throughput of memory system
CN115480707A (en) Data storage method and device
KR20210121654A (en) Apparatus and method for recovering a data error in a memory system
CN111435334A (en) Apparatus and method for checking valid data in memory system
JP6342013B2 (en) Method, system and computer program for operating a data storage system including a non-volatile memory array
KR20210131058A (en) Apparatus and method for protecting data in a memory system
CN112599170A (en) Apparatus and method for providing multi-stream operation in a memory system
CN116490853A (en) Distributed ECC scheme in a memory controller
CN116775368A (en) Multi-layer code rate architecture for copyback between partitions at different code rates
TW202316259A (en) Apparatus and method for controlling a shared memory in a data processing system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination