WO2011019794A2 - Procédé et appareil pour prendre en compte des défaillances prédites ou réelles dans un système à mémoire flash - Google Patents

Procédé et appareil pour prendre en compte des défaillances prédites ou réelles dans un système à mémoire flash Download PDF

Info

Publication number
WO2011019794A2
WO2011019794A2 PCT/US2010/045129 US2010045129W WO2011019794A2 WO 2011019794 A2 WO2011019794 A2 WO 2011019794A2 US 2010045129 W US2010045129 W US 2010045129W WO 2011019794 A2 WO2011019794 A2 WO 2011019794A2
Authority
WO
WIPO (PCT)
Prior art keywords
data
flash memory
page
stripe
stored
Prior art date
Application number
PCT/US2010/045129
Other languages
English (en)
Other versions
WO2011019794A3 (fr
Inventor
Holloway H. Frost
Charles J. Camp
James A. Fuxa
Original Assignee
Texas Memory Systems, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US12/554,891 external-priority patent/US7856528B1/en
Application filed by Texas Memory Systems, Inc. filed Critical Texas Memory Systems, Inc.
Publication of WO2011019794A2 publication Critical patent/WO2011019794A2/fr
Publication of WO2011019794A3 publication Critical patent/WO2011019794A3/fr

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/08Error detection or correction by redundancy in data representation, e.g. by using checking codes
    • G06F11/10Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's
    • G06F11/1004Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's to protect a block of data words, e.g. CRC or checksum
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/08Error detection or correction by redundancy in data representation, e.g. by using checking codes
    • G06F11/10Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's
    • G06F11/1008Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's in individual solid state devices
    • G06F11/1068Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's in individual solid state devices in sector programmable memories, e.g. flash disk
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/16Error detection or correction of the data by redundancy in hardware
    • G06F11/1666Error detection or correction of the data by redundancy in hardware where the redundant component is memory or memory area

Definitions

  • 12/554,891 entitled “Method and Apparatus for Protecting Data Using Variable Size Page Stripes in a FLASH-Based Storage System,” filed September 5, 2009; and (4) U.S. Non-Provisional Application Serial No. 12/554,892, entitled “FLASH-based Memory System with Variable Length Page Stripes Including Data Protection Information,” filed September 5, 2009.
  • This disclosure relates generally to methods and apparatus for improving the ability of a memory storage system to efficiently and effectively protect, store and retrieve data stored in multiple storage locations.
  • data is stored in multiple storage locations.
  • multiple individual hard disks or memory chips are used to store data and the data stored in one or more of the storage devices is associated with data stored in other storage devices in such a manner that data errors in one or more storage devices can be detected and possibly corrected.
  • One such approach is to store a given quantity of data across multiple storage locations by dividing the data into data portions of equal length - the individual data portions sometimes being referred to as "data pages" - and then storing the data pages in multiple storage locations such that one data page is stored in each storage device.
  • a further storage device may be used to store a page of data protection information, where a given page of data protection information is associated with a specific set of data pages stored in the multiple storage locations.
  • the set of data pages in the multiple locations that is used to store associated data is referred to as a "data stripe" or "Page Stripe.”
  • each data stripe In conventional systems, the length of all of the data stripes used in the system is the same. Thus, in such systems, all of the data stored in the system is divided into data stripes of the same length, with each data stripe consisting of the same number of pages, and with each data stripe being stored in the same number of memory locations. Also, in such system, each data stripe conventionally utilizes the same form of data protection and the data protection information for each data stripe is determined in the same way.
  • the data protection information for a given data stripe can often be used to reconstruct the data in the data page that was stored in the failed memory location. Using the reconstructed data, the data for the entire data stripe may be reconstructed.
  • the reconstructed data is stored in a reserve or back-up storage location that takes the place of the failed storage location within the system such that the data stripe that was associated with the failed memory location is reconstructed in substantially the same form.
  • the reconstructed data stripe consists of the same number of pages, is stored in the same number of memory locations, and utilizes the same form of data protection as the data stripe that was associated with the failed storage location.
  • the disclosed embodiments are directed to methods and apparatuses for providing efficient and enhanced protection of data stored in a FLASH memory system.
  • the methods and apparatuses involve a system controller for a plurality of FLASH memory devices in the FLASH memory system that is capable of adapting to the failure of one or more of the FLASH memory devices.
  • the system controller is configured to store data in the FLASH memory devices in the form of page stripes, with each page stripe composed of a plurality of data pages, and each data page being stored in a FLASH memory device that is different from each of the FLASH memory devices in which the other data pages of the page stripe are stored.
  • the system controller is also configured to detect failure of a FLASH memory device in which a data page of a particular page stripe is stored, reconstruct the data that was stored within the data page of that page stripe, and store the reconstructed data page as a data page within a new page stripe, where the number of data pages in the new page stripe is less than the number of data pages in the particular page stripe, and where no page of the new page stripe is stored in a memory location within the failed FLASH memory device.
  • the system controller is configured to write data to the FLASH memory devices in a striped fashion using data stripes, with each data stripe including a group of data collections.
  • the system controller writes the data in a manner such that each data collection within a group of data collections is written into a FLASH memory device that differs from the FLASH memory devices into which the other data collections within the group of data collections are written, and the number of data collections used to form each data stripe is based, at least in part, on failure information associated with the FLASH memory devices such that the controller can adjust the number of data collections used for one or more page stripes in response to information indicating that all or part of one or more FLASH memory devices has failed.
  • the system controller is configured to receive WRITE requests from an external host device, each WRITE request including a data item and a logical memory address associated with the data item. For each WRITE request, the system controller translates the logical memory address to a physical memory address and writes the data item to a physical memory location corresponding to the physical memory address. The system controller then associates a number of data items received in a plurality of WRITE requests with each other to form a group of received data items, generates data protection information for each group of data items, writes the data protection information to a physical memory location, translates the received logical addresses for the data items in the group, and selects the physical memory location for storage of the data protection information.
  • the storage is performed by the system controller in such a way that each of the data items is stored in a physical memory location within a FLASH memory device that is different from the FLASH memory devices in which the other data items and the data protection information for the group are stored.
  • the system controller can also adjust the number of data items used to form each group in response to information indicating the actual or predicted failure of all or part of one or more FLASH memory devices, such that the number of data items in one group of received data items stored during a time when all of the FLASH memory devices are operable can differ from the number of data items in a second group of received data items stored at a time after the predicted or actual failure of all or part of one or more FLASH memory devices.
  • the disclosed embodiments are also directed to methods and apparatuses for providing efficient and enhanced protection of data stored in a FLASH memory system.
  • the methods and apparatuses involve a system controller for a plurality of FLASH memory devices in the FLASH memory system that is capable of using variable size page stripes for the FLASH memory devices.
  • the system controller is configured to store data such that each page stripe comprises a plurality of pages of data, with each page of data in the page stripe being stored in a FLASH memory device that is different from the FLASH memory devices in which the other pages of data in the page stripe are stored.
  • the system controller is also configured to maintain one or more buffers containing information reflecting blocks of memory within the FLASH memory devices that have been erased and are available for storage of information, and to dynamically determine the number of pages to be included in a page stripe based on the information contained in the one or more buffers such that a first page stripe can have a first number of pages of data and a second page stripe can have a second number of pages of data, where the first number is different from the second number.
  • the system controller is configured to write data to the FLASH memory devices in a striped fashion using data stripes, with each data stripe including a group of data collections.
  • the system controller writes the data in a manner such that each data stripe includes a group of data collections and an associated set of data protection information, each data collection is written into a different FLASH memory device from the other data collections and the data protection information in the data stripe, and the controller can, on a group by group basis, adjust the number of data collections within each data stripe.
  • the system controller is configured to receive WRITE requests from an external host device, each WRITE request including a data item and a logical memory address associated with the data item. For each WRITE request, the system controller translates the logical memory address to a physical memory address and writes the data item to a physical memory location corresponding to the physical memory address. The system controller then associates a number of data items received in a plurality of WRITE requests with each other to form a group of received data items, generates data protection information for each group of data items, writes the data protection information to a physical memory location, and translates the received logical memory addresses for the data items in the group.
  • the controller thereafter selects the physical memory location for storage of the data protection information such that each of the data items is stored in a physical memory location within a different FLASH memory device from the FLASH memory devices in which the other data items and the data protection information for the group of received data items are stored.
  • the system controller can also dynamically select the number of data items used to form each group such that the number of data items in one group of received data items can differ from the number of data items in a second group of received data items.
  • the disclosed embodiments are further directed to methods and apparatuses for providing efficient and enhanced protection of data stored in a FLASH memory system.
  • the methods and apparatuses involve a system controller for a plurality of FLASH memory devices in the FLASH memory system that is capable of protecting data using different size page stripes.
  • the system controller is configured to store data in the FLASH memory devices in the form of page stripes, each page stripe comprising a plurality of pages of information, each page of information being stored in a FLASH memory device that is different from each of the FLASH memory devices in which the other pages of information within the page stripe are stored.
  • the system controller stores the data in a manner such that the pages making up each page stripe include a plurality of data pages and at least one data protection page containing data protection information that may be used to reconstruct data stored in a data page within the page stripe that becomes corrupted or unavailable, with the data protection information for a given page stripe being obtained by performing a bitwise logical operation on the information within the data pages for the given page stripe.
  • the page stripes stored by the system controller include a first page stripe having N data pages and one data protection page, and a second page stripe having M data pages and one data protection page, where N is an integer greater than three and M is an integer less than N.
  • the system controller is configured to write data to the FLASH memory devices in a striped fashion using data stripes, where each data stripe includes a group of data collections and an associated set of data protection information.
  • the system controller writes the data in a manner such that each data collection within a group of data collections is written into a FLASH memory device that differs from the FLASH memory devices into which the other data collections within the group of data collections and the data protection information associated with the group of data collections are written, each data collection includes error correction code data generated from the data stored in the data collection, and each set of data protection information is generated by performing a bitwise exclusive-or (XOR) of the data within the data collections associated with the set of data protection information.
  • the system controller may then identify an error within the data collection using the error correction code data within each data collection and reconstruct the data for a data collection associated with the set of data protection information using the set of data protection information, where the number of data collections within the group of data collections can be varied.
  • the system controller is configured to write information to the FLASH memory devices using data stripes such that a first data stripe has M data pages and a data protection page, where M is an integer greater than three and where information stored in the data protection page for the first data stripe was generated through the performance of a given operation on information stored within the M data pages of the first data stripe.
  • the controller is further configured to write a second data stripe having N data pages and a data protection page in the FLASH memory devices, where N is an integer less than M, and wherein information stored in the data protection page for the second data stripe was generated from the performance of the given operation on information stored within the N data pages of the second stripe.
  • Figure 1 illustrates an exemplary FLASH memory storage system in accordance with the present disclosure
  • Figures 2A and 2B illustrate an exemplary arrangement of physical memory within a FLASH memory chip in accordance with the present disclosure
  • Figures 3A-3F illustrate exemplary implementations of Page Stripes in accordance with the present disclosure
  • Figure 4 illustrates an exemplary Data Page in accordance with the present disclosure
  • Figure 5 illustrates an exemplary Data Protection Page in accordance with the present disclosure
  • Figure 6 illustrates an exemplary circuit that can be used to produce a Data Protection Page in accordance with the present disclosure
  • Figures 7A and 7B illustrate an exemplary Page Stripe and an exemplary storage arrangement for the Page Stripe in accordance with the present disclosure
  • Figures 8A and 8B illustrate another exemplary Page Stripe and another exemplary storage arrangement therefor in accordance with the present disclosure
  • Figures 9A-9D illustrate additional exemplary Page Stripes and additional exemplary storage arrangements therefor in accordance with the present disclosure
  • Figures 10A-10D illustrate further exemplary Page Stripes and further exemplary storage arrangements therefor in accordance with the present disclosure
  • Figure 1 1 illustrates an exemplary arrangement of Data Pages within groups of Blocks in accordance with the present disclosure
  • Figure 12 illustrates an exemplary arrangement of Data Pages within groups of Blocks where data pages that already contain data are indicated as unavailable in accordance with the present disclosure
  • Figure 13 illustrates an exemplary Ready-to- Erase buffer in accordance with the present disclosure
  • Figures 14A-14D illustrate another exemplary FLASH memory storage system and exemplary storage arrangement where memory chips that have failed are indicated as unavailable in accordance with the present disclosure.
  • Figures 15A and 15B illustrate an exemplary Logical-to-Physical Translation Table having Data Identifiers therein in accordance with the present disclosure.
  • FIG. 1 a memory storage system 100 in accordance with certain teachings of the present disclosure is illustrated. While it can be constructed in various ways, in the example of Figure 1 , the memory storage system is constructed on a single multi-layer printed circuit board.
  • the exemplary illustrated memory storage system 100 includes: a FLASH controller 10; FLASH controller memory 1 1 ; a CPU 15; CPU memory 17; an external communication bus 12 used to communicate information to the FLASH controller 10; a FLASH memory storage array 14; and an internal communication bus 16 that enables communications between the FLASH controller 10 and the FLASH memory storage array 14.
  • the components of the memory storage system 100 are mounted to the same printed circuit board. Such mounting may be accomplished through, for example, surface mounting techniques, through-hole techniques, through the use of sockets and socket-mounts and/or other mounting techniques.
  • the FLASH controller 10 may take many forms.
  • the FLASH controller 10 is a field programmable gate array (FPGA) that, during start-up of the system is programmed and configured by the CPU 15.
  • FPGA field programmable gate array
  • the controller memory 1 1 may take many forms. In the exemplary embodiment of Figure 1 , the controller memory 1 1 takes the form of random access memory and in particular DDR2 RAM memory.
  • the communication bus 12 can be any acceptable data bus for communicating memory access requests between a host device (such as a personal computer, a router, etc.) and the memory system 100.
  • the communication bus 12 can also use any acceptable data communications protocols.
  • the FLASH controller 10 receives requests via communication bus 12 to read data stored in the FLASH memory storage array 14 and/or to store data in the FLASH memory storage array 14.
  • the FLASH controller 10 responds to these requests either by accessing the FLASH memory storage array 14 to read or write the requested data from or into the storage array 14 in accordance with the request, by accessing a memory cache (not illustrated) associated with the storage array 14, or by performing a read or write operation through the use of a Data Identifier as described in more detail below.
  • the FLASH memory storage array 14 may take many forms.
  • the FLASH memory storage array 14 is formed from twenty individually addressable FLASH memory storage devices divided into groups of two (Oa, Ob), (1 a, 1 b), (2a, 2b) through (9a, 9b).
  • each of the FLASH memory storage devices 0a-9b takes the form of a board-mounted FLASH memory chip, such as, for example, a 64 Gigabit (Gb) Single Level Cell (SLC) NAND flash memory chip.
  • Gb 64 Gigabit
  • SLC Single Level Cell
  • the internal communication bus 16 can take any form that enables the communications described herein.
  • this bus 16 is formed from ten individual eight-bit communication buses 0-9 (not individually illustrated), each arranged to enable communication between the systems controller 10 and each of the groups of two memory storage devices 0a-9b.
  • communication bus 0 enables communications between the FLASH controller 10 and the group comprising memory devices Oa and Ob
  • communication bus 4 enables communications between the systems controller 10 and the memory devices 4a and 4b.
  • an on-board ultra-capacitor 18 may also be provided and configured to receive charge during intervals when power is supplied to the FLASH memory system 100 and to provide power for a limited time to the components making up the FLASH memory system 100 whenever applied power is removed or drops below the power level provided by the ultra-capacitor.
  • the purpose of the ultra-capacitor is to provide power for limited operation of the FLASH memory system 100 upon the failure of power to the system. In the event of a power loss, the ultra-capacitor will automatically engage and provide power to most or all components of the FLASH memory system 100.
  • the ultra-capacitor is sized to provide adequate power to allow the system to store into the FLASH memory array 14 any data that may be retained in the RAM storage device 1 1 at the time of power loss or power failure, as well as any other volatile information that may be necessary or useful for proper board operation.
  • the overall FLASH system 100 acts as a non-volatile memory system, even though it utilizes various volatile memory components.
  • multiple ultra-capacitors at various distributed locations across the printed circuit board and/or a single ultra-capacitor bank is used to provide the described back-up power.
  • the term ultra-capacitor is any capacitor with sufficiently high capacitance to provide the back-up power required to perform the functions described above that is adequately sized to fit on a printed circuit board and be used in a system, such as system 100.
  • the system 100 uses an addressing scheme to allow the FLASH controller 10 to access specific memory locations within the memory array 14.
  • this addressing scheme will be discussed in the context of a WRITE request, although it will be understood that the same addressing scheme is and can be used for other requests, such as READ requests.
  • the FLASH controller 10 will receive a WRITE request from a host device that contains both: (i) data to be stored in the memory system 100 and (ii) an indication of the memory address where the host device would like for the data to be stored.
  • the WRITE request may also include an indication of the amount (or size) of the data to be transferred.
  • the system is constructed such that the amount of data (or the size of each WRITE request) is fixed at the size of a single FLASH memory page. In the exemplary embodiment of Figure 1 , this corresponds to 4KBytes of information.
  • the address provided by the host device can correspond to the address of a Page within a logical address space.
  • the address received by the FLASH controller 10 does not refer to an actual physical location within the memory array 14. Instead, the address received by the Flash Controller 10 from the host device is a Logical Block Address (or "LBA") because it refers to a logical address, rather than to any specific physical location within the memory array 14.
  • LBA Logical Block Address
  • the memory array 14 comprises a collection of individual FLASH memory storage chips.
  • a specific physical addressing scheme is used to allow access to the various physical memory locations within the FLASH memory chips 0a-9b. In the embodiment of Figure 1 , this physical addressing scheme is based on the physical organization and layout of the memory array 14.
  • each group of two chips forms a "Lane,” also sometimes referred to as a "Channel,” such that there are ten Lanes or Channels within the memory array 14 (LANE0-LANE9).
  • LANEO corresponds to chips Oa and 0b; LANE1 to chips 1 a and 1 b and so on, with LANE9 corresponding to chips 9a and 9b.
  • each of the individual Lanes has associated with it one of the individual eight-bit buses 0-9 mentioned earlier to enable the FLASH controller 10 to communicate information across the Lane.
  • the FLASH controller 10 can direct its communications to one of the Lanes of memory chips. Because each communication bus 0-9 for a given Lane is independent of the communication buses for the other Lanes, the controller 10 can issue commands and send or receive data across the various communication buses at the same time such that the system controller can access the memory chips corresponding to the individual Lanes at, or very nearly at, the same time.
  • each Lane enables communications with one of two physical memory chips at any given time.
  • data provided across communication bus 0 can enable communications with either chip Oa or chip Ob.
  • the FLASH controller 10 controls eight individual chip enable lines (four for chip Oa and four for chip Ob) so that each chip and its corresponding internal hardware resources may be addressed individually. The assertion of a single chip enable line results in communications with one chip and one chip enable ("CE”) resource within that chip.
  • CE chip enable
  • the physical memory locations within each of the FLASH memory chips are divided into physical locations that can be addressed and/or identified through the use of one or more of: Chip Enables ("CEs", generally described above); Dice (multiple individual die); Planes; Blocks; and Pages.
  • CEs Chip Enables
  • Dice multiple individual die
  • Planes Blocks
  • Pages Pages
  • Figures 2A and 2B generally illustrate the physical memory 200 within each of the individual FLASH memory chips 0a-9b of Figure 1.
  • the physical memory 200 within the device may be divided into four high level groupings, where each grouping has associated with it an individual Chip Enable (or "CE") line.
  • CE Chip Enable
  • the physical memory 200 of each FLASH chip is divided into four groupings of Chip Enables (CEO, CE1 , CE2 and CE3) and each Chip Enable would have a separate CE line.
  • CEO Chip Enable
  • CE1 , CE2 and CE3 the physical memory 200 of each FLASH chip
  • CE3 Chip Enable
  • each CE group of memory locations is further divided into Dice (multiple individual die), Pages, Blocks and Planes.
  • each Chip Enable includes two Dice (DIEO and DIE1 ) which are illustrated for CE0-CE3.
  • a Page is the smallest individually addressable data unit.
  • each Page of data has a specific length which in the example is a data length corresponding to 4KB of data plus 128 additional bytes used as described in more detail below.
  • data is written into or read from the memory array 14 on a Page-by-Page basis.
  • Blocks are grouped together to form "Blocks".
  • a Block is a collection of pages that are associated with one another, typically in a physical manner. The physical association is such that the Block is the smallest group of FLASH memory locations that can be erased at any given time.
  • each Block includes 64 Pages of data. This is reflected generally in Figure 2B.
  • an ERASE operation involves the placement of all of the memory locations that are subject to the erase operation in a particular logical state, corresponding to a specific physical state of the memory locations.
  • the ERASE operation is performed on a Block-by-Block basis and the performance of an ERASE operation of a given block places all of the memory locations within the Block into a logical "1 " state, corresponding to a state where there is no or relatively low charge stored within the storage devices associated with each memory location.
  • the memory locations can be erased only on a Block-by-Block basis in the embodiment shown.
  • each Plane represents a collection of Blocks that, because of the physical layout of the FLASH memory chips, are physically associated with one another and that utilize common circuitry for the performance of various operations.
  • each Die includes two Planes and each Plane comprises 2048 Blocks of data.
  • the Blocks within the Planes are illustrated for CE3.
  • each of the Pages of Data within an exemplary Plane (e.g., PLANEO of DIEO of CE3) will be associated with some specific input/output circuitry that includes an Input/Output (I/O) Buffer.
  • the I/O Buffer is a buffer that is sized to store at least one Page of data.
  • the Page of data is first retrieved from the specific Page to be accessed and placed in the I/O Buffer for the Plane in which the accessed Page resides. If the data was requested in a manner where it would be accessible outside the FLASH chip 200, the data is delivered from the I/O Buffer in the associated Plane to the System Controller 10.
  • the memory system 100 of Figure 1 does not generally allow devices external to the system to directly address and access the physical memory locations within the FLASH memory storage array. Instead, the memory system 100 is generally configured to present a single contiguous logical address space to the external devices that may request READ or WRITE access to data stored in the memory array 14.
  • the use of this logical address space allows the system 100 to present a logical address space external to the system 100, such that a host device can write data to or read data from logical addresses within the address space - thus allowing easy access and use of the memory system 100 - but also allows the FLASH controller 10 and CPU 15 to control where the data that is associated with the various logical addresses is actually stored in the physical memory locations that make up memory array 14 such that the performance of the system is optimized.
  • the system 100 isolates the logical address space made available to host devices from the physical memory within the array 14, it is not necessary that the size of the physical memory array 14 be equal to the size of the logical address space presented external to the system. In some embodiments it is beneficial to present a logical address space that is less than the total available address space. Such an approach ensures that there is available raw physical memory for system operation, even if data is written to each presented logical address space. For example, in the embodiment of Figure 1 , where the FLASH memory array 14 is formed using 64 Gb FLASH memory chips providing a raw physical memory space of 640Gb of storage, the system could present a logical address space corresponding to approximately 448Gb of data storage.
  • Page Stripes represents a grouping of associated information, stored in a particular manner within the memory array 14.
  • Page Stripes Information Content
  • each Page Stripe includes a number of Pages of stored data (typically provided by a host device) and one Page of data used to protect the stored data. While the actual size of a Page Stripe may vary, for purposes of the following discussion an exemplary Page Stripe consisting of nine pages of stored data and one page of data protection information is described.
  • Figure 3A illustrates an exemplary Page Stripe 300 in accordance with the teachings of the present disclosure.
  • the exemplary Page Stripe consists of nine pages of data, each referred to herein as a "Data Page” (DPAGEO, DPAGE1 , DPAGE2. . .DPAGE8 in the example) and one page of data protection information, referred to herein as a "Data Protection Page” (PPAGE9 in the example).
  • Data Page DPAGEO, DPAGE1 , DPAGE2. . .DPAGE8 in the example
  • PPAGE9 Data Protection Page
  • Figure 4 generally illustrates the format used for each Data Page within the Page Stripe 300.
  • an exemplary Data Page 410 is illustrated.
  • the illustrated Data Page 410 includes 4096 bytes of stored data and 128 bytes of additional information that, in the illustrated example, includes a number of bits that provide the Logical Block Address (LBA) corresponding to the specific Data Page at issue; a number of bits that reflect a cyclic redundancy check (CRC) of the combination of the stored data and the stored LBA; and a number of Error Correction Code (ECC) bits calculated, in the illustrated example, using the combination of the stored data bytes, the LBA bits and the CRC bits.
  • LBA Logical Block Address
  • CRC cyclic redundancy check
  • ECC Error Correction Code
  • bits of data reflecting the status of the Block in which the illustrated Page is found may also be stored within the Data Page.
  • the LBA information is in the form of four bytes of data, although the length of the LBA address is not critical and can vary.
  • the CRC data can take many forms and be of variable length and various techniques may be used to determine the CRC data associated with the LBA address stored in the Data Page.
  • the CRC data comprises a 64-bit value formed by a hashing technique that performs a hash operation on the 4096 data bytes plus the four LBA data bytes to produce a 64-bit CRC hash value.
  • the ECC data associated with the stored data and LBA information is calculated using a beneficial technique in which, the ECC data stored in the Data Page comprises thirty-three sixteen-bit ECC segments, with each of thirty-two of the ECC segments being associated with 128 unique bytes of the 4KByte data area, and a thirty-third ECC segment being associated with the LBA and CRC fields.
  • Figure 5 generally illustrates the form of the information stored in the Data Protection Page of the exemplary Page Stripe 300.
  • an exemplary Data Protection Page 500 is illustrated.
  • the data and LBA fields of the Data Protection Page 500 simply contain the bit-by-bit Exclusive Or (XOR) of the corresponding fields in one or more of the associated Data Pages (PAGEO, PAGE1 , PAGE2 . . . PAGE8).
  • the ECC and CRC fields for the Data Protection Page 500 are recalculated for the Data Protection Page 500 in a manner identical to that used in the corresponding Data Pages.
  • the XOR calculation used to produce the Data Protection Page can be accomplished using the apparatus of Figure 6 and/or a software approach.
  • XOR circuitry 600 includes an input memory buffer 60, an addressable XOR memory buffer 61 , a multi-bit XOR circuit/buffer 63 and a multiplexer (MUX) 64.
  • ECC and CRC calculation logic 65 is also illustrated, as is the physical FLASH memory array 66.
  • each of the input buffer 60, XOR buffer 61 , XOR circuit 63 and MUX 64 operate on a Page of information.
  • the circuitry 600 of Figure 6 operates as follows: All data destined for the FLASH memory 66 passes first through input memory buffer 60. If this data is the first Page of a new Page Stripe, the data is copied directly into the addressable XOR memory buffer 61 as it flows into the downstream ECC and CRC calculation logic 66. For the second and subsequent Pages of a Page Stripe, previous data in the addressable XOR memory buffer is unloaded and XORed with new data as the new data is unloaded from the input memory buffer 60. The result is then written back into the addressable XOR memory buffer 61 , yielding the XOR of all Data Pages up to and including the current one.
  • the XOR operation may alternately be performed through the use of software or firmware.
  • the data and LBA information from the failing Page may be reconstructed from the other Pages (including the XOR Data Protection Page) within the same Page Stripe using the information in the Data Protection Page for the Page Stripe.
  • the XOR Data Protection Page for each Page Stripe employs the same local protection mechanisms (ECC and CRC) as every other Data Page within the Page Stripe.
  • Page Stripe 300 of Figure 3A is but one example of a Page Stripe in accordance with the teachings of this disclosure. Page Stripes of different sizes and constructions can also be used.
  • One such alternate Page Stripe is reflected in the embodiment of Figure 3B.
  • Figure 3B illustrates an alternate Page Stripe 340 that includes only nine total Pages of data with eight of the Pages (DPAGE0-DPAGE7) being Data Pages and one of the Pages (PPAGE8) being a Data Protection Page.
  • the individual Data Pages (DPAGE0-DPAGE7) are constructed in accordance with the Data Page format of Figure 4 and the Data Protection Page is of the form reflected in Figure 5. Because the Page Stripe 340 includes only eight Data Pages, however, the Data Protection Page (PPAGE8) will include the XOR of only eight Data Pages, as opposed to the nine Data Pages that would be used for the Page Stripe 300 of Figure 3A.
  • Page Stripe 350 includes only eight total pages, with seven of the Pages (DPAGE0-DPAGE6) being Data Pages and One of the Pages (PPAGE7) being a Data Protection Page.
  • FIG. 3D illustrates a Page Stripe 360 that is formed from a total of ten Pages of information, where the Data Protection Page is located at the PPAGE4 location.
  • Figure 3E illustrates a Page Stripe 370 with ten Pages of information including nine Data Pages and a Data Protection Page at the PPAGE7 location.
  • Figure 3F illustrates yet another example, depicting a Page Stripe 380 having eight Pages, including seven Data Pages and one Data Protection Page at the PPAGEO location.
  • the memory locations in which the Pages of data within a Page Stripe can be stored may vary within memory array 14, in one embodiment, the Pages that make up a given Page Stripe are stored in physical memory locations selected in such a manner that the overall operation of the memory system 100 is optimized. In this embodiment, the physical memory locations in which the data in each Page Stripe is stored are such that the physical Lane associated with each Page of data within the Page Stripe is different from the Lanes associated with the other Pages that make up the Page Stripe.
  • this embodiment allows for efficient writing and reading of a Page Stripe to the memory array since it allows all of the Pages of data that make up the Page Stripe to be written to the memory array 14 simultaneously or near-simultaneously by having the FLASH controller 10 issue commands to the various Lanes at, or close to, the same time.
  • Figure 7A illustrates an exemplary Page Stripe 700 consisting of nine Data Pages 70a, 70b, 70c through 7Oi and one Data Protection Page 7Oj.
  • Figure 7B illustrates the manner in which this Page Stripe 700 can be stored in the memory array 14 of Figure 1 .
  • the first Data Page 70a is stored in a physical memory location within LANEO; the second Data Page 70b is stored in a physical memory location within LANE1 ; the third Data Page 70c is stored in a physical memory location within LANE2, and so on until the ninth Data Page 7Oi is stored in a physical memory location within LANE8.
  • the Data Protection Page 7Oj is stored in a physical location within LANE9.
  • Figures 7A and 7B is but one example of how a Page Stripe can be stored within the physical memory array.
  • Figures 8A and 8B illustrate an alternate arrangement.
  • Figure 8A illustrates an exemplary Page Stripe 800 that includes eight Data Pages 80a-80h and a single Data Protection Page 8Oi.
  • Figure 8B illustrates an example of how the Pages making up Page Stripe 800 can be stored in the memory array 14.
  • the first Data Page 80a is stored in a physical location associated with LANEO
  • the second Data Page 80b with a physical location associated with LANE1
  • the third Data Page 80c in a physical location within LANE2.
  • the fourth through eighth Data Pages (80d-80h) are then stored in physical locations within LANE4-LANE8, respectively, and the Data Protection Page 8Oi is stored within a location in LANE9.
  • the Pages that make up the exemplary Page Stripes are stored sequentially across the Lanes, such that each of the Lane designations for the memory locations associated with the Pages within the Page Stripe are sequential as one considers the Page Stripe from the first Data Page to the Second Data Page continuing to the Data Protection Page. While this approach is not critical to the disclosed embodiments, it is beneficial in that it can simplify the implementation of the disclosed subject matter.
  • Page Stripes are stored such that the Pages associated with the Page Stripe are written sequentially across the Lanes, but with the first Data Page of the Page Stripe written into a physical location associated with a Lane other than LANEO.
  • Figures 9A-9D illustrate examples of how an exemplary Page Stripe 900 containing nine Data Pages 90a-90i and a single Data Protection Page 9Oj can be written sequentially across Lanes within memory array 14 with the first Data Page being stored in a location associated with a Lane other than LANEO.
  • Page Stripe 900 is stored sequentially with the first Data Page stored at an address associated with LANE3 and the Page Stripe sequentially "wrapping around" such that the Data Protection Page 9Oj is stored in an address associated with LANE2.
  • Figure 9C illustrates storage with the first Data Page 90a in an address associated with LANE4
  • Figure 9D illustrates storage with the first Data Page 90a in an address associated with LANE5.
  • FIGS 10A-10D illustrate still further examples of how a Page Stripe 1000 including eight Data Pages and a single Data Protection Page can be written into memory array 14.
  • Pages within a particular Page Stripe may be written to various Lanes, in any order, so long as no two Pages of the same Page Stripe occupy the same Lane.
  • Memory System 100 Exemplary Operations
  • the exemplary system of Figure 1 may perform WRITE operations through a number of steps including:
  • the step of receiving, from a host device, data to be stored and an LBA where the host device would like for the data to be stored is relatively straightforward.
  • the data and the LBA supplied by the host are typically provided to the System Controller 10 over the communication bus 12.
  • the step of determining whether the LBA for the received data was previously associated with one or more different physical memory Pages and, if so, changing the status of the previous Page of memory to an indication that the data is no longer valid involves the FLASH controller 10 comparing the received LBA to the LBA entries in the Logical-to-Physical conversion tables. If the comparison indicates that the LBA provided by the host device for the current WRITE operation was previously associated with another physical memory location, then the system will know that the previously stored data is no longer valid. Accordingly, the system will change a status indicator for the physical Pages of data associated with the previously stored data to indicate that they are DIRTY, or no longer valid.
  • the step of identifying one or more available Pages where the received data can be stored can be implemented in a variety of ways.
  • the FLASH controller will already be in possession of information that identifies a specific group of associated Blocks in physical memory that are available to store data.
  • the FLASH controller 10 will then have an internal count indicating which Pages within the group of Blocks already have data stored therein and will use the next available group of Pages as a source for a Page within a Page Stripe for the data to be stored. This process is illustrated generally in Figure 1 1.
  • Figure 1 1 generally illustrates the selection of a Page Stripe location in instances where the FLASH controller 10 is already in possession of information identifying a group of blocks in physical memory where data may be stored. Because the group of Blocks is intended for the storage of Page Stripes, and because there is a general one-to-one correspondence between the number of Blocks in the group of Blocks and the number of Pages in the Page Stripes that are stored in the Blocks, the group of Blocks is referred to herein as a Block Stripe. In the example of Figure 1 1 , the Block Stripe is sized to have ten Blocks such that the Page Stripes stored within the Block Stripe have nine Data Pages and one Data Protection Page.
  • the FLASH controller 10 will not be aware of a Block Stripe in which data can be stored. This condition can occur, for example, just after the FLASH controller has written a Page Stripe to the last available page locations of a previously available Block Stripe. Under these conditions, the FLASH controller needs a mechanism for identifying another available Block Stripe to store data.
  • the mechanism for identifying available Block Stripes involves having the FLASH controller 10 pull data identifying an available (or free) Block Stripe from a buffer in which locations of Free Block Stripes are stored.
  • This buffer referred to herein as the Free Block Stripe Buffer, is a buffer that contains, for each entry, information that identifies a group of Blocks into which data can be stored in a Page Stripe manner.
  • the entries in the Free Block Stripe Buffer are such that all of the Blocks corresponding to an entry have been previously erased and are, therefore, available for the immediate storage of data.
  • the Free Block Stripe Buffer may also contain specific information for each entry, or for a group of entries, indicating the format of the Page Stripes that can be stored in the buffer. For example, such entries may indicate that the Block Stripe corresponding to one particular entry of the Free Block Stripes buffer can store Page Stripes having nine Data Pages and one Data Protection Page and that the Block Stripe for a different entry can store Page Stripes having eight Data Pages and one Data Protection Page. This formatting information can be stored as part of the Free Block Stripe Buffer or could be stored in a different buffer.
  • Free Block Stripe Buffers could be maintained with each one storing Block Stripes capable of storing Page Stripes of different formats.
  • the FLASH controller 10 can intelligently decide to select the entry in the Free Block Stripe Buffer that would optimize overall performance of the memory system 100. For example, if the FLASH controller 10 was aware that the host device was attempting multiple WRITE operations to the system and each WRITE operation was associated with data sufficient to store nine Data Pages of data, or if the controller 10 was attempting to move only nine pages of data, the FLASH controller could select the Free Block Stripe Buffer entry corresponding to a Block Stripe of adequate size to store a Page Stripe with nine Data Pages (and one Data Protection Page).
  • the FLASH controller 10 could select an entry from the Free Block Stripe Buffer corresponding to a different Page Stripe format (such as a Page Stripe with eight Data Pages and one Data Protection Page). (Move operations are discussed in more detail below.) In this manner, the overall operation of the system could be optimized.
  • the FLASH controller 10 could select and have available for storage multiple Block Stripes.
  • the FLASH controller could select Block Stripes sufficient to store Page Stripes with that number of data pages.
  • the FLASH controller 10 - could select a Free Block Stripe from the Free Block Stripe Buffers that was of a size appropriate to the amount of data to be stored. This approach could improve the overall performance of the system because, in the absence of such a step, it may be necessary to add dummy data (in the form of appended logical Os or 1 s) to received data to "fill" out a Page Stripe.
  • the FLASH controller 10 will, in some embodiments, configure the data received during the WRITE operation so that it will "fit" into the selected Page Stripe location on a Page-aligned basis. This step will involve the Flash Controller 10 breaking up the received data into data groups appropriate for storage in a Page Stripe, generating the data to be stored in each Data Page of the Page Stripe (including any LBA data, CRC and/or ECC data as discussed above) and also generating the data for the Data Protection Page for the Page Stripe (as discussed above).
  • the FLASH controller 10 may append logical 1 's or O's (or any other data) to the data to be stored so that a complete Page Stripe of information can be written to the physical Page Stripe location.
  • the configuration step could be used to identify the amount of data that was to be stored in the Page Stripe which could enable the FLASH controller 10 to select the available Page Stripe location that would minimize or eliminate the need to append data bits to the stored data to fill out the Data Pages for the Page Stripe. Since such appended data bits do not constitute actual host device stored data, the reduction of the extent of the appended bits can enhance overall system performance.
  • the configured Page Stripe is written to physical memory.
  • This step involves the FLASH controller 10 issuing the appropriate commands across the communication bus 16 to indicate to the memory storage devices that write operations will occur, to indicate the specific Page locations where the write operations will occur and to provide the data for those operations.
  • the write operation may occur simultaneously or near-simultaneously for all of the Pages that make up the Page Stripe being stored.
  • the FLASH controller 10 will update the Logical-to-Physical conversion table to associate the LBA provided by the host device with the actual physical locations where the data provided by the host device for storage at that LBA was stored.
  • the FLASH controller 10 will write data to the memory array 14 on a Page-by-Page basis as data is received from a host device.
  • the FLASH controller will write the data to the next Page in the current Page Stripe.
  • a READ operation could be requested of a Page before the Page Stripe containing that Page is "filled-out" and before the Data Protection Page for the Page Stripe containing the Page is stored to physical memory.
  • the FLASH controller can retrieve the data for the requested Page and, assuming that the ECC and CRC data confirms that the Page has valid data and/or identifies an error that can be corrected through use of the ECC data within the Page, provide the requested Page of data to the host device. In such a circumstance, there is no need for early completion of the Page Stripe containing the page and the memory system 100 can merely await the receipt of adequate information to complete the Page Stripe.
  • the FLASH controller 10 could: (i) take the accumulated XOR data for the "incomplete" Page Stripe; (ii) modify the format for the Page Stripe at issue so that the modified format includes only the received data as of that time (e.g.
  • the modified Page Stripe format would have seven Data Pages and one Data Protection Page); and (iii) write the then-accumulated XOR data to the Data Protection Page for the reformatted Page Stripe.
  • the system could then use the then-completed, modified Page Stripe to recreate the data for the Page that was corrupted.
  • the next WRITE operation received by the system would then be to a different Page Stripe.
  • This approach would, therefore, allow the system to modify and "complete" a Page Stripe and use the Data Protection Page information for that Page Stripe to regenerate data from a lost or corrupted page without having to either: (a) wait until a Page Stripe of nine Data Pages and one Data Protection Page is completed or (b) complete a ten-Page Page Stripe through the writing of dummy data (e.g., all O's, 1 's, or other dummy data).
  • dummy data e.g., all O's, 1 's, or other dummy data
  • one step of the WRITE operation can involve the FLASH controller 10 pulling Free Block Stripe information from one or more Free Block Stripe Buffers.
  • the following discusses the manner in which the Free Block Stripe Buffer (or Buffers) can be populated.
  • the Free Block Stripe Buffer(s) is/are populated through the use of apparatus and methods that:
  • a particular Page within a FLASH memory device must be completely erased before any data can be written to that Page.
  • the ERASE operation typically involves the setting of all of the bits in a particular Block of data to a logical 1 state or a logical 0 state. After a Block of FLASH memory has been erased, data can be written into the Pages within that Block.
  • the system maintains one or more tables that track the "DIRTY "status of various pages within the system.
  • one or more tables are maintained that track, for each Block Stripe, the number of DIRTY pages within the Block Stripe.
  • a Block Stripe State Table can be maintained, where each entry in the table corresponds to a given Block Stripe. Whenever the table indicates that a Block Stripe is sufficiently dirty, the remaining valid data in the Block Stripe could be written into alternate physical memory locations through a move operation and the LPT table updated to reflect the move.
  • a previously erased Block Stripe will be directly placed in the Free Block Stripe Buffer.
  • the Block Stripe that was erased cannot be used.
  • new Block Stripes can be assembled from the Blocks of such Block Stripes using one or more Ready-to-Erase Buffers that contain information about Blocks within such Block Stripes.
  • the memory system 100 maintains one or more of a number of related Ready- to-Erase buffers in which information identifying one or more Blocks of physical memory that are ready to be erased are maintained and in which the system follows a process of using the data in the Ready-to-Erase buffer to select blocks of data for efficient Erasing operations.
  • RTE Ready-to-Erase
  • Figure 13 illustrates one exemplary set of RTE buffers 1300 that may be utilized with the memory system 100 of Figure 1.
  • the illustrated set of buffers is for a given Chip Enable.
  • the RTE buffers within the set 1300 can be maintained as individual buffers, a large arrayed buffer, or a collection of arrayed buffers. The arrangement is not critical as long as the Blocks within the RTE buffer set 1300 can be associated with one another on a per Lane and per Plane basis.
  • the buffers within set 1300 may be maintained by CPU 15 and stored within a memory location utilized by CPU 15.
  • the buffers within the set 1300 may be first-in first-out (or FIFO) buffers.
  • the RTE buffers are maintained on a per Lane and per Plane basis such that the set 1300 of RTE buffers identifies, at any given time, Blocks of memory that are ready to be erased and, for each such Block, the specific Lane and Plane associated with that Block. Because of this organization, the memory system 100 can use the RTE buffers to efficiently perform ERASE operations to optimize the overall performance of the system 100.
  • the CPU 15 within the memory system 100 monitors the information in the RTE buffer set 1300 to identify groups of Blocks within the RTE buffer that are associated with memory locations that can be used to efficiently store a Page Stripe of data.
  • the CPU 15 will execute instructions to: (1 ) cause an ERASE operation to be performed on the Blocks within the identified group, and (2) cause one or more indications to be provided that: (a) associate the Blocks in the identified group with one another so that memory locations within the Blocks can be used to store Page Stripes of data, and (b) indicate that the Blocks that make up the identified group are free and available to store data.
  • One of the benefits of having all of the Pages of a Page Stripe within the same Plane is that it allows for the use of faster and potentially more efficient operations to move data within the physical memory array.
  • the act of moving data from one physical Page location to another Page location can be accomplished in a variety of ways.
  • One approach for such a movement of data would be to read the data from the original Page into a buffer external to the FLASH chip where the data originally resided and then WRITE the data into a Page within the same or a different FLASH chip.
  • the disclosed system enhances the ability of the system to ensure that most or all of the movements of the data within a Page Stripe (e.g., a move required by a subsequent WRITE to a Page location within a Page Stripe containing data) are intra-Plane moves that can utilize the faster and more efficient approach(s) that can be used to implement intra-Plane data transfers. This is because it would be difficult for the system to identify destination locations that would allow for each Page of the Page Stripe to be moved via an intra-Plane operation if the Pages within the Page Stripe were from different Planes.
  • one approach for identifying a suitable group of Blocks within the RTE buffer set 1300 to obtain the advantages described above would be to monitor the Blocks in the buffer set 1300 to determine when groups of Blocks can be identified where: all of the Blocks within the candidate group are associated with physical addresses in different Lanes and where all of the Blocks within the candidate group are associated with the corresponding Planes.
  • the system CPU 15 would execute instructions that associate all of the Blocks within the candidate group with one another and that cause an ERASE operation to be performed on all of the Blocks within the candidate group.
  • the system 100 may first look for groups of Blocks within the RTE buffer set 1300 such that: (i) each Block is associated with a different Lane; (ii) each Block is associated with the same corresponding Plane; and (iii) the number of Blocks is equal to the number of Lanes.
  • the population of the Blocks in the RTE buffer set 1300 may be such that it is difficult or impossible for the system to readily identify a candidate group of Blocks meeting the preferred criteria described above.
  • This condition could exist, for example, when one or more of the FLASH memory chips that make up the memory array 14 fail. While failures are not common and not expected, they can occur. Thus, it is possible that, for a given memory array 14, one or both of the FLASH memory chips associated with a given Lane could fail.
  • the failure of the FLASH chips would ensure that no Blocks associated with that Lane are placed in the RTE buffer.
  • the absence of Blocks associated with the Lane associated with the failed FLASH chips would ensure that the preferred conditions (where there is a Block associated with each Lane) would not occur.
  • partial chip failures could create conditions under which it would be difficult to identify candidate groups within the RTE Buffer set 1300 that meet the preferred conditions.
  • complete FLASH chip failure is relatively rare, it is not uncommon for given Blocks within a chip, given Planes within a chip, or given CEs within a chip either to fail during operation or to be inoperative upon initial use of the chip.
  • these failures can significantly reduce the number of Blocks that are placed within the RTE buffer set 1300 for a given Lane and/or given Plane.
  • the failure of a chip or the failure of a portion of a chip can include both the actual failure of a chip or the occurrence of a situation indicating an anticipated or predicted failure of a chip or a portion of a chip.
  • the manner in which data is written to and/or read from the memory array can create conditions under which it is difficult to identify groups of Blocks in the RTE buffer set 1 1 10 meeting the preferred conditions.
  • the memory system 100 may operate to select groups of Blocks that, while not meeting the preferred conditions, meet a first reduced set of conditions that are appropriate for the operation of the system. For example, if the population of Blocks within the RTE buffer set 1300 is such that the system cannot, after a given amount of time or operational cycles, identify a group of Blocks meeting the preferred conditions, the system may determine whether a group of Blocks meeting another set of conditions can be identified.
  • the system may determine whether a group of N Blocks can be identified from different Lanes, where N is one less than the total number of available Lanes. If such a group of Blocks can be identified that meets this first reduced set of conditions, the system can then associate that group of Blocks together as a location for storing Data Stripes, where the number of Pages in such Page Stripes is one less than the total number of Lanes in the system and ensure that ERASE operations are performed on the Blocks within that group.
  • the system could attempt to identify blocks meeting a second set of reduced conditions such, as for example, conditions where there are N' Blocks that can be identified, where N' is two less than the number of available Lanes.
  • the operations using this second set of reduced conditions could follow those described above in connection with the first set of reduced conditions.
  • the system could look for groups meeting other sets of reduced conditions, if an inadequate number of groups of Blocks meeting the already presented sets of reduced conditions were identified.
  • the operation of the system in terms of accepting and using groups of Blocks in the RTE buffer set 1300 meeting conditions other than the preferred conditions can be static or can vary depending on the operational state of the memory system 100.
  • the system 100 could operate under conditions where it waits to identify groups of Blocks meeting the preferred conditions before taking action.
  • the system could more readily process groups of Blocks meeting reduced criteria.
  • system 100 would be willing to accept groups meeting reduced criteria until a desired inventory of available Page Stripe locations were assembled and thereafter, as long as the inventory was at or near the desired inventory, utilize the preferred criteria.
  • the desired inventory count could be static or variable depending on the write activity of the system 100.
  • a READ operation is performed when the FLASH controller 10 receives a READ request from an external host device.
  • the READ request will comprise a request from a host device to READ a Page of data associated with a particular LBA provided by the host device.
  • the Flash Controller will, in one embodiment:
  • (iii) validate and, if necessary, correct or reconstruct the requested data using the ECC data and/or the information in the Data Protection Page for the Page Stripe corresponding to the requested LBA;
  • this reading of data is done on a Page specific basis, where the Page of data that is retrieved corresponds to the Page of data associated with the LBA provided by the host device.
  • the Page of data retrieved as a result of the READ operation is determined to be corrupted to a point that it can not be corrected through intra-Page ECC and/or CRC (or if the page is determined to have failed or be unreadable for any reason) then all of the Data Pages and the Data Protection Page for the Page Stripe in which that Page resides may be read and used to reconstruct the data within the Page associated with the LBA provided by the host device.
  • a hard error is a corruption of one or multiple bits of data that is caused by a physical aspect of the memory storage device.
  • Hard errors can be caused by a variety of factors including, but not limited to, the physical failure of components within a given memory chip (such as the failure of a charge pump), the physical failure of an entire memory chip or the external support structures for that chip (e.g., the breaking of a power line or an address line to a chip); the physical failure of all or part of a chip as a result of excessive temperature, magnetic field, humidity, etc.
  • hard errors are related to the physical structure of a memory system, hard errors are uniquely associated with a particular collection of memory chips, a particular memory chip, or specific physical regions within a chip (such as a Chip Enable region, Plane or Block).
  • data errors can be detected during a READ operation through the use of the ECC and CRC data for each Page.
  • identified data errors can be corrected through the use of ECC algorithms and/or through the use of the Data Protection information (in the event that a single Page exhibits an uncorrectable error).
  • the ECC or Data Protection information can be used to recreate the corrupted data bit or bits, the recreated data can be placed within a new Page Stripe along with other Pages from the original stripe; and the new Page Stripe can be written back to the physical memory using the corrected data.
  • the memory system 100 will maintain records of the identified data errors and the physical structure associated with those errors.
  • the memory system 100 will maintain records reflecting the number of errors associated with the various Blocks, Planes and, potentially, Chip Enables and Chips within the system.
  • these counts show that the number of errors associated with a given Block, Plane, Chip Enable or Chip are above a predetermined threshold, they can indicate that there has been a failure of a given memory chip or of a given region within the chip (i.e., a given Chip Enable, Plane or Block within a chip).
  • the memory system 100 can designate the Chip (or intra-chip) region as bad or failed by designating the Blocks within the chip or region as bad.
  • the Blocks that are identified as bad will no longer be used by the memory system for the storage of data. This can be accomplished by, for example: (i) not placing the bad Blocks into the RTE Buffer, such that they are not used in the construction of Free Block Stripes and, therefore, would not be used in a Page Stripe for the storage of data or (ii) continuing to place the bad Blocks into the RTE buffer, but doing so under conditions where the blocks are identified as bad.
  • an indication would be provided so that the system 100 could use that information when assembling Free Block Stripes. For example, if there were ten blocks that were in the RTE buffer that meet the conditions for being grouped together as a Block Stripe but one of the Blocks was a bad block, the system could then proceed to form a Block Stripe from the identified Blocks that would have ten Blocks, but would provide an indication as to the bad Block such that the Page Stripe format for that Block Stripe would only utilize the nine good Blocks.
  • the memory system of Figure 14A includes a FLASH controller 10, a CPU 15, and a memory array that includes ten Lanes, with each Lane including two memory chips. Assuming that all of the blocks within all of the chips are "good" blocks, the system could store data in the memory array using Page Stripes that are formatted such that each Page Stripe, or at least the majority of Page Stripes, includes a Page stored in each of the ten Lanes (e.g., a Page Stripe having nine Data Pages and one Data Protection Page). This is generally reflected in Figure 14B which shows the standard Page Stripe format for the embodiment of Figure 14A.
  • the failure of the chips in LANE5 would be detected and the system 100 could change the format of the Page Stripes that are used so that, as the system reads, writes and moves data, the data that was previously stored in physical locations across chips in all ten Lanes using a Page Stripe format with ten pages, is now stored across chips in only nine Lanes using a Page Stripe format with nine pages as reflected in Figure 14D.
  • no data stored in the memory system 100 was lost, and the memory system 100 can self- adapt to the failure and continue to perform and operate by processing READ and WRITE requests from host devices.
  • each READ or WRITE request issued by a host device will typically result in the performance of a READ or WRITE operation on locations within the physical memory array. While such operations can fulfill the operational goals of the memory system 100, they may not be optimal because: (i) the actual access of the physical memory array takes some amount of time (thus introducing some delay into the overall system operation) and (ii) the multiple accesses to the memory array tend to degrade the overall lifespan of chips that make up the physical array since FLASH memory chips used to form the physical memory array can be subjected to only a finite number of ERASE operations and the repeated access will resulted in increased ERASE operations.
  • An alternate embodiment of the memory system 100 of Figure 1 utilizes methods and apparatus to improve the overall performance and lifespan of the system. This is accomplished by having the system monitor the incoming WRITE requests to assess the specific data that the host device seeks to write to the memory system.
  • the embodiment described herein utilizes hardware or a software process that first considered, for each WRITE request, whether the data associated with that WRITE requests meets one of a number of predefined criteria. For example, the system could use hardware to determine if the data associated with the WRITE request consisted of all logical 1 's or all logical O's. If it were determined that the data associated with the WRITE request was within one of these predetermined categories, then the memory system would not write the data to the memory array, but would rather take an alternate course as described below.
  • the memory system 100 would create an entry in the Logical-to-Physical Translation table (LPT) that associated the LBA provided by the host device with a specific Data Identifier.
  • the Data Identifier would: (a) have the general format of the physical memory address identifier stored in the LPT when the LBA in the table is associated with data actually stored in memory, but (b) would not correspond to any specific physical address in the physical memory array.
  • the Data Identifier would be associated by the system with a specific data string such that, for a given LBA entry, the presence of the Data Identifier would convey the data associated with the LBA, even though such data was not actually stored in a physical location within the memory array, and even though there was no actual physical memory location in the array associated with the LBA.
  • FIG. 15A- 15B This aspect of the present disclosure is generally identified in Figures 15A- 15B.
  • the Data Identifier FFFFF is associated with a data string of all logical O's; the Data Identifier FFFFE with all logical 1 's; and the Data Identifier FFFFD with alternating logical O's and 1 's (beginning with a logical 1 ). This is reflected in the Table in Figure 15A.
  • Figure 15B illustrates an exemplary LPT that has multiple entries, each entry being associated with a specific LBA.
  • the addressing of the table is such that an LPT entry is associated with each LBA address presented by the memory system.
  • Figure 15B illustrates the situation that would exist if a WRITE operation is requested where the data associated with the request is all logical O's and the WRITE request was directed to the LBA address 55.
  • the system would, before executing the WRITE request, analyze the data associated with the request, and determine that it was all logical O's. This could be done through software analysis of the data or through the use of a hardware component, such as a comparator or large AND or OR device.
  • the system would - instead of actually storing data in the memory array - discard the data provided by the host device and store the Data Identifier associated with that data string in the LPT location that would normally store the physical address where the data associated with the corresponding LBA was located.
  • Figure 15B illustrates the situation that would exist if a subsequent WRITE operation occurred where the WRITE was directed to LBA 500 with the data being all logical O's.
  • the system would, using the approaches described above, determine that the data was all O's, discard the data provided by the host device, and write the Data Identifier associated with the all O's string to the entry in the LPT associated with the LBA 500.
  • the entries for both the LBA 55 and LBA 500 would have the same Data Identifier.
  • the same process would be followed for WRITE operations associated with data strings corresponding to other predefined Data Identifiers.
  • the use of the Data Identifiers as described above is beneficial because it does not result in the actual writing of data to the physical memory array and does not suffer the write overhead (time delay) that would occur if an actual write operation occurred.
  • the LPT table is stored in RAM memory and in particular, DDR2 RAM memory.
  • the access times required for RAM memory access are faster than those required for FLASH memory access.
  • the use of Data Identifiers can substantially increase the time seen by the host device for the performance of a write operation.
  • the total number of ERASE operations can be reduced and the lifespan of the memory array increased.
  • the Data Identifiers were predefined to correspond to specific anticipated data strings. Alternate embodiments are envisioned where some of the Data Identifiers are not predefined to be associated with specific data strings, but are rather constructed by the system 100 in response to the actual operation of the system
  • the system 100 can include a process that runs in the background during relatively idle time, where the data actually stored in the memory array is considered.
  • the system would then define a Data Identifier as being associated with that specific data string and would modify the corresponding LPT entries. This process not only could speed up READ and WRITE requests as described above, it could also free up memory space within the memory array that would otherwise be used to store such repetitive data, thus providing more available physical memory and improving the overall operation of the system.
  • the system 100 can include a running Data String Cache memory that associates a Data Identifier with each of the most recent data strings associated with the last N number of WRITE operations (where N is a predefined number).
  • N is a predefined number.
  • the Data Identifier will be used for that entry.
  • a count can be maintained of the number of times a hit occurs for the entries in the Data String Cache.
  • the particular entry can be deleted from the cache, the corresponding data string actually stored in physical memory and a physical memory location for each of the corresponding LBAs in the LPT table, and another data string entry can be placed in the Data String Cache.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Security & Cryptography (AREA)
  • Techniques For Improving Reliability Of Storages (AREA)

Abstract

La présente invention concerne des procédés et des appareils pour améliorer la protection de données stockées dans un système à mémoire flash, impliquant un contrôleur pouvant prendre en compte la défaillance d'un ou de plusieurs dispositifs à mémoire flash dans le système de mémoire. Le contrôleur stocke des données sous la forme de segments de pages, chaque segment de pages étant composé de pages de données, et chaque page de données étant stockée dans un dispositif à mémoire flash différent. Le contrôleur détecte également la défaillance d'un dispositif à mémoire flash dans lequel une page de données d'un segment de pages particulier est stockée, et stocke la page de données reconstruite dans un nouveau segment de pages. Le nombre de pages de données dans le nouveau segment de pages est inférieur au nombre de pages de données dans le segment de pages particulier, et aucune page du nouveau segment de pages n'est stockée dans un emplacement mémoire du dispositif à mémoire flash défaillant.
PCT/US2010/045129 2009-08-11 2010-08-11 Procédé et appareil pour prendre en compte des défaillances prédites ou réelles dans un système à mémoire flash WO2011019794A2 (fr)

Applications Claiming Priority (8)

Application Number Priority Date Filing Date Title
US23291309P 2009-08-11 2009-08-11
US61/232,913 2009-08-11
US12/554,892 2009-09-05
US12/554,891 US7856528B1 (en) 2009-08-11 2009-09-05 Method and apparatus for protecting data using variable size page stripes in a FLASH-based storage system
US12/554,888 2009-09-05
US12/554,892 US8176284B2 (en) 2009-08-11 2009-09-05 FLASH-based memory system with variable length page stripes including data protection information
US12/554,891 2009-09-05
US12/554,888 US8176360B2 (en) 2009-08-11 2009-09-05 Method and apparatus for addressing actual or predicted failures in a FLASH-based storage system

Publications (2)

Publication Number Publication Date
WO2011019794A2 true WO2011019794A2 (fr) 2011-02-17
WO2011019794A3 WO2011019794A3 (fr) 2011-04-28

Family

ID=43586799

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2010/045129 WO2011019794A2 (fr) 2009-08-11 2010-08-11 Procédé et appareil pour prendre en compte des défaillances prédites ou réelles dans un système à mémoire flash

Country Status (1)

Country Link
WO (1) WO2011019794A2 (fr)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2012166725A2 (fr) 2011-05-31 2012-12-06 Micron Technology, Inc. Appareils et procédés pour assurer l'intégrité des données
US8797806B2 (en) 2011-08-15 2014-08-05 Micron Technology, Inc. Apparatus and methods including source gates
US9318199B2 (en) 2012-10-26 2016-04-19 Micron Technology, Inc. Partial page memory operations
US10541029B2 (en) 2012-08-01 2020-01-21 Micron Technology, Inc. Partial block memory operations
CN110990606A (zh) * 2019-12-11 2020-04-10 Tcl移动通信科技(宁波)有限公司 图片存储方法、装置、存储介质及电子设备
WO2021216128A1 (fr) * 2020-04-24 2021-10-28 Western Digital Technologies, Inc. Stationnement de données pour ssds à zones

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6041423A (en) * 1996-11-08 2000-03-21 Oracle Corporation Method and apparatus for using undo/redo logging to perform asynchronous updates of parity and data pages in a redundant array data storage environment
US20070294570A1 (en) * 2006-05-04 2007-12-20 Dell Products L.P. Method and System for Bad Block Management in RAID Arrays
US20090172335A1 (en) * 2007-12-31 2009-07-02 Anand Krishnamurthi Kulkarni Flash devices with raid

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6041423A (en) * 1996-11-08 2000-03-21 Oracle Corporation Method and apparatus for using undo/redo logging to perform asynchronous updates of parity and data pages in a redundant array data storage environment
US20070294570A1 (en) * 2006-05-04 2007-12-20 Dell Products L.P. Method and System for Bad Block Management in RAID Arrays
US20090172335A1 (en) * 2007-12-31 2009-07-02 Anand Krishnamurthi Kulkarni Flash devices with raid

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
'Proceedings of the 25th Workshop on Hot Topics in System Dependa bility', June 2009 article GREENAN, K. ET AL.: 'Building Flexible, Fault-Tolerant Flash-based Storage Sy stems.' *

Cited By (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101560077B1 (ko) * 2011-05-31 2015-10-13 마이크론 테크놀로지, 인크 데이터 무결성을 제공하기 위한 방법 및 장치
CN103620565A (zh) * 2011-05-31 2014-03-05 美光科技公司 用于提供数据完整性的设备及方法
EP2715550A2 (fr) * 2011-05-31 2014-04-09 Micron Technology, INC. Appareils et procédés pour assurer l'intégrité des données
WO2012166725A2 (fr) 2011-05-31 2012-12-06 Micron Technology, Inc. Appareils et procédés pour assurer l'intégrité des données
EP2715550A4 (fr) * 2011-05-31 2014-11-12 Micron Technology Inc Appareils et procédés pour assurer l'intégrité des données
US9152512B2 (en) 2011-05-31 2015-10-06 Micron Technology, Inc. Apparatus and methods for providing data integrity
US9779816B2 (en) 2011-08-15 2017-10-03 Micron Technology, Inc. Apparatus and methods including source gates
US11211126B2 (en) 2011-08-15 2021-12-28 Micron Technology, Inc. Apparatus and methods including source gates
US9378839B2 (en) 2011-08-15 2016-06-28 Micron Technology, Inc. Apparatus and methods including source gates
US10783967B2 (en) 2011-08-15 2020-09-22 Micron Technology, Inc. Apparatus and methods including source gates
US8797806B2 (en) 2011-08-15 2014-08-05 Micron Technology, Inc. Apparatus and methods including source gates
US10170189B2 (en) 2011-08-15 2019-01-01 Micron Technology, Inc. Apparatus and methods including source gates
US10541029B2 (en) 2012-08-01 2020-01-21 Micron Technology, Inc. Partial block memory operations
US11626162B2 (en) 2012-08-01 2023-04-11 Micron Technology, Inc. Partial block memory operations
US9653171B2 (en) 2012-10-26 2017-05-16 Micron Technology, Inc. Partial page memory operations
US9318199B2 (en) 2012-10-26 2016-04-19 Micron Technology, Inc. Partial page memory operations
CN110990606A (zh) * 2019-12-11 2020-04-10 Tcl移动通信科技(宁波)有限公司 图片存储方法、装置、存储介质及电子设备
CN110990606B (zh) * 2019-12-11 2023-10-03 Tcl移动通信科技(宁波)有限公司 图片存储方法、装置、存储介质及电子设备
WO2021216128A1 (fr) * 2020-04-24 2021-10-28 Western Digital Technologies, Inc. Stationnement de données pour ssds à zones
US11409459B2 (en) 2020-04-24 2022-08-09 Western Digital Technologies, Inc. Data parking for SSDs with zones
US11847337B2 (en) 2020-04-24 2023-12-19 Western Digital Technologies, Inc. Data parking for ZNS devices

Also Published As

Publication number Publication date
WO2011019794A3 (fr) 2011-04-28

Similar Documents

Publication Publication Date Title
US9983927B2 (en) Memory system with variable length page stripes including data protection information
US7941696B2 (en) Flash-based memory system with static or variable length page stripes including data protection information and auxiliary protection stripes
US10884914B2 (en) Regrouping data during relocation to facilitate write amplification reduction
KR101660150B1 (ko) 물리 페이지, 논리 페이지, 및 코드워드 대응
US9547589B2 (en) Endurance translation layer (ETL) and diversion of temp files for reduced flash wear of a super-endurance solid-state drive
JP6855102B2 (ja) 不揮発性メモリ・システムにおけるマルチページ障害の回復
US8959280B2 (en) Super-endurance solid-state drive with endurance translation layer (ETL) and diversion of temp files for reduced flash wear
US9405621B2 (en) Green eMMC device (GeD) controller with DRAM data persistence, data-type splitting, meta-page grouping, and diversion of temp files for enhanced flash endurance
US8850114B2 (en) Storage array controller for flash-based storage devices
US8788876B2 (en) Stripe-based memory operation
EP2732373B1 (fr) Procédé et appareil pour un raid souple en ssd
US8910017B2 (en) Flash memory with random partition
US9058288B2 (en) Redundant storage in non-volatile memory by storing redundancy information in volatile memory
WO2011019794A2 (fr) Procédé et appareil pour prendre en compte des défaillances prédites ou réelles dans un système à mémoire flash
US9430375B2 (en) Techniques for storing data in bandwidth optimized or coding rate optimized code words based on data access frequency
US11789643B2 (en) Memory system and control method
US12135895B2 (en) Hot data management in a data storage system
CN115686366A (zh) 一种基于raid的写数据缓存加速方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 10808676

Country of ref document: EP

Kind code of ref document: A2

NENP Non-entry into the national phase in:

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 10808676

Country of ref document: EP

Kind code of ref document: A2