US20220405532A1 - Non-volatile memory die with on-chip data augmentation components for use with machine learning - Google Patents
Non-volatile memory die with on-chip data augmentation components for use with machine learning Download PDFInfo
- Publication number
- US20220405532A1 US20220405532A1 US17/897,028 US202217897028A US2022405532A1 US 20220405532 A1 US20220405532 A1 US 20220405532A1 US 202217897028 A US202217897028 A US 202217897028A US 2022405532 A1 US2022405532 A1 US 2022405532A1
- Authority
- US
- United States
- Prior art keywords
- data
- nvm
- images
- die
- machine learning
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 230000015654 memory Effects 0.000 title claims abstract description 111
- 238000010801 machine learning Methods 0.000 title claims abstract description 98
- 238000013434 data augmentation Methods 0.000 title claims abstract description 83
- 238000000034 method Methods 0.000 claims abstract description 89
- 230000003190 augmentative effect Effects 0.000 claims abstract description 66
- 238000013135 deep learning Methods 0.000 claims abstract description 19
- 238000012545 processing Methods 0.000 claims description 33
- 238000012937 correction Methods 0.000 claims description 28
- 230000008569 process Effects 0.000 claims description 18
- 238000013528 artificial neural network Methods 0.000 claims description 11
- 230000000946 synaptic effect Effects 0.000 claims description 5
- 239000013598 vector Substances 0.000 claims description 4
- 238000012549 training Methods 0.000 abstract description 48
- 230000006870 function Effects 0.000 description 20
- 239000000758 substrate Substances 0.000 description 16
- 238000003491 array Methods 0.000 description 15
- 238000010586 diagram Methods 0.000 description 15
- 238000004891 communication Methods 0.000 description 14
- 230000003416 augmentation Effects 0.000 description 12
- 239000004065 semiconductor Substances 0.000 description 8
- 238000013500 data storage Methods 0.000 description 7
- 238000005516 engineering process Methods 0.000 description 4
- 238000012546 transfer Methods 0.000 description 4
- 238000004590 computer program Methods 0.000 description 3
- 230000008901 benefit Effects 0.000 description 2
- 239000002245 particle Substances 0.000 description 2
- 238000003909 pattern recognition Methods 0.000 description 2
- 230000002093 peripheral effect Effects 0.000 description 2
- 229910052710 silicon Inorganic materials 0.000 description 2
- 239000010703 silicon Substances 0.000 description 2
- 241001465754 Metazoa Species 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 239000011370 conductive nanoparticle Substances 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 239000003989 dielectric material Substances 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000007667 floating Methods 0.000 description 1
- 238000007726 management method Methods 0.000 description 1
- 230000005055 memory storage Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 230000001537 neural effect Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 239000012782 phase change material Substances 0.000 description 1
- 230000002441 reversible effect Effects 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 230000001360 synchronised effect Effects 0.000 description 1
- 230000001052 transient effect Effects 0.000 description 1
Images
Classifications
-
- G06K9/6257—
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/0703—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
- G06F11/0706—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment
- G06F11/0727—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment in a storage system, e.g. in a DASD or network based storage system
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/08—Error detection or correction by redundancy in data representation, e.g. by using checking codes
- G06F11/10—Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's
- G06F11/1008—Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's in individual solid state devices
- G06F11/1048—Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's in individual solid state devices using arrangements adapted for a specific error detection or correction feature
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
- G06F18/2148—Generating training patterns; Bootstrap methods, e.g. bagging or boosting characterised by the process organisation or structure, e.g. boosting cascade
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- G06T3/0006—
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/02—Affine transformations
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/60—Rotation of whole images or parts thereof
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/772—Determining representative reference patterns, e.g. averaging or distorting patterns; Generating dictionaries
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/774—Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/94—Hardware or software architectures specially adapted for image or video understanding
- G06V10/955—Hardware or software architectures specially adapted for image or video understanding using specific electronic processors
-
- G—PHYSICS
- G11—INFORMATION STORAGE
- G11C—STATIC STORES
- G11C11/00—Digital stores characterised by the use of particular electric or magnetic storage elements; Storage elements therefor
- G11C11/54—Digital stores characterised by the use of particular electric or magnetic storage elements; Storage elements therefor using elements simulating biological cells, e.g. neuron
-
- G—PHYSICS
- G11—INFORMATION STORAGE
- G11C—STATIC STORES
- G11C16/00—Erasable programmable read-only memories
- G11C16/02—Erasable programmable read-only memories electrically programmable
- G11C16/04—Erasable programmable read-only memories electrically programmable using variable threshold transistors, e.g. FAMOS
- G11C16/0483—Erasable programmable read-only memories electrically programmable using variable threshold transistors, e.g. FAMOS comprising cells having several storage transistors connected in series
-
- G—PHYSICS
- G11—INFORMATION STORAGE
- G11C—STATIC STORES
- G11C16/00—Erasable programmable read-only memories
- G11C16/02—Erasable programmable read-only memories electrically programmable
- G11C16/06—Auxiliary circuits, e.g. for writing into memory
- G11C16/26—Sensing or reading circuits; Data output circuits
Definitions
- the disclosure relates, in some embodiments, to non-volatile memory (NVM) dies. More specifically, but not exclusively, the disclosure relates to methods and apparatus for implementing data augmentation within an NVM die for use with machine learning.
- NVM non-volatile memory
- Machine learning generally relates to the use of artificial intelligence to perform tasks without explicit instructions and instead relying on patterns and inference.
- Deep learning (which also may be referred to as deep structured learning or hierarchical learning) relates to machine learning methods based on learning data representations or architectures, such as deep neural networks (DNNs), rather than to task-specific procedures or algorithms.
- DNNs deep neural networks
- Deep learning is applied to such fields as speech recognition, computer vision, and self-driving vehicles. Deep learning may be accomplished by, or facilitated by, deep learning accelerators (DLAs), e.g., microprocessor devices designed to accelerate the generation of useful neural networks to implement deep learning.
- DLAs deep learning accelerators
- a DLA or other machine learning system may need to be trained using initial training data, such as an initial set of images that have been tagged or labeled for use in training an image recognition system.
- Data augmentation includes procedures for expanding an initial set of images in a realistic but randomized manner to increase the variety of data for use during training. For example, a small set of input images may be altered slightly (by, e.g., rotating or skewing the images) to create a larger set of images (i.e. an augmented image set) for use in training the system. That is, data augmentation allows re-using tagged or labeled data in multiple training instances in order to increase the size of the training data set.
- One embodiment of the disclosure provides an apparatus that includes: a die with non-volatile memory (NVM) elements; and a data augmentation controller formed in the die and configured to augment machine learning data stored within the NVM elements with augmented machine learning data.
- NVM non-volatile memory
- Another embodiment of the disclosure provides a method for use with a die having an NVM array, the method including: storing machine learning data within the NVM array of the die; and generating augmented machine learning data using data augmentation circuitry formed in the die.
- Yet another embodiment of the disclosure provides an apparatus with a die having an NVM array where the apparatus includes: means formed in the die for storing a machine learning data within the NVM array of the die; and means formed in the die for generating at least one augmented version of the machine learning data.
- FIG. 1 illustrates a schematic block diagram configuration for an exemplary solid state device (SSD) having one or more non-volatile memory (NVM) array dies, where the dies have on-chip data augmentation components.
- SSD solid state device
- NVM non-volatile memory
- FIG. 2 illustrates an example of an NVM die having on-chip under-the-array or next-to-the-array components configured for data augmentation processing.
- FIG. 3 illustrates a flow chart of an exemplary method according to aspects of the present disclosure for performing various types of on-chip data augmentation of image data.
- FIG. 4 illustrates a NAND array of an NVM die for storing image data and also schematically illustrating the various on-chip data augmentation procedures of FIG. 3 .
- FIG. 5 illustrates a flow chart of an exemplary method according to aspects of the present disclosure for performing on-chip data augmentation of image data by deactivating or at least reducing the use of on-chip error correction procedures so as to obtain noisy images.
- FIG. 6 illustrates a NAND array of an NVM die for storing image data and also schematically illustrating the error correction-based data augmentation procedures of FIG. 5 .
- FIG. 7 illustrates a flow chart of an exemplary method according to aspects of the present disclosure for use in systems where error correction procedures are instead performed by a separate device controller.
- FIG. 8 illustrates a flow chart of an exemplary method according to aspects of the present disclosure for performing on-chip data augmentation of images by adjusting read voltages during data reads so as to obtain noisy images.
- FIG. 9 illustrates a NAND array of an NVM die for storing image data and also schematically illustrating the read voltage-based data augmentation procedures of FIG. 8 .
- FIG. 10 illustrates a flow chart of an exemplary method according to aspects of the present disclosure for performing on-chip data augmentation of images by storing and then reading image data within worn regions of the NVM die so as to obtain noisy images.
- FIG. 11 illustrates a NAND array of an NVM die for storing image data and also schematically illustrating the worn region-based data augmentation procedures of FIG. 10 .
- FIG. 12 illustrates a flow chart that summarizes exemplary on-chip data augmentation operations performed by an NVM die.
- FIG. 13 illustrates a schematic block diagram configuration for an exemplary NVM apparatus such as a NAND die.
- FIG. 14 illustrates a schematic block diagram providing further details of an exemplary NVM die and its on-chip components.
- NVM non-volatile memory
- data storage devices or apparatus for controlling the NVM arrays such as a controller of a data storage device (such as an SSD), and in particular to NAND flash memory storage devices (herein “NANDs”).
- NAND NAND flash memory storage devices
- a NAND is a type of non-volatile storage technology that does not require power to retain data. It exploits negative-AND, i.e. NAND, logic.
- NAND NAND flash memory storage devices
- an SSD having one or more NAND dies will be used below in the description of various embodiments. It is understood that at least some aspects described herein may be applicable to other forms of data storage devices as well.
- phase-change memory PCM arrays
- MRAM magneto-resistive random access memory
- ReRAM resistive random access memory
- the various embodiments may be used in various machine learning devices which may include some combination of processing elements and memory/data storage elements, including the NVM arrays constructed/configured in accordance with the described embodiments.
- machine learning may be accomplished by, or facilitated by, deep learning accelerators (DLAs), e.g., microprocessor devices designed to accelerate the generation of deep neural networks (DNNs) to implement machine learning.
- DLAs deep learning accelerators
- DNNs deep neural networks
- a DLA may need to be trained using initial training data, such as an initial set of images for training an image recognition system having a DLA.
- Data augmentation is a process of modifying an initial set of images (in, e.g., a realistic but randomized manner) to increase the variety or variance of data for use during training.
- a set of input images may be altered (by, e.g., rotating or skewing the images) to create a larger set of images (an augmented image set) for use in training the system.
- Data augmentation may be defined more generally as a regularization technique for avoiding overfitting when training a machine learning system, such as a machine learning network or algorithm. Regularization is the process of adding information in order to solve an ill-posed problem or to prevent overfitting during machine learning. For example, regularization may make slight modifications to a learning model so the learning model generalizes more effectively from training data.
- the term data augmentation is defined as generating at least one modified version of data to avoid or reduce the risk of overfitting during training of a machine learning system using the data.
- the data may be, for example, a data vector, data array, data object or data representation of any number of dimensions, such as a 2-D data object containing one or more patterns. Examples of such data include images or audio segments or other types of numerical data, categorical data, time series data, or text.
- Deep learning or machine learning may be implemented using processing components that are integrated with the memory components where the data to be processed is stored, i.e. using “near memory” computing, so as to reduce the need to transfer large quantities of data from one component to another.
- processing components that are integrated with the memory components where the data to be processed is stored, i.e. using “near memory” computing, so as to reduce the need to transfer large quantities of data from one component to another.
- standalone processing units such as graphics processing units (GPUs), central processing units (CPUs), etc.
- stand-alone memory units such as dynamic random-access-memory (DRAM)
- a near memory computing architecture is disclosed herein for data augmentation.
- a DNN is an example of an artificial neural network that has multiple layers between input and output layers.
- a DNN operates to determine a mathematical computation or manipulation to convert the input into the output, which might be a linear or non-linear computation.
- the DNN may work through its layers by calculating a probability of each output.
- Each mathematical manipulation may be considered a layer.
- Networks that have many layers are referred to as having “deep” layers, hence the term DNN.
- the DNN might be configured to identify a person within an input image by processing the bits of the input image to yield identify the particular person, i.e. the output of the DNN is a value that identifies the particular person.
- the DNN may need to be trained.
- the data augmentation procedures and apparatus described herein may be used to augment an initial set of training data, such as an initial set of labeled images (where labeled images are images containing known data, such as an image that has already been identified as corresponding to a particular type of object).
- labeled images are images containing known data, such as an image that has already been identified as corresponding to a particular type of object.
- the die may also be configured for near memory DNN processing by, for example, providing a DLA on the die as well as data augmentation circuits.
- An advantage of at least some of the exemplary methods and apparatus described herein is that only the final result of a data augmented training procedure is transferred to the controller and host, thus avoiding the transference of large amounts of training data, such as augmented sets of training images that might include thousands of augmented images.
- the data augmentation machine learning dies described herein may be different from GPUs in that a GPU typically transfers calculated data from its NVM to a volatile RAM/DRAM, whereas the augmentations described in various examples herein are done by the NAND dies.
- the die includes extra-array logic for performing the augmentation, storing the results, and performing other machine learning operations, such as the actual training of a DLA based on the augmented data.
- a NVM architecture is disclosed that offloads data augmentation from host devices or other devices and instead performs the augmentation within the NVM die.
- at least some of the methods and apparatus disclosed herein exploit die parallelism and inherent features of an NVM (such as inherent noise features). This can facilitate the implementation of machine learning edge computing application training on-chip.
- FIG. 1 is a block diagram of a system 100 including an exemplary SSD having an NVM with on-chip machine learning data augmentation components.
- the system 100 includes a host 102 and a SSD 104 coupled to the host 102 .
- the host 102 provides commands to the SSD 104 for transferring data between the host 102 and the SSD 104 .
- the host 102 may provide a write command to the SSD 104 for writing data to the SSD 104 or read command to the SSD 104 for reading data from the SSD 104 .
- the host 102 may be any system or device having a need for data storage or retrieval and a compatible interface for communicating with the SSD 104 .
- the host 102 may a computing device, a personal computer, a portable computer, a workstation, a server, a personal digital assistant, a digital camera, or a digital phone as merely a few examples.
- the host 102 may be a system or device having a need for neural network processing, such as speech recognition, computer vision, and self-driving vehicles.
- the host 102 may be a component of a self-driving system of a vehicle.
- the SSD 104 includes a host interface 106 , a controller 108 , a memory 110 (such as RAM), an NVM interface 112 (which may be referred to as a flash interface), and an NVM 114 , such as one or more NAND dies configured with on-chip machine learning data augmentation components.
- the host interface 106 is coupled to the controller 108 and facilitates communication between the host 102 and the controller 108 .
- the controller 108 is coupled to the memory 110 as well as to the NVM 114 via the NVM interface 112 .
- the host interface 106 may be any suitable communication interface, such as an Integrated Drive Electronics (IDE) interface, a Universal Serial Bus (USB) interface, a Serial Peripheral (SP) interface, an Advanced Technology Attachment (ATA) or Serial Advanced Technology Attachment (SATA) interface, a Small Computer System Interface (SCSI), an IEEE 1394 (Firewire) interface, or the like.
- the host 102 includes the SSD 104 .
- the SSD 104 is remote from the host 102 or is contained in a remote computing system communicatively coupled with the host 102 .
- the host 102 may communicate with the SSD 104 through a wireless communication link.
- the controller 108 controls operation of the SSD 104 .
- the controller 108 receives commands from the host 102 through the host interface 106 and performs the commands to transfer data between the host 102 and the NVM 114 .
- the controller 108 may manage reading from and writing to memory 110 for performing the various functions effected by the controller and to maintain and manage cached information stored in memory 110 .
- the controller 108 may include any type of processing device, such as a microprocessor, a microcontroller, an embedded controller, a logic circuit, software, firmware, or the like, for controlling operation of the SSD 104 .
- some or all of the functions described herein as being performed by the controller 108 may instead be performed by another element of the SSD 104 .
- the SSD 104 may include a microprocessor, a microcontroller, an embedded controller, a logic circuit, software, firmware, or any kind of processing device, for performing one or more of the functions described herein as being performed by the controller 108 .
- one or more of the functions described herein as being performed by the controller 108 are instead performed by the host 102 .
- some or all of the functions described herein as being performed by the controller 108 may instead be performed by another element such as a controller in a hybrid drive including both non-volatile memory elements and magnetic storage elements.
- the memory 110 may be any suitable memory, computing device, or system capable of storing data.
- the memory 110 may be ordinary RAM, DRAM, double data rate (DDR) RAM (DRAM), static RAM (SRAM), synchronous dynamic RAM (SDRAM), a flash storage, an erasable programmable read-only-memory (EPROM), an electrically erasable programmable ROM (EEPROM), or the like.
- the controller 108 uses the memory 110 , or a portion thereof, to store data during the transfer of data between the host 102 and the NVM 114 .
- the memory 110 or a portion of the memory 110 may be a cache memory.
- the NVM 114 receives data from the controller 108 via the NVM interface 112 and stores the data.
- the NVM 114 may be any suitable type of non-volatile memory, such as a NAND-type flash memory or the like.
- the controller 108 may include hardware, firmware, software, or any combinations thereof that provide a machine learning controller 116 for use with the NVM array 114 (where the machine learning controller, in some examples, may include at least some off-chip data augmentation components such as components that control data augmentation based on controlling off-chip error correction).
- FIG. 1 shows an example SSD and an SSD is generally used as an illustrative example in the description throughout, the various disclosed embodiments are not necessarily limited to an SSD application/implementation.
- the disclosed NVM die and associated processing components can be implemented as part of a package that includes other processing circuitry and/or components.
- a processor may include, or otherwise be coupled with, embedded NVM and associated circuitry and/or components for machine learning that are described herein.
- the processor could, as one example, off-load certain machine learning tasks to the NVM and associated circuitry and/or components.
- the controller 108 may be a controller in another type of device and still include the neural network controller 116 and perform some or all of the functions described herein.
- FIG. 2 illustrates a block diagram of an exemplary NVM die 200 that includes NVM storage array components 202 and under-the-array or next-to-the-array (or other extra-array) processing components 204 . Not all circuit or memory components that might be used in a practical NVM die are illustrated in the figure, such as input and output components, voltage regulation components, clocks and timing components, etc. Rather only some components and circuits are shown, summarized as block or schematic diagrams.
- the exemplary NVM array components 202 include: NVM storage 206 for storing machine learning training data such as input image data and augmented image data; and NVM storage 208 configured for storing other data such as DNN synaptic weights, bias values, etc., or other types of user data or system data.
- the NVM extra-array processing components 204 include data augmentation components 210 configured to perform or control data augmentation operations.
- the exemplary data augmentation components 210 include: one or more noise addition components 212 configured to generate augmented machine learning data by adding noise to initial machine learning data, such as by adding uncorrelated noise to each of an initial set of labeled training images; one or more skew components 214 configured to generate augmented machine learning data by skewing initial machine learning data, such as by skewing each of an initial set of labeled training images in a different manner; one or more crop components 216 configured to generate augmented machine learning data by cropping initial machine learning data, such as by cropping each of an initial set of labeled training images in a different manner; one or more flip/rotate/translate components 218 configured to generate augmented machine learning data by flipping, rotating and/or translating initial machine learning data, such as by flipping, rotating and/or translating each of an initial set of labeled training images in a different manner.
- each augmentation component ( 212 , 214 , 216 , and 218 ) is shown since, in some examples, a plurality of such devices may operate in parallel.
- N noise addition components 212 may be provided to concurrently process N different input training images to generate a set of augmented images from each of the N different input training images.
- only a single instance of each component may be provided.
- only one or a few of the illustrated components are provided such as only the noise addition components 212 or only the skew components 214 .
- other augmentation components are additionally or alternatively provided, which serve to augment the initial data set in other manners.
- the exemplary components of FIG. 2 primarily relate to the augmentation of image data. For examples where the data is not image data but, for example, audio data, different augmentation components may be provided that are appropriate to the type of data.
- the NVM extra-array processing components 204 of FIG. 2 also include various other components including: a machine learning value storage controller 226 configured to store machine learning data in the NVM storage 206 ; a machine learning value read controller 228 configured to read previously-stored machine learning data from the NVM storage 206 ; and an on-chip error correction code (ECC) controller 230 configured to control any on-chip ECC applied to data as it is read from the NVM array components 202 to address a bit error rate (BER).
- ECC error correction code
- BER bit error rate
- certain types of data augmentation can be performed by adjusting ECC or, in some cases, deactivating ECC so as to increase the BER to selectively add noise into images.
- a data augmentation controller may be configured to generate augmented data by reducing an amount of error correction performed by the error correction components compared to an average amount of error correction that would otherwise be employed to read data not subject to data augmentation, and then reading stored data from the NVM elements with the reduced error correction.
- ECC may be reduced by examining only two bytes of the ECC data.
- ECC is instead performed by a device controller that is separate from the die (such as controller 108 of FIG. 1 ).
- the die itself does not control ECC and hence cannot directly adjust the ECC.
- all images read by the die may be “noisy” images suitable for use as augmented images in on-chip training.
- the BER for an NVM block may vary as a function of underlying conditions and memory type and so read controller 228 may be programmed or configured to take such information into account when selecting a target location for storing data. For example, for data augmentation purposes, write parameters may be selected or modified to increase the BER so that any augmentation requirements are satisfied.
- one technique for modifying write parameters is to modify the location where image data is written so as to store the data in worn regions of the NVM array 202 to thereby increase storage errors, so as to inject noise into the stored/retrieved image data.
- FIG. 2 also illustrates a machine learning controller 232 , which may be, e.g., a DLA, DNN, pattern recognition controller, image recognition controller, etc., configured to perform some form of machine learning using augmented data.
- the augmented data is stored in the NVM arrays 202 for later use.
- augmented data is held in other memory within the die, such as within data laches (not shown in FIG. 2 ), for immediate use by training components, then erased or overwritten. That is, in some examples, the augmented data may be transient data that is saved only as long as it is needed to train a machine learning system (e.g. a DNN) and then discarded.
- FIG. 3 illustrates an exemplary method 300 for data augmentation for use with image recognition according to aspects of the present disclosure where any of the aforementioned forms of data augmentation may be applied (e.g. skewing, rotating, etc.).
- input circuitry of an NVM die inputs an initial set of labeled (or tagged) input images for use with image recognition training (or for use with other forms of deep learning or machine learning) and stores the initial set of images within a NAND NVM array of the die.
- read circuitry of the NVM die reads one or more of the labeled images from the NAND NVM array.
- data augmentation circuitry of the NVM die generates a set of altered versions of the labeled images by, e.g., rotating, translating, skewing, cropping, flipping, and/or adding noise to the labeled images read from the NAND NVM array to provide an augmented image set.
- machine learning circuitry of the NVM die performs machine learning, such as DLA learning, using the augmented image set to, for example, train an image recognition system to recognize images within the augmented image set, and then output a set of trained parameters.
- the parameters may include synaptic weights, bias values, etc., for use with a DNN configured for image recognition.
- the image recognition system itself may be configured, e.g., within the extra-array circuitry of the die or may be embodied elsewhere, such as within an SSD controller, a host system, or a remote server.
- FIG. 4 illustrates a NAND array 400 of an NVM die (such as the die of FIG. 2 ) for storing image data and various procedures that manipulate and process the data using the methods of FIG. 3 .
- a host device or other external system provides labeled training images, such as labeled images of particular individuals to be identified by an image recognition system or particular types of objects or animals to be detected.
- circuitry of the NVM die inputs and stores the labeled images within a first portion or first region 406 of the NAND array 400 .
- circuitry of the NAND die reads the labeled images at 408 , modifies the images to generate an augmented set of labeled images at 410 , and then stores the augmented set of images at 412 into a second region or second portion 414 of the NAND array 400 for subsequent use in training an image recognition system, such as for training the DLA of an on-chip image recognition system.
- the circuitry of the NAND die uses the augmented set of images (substantially) immediately to train an image recognition system, such as by directly applying the augmented set of images to an on-chip DLA.
- FIG. 5 illustrates an exemplary method 500 for data augmentation for use with image recognition according to aspects of the present disclosure where data augmentation is performed by deactivating or at least reducing the use of on-chip ECC (or other on-chip error correction systems or procedures). Note that, herein, deactivating ECC is one example of reducing the use of ECC.
- input circuitry of the NVM die inputs an initial set of labeled input images for use with image recognition training (or other forms of deep learning or machine learning) and stores the initial set of labeled input images within a NAND NVM array of the die, where the die is configured to apply on-chip error correction to data read from the NAND NVM array.
- control circuitry of the NVM die deactivates on-chip error detection and correction procedures or otherwise reduces the amount (or effectiveness) of on-chip error correction applied by the NVM to data read from the array to selectively increase the effective BER.
- ECC Error Correction Code
- read circuitry of the NVM die repeatedly reads an image from the NAND NVM array without on-chip error correction or with reduced on-chip error correction to generate a set of augmented labeled images that differ from one another and from the initial image due to differing noise artifacts caused by the lack of on-chip error correction or the reduced error correction.
- a set of augmented labeled images that differ from one another and from the initial image due to differing noise artifacts caused by the lack of on-chip error correction or the reduced error correction.
- noise artifacts e.g. different noise vectors
- machine learning circuitry of the NVM die performs machine learning, such as DLA learning, using the augmented labeled image set to, e.g., train an image recognition system to recognize images within the set, and then output a set of trained parameters.
- the image recognition system may be configured, e.g., within the extra-array circuitry of the die or may be embodied elsewhere, such as within an SSD controller, a host system, or a remote server.
- FIG. 6 illustrates a NAND array 600 of an NVM die (such as the die of FIG. 2 ) for storing image data and various procedures that manipulate and process the data using the methods of FIG. 5 .
- a host device or other external system provides labeled training images.
- circuitry of the NVM die inputs and stores the labeled images within a first portion or first region 606 of the NAND array 600 .
- circuitry of the NAND die later, when a data augmentation procedure is initiated, circuitry of the NAND die: repeatedly reads the labeled images at 608 with on-chip ECC deactivated or reduced so as to provide or retain noise within the read images and thus generate an augmented set of labeled images; and then stores the augmented set of images at 610 into a second region or second portion 612 of the NAND array 600 for subsequent use in training an image recognition system. Additionally or alternatively, at 614 , the circuitry of the NAND die uses the augmented set of images (substantially) immediately to train an image recognition system.
- FIG. 6 multiple arrows are shown leading from the first array portion 606 to emphasize that individual images stored therein can be repeatedly read.
- Each separate read from the NAND array performed either without on-chip ECC or with reduced on-chip ECC, will generally result in different noise artifacts in the read-out images, where the noise artifacts are uncorrelated within one another, thus providing noise-based data augmentation or noise-augmented data sets.
- the read operations may be performed repeatedly until a training system that uses the augmented data set is satisfied that a sufficient the number of samples of each particular image are collected, such as by comparing the number of sample against a suitable threshold value or by verifying that the system is sufficiently trained.
- a read channel or NVM device controller that is separate from the NVM die may be configured to perform at least some of the procedures or operations of FIGS. 5 and 6 , for example if ECC is performed by a device or component that is separate from the die.
- FIG. 7 summarizes a method that may be performed by the die.
- the die reads stored (target) data from a NAND block (which might be image data for use in DLA training or might be other data).
- the die determines whether a data augmentation mode is ON. If the data augmentation mode is ON, then at block 706 , the die uses the read data in DLA training or other machine learning training. If data augmentation mode is OFF, the read data at block 708 is instead sent a controller (such as separate device controller 108 of FIG. 1 or a controller formed on the NAND) to perform ECC decoding on the data, so that the data can then be processed normally.
- a controller such as separate device controller 108 of FIG. 1 or a controller formed on the NAND
- FIG. 8 illustrates an exemplary method 800 for data augmentation for use with image recognition according to aspects of the present disclosure where data augmentation is performed by modifying read voltages to inject noise into read images (or otherwise obtain a greater amount of read errors).
- input circuitry of the NVM die inputs an initial set of labeled input images for use with image recognition training (or other forms of deep learning or machine learning) and stores the initial set of labeled input images within a NAND NVM array of the die.
- control circuitry of the NVM die identifies a read voltage for reading data from the NVM array with minimal read errors (e.g. a normal read voltage set to achieve a low BER).
- the control circuitry of the NVM die modifies the read voltages applied to its NVM elements (as compared to read voltages that would otherwise be employed to read images not subject to data augmentation, e.g. the voltages with minimal read errors identified at block 804 ). And so, in one example, if data is ordinarily read using an average threshold voltage of X volts, the modified read voltage might be 0.9X volts.
- read circuitry of the NVM die applies the modified read voltages to the NVM elements while reading one or more of the initial labeled images from the NVM elements to generate a set of augmented labeled images that differ from one another and from the initial images due to differing noise artifacts caused by the modified read voltages.
- machine learning circuitry of the NVM die performs machine learning, such as DLA learning, using the augmented labeled image set to, e.g., train an image recognition system to recognize images within the set, and then output a set of trained parameters.
- machine learning such as DLA learning
- FIG. 9 illustrates a NAND array 900 of an NVM die (such as the die of FIG. 2 ) for storing image data and various procedures that manipulate and process the data using the methods of FIG. 8 .
- a host device or other external system provides labeled training images.
- circuitry of the NVM die inputs and stores the labeled images within a first portion or first region 906 of the NAND array 900 .
- circuitry of the NAND die later, when a data augmentation procedure is initiated, circuitry of the NAND die: repeatedly reads the labeled images at 908 with the modified read voltages so as to provide or retain noise within the read images to thereby generate an augmented set of labeled images; and then stores the augmented set of images into a second region or second portion 912 of the NAND array 900 for subsequent use in training an image recognition system. Additionally or alternatively, at 914 , the circuitry of the NAND die uses the augmented set of images (substantially) immediately to train an image recognition system. In FIG. 9 , multiple arrows are shown leading from the first array portion 906 to emphasize that individual images stored therein can be repeatedly read with potentially different read voltages. Each separate read from the NAND array will generally result in different noise artifacts, where the noise artifacts are uncorrelated within one another, thus providing noise-based data augmentation or noise-augmented data sets.
- FIG. 10 illustrates an exemplary method 1000 for data augmentation for use with image recognition according to aspects of the present disclosure where data augmentation is performed by repeatedly writing (initially un-augmented) data to worn regions of the NVM and then reading the data from the worn regions of the NVM to thereby inject noise into the images.
- input circuitry of the NVM die inputs an initial set of labeled input images for use with image recognition training (or other forms of deep learning or machine learning) and stores the initial set of labeled input images within a NAND NVM array of the die, where the die has regions affected by differing amounts of wear.
- control circuitry of the NVM die identifies worn regions of the NVM array that are subject to storage errors.
- any suitable technique can be used to identify worn areas of the NVM array, such as by tracking the BER of data read from various blocks.
- read circuitry of the NVM die reads labeled images from an initial storage region of the NVM array and, at block 1008 , write (program) circuitry of the NVM die re-stores the labeled images in the worn regions of the NVM subject to storage errors.
- write circuitry of the NVM die re-reads the labeled images from the worn regions of the NVM array to thereby obtain noise-augmented versions of the labeled images where the noise is caused by storing/reading from the worn regions of the NVM array that have high BER.
- machine learning circuitry of the NVM die performs machine learning, such as DLA learning, using the augmented labeled image set to, e.g., train an image recognition system to recognize images, and then output a set of trained parameters.
- FIG. 11 illustrates a NAND array 1100 of an NVM die (such as the die of FIG. 2 ) for storing image data and various procedures that manipulate and process the data using the methods of FIG. 10 .
- a host device or other external system provides labeled training images.
- circuitry of the NVM die inputs and stores the labeled images within a first (non-worn) region 1106 of the NAND array 1100 . Later, at 1108 , when a data augmentation procedure is initiated, circuitry of the NVM die re-stores the labeled images in a worn region 1110 of the NAND array.
- read circuitry of the NVM die repeatedly reads the labeled images from the worn region 1110 so as to thereby generate an augmented set of labeled images exploiting un-corrected read errors.
- circuitry of the NVM die stores the augmented set of images into another region 1116 of the NAND array 1100 for subsequent use in training an image recognition system. Additionally or alternatively, at 1118 , the circuitry of the NAND die uses the augmented set of images (substantially) immediately to train an image recognition system. In FIG. 11 , multiple arrows are shown leading from worn array portion 1110 to emphasize that individual images stored therein can be repeatedly read. Each separate read from the NAND array will generally result in different noise artifacts due to the worn characteristics of the array region 1110 , where the noise artifacts are uncorrelated within one another, thus providing for noise-augmented data sets.
- FIG. 12 broadly illustrates a process 1200 in accordance with some aspects of the disclosure.
- the process 1200 may take place within any suitable apparatus or device having a die capable of performing the operations, such as a NAND die.
- the die e.g. a suitably-configured NAND die
- the die stores machine learning data within the NVM array of a die.
- the die generates augmented machine learning data using data augmentation circuitry formed in the die or using components of a memory controller. Examples are described above.
- ECC components of the memory controller may be configured or controlled to permit or facilitate the creation of augmented data sets by deactivating or reducing ECC.
- FIG. 13 broadly illustrates an embodiment of an apparatus 1300 configured according to one or more aspects of the disclosure.
- the apparatus 1300 or components thereof, could embody or be implemented within a NAND die or some other type of NVM device that supports data storage.
- the apparatus 1300 includes NVM elements 1302 and a data augmentation controller 1304 configured to augment machine learning data stored within the NVM elements 1302 with augmented machine learning data. Examples of the apparatus are described above. Additional examples are described below.
- at least some data augmentation components may be separate from the die, such as ECC components of a NAND device controller.
- FIG. 14 illustrates an embodiment of an apparatus 1400 configured according to one or more aspects of the disclosure.
- the apparatus 1400 or components thereof, could embody or be implemented within a NAND die or some other type of NVM device that supports data storage.
- the apparatus 1400 or components thereof, could be a component of a processor, a controller, a computing device, a personal computer, a portable device, or workstation, a server, a personal digital assistant, a digital camera, a digital phone, an entertainment device, a medical device, a self-driving vehicle control device, or any other electronic device that stores, processes or uses neural data.
- the apparatus 1400 includes a communication interface 1402 , a physical memory array (e.g., NAND blocks) 1404 , and extra-array processing circuits 1410 , 1411 (e.g. under-the-array or next-to-the-array circuits). These components can be coupled to and/or placed in electrical communication with one another via suitable components, represented generally by the connection lines in FIG. 14 . Although not shown, other circuits such as timing sources, peripherals, voltage regulators, and power management circuits may be provided, which are well known in the art, and therefore, will not be described any further.
- the communication interface 1402 provides a means for communicating with other apparatuses over a transmission medium.
- the communication interface 1402 includes circuitry and/or programming (e.g., a program) adapted to facilitate the communication of information bi-directionally with respect to one or more devices in a system.
- the communication interface 1402 may be configured for wire-based communication.
- the communication interface 1402 could be a bus interface, a send/receive interface, or some other type of signal interface including circuitry for outputting and/or obtaining signals (e.g., outputting signal from and/or receiving signals into an SSD).
- the communication interface 1402 serves as one example of a means for receiving and/or a means for transmitting.
- the physical memory array 1404 may represent one or more NAND blocks.
- the physical memory array 1404 may be used for storing data such images that are manipulated by the circuits 1410 , 1411 or some other component of the apparatus 1400 .
- the physical memory array 1404 may be coupled to the circuits 1410 , 1411 such that the circuits 1410 , 1411 can read or sense information from, and write or program information to, the physical memory array 1404 . That is, the physical memory array 1404 can be coupled to the circuits 1410 , 1411 so that the physical memory array 1404 is accessible by the circuits 1410 , 1411 .
- the circuits 1410 , 1411 are arranged or configured to obtain, process and/or send data, control data access and storage, issue or respond to commands, and control other desired operations.
- the circuits 1410 , 1411 may be implemented as one or more processors, one or more controllers, and/or other structures configured to perform functions.
- the circuits 1410 , 1411 may be adapted to perform any or all of the extra-array features, processes, functions, operations and/or routines described herein.
- the circuits 1410 may be configured to perform any of the steps, functions, and/or processes described with respect to FIGS. 2 - 13 .
- the term “adapted” in relation to the processing circuits 1410 , 1411 may refer to the circuits being one or more of configured, employed, implemented, and/or programmed to perform a particular process, function, operation and/or routine according to various features described herein.
- the circuits may include a specialized processor, such as an application specific integrated circuit (ASIC) that serves as a means for (e.g., structure for) carrying out any one of the operations described in conjunction with FIGS. 2 - 13 .
- ASIC application specific integrated circuit
- the circuits serve as an example of a means for processing.
- the circuits may provide and/or incorporate, at least in part, functionality described above for the components 204 of FIG. 2 .
- the processing circuit 1410 , 1411 may include one or more of: circuit/modules 1420 configured for storing images or other machine learning data in the NAND blocks; circuits/modules 1422 configured for reading images or other machine learning data from the NAND blocks; circuits/modules 1424 configured for controlling the augmentation of images or other machine learning data; circuits/modules 1426 configured for skewing images; circuits/modules 1428 configured for cropping images; circuits/modules 1430 configured for flipping/rotating/translating images; circuits/modules 1432 configured for controlling augmentation via noise; circuits/modules 1433 configured for performing ECC; circuits/modules 1434 configured for deactivating ECC; circuits/modules 1436 configured for reducing ECC; circuits/modules 1437 configured for controlling read voltages; circuits/modules 1438 configured for adjusting read voltages to inject noise; circuits/modules 1439 configured for controlling machine learning
- the physical memory array 1404 may include one or more of: blocks 1440 for storing machine learning data, such as input labeled images; blocks 1442 for storing augmented versions of the machine learning data; blocks 1444 that are worn regions; and blocks 1446 for storing other user data or system data (e.g. data pertaining to the overall control of operations of the NAND die).
- machine learning data such as input labeled images
- blocks 1442 for storing augmented versions of the machine learning data
- blocks 1444 that are worn regions and blocks 1446 for storing other user data or system data (e.g. data pertaining to the overall control of operations of the NAND die).
- means may be provided for performing the functions illustrated in FIG. 14 and/or other functions illustrated or described herein.
- the means may include one or more of: means, such as circuit/module 1420 , for storing images or other machine learning data in the NAND blocks; means, such as circuits/modules 1422 , for reading images or other machine learning data from the NAND blocks; means, such as circuits/modules 1424 , for controlling the augmentation of images or other machine learning data; means, such as circuits/modules 1426 , for skewing images; means, such as circuits/modules 1428 , for cropping images; means, such as circuits/modules 1430 , for flipping/rotating/translating images; means, such as circuits/modules 1432 , for controlling augmentation via noise; means, such as circuits/modules 1433 , for performing ECC; means, such as circuits/modules 1434 , for deactivating ECC; means, such as circuits/modules 14
- means such as NVM elements 1202 of FIG. 2 that are formed in a die, are provided for storing machine learning data within the NVM array of the die; and means, such as data augmentation controller 204 of FIG. 2 that are also formed in the die, are provided for generating at least one augmented version of the machine learning data.
- processing circuits described herein may be generally adapted for processing, including the execution of programming code stored on a storage medium.
- code or “programming” shall be construed broadly to include without limitation instructions, instruction sets, data, code, code segments, program code, programs, programming, subprograms, software modules, applications, software applications, software packages, routines, subroutines, objects, executables, threads of execution, procedures, functions, etc., whether referred to as software, firmware, middleware, microcode, hardware description language, or otherwise.
- processing circuits described herein may be arranged to obtain, process and/or send data, control data access and storage, issue commands, and control other desired operations.
- the processing circuits may include circuitry configured to implement desired programming provided by appropriate media in at least one example.
- the processing circuits may be implemented as one or more processors, one or more controllers, and/or other structure configured to execute executable programming.
- Examples of processing circuits may include a general purpose processor, a digital signal processor (DSP), an ASIC, a field programmable gate array (FPGA) or other programmable logic component, discrete gate or transistor logic, discrete hardware components, or any combination thereof designed to perform the functions described herein.
- a general purpose processor may include a microprocessor, as well as any conventional processor, controller, microcontroller, or state machine. At least some of the processing circuits may also be implemented as a combination of computing components, such as a combination of a controller and a microprocessor, a number of microprocessors, one or more microprocessors in conjunction with an ASIC and a microprocessor, or any other number of varying configurations.
- the various examples of processing circuits noted herein are for illustration and other suitable configurations within the scope of the disclosure are also contemplated.
- NAND flash memory such as 3D NAND flash memory.
- Semiconductor memory devices include volatile memory devices, such as DRAM) or SRAM devices, NVM devices, such as ReRAM, EEPROM, flash memory (which can also be considered a subset of EEPROM), ferroelectric random access memory (FRAM), and MRAM, and other semiconductor elements capable of storing information.
- NVM devices such as ReRAM, EEPROM, flash memory (which can also be considered a subset of EEPROM), ferroelectric random access memory (FRAM), and MRAM, and other semiconductor elements capable of storing information.
- Each type of memory device may have different configurations.
- flash memory devices may be configured in a NAND or a NOR configuration.
- some features described herein are specific to NAND-based devices, such as the NAND-based on-chip copy with update.
- the memory devices can be formed from passive and/or active elements, in any combinations.
- passive semiconductor memory elements include ReRAM device elements, which in some embodiments include a resistivity switching storage element, such as an anti-fuse, phase change material, etc., and optionally a steering element, such as a diode, etc.
- active semiconductor memory elements include EEPROM and flash memory device elements, which in some embodiments include elements containing a charge storage region, such as a floating gate, conductive nanoparticles, or a charge storage dielectric material.
- Multiple memory elements may be configured so that they are connected in series or so that each element is individually accessible.
- flash memory devices in a NAND configuration typically contain memory elements connected in series.
- a NAND memory array may be configured so that the array is composed of multiple strings of memory in which a string is composed of multiple memory elements sharing a single bit line and accessed as a group.
- memory elements may be configured so that each element is individually accessible, e.g., a NOR memory array.
- NAND and NOR memory configurations are exemplary, and memory elements may be otherwise configured.
- the semiconductor memory elements located within and/or over a substrate may be arranged in two or three dimensions, such as a two dimensional memory structure or a three dimensional memory structure.
- the semiconductor memory elements are arranged in a single plane or a single memory device level.
- memory elements are arranged in a plane (e.g., in an x-y direction plane) which extends substantially parallel to a major surface of a substrate that supports the memory elements.
- the substrate may be a wafer over or in which the layer of the memory elements are formed or it may be a carrier substrate which is attached to the memory elements after they are formed.
- the substrate may include a semiconductor such as silicon.
- the memory elements may be arranged in the single memory device level in an ordered array, such as in a plurality of rows and/or columns. However, the memory elements may be arrayed in non-regular or non-orthogonal configurations.
- the memory elements may each have two or more electrodes or contact lines, such as bit lines and word lines.
- a three dimensional memory array is arranged so that memory elements occupy multiple planes or multiple memory device levels, thereby forming a structure in three dimensions (i.e., in the x, y and z directions, where the z direction is substantially perpendicular and the x and y directions are substantially parallel to the major surface of the substrate).
- a three dimensional memory structure may be vertically arranged as a stack of multiple two dimensional memory device levels.
- a three dimensional memory array may be arranged as multiple vertical columns (e.g., columns extending substantially perpendicular to the major surface of the substrate, i.e., in the z direction) with each column having multiple memory elements in each column.
- the columns may be arranged in a two dimensional configuration, e.g., in an x-y plane, resulting in a three dimensional arrangement of memory elements with elements on multiple vertically stacked memory planes.
- Other configurations of memory elements in three dimensions can also constitute a three dimensional memory array.
- the memory elements may be coupled together to form a NAND string within a single horizontal (e.g., x-y) memory device levels.
- the memory elements may be coupled together to form a vertical NAND string that traverses across multiple horizontal memory device levels.
- Other three dimensional configurations can be envisioned wherein some NAND strings contain memory elements in a single memory level while other strings contain memory elements which span through multiple memory levels.
- Three dimensional memory arrays may also be designed in a NOR configuration and in a ReRAM configuration.
- a monolithic three dimensional memory array typically, one or more memory device levels are formed above a single substrate.
- the monolithic three dimensional memory array may also have one or more memory layers at least partially within the single substrate.
- the substrate may include a semiconductor such as silicon.
- the layers constituting each memory device level of the array are typically formed on the layers of the underlying memory device levels of the array.
- layers of adjacent memory device levels of a monolithic three dimensional memory array may be shared or have intervening layers between memory device levels.
- non-monolithic stacked memories can be constructed by forming memory levels on separate substrates and then stacking the memory levels atop each other. The substrates may be thinned or removed from the memory device levels before stacking, but as the memory device levels are initially formed over separate substrates, the resulting memory arrays are not monolithic three dimensional memory arrays. Further, multiple two dimensional memory arrays or three dimensional memory arrays (monolithic or non-monolithic) may be formed on separate chips and then packaged together to form a stacked-chip memory device.
- Associated circuitry is typically required for operation of the memory elements and for communication with the memory elements.
- memory devices may have circuitry used for controlling and driving memory elements to accomplish functions such as programming and reading.
- This associated circuitry may be on the same substrate as the memory elements and/or on a separate substrate.
- a controller for memory read-write operations may be located on a separate controller chip and/or on the same substrate as the memory elements.
- the subject matter described herein may be implemented in hardware, software, firmware, or any combination thereof.
- the terms “function,” “module,” and the like as used herein may refer to hardware, which may also include software and/or firmware components, for implementing the feature being described.
- the subject matter described herein may be implemented using a computer readable medium having stored thereon computer executable instructions that when executed by a computer (e.g., a processor) control the computer to perform the functionality described herein.
- Examples of computer readable media suitable for implementing the subject matter described herein include non-transitory computer-readable media, such as disk memory devices, chip memory devices, programmable logic devices, and application specific integrated circuits.
- a computer readable medium that implements the subject matter described herein may be located on a single device or computing platform or may be distributed across multiple devices or computing platforms.
- any reference to an element herein using a designation such as “first,” “second,” and so forth does not generally limit the quantity or order of those elements. Rather, these designations may be used herein as a convenient method of distinguishing between two or more elements or instances of an element. Thus, a reference to first and second elements does not mean that only two elements may be used there or that the first element must precede the second element in some manner. Also, unless stated otherwise a set of elements may include one or more elements.
- terminology of the form “at least one of A, B, or C” or “A, B, C, or any combination thereof” used in the description or the claims means “A or B or C or any combination of these elements.”
- this terminology may include A, or B, or C, or A and B, or A and C, or A and B and C, or 2A, or 2B, or 2C, or 2A and B, and so on.
- “at least one of: A, B, or C” is intended to cover A, B, C, A-B, A-C, B-C, and A-B-C, as well as multiples of the same members (e.g., any lists that include AA, BB, or CC).
- “at least one of: A, B, and C” is intended to cover A, B, C, A-B, A-C, B-C, and A-B-C, as well as multiples of the same members.
- a phrase referring to a list of items linked with “and/or” refers to any combination of the items.
- “A and/or B” is intended to cover A alone, B alone, or A and B together.
- “A, B and/or C” is intended to cover A alone, B alone, C alone, A and B together, A and C together, B and C together, or A, B, and C together.
- determining encompasses a wide variety of actions. For example, “determining” may include calculating, computing, processing, deriving, investigating, looking up (e.g., looking up in a table, a database or another data structure), ascertaining, and the like. Also, “determining” may include receiving (e.g., receiving information), accessing (e.g., accessing data in a memory), and the like. Also, “determining” may include resolving, selecting, choosing, establishing, and the like.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Software Systems (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Computing Systems (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- Multimedia (AREA)
- Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Mathematical Physics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Quality & Reliability (AREA)
- Computational Linguistics (AREA)
- Biophysics (AREA)
- Molecular Biology (AREA)
- Biomedical Technology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Image Analysis (AREA)
Abstract
Description
- This application is a continuation of U.S. patent application Ser. No. 16/447,619, filed Jun. 20, 2019, having Attorney Docket No. WDT-1342 (WDA-4383-US), entitled “NON-VOLATILE MEMORY DIE WITH ON-CHIP DATA AUGMENTATION COMPONENTS FOR USE WITH MACHINE LEARNING,” the entire content of which is incorporated herein by reference.
- The disclosure relates, in some embodiments, to non-volatile memory (NVM) dies. More specifically, but not exclusively, the disclosure relates to methods and apparatus for implementing data augmentation within an NVM die for use with machine learning.
- Machine learning generally relates to the use of artificial intelligence to perform tasks without explicit instructions and instead relying on patterns and inference. Deep learning (which also may be referred to as deep structured learning or hierarchical learning) relates to machine learning methods based on learning data representations or architectures, such as deep neural networks (DNNs), rather than to task-specific procedures or algorithms. Deep learning is applied to such fields as speech recognition, computer vision, and self-driving vehicles. Deep learning may be accomplished by, or facilitated by, deep learning accelerators (DLAs), e.g., microprocessor devices designed to accelerate the generation of useful neural networks to implement deep learning.
- A DLA or other machine learning system may need to be trained using initial training data, such as an initial set of images that have been tagged or labeled for use in training an image recognition system. Data augmentation includes procedures for expanding an initial set of images in a realistic but randomized manner to increase the variety of data for use during training. For example, a small set of input images may be altered slightly (by, e.g., rotating or skewing the images) to create a larger set of images (i.e. an augmented image set) for use in training the system. That is, data augmentation allows re-using tagged or labeled data in multiple training instances in order to increase the size of the training data set.
- The following presents a simplified summary of some aspects of the disclosure to provide a basic understanding of such aspects. This summary is not an extensive overview of all contemplated features of the disclosure, and is intended neither to identify key or critical elements of all aspects of the disclosure nor to delineate the scope of any or all aspects of the disclosure. Its sole purpose is to present various concepts of some aspects of the disclosure in a simplified form as a prelude to the more detailed description that is presented later.
- One embodiment of the disclosure provides an apparatus that includes: a die with non-volatile memory (NVM) elements; and a data augmentation controller formed in the die and configured to augment machine learning data stored within the NVM elements with augmented machine learning data.
- Another embodiment of the disclosure provides a method for use with a die having an NVM array, the method including: storing machine learning data within the NVM array of the die; and generating augmented machine learning data using data augmentation circuitry formed in the die.
- Yet another embodiment of the disclosure provides an apparatus with a die having an NVM array where the apparatus includes: means formed in the die for storing a machine learning data within the NVM array of the die; and means formed in the die for generating at least one augmented version of the machine learning data.
-
FIG. 1 illustrates a schematic block diagram configuration for an exemplary solid state device (SSD) having one or more non-volatile memory (NVM) array dies, where the dies have on-chip data augmentation components. -
FIG. 2 illustrates an example of an NVM die having on-chip under-the-array or next-to-the-array components configured for data augmentation processing. -
FIG. 3 illustrates a flow chart of an exemplary method according to aspects of the present disclosure for performing various types of on-chip data augmentation of image data. -
FIG. 4 illustrates a NAND array of an NVM die for storing image data and also schematically illustrating the various on-chip data augmentation procedures ofFIG. 3 . -
FIG. 5 illustrates a flow chart of an exemplary method according to aspects of the present disclosure for performing on-chip data augmentation of image data by deactivating or at least reducing the use of on-chip error correction procedures so as to obtain noisy images. -
FIG. 6 illustrates a NAND array of an NVM die for storing image data and also schematically illustrating the error correction-based data augmentation procedures ofFIG. 5 . -
FIG. 7 illustrates a flow chart of an exemplary method according to aspects of the present disclosure for use in systems where error correction procedures are instead performed by a separate device controller. -
FIG. 8 illustrates a flow chart of an exemplary method according to aspects of the present disclosure for performing on-chip data augmentation of images by adjusting read voltages during data reads so as to obtain noisy images. -
FIG. 9 illustrates a NAND array of an NVM die for storing image data and also schematically illustrating the read voltage-based data augmentation procedures ofFIG. 8 . -
FIG. 10 illustrates a flow chart of an exemplary method according to aspects of the present disclosure for performing on-chip data augmentation of images by storing and then reading image data within worn regions of the NVM die so as to obtain noisy images. -
FIG. 11 illustrates a NAND array of an NVM die for storing image data and also schematically illustrating the worn region-based data augmentation procedures ofFIG. 10 . -
FIG. 12 illustrates a flow chart that summarizes exemplary on-chip data augmentation operations performed by an NVM die. -
FIG. 13 illustrates a schematic block diagram configuration for an exemplary NVM apparatus such as a NAND die. -
FIG. 14 illustrates a schematic block diagram providing further details of an exemplary NVM die and its on-chip components. - In the following detailed description, reference is made to the accompanying drawings, which form a part thereof. In addition to the illustrative aspects, embodiments, and features described above, further aspects, embodiments, and features will become apparent by reference to the drawings and the following detailed description. The description of elements in each figure may refer to elements of proceeding figures. Like numbers may refer to like elements in the figures, including alternate embodiments of like elements.
- The examples herein relate to non-volatile memory (NVM) arrays, and to data storage devices or apparatus for controlling the NVM arrays, such as a controller of a data storage device (such as an SSD), and in particular to NAND flash memory storage devices (herein “NANDs”). (A NAND is a type of non-volatile storage technology that does not require power to retain data. It exploits negative-AND, i.e. NAND, logic.) For the sake of brevity, an SSD having one or more NAND dies will be used below in the description of various embodiments. It is understood that at least some aspects described herein may be applicable to other forms of data storage devices as well. For example, at least some aspects described herein may be applicable to phase-change memory (PCM) arrays, magneto-resistive random access memory (MRAM) arrays and resistive random access memory (ReRAM) arrays. In addition, the various embodiments may be used in various machine learning devices which may include some combination of processing elements and memory/data storage elements, including the NVM arrays constructed/configured in accordance with the described embodiments.
- As noted above, machine learning may be accomplished by, or facilitated by, deep learning accelerators (DLAs), e.g., microprocessor devices designed to accelerate the generation of deep neural networks (DNNs) to implement machine learning. These neural networks may also be referred to as learning networks. A DLA may need to be trained using initial training data, such as an initial set of images for training an image recognition system having a DLA. Data augmentation is a process of modifying an initial set of images (in, e.g., a realistic but randomized manner) to increase the variety or variance of data for use during training. For example, a set of input images may be altered (by, e.g., rotating or skewing the images) to create a larger set of images (an augmented image set) for use in training the system. Data augmentation may be defined more generally as a regularization technique for avoiding overfitting when training a machine learning system, such as a machine learning network or algorithm. Regularization is the process of adding information in order to solve an ill-posed problem or to prevent overfitting during machine learning. For example, regularization may make slight modifications to a learning model so the learning model generalizes more effectively from training data. Herein, the term data augmentation is defined as generating at least one modified version of data to avoid or reduce the risk of overfitting during training of a machine learning system using the data. The data may be, for example, a data vector, data array, data object or data representation of any number of dimensions, such as a 2-D data object containing one or more patterns. Examples of such data include images or audio segments or other types of numerical data, categorical data, time series data, or text.
- Deep learning or machine learning may be implemented using processing components that are integrated with the memory components where the data to be processed is stored, i.e. using “near memory” computing, so as to reduce the need to transfer large quantities of data from one component to another. (The alternative, i.e. using standalone processing units such as graphics processing units (GPUs), central processing units (CPUs), etc., and stand-alone memory units such as dynamic random-access-memory (DRAM), can require transference of large quantities of data from one component to another.)
- Herein, methods and apparatus are disclosed for implementing data augmentation for use with near memory machine learning systems such as DNNs employing DLAs where the data augmentation is performed within the die of an NVM using, for example, under-the-array data augmentation components or next-to-the-array components or is performed using components of an off-chip memory controller coupled to the die. That is, a near memory computing architecture is disclosed herein for data augmentation.
- Note that a DNN is an example of an artificial neural network that has multiple layers between input and output layers. A DNN operates to determine a mathematical computation or manipulation to convert the input into the output, which might be a linear or non-linear computation. For example, the DNN may work through its layers by calculating a probability of each output. Each mathematical manipulation may be considered a layer. Networks that have many layers are referred to as having “deep” layers, hence the term DNN. In one particular example, the DNN might be configured to identify a person within an input image by processing the bits of the input image to yield identify the particular person, i.e. the output of the DNN is a value that identifies the particular person. The DNN may need to be trained. The data augmentation procedures and apparatus described herein may be used to augment an initial set of training data, such as an initial set of labeled images (where labeled images are images containing known data, such as an image that has already been identified as corresponding to a particular type of object). In addition to configuring an NVM die for near memory data augmentation, the die may also be configured for near memory DNN processing by, for example, providing a DLA on the die as well as data augmentation circuits.
- An advantage of at least some of the exemplary methods and apparatus described herein is that only the final result of a data augmented training procedure is transferred to the controller and host, thus avoiding the transference of large amounts of training data, such as augmented sets of training images that might include thousands of augmented images.
- Note also that the data augmentation machine learning dies described herein may be different from GPUs in that a GPU typically transfers calculated data from its NVM to a volatile RAM/DRAM, whereas the augmentations described in various examples herein are done by the NAND dies. As noted, in some examples, the die includes extra-array logic for performing the augmentation, storing the results, and performing other machine learning operations, such as the actual training of a DLA based on the augmented data. Thus, in some aspects, a NVM architecture is disclosed that offloads data augmentation from host devices or other devices and instead performs the augmentation within the NVM die. Moreover, at least some of the methods and apparatus disclosed herein exploit die parallelism and inherent features of an NVM (such as inherent noise features). This can facilitate the implementation of machine learning edge computing application training on-chip.
- The data augmentation methods and apparatus described herein may be used in conjunction with on-chip DLA features and other features described in U.S. patent application Ser. No. 16/212,586 and in U.S. patent application Ser. No. 16/212,596, both entitled “NON-VOLATILE MEMORY DIE WITH DEEP LEARNING NEURAL NETWORK,” and both filed Dec. 6, 2018, both of which are assigned to the assignee of the present application.
-
FIG. 1 is a block diagram of asystem 100 including an exemplary SSD having an NVM with on-chip machine learning data augmentation components. Thesystem 100 includes ahost 102 and aSSD 104 coupled to thehost 102. Thehost 102 provides commands to theSSD 104 for transferring data between thehost 102 and theSSD 104. For example, thehost 102 may provide a write command to theSSD 104 for writing data to theSSD 104 or read command to theSSD 104 for reading data from theSSD 104. Thehost 102 may be any system or device having a need for data storage or retrieval and a compatible interface for communicating with theSSD 104. For example, thehost 102 may a computing device, a personal computer, a portable computer, a workstation, a server, a personal digital assistant, a digital camera, or a digital phone as merely a few examples. Additionally or alternatively, thehost 102 may be a system or device having a need for neural network processing, such as speech recognition, computer vision, and self-driving vehicles. For example, thehost 102 may be a component of a self-driving system of a vehicle. - The
SSD 104 includes ahost interface 106, acontroller 108, a memory 110 (such as RAM), an NVM interface 112 (which may be referred to as a flash interface), and anNVM 114, such as one or more NAND dies configured with on-chip machine learning data augmentation components. Thehost interface 106 is coupled to thecontroller 108 and facilitates communication between thehost 102 and thecontroller 108. Thecontroller 108 is coupled to thememory 110 as well as to theNVM 114 via theNVM interface 112. Thehost interface 106 may be any suitable communication interface, such as an Integrated Drive Electronics (IDE) interface, a Universal Serial Bus (USB) interface, a Serial Peripheral (SP) interface, an Advanced Technology Attachment (ATA) or Serial Advanced Technology Attachment (SATA) interface, a Small Computer System Interface (SCSI), an IEEE 1394 (Firewire) interface, or the like. In some embodiments, thehost 102 includes theSSD 104. In other embodiments, theSSD 104 is remote from thehost 102 or is contained in a remote computing system communicatively coupled with thehost 102. For example, thehost 102 may communicate with theSSD 104 through a wireless communication link. - The
controller 108 controls operation of theSSD 104. In various aspects, thecontroller 108 receives commands from thehost 102 through thehost interface 106 and performs the commands to transfer data between thehost 102 and theNVM 114. Furthermore, thecontroller 108 may manage reading from and writing tomemory 110 for performing the various functions effected by the controller and to maintain and manage cached information stored inmemory 110. - The
controller 108 may include any type of processing device, such as a microprocessor, a microcontroller, an embedded controller, a logic circuit, software, firmware, or the like, for controlling operation of theSSD 104. In some aspects, some or all of the functions described herein as being performed by thecontroller 108 may instead be performed by another element of theSSD 104. For example, theSSD 104 may include a microprocessor, a microcontroller, an embedded controller, a logic circuit, software, firmware, or any kind of processing device, for performing one or more of the functions described herein as being performed by thecontroller 108. According to other aspects, one or more of the functions described herein as being performed by thecontroller 108 are instead performed by thehost 102. In still further aspects, some or all of the functions described herein as being performed by thecontroller 108 may instead be performed by another element such as a controller in a hybrid drive including both non-volatile memory elements and magnetic storage elements. - The
memory 110 may be any suitable memory, computing device, or system capable of storing data. For example, thememory 110 may be ordinary RAM, DRAM, double data rate (DDR) RAM (DRAM), static RAM (SRAM), synchronous dynamic RAM (SDRAM), a flash storage, an erasable programmable read-only-memory (EPROM), an electrically erasable programmable ROM (EEPROM), or the like. In various embodiments, thecontroller 108 uses thememory 110, or a portion thereof, to store data during the transfer of data between thehost 102 and theNVM 114. For example, thememory 110 or a portion of thememory 110 may be a cache memory. TheNVM 114 receives data from thecontroller 108 via theNVM interface 112 and stores the data. TheNVM 114 may be any suitable type of non-volatile memory, such as a NAND-type flash memory or the like. - In the example of
FIG. 1 , thecontroller 108 may include hardware, firmware, software, or any combinations thereof that provide amachine learning controller 116 for use with the NVM array 114 (where the machine learning controller, in some examples, may include at least some off-chip data augmentation components such as components that control data augmentation based on controlling off-chip error correction). AlthoughFIG. 1 shows an example SSD and an SSD is generally used as an illustrative example in the description throughout, the various disclosed embodiments are not necessarily limited to an SSD application/implementation. As an example, the disclosed NVM die and associated processing components can be implemented as part of a package that includes other processing circuitry and/or components. For example, a processor may include, or otherwise be coupled with, embedded NVM and associated circuitry and/or components for machine learning that are described herein. The processor could, as one example, off-load certain machine learning tasks to the NVM and associated circuitry and/or components. As another example, thecontroller 108 may be a controller in another type of device and still include theneural network controller 116 and perform some or all of the functions described herein. -
FIG. 2 illustrates a block diagram of an exemplary NVM die 200 that includes NVMstorage array components 202 and under-the-array or next-to-the-array (or other extra-array) processingcomponents 204. Not all circuit or memory components that might be used in a practical NVM die are illustrated in the figure, such as input and output components, voltage regulation components, clocks and timing components, etc. Rather only some components and circuits are shown, summarized as block or schematic diagrams. The exemplaryNVM array components 202 include:NVM storage 206 for storing machine learning training data such as input image data and augmented image data; andNVM storage 208 configured for storing other data such as DNN synaptic weights, bias values, etc., or other types of user data or system data. - The NVM
extra-array processing components 204 includedata augmentation components 210 configured to perform or control data augmentation operations. In the example ofFIG. 2 , the exemplarydata augmentation components 210 include: one or morenoise addition components 212 configured to generate augmented machine learning data by adding noise to initial machine learning data, such as by adding uncorrelated noise to each of an initial set of labeled training images; one ormore skew components 214 configured to generate augmented machine learning data by skewing initial machine learning data, such as by skewing each of an initial set of labeled training images in a different manner; one ormore crop components 216 configured to generate augmented machine learning data by cropping initial machine learning data, such as by cropping each of an initial set of labeled training images in a different manner; one or more flip/rotate/translatecomponents 218 configured to generate augmented machine learning data by flipping, rotating and/or translating initial machine learning data, such as by flipping, rotating and/or translating each of an initial set of labeled training images in a different manner. - Multiple instances of each augmentation component (212, 214, 216, and 218) are shown since, in some examples, a plurality of such devices may operate in parallel. For example, N
noise addition components 212 may be provided to concurrently process N different input training images to generate a set of augmented images from each of the N different input training images. In other examples, only a single instance of each component may be provided. In still other examples, only one or a few of the illustrated components are provided such as only thenoise addition components 212 or only theskew components 214. In yet other examples, other augmentation components are additionally or alternatively provided, which serve to augment the initial data set in other manners. Note also that the exemplary components ofFIG. 2 primarily relate to the augmentation of image data. For examples where the data is not image data but, for example, audio data, different augmentation components may be provided that are appropriate to the type of data. - The NVM
extra-array processing components 204 ofFIG. 2 also include various other components including: a machine learningvalue storage controller 226 configured to store machine learning data in theNVM storage 206; a machine learning value readcontroller 228 configured to read previously-stored machine learning data from theNVM storage 206; and an on-chip error correction code (ECC)controller 230 configured to control any on-chip ECC applied to data as it is read from theNVM array components 202 to address a bit error rate (BER). As will be explained, certain types of data augmentation can be performed by adjusting ECC or, in some cases, deactivating ECC so as to increase the BER to selectively add noise into images. That is, a data augmentation controller may be configured to generate augmented data by reducing an amount of error correction performed by the error correction components compared to an average amount of error correction that would otherwise be employed to read data not subject to data augmentation, and then reading stored data from the NVM elements with the reduced error correction. And so, in one example, if the device ordinarily examines three bytes of ECC data within 512 bytes of data, ECC may be reduced by examining only two bytes of the ECC data. Thus, if the die itself is equipped for performing ECC procedures, those procedures can be deactivated or modified to increase the noise in the data read from the NAND arrays to provide augmented image data. It is noted that, in many systems, ECC is instead performed by a device controller that is separate from the die (such ascontroller 108 ofFIG. 1 ). In such implementations, the die itself does not control ECC and hence cannot directly adjust the ECC. In such systems, because the ECC is performed by the controller, all images read by the die may be “noisy” images suitable for use as augmented images in on-chip training. Also, note that the BER for an NVM block may vary as a function of underlying conditions and memory type and so readcontroller 228 may be programmed or configured to take such information into account when selecting a target location for storing data. For example, for data augmentation purposes, write parameters may be selected or modified to increase the BER so that any augmentation requirements are satisfied. This is in contrast to the conventional desire to reduce BER so that ECC decoding is easier. As will be explained below, one technique for modifying write parameters is to modify the location where image data is written so as to store the data in worn regions of theNVM array 202 to thereby increase storage errors, so as to inject noise into the stored/retrieved image data. -
FIG. 2 also illustrates amachine learning controller 232, which may be, e.g., a DLA, DNN, pattern recognition controller, image recognition controller, etc., configured to perform some form of machine learning using augmented data. In some examples, the augmented data is stored in theNVM arrays 202 for later use. In other examples, augmented data is held in other memory within the die, such as within data laches (not shown inFIG. 2 ), for immediate use by training components, then erased or overwritten. That is, in some examples, the augmented data may be transient data that is saved only as long as it is needed to train a machine learning system (e.g. a DNN) and then discarded. - In the following, various exemplary data augmentation systems and procedures are described where data is stored in a NAND array and where the data augmentation is used to train image or pattern recognition systems. As already explained, other types of NVM arrays may be used and the data augmentation may be applied to other types of machine learning. Hence, the following descriptions provide illustrative and non-limiting examples.
-
FIG. 3 illustrates anexemplary method 300 for data augmentation for use with image recognition according to aspects of the present disclosure where any of the aforementioned forms of data augmentation may be applied (e.g. skewing, rotating, etc.). Beginning atblock 302, input circuitry of an NVM die inputs an initial set of labeled (or tagged) input images for use with image recognition training (or for use with other forms of deep learning or machine learning) and stores the initial set of images within a NAND NVM array of the die. At 304, read circuitry of the NVM die reads one or more of the labeled images from the NAND NVM array. At 306, data augmentation circuitry of the NVM die generates a set of altered versions of the labeled images by, e.g., rotating, translating, skewing, cropping, flipping, and/or adding noise to the labeled images read from the NAND NVM array to provide an augmented image set. At 308, machine learning circuitry of the NVM die performs machine learning, such as DLA learning, using the augmented image set to, for example, train an image recognition system to recognize images within the augmented image set, and then output a set of trained parameters. In some examples, the parameters may include synaptic weights, bias values, etc., for use with a DNN configured for image recognition. The image recognition system itself may be configured, e.g., within the extra-array circuitry of the die or may be embodied elsewhere, such as within an SSD controller, a host system, or a remote server. - Insofar as flipping is concerned, when using a DLA, images often need to be stored in a parsed format (rather than a compressed format like JPEG). With parsed images, flipping of an image can be achieved by reversing the order of read pixels. Flipping on a different axis may be performed by the die if the size and parameters of the image are stored in the NAND memory (as would often be the case with an on-chip DLA) and hence the parameters are available to the die logic circuitry for use in flipping. Note also that noise can be added to an image by omitting every other bit of the image or every other row or column of the image, or by performing other relatively straight-forward adjustments to an image to generate a “noisy” version of the image.
-
FIG. 4 illustrates aNAND array 400 of an NVM die (such as the die ofFIG. 2 ) for storing image data and various procedures that manipulate and process the data using the methods ofFIG. 3 . At 402, a host device or other external system provides labeled training images, such as labeled images of particular individuals to be identified by an image recognition system or particular types of objects or animals to be detected. At 404, circuitry of the NVM die inputs and stores the labeled images within a first portion orfirst region 406 of theNAND array 400. Later, when a data augmentation procedure is initiated, circuitry of the NAND die reads the labeled images at 408, modifies the images to generate an augmented set of labeled images at 410, and then stores the augmented set of images at 412 into a second region orsecond portion 414 of theNAND array 400 for subsequent use in training an image recognition system, such as for training the DLA of an on-chip image recognition system. Additionally or alternatively, at 416, the circuitry of the NAND die uses the augmented set of images (substantially) immediately to train an image recognition system, such as by directly applying the augmented set of images to an on-chip DLA. -
FIG. 5 illustrates anexemplary method 500 for data augmentation for use with image recognition according to aspects of the present disclosure where data augmentation is performed by deactivating or at least reducing the use of on-chip ECC (or other on-chip error correction systems or procedures). Note that, herein, deactivating ECC is one example of reducing the use of ECC. Beginning atblock 502, input circuitry of the NVM die inputs an initial set of labeled input images for use with image recognition training (or other forms of deep learning or machine learning) and stores the initial set of labeled input images within a NAND NVM array of the die, where the die is configured to apply on-chip error correction to data read from the NAND NVM array. At 504, control circuitry of the NVM die deactivates on-chip error detection and correction procedures or otherwise reduces the amount (or effectiveness) of on-chip error correction applied by the NVM to data read from the array to selectively increase the effective BER. By deactivating ECC, data is read “as is,” i.e. without ECC-based decoding. This reduces latency and saves power while also yielding noisy images for data augmentation. At 506, read circuitry of the NVM die repeatedly reads an image from the NAND NVM array without on-chip error correction or with reduced on-chip error correction to generate a set of augmented labeled images that differ from one another and from the initial image due to differing noise artifacts caused by the lack of on-chip error correction or the reduced error correction. In this manner, inherent noise associated with the natural BER of the die can be exploited to generate an augmented data set having uncorrelated noise artifacts (e.g. different noise vectors). At 508, machine learning circuitry of the NVM die performs machine learning, such as DLA learning, using the augmented labeled image set to, e.g., train an image recognition system to recognize images within the set, and then output a set of trained parameters. The image recognition system may be configured, e.g., within the extra-array circuitry of the die or may be embodied elsewhere, such as within an SSD controller, a host system, or a remote server. -
FIG. 6 illustrates aNAND array 600 of an NVM die (such as the die ofFIG. 2 ) for storing image data and various procedures that manipulate and process the data using the methods ofFIG. 5 . At 602, a host device or other external system provides labeled training images. At 604, circuitry of the NVM die inputs and stores the labeled images within a first portion orfirst region 606 of theNAND array 600. Later, when a data augmentation procedure is initiated, circuitry of the NAND die: repeatedly reads the labeled images at 608 with on-chip ECC deactivated or reduced so as to provide or retain noise within the read images and thus generate an augmented set of labeled images; and then stores the augmented set of images at 610 into a second region orsecond portion 612 of theNAND array 600 for subsequent use in training an image recognition system. Additionally or alternatively, at 614, the circuitry of the NAND die uses the augmented set of images (substantially) immediately to train an image recognition system. - In
FIG. 6 , multiple arrows are shown leading from thefirst array portion 606 to emphasize that individual images stored therein can be repeatedly read. Each separate read from the NAND array, performed either without on-chip ECC or with reduced on-chip ECC, will generally result in different noise artifacts in the read-out images, where the noise artifacts are uncorrelated within one another, thus providing noise-based data augmentation or noise-augmented data sets. The read operations may be performed repeatedly until a training system that uses the augmented data set is satisfied that a sufficient the number of samples of each particular image are collected, such as by comparing the number of sample against a suitable threshold value or by verifying that the system is sufficiently trained. In some examples, a read channel or NVM device controller that is separate from the NVM die (i.e. off-chip) may be configured to perform at least some of the procedures or operations ofFIGS. 5 and 6 , for example if ECC is performed by a device or component that is separate from the die. - As noted, in some systems, ECC is performed by a device controller that is separate from the NAND die (such as
controller 108 ofFIG. 1 ).FIG. 7 summarizes a method that may be performed by the die. Briefly, atblock 702, the die reads stored (target) data from a NAND block (which might be image data for use in DLA training or might be other data). Atdecision block 704, the die determines whether a data augmentation mode is ON. If the data augmentation mode is ON, then atblock 706, the die uses the read data in DLA training or other machine learning training. If data augmentation mode is OFF, the read data atblock 708 is instead sent a controller (such asseparate device controller 108 ofFIG. 1 or a controller formed on the NAND) to perform ECC decoding on the data, so that the data can then be processed normally. -
FIG. 8 illustrates anexemplary method 800 for data augmentation for use with image recognition according to aspects of the present disclosure where data augmentation is performed by modifying read voltages to inject noise into read images (or otherwise obtain a greater amount of read errors). Beginning atblock 802, input circuitry of the NVM die inputs an initial set of labeled input images for use with image recognition training (or other forms of deep learning or machine learning) and stores the initial set of labeled input images within a NAND NVM array of the die. Atblock 804, control circuitry of the NVM die identifies a read voltage for reading data from the NVM array with minimal read errors (e.g. a normal read voltage set to achieve a low BER). Atblock 806, the control circuitry of the NVM die modifies the read voltages applied to its NVM elements (as compared to read voltages that would otherwise be employed to read images not subject to data augmentation, e.g. the voltages with minimal read errors identified at block 804). And so, in one example, if data is ordinarily read using an average threshold voltage of X volts, the modified read voltage might be 0.9X volts. Atblock 808, read circuitry of the NVM die applies the modified read voltages to the NVM elements while reading one or more of the initial labeled images from the NVM elements to generate a set of augmented labeled images that differ from one another and from the initial images due to differing noise artifacts caused by the modified read voltages. At 810, machine learning circuitry of the NVM die performs machine learning, such as DLA learning, using the augmented labeled image set to, e.g., train an image recognition system to recognize images within the set, and then output a set of trained parameters. -
FIG. 9 illustrates aNAND array 900 of an NVM die (such as the die ofFIG. 2 ) for storing image data and various procedures that manipulate and process the data using the methods ofFIG. 8 . At 902, a host device or other external system provides labeled training images. At 904, circuitry of the NVM die inputs and stores the labeled images within a first portion orfirst region 906 of theNAND array 900. Later, when a data augmentation procedure is initiated, circuitry of the NAND die: repeatedly reads the labeled images at 908 with the modified read voltages so as to provide or retain noise within the read images to thereby generate an augmented set of labeled images; and then stores the augmented set of images into a second region orsecond portion 912 of theNAND array 900 for subsequent use in training an image recognition system. Additionally or alternatively, at 914, the circuitry of the NAND die uses the augmented set of images (substantially) immediately to train an image recognition system. InFIG. 9 , multiple arrows are shown leading from thefirst array portion 906 to emphasize that individual images stored therein can be repeatedly read with potentially different read voltages. Each separate read from the NAND array will generally result in different noise artifacts, where the noise artifacts are uncorrelated within one another, thus providing noise-based data augmentation or noise-augmented data sets. -
FIG. 10 illustrates anexemplary method 1000 for data augmentation for use with image recognition according to aspects of the present disclosure where data augmentation is performed by repeatedly writing (initially un-augmented) data to worn regions of the NVM and then reading the data from the worn regions of the NVM to thereby inject noise into the images. Beginning atblock 1002, input circuitry of the NVM die inputs an initial set of labeled input images for use with image recognition training (or other forms of deep learning or machine learning) and stores the initial set of labeled input images within a NAND NVM array of the die, where the die has regions affected by differing amounts of wear. At 1004, control circuitry of the NVM die identifies worn regions of the NVM array that are subject to storage errors. Any suitable technique can be used to identify worn areas of the NVM array, such as by tracking the BER of data read from various blocks. Atblock 1006, read circuitry of the NVM die reads labeled images from an initial storage region of the NVM array and, atblock 1008, write (program) circuitry of the NVM die re-stores the labeled images in the worn regions of the NVM subject to storage errors. Atblock 1010, read circuitry of the NVM die re-reads the labeled images from the worn regions of the NVM array to thereby obtain noise-augmented versions of the labeled images where the noise is caused by storing/reading from the worn regions of the NVM array that have high BER. At 1012, machine learning circuitry of the NVM die performs machine learning, such as DLA learning, using the augmented labeled image set to, e.g., train an image recognition system to recognize images, and then output a set of trained parameters. -
FIG. 11 illustrates aNAND array 1100 of an NVM die (such as the die ofFIG. 2 ) for storing image data and various procedures that manipulate and process the data using the methods ofFIG. 10 . At 1102, a host device or other external system provides labeled training images. At 1104, circuitry of the NVM die inputs and stores the labeled images within a first (non-worn)region 1106 of theNAND array 1100. Later, at 1108, when a data augmentation procedure is initiated, circuitry of the NVM die re-stores the labeled images in aworn region 1110 of the NAND array. At 1112, read circuitry of the NVM die repeatedly reads the labeled images from the wornregion 1110 so as to thereby generate an augmented set of labeled images exploiting un-corrected read errors. At 1114, circuitry of the NVM die stores the augmented set of images into anotherregion 1116 of theNAND array 1100 for subsequent use in training an image recognition system. Additionally or alternatively, at 1118, the circuitry of the NAND die uses the augmented set of images (substantially) immediately to train an image recognition system. InFIG. 11 , multiple arrows are shown leading fromworn array portion 1110 to emphasize that individual images stored therein can be repeatedly read. Each separate read from the NAND array will generally result in different noise artifacts due to the worn characteristics of thearray region 1110, where the noise artifacts are uncorrelated within one another, thus providing for noise-augmented data sets. - In the following, various general exemplary procedures and systems are described.
-
FIG. 12 broadly illustrates aprocess 1200 in accordance with some aspects of the disclosure. Theprocess 1200 may take place within any suitable apparatus or device having a die capable of performing the operations, such as a NAND die. Atblock 1202, the die (e.g. a suitably-configured NAND die) stores machine learning data within the NVM array of a die. Atblock 1204, the die generates augmented machine learning data using data augmentation circuitry formed in the die or using components of a memory controller. Examples are described above. Insofar as using components of a memory controller is concerned, by way of example, ECC components of the memory controller may be configured or controlled to permit or facilitate the creation of augmented data sets by deactivating or reducing ECC. -
FIG. 13 broadly illustrates an embodiment of anapparatus 1300 configured according to one or more aspects of the disclosure. Theapparatus 1300, or components thereof, could embody or be implemented within a NAND die or some other type of NVM device that supports data storage. Theapparatus 1300 includesNVM elements 1302 and adata augmentation controller 1304 configured to augment machine learning data stored within theNVM elements 1302 with augmented machine learning data. Examples of the apparatus are described above. Additional examples are described below. As noted, at least some data augmentation components may be separate from the die, such as ECC components of a NAND device controller. -
FIG. 14 illustrates an embodiment of anapparatus 1400 configured according to one or more aspects of the disclosure. Theapparatus 1400, or components thereof, could embody or be implemented within a NAND die or some other type of NVM device that supports data storage. In various implementations, theapparatus 1400, or components thereof, could be a component of a processor, a controller, a computing device, a personal computer, a portable device, or workstation, a server, a personal digital assistant, a digital camera, a digital phone, an entertainment device, a medical device, a self-driving vehicle control device, or any other electronic device that stores, processes or uses neural data. - The
apparatus 1400 includes acommunication interface 1402, a physical memory array (e.g., NAND blocks) 1404, andextra-array processing circuits 1410, 1411 (e.g. under-the-array or next-to-the-array circuits). These components can be coupled to and/or placed in electrical communication with one another via suitable components, represented generally by the connection lines inFIG. 14 . Although not shown, other circuits such as timing sources, peripherals, voltage regulators, and power management circuits may be provided, which are well known in the art, and therefore, will not be described any further. - The
communication interface 1402 provides a means for communicating with other apparatuses over a transmission medium. In some implementations, thecommunication interface 1402 includes circuitry and/or programming (e.g., a program) adapted to facilitate the communication of information bi-directionally with respect to one or more devices in a system. In some implementations, thecommunication interface 1402 may be configured for wire-based communication. For example, thecommunication interface 1402 could be a bus interface, a send/receive interface, or some other type of signal interface including circuitry for outputting and/or obtaining signals (e.g., outputting signal from and/or receiving signals into an SSD). Thecommunication interface 1402 serves as one example of a means for receiving and/or a means for transmitting. - The
physical memory array 1404 may represent one or more NAND blocks. Thephysical memory array 1404 may be used for storing data such images that are manipulated by thecircuits apparatus 1400. Thephysical memory array 1404 may be coupled to thecircuits circuits physical memory array 1404. That is, thephysical memory array 1404 can be coupled to thecircuits physical memory array 1404 is accessible by thecircuits - The
circuits circuits circuits circuits 1410 may be configured to perform any of the steps, functions, and/or processes described with respect toFIGS. 2-13 . As used herein, the term “adapted” in relation to theprocessing circuits FIGS. 2-13 . The circuits serve as an example of a means for processing. In various implementations, the circuits may provide and/or incorporate, at least in part, functionality described above for thecomponents 204 ofFIG. 2 . - According to at least one example of the
apparatus 1400, theprocessing circuit modules 1420 configured for storing images or other machine learning data in the NAND blocks; circuits/modules 1422 configured for reading images or other machine learning data from the NAND blocks; circuits/modules 1424 configured for controlling the augmentation of images or other machine learning data; circuits/modules 1426 configured for skewing images; circuits/modules 1428 configured for cropping images; circuits/modules 1430 configured for flipping/rotating/translating images; circuits/modules 1432 configured for controlling augmentation via noise; circuits/modules 1433 configured for performing ECC; circuits/modules 1434 configured for deactivating ECC; circuits/modules 1436 configured for reducing ECC; circuits/modules 1437 configured for controlling read voltages; circuits/modules 1438 configured for adjusting read voltages to inject noise; circuits/modules 1439 configured for controlling machine learning with initial data and augmented data; circuits/modules 1441 configured for identifying a worn NVM region; and circuits/modules 1443 configured for storing data to and/or reading data from a worn NVM region storage/read component. - As shown in
FIG. 14 , thephysical memory array 1404 may include one or more of:blocks 1440 for storing machine learning data, such as input labeled images;blocks 1442 for storing augmented versions of the machine learning data;blocks 1444 that are worn regions; and blocks 1446 for storing other user data or system data (e.g. data pertaining to the overall control of operations of the NAND die). - In at least some examples, means may be provided for performing the functions illustrated in
FIG. 14 and/or other functions illustrated or described herein. For example, the means may include one or more of: means, such as circuit/module 1420, for storing images or other machine learning data in the NAND blocks; means, such as circuits/modules 1422, for reading images or other machine learning data from the NAND blocks; means, such as circuits/modules 1424, for controlling the augmentation of images or other machine learning data; means, such as circuits/modules 1426, for skewing images; means, such as circuits/modules 1428, for cropping images; means, such as circuits/modules 1430, for flipping/rotating/translating images; means, such as circuits/modules 1432, for controlling augmentation via noise; means, such as circuits/modules 1433, for performing ECC; means, such as circuits/modules 1434, for deactivating ECC; means, such as circuits/modules 1436, for reducing ECC; means, such as circuits/modules 1437, for controlling read voltages; means, such as circuits/modules 1438, for adjusting read voltages to inject noise; means, such as circuits/modules 1439, for controlling machine learning with initial data and augmented data; means, such as circuits/modules 1441, for identifying a worn NVM region; means, such as circuits/modules 1443, for storing data to and/or reading data from a worn NVM region storage/read component; means, such as NAND blocks 1440, for storing machine learning data; means, such as NAND blocks 1442, for storing augmented versions of the machine learning data; and means, such as NAND blocks 1446, for storing other user data or system data (e.g. data pertaining to the overall control of operations of the NAND die). - In other examples, means, such as
NVM elements 1202 ofFIG. 2 that are formed in a die, are provided for storing machine learning data within the NVM array of the die; and means, such asdata augmentation controller 204 ofFIG. 2 that are also formed in the die, are provided for generating at least one augmented version of the machine learning data. - At least some of the processing circuits described herein may be generally adapted for processing, including the execution of programming code stored on a storage medium. As used herein, the terms “code” or “programming” shall be construed broadly to include without limitation instructions, instruction sets, data, code, code segments, program code, programs, programming, subprograms, software modules, applications, software applications, software packages, routines, subroutines, objects, executables, threads of execution, procedures, functions, etc., whether referred to as software, firmware, middleware, microcode, hardware description language, or otherwise.
- At least some of the processing circuits described herein may be arranged to obtain, process and/or send data, control data access and storage, issue commands, and control other desired operations. The processing circuits may include circuitry configured to implement desired programming provided by appropriate media in at least one example. For example, the processing circuits may be implemented as one or more processors, one or more controllers, and/or other structure configured to execute executable programming. Examples of processing circuits may include a general purpose processor, a digital signal processor (DSP), an ASIC, a field programmable gate array (FPGA) or other programmable logic component, discrete gate or transistor logic, discrete hardware components, or any combination thereof designed to perform the functions described herein. A general purpose processor may include a microprocessor, as well as any conventional processor, controller, microcontroller, or state machine. At least some of the processing circuits may also be implemented as a combination of computing components, such as a combination of a controller and a microprocessor, a number of microprocessors, one or more microprocessors in conjunction with an ASIC and a microprocessor, or any other number of varying configurations. The various examples of processing circuits noted herein are for illustration and other suitable configurations within the scope of the disclosure are also contemplated.
- Aspects of the subject matter described herein can be implemented in any suitable NAND flash memory, such as 3D NAND flash memory. Semiconductor memory devices include volatile memory devices, such as DRAM) or SRAM devices, NVM devices, such as ReRAM, EEPROM, flash memory (which can also be considered a subset of EEPROM), ferroelectric random access memory (FRAM), and MRAM, and other semiconductor elements capable of storing information. Each type of memory device may have different configurations. For example, flash memory devices may be configured in a NAND or a NOR configuration. As noted, some features described herein are specific to NAND-based devices, such as the NAND-based on-chip copy with update.
- The memory devices can be formed from passive and/or active elements, in any combinations. By way of non-limiting example, passive semiconductor memory elements include ReRAM device elements, which in some embodiments include a resistivity switching storage element, such as an anti-fuse, phase change material, etc., and optionally a steering element, such as a diode, etc. Further by way of non-limiting example, active semiconductor memory elements include EEPROM and flash memory device elements, which in some embodiments include elements containing a charge storage region, such as a floating gate, conductive nanoparticles, or a charge storage dielectric material.
- Multiple memory elements may be configured so that they are connected in series or so that each element is individually accessible. By way of non-limiting example, flash memory devices in a NAND configuration (NAND memory) typically contain memory elements connected in series. A NAND memory array may be configured so that the array is composed of multiple strings of memory in which a string is composed of multiple memory elements sharing a single bit line and accessed as a group. Alternatively, memory elements may be configured so that each element is individually accessible, e.g., a NOR memory array. NAND and NOR memory configurations are exemplary, and memory elements may be otherwise configured. The semiconductor memory elements located within and/or over a substrate may be arranged in two or three dimensions, such as a two dimensional memory structure or a three dimensional memory structure.
- In a two dimensional memory structure, the semiconductor memory elements are arranged in a single plane or a single memory device level. Typically, in a two dimensional memory structure, memory elements are arranged in a plane (e.g., in an x-y direction plane) which extends substantially parallel to a major surface of a substrate that supports the memory elements. The substrate may be a wafer over or in which the layer of the memory elements are formed or it may be a carrier substrate which is attached to the memory elements after they are formed. As a non-limiting example, the substrate may include a semiconductor such as silicon. The memory elements may be arranged in the single memory device level in an ordered array, such as in a plurality of rows and/or columns. However, the memory elements may be arrayed in non-regular or non-orthogonal configurations. The memory elements may each have two or more electrodes or contact lines, such as bit lines and word lines.
- A three dimensional memory array is arranged so that memory elements occupy multiple planes or multiple memory device levels, thereby forming a structure in three dimensions (i.e., in the x, y and z directions, where the z direction is substantially perpendicular and the x and y directions are substantially parallel to the major surface of the substrate). As a non-limiting example, a three dimensional memory structure may be vertically arranged as a stack of multiple two dimensional memory device levels. As another non-limiting example, a three dimensional memory array may be arranged as multiple vertical columns (e.g., columns extending substantially perpendicular to the major surface of the substrate, i.e., in the z direction) with each column having multiple memory elements in each column. The columns may be arranged in a two dimensional configuration, e.g., in an x-y plane, resulting in a three dimensional arrangement of memory elements with elements on multiple vertically stacked memory planes. Other configurations of memory elements in three dimensions can also constitute a three dimensional memory array.
- By way of non-limiting example, in a three dimensional NAND memory array, the memory elements may be coupled together to form a NAND string within a single horizontal (e.g., x-y) memory device levels. Alternatively, the memory elements may be coupled together to form a vertical NAND string that traverses across multiple horizontal memory device levels. Other three dimensional configurations can be envisioned wherein some NAND strings contain memory elements in a single memory level while other strings contain memory elements which span through multiple memory levels. Three dimensional memory arrays may also be designed in a NOR configuration and in a ReRAM configuration.
- Typically, in a monolithic three dimensional memory array, one or more memory device levels are formed above a single substrate. Optionally, the monolithic three dimensional memory array may also have one or more memory layers at least partially within the single substrate. As a non-limiting example, the substrate may include a semiconductor such as silicon. In a monolithic three dimensional array, the layers constituting each memory device level of the array are typically formed on the layers of the underlying memory device levels of the array. However, layers of adjacent memory device levels of a monolithic three dimensional memory array may be shared or have intervening layers between memory device levels.
- Then again, two dimensional arrays may be formed separately and then packaged together to form a non-monolithic memory device having multiple layers of memory. For example, non-monolithic stacked memories can be constructed by forming memory levels on separate substrates and then stacking the memory levels atop each other. The substrates may be thinned or removed from the memory device levels before stacking, but as the memory device levels are initially formed over separate substrates, the resulting memory arrays are not monolithic three dimensional memory arrays. Further, multiple two dimensional memory arrays or three dimensional memory arrays (monolithic or non-monolithic) may be formed on separate chips and then packaged together to form a stacked-chip memory device.
- Associated circuitry is typically required for operation of the memory elements and for communication with the memory elements. As non-limiting examples, memory devices may have circuitry used for controlling and driving memory elements to accomplish functions such as programming and reading. This associated circuitry may be on the same substrate as the memory elements and/or on a separate substrate. For example, a controller for memory read-write operations may be located on a separate controller chip and/or on the same substrate as the memory elements. One of skill in the art will recognize that the subject matter described herein is not limited to the two dimensional and three dimensional exemplary structures described but cover all relevant memory structures within the spirit and scope of the subject matter as described herein and as understood by one of skill in the art.
- The examples set forth herein are provided to illustrate certain concepts of the disclosure. The apparatus, devices, or components illustrated above may be configured to perform one or more of the methods, features, or steps described herein. Those of ordinary skill in the art will comprehend that these are merely illustrative in nature, and other examples may fall within the scope of the disclosure and the appended claims. Based on the teachings herein those skilled in the art should appreciate that an aspect disclosed herein may be implemented independently of any other aspects and that two or more of these aspects may be combined in various ways. For example, an apparatus may be implemented or a method may be practiced using any number of the aspects set forth herein. In addition, such an apparatus may be implemented or such a method may be practiced using other structure, functionality, or structure and functionality in addition to or other than one or more of the aspects set forth herein.
- Aspects of the present disclosure have been described above with reference to schematic flowchart diagrams and/or schematic block diagrams of methods, apparatus, systems, and computer program products according to embodiments of the disclosure. It will be understood that each block of the schematic flowchart diagrams and/or schematic block diagrams, and combinations of blocks in the schematic flowchart diagrams and/or schematic block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a computer or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor or other programmable data processing apparatus, create means for implementing the functions and/or acts specified in the schematic flowchart diagrams and/or schematic block diagrams block or blocks.
- The subject matter described herein may be implemented in hardware, software, firmware, or any combination thereof. As such, the terms “function,” “module,” and the like as used herein may refer to hardware, which may also include software and/or firmware components, for implementing the feature being described. In one example implementation, the subject matter described herein may be implemented using a computer readable medium having stored thereon computer executable instructions that when executed by a computer (e.g., a processor) control the computer to perform the functionality described herein. Examples of computer readable media suitable for implementing the subject matter described herein include non-transitory computer-readable media, such as disk memory devices, chip memory devices, programmable logic devices, and application specific integrated circuits. In addition, a computer readable medium that implements the subject matter described herein may be located on a single device or computing platform or may be distributed across multiple devices or computing platforms.
- It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. Other steps and methods may be conceived that are equivalent in function, logic, or effect to one or more blocks, or portions thereof, of the illustrated figures. Although various arrow types and line types may be employed in the flowchart and/or block diagrams, they are understood not to limit the scope of the corresponding embodiments. For instance, an arrow may indicate a waiting or monitoring period of unspecified duration between enumerated steps of the depicted embodiment.
- The various features and processes described above may be used independently of one another, or may be combined in various ways. All possible combinations and sub-combinations are intended to fall within the scope of this disclosure. In addition, certain method, event, state or process blocks may be omitted in some implementations. The methods and processes described herein are also not limited to any particular sequence, and the blocks or states relating thereto can be performed in other sequences that are appropriate. For example, described tasks or events may be performed in an order other than that specifically disclosed, or multiple may be combined in a single block or state. The example tasks or events may be performed in serial, in parallel, or in some other suitable manner. Tasks or events may be added to or removed from the disclosed example embodiments. The example systems and components described herein may be configured differently than described. For example, elements may be added to, removed from, or rearranged compared to the disclosed example embodiments.
- Those of skill in the art will appreciate that information and signals may be represented using any of a variety of different technologies and techniques. For example, data, instructions, commands, information, signals, bits, symbols, and chips that may be referenced throughout the above description may be represented by voltages, currents, electromagnetic waves, magnetic fields or particles, optical fields or particles, or any combination thereof.
- The word “exemplary” is used herein to mean “serving as an example, instance, or illustration.” Any aspect described herein as “exemplary” is not necessarily to be construed as preferred or advantageous over other aspects. Likewise, the term “aspects” does not require that all aspects include the discussed feature, advantage or mode of operation.
- While the above descriptions contain many specific embodiments of the invention, these should not be construed as limitations on the scope of the invention, but rather as examples of specific embodiments thereof. Accordingly, the scope of the invention should be determined not by the embodiments illustrated, but by the appended claims and their equivalents. Moreover, reference throughout this specification to “one embodiment,” “an embodiment,” or similar language means that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one embodiment of the present disclosure. Thus, appearances of the phrases “in one embodiment,” “in an embodiment,” and similar language throughout this specification may, but do not necessarily, all refer to the same embodiment, but mean “one or more but not all embodiments” unless expressly specified otherwise.
- The terminology used herein is for the purpose of describing particular aspects only and is not intended to be limiting of the aspects. As used herein, the singular forms “a,” “an” and “the” are intended to include the plural forms as well (i.e., one or more), unless the context clearly indicates otherwise. An enumerated listing of items does not imply that any or all of the items are mutually exclusive and/or mutually inclusive, unless expressly specified otherwise. It will be further understood that the terms “comprises,” “comprising,” “includes” “including,” “having,” an variations thereof when used herein mean “including but not limited to” unless expressly specified otherwise. That is, these terms may specify the presence of stated features, integers, steps, operations, elements, or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, or groups thereof. Moreover, it is understood that the word “or” has the same meaning as the Boolean operator “OR,” that is, it encompasses the possibilities of “either” and “both” and is not limited to “exclusive or” (“XOR”), unless expressly stated otherwise. It is also understood that the symbol “/” between two adjacent words has the same meaning as “or” unless expressly stated otherwise. Moreover, phrases such as “connected to,” “coupled to” or “in communication with” are not limited to direct connections unless expressly stated otherwise.
- Any reference to an element herein using a designation such as “first,” “second,” and so forth does not generally limit the quantity or order of those elements. Rather, these designations may be used herein as a convenient method of distinguishing between two or more elements or instances of an element. Thus, a reference to first and second elements does not mean that only two elements may be used there or that the first element must precede the second element in some manner. Also, unless stated otherwise a set of elements may include one or more elements. In addition, terminology of the form “at least one of A, B, or C” or “A, B, C, or any combination thereof” used in the description or the claims means “A or B or C or any combination of these elements.” For example, this terminology may include A, or B, or C, or A and B, or A and C, or A and B and C, or 2A, or 2B, or 2C, or 2A and B, and so on. As a further example, “at least one of: A, B, or C” is intended to cover A, B, C, A-B, A-C, B-C, and A-B-C, as well as multiples of the same members (e.g., any lists that include AA, BB, or CC). Likewise, “at least one of: A, B, and C” is intended to cover A, B, C, A-B, A-C, B-C, and A-B-C, as well as multiples of the same members. Similarly, as used herein, a phrase referring to a list of items linked with “and/or” refers to any combination of the items. As an example, “A and/or B” is intended to cover A alone, B alone, or A and B together. As another example, “A, B and/or C” is intended to cover A alone, B alone, C alone, A and B together, A and C together, B and C together, or A, B, and C together.
- As used herein, the term “determining” encompasses a wide variety of actions. For example, “determining” may include calculating, computing, processing, deriving, investigating, looking up (e.g., looking up in a table, a database or another data structure), ascertaining, and the like. Also, “determining” may include receiving (e.g., receiving information), accessing (e.g., accessing data in a memory), and the like. Also, “determining” may include resolving, selecting, choosing, establishing, and the like.
Claims (20)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US17/897,028 US20220405532A1 (en) | 2019-06-20 | 2022-08-26 | Non-volatile memory die with on-chip data augmentation components for use with machine learning |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US16/447,619 US11501109B2 (en) | 2019-06-20 | 2019-06-20 | Non-volatile memory die with on-chip data augmentation components for use with machine learning |
US17/897,028 US20220405532A1 (en) | 2019-06-20 | 2022-08-26 | Non-volatile memory die with on-chip data augmentation components for use with machine learning |
Related Parent Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US16/447,619 Continuation US11501109B2 (en) | 2019-06-20 | 2019-06-20 | Non-volatile memory die with on-chip data augmentation components for use with machine learning |
Publications (1)
Publication Number | Publication Date |
---|---|
US20220405532A1 true US20220405532A1 (en) | 2022-12-22 |
Family
ID=74038074
Family Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US16/447,619 Active 2039-07-07 US11501109B2 (en) | 2019-06-20 | 2019-06-20 | Non-volatile memory die with on-chip data augmentation components for use with machine learning |
US17/897,028 Pending US20220405532A1 (en) | 2019-06-20 | 2022-08-26 | Non-volatile memory die with on-chip data augmentation components for use with machine learning |
Family Applications Before (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US16/447,619 Active 2039-07-07 US11501109B2 (en) | 2019-06-20 | 2019-06-20 | Non-volatile memory die with on-chip data augmentation components for use with machine learning |
Country Status (1)
Country | Link |
---|---|
US (2) | US11501109B2 (en) |
Families Citing this family (29)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2018176000A1 (en) | 2017-03-23 | 2018-09-27 | DeepScale, Inc. | Data synthesis for autonomous control systems |
US11157441B2 (en) | 2017-07-24 | 2021-10-26 | Tesla, Inc. | Computational array microprocessor system using non-consecutive data formatting |
US11893393B2 (en) | 2017-07-24 | 2024-02-06 | Tesla, Inc. | Computational array microprocessor system with hardware arbiter managing memory requests |
US11409692B2 (en) | 2017-07-24 | 2022-08-09 | Tesla, Inc. | Vector computational unit |
US10671349B2 (en) | 2017-07-24 | 2020-06-02 | Tesla, Inc. | Accelerated mathematical engine |
US11561791B2 (en) | 2018-02-01 | 2023-01-24 | Tesla, Inc. | Vector computational unit receiving data elements in parallel from a last row of a computational array |
US11215999B2 (en) | 2018-06-20 | 2022-01-04 | Tesla, Inc. | Data pipeline and deep learning system for autonomous driving |
US11361457B2 (en) | 2018-07-20 | 2022-06-14 | Tesla, Inc. | Annotation cross-labeling for autonomous control systems |
US11636333B2 (en) | 2018-07-26 | 2023-04-25 | Tesla, Inc. | Optimizing neural network structures for embedded systems |
US11562231B2 (en) | 2018-09-03 | 2023-01-24 | Tesla, Inc. | Neural networks for embedded devices |
KR20210072048A (en) | 2018-10-11 | 2021-06-16 | 테슬라, 인크. | Systems and methods for training machine models with augmented data |
US11196678B2 (en) | 2018-10-25 | 2021-12-07 | Tesla, Inc. | QOS manager for system on a chip communications |
US11816585B2 (en) | 2018-12-03 | 2023-11-14 | Tesla, Inc. | Machine learning models operating at different frequencies for autonomous vehicles |
US11537811B2 (en) | 2018-12-04 | 2022-12-27 | Tesla, Inc. | Enhanced object detection for autonomous vehicles based on field view |
US11610117B2 (en) | 2018-12-27 | 2023-03-21 | Tesla, Inc. | System and method for adapting a neural network model on a hardware platform |
US10997461B2 (en) | 2019-02-01 | 2021-05-04 | Tesla, Inc. | Generating ground truth for machine learning from time series elements |
US11150664B2 (en) | 2019-02-01 | 2021-10-19 | Tesla, Inc. | Predicting three-dimensional features for autonomous driving |
US11567514B2 (en) | 2019-02-11 | 2023-01-31 | Tesla, Inc. | Autonomous and user controlled vehicle summon to a target |
US10956755B2 (en) | 2019-02-19 | 2021-03-23 | Tesla, Inc. | Estimating object properties using visual image data |
US11983853B1 (en) | 2019-10-31 | 2024-05-14 | Meta Plattforms, Inc. | Techniques for generating training data for machine learning enabled image enhancement |
US11328095B2 (en) * | 2020-01-07 | 2022-05-10 | Attestiv Inc. | Peceptual video fingerprinting |
US11907571B2 (en) | 2020-07-13 | 2024-02-20 | SK Hynix Inc. | Read threshold optimization systems and methods using domain transformation |
CN113094531B (en) * | 2021-03-22 | 2022-05-20 | 华中科技大学 | In-memory image retrieval method and retrieval system |
US11749354B2 (en) | 2021-07-13 | 2023-09-05 | SK Hynix Inc. | Systems and methods for non-parametric PV-level modeling and read threshold voltage estimation |
US11769556B2 (en) * | 2021-07-27 | 2023-09-26 | SK Hynix Inc. | Systems and methods for modeless read threshold voltage estimation |
US11769555B2 (en) | 2021-07-27 | 2023-09-26 | SK Hynix Inc. | Read threshold voltage estimation systems and methods for parametric PV-level modeling |
US11854629B2 (en) | 2021-11-22 | 2023-12-26 | SK Hynix Inc. | System and method for non-parametric optimal read threshold estimation using deep neural network |
CN116430427B (en) * | 2023-06-12 | 2023-08-25 | 湖南中车时代通信信号有限公司 | Automatic acquisition method, system and storage medium for coordinates of railway interval signal equipment |
US12112536B1 (en) * | 2023-10-19 | 2024-10-08 | Tartan Aerial Sense Tech Private Limited | Camera apparatus and method of training and operating neural network model for enhanced foliage detection |
Family Cites Families (44)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6545907B1 (en) | 2001-10-30 | 2003-04-08 | Ovonyx, Inc. | Technique and apparatus for performing write operations to a phase change material memory device |
EP1489622B1 (en) | 2003-06-16 | 2007-08-15 | STMicroelectronics S.r.l. | Writing circuit for a phase change memory device |
KR100574975B1 (en) | 2004-03-05 | 2006-05-02 | 삼성전자주식회사 | Set programming method of phase-change memory array and writing driver circuit |
US7460389B2 (en) | 2005-07-29 | 2008-12-02 | International Business Machines Corporation | Write operations for phase-change-material memory |
US7965546B2 (en) | 2007-04-26 | 2011-06-21 | Super Talent Electronics, Inc. | Synchronous page-mode phase-change memory with ECC and RAM cache |
US7990642B2 (en) | 2009-04-17 | 2011-08-02 | Lsi Corporation | Systems and methods for storage channel testing |
US8495467B1 (en) | 2009-06-30 | 2013-07-23 | Micron Technology, Inc. | Switchable on-die memory error correcting engine |
US8199566B1 (en) | 2009-11-23 | 2012-06-12 | Micron Technology, Inc. | Write performance of phase change memory using set-pulse shaping |
US8725935B2 (en) | 2009-12-18 | 2014-05-13 | Sandisk Technologies Inc. | Balanced performance for on-chip folding of non-volatile memories |
US20120311262A1 (en) | 2011-06-01 | 2012-12-06 | International Business Machines Corporation | Memory cell presetting for improved memory performance |
US20160026912A1 (en) | 2014-07-22 | 2016-01-28 | Intel Corporation | Weight-shifting mechanism for convolutional neural networks |
FR3025344B1 (en) | 2014-08-28 | 2017-11-24 | Commissariat Energie Atomique | NETWORK OF CONVOLUTIONAL NEURONS |
US20160064409A1 (en) | 2014-08-29 | 2016-03-03 | Kabushiki Kaisha Toshiba | Non-volatile semiconductor storage device |
US9678832B2 (en) | 2014-09-18 | 2017-06-13 | Sandisk Technologies Llc | Storage module and method for on-chip copy gather |
US9778863B2 (en) | 2014-09-30 | 2017-10-03 | Sandisk Technologies Llc | System and method for folding partial blocks into multi-level cell memory blocks |
US9880760B2 (en) | 2014-10-30 | 2018-01-30 | Sandisk Technologies Llc | Managing data stored in a nonvolatile storage device |
US20160345009A1 (en) * | 2015-05-19 | 2016-11-24 | ScaleFlux | Accelerating image analysis and machine learning through in-flash image preparation and pre-processing |
US9767565B2 (en) | 2015-08-26 | 2017-09-19 | Digitalglobe, Inc. | Synthesizing training data for broad area geospatial object detection |
US20170068451A1 (en) | 2015-09-08 | 2017-03-09 | Sandisk Technologies Inc. | Storage Device and Method for Detecting and Handling Burst Operations |
US9530491B1 (en) | 2015-11-16 | 2016-12-27 | Sandisk Technologies Llc | System and method for direct write to MLC memory |
WO2017162129A1 (en) | 2016-03-21 | 2017-09-28 | 成都海存艾匹科技有限公司 | Integrated neuroprocessor comprising three-dimensional memory array |
US20190156202A1 (en) | 2016-05-02 | 2019-05-23 | Scopito Aps | Model construction in a neural network for object detection |
US11308383B2 (en) | 2016-05-17 | 2022-04-19 | Silicon Storage Technology, Inc. | Deep learning neural network classifier using non-volatile memory array |
US10387303B2 (en) | 2016-08-16 | 2019-08-20 | Western Digital Technologies, Inc. | Non-volatile storage system with compute engine to accelerate big data applications |
US11501130B2 (en) | 2016-09-09 | 2022-11-15 | SK Hynix Inc. | Neural network hardware accelerator architectures and operating method thereof |
US9646243B1 (en) | 2016-09-12 | 2017-05-09 | International Business Machines Corporation | Convolutional neural networks using resistive processing unit array |
US10176092B2 (en) | 2016-09-21 | 2019-01-08 | Ngd Systems, Inc. | System and method for executing data processing tasks using resilient distributed datasets (RDDs) in a storage device |
CN106485317A (en) | 2016-09-26 | 2017-03-08 | 上海新储集成电路有限公司 | A kind of neutral net accelerator and the implementation method of neural network model |
US10943148B2 (en) * | 2016-12-02 | 2021-03-09 | Apple Inc. | Inspection neural network for assessing neural network reliability |
US20180232508A1 (en) | 2017-02-10 | 2018-08-16 | The Trustees Of Columbia University In The City Of New York | Learning engines for authentication and autonomous applications |
US10909449B2 (en) | 2017-04-14 | 2021-02-02 | Samsung Electronics Co., Ltd. | Monolithic multi-bit weight cell for neuromorphic computing |
EP3610612B1 (en) | 2017-04-17 | 2022-09-28 | Cerebras Systems Inc. | Dataflow triggered tasks for accelerated deep learning |
CN107301455B (en) | 2017-05-05 | 2020-11-03 | 中国科学院计算技术研究所 | Hybrid cube storage system for convolutional neural network and accelerated computing method |
KR20200028330A (en) | 2017-05-09 | 2020-03-16 | 뉴럴라 인코포레이티드 | Systems and methods that enable continuous memory-based learning in deep learning and artificial intelligence to continuously run applications across network compute edges |
US10460817B2 (en) | 2017-07-13 | 2019-10-29 | Qualcomm Incorporated | Multiple (multi-) level cell (MLC) non-volatile (NV) memory (NVM) matrix circuits for performing matrix computations with multi-bit input vectors |
KR102534917B1 (en) * | 2017-08-16 | 2023-05-19 | 에스케이하이닉스 주식회사 | Memory device comprising neural network processor and memory system including the same |
US10394706B2 (en) * | 2017-11-02 | 2019-08-27 | Western Digital Technologies, Inc. | Non-volatile storage with adaptive command prediction |
US20190147320A1 (en) * | 2017-11-15 | 2019-05-16 | Uber Technologies, Inc. | "Matching Adversarial Networks" |
CN108985344A (en) | 2018-06-26 | 2018-12-11 | 四川斐讯信息技术有限公司 | A kind of the training set optimization method and system of neural network model |
US10776263B2 (en) * | 2018-06-27 | 2020-09-15 | Seagate Technology Llc | Non-deterministic window scheduling for data storage systems |
US11170290B2 (en) | 2019-03-28 | 2021-11-09 | Sandisk Technologies Llc | Realization of neural networks with ternary inputs and binary weights in NAND memory arrays |
US10930365B2 (en) * | 2019-02-21 | 2021-02-23 | Intel Corporation | Artificial intelligence based monitoring of solid state drives and dual in-line memory modules |
US11361505B2 (en) * | 2019-06-06 | 2022-06-14 | Qualcomm Technologies, Inc. | Model retrieval for objects in images using field descriptors |
US11520521B2 (en) * | 2019-06-20 | 2022-12-06 | Western Digital Technologies, Inc. | Storage controller having data augmentation components for use with non-volatile memory die |
-
2019
- 2019-06-20 US US16/447,619 patent/US11501109B2/en active Active
-
2022
- 2022-08-26 US US17/897,028 patent/US20220405532A1/en active Pending
Also Published As
Publication number | Publication date |
---|---|
US20200401850A1 (en) | 2020-12-24 |
US11501109B2 (en) | 2022-11-15 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20220405532A1 (en) | Non-volatile memory die with on-chip data augmentation components for use with machine learning | |
US11520521B2 (en) | Storage controller having data augmentation components for use with non-volatile memory die | |
US11507843B2 (en) | Separate storage and control of static and dynamic neural network data within a non-volatile memory array | |
US11705191B2 (en) | Non-volatile memory die with deep learning neural network | |
US11662904B2 (en) | Non-volatile memory with on-chip principal component analysis for generating low dimensional outputs for machine learning | |
US8788908B2 (en) | Data storage system having multi-bit memory device and on-chip buffer program method thereof | |
US11114174B2 (en) | Memory system processing request based on inference and operating method of the same | |
US20200184335A1 (en) | Non-volatile memory die with deep learning neural network | |
EP3756186A2 (en) | Non-volatile memory die with deep learning neural network | |
US8638603B2 (en) | Data storage system having multi-level memory device and operating method thereof | |
US11893244B2 (en) | Hybrid memory management of non-volatile memory (NVM) devices for use with recurrent neural networks | |
US20220171992A1 (en) | Super-sparse image compression using cross-bar non-volatile memory device | |
US11251812B2 (en) | Encoding and decoding of hamming distance-based binary representations of numbers | |
US11216696B2 (en) | Training data sample selection for use with non-volatile memory and machine learning processor | |
US20240311023A1 (en) | Data storage device with noise injection | |
WO2021223528A1 (en) | Processing device and method for executing convolutional neural network operation | |
US12051482B2 (en) | Data storage device with noise injection | |
US11755208B2 (en) | Hybrid memory management of non-volatile memory (NVM) devices for use with recurrent neural networks | |
KR20220127168A (en) | De-noising using multiple threshold-expert machine learning models | |
US20220138541A1 (en) | Systems and methods for use with recurrent neural networks |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: WESTERN DIGITAL TECHNOLOGIES, INC., CALIFORNIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:BAZARSKY, ALEXANDER;NAVON, ARIEL;REEL/FRAME:060917/0612 Effective date: 20190620 |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION |
|
AS | Assignment |
Owner name: JPMORGAN CHASE BANK, N.A., ILLINOIS Free format text: PATENT COLLATERAL AGREEMENT - A&R LOAN AGREEMENT;ASSIGNOR:WESTERN DIGITAL TECHNOLOGIES, INC.;REEL/FRAME:064715/0001 Effective date: 20230818 Owner name: JPMORGAN CHASE BANK, N.A., ILLINOIS Free format text: PATENT COLLATERAL AGREEMENT - DDTL LOAN AGREEMENT;ASSIGNOR:WESTERN DIGITAL TECHNOLOGIES, INC.;REEL/FRAME:067045/0156 Effective date: 20230818 |
|
AS | Assignment |
Owner name: SANDISK TECHNOLOGIES, INC., CALIFORNIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:WESTERN DIGITAL TECHNOLOGIES, INC.;REEL/FRAME:067567/0682 Effective date: 20240503 |
|
AS | Assignment |
Owner name: SANDISK TECHNOLOGIES, INC., CALIFORNIA Free format text: CHANGE OF NAME;ASSIGNOR:SANDISK TECHNOLOGIES, INC.;REEL/FRAME:067982/0032 Effective date: 20240621 |
|
AS | Assignment |
Owner name: JPMORGAN CHASE BANK, N.A., AS THE AGENT, ILLINOIS Free format text: PATENT COLLATERAL AGREEMENT;ASSIGNOR:SANDISK TECHNOLOGIES, INC.;REEL/FRAME:068762/0494 Effective date: 20240820 |