US20230208449A1 - Neural networks and systems for decoding encoded data - Google Patents
Neural networks and systems for decoding encoded data Download PDFInfo
- Publication number
- US20230208449A1 US20230208449A1 US18/179,317 US202318179317A US2023208449A1 US 20230208449 A1 US20230208449 A1 US 20230208449A1 US 202318179317 A US202318179317 A US 202318179317A US 2023208449 A1 US2023208449 A1 US 2023208449A1
- Authority
- US
- United States
- Prior art keywords
- data
- encoded
- neural network
- weights
- output
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
-
- H—ELECTRICITY
- H03—ELECTRONIC CIRCUITRY
- H03M—CODING; DECODING; CODE CONVERSION IN GENERAL
- H03M13/00—Coding, decoding or code conversion, for error detection or error correction; Coding theory basic assumptions; Coding bounds; Error probability evaluation methods; Channel models; Simulation or testing of codes
- H03M13/65—Purpose and implementation aspects
- H03M13/6597—Implementations using analogue techniques for coding or decoding, e.g. analogue Viterbi decoder
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/18—Complex mathematical operations for evaluating statistical data, e.g. average values, frequency distributions, probability functions, regression analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
- G06F18/232—Non-hierarchical techniques
- G06F18/2321—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
- G06F18/23213—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/047—Probabilistic or stochastic networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- H—ELECTRICITY
- H03—ELECTRONIC CIRCUITRY
- H03M—CODING; DECODING; CODE CONVERSION IN GENERAL
- H03M13/00—Coding, decoding or code conversion, for error detection or error correction; Coding theory basic assumptions; Coding bounds; Error probability evaluation methods; Channel models; Simulation or testing of codes
- H03M13/03—Error detection or forward error correction by redundancy in data representation, i.e. code words containing more digits than the source words
- H03M13/05—Error detection or forward error correction by redundancy in data representation, i.e. code words containing more digits than the source words using block codes, i.e. a predetermined number of check bits joined to a predetermined number of information bits
- H03M13/11—Error detection or forward error correction by redundancy in data representation, i.e. code words containing more digits than the source words using block codes, i.e. a predetermined number of check bits joined to a predetermined number of information bits using multiple parity bits
- H03M13/1102—Codes on graphs and decoding on graphs, e.g. low-density parity check [LDPC] codes
-
- H—ELECTRICITY
- H03—ELECTRONIC CIRCUITRY
- H03M—CODING; DECODING; CODE CONVERSION IN GENERAL
- H03M13/00—Coding, decoding or code conversion, for error detection or error correction; Coding theory basic assumptions; Coding bounds; Error probability evaluation methods; Channel models; Simulation or testing of codes
- H03M13/03—Error detection or forward error correction by redundancy in data representation, i.e. code words containing more digits than the source words
- H03M13/05—Error detection or forward error correction by redundancy in data representation, i.e. code words containing more digits than the source words using block codes, i.e. a predetermined number of check bits joined to a predetermined number of information bits
- H03M13/11—Error detection or forward error correction by redundancy in data representation, i.e. code words containing more digits than the source words using block codes, i.e. a predetermined number of check bits joined to a predetermined number of information bits using multiple parity bits
- H03M13/1102—Codes on graphs and decoding on graphs, e.g. low-density parity check [LDPC] codes
- H03M13/1105—Decoding
-
- H—ELECTRICITY
- H03—ELECTRONIC CIRCUITRY
- H03M—CODING; DECODING; CODE CONVERSION IN GENERAL
- H03M13/00—Coding, decoding or code conversion, for error detection or error correction; Coding theory basic assumptions; Coding bounds; Error probability evaluation methods; Channel models; Simulation or testing of codes
- H03M13/03—Error detection or forward error correction by redundancy in data representation, i.e. code words containing more digits than the source words
- H03M13/05—Error detection or forward error correction by redundancy in data representation, i.e. code words containing more digits than the source words using block codes, i.e. a predetermined number of check bits joined to a predetermined number of information bits
- H03M13/13—Linear codes
- H03M13/15—Cyclic codes, i.e. cyclic shifts of codewords produce other codewords, e.g. codes defined by a generator polynomial, Bose-Chaudhuri-Hocquenghem [BCH] codes
-
- H—ELECTRICITY
- H03—ELECTRONIC CIRCUITRY
- H03M—CODING; DECODING; CODE CONVERSION IN GENERAL
- H03M13/00—Coding, decoding or code conversion, for error detection or error correction; Coding theory basic assumptions; Coding bounds; Error probability evaluation methods; Channel models; Simulation or testing of codes
- H03M13/03—Error detection or forward error correction by redundancy in data representation, i.e. code words containing more digits than the source words
- H03M13/05—Error detection or forward error correction by redundancy in data representation, i.e. code words containing more digits than the source words using block codes, i.e. a predetermined number of check bits joined to a predetermined number of information bits
- H03M13/13—Linear codes
- H03M13/15—Cyclic codes, i.e. cyclic shifts of codewords produce other codewords, e.g. codes defined by a generator polynomial, Bose-Chaudhuri-Hocquenghem [BCH] codes
- H03M13/151—Cyclic codes, i.e. cyclic shifts of codewords produce other codewords, e.g. codes defined by a generator polynomial, Bose-Chaudhuri-Hocquenghem [BCH] codes using error location or error correction polynomials
- H03M13/1515—Reed-Solomon codes
-
- H—ELECTRICITY
- H03—ELECTRONIC CIRCUITRY
- H03M—CODING; DECODING; CODE CONVERSION IN GENERAL
- H03M13/00—Coding, decoding or code conversion, for error detection or error correction; Coding theory basic assumptions; Coding bounds; Error probability evaluation methods; Channel models; Simulation or testing of codes
- H03M13/03—Error detection or forward error correction by redundancy in data representation, i.e. code words containing more digits than the source words
- H03M13/05—Error detection or forward error correction by redundancy in data representation, i.e. code words containing more digits than the source words using block codes, i.e. a predetermined number of check bits joined to a predetermined number of information bits
- H03M13/13—Linear codes
- H03M13/15—Cyclic codes, i.e. cyclic shifts of codewords produce other codewords, e.g. codes defined by a generator polynomial, Bose-Chaudhuri-Hocquenghem [BCH] codes
- H03M13/151—Cyclic codes, i.e. cyclic shifts of codewords produce other codewords, e.g. codes defined by a generator polynomial, Bose-Chaudhuri-Hocquenghem [BCH] codes using error location or error correction polynomials
- H03M13/152—Bose-Chaudhuri-Hocquenghem [BCH] codes
-
- H—ELECTRICITY
- H03—ELECTRONIC CIRCUITRY
- H03M—CODING; DECODING; CODE CONVERSION IN GENERAL
- H03M13/00—Coding, decoding or code conversion, for error detection or error correction; Coding theory basic assumptions; Coding bounds; Error probability evaluation methods; Channel models; Simulation or testing of codes
- H03M13/65—Purpose and implementation aspects
- H03M13/6508—Flexibility, adaptability, parametrability and configurability of the implementation
- H03M13/6513—Support of multiple code types, e.g. unified decoder for LDPC and turbo codes
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L1/00—Arrangements for detecting or preventing errors in the information received
- H04L1/004—Arrangements for detecting or preventing errors in the information received by using forward error control
- H04L1/0045—Arrangements at the receiver end
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/048—Activation functions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/06—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
- G06N3/063—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
-
- H—ELECTRICITY
- H03—ELECTRONIC CIRCUITRY
- H03M—CODING; DECODING; CODE CONVERSION IN GENERAL
- H03M13/00—Coding, decoding or code conversion, for error detection or error correction; Coding theory basic assumptions; Coding bounds; Error probability evaluation methods; Channel models; Simulation or testing of codes
- H03M13/03—Error detection or forward error correction by redundancy in data representation, i.e. code words containing more digits than the source words
- H03M13/05—Error detection or forward error correction by redundancy in data representation, i.e. code words containing more digits than the source words using block codes, i.e. a predetermined number of check bits joined to a predetermined number of information bits
- H03M13/13—Linear codes
-
- H—ELECTRICITY
- H03—ELECTRONIC CIRCUITRY
- H03M—CODING; DECODING; CODE CONVERSION IN GENERAL
- H03M13/00—Coding, decoding or code conversion, for error detection or error correction; Coding theory basic assumptions; Coding bounds; Error probability evaluation methods; Channel models; Simulation or testing of codes
- H03M13/29—Coding, decoding or code conversion, for error detection or error correction; Coding theory basic assumptions; Coding bounds; Error probability evaluation methods; Channel models; Simulation or testing of codes combining two or more codes or code structures, e.g. product codes, generalised product codes, concatenated codes, inner and outer codes
- H03M13/2906—Coding, decoding or code conversion, for error detection or error correction; Coding theory basic assumptions; Coding bounds; Error probability evaluation methods; Channel models; Simulation or testing of codes combining two or more codes or code structures, e.g. product codes, generalised product codes, concatenated codes, inner and outer codes using block codes
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L1/00—Arrangements for detecting or preventing errors in the information received
- H04L1/004—Arrangements for detecting or preventing errors in the information received by using forward error control
- H04L1/0056—Systems characterized by the type of code used
- H04L1/0057—Block codes
Definitions
- Examples described herein relate to neural networks for use in decoding encoded data. Examples of neural networks are described which may be used with error-correcting coding (ECC) memory, where a neural network may be used to decode encoded data.
- ECC error-correcting coding
- Error correction coding may be used in a variety of applications, such as memory devices or wireless baseband circuitry.
- error correction coding techniques may encode original data with additional bits to describe the original bits which are intended to be stored, retrieved, and/or transmitted.
- the additional bits may be stored together with the original bits. Accordingly, there may be L bits of original data to be stored and/or transmitted.
- An encoder may provide N-L additional bits, such that the encoded data may be N bits worth of data.
- the original bits may be stored as the original bits, or may be changed by the encoder to form the encoded N bits of stored data.
- a decoder may decode the N bits to retrieve and/or estimate the original L bits, which may be corrected in some examples in accordance with the ECC technique.
- FIG. 1 is a schematic illustration of a neural network arranged in accordance with examples described herein.
- FIG. 2 is a schematic illustration of a hardware implementation of a neural network arranged in accordance with examples described herein.
- FIG. 3 is a schematic illustration of an apparatus arranged in accordance with examples described herein.
- FIG. 4 is a flowchart of a method arranged in accordance with examples described herein.
- Multi-layer neural networks may be used to decode encoded data (e.g., data encoded using one or more encoding techniques).
- the neural networks may have nonlinear mapping and distributed processing capabilities which may be advantageous in many systems employing the neural network decoders.
- neural networks described herein may be used to implement error code correction (ECC) decoders.
- ECC error code correction
- An encoder may have L bits of input data (a1, a2, . . . aL).
- the encoder may encode the input data in accordance with an encoding technique to provide N bits of encoded data (b1, b2, . . . bN).
- the encoded data may be stored and/or transmitted, or some other action taken with the encoded data, which may introduce noise into the data.
- a decoder may receive a version of the N bits of encoded data (x1, x2, . . . xN).
- the decoder may decode the received encoded data into an estimate of the L bits original data (y1, y2, . . . yL).
- Examples of wireless baseband circuitry may utilize error correction coding (such as low density parity check coding, LDPC).
- An encoder may add particularly selected N-L bits into an original data of L bits, which may allow a decoder to decode the data and reduce and/or minimize errors introduced by noise, interferences and other practical factors in the data storage and transmission.
- error correction coding techniques including low density parity check coding (LDPC), Reed-Solomon coding, Bose-Chaudhuri-Hocquenghem (BCH), and Polar coding.
- LDPC low density parity check coding
- Reed-Solomon coding Reed-Solomon coding
- BCH Bose-Chaudhuri-Hocquenghem
- Polar coding Polar coding
- the decoder may be one of the processing blocks that cost the most computational resources in wireless baseband circuitry and/or memory controllers, which may reduce the desirability of existing decoding schemes in many emerging applications such as Internet of Things (IoT) and/or tactile internet where ultra-low power consumption and ultra-low latency are highly desirable.
- IoT Internet of Things
- tactile internet where ultra-low power consumption and ultra-low latency are highly desirable.
- Examples described herein utilize multi-layer neural networks to decode encoded data (e.g., data encoded using one or more encoding techniques).
- the neural networks have nonlinear mapping and distributed processing capabilities which may be advantageous in many systems employing the neural network decoders.
- FIG. 1 is a schematic illustration of a neural network arranged in accordance with examples described herein.
- the neural network 100 include three stages (e.g., layers). While three stages are shown in FIG. 1 , any number of stages may be used in other examples.
- a first stage of neural network 100 includes node 118 , node 120 , node 122 , and node 124 .
- a second stage of neural network 100 includes combiner 102 , combiner 104 , combiner 106 , and combiner 108 .
- a third stage of neural network 100 includes combiner 110 , combiner 112 , combiner 114 , and combiner 116 . Additional, fewer, and/or different components may be used in other examples.
- a neural network may be used including multiple stages of nodes.
- the nodes may be implemented using processing elements which may execute one or more functions on inputs received from a previous stage and provide the output of the functions to the next stage of the neural network.
- the processing elements may be implemented using, for example, one or more processors, controllers, and/or custom circuitry, such as an application specific integrated circuit (ASIC) and/or a field programmable gate array (FPGA).
- ASIC application specific integrated circuit
- FPGA field programmable gate array
- the processing elements may be implemented as combiners and/or summers and/or any other structure for performing functions allocated to the processing element.
- certain of the processing elements of neural networks described herein perform weighted sums, e.g., may be implemented using one or more multiplication/accumulation units, which may be implemented using processor(s) and/or other circuitry.
- the neural network 100 may have an input layer, which may be a first stage of the neural network including node 118 , node 120 , node 122 , and node 124 .
- the nodes node 118 , node 120 , node 122 , and node 124 may implement a linear function which may provide the input signals (e.g., x1(n), x2(n), . . . xN(n)) to another stage of the neural network (e.g., a ‘hidden stage’ or ‘hidden layer’).
- a ‘hidden stage’ or ‘hidden layer’ may be a linear function which may provide the input signals (e.g., x1(n), x2(n), . . . xN(n)) to another stage of the neural network.
- N bits of input data may be provided to an input stage (e.g., an input layer) of a neural network during operation.
- the input data may be data encoded in accordance with an encoding technique (e.g., low density parity check coding (LDPC), Reed-Solomon coding, Bose-Chaudhuri-Hocquenghem (BCH), and/or Polar coding).
- the N bits of input data may be output by the first stage of the neural network 100 to a next stage of the neural network 100 .
- the connection between the first stage and the second stage of the neural network 100 may not be weighted—e.g., processing elements in the second stage may receive bits unaltered from the first stage in some examples.
- Each of the input data bits may be provided to multiple ones of the processing elements in the next stage. While an input layer is shown, in some examples, the input layer may not be present.
- the neural network 100 may have a next layer, which may be referred to as a ‘hidden layer’ in some examples.
- the next layer may include combiner 102 , combiner 104 , combiner 106 , and combiner 108 , although any number of elements may be used.
- the processing elements in the second stage of the neural network 100 are referred to as combiners, generally the processing elements in the second stage may perform a nonlinear activation function using the input data bits received at the processing element. Any number of nonlinear activation functions may be used. Examples of functions which may be used include Gaussian functions, such as
- functions which may be used include piece-wise linear functions, such as
- ⁇ represents a real parameter (e.g., a scaling parameter) and r is the distance between the input vector and the current vector.
- the distance may be measured using any of a variety of metrics, including the Euclidean norm.
- Each element in the ‘hidden layer’ may receive as inputs selected bits (e.g., some or all) of the input data.
- each element in the ‘hidden layer’ may receive as inputs from the output of multiple selected elements (e.g., some or all elements) in the input layer.
- the combiner 102 may receive as inputs the output of node 118 , node 120 , node 122 , and node 124 . While a single ‘hidden layer’ is shown by way of example in FIG. 1 , any number of ‘hidden layers’ may be present and may be connected in series.
- the hidden layer may evaluate at least one non-linear function using combinations of the data received at the hidden layer node (e.g., element). In this manner, the hidden layer may provide intermediate data at an output of one or more hidden layers.
- the neural network 100 may have an output layer.
- the output layer in the example of FIG. 1 may include combiner 110 , combiner 112 , combiner 114 , and combiner 116 , although any number of elements may be used. While the processing element in the output stage of the neural network 100 are referred to as combiners, generally the processing elements in the output may perform any combination or other operation using data bits received from a last ‘hidden layer’ in the neural network. Each element in the output layer may receive as inputs selected bits (e.g., some or all) of the data provided by a last ‘hidden layer’.
- the combiner 110 may receive as inputs from the outputs of combiner 102 , combiner 104 , combiner 106 , and combiner 108 .
- the connections between the hidden layer and the output layer may be weighted.
- a set of weights W may be specified.
- Other distributions of weights may also be used.
- the weights may be multiplied with the output of the hidden layer before the output is provided to the output layer.
- the output layer may perform a sum of weighted inputs.
- an output of the neural network 100 e.g., the outputs of the output layer
- the output layer may accordingly combine intermediate data received from one or more hidden layers using weights to provide output data.
- the neural network 100 may be used to provide L output bits which represent decoded data corresponding to N input bits.
- N input bits are shown (x 1 (n), x 2 (n), . . . x N (n)) and L output bits are provided (y 1 (n), y 2 (n), . . . y L (n)).
- the neural network 100 may be trained such that the weights W used and/or the functions provided by the elements of the hidden layers cause the neural network 100 to provide output bits which represent the decoded data corresponding to the N encoded input bits.
- the input bits may have been encoded with an encoding technique, and the weights and/or functions provided by the elements of the hidden layers may be selected in accordance with the encoding technique. Accordingly, the neural network 100 may be trained multiple times—once for each encoding technique that may be used to provide the neural network 100 with input data.
- Examples of neural networks may be trained. Training generally refers to the process of determining weights, functions, and/or other attributes to be utilized by a neural network to create a desired transformation of input data to output data.
- neural networks described herein may be trained to transform encoded input data to decoded data (e.g., an estimate of the decoded data).
- neural networks described herein may be trained to transform noisy encoded input data to decoded data (e.g., an estimate of the decoded data). In this manner, neural networks may be used to reduce and/or improve errors which may be introduced by noise present in the input data.
- neural networks described herein may be trained to transform noisy encoded input data to encoded data with reduced noise.
- the encoded data with reduced noise may then be provided to any decoder (e.g., a neural network and/or other decoder) for decoding of the encoded data.
- any decoder e.g., a neural network and/or other decoder
- neural networks may be used to reduce and/or improve errors which may be introduced by noise.
- Training as described herein may be supervised or un-supervised in various examples.
- training may occur using known pairs of anticipated input and desired output data.
- training may utilize known encoded data and decoded data pairs to train a neural network to decode subsequent encoded data into decoded data.
- training may utilize known noisy encoded data and decoded data pairs to train a neural network to decode subsequent noisy encoded data into decoded data.
- training may utilize known noisy encoded data and encoded data pairs to train a neural network to provide encoded data having reduced noise than input noisy encoded data. Examples of training may include determining weights to be used by a neural network, such as neural network 100 of FIG. 1 .
- the same neural network hardware is used during training as will be used during operation. In some examples, however, different neural network hardware may be used during training, and the weights, functions, or other attributes determined during training may be stored for use by other neural network hardware during operation.
- X ( n ) [ x 1 ( n ), x 2 ( n ), . . . x N ( n )] Y
- connections between a last hidden layer and the output layer may be weighted.
- Each element in the output layer may have a linear input-output relationship such that it may perform a summation (e.g., a weighted summation). Accordingly, an output of the i'th element in the output layer at time n may be written as:
- a neural network architecture may include a number of elements and may have center vectors which are distributed in the input domain such that the neural network may approximate nonlinear multidimensional functions and therefore may approximate forward mapping an inverse mapping between two code words (e.g., from an N-bit input to an L-bit output).
- the choice of transfer function used by elements in the hidden layer may not affect the mapping performance of the neural network, and accordingly, a function may be used which may be implemented conveniently in hardware in some examples. For example, a thin-plate-spline function and/or a Gaussian function may be used in various examples and may both provide adequate approximation capabilities. Other functions may also be used.
- Examples of neural networks may accordingly be specified by attributes (e.g., parameters).
- attributes e.g., parameters
- two sets of parameters may be used to specify a neural network: connection weights and center vectors (e.g., thresholds).
- the parameters may be determined from selected input data (e.g., encoded input data) by solving an optimization function.
- An example optimization function may be given as:
- M is a number of trained input vector (e.g., trained encoded data inputs)
- Y(n) is an output vector computed from the sample input vector using Equations 1 and 2 above, and is the corresponding desired (e.g., known) output vector.
- the output vector Y(n) may be written as:
- Y ( n ) [ y 1 ( n ), y 2 ( n ), . . . y N ( n )] T
- Various methods may be used to solve the optimization function.
- the center vectors may be chosen from a subset of available sample vectors.
- the number of elements in the hidden layer(s) may be relatively large to cover the entire input domain.
- k-means cluster algorithms distribute the center vectors according to the natural measure of the attractor (e.g., if the density of the data points is high, so is the density of the centers).
- k-means cluster algorithms may find a set of cluster centers and partition the training samples into subsets.
- Each cluster center may be associated with one of the H hidden layer elements in this network.
- the data may be partitioned in such a way that the training points are assigned to the cluster with the nearest center.
- the cluster center corresponding to one of the minima of an optimization function.
- An example optimization function for use with a k-means cluster algorithm may be given as:
- B jn is the cluster partition or membership function forming an H ⁇ M matrix.
- Each column may represent an available sample vector (e.g., known input data) and each row may represent a cluster.
- Each column may include a single ‘1’ in the row corresponding to the cluster nearest to that training point, and zeros elsewhere.
- each cluster may be initialized to a different randomly chosen training point. Then each training example may be assigned to the element nearest to it. When all training points have been assigned, the average position of the training point for each cluster may be found and the cluster center is moved to that point. The clusters may become the desired centers of the hidden layer elements.
- the scaling factor ⁇ may be determined, and may be determined before determining the connection weights.
- the scaling factor may be selected to cover the training points to allow a smooth fit of the desired network outputs. Generally, this refers to any point within the convex hull of the processing element centers may significantly activate more than one element. To achieve this goal, each hidden layer element may activate at least one other hidden layer element to a significant degree.
- An appropriate method to determine the scaling parameter a may be based on the P-nearest neighbor heuristic, which may be given as,
- connection weights may additionally or instead be determined during training.
- the optimization function of Equation 3 may become a linear least-squares problem once the center vectors and the scaling parameter have been determined.
- the linear least-squares problem may be written as
- F is an H ⁇ M matrix of the outputs of the hidden layer processing elements and whose matrix elements are computed using
- connection weight matrix W [ (1) (2), . . . , (M)] is the L ⁇ M matrix of the desired (e.g., known) outputs.
- the connection weight matrix W may be found from Equation 5 and may be written as follows:
- connection weights may be determined as follows.
- connection weights may be initialized to any value (e.g., random values may be used).
- the output vector Y(n) may be computed using Equation 2.
- the error term e i (n) of each output element in the output layer may be computed as follows:
- connection weights may then be adjusted based on the error term, for example as follows:
- W ij ( n+ 1) W ij ( n )+ ⁇ e i ( n ) f j ( ⁇ X ( n ) ⁇ C i ⁇ )
- ⁇ is the learning-rate parameter which may be fixed or time-varying.
- the total error may be computed based on the output from the output layer and the desired (known) data:
- the process may be iterated by again calculating a new output vector, error term, and again adjusting the connection weights. The process may continue until weights are identified which reduce the error to equal to or less than a threshold error.
- the neural network 100 of FIG. 1 may be trained to determine parameters (e.g., weights) for use by the neural network 100 to perform a particular mapping between input and output data. For example, training the neural network 100 may provide one set of parameters to use when decoding encoded data that had been encoded with a particular encoding technique (e.g., low density parity check coding (LDPC), Reed-Solomon coding, Bose-Chaudhuri-Hocquenghem (BCH), and/or Polar coding).
- LDPC low density parity check coding
- Reed-Solomon coding e.g., Reed-Solomon coding
- BCH Bose-Chaudhuri-Hocquenghem
- Polar coding e.g., Polar coding
- a different set of weights may be determined for each of multiple encoding techniques—e.g., one set of weights may be determined for use with decoding LDPC encoded data and another set of weights may be determined for use with decoding BCH encoded data.
- neural network 100 of FIG. 1 is provided by way of example only. Other multilayer neural network structures may be used in other examples. Moreover, the training procedures described herein are also provided by way of example. Other training techniques (e.g., learning algorithms) may be used, for example, to solve the local minimum problem and/or vanishing gradient problem. Determined weights and/or vectors for each decoder may be obtained by an off-line learning mode of the neural network, which may advantageously provide more resources and data.
- training techniques e.g., learning algorithms
- the input training samples: [x 1 (n), x 2 (n), . . . , x N (n)] may be generated by passing the encoded samples [b 1 (n), b 2 (n), . . . , b N (n)] through some noisy channels and/or adding noise.
- the supervised output samples may be the corresponding original code [a 1 (n), a 2 (n), . . . , a L (n)] which may be used to generate [b 1 (n), b 2 (n), . . . , b N (n)] by the encoder.
- the desired decoded code-word can be obtained from input data utilizing the neural network (e.g., computing equation 2), which may avoid complex iterations and feedback decisions used in traditional error-correcting decoding algorithms.
- the neural network e.g., computing equation 2
- neural networks described herein may provide a reduction in processing complexity and/or latency, because some complexity has been transferred to an off-line training process which is used to determine the weights and/or functions which will be used.
- the same neural network e.g., the neural network 100 of FIG. 1
- neural networks may serve as a universal decoder for multiple encoder types.
- FIG. 2 is a schematic diagram of hardware implementation of a neural network 200 .
- the hardware implementation of the neural network 200 may be used, for example, to implement one or more neural networks, such as the neural network 100 of FIG. 1 .
- the hardware implementation of the neural network 200 includes a processing unit 230 having two stages—a first stage which may include multiplication/accumulation unit 204 , table look-up 216 , multiplication/accumulation unit 206 , table look-up 218 , and multiplication/accumulation unit 208 , and table look-up 220 .
- the processing unit 230 includes a second stage, coupled to the first stage in series, which includes multiplication/accumulation unit 210 , table look-up 222 , multiplication/accumulation unit 212 , table look-up 224 , multiplication/accumulation unit 214 , and table look-up 226 .
- the hardware implementation of the neural network 200 further includes a mode configurable control 202 and a weight memory 228 .
- the processing unit 230 may receive input data (e. g. x 1 (n), x 2 (n), . . . , x N (n)) from a memory device, communication transceiver and/or other component.
- the input data may be encoded in accordance with an encoding technique.
- the processing unit 230 may function to process the encoded input data to provide output data—e.g., y 1 (n), y 2 (n), . . . , y L , (n).
- the output data may be the decoded data (e.g., an estimate of the decoded data) corresponding to the encoded input data in some examples.
- the output data may be the data corresponding to the encoded input data, but having reduced and/or modified noise.
- FIG. 2 While two stages are shown in FIG. 2 —a first stage including N multiplication/accumulation units and N table look-ups, and a second stage including L multiplication/accumulation units and L table look-ups—any number of stages, and numbers of elements in each stage may be used.
- one multiplication/accumulation unit followed by one table look-up may be used to implement a processing element of FIG. 1 .
- the multiplication/accumulation unit 204 coupled in series to the table look-up 216 may be used to implement the combiner 102 of FIG. 1 .
- each multiplication/accumulation unit of FIG. 2 may be implemented using one or more multipliers and one or more adders.
- Each of the multiplication unit/accumulation units of FIG. 2 may include multiple multipliers, multiple accumulation unit, or and/or multiple adders.
- Any one of the multiplication unit/accumulation units of FIG. 2 may be implemented using an arithmetic logic unit (ALU).
- ALU arithmetic logic unit
- any one of the multiplication unit/accumulation units may include one multiplier and one adder that each perform, respectively, multiple multiplications and multiple additions.
- the input-output relationship of an example multiplication/accumulation unit may be written as:
- I is the number of multiplications to be performed by the unit
- W i refers to the coefficients to be used in the multiplications
- Z in (i) is a factor for multiplication which may be, for example, input to the system and/or stored in one or more of the table look-ups.
- the table-lookups shown in FIG. 2 may generally perform a predetermined nonlinear mapping from input to output.
- the table-lookups may be used to evaluate at least one non-linear function.
- the contents and size of the various table look-ups depicted may be different and may be predetermined.
- one or more of the table look-ups shown in FIG. 2 may be replaced by a single consolidated table look-up.
- nonlinear mappings e.g., functions
- nonlinear mappings which may be performed by the table look-ups include Gaussian functions, piece-wise linear functions, sigmoid functions, thin-plate-spline functions, multiquadratic functions, cubic approximations, and inverse multi-quadratic functions.
- selected table look-ups may be by-passed and/or may be de-activated, which may allow a table look-up and its associated multiplication/accumulation unit to be considered a unity gain element.
- the table-lookups may be implemented using one or more look-up tables (e.g., stored in one or more memory device(s)), which may associate an input with the output of a non-linear function utilizing the input.
- the hardware implementation of neural network 200 may be used to convert an input code word (e g. x 1 (n), x 2 (n), . . . , x N (n)) to an output code word (e.g., y 1 (n), y 2 (n), . . . , y L (n)). Examples of the conversion have been described herein with reference to FIG. 1 .
- the input code word may correspond to noisy encoded input data.
- the hardware implementation of the neural network 200 may utilize multiplication with corresponding weights (e.g., weights obtained during training) and look up tables to provide the output code word.
- the output code word may correspond to the decoded data and/or to a version of the encoded input data having reduced and/or changed noise.
- the mode configuration control 202 may be implemented using circuitry (e.g., logic), one or more processor(s), microcontroller(s), controller(s), or other elements.
- the mode configuration control 202 may select certain weights and/or other parameters from weight memory 228 and provide those weights and/or other parameters to one or more of the multiplication/accumulation units and/or table look-ups of FIG. 2 .
- weights and/or other parameters stored in weight memory 228 may be associated with particular encoding techniques.
- the mode configuration control 202 may be used to select weights and/or other parameters in weight memory 228 associated with a particular encoding technique (e.g., Reed-Solomon coding, BCH coding, LDPC coding, and/or Polar coding).
- the hardware implementation of neural network 200 may then utilize the selected weights and/or other parameters to function as a decoder for data encoded with that encoding technique.
- the mode configuration control 202 may select different weights and/or other parameters stored in weight memory 228 which are associated with a different encoding technique to alter the operation of the hardware implementation of neural network 200 to serve as a decoder for the different encoding technique. In this manner, the hardware implementation of neural network 200 may flexibly function as a decoder for multiple encoding techniques.
- FIG. 3 is a schematic illustration of apparatus 300 (e.g., an integrated circuit, a memory device, a memory system, an electronic device or system, a smart phone, a tablet, a computer, a server, an appliance, a vehicle, etc.) according to an embodiment of the disclosure.
- the apparatus 300 may generally include a host 302 and a memory system 304 .
- the host 302 may be a host system such as a personal laptop computer, a desktop computer, a digital camera, a mobile telephone, or a memory card reader, among various other types of hosts.
- the host 302 may include a number of memory access devices (e.g., a number of processors).
- the host 302 may also be a memory controller, such as where memory system 304 is a memory device (e.g., a memory device having an on-die controller).
- the memory system 304 may be a solid state drive (SSD) or other type of memory and may include a host interface 306 , a controller 308 (e.g., a processor and/or other control circuitry), and a number of memory device(s) 310 .
- the memory system 304 , the controller 308 , and/or the memory device(s) 310 may also be separately considered an “apparatus.”
- the memory device(s) 310 may include a number of solid state memory devices such as NAND flash devices, which may provide a storage volume for the memory system 304 . Other types of memory may also be used.
- the controller 308 may be coupled to the host interface 306 and to the memory device(s) 310 via a plurality of channels to transfer data between the memory system 304 and the host 302 .
- the interface 306 may be in the form of a standardized interface.
- the interface 306 may be a serial advanced technology attachment (SATA), peripheral component interconnect express (PCIe), or a universal serial bus (USB), among other connectors and interfaces.
- SATA serial advanced technology attachment
- PCIe peripheral component interconnect express
- USB universal serial bus
- interface 306 provides an interface for passing control, address, data, and other signals between the memory system 304 and the host 302 having compatible receptors for the interface 306 .
- the controller 308 may communicate with the memory device(s) 314 (which in some embodiments can include a number of memory arrays on a single die) to control data read, write, and erase operations, among other operations.
- the controller 308 may include a discrete memory channel controller for each channel (not shown in FIG. 3 ) coupling the controller 308 to the memory device(s) 314 .
- the controller 308 may include a number of components in the form of hardware and/or firmware (e.g., one or more integrated circuits) and/or software for controlling access to the memory device(s) 314 and/or for facilitating data transfer between the host 302 and memory device(s) 314 .
- the controller 308 may include an ECC encoder 310 for encoding data bits written to the memory device(s) 314 using one or more encoding techniques.
- the ECC encoder 310 may include a single parity check (SPC) encoder, and/or an algebraic error correction circuit such as one of the group including a Bose-Chaudhuri-Hocquenghem (BCH) ECC encoder and/or a Reed Solomon ECC encoder, among other types of error correction circuits.
- the controller 308 may further include an ECC decoder 312 for decoding encoded data, which may include identifying erroneous cells, converting erroneous cells to erasures, and/or correcting the erasures.
- the memory device(s) 314 may, for example, include one or more output buffers which may read selected data from memory cells of the memory device(s) 314 .
- the output buffers may provide output data, which may be provided as encoded input data to the ECC decoder 312 .
- the neural network 100 of FIG. 1 and/or the hardware implementation of neural network 200 of FIG. 2 may be used to implement the ECC decoder 312 of FIG. 3 , for example.
- the ECC decoder 312 may be capable of decoding data for each type of encoder in the ECC encoder 310 .
- the ECC encoder 310 may store parameters associated with multiple encoding techniques which may be used by the ECC encoder 310 , such that the hardware implementation of neural network 200 may be used as a ‘universal decoder’ to decode input data encoded by the ECC encoder 310 , using any of multiple encoding techniques available to the ECC encoder.
- the ECC encoder 310 and the ECC decoder 312 may each be implemented using discrete components such as an application specific integrated circuit (ASIC) or other circuitry, or the components may reflect functionality provided by circuitry within the controller 308 that does not necessarily have a discrete physical form separate from other portions of the controller 308 . Although illustrated as components within the controller 308 in FIG. 3 , each of the ECC encoder 310 and ECC decoder 312 may be external to the controller 308 or have a number of components located within the controller 308 and a number of components located external to the controller 308 .
- ASIC application specific integrated circuit
- the memory device(s) 314 may include a number of arrays of memory cells (e.g., non-volatile memory cells).
- the arrays can be flash arrays with a NAND architecture, for example. However, embodiments are not limited to a particular type of memory array or array architecture. Floating-gate type flash memory cells in a NAND architecture may be used, but embodiments are not so limited.
- the cells may be multi-level cells (MLC) such as triple level cells (TLC) which store three data bits per cell.
- MLC multi-level cells
- TLC triple level cells
- the memory cells can be grouped, for instance, into a number of blocks including a number of physical pages. A number of blocks can be included in a plane of memory cells and an array can include a number of planes.
- a memory device may be configured to store 8 KB (kilobytes) of user data per page, 128 pages of user data per block, 2048 blocks per plane, and 16 planes per device.
- controller 308 may control encoding of a number of received data bits according to the ECC encoder 310 that allows for later identification of erroneous bits and the conversion of those erroneous bits to erasures.
- the controller 308 may also control programming the encoded number of received data bits to a group of memory cells in memory device(s) 314 .
- the apparatus shown in FIG. 3 may be implemented in any of a variety of products employing processors and memory including for example cameras, phones, wireless devices, displays, chip sets, set top boxes, gaming systems, vehicles, and appliances. Resulting devices employing the memory system may benefit from examples of neural networks described herein to perform their ultimate user function.
- FIG. 4 is a flowchart of a method arranged in accordance with examples described herein.
- the example method may include block 402 , which may be followed by block 404 , which may be followed by block 406 , which may be followed by block 408 , which may be followed by block 410 . Additional, fewer, and/or different blocks may be used in other examples, and the order of the blocks may be different in other examples.
- Block 402 recites “receive known encoded and decoded data pairs, the encoded data encoded with a particular encoding technique.”
- the known encoded and decoded data pairs may be received by a computing device that includes a neural network, such as the neural network 100 of FIG. 1 and/or the neural network 200 of FIG. 2 and/or the ECC decoder 312 of FIG. 3 .
- Signaling indicative of the set of data pairs may be provided to the computing device.
- Block 404 may follow block 402 .
- Block 404 recites “determine a set of weights for a neural network to decode data encoded with the particular encoding technique.”
- a neural network e.g., any of the neural networks described herein
- the weights may be numerical values, which, when used by the neural network, allow the neural network to output decoded data corresponding encoded input data encoded with a particular encoding technique.
- the weights may be stored, for example, in the weight memory 228 of FIG. 2 .
- training may not be performed, and an initial set of weights may simply be provided to a neural network, e.g., based on training of another neural network.
- multiple sets of data pairs may be received (e.g., in block 402 ), with each set corresponding to data encoded with a different encoding technique. Accordingly, multiple sets of weights may be determined (e.g., in block 404 ), each set corresponding to a different encoding technique. For example, one set of weights may be determined which may be used to decode data encoded in accordance with LDPC coding while another set of weights may be determined which may be used to decode data encoded with BCH coding.
- Block 406 may follow block 404 .
- Block 406 recites “receive data encoded with the particular encoding technique.”
- data e.g., signaling indicative of data
- data may be retrieved from a memory of a computing device and/or received using a wireless communications receiver. Any of a variety of encoding techniques may have been used to encode the data.
- Block 408 may follow block 406 .
- Block 408 recites “decode the data using the set of weights.”
- the decoded data may be determined.
- any neural network described herein may be used to decode the encoded data (e.g., the neural network 100 of FIG. 1 , neural network 200 of FIG. 2 , and/or the ECC decoder 312 of FIG. 3 ).
- a set of weights may be selected that is associated with the particular encoding technique used to encode the data received in block 406 .
- the set of weights may be selected from among multiple available sets of weights, each for use in decoding data encoded with a different encoding technique.
- Block 410 may follow block 408 .
- Block 410 recites “writing the decoded data to or reading the decoded data from memory.”
- data decoded in block 408 may be written to a memory, such as the memory 314 of FIG. 3 .
- the decoded data may be transmitted to another device (e.g., using wireless communication techniques).
- block 410 recites memory, in some examples any storage device may be used.
- blocks 406 - 410 may be repeated for data encoded with different encoding techniques.
- data may be received in block 406 , encoded with one particular encoding technique (e.g., LDPC coding).
- a set of weights may be selected that is for use with LDPC coding and provided to a neural network for decoding in block 408 .
- the decoded data may be obtained in block 410 .
- Data may then be received in block 406 , encoded with a different encoding technique (e.g., BCH coding).
- Another set of weights may be selected that is for use with BCH coding and provided to a neural network for decoding in block 408 .
- the decoded data may be obtained in block 410 . In this manner, one neural network may be used to decode data that had been encoded with multiple encoding techniques.
- Examples described herein may refer to various components as “coupled” or signals as being “provided to” or “received from” certain components. It is to be understood that in some examples the components are directly coupled one to another, while in other examples the components are coupled with intervening components disposed between them. Similarly, signal may be provided directly to and/or received directly from the recited components without intervening components, but also may be provided to and/or received from the certain components through intervening components.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Mathematical Physics (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Probability & Statistics with Applications (AREA)
- Life Sciences & Earth Sciences (AREA)
- General Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Pure & Applied Mathematics (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Artificial Intelligence (AREA)
- Health & Medical Sciences (AREA)
- Evolutionary Computation (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- Molecular Biology (AREA)
- Algebra (AREA)
- Mathematical Optimization (AREA)
- Mathematical Analysis (AREA)
- Computational Mathematics (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Databases & Information Systems (AREA)
- Operations Research (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Neurology (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Error Detection And Correction (AREA)
Abstract
Examples described herein utilize multi-layer neural networks to decode encoded data (e.g., data encoded using one or more encoding techniques). The neural networks may have nonlinear mapping and distributed processing capabilities which may be advantageous in many systems employing the neural network decoders. In this manner, neural networks described herein may be used to implement error code correction (ECC) decoders.
Description
- This application is a continuation of pending U.S. patent application Ser. No. 16/233,576 filed Dec. 27, 2018. The aforementioned application is incorporated herein by reference, in its entirety, for any purpose.
- Examples described herein relate to neural networks for use in decoding encoded data. Examples of neural networks are described which may be used with error-correcting coding (ECC) memory, where a neural network may be used to decode encoded data.
- Error correction coding (ECC) may be used in a variety of applications, such as memory devices or wireless baseband circuitry. Generally, error correction coding techniques may encode original data with additional bits to describe the original bits which are intended to be stored, retrieved, and/or transmitted. The additional bits may be stored together with the original bits. Accordingly, there may be L bits of original data to be stored and/or transmitted. An encoder may provide N-L additional bits, such that the encoded data may be N bits worth of data. The original bits may be stored as the original bits, or may be changed by the encoder to form the encoded N bits of stored data. A decoder may decode the N bits to retrieve and/or estimate the original L bits, which may be corrected in some examples in accordance with the ECC technique.
-
FIG. 1 is a schematic illustration of a neural network arranged in accordance with examples described herein. -
FIG. 2 is a schematic illustration of a hardware implementation of a neural network arranged in accordance with examples described herein. -
FIG. 3 is a schematic illustration of an apparatus arranged in accordance with examples described herein. -
FIG. 4 . is a flowchart of a method arranged in accordance with examples described herein. - Multi-layer neural networks may be used to decode encoded data (e.g., data encoded using one or more encoding techniques). The neural networks may have nonlinear mapping and distributed processing capabilities which may be advantageous in many systems employing the neural network decoders. In this manner, neural networks described herein may be used to implement error code correction (ECC) decoders.
- An encoder may have L bits of input data (a1, a2, . . . aL). The encoder may encode the input data in accordance with an encoding technique to provide N bits of encoded data (b1, b2, . . . bN). The encoded data may be stored and/or transmitted, or some other action taken with the encoded data, which may introduce noise into the data. Accordingly, a decoder may receive a version of the N bits of encoded data (x1, x2, . . . xN). The decoder may decode the received encoded data into an estimate of the L bits original data (y1, y2, . . . yL).
- Examples of wireless baseband circuitry may utilize error correction coding (such as low density parity check coding, LDPC). An encoder may add particularly selected N-L bits into an original data of L bits, which may allow a decoder to decode the data and reduce and/or minimize errors introduced by noise, interferences and other practical factors in the data storage and transmission.
- There are a variety of particular error correction coding techniques, including low density parity check coding (LDPC), Reed-Solomon coding, Bose-Chaudhuri-Hocquenghem (BCH), and Polar coding. The use of these coding techniques, however, may come at the cost of the decrease of the frequency and/or channel and/or storage resource usage efficiency and the increase of the processing complexity. For example, the use of coding techniques may increase the amount of data which may be stored and/or transmitted. Moreover, processing resources may be necessary to implement the encoding and decoding. In some examples, the decoder may be one of the processing blocks that cost the most computational resources in wireless baseband circuitry and/or memory controllers, which may reduce the desirability of existing decoding schemes in many emerging applications such as Internet of Things (IoT) and/or tactile internet where ultra-low power consumption and ultra-low latency are highly desirable.
- Examples described herein utilize multi-layer neural networks to decode encoded data (e.g., data encoded using one or more encoding techniques). The neural networks have nonlinear mapping and distributed processing capabilities which may be advantageous in many systems employing the neural network decoders.
-
FIG. 1 is a schematic illustration of a neural network arranged in accordance with examples described herein. Theneural network 100 include three stages (e.g., layers). While three stages are shown inFIG. 1 , any number of stages may be used in other examples. A first stage ofneural network 100 includesnode 118,node 120,node 122, andnode 124. A second stage ofneural network 100 includes combiner 102, combiner 104, combiner 106, and combiner 108. A third stage ofneural network 100 includes combiner 110, combiner 112, combiner 114, and combiner 116. Additional, fewer, and/or different components may be used in other examples. - Generally, a neural network may be used including multiple stages of nodes. The nodes may be implemented using processing elements which may execute one or more functions on inputs received from a previous stage and provide the output of the functions to the next stage of the neural network. The processing elements may be implemented using, for example, one or more processors, controllers, and/or custom circuitry, such as an application specific integrated circuit (ASIC) and/or a field programmable gate array (FPGA). The processing elements may be implemented as combiners and/or summers and/or any other structure for performing functions allocated to the processing element. In some examples, certain of the processing elements of neural networks described herein perform weighted sums, e.g., may be implemented using one or more multiplication/accumulation units, which may be implemented using processor(s) and/or other circuitry.
- In the example, of
FIG. 1 , theneural network 100 may have an input layer, which may be a first stage of the neuralnetwork including node 118,node 120,node 122, andnode 124. Thenodes node 118,node 120,node 122, andnode 124 may implement a linear function which may provide the input signals (e.g., x1(n), x2(n), . . . xN(n)) to another stage of the neural network (e.g., a ‘hidden stage’ or ‘hidden layer’). Accordingly, in the example ofFIG. 1 , N bits of input data may be provided to an input stage (e.g., an input layer) of a neural network during operation. In some examples, the input data may be data encoded in accordance with an encoding technique (e.g., low density parity check coding (LDPC), Reed-Solomon coding, Bose-Chaudhuri-Hocquenghem (BCH), and/or Polar coding). The N bits of input data may be output by the first stage of theneural network 100 to a next stage of theneural network 100. In some examples, the connection between the first stage and the second stage of theneural network 100 may not be weighted—e.g., processing elements in the second stage may receive bits unaltered from the first stage in some examples. Each of the input data bits may be provided to multiple ones of the processing elements in the next stage. While an input layer is shown, in some examples, the input layer may not be present. - The
neural network 100 may have a next layer, which may be referred to as a ‘hidden layer’ in some examples. The next layer may include combiner 102, combiner 104, combiner 106, and combiner 108, although any number of elements may be used. While the processing elements in the second stage of theneural network 100 are referred to as combiners, generally the processing elements in the second stage may perform a nonlinear activation function using the input data bits received at the processing element. Any number of nonlinear activation functions may be used. Examples of functions which may be used include Gaussian functions, such as -
- Examples of functions which may be used include multi-quadratic functions, such as ƒ(r)=(r2+σ2)1/2. Examples of functions which may be used include inverse multi-quadratic functions, such as ƒ(r)=(r2+σ2)−1/2. Examples of functions which may be used include thin-plate-spline functions, such as ƒ(r)=r2 log (r). Examples of functions which may be used include piece-wise linear functions, such as
-
- Examples of functions which may be used include cubic approximation functions, such as
-
- In these example functions, σ represents a real parameter (e.g., a scaling parameter) and r is the distance between the input vector and the current vector. The distance may be measured using any of a variety of metrics, including the Euclidean norm.
- Each element in the ‘hidden layer’ may receive as inputs selected bits (e.g., some or all) of the input data. For example, each element in the ‘hidden layer’ may receive as inputs from the output of multiple selected elements (e.g., some or all elements) in the input layer. For example, the
combiner 102 may receive as inputs the output ofnode 118,node 120,node 122, andnode 124. While a single ‘hidden layer’ is shown by way of example inFIG. 1 , any number of ‘hidden layers’ may be present and may be connected in series. While four elements are shown in the ‘hidden layer’, any number may be used, and they may be the same or different in number than the number of nodes in the input layer and/or the number of nodes in any other hidden layer. The nodes in the hidden layer may evaluate at least one non-linear function using combinations of the data received at the hidden layer node (e.g., element). In this manner, the hidden layer may provide intermediate data at an output of one or more hidden layers. - The
neural network 100 may have an output layer. The output layer in the example ofFIG. 1 may includecombiner 110,combiner 112,combiner 114, andcombiner 116, although any number of elements may be used. While the processing element in the output stage of theneural network 100 are referred to as combiners, generally the processing elements in the output may perform any combination or other operation using data bits received from a last ‘hidden layer’ in the neural network. Each element in the output layer may receive as inputs selected bits (e.g., some or all) of the data provided by a last ‘hidden layer’. For example, thecombiner 110 may receive as inputs from the outputs ofcombiner 102,combiner 104,combiner 106, andcombiner 108. The connections between the hidden layer and the output layer may be weighted. For example, a set of weights W may be specified. There may be one weight for each connection between a hidden layer node and an output layer node in some examples. In some examples, there may be one weight for each hidden layer node that may be applied to the data provided by that node to each connected output node. Other distributions of weights may also be used. The weights may be multiplied with the output of the hidden layer before the output is provided to the output layer. In this manner, the output layer may perform a sum of weighted inputs. Accordingly, an output of the neural network 100 (e.g., the outputs of the output layer) may be referred to as a weighted sum. The output layer may accordingly combine intermediate data received from one or more hidden layers using weights to provide output data. - In some examples, the
neural network 100 may be used to provide L output bits which represent decoded data corresponding to N input bits. For example, in the example ofFIG. 1 , N input bits are shown (x1 (n), x2(n), . . . xN(n)) and L output bits are provided (y1(n), y2(n), . . . yL(n)). Theneural network 100 may be trained such that the weights W used and/or the functions provided by the elements of the hidden layers cause theneural network 100 to provide output bits which represent the decoded data corresponding to the N encoded input bits. The input bits may have been encoded with an encoding technique, and the weights and/or functions provided by the elements of the hidden layers may be selected in accordance with the encoding technique. Accordingly, theneural network 100 may be trained multiple times—once for each encoding technique that may be used to provide theneural network 100 with input data. - Examples of neural networks may be trained. Training generally refers to the process of determining weights, functions, and/or other attributes to be utilized by a neural network to create a desired transformation of input data to output data. In some examples, neural networks described herein may be trained to transform encoded input data to decoded data (e.g., an estimate of the decoded data). In some examples, neural networks described herein may be trained to transform noisy encoded input data to decoded data (e.g., an estimate of the decoded data). In this manner, neural networks may be used to reduce and/or improve errors which may be introduced by noise present in the input data. In some examples, neural networks described herein may be trained to transform noisy encoded input data to encoded data with reduced noise. The encoded data with reduced noise may then be provided to any decoder (e.g., a neural network and/or other decoder) for decoding of the encoded data. In this manner, neural networks may be used to reduce and/or improve errors which may be introduced by noise.
- Training as described herein may be supervised or un-supervised in various examples. In some examples, training may occur using known pairs of anticipated input and desired output data. For example, training may utilize known encoded data and decoded data pairs to train a neural network to decode subsequent encoded data into decoded data. In some examples, training may utilize known noisy encoded data and decoded data pairs to train a neural network to decode subsequent noisy encoded data into decoded data. In some examples, training may utilize known noisy encoded data and encoded data pairs to train a neural network to provide encoded data having reduced noise than input noisy encoded data. Examples of training may include determining weights to be used by a neural network, such as
neural network 100 ofFIG. 1 . In some examples, the same neural network hardware is used during training as will be used during operation. In some examples, however, different neural network hardware may be used during training, and the weights, functions, or other attributes determined during training may be stored for use by other neural network hardware during operation. - Examples of training can be described mathematically. For example, consider input data at a time instant (n), given as:
-
X(n)=[x 1(n),x 2(n), . . . x N(n)]Y - the center vector for each element in hidden layer(s) of the neural network (e.g.,
combiner 102,combiner 104,combiner 106, andcombiner 108 ofFIG. 1 ) may be denoted as Ci (for i=1, 2, . . . , H, where H is the element number in the hidden layer). - The output of each element in a hidden layer may then be given as:
-
h i(n)=f i(∥X(n)−C i∥) for(i=1,2, . . . , H) (1) - The connections between a last hidden layer and the output layer may be weighted. Each element in the output layer may have a linear input-output relationship such that it may perform a summation (e.g., a weighted summation). Accordingly, an output of the i'th element in the output layer at time n may be written as:
-
y i(n)=Σj=1 H W ij h j(n)=Σj=1 H W ij f j(∥X(n)−C j∥) (2) - for (i=1, 2, . . . , L) and where L is the element number of the output of the output layer and Wij is the connection weight between the j'th element in the hidden layer and the i'th element in the output layer.
- Generally, a neural network architecture (e.g., the
neural network 100 ofFIG. 1 ) may include a number of elements and may have center vectors which are distributed in the input domain such that the neural network may approximate nonlinear multidimensional functions and therefore may approximate forward mapping an inverse mapping between two code words (e.g., from an N-bit input to an L-bit output). Generally, the choice of transfer function used by elements in the hidden layer may not affect the mapping performance of the neural network, and accordingly, a function may be used which may be implemented conveniently in hardware in some examples. For example, a thin-plate-spline function and/or a Gaussian function may be used in various examples and may both provide adequate approximation capabilities. Other functions may also be used. - Examples of neural networks may accordingly be specified by attributes (e.g., parameters). In some examples, two sets of parameters may be used to specify a neural network: connection weights and center vectors (e.g., thresholds). The parameters may be determined from selected input data (e.g., encoded input data) by solving an optimization function. An example optimization function may be given as:
-
-
Y(n)=[y 1(n),y 2(n), . . . y N(n)]T - Various methods (e.g., gradient descent procedures) may be used to solve the optimization function. However, in some examples, another approach may be used to determine the parameters of a neural network, which may generally include two steps—(1) determining center vectors Ci (i=1, 2, . . . , H) and (2) determining the weights.
- In some examples, the center vectors may be chosen from a subset of available sample vectors. In such examples, the number of elements in the hidden layer(s) may be relatively large to cover the entire input domain. Accordingly, in some examples, it may be desirable to apply k-means cluster algorithms. Generally, k-means cluster algorithms distribute the center vectors according to the natural measure of the attractor (e.g., if the density of the data points is high, so is the density of the centers). k-means cluster algorithms may find a set of cluster centers and partition the training samples into subsets. Each cluster center may be associated with one of the H hidden layer elements in this network. The data may be partitioned in such a way that the training points are assigned to the cluster with the nearest center. The cluster center corresponding to one of the minima of an optimization function. An example optimization function for use with a k-means cluster algorithm may be given as:
-
E k_means=Σj=1 HΣn=1 M B jn ∥X(n)−C j∥2 (4) - where Bjn is the cluster partition or membership function forming an H×M matrix. Each column may represent an available sample vector (e.g., known input data) and each row may represent a cluster. Each column may include a single ‘1’ in the row corresponding to the cluster nearest to that training point, and zeros elsewhere.
- The center of each cluster may be initialized to a different randomly chosen training point. Then each training example may be assigned to the element nearest to it. When all training points have been assigned, the average position of the training point for each cluster may be found and the cluster center is moved to that point. The clusters may become the desired centers of the hidden layer elements.
- In some examples, for some transfer functions (e.g., the Gaussian function), the scaling factor σ may be determined, and may be determined before determining the connection weights. The scaling factor may be selected to cover the training points to allow a smooth fit of the desired network outputs. Generally, this refers to any point within the convex hull of the processing element centers may significantly activate more than one element. To achieve this goal, each hidden layer element may activate at least one other hidden layer element to a significant degree. An appropriate method to determine the scaling parameter a may be based on the P-nearest neighbor heuristic, which may be given as,
-
- where Cj (for i=1,2, . . . , H) are the P-nearest neighbors of Ci.
- The connection weights may additionally or instead be determined during training. In an example of a neural network, such as
neural network 100 ofFIG. 1 , having one hidden layer of weighted connections an output elements which are summation units, the optimization function of Equation 3 may become a linear least-squares problem once the center vectors and the scaling parameter have been determined. The linear least-squares problem may be written as -
- where W={Wij} is the L×H matrix of the connection weights, F is an H×M matrix of the outputs of the hidden layer processing elements and whose matrix elements are computed using
-
F in =f i(∥X(n)−C i∥)(i=1,2, . . . ,H;n=1,2, . . . ,M) -
-
- where F+ is the pseudo-inverse of F. In this manner, the above may provide a batch-processing method for determining the connection weights of a neural network. It may be applied, for example, where all input sample sets are available at one time. In some examples, each new sample set may become available recursively, such as in the recursive-least-squares algorithms (RLS). In such cases, the connection weights may be determined as follows.
- First, connection weights may be initialized to any value (e.g., random values may be used).
- The output vector Y(n) may be computed using Equation 2. The error term ei(n) of each output element in the output layer may be computed as follows:
- The connection weights may then be adjusted based on the error term, for example as follows:
-
W ij(n+1)=W ij(n)+γe i(n)f j(∥X(n)−C i∥) - (i=1,2, . . . , L; j=1,2, . . . , M)
- where γ is the learning-rate parameter which may be fixed or time-varying.
- The total error may be computed based on the output from the output layer and the desired (known) data:
- The process may be iterated by again calculating a new output vector, error term, and again adjusting the connection weights. The process may continue until weights are identified which reduce the error to equal to or less than a threshold error.
- Accordingly, the
neural network 100 ofFIG. 1 may be trained to determine parameters (e.g., weights) for use by theneural network 100 to perform a particular mapping between input and output data. For example, training theneural network 100 may provide one set of parameters to use when decoding encoded data that had been encoded with a particular encoding technique (e.g., low density parity check coding (LDPC), Reed-Solomon coding, Bose-Chaudhuri-Hocquenghem (BCH), and/or Polar coding). The neural network 100 (and/or another neural network) may be trained multiple times, using different known input/output data pairs, for example. Multiple trainings may result in multiple sets of connection weights. For example, a different set of weights may be determined for each of multiple encoding techniques—e.g., one set of weights may be determined for use with decoding LDPC encoded data and another set of weights may be determined for use with decoding BCH encoded data. - Recall that the structure of
neural network 100 ofFIG. 1 is provided by way of example only. Other multilayer neural network structures may be used in other examples. Moreover, the training procedures described herein are also provided by way of example. Other training techniques (e.g., learning algorithms) may be used, for example, to solve the local minimum problem and/or vanishing gradient problem. Determined weights and/or vectors for each decoder may be obtained by an off-line learning mode of the neural network, which may advantageously provide more resources and data. - In examples of supervised learning, the input training samples: [x1 (n), x2 (n), . . . , xN(n)] may be generated by passing the encoded samples [b1(n), b2(n), . . . , bN(n)] through some noisy channels and/or adding noise. The supervised output samples may be the corresponding original code [a1(n), a2(n), . . . , aL(n)] which may be used to generate [b1(n), b2 (n), . . . , bN(n)] by the encoder. Once these parameters are determined in offline mode, the desired decoded code-word can be obtained from input data utilizing the neural network (e.g., computing equation 2), which may avoid complex iterations and feedback decisions used in traditional error-correcting decoding algorithms. In this manner, neural networks described herein may provide a reduction in processing complexity and/or latency, because some complexity has been transferred to an off-line training process which is used to determine the weights and/or functions which will be used. Further, the same neural network (e.g., the
neural network 100 ofFIG. 1 ) can be used to decode an input code-word encoded from any of multiple error correction encoder by selecting different weights that were obtained by the training for the particular error correction technique employed. In this manner, neural networks may serve as a universal decoder for multiple encoder types. -
FIG. 2 is a schematic diagram of hardware implementation of aneural network 200. The hardware implementation of theneural network 200, may be used, for example, to implement one or more neural networks, such as theneural network 100 ofFIG. 1 . The hardware implementation of theneural network 200 includes aprocessing unit 230 having two stages—a first stage which may include multiplication/accumulation unit 204, table look-up 216, multiplication/accumulation unit 206, table look-up 218, and multiplication/accumulation unit 208, and table look-up 220. Theprocessing unit 230 includes a second stage, coupled to the first stage in series, which includes multiplication/accumulation unit 210, table look-up 222, multiplication/accumulation unit 212, table look-up 224, multiplication/accumulation unit 214, and table look-up 226. The hardware implementation of theneural network 200 further includes a modeconfigurable control 202 and aweight memory 228. - The
processing unit 230 may receive input data (e. g. x1 (n), x2 (n), . . . , xN(n)) from a memory device, communication transceiver and/or other component. In some examples, the input data may be encoded in accordance with an encoding technique. Theprocessing unit 230 may function to process the encoded input data to provide output data—e.g., y1 (n), y2 (n), . . . , yL, (n). The output data may be the decoded data (e.g., an estimate of the decoded data) corresponding to the encoded input data in some examples. The output data may be the data corresponding to the encoded input data, but having reduced and/or modified noise. - While two stages are shown in
FIG. 2 —a first stage including N multiplication/accumulation units and N table look-ups, and a second stage including L multiplication/accumulation units and L table look-ups—any number of stages, and numbers of elements in each stage may be used. In the example ofFIG. 2 , one multiplication/accumulation unit followed by one table look-up may be used to implement a processing element ofFIG. 1 . For example, the multiplication/accumulation unit 204 coupled in series to the table look-up 216 may be used to implement thecombiner 102 ofFIG. 1 . - Generally, each multiplication/accumulation unit of
FIG. 2 may be implemented using one or more multipliers and one or more adders. Each of the multiplication unit/accumulation units ofFIG. 2 may include multiple multipliers, multiple accumulation unit, or and/or multiple adders. Any one of the multiplication unit/accumulation units ofFIG. 2 may be implemented using an arithmetic logic unit (ALU). In some examples, any one of the multiplication unit/accumulation units may include one multiplier and one adder that each perform, respectively, multiple multiplications and multiple additions. The input-output relationship of an example multiplication/accumulation unit may be written as: -
Z out=Σi=1 I W i *Z in(i) (6) - where “I” is the number of multiplications to be performed by the unit, Wi refers to the coefficients to be used in the multiplications, and Zin(i) is a factor for multiplication which may be, for example, input to the system and/or stored in one or more of the table look-ups.
- The table-lookups shown in
FIG. 2 may generally perform a predetermined nonlinear mapping from input to output. For example, the table-lookups may be used to evaluate at least one non-linear function. In some examples, the contents and size of the various table look-ups depicted may be different and may be predetermined. In some examples, one or more of the table look-ups shown inFIG. 2 may be replaced by a single consolidated table look-up. Examples of nonlinear mappings (e.g., functions) which may be performed by the table look-ups include Gaussian functions, piece-wise linear functions, sigmoid functions, thin-plate-spline functions, multiquadratic functions, cubic approximations, and inverse multi-quadratic functions. Examples of functions have been described with reference also toFIG. 1 . In some examples, selected table look-ups may be by-passed and/or may be de-activated, which may allow a table look-up and its associated multiplication/accumulation unit to be considered a unity gain element. Generally, the table-lookups may be implemented using one or more look-up tables (e.g., stored in one or more memory device(s)), which may associate an input with the output of a non-linear function utilizing the input. - Accordingly, the hardware implementation of
neural network 200 may be used to convert an input code word (e g. x1 (n), x2 (n), . . . , xN(n)) to an output code word (e.g., y1(n), y2 (n), . . . , yL(n)). Examples of the conversion have been described herein with reference toFIG. 1 . For example, the input code word may correspond to noisy encoded input data. The hardware implementation of theneural network 200 may utilize multiplication with corresponding weights (e.g., weights obtained during training) and look up tables to provide the output code word. The output code word may correspond to the decoded data and/or to a version of the encoded input data having reduced and/or changed noise. - The
mode configuration control 202 may be implemented using circuitry (e.g., logic), one or more processor(s), microcontroller(s), controller(s), or other elements. Themode configuration control 202 may select certain weights and/or other parameters fromweight memory 228 and provide those weights and/or other parameters to one or more of the multiplication/accumulation units and/or table look-ups ofFIG. 2 . In some examples, weights and/or other parameters stored inweight memory 228 may be associated with particular encoding techniques. During operation, themode configuration control 202 may be used to select weights and/or other parameters inweight memory 228 associated with a particular encoding technique (e.g., Reed-Solomon coding, BCH coding, LDPC coding, and/or Polar coding). The hardware implementation ofneural network 200 may then utilize the selected weights and/or other parameters to function as a decoder for data encoded with that encoding technique. Themode configuration control 202 may select different weights and/or other parameters stored inweight memory 228 which are associated with a different encoding technique to alter the operation of the hardware implementation ofneural network 200 to serve as a decoder for the different encoding technique. In this manner, the hardware implementation ofneural network 200 may flexibly function as a decoder for multiple encoding techniques. -
FIG. 3 is a schematic illustration of apparatus 300 (e.g., an integrated circuit, a memory device, a memory system, an electronic device or system, a smart phone, a tablet, a computer, a server, an appliance, a vehicle, etc.) according to an embodiment of the disclosure. Theapparatus 300 may generally include ahost 302 and amemory system 304. - The
host 302 may be a host system such as a personal laptop computer, a desktop computer, a digital camera, a mobile telephone, or a memory card reader, among various other types of hosts. Thehost 302 may include a number of memory access devices (e.g., a number of processors). Thehost 302 may also be a memory controller, such as wherememory system 304 is a memory device (e.g., a memory device having an on-die controller). - The
memory system 304 may be a solid state drive (SSD) or other type of memory and may include ahost interface 306, a controller 308 (e.g., a processor and/or other control circuitry), and a number of memory device(s) 310. Thememory system 304, thecontroller 308, and/or the memory device(s) 310 may also be separately considered an “apparatus.” The memory device(s) 310 may include a number of solid state memory devices such as NAND flash devices, which may provide a storage volume for thememory system 304. Other types of memory may also be used. - The
controller 308 may be coupled to thehost interface 306 and to the memory device(s) 310 via a plurality of channels to transfer data between thememory system 304 and thehost 302. Theinterface 306 may be in the form of a standardized interface. For example, when thememory system 304 is used for data storage in theapparatus 300, theinterface 306 may be a serial advanced technology attachment (SATA), peripheral component interconnect express (PCIe), or a universal serial bus (USB), among other connectors and interfaces. In general,interface 306 provides an interface for passing control, address, data, and other signals between thememory system 304 and thehost 302 having compatible receptors for theinterface 306. - The
controller 308 may communicate with the memory device(s) 314 (which in some embodiments can include a number of memory arrays on a single die) to control data read, write, and erase operations, among other operations. Thecontroller 308 may include a discrete memory channel controller for each channel (not shown inFIG. 3 ) coupling thecontroller 308 to the memory device(s) 314. Thecontroller 308 may include a number of components in the form of hardware and/or firmware (e.g., one or more integrated circuits) and/or software for controlling access to the memory device(s) 314 and/or for facilitating data transfer between thehost 302 and memory device(s) 314. - The
controller 308 may include anECC encoder 310 for encoding data bits written to the memory device(s) 314 using one or more encoding techniques. TheECC encoder 310 may include a single parity check (SPC) encoder, and/or an algebraic error correction circuit such as one of the group including a Bose-Chaudhuri-Hocquenghem (BCH) ECC encoder and/or a Reed Solomon ECC encoder, among other types of error correction circuits. Thecontroller 308 may further include anECC decoder 312 for decoding encoded data, which may include identifying erroneous cells, converting erroneous cells to erasures, and/or correcting the erasures. The memory device(s) 314 may, for example, include one or more output buffers which may read selected data from memory cells of the memory device(s) 314. The output buffers may provide output data, which may be provided as encoded input data to theECC decoder 312. Theneural network 100 ofFIG. 1 and/or the hardware implementation ofneural network 200 ofFIG. 2 may be used to implement theECC decoder 312 ofFIG. 3 , for example. In various embodiments, theECC decoder 312 may be capable of decoding data for each type of encoder in theECC encoder 310. For example, theweight memory 228 ofFIG. 2 may store parameters associated with multiple encoding techniques which may be used by theECC encoder 310, such that the hardware implementation ofneural network 200 may be used as a ‘universal decoder’ to decode input data encoded by theECC encoder 310, using any of multiple encoding techniques available to the ECC encoder. - The
ECC encoder 310 and theECC decoder 312 may each be implemented using discrete components such as an application specific integrated circuit (ASIC) or other circuitry, or the components may reflect functionality provided by circuitry within thecontroller 308 that does not necessarily have a discrete physical form separate from other portions of thecontroller 308. Although illustrated as components within thecontroller 308 inFIG. 3 , each of theECC encoder 310 andECC decoder 312 may be external to thecontroller 308 or have a number of components located within thecontroller 308 and a number of components located external to thecontroller 308. - The memory device(s) 314 may include a number of arrays of memory cells (e.g., non-volatile memory cells). The arrays can be flash arrays with a NAND architecture, for example. However, embodiments are not limited to a particular type of memory array or array architecture. Floating-gate type flash memory cells in a NAND architecture may be used, but embodiments are not so limited. The cells may be multi-level cells (MLC) such as triple level cells (TLC) which store three data bits per cell. The memory cells can be grouped, for instance, into a number of blocks including a number of physical pages. A number of blocks can be included in a plane of memory cells and an array can include a number of planes. As one example, a memory device may be configured to store 8 KB (kilobytes) of user data per page, 128 pages of user data per block, 2048 blocks per plane, and 16 planes per device.
- According to a number of embodiments,
controller 308 may control encoding of a number of received data bits according to theECC encoder 310 that allows for later identification of erroneous bits and the conversion of those erroneous bits to erasures. Thecontroller 308 may also control programming the encoded number of received data bits to a group of memory cells in memory device(s) 314. - The apparatus shown in
FIG. 3 may be implemented in any of a variety of products employing processors and memory including for example cameras, phones, wireless devices, displays, chip sets, set top boxes, gaming systems, vehicles, and appliances. Resulting devices employing the memory system may benefit from examples of neural networks described herein to perform their ultimate user function. - From the foregoing it will be appreciated that, although specific embodiments have been described herein for purposes of illustration, various modifications may be made while remaining with the scope of the claimed technology. Certain details are set forth herein to provide an understanding of described embodiments of technology. However, other examples may be practiced without various of these particular details. In some instances, well-known circuits, control signals, timing protocols, neural network structures, algorithms, and/or software operations have not been shown in detail in order to avoid unnecessarily obscuring the described embodiments. Other embodiments may be utilized, and other changes may be made, without departing from the spirit or scope of the subject matter presented here.
-
FIG. 4 is a flowchart of a method arranged in accordance with examples described herein. The example method may include block 402, which may be followed byblock 404, which may be followed byblock 406, which may be followed byblock 408, which may be followed byblock 410. Additional, fewer, and/or different blocks may be used in other examples, and the order of the blocks may be different in other examples. -
Block 402 recites “receive known encoded and decoded data pairs, the encoded data encoded with a particular encoding technique.” The known encoded and decoded data pairs may be received by a computing device that includes a neural network, such as theneural network 100 ofFIG. 1 and/or theneural network 200 ofFIG. 2 and/or theECC decoder 312 ofFIG. 3 . Signaling indicative of the set of data pairs may be provided to the computing device. -
Block 404 may follow block 402.Block 404 recites “determine a set of weights for a neural network to decode data encoded with the particular encoding technique.” For example, a neural network (e.g., any of the neural networks described herein) may be trained using the encoded and decoded data pairs received inblock 402. The weights may be numerical values, which, when used by the neural network, allow the neural network to output decoded data corresponding encoded input data encoded with a particular encoding technique. The weights may be stored, for example, in theweight memory 228 ofFIG. 2 . In some examples, training may not be performed, and an initial set of weights may simply be provided to a neural network, e.g., based on training of another neural network. - In some examples, multiple sets of data pairs may be received (e.g., in block 402), with each set corresponding to data encoded with a different encoding technique. Accordingly, multiple sets of weights may be determined (e.g., in block 404), each set corresponding to a different encoding technique. For example, one set of weights may be determined which may be used to decode data encoded in accordance with LDPC coding while another set of weights may be determined which may be used to decode data encoded with BCH coding.
-
Block 406 may follow block 404.Block 406 recites “receive data encoded with the particular encoding technique.” For example, data (e.g., signaling indicative of data) encoded with the particular encoding technique may be retrieved from a memory of a computing device and/or received using a wireless communications receiver. Any of a variety of encoding techniques may have been used to encode the data. -
Block 408 may follow block 406.Block 408 recites “decode the data using the set of weights.” By processing the encoded data received inblock 406 using the weights, which may have been determined inblock 404, the decoded data may be determined. For example, any neural network described herein may be used to decode the encoded data (e.g., theneural network 100 ofFIG. 1 ,neural network 200 ofFIG. 2 , and/or theECC decoder 312 ofFIG. 3 ). In some examples, a set of weights may be selected that is associated with the particular encoding technique used to encode the data received inblock 406. The set of weights may be selected from among multiple available sets of weights, each for use in decoding data encoded with a different encoding technique. -
Block 410 may follow block 408.Block 410 recites “writing the decoded data to or reading the decoded data from memory.” For example, data decoded inblock 408 may be written to a memory, such as thememory 314 ofFIG. 3 . In some examples, instead of or in addition to writing the data to memory, the decoded data may be transmitted to another device (e.g., using wireless communication techniques). Whileblock 410 recites memory, in some examples any storage device may be used. - In some examples, blocks 406-410 may be repeated for data encoded with different encoding techniques. For example, data may be received in
block 406, encoded with one particular encoding technique (e.g., LDPC coding). A set of weights may be selected that is for use with LDPC coding and provided to a neural network for decoding inblock 408. The decoded data may be obtained inblock 410. Data may then be received inblock 406, encoded with a different encoding technique (e.g., BCH coding). Another set of weights may be selected that is for use with BCH coding and provided to a neural network for decoding inblock 408. The decoded data may be obtained inblock 410. In this manner, one neural network may be used to decode data that had been encoded with multiple encoding techniques. - Examples described herein may refer to various components as “coupled” or signals as being “provided to” or “received from” certain components. It is to be understood that in some examples the components are directly coupled one to another, while in other examples the components are coupled with intervening components disposed between them. Similarly, signal may be provided directly to and/or received directly from the recited components without intervening components, but also may be provided to and/or received from the certain components through intervening components.
Claims (20)
1. An apparatus comprising:
a first stage of combiners configured to receive encoded input data and further configured to implement a first function to provide first intermediate data; and
a second stage of combiners configured to receive the first intermediate data and further configured to combine the first intermediate data using a set of predetermined weights to provide the encoded data with reduced noise.
2. The apparatus of claim 1 , further comprising:
a third stage of combiners configured to receive the first intermediate data and implement a second function to provide second intermediate data to the second stage of combiners.
3. The apparatus of claim 1 , wherein the first function is a nonlinear function.
4. The apparatus of claim 1 , wherein the first stage of combiners and second stage of combiners comprises a first plurality of multiplication/accumulation units, the first plurality of multiplication/accumulation units each configured to multiply at least one bit of the encoded input data with at least one of the set of predetermined weights and sum multiple weighted bits of the encoded input data.
5. The apparatus of claim 4 , wherein the first stage of combiners further comprises a first plurality of table look-ups, the first plurality of table look-ups each configured to look-up at least one intermediate data value corresponding to an output of a respective one of the first plurality of multiplication/accumulation units based on at least one non-linear function.
6. The apparatus of claim 5 , wherein the at least one non-linear function comprises a Gaussian function, a piece-wise linear function, a sigmoid function, a thin-plate-spline function, a multiquadratic function, a cubic approximation, an inverse multi-quadratic function, or combinations thereof.
7. The apparatus of claim 1 , wherein the set of predetermined weights is based at least in part on an encoding technique associated with the encoded input data.
8. The apparatus of claim 7 , wherein the encoding technique comprises Reed-Solomon coding, Bose-Chaudhuri-Hocquenghem (BCH) coding, low-density parity check (LDPC) coding, Polar coding, or combinations thereof.
9. The apparatus of claim 1 , wherein the set of predetermined weights are based on training of a neural network using known noisy encoded data and encoded data pairs.
10. The apparatus of claim 1 , further comprising:
an encoder configured to encode the input data using encoded bits in accordance with an encoding technique and to provide the encoded input data; and
a memory configured to receive the encoded input data from the encoder and configured to store the encoded input data, wherein, in storing the encoded input data, noise is introduced into the encoded input data.
11. A method comprising:
transmitting signaling, from a memory of a computing device, indicative of data encoded an encoding technique; and
modifying, at a neural network of the computing device, the data encoded with an encoding technique using a set of weights to provide encoded data with reduced noise.
12. The method of claim 11 , further comprising:
receiving, at the computing device, signaling indicative of a set of encoded data pairs comprising encoded data; and
determining, for the neural network, the set of weights that modifies the encoded data using the signaling indicative of the set of encoded data pairs.
13. The method of claim 12 , wherein determining the set of weights comprises selecting weights resulting in a minimized value of an error function between an output of the neural network and known noisy encoded data.
14. The method of claim 11 , wherein modifying the data using the neural network using the set of weights to provide encoded data with reduced noise comprises:
combining the data encoded with the encoding technique among the set of weights to provide the encoded data with reduced noise using a plurality of layers of the neural network, comprising an input layer, a hidden layer, an output layer, or combinations thereof.
15. The method of claim 11 , wherein the encoded data with reduced noise is an estimate of the encoded data relative to output of an encoder associated with the encoding technique.
16. The method of claim 11 , wherein the encoding technique comprises Reed-Solomon coding, Bose-Chaudhuri-Hocquenghem (BCH) coding, low-density parity check (LDPC) coding, Polar coding, or combinations thereof.
17. The method of claim 16 , wherein the neural network is trained multiple times, once for each encoding technique used.
18. A memory system comprising:
one or more output buffers configured to transmit noisy output data; and
a neural network configured to receive the noisy output data, and configured to utilize initial weights selected based on encoded data pairs to provide an estimate of encoded data with reduced noise.
19. The memory system of claim 18 , wherein the noise is introduced in transmitting the output data from the output buffers.
20. The memory system of claim 18 , wherein the neural network is configured to use multiple stages of nodes to provide the estimate of the encoded data, the multiple stages of nodes comprising an input stage, a hidden stage, an output stage, or combinations thereof.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US18/179,317 US20230208449A1 (en) | 2018-12-27 | 2023-03-06 | Neural networks and systems for decoding encoded data |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US16/233,576 US11599773B2 (en) | 2018-12-27 | 2018-12-27 | Neural networks and systems for decoding encoded data |
US18/179,317 US20230208449A1 (en) | 2018-12-27 | 2023-03-06 | Neural networks and systems for decoding encoded data |
Related Parent Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US16/233,576 Continuation US11599773B2 (en) | 2018-12-27 | 2018-12-27 | Neural networks and systems for decoding encoded data |
Publications (1)
Publication Number | Publication Date |
---|---|
US20230208449A1 true US20230208449A1 (en) | 2023-06-29 |
Family
ID=71122023
Family Applications (3)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US16/233,576 Active 2041-09-10 US11599773B2 (en) | 2018-12-27 | 2018-12-27 | Neural networks and systems for decoding encoded data |
US16/839,447 Active 2039-10-09 US11416735B2 (en) | 2018-12-27 | 2020-04-03 | Neural networks and systems for decoding encoded data |
US18/179,317 Pending US20230208449A1 (en) | 2018-12-27 | 2023-03-06 | Neural networks and systems for decoding encoded data |
Family Applications Before (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US16/233,576 Active 2041-09-10 US11599773B2 (en) | 2018-12-27 | 2018-12-27 | Neural networks and systems for decoding encoded data |
US16/839,447 Active 2039-10-09 US11416735B2 (en) | 2018-12-27 | 2020-04-03 | Neural networks and systems for decoding encoded data |
Country Status (5)
Country | Link |
---|---|
US (3) | US11599773B2 (en) |
EP (1) | EP3903238A4 (en) |
KR (1) | KR20210096679A (en) |
CN (1) | CN113168560A (en) |
WO (1) | WO2020139976A1 (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11973513B2 (en) | 2021-04-27 | 2024-04-30 | Micron Technology, Inc. | Decoders and systems for decoding encoded data using neural networks |
Families Citing this family (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11599773B2 (en) | 2018-12-27 | 2023-03-07 | Micron Technology, Inc. | Neural networks and systems for decoding encoded data |
US11424764B2 (en) | 2019-11-13 | 2022-08-23 | Micron Technology, Inc. | Recurrent neural networks and systems for decoding encoded data |
KR20210064723A (en) * | 2019-11-26 | 2021-06-03 | 에스케이하이닉스 주식회사 | Electronic device and operating method thereof |
US11356305B2 (en) * | 2020-02-24 | 2022-06-07 | Qualcomm Incorporated | Method to convey the TX waveform distortion to the receiver |
US11546000B2 (en) * | 2020-05-04 | 2023-01-03 | Samsung Electronics Co., Ltd. | Mobile data storage |
WO2022071642A1 (en) * | 2020-09-29 | 2022-04-07 | 엘지전자 주식회사 | Method and apparatus for performing channel coding of ue and base station in wireless communication system |
US11556790B2 (en) * | 2020-09-30 | 2023-01-17 | Micron Technology, Inc. | Artificial neural network training in memory |
US11563449B2 (en) * | 2021-04-27 | 2023-01-24 | Micron Technology, Inc. | Systems for error reduction of encoded data using neural networks |
EP4329202A1 (en) * | 2021-05-25 | 2024-02-28 | Samsung Electronics Co., Ltd. | Neural network-based self-correcting min-sum decoder and electronic device comprising same |
US11528037B1 (en) * | 2021-06-17 | 2022-12-13 | Samsung Electronics Co., Ltd. | Hardware architecture for local erasure correction in SSD/UFS via maximally recoverable codes |
US11755408B2 (en) * | 2021-10-07 | 2023-09-12 | Micron Technology, Inc. | Systems for estimating bit error rate (BER) of encoded data using neural networks |
CN114943291A (en) * | 2022-05-25 | 2022-08-26 | 北京地平线机器人技术研发有限公司 | Training method and device of multi-task model |
US20230418738A1 (en) * | 2022-06-22 | 2023-12-28 | Western Digital Technologies, Inc. | Memory device with latch-based neural network weight parity detection and trimming |
Family Cites Families (40)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7006881B1 (en) | 1991-12-23 | 2006-02-28 | Steven Hoffberg | Media recording device with remote graphic user interface |
AU2001295591A1 (en) | 2000-10-13 | 2002-04-22 | Fraunhofer-Gesellschaft Zur Forderung Der Angewandten Forschung E.V. | A method for supervised teaching of a recurrent artificial neural network |
WO2004008647A2 (en) | 2002-07-16 | 2004-01-22 | In Kwan Hwang | Multistage adaptive parallel interference canceller |
BRPI0515814A (en) | 2004-12-10 | 2008-08-05 | Matsushita Electric Ind Co Ltd | wideband encoding device, wideband lsp prediction device, scalable band encoding device, wideband encoding method |
CN102414498B (en) | 2009-02-25 | 2015-09-30 | 乔治麦尔有限公司 | For the improvement terminator terminating junctor of high pressure reinforced rubber hose |
US8392789B2 (en) | 2009-07-28 | 2013-03-05 | Texas Instruments Incorporated | Method and system for decoding low density parity check codes |
EP3104564B1 (en) | 2015-06-12 | 2020-05-06 | Institut Mines-Télécom | Anticipated termination for sequential decoders |
KR102124714B1 (en) | 2015-09-03 | 2020-06-19 | 미디어텍 인크. | Method and apparatus for neural network based processing in video coding |
US9742593B2 (en) | 2015-12-16 | 2017-08-22 | Kumu Networks, Inc. | Systems and methods for adaptively-tuned digital self-interference cancellation |
US10891540B2 (en) | 2015-12-18 | 2021-01-12 | National Technology & Engineering Solutions Of Sandia, Llc | Adaptive neural network management system |
WO2017136070A1 (en) | 2016-02-03 | 2017-08-10 | Google Inc. | Compressed recurrent neural network models |
US10176802B1 (en) * | 2016-03-21 | 2019-01-08 | Amazon Technologies, Inc. | Lattice encoding using recurrent neural networks |
WO2018016051A1 (en) | 2016-07-21 | 2018-01-25 | 日産ライトトラック株式会社 | Frame for vehicles |
US10698657B2 (en) | 2016-08-12 | 2020-06-30 | Xilinx, Inc. | Hardware accelerator for compressed RNN on FPGA |
US10552738B2 (en) | 2016-12-15 | 2020-02-04 | Google Llc | Adaptive channel coding using machine-learned models |
EP3619656A4 (en) * | 2017-05-03 | 2021-01-20 | Virginia Tech Intellectual Properties, Inc. | Learning and deployment of adaptive wireless communications |
US10491243B2 (en) | 2017-05-26 | 2019-11-26 | SK Hynix Inc. | Deep learning for low-density parity-check (LDPC) decoding |
US20180357530A1 (en) | 2017-06-13 | 2018-12-13 | Ramot At Tel-Aviv University Ltd. | Deep learning decoding of error correcting codes |
US10749594B1 (en) | 2017-08-18 | 2020-08-18 | DeepSig Inc. | Learning-based space communications systems |
CN117768643A (en) | 2017-10-13 | 2024-03-26 | 弗劳恩霍夫应用研究促进协会 | Intra prediction mode concept for block-wise slice coding |
US20190197549A1 (en) * | 2017-12-21 | 2019-06-27 | Paypal, Inc. | Robust features generation architecture for fraud modeling |
KR20200004195A (en) | 2018-07-03 | 2020-01-13 | 에스케이하이닉스 주식회사 | Memory controller and operating method thereof |
CN110737758B (en) * | 2018-07-03 | 2022-07-05 | 百度在线网络技术(北京)有限公司 | Method and apparatus for generating a model |
GB2576499A (en) | 2018-08-15 | 2020-02-26 | Imperial College Sci Tech & Medicine | Joint source channel coding for noisy channels using neural networks |
GB2576500A (en) | 2018-08-15 | 2020-02-26 | Imperial College Sci Tech & Medicine | Joint source channel coding based on channel capacity using neural networks |
US11403511B2 (en) * | 2018-08-23 | 2022-08-02 | Apple Inc. | Unsupervised annotation using dual network system with pre-defined structure |
US10812449B1 (en) | 2018-09-19 | 2020-10-20 | Verisign | Method for generating a domain name using a learned information-rich latent space |
KR20200059703A (en) * | 2018-11-21 | 2020-05-29 | 삼성전자주식회사 | Voice recognizing method and voice recognizing appratus |
CN113841165A (en) | 2018-12-17 | 2021-12-24 | 芯成半导体(开曼)有限公司 | System and method for training artificial neural networks |
US11599773B2 (en) | 2018-12-27 | 2023-03-07 | Micron Technology, Inc. | Neural networks and systems for decoding encoded data |
US11844100B2 (en) | 2019-03-12 | 2023-12-12 | Nec Corporation | Virtual radio access network control |
KR102273153B1 (en) | 2019-04-24 | 2021-07-05 | 경희대학교 산학협력단 | Memory controller storing data in approximate momory device based on priority-based ecc, non-transitory computer-readable medium storing program code, and electronic device comprising approximate momory device and memory controller |
US10861562B1 (en) | 2019-06-24 | 2020-12-08 | SK Hynix Inc. | Deep learning based regression framework for read thresholds in a NAND flash memory |
US11088712B2 (en) | 2019-11-05 | 2021-08-10 | Western Digital Technologies, Inc. | Iterative decoder performance prediction using machine learning |
US11424764B2 (en) | 2019-11-13 | 2022-08-23 | Micron Technology, Inc. | Recurrent neural networks and systems for decoding encoded data |
US11936452B2 (en) | 2020-02-28 | 2024-03-19 | Qualcomm Incorporated | Neural network based channel state information feedback |
US20210287074A1 (en) | 2020-03-12 | 2021-09-16 | Semiconductor Components Industries, Llc | Neural network weight encoding |
US11507843B2 (en) | 2020-03-30 | 2022-11-22 | Western Digital Technologies, Inc. | Separate storage and control of static and dynamic neural network data within a non-volatile memory array |
KR20210131114A (en) | 2020-04-23 | 2021-11-02 | 한국전자통신연구원 | Method and apparatus for generating secret key based on neural network synchronization |
US20220019900A1 (en) | 2020-07-15 | 2022-01-20 | Robert Bosch Gmbh | Method and system for learning perturbation sets in machine learning |
-
2018
- 2018-12-27 US US16/233,576 patent/US11599773B2/en active Active
-
2019
- 2019-12-26 KR KR1020217023009A patent/KR20210096679A/en unknown
- 2019-12-26 WO PCT/US2019/068616 patent/WO2020139976A1/en unknown
- 2019-12-26 EP EP19902559.4A patent/EP3903238A4/en not_active Withdrawn
- 2019-12-26 CN CN201980081811.XA patent/CN113168560A/en active Pending
-
2020
- 2020-04-03 US US16/839,447 patent/US11416735B2/en active Active
-
2023
- 2023-03-06 US US18/179,317 patent/US20230208449A1/en active Pending
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11973513B2 (en) | 2021-04-27 | 2024-04-30 | Micron Technology, Inc. | Decoders and systems for decoding encoded data using neural networks |
Also Published As
Publication number | Publication date |
---|---|
US20200234103A1 (en) | 2020-07-23 |
EP3903238A4 (en) | 2022-11-30 |
WO2020139976A1 (en) | 2020-07-02 |
CN113168560A (en) | 2021-07-23 |
US20200210816A1 (en) | 2020-07-02 |
US11599773B2 (en) | 2023-03-07 |
KR20210096679A (en) | 2021-08-05 |
EP3903238A1 (en) | 2021-11-03 |
US11416735B2 (en) | 2022-08-16 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20230208449A1 (en) | Neural networks and systems for decoding encoded data | |
US11424764B2 (en) | Recurrent neural networks and systems for decoding encoded data | |
US11973513B2 (en) | Decoders and systems for decoding encoded data using neural networks | |
CN109284822B (en) | Neural network operation device and method | |
US20200134460A1 (en) | Processing method and accelerating device | |
CN113032178B (en) | Memory controller and access method of flash memory | |
US20230163788A1 (en) | Systems for error reduction of encoded data using neural networks | |
US9251000B2 (en) | Apparatuses and methods for combining error coding and modulation schemes | |
US10379945B2 (en) | Asymmetric error correction and flash-memory rewriting using polar codes | |
CN118056355A (en) | System for estimating Bit Error Rate (BER) of encoded data using neural network | |
US9236886B1 (en) | Universal and reconfigurable QC-LDPC encoder | |
CN108347300A (en) | A kind of method, apparatus and device for encoding and decoding of adjustment Polar codes | |
WO2023124371A1 (en) | Data processing apparatus and method, and chip, computer device and storage medium | |
US20190286522A1 (en) | Ldpc decoding device, memory system including the same and method thereof | |
US11513897B2 (en) | Error correction on length-compatible polar codes for memory systems | |
CN101436864B (en) | Method and apparatus for decoding low density parity check code | |
CN110730006A (en) | LDPC code error correction method and error correction module for MCU | |
CN208691219U (en) | Ldpc decoder, storage equipment and wireless telecom equipment | |
JP2021015510A (en) | Processor for neural network, processing method for neural network, and program | |
CN108322225A (en) | LDPC interpretation methods based on mutual information and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: MICRON TECHNOLOGY, INC., IDAHO Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:LUO, FA-LONG;CUMMINS, JAIME;SCHMITZ, TAMARA;SIGNING DATES FROM 20181220 TO 20181221;REEL/FRAME:062897/0300 |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION |